A Performance Study for the LLVM back-end of GHC

Abstract

The objective of this project is to find Haskell programs which are not well optimized with the LLVM backend, resulting in a slower binary, compared with the Native Code Generator of GHC. They will be identified by a series of benchmarks and comparisons of the generated assembly code. These programs, with the assembly code generated by both back-ends of GHC, will be used as a base to identify issues in the LLVM optimization. New optimizations will be written to LLVM to treat the cases that the programs made clear, or bugs will be fixed in LLVM if it's shown that they are responsible for the bad performance.

Background

Haskell is one of the most popular functional programming languages. It is widely taught in the academia as an example of the functional programming paradigm, and it has seen use in the industry. Haskell is developed as an open source project, enjoying the support of a very active community of developers.

The Glasgow Haskell Compiler (GHC) is the de facto standard Haskell compiler^. This is an open source native code compiler, whose project started on the late eighties. GHC has been mostly developed at the University of Glasgow; however, nowadays it contains patches sent by many different developers, spread all over the world. Currently this compiler is distributed together with the Haskell Platform.

The Low Level Virtual Machine (LLVM) provides compiler writers with a set of compilation tools, such as front-ends for many languages, and back-ends for several architectures. LLVM also provides developers a rich set of language and machine independent code optimizations. It has been in development for 10 years now, it is used in a lot of open source projects, and it is mentioned in a increasing number of papers and theses. It receives an Intermediate Representation (IR), which is a portable high-level assembly language, and produces machine code for several architectures.

Recently, David Terei addapted GHC to use LLVM as a back-end[Terei09], in addition to the already existing C and Native Code Generator (NGC) back-ends. The motivation for using LLVM as a back-end was the ease of implementation, since no architecture specific details should be treated by the compiler writers; and the performance gains, since all the optimizations already available in LLVM can be used. It is an intermediate option between having more control and requiring more work, like compiling directly to machine code, as the NGC back-end does; and having less control and work compiling to a high-level language, as the C back-end does.

The LLVM back-end performance

Contrary to the general intuition, the performance gains delivered by LLVM to the Haskell language were not as good as expected. In some cases, like in the case of Data Parallel Haskell, LLVM got an average performance improvement of 25%. LLVM also shines on tight loop optimization [Stewart10-2]. However, the optimization steps of LLVM produced, in some cases, even worse code than the non-optimized Haskell binaries without optimization. Therefore, a study about the performance of the code generated with GHC and LLVM is relevant.

In the conclusion of [Terei09], three possibilities are considered to justify the bad performance of the binaries generated by the LLVM back-end. First, that the LLVM assembly code produced by GHC is not efficient, because of incorrect alignment directives. Second, that the changes introduced in LLVM — necessary for a custom calling conventions to adapt the GHC Intermediate Representation Cmm to LLVM — introduced optimization bugs, because they alter register definitions. Both these possibilities are considered less likely to be interfering with the results, and what is believed to be the cause of the poor performance is a bug in LLVM.

This belief is fundamented with an example where the stack is manipulated, but not used. In the x86 assembly below, produced by LLVM, the %esp register is manipulated, being subtracted by four and then incremented by four, but its value is never used:

Utils_doStatefulOp1_entry:
        subl    $4, %esp        # Redundant instruction
        movl    4(%ebp), %eax
        movl    8(%ebp), %ecx
        movl    (%ebp), %esi
        movl    %eax, 8(%ebp)
        movl    %ecx, 4(%ebp)
        addl    $4, %ebp
        addl    $4, %esp        # Redundant instruction
        jmp     stg_ap_pp_fast

This code was generated after being optimized with LLVM. Similar problems could be happening in a lot of situations. Also, some simple optimizations that the NCG back-end does may not be done by LLVM. This work relies on this assumption to find this(these) bug(s) in LLVM, and correct them when possible.

Choosing the correct parameters

LLVM and GHC support several optimization options, starting with -f, and some options to group them, starting with -O. Although the grouping options are usually more used, they are often not the best choice. Genetic algorithms can be used to determine what are the best choices of options to a particular program. This technique has shown good results with GHC and LLVM[Stewart10]. I plan to base my work on analizing the assembly code of programs generated using the best set of options, as obtained by acovea. This tool implements a genetic algorithm that searches the best options to compile a program using gcc. Acovea uses gcc, which has a similar suite of code optimizations than LLVM.

Timeline

On the first week I plan to work on defining the better optimizations options for each program in the nofib benchmark suite, for the NCG and the LLVM back-ends. In the following week I'll benchmark all the programs with these options, using the Criterion library to measure. I'll publish the results of these activities as soon as I have them in my weblog.

During the rest of June, I will identify cases where the assembly generated by the NCG back-end is more efficient than the one generated with LLVM, and study them, trying to see what is not being done by the LLVM optimizer. I will try to find common patterns in these cases and determine if it's a bug in LLVM that should be fixed or if a new optimization algorithm should be included to handle these cases.

In the mid-term evaluation period, from July 12 to 16, I should have a group of cases with common patterns, and a direction about how they can be improved in LLVM. Until August 9, I will work on implementing a new compiler optimization on LLVM that improves the quality of the code produced for Haskell. Or I will spend this time fixing any bug that I have previously found. From August 9 to 16 I will document what I did in the whole project, joining the contents of the blog posts, and the generated patch.

Biography

I am currently a last year undergrad student at Federal University of Minas Gerais, and working on my graduation project under orientation of professor Dr. Carlos Camarão de Figueiredo, a member of the Haskell 2011 Committee, and co-orientation of professor Dr. Fernando Pereira. I started to use Haskell in a research project about Urban Modelling in 2004, and never stopped since then. I contributed with patches to some widely used Haskell projects, such as XMonad and Darcs. I am a Debian Developer and member of the Debian Haskell Group, maintaining a lot of Haskell libraries, including gtk2hs. I can get a lot of help on LLVM. I work together with Andre Tavares and Andrei Rimsa, two summer of coders last year, 2009, and my co-advisor, Fernando Pereira, was a Summer of Coder in 2006. These three Summer of Code projects were in LLVM.

I believe I will be able to successfully work in this project for a number of reasons. First, I have been studying the GHC code recently, and helping Kari Pahula with his maintenance of the ghc6 package in Debian. I have written some of the patches that are applied in it. Second, I have been programming in Haskell for 5 years, and in C++ for 8 years, and I think I will not have problems to read the x86 assembly generated from the back-ends, the GHC source, or the LLVM source code. Finally, I am really excited about working on GHC, which is a project to which I have the desire to contribute since I started using Haskell, because of my big interest in compilers. This is something I have been preparing myself for, and it is great to see an opportunity to make it happen.

Contact info

Emailmarcot@debian.org
IRC marcot@irc.oftc.net (#debian-haskell, #debconf, #debian-devel, #debian-br)
marcot@irc.freenode.net (#ghc, #linux-bh)
Jabbermarcot@jabber-br.org
Phone+55 31 33181580
Mobile+55 31 98116720
AddressRua Boaventura, 771/303-E Indaiá 31270-020 Belo Horizonte/MG Brasil

Bibliography

  1. [Terei09] Terei, D. A. Low Level Virtual Machine for Glasgow Haskell Compiler. See http://www.cse.unsw.edu.au/~pls/thesis/davidt-thesis.pdf .
  1. [Stewart10] Stewart, D. Evolving Faster Haskell Programs (now with LLVM!). See http://donsbot.wordpress.com/2010/03/01/evolving-faster-haskell-programs-now-with-llvm/ .
  1. [Stewart10-2] Stewart, D. Smoking fast Haskell code using GHC’s new LLVM codegen. See http://donsbot.wordpress.com/2010/02/21/smoking-fast-haskell-code-using-ghcs-new-llvm-codegen/ .
 
marco_soc2.txt · Last modified: 2010/04/09 15:20 by Fernando Pereira