The GHC Commentary - The Mighty Simplifier

Most of the optimising program transformations applied by GHC are performed on an intermediate language called Core, which essentially is a compiler-friendly formulation of rank-2 polymorphic lambda terms defined in the module CoreSyn.lhs. The transformation engine optimising Core programs is called the Simplifier and composed from a couple of modules located in the directory fptools/ghc/compiler/simplCore/. The main engine of the simplifier is contained in Simplify.lhs. and its driver is the routine core2core in SimplCore.lhs.

The program that the simplifier has produced after applying its various optimisations can be obtained by passing the option -ddump-simpl to GHC. Moreover, the various intermediate stages of the optimisation process is printed when passing -dverbose-core2core.

Recursive Definitions

The simplification process has to take special care when handling recursive binding groups; otherwise, the compiler might loop. Therefore, the routine reOrderRec in OccurAnal.lhs computes a set of loop breakers - a set of definitions that together cut any possible loop in the binding group. It marks the identifiers bound by these definitions as loop breakers by enriching their occurence information. Loop breakers will never be inlined by the simplifier; thus, guaranteeing termination of the simplification procedure. (This is not entirely accurate -- see rewrite rules below.) The processes finding loop breakers works as follows: First, the strongly connected components (SCC) of the graph representing all function dependencies is computed. Then, each SCC is inspected in turn. If it contains only a single binding (self-recursive function), this is the loop breaker. In case of multiple recursive bindings, the function attempts to select bindings where the decision not to inline them does cause the least harm - in the sense of inhibiting optimisations in the code. This is achieved by considering each binding in turn and awarding a score between 0 and 4, where a lower score means that the function is less useful for inlining - and thus, a better loop breaker. The evaluation of bingings is performed by the function score locally defined in OccurAnal. Note that, because core programs represent function definitions as one binding choosing between the possibly many equations in the source program with a case construct, a loop breaker cannot inline any of its possibly many alternatives (not even the non-recursive alternatives).

Rewrite Rules

The application of rewrite rules is controlled in the module Simplify.lhs by the function completeCall. This function first checks whether it should inline the function applied at the currently inspected call site, then simplifies the arguments, and finally, checks whether any rewrite rule can be applied (and also whether there is a matching specialised version of the applied function). The actual check for rule application is performed by the function Rules.lookupRule.

It should be note that the application of rewrite rules is not subject to the loop breaker check - i.e., rules of loop breakers will be applied regardless of whether this may cause the simplifier to diverge.

Last modified: Wed Aug 8 19:25:33 EST 2001