[[project @ 2003-03-20 11:53:57 by simonmar] simonmar**20030320115357 Reformat this section a bit, and add a note about the poor performance of mutable arrays in the garbage collector. ] { hunk ./ghc/docs/users_guide/sooner.sgml 1 - -Advice on: sooner, faster, smaller, thriftier - + +Advice on: sooner, faster, smaller, thriftier hunk ./ghc/docs/users_guide/sooner.sgml 4 - -Please advise us of other “helpful hints” that should go here! - +Please advise us of other “helpful hints” that +should go here! hunk ./ghc/docs/users_guide/sooner.sgml 7 - -Sooner: producing a program more quickly - + +Sooner: producing a program more quickly + hunk ./ghc/docs/users_guide/sooner.sgml 11 - -compiling faster -faster compiling - +compiling faster +faster compiling hunk ./ghc/docs/users_guide/sooner.sgml 14 - -Don't use or (especially) : - - -By using them, you are telling GHC that you are willing to suffer -longer compilation times for better-quality code. - + + + Don't use or (especially) : + + By using them, you are telling GHC that you are + willing to suffer longer compilation times for + better-quality code. hunk ./ghc/docs/users_guide/sooner.sgml 22 - -GHC is surprisingly zippy for normal compilations without ! - - - - -Use more memory: - - -Within reason, more memory for heap space means less garbage -collection for GHC, which means less compilation time. If you use the - option, you'll get a garbage-collector -report. (Again, you can use the cheap-and-nasty option to send the GC stats straight to standard error.) - + GHC is surprisingly zippy for normal compilations + without ! + + hunk ./ghc/docs/users_guide/sooner.sgml 27 - -If it says you're using more than 20% of total time in garbage -collecting, then more memory would help. - + + Use more memory: + + Within reason, more memory for heap space means less + garbage collection for GHC, which means less compilation + time. If you use the option, + you'll get a garbage-collector report. (Again, you can use + the cheap-and-nasty + option to send the GC stats straight to standard + error.) hunk ./ghc/docs/users_guide/sooner.sgml 38 - -If the heap size is approaching the maximum (64M by default), and you -have lots of memory, try increasing the maximum with the --M<size> option option, e.g.: ghc -c -O --M1024m Foo.hs. - + If it says you're using more than 20% of total + time in garbage collecting, then more memory would + help. hunk ./ghc/docs/users_guide/sooner.sgml 42 - -Increasing the default allocation area size used by the compiler's RTS -might also help: use the -A<size> option -option. - + If the heap size is approaching the maximum (64M by + default), and you have lots of memory, try increasing the + maximum with the + -M<size> + option option, e.g.: ghc -c + -O -M1024m Foo.hs. hunk ./ghc/docs/users_guide/sooner.sgml 49 - -If GHC persists in being a bad memory citizen, please report it as a -bug. - - - - -Don't use too much memory! - - -As soon as GHC plus its “fellow citizens” (other processes on your -machine) start using more than the real memory on your -machine, and the machine starts “thrashing,” the party is -over. Compile times will be worse than terrible! Use something -like the csh-builtin time command to get a report on how many page -faults you're getting. - + Increasing the default allocation area size used by + the compiler's RTS might also help: use the + -A<size> + option option. hunk ./ghc/docs/users_guide/sooner.sgml 54 - -If you don't know what virtual memory, thrashing, and page faults are, -or you don't know the memory configuration of your machine, -don't try to be clever about memory use: you'll just make -your life a misery (and for other people, too, probably). - - - - -Try to use local disks when linking: - - -Because Haskell objects and libraries tend to be large, it can take -many real seconds to slurp the bits to/from a remote filesystem. - + If GHC persists in being a bad memory citizen, please + report it as a bug. + + hunk ./ghc/docs/users_guide/sooner.sgml 59 - -It would be quite sensible to compile on a fast machine using -remotely-mounted disks; then link on a slow machine that had -your disks directly mounted. - - - - -Don't derive/use Read unnecessarily: - - -It's ugly and slow. - - - - -GHC compiles some program constructs slowly: - - -Deeply-nested list comprehensions seem to be one such; in the past, -very large constant tables were bad, too. - + + Don't use too much memory! + + As soon as GHC plus its “fellow citizens” + (other processes on your machine) start using more than the + real memory on your machine, and the + machine starts “thrashing,” the party + is over. Compile times will be worse than + terrible! Use something like the csh-builtin + time command to get a report on how many + page faults you're getting. hunk ./ghc/docs/users_guide/sooner.sgml 71 - -We'd rather you reported such behaviour as a bug, so that we can try -to correct it. - + If you don't know what virtual memory, thrashing, and + page faults are, or you don't know the memory configuration + of your machine, don't try to be clever + about memory use: you'll just make your life a misery (and + for other people, too, probably). + + hunk ./ghc/docs/users_guide/sooner.sgml 79 - -The part of the compiler that is occasionally prone to wandering off -for a long time is the strictness analyser. You can turn this off -individually with . --fno-strictness anti-option - + + Try to use local disks when linking: + + Because Haskell objects and libraries tend to be + large, it can take many real seconds to slurp the bits + to/from a remote filesystem. hunk ./ghc/docs/users_guide/sooner.sgml 86 - -To figure out which part of the compiler is badly behaved, the - - option is your friend. - + It would be quite sensible to + compile on a fast machine using + remotely-mounted disks; then link on a + slow machine that had your disks directly mounted. + + hunk ./ghc/docs/users_guide/sooner.sgml 93 - -If your module has big wads of constant data, GHC may produce a huge -basic block that will cause the native-code generator's register -allocator to founder. Bring on -fvia-C option -(not that GCC will be that quick about it, either). - - - - -Explicit import declarations: - - -Instead of saying import Foo, say import -Foo (...stuff I want...) You can get GHC to tell you the -minimal set of required imports by using the - option (see ). - + + Don't derive/use Read unnecessarily: + + It's ugly and slow. + + hunk ./ghc/docs/users_guide/sooner.sgml 100 - -Truthfully, the reduction on compilation time will be very small. -However, judicious use of import declarations can make a -program easier to understand, so it may be a good idea anyway. - - - - - + + GHC compiles some program constructs slowly: + + Deeply-nested list comprehensions seem to be one such; + in the past, very large constant tables were bad, + too. hunk ./ghc/docs/users_guide/sooner.sgml 107 - + We'd rather you reported such behaviour as a bug, so + that we can try to correct it. + + The part of the compiler that is occasionally prone to + wandering off for a long time is the strictness analyser. + You can turn this off individually with + . + -fno-strictness + anti-option + + To figure out which part of the compiler is badly + behaved, the + + option is your friend. + + If your module has big wads of constant data, GHC may + produce a huge basic block that will cause the native-code + generator's register allocator to founder. Bring on + -fvia-C + option (not that GCC will be that + quick about it, either). + + + + + Explicit import declarations: + + Instead of saying import Foo, say + import Foo (...stuff I want...) You can + get GHC to tell you the minimal set of required imports by + using the option + (see ). + + Truthfully, the reduction on compilation time will be + very small. However, judicious use of + import declarations can make a program + easier to understand, so it may be a good idea + anyway. + + + + + + + Faster: producing a program that runs quicker + + faster programs, how to produce hunk ./ghc/docs/users_guide/sooner.sgml 155 - -Faster: producing a program that runs quicker - + The key tool to use in making your Haskell program run + faster are GHC's profiling facilities, described separately in + . There is no + substitute for finding where your program's time/space + is really going, as opposed to where you + imagine it is going. hunk ./ghc/docs/users_guide/sooner.sgml 162 - -faster programs, how to produce - + Another point to bear in mind: By far the best way to + improve a program's performance dramatically + is to use better algorithms. Once profiling has thrown the + spotlight on the guilty time-consumer(s), it may be better to + re-think your program than to try all the tweaks listed below. hunk ./ghc/docs/users_guide/sooner.sgml 168 - -The key tool to use in making your Haskell program run faster are -GHC's profiling facilities, described separately in . There is no substitute for -finding where your program's time/space is really going, as -opposed to where you imagine it is going. - + Another extremely efficient way to make your program snappy + is to use library code that has been Seriously Tuned By Someone + Else. You might be able to write a better + quicksort than the one in Data.List, but it + will take you much longer than typing import + Data.List. hunk ./ghc/docs/users_guide/sooner.sgml 175 - -Another point to bear in mind: By far the best way to improve a -program's performance dramatically is to use better -algorithms. Once profiling has thrown the spotlight on the guilty -time-consumer(s), it may be better to re-think your program than to -try all the tweaks listed below. - + Please report any overly-slow GHC-compiled programs. Since + GHC doesn't have any credible competition in the performance + department these days it's hard to say what overly-slow means, so + just use your judgement! Of course, if a GHC compiled program + runs slower than the same program compiled with NHC or Hugs, then + it's definitely a bug. hunk ./ghc/docs/users_guide/sooner.sgml 182 - -Another extremely efficient way to make your program snappy is to use -library code that has been Seriously Tuned By Someone Else. You -might be able to write a better quicksort than the one in the -HBC library, but it will take you much longer than typing import -QSort. (Incidentally, it doesn't hurt if the Someone Else is Lennart -Augustsson.) - + + + Optimise, using or : + + This is the most basic way to make your program go + faster. Compilation time will be slower, especially with + . hunk ./ghc/docs/users_guide/sooner.sgml 190 - -Please report any overly-slow GHC-compiled programs. The current -definition of “overly-slow” is “the HBC-compiled version ran -faster”… - + At present, is nearly + indistinguishable from . + + hunk ./ghc/docs/users_guide/sooner.sgml 195 - - + + Compile via C and crank up GCC: + + The native code-generator is designed to be quick, not + mind-bogglingly clever. Better to let GCC have a go, as it + tries much harder on register allocation, etc. hunk ./ghc/docs/users_guide/sooner.sgml 202 - -Optimise, using or : - - -This is the most basic way -to make your program go faster. Compilation time will be slower, -especially with . - + At the moment, if you turn on you + get GCC instead. This may change in the future. hunk ./ghc/docs/users_guide/sooner.sgml 205 - -At present, is nearly indistinguishable from . - - - - -Compile via C and crank up GCC: - - -The native code-generator is designed to be quick, not mind-bogglingly -clever. Better to let GCC have a go, as it tries much harder on -register allocation, etc. + So, when we want very fast code, we use: . + + hunk ./ghc/docs/users_guide/sooner.sgml 210 -At the moment, if you turn on you get GCC -instead. This may change in the future. + + Overloaded functions are not your friend: + + Haskell's overloading (using type classes) is elegant, + neat, etc., etc., but it is death to performance if left to + linger in an inner loop. How can you squash it? hunk ./ghc/docs/users_guide/sooner.sgml 217 - -So, when we want very fast code, we use: . - - - - -Overloaded functions are not your friend: - - -Haskell's overloading (using type classes) is elegant, neat, etc., -etc., but it is death to performance if left to linger in an inner -loop. How can you squash it? - + + + Give explicit type signatures: + + Signatures are the basic trick; putting them on + exported, top-level functions is good + software-engineering practice, anyway. (Tip: using + -fwarn-missing-signatures + option can help enforce good + signature-practice). hunk ./ghc/docs/users_guide/sooner.sgml 228 - - + The automatic specialisation of overloaded + functions (with ) should take care + of overloaded local and/or unexported functions. + + hunk ./ghc/docs/users_guide/sooner.sgml 234 - -Give explicit type signatures: - - -Signatures are the basic trick; putting them on exported, top-level -functions is good software-engineering practice, anyway. (Tip: using --fwarn-missing-signatures -option can help enforce good signature-practice). - + + Use SPECIALIZE pragmas: + + SPECIALIZE pragma + overloading, death to hunk ./ghc/docs/users_guide/sooner.sgml 240 - -The automatic specialisation of overloaded functions (with ) -should take care of overloaded local and/or unexported functions. - - - - -Use SPECIALIZE pragmas: - - -SPECIALIZE pragma -overloading, death to - + Specialize the overloading on key functions in + your program. See + and . + + hunk ./ghc/docs/users_guide/sooner.sgml 246 - -Specialize the overloading on key functions in your program. See - and -. - - - - -“But how do I know where overloading is creeping in?”: - - -A low-tech way: grep (search) your interface files for overloaded -type signatures; e.g.,: + + “But how do I know where overloading is creeping in?”: + + A low-tech way: grep (search) your interface + files for overloaded type signatures. You can view + interface files using the + option (see ). hunk ./ghc/docs/users_guide/sooner.sgml 255 - -% egrep '^[a-z].*::.*=>' *.hi - + +% ghc --show-iface Foo.hi | egrep '^[a-z].*::.*=>' + + + + + + + hunk ./ghc/docs/users_guide/sooner.sgml 265 - - - - - - - - -Strict functions are your dear friends: - - -and, among other things, lazy pattern-matching is your enemy. - + + Strict functions are your dear friends: + + and, among other things, lazy pattern-matching is your + enemy. hunk ./ghc/docs/users_guide/sooner.sgml 271 - -(If you don't know what a “strict function” is, please consult a -functional-programming textbook. A sentence or two of -explanation here probably would not do much good.) - + (If you don't know what a “strict + function” is, please consult a functional-programming + textbook. A sentence or two of explanation here probably + would not do much good.) hunk ./ghc/docs/users_guide/sooner.sgml 276 - -Consider these two code fragments: + Consider these two code fragments: hunk ./ghc/docs/users_guide/sooner.sgml 278 - + hunk ./ghc/docs/users_guide/sooner.sgml 282 - + hunk ./ghc/docs/users_guide/sooner.sgml 284 -The former will result in far better code. - + The former will result in far better code. hunk ./ghc/docs/users_guide/sooner.sgml 286 - -A less contrived example shows the use of cases instead -of lets to get stricter code (a good thing): + A less contrived example shows the use of + cases instead of lets + to get stricter code (a good thing): hunk ./ghc/docs/users_guide/sooner.sgml 290 - + hunk ./ghc/docs/users_guide/sooner.sgml 302 - + + + + + + + + GHC loves single-constructor data-types: + + It's all the better if a function is strict in a + single-constructor type (a type with only one + data-constructor; for example, tuples are single-constructor + types). + + + + + Newtypes are better than datatypes: + + If your datatype has a single constructor with a + single field, use a newtype declaration + instead of a data declaration. The + newtype will be optimised away in most + cases. + + + + + “How do I find out a function's strictness?” + + Don't guess—look it up. + + Look for your function in the interface file, then for + the third field in the pragma; it should say + __S <string>. The + <string> gives the strictness of + the function's arguments. L is lazy + (bad), S and E are + strict (good), P is + “primitive” (good), U(...) + is strict and “unpackable” (very good), and + A is absent (very good). + + For an “unpackable” + U(...) argument, the info inside tells + the strictness of its components. So, if the argument is a + pair, and it says U(AU(LSS)), that + means “the first component of the pair isn't used; the + second component is itself unpackable, with three components + (lazy in the first, strict in the second \& + third).” + + If the function isn't exported, just compile with the + extra flag ; next to the + signature for any binder, it will print the self-same + pragmatic information as would be put in an interface file. + (Besides, Core syntax is fun to look at!) + + + + + Force key functions to be INLINEd (esp. monads): + + Placing INLINE pragmas on certain + functions that are used a lot can have a dramatic effect. + See . + + + + + Explicit export list: + + If you do not have an explicit export list in a + module, GHC must assume that everything in that module will + be exported. This has various pessimising effects. For + example, if a bit of code is actually + unused (perhaps because of unfolding + effects), GHC will not be able to throw it away, because it + is exported and some other module may be relying on its + existence. + + GHC can be quite a bit more aggressive with pieces of + code if it knows they are not exported. + + + + + Look at the Core syntax! + + (The form in which GHC manipulates your code.) Just + run your compilation with + (don't forget the ). + + If profiling has pointed the finger at particular + functions, look at their Core code. lets + are bad, cases are good, dictionaries + (d.<Class>.<Unique>) [or + anything overloading-ish] are bad, nested lambdas are + bad, explicit data constructors are good, primitive + operations (e.g., eqInt#) are + good,… + + + + + Use strictness annotations: + + Putting a strictness annotation ('!') on a constructor + field helps in two ways: it adds strictness to the program, + which gives the strictness analyser more to work with, and + it might help to reduce space leaks. + + It can also help in a third way: when used with + (see ), a strict field can be unpacked or + unboxed in the constructor, and one or more levels of + indirection may be removed. Unpacking only happens for + single-constructor datatypes (Int is a + good candidate, for example). + + Using is only + really a good idea in conjunction with , + because otherwise the extra packing and unpacking won't be + optimised away. In fact, it is possible that + may worsen + performance even with + , but this is unlikely (let us know if it + happens to you). + + hunk ./ghc/docs/users_guide/sooner.sgml 433 - - - - -GHC loves single-constructor data-types: - - -It's all the better if a function is strict in a single-constructor -type (a type with only one data-constructor; for example, tuples are -single-constructor types). - - - - -Newtypes are better than datatypes: - - -If your datatype has a single constructor with a single field, use a -newtype declaration instead of a data declaration. The newtype -will be optimised away in most cases. - - - - -“How do I find out a function's strictness?” - - -Don't guess—look it up. - + + Use unboxed types (a GHC extension): + + When you are really desperate for + speed, and you want to get right down to the “raw + bits.” Please see for + some information about using unboxed types. hunk ./ghc/docs/users_guide/sooner.sgml 441 - -Look for your function in the interface file, then for the third field -in the pragma; it should say __S -<string>. The <string> gives -the strictness of the function's arguments. L is -lazy (bad), S and E are -strict (good), P is “primitive” -(good), U(...) is strict and -“unpackable” (very good), and A is -absent (very good). - + Before resorting to explicit unboxed types, try using + strict constructor fields and + first (see above). + That way, your code stays portable. + + hunk ./ghc/docs/users_guide/sooner.sgml 448 - -For an “unpackable” U(...) argument, the info inside -tells the strictness of its components. So, if the argument is a -pair, and it says U(AU(LSS)), that means “the first component of the -pair isn't used; the second component is itself unpackable, with three -components (lazy in the first, strict in the second \& third).” - + + Use foreign import (a GHC extension) to plug into fast libraries: + + This may take real work, but… There exist piles + of massively-tuned library code, and the best thing is not + to compete with it, but link with it. hunk ./ghc/docs/users_guide/sooner.sgml 455 - -If the function isn't exported, just compile with the extra flag ; -next to the signature for any binder, it will print the self-same -pragmatic information as would be put in an interface file. -(Besides, Core syntax is fun to look at!) - - - - -Force key functions to be INLINEd (esp. monads): - - -Placing INLINE pragmas on certain functions that are used a lot can -have a dramatic effect. See . - - - - -Explicit export list: - - -If you do not have an explicit export list in a module, GHC must -assume that everything in that module will be exported. This has -various pessimising effects. For example, if a bit of code is actually -unused (perhaps because of unfolding effects), GHC will not be -able to throw it away, because it is exported and some other module -may be relying on its existence. - + describes the foreign function + interface. + + hunk ./ghc/docs/users_guide/sooner.sgml 460 - -GHC can be quite a bit more aggressive with pieces of code if it knows -they are not exported. - - - - -Look at the Core syntax! - - -(The form in which GHC manipulates your code.) Just run your -compilation with (don't forget the ). - + + Don't use Floats: + + If you're using Complex, definitely + use Complex Double rather than + Complex Float (the former is specialised + heavily, but the latter isn't). hunk ./ghc/docs/users_guide/sooner.sgml 468 - -If profiling has pointed the finger at particular functions, look at -their Core code. lets are bad, cases are good, dictionaries -(d.<Class>.<Unique>) [or anything overloading-ish] are bad, -nested lambdas are bad, explicit data constructors are good, primitive -operations (e.g., eqInt#) are good,… - - - - -Use unboxed types (a GHC extension): - - -When you are really desperate for speed, and you want to get -right down to the “raw bits.” Please see for some information about using unboxed -types. - - - - -Use foreign import (a GHC extension) to plug into fast libraries: - - -This may take real work, but… There exist piles of -massively-tuned library code, and the best thing is not -to compete with it, but link with it. - + Floats (probably 32-bits) are + almost always a bad idea, anyway, unless you Really Know + What You Are Doing. Use Doubles. + There's rarely a speed disadvantage—modern machines + will use the same floating-point unit for both. With + Doubles, you are much less likely to hang + yourself with numerical errors. hunk ./ghc/docs/users_guide/sooner.sgml 476 - - describes the foreign function interface. - - - + One time when Float might be a good + idea is if you have a lot of them, say + a giant array of Floats. They take up + half the space in the heap compared to + Doubles. However, this isn't true on a + 64-bit machine. + + hunk ./ghc/docs/users_guide/sooner.sgml 485 - -Don't use Floats: - - -If you're using Complex, definitely use -Complex Double rather than Complex -Float (the former is specialised heavily, but the latter -isn't). + + Use unboxed arrays (UArray) + + GHC supports arrays of unboxed elements, for several + basic arithmetic element types including + Int and Char: see the + Data.Array.Unboxed library for details. + These arrays are likely to be much faster than using + standard Haskell 98 arrays from the + Data.Array library. + + hunk ./ghc/docs/users_guide/sooner.sgml 498 - -Floats (probably 32-bits) are almost always a bad idea, anyway, -unless you Really Know What You Are Doing. Use Doubles. There's -rarely a speed disadvantage—modern machines will use the same -floating-point unit for both. With Doubles, you are much less -likely to hang yourself with numerical errors. - + + Use a bigger heap! + + If your program's GC stats + (-S RTS + option RTS option) indicate that it's + doing lots of garbage-collection (say, more than 20% + of execution time), more memory might help—with the + -M<size> + RTS option or + -A<size> + RTS option RTS options (see ). hunk ./ghc/docs/users_guide/sooner.sgml 512 - -One time when Float might be a good idea is if you have a -lot of them, say a giant array of Floats. They take up -half the space in the heap compared to Doubles. However, this isn't -true on a 64-bit machine. - - - - -Use a bigger heap! - - -If your program's GC stats (-S RTS option RTS option) -indicate that it's doing lots of garbage-collection (say, more than -20% of execution time), more memory might help—with the --M<size> RTS option or --A<size> RTS option RTS options (see -). - - - - - + This is especially important if your program uses a + lot of mutable arrays of pointers or mutable variables + (i.e. STArray, + IOArray, STRef and + IORef, but not UArray, + STUArray or IOUArray). + GHC's garbage collector currently scans these objects on + every collection, so your program won't benefit from + generational GC in the normal way if you use lots of + these. Increasing the heap size to reduce the number of + collections will probably help. + + + hunk ./ghc/docs/users_guide/sooner.sgml 529 - -Smaller: producing a program that is smaller - + +Smaller: producing a program that is smaller + hunk ./ghc/docs/users_guide/sooner.sgml 533 - -smaller programs, how to produce - + +smaller programs, how to produce + hunk ./ghc/docs/users_guide/sooner.sgml 537 - + hunk ./ghc/docs/users_guide/sooner.sgml 540 --funfolding-use-threshold0 -option option for the extreme case. (“Only unfoldings with +-funfolding-use-threshold0 +option option for the extreme case. (“Only unfoldings with hunk ./ghc/docs/users_guide/sooner.sgml 546 - + hunk ./ghc/docs/users_guide/sooner.sgml 548 - + hunk ./ghc/docs/users_guide/sooner.sgml 550 - + hunk ./ghc/docs/users_guide/sooner.sgml 552 - -Use strip on your executables. - + +Use strip on your executables. + hunk ./ghc/docs/users_guide/sooner.sgml 558 - -Thriftier: producing a program that gobbles less heap space - + +Thriftier: producing a program that gobbles less heap space + hunk ./ghc/docs/users_guide/sooner.sgml 562 - -memory, using less heap -space-leaks, avoiding -heap space, using less - + +memory, using less heap +space-leaks, avoiding +heap space, using less + hunk ./ghc/docs/users_guide/sooner.sgml 568 - + hunk ./ghc/docs/users_guide/sooner.sgml 570 -with , and remove all doubt! (You'll +with , and remove all doubt! (You'll hunk ./ghc/docs/users_guide/sooner.sgml 573 - RTS option; so… ./a.out +RTS + RTS option; so… ./a.out +RTS hunk ./ghc/docs/users_guide/sooner.sgml 575 --G RTS option --Sstderr RTS option - +-G RTS option +-Sstderr RTS option + hunk ./ghc/docs/users_guide/sooner.sgml 579 - + hunk ./ghc/docs/users_guide/sooner.sgml 582 - + hunk ./ghc/docs/users_guide/sooner.sgml 584 - + hunk ./ghc/docs/users_guide/sooner.sgml 590 - + }