Current ideas

Thu Nov 27 08:59:49 EST 2008

> I can't seem to post messages to the lhc mailing list.

OK. I'll cc this to lhc at projects.haskell.org anyway for those
interested eyes.

> > Hi,
> >
> > Since LHC's creation, I have been doing a bit of work on it,
> > mainly cleaning things and replacing things etc. - it's now much
> > easier to build (with cabal,) has some cleaned up code, and general
> > improvements over jhc I would say (at this point anyway.)
> >
> > Here's the current ChangeLog:
> > http://code.haskell.org/lhc/ChangeLog
> >
> > Before any sort of announcement (and a version bump,) here are some
> > things that have been on my mind:
> >
> >  * There could be various improvements in performance and memory usage
> >   just about everywhere. I have some initial profiling results from running
> >   the compiler over 'hello world' here:
> >   http://thoughtpolice.stringsandints.com/code/lhc-tests/hello-world/
> >
> >   The files are:
> >   - lhc.prof & lhc-cc.ps - detailed cost centres - we see about 40%
> >     *total* time and allocation is spent in E.Binary on two different get
> >     routines for Data.Binary.
> >   - lhc-constr.ps & lhc-type.ps - most of the allocation goes to the
> >     S data type used in the lambda lifter/CPR pass, as well as lazy
> >     tuples.
> >
> >   So I think if we can target E.Binary usage, we can possibly cut
> >   down GC and runtime considerably. We may be holding the GC from
> 
> Compiling HelloWorld is currently dominated by the time it takes to
> load base-1.0.hl. Reducing the size of base-1.0.hl should benefit
> compile times dramatically.

Seems reasonable; do you have any ideas of how we can make
loading faster/file sizes smaller?

> >   Note that these benchmarks were run before I replaced several other
> >   parts of the code and removed a few things, notably replacing DrIFT
> >   with derive - I noticed GC went up to about 61% from the average
> >   58% for DrIFT. I should probably update them.
> >
> >   (Also, GHC HEAD has had recent improvements to the parallel garbage
> >   collector, etc. so I would like to see if running lhc on top of it
> >   would reduce garbage time with -N2 and -threaded.)
> 
> I tried it with on my dual core AMD. CPU usage did go above 100% but
> it didn't make compiling base-1.0.hl any faster.

I've been having lots of trouble recently building GHC HEAD, so I
still have nothing to report here to see if GC time goes down/things
build faster.

> >  * I think we should continue with jhc's goal of sticking with the latest
> >   GHC. I see potential in use in quasi-quoting, for perhaps replacing
> >   the FlagDump, Name and PrimitiveOperators information that was part
> >   of the autoconf build system.
> >   We can easily construct a parser with parsec (or even a
> >   regex lib; the perl scripts for this are in util/) and go from
> >   there; this also replaces another external dependency
> >   and makes cabal life better.
> 
> I like this idea.

I think it's a reasonable approach. Currently, using the old perl
scripts in utils/ isn't that bad, but having it done automagically at
compile time using template haskell would be considerably easier for everyone.

> >  * The region inference algorithm is currently buggy, and code leaks
> >   pretty badly if it runs for a while. If you look here:
> >   http://thoughtpolice.stringsandints.com/code/lhc-tests/bench
> >   and build, loop and startup work fine, but recursive almost immediately
> >   starts gobbling up 800mb+ of memory. This is something to be
> >   addressed for sure.
> 
> What's the right thing to do here? Do we need a generational GC or
> should we piggyback on GHC?

I've been looking through the code at the allocation stuff; there
doesn't seem to be a whole lot of code dedicated to it, but it will
take time to reason about.

In the mean time, an easy way to cope with this problem is to compile
your application with '-fboehm' and run it; this causes the minimal
RTS to use the Boehm GC for allocation/deallocation, instead
of the basic alloc routines inside lhc_rts_alloc.c. With this flag,
recursive, well, still doesn't finish, but it doesn't also immediately take
up 900mb RAM which is better than nothing.

> >  * Right now the parser is featureful and works, but we may also
> >   be able to swap it out to say, haskell-src-exts.
> >
> >   Pros: way more extensions we can support out of the box, many
> >         probably pretty easily with some knowledge of E and GRIN.
> >   Cons: we lose pragma's entirely, and we must effectively put all
> >         extensions on, all the time.
> >
> >   Losing pragmas is a problem, but it is a TODO on the
> >   haskell-src-exts project, and I'm convinced it could be worked in
> >   there in the interests of making lhc more robust.
> >
> >   I am also not convinced that losing the ability to turn language
> >   extensions off is a particularly bad thing either; with
> >   haskell-src-exts we immediately gain support for a large variety of
> >   extensions making lhc compatible with a much more vast amount of
> >   code immediately; although we will have to implement code for
> >   extensions like associated types, GADTs, etc. etc..
> >
> >   I think contribuing back to haskell-src-exts would probably be a
> >   good idea.
> 
> I like this idea. The parser we have now is slightly buggy.
> Losing pragmas seems like a big problem, though. We might be able to
> go without INLINE and SPECIALIZE, but the RULES pragma is pretty essential.

Right; I think we can hold off on this for right now, until
haskell-src-exts gets pragma support. I will talk to Niklas about
this and see if me and him can work something out - having LHC support
so much more haskell code would be a wonderful thing to work for!

Also, what's interesting about using haskell-src-exts is that we will
actually support a few things GHC *doesn't* like regular patterns and
HSX-style XML syntax.

The HSX-style syntax particularly interests me; when the thought of
using haskell-src-exts for LHC came up, this feature, combined with
LHC's ISO-C output, made me think of the possibility of having webapps
supported first-class by LHC. The approach could be similar to urweb,
I think:

http://www.impredicative.com/ur/

For example, the following code is parsed perfectly by
haskell-src-exts:
----------
hello =
 <html>
  <head>
   <title>this is helloworld</title>
  </head>
  <body>
   <h1>hello world!</h1>
  </body>
 </html>
----------

We could then, with lhc, compile it and run it like so:

 $ lhc -fwebapp tester.hs -o test
 $ ./test 8080
 ... started HTTP server on port 8080 ...

Of course, this is a far ways off in reality and even then it's still
an idea; we would still need to add support for pragmas to
haskell-src-exts, need the JHC RTS to have some thread support (since
in the above example, we output a full HTTP server,) and we would have
to craft a DSL for web-programming (handling cookies, DOM, etc.)
However, something like this could be extremely interesting to play
around with, and having first-class support for it in a haskell
compiler would be great (who says we can't do webapps?)

> >  * We definitely need a testsuite - I think a good first suite to
> >   target and fully compile would be nobench.
> >
> >  * We should eliminate the last bits of the old build system; I have just
> >   eliminated cbits as we don't need it, and we need to replace the
> >   Name, Flag and PrimitiveOperator generation routines, because right
> >   now they're generically hard-coded in there.
> >
> >  * There should be something like a LHC commentary. Starting one on
> >   http://lhc.seize.it seems reasonable.
> 
> Couldn't agree more.

I'll remove the rest of the autoconf-artifacts (Makefile.am,
configure.ac, etc) once I can write up some temporary scripts to
update Name.Name, PrimitiveOperators and the Flag* files.

I can also start the beginnings of a rudimentary LHC commentary and
testsuite when I have the time.

> >  * Should find a way to get lhc's base package on hackage so we can do
> >   'cabal install lhc; cabal install base --lhc'
> 
> We'll have to talk with Duncan about this one.

Yeah, it would be really nice to have, though (otherwise we may have
to tie lhc-base into the regular lhc build somehow, which would
probably be gross.)

Austin