[Haddock] Unicode support for Haddock

Ian Lynagh ian at well-typed.com
Sun Feb 10 19:50:56 GMT 2013

[CCing the haddock list]

On Sun, Feb 03, 2013 at 07:07:54PM +0000, Max Bolingbroke wrote:
> Hi GHCers,
> I recently ran into a problem where Haddock does not correctly handle
> Unicode in doc comments. So for example with this file:
> """
> module Example where
> -- | 好
> ok :: Int -> Int
> ok x = x
> -- | 个
> misinterp :: Int -> Int
> misinterp _ = (-1)
> -- | 漢
> failure :: Int -> Int
> failure x = x-1
> """
> Current versions of Haddock will output the documentation for "ok"
> correctly, will output an empty bulleted list as the documentation for
> "misinterp" and not output any documentation at all for "failure"
> (echoing a warning to stderr instead).
> This is kind of sad. There is a very old open ticket about this issue:
> http://trac.haskell.org/haddock/ticket/20. The patches I've attached
> to that ticket fix the problem by using the native Unicode support in
> Alex 3. I've also attached to the ticket a patch which makes the
> necessary changes to GHC's build system required to build this new
> Haddock correctly.
> Do these patches seem OK? Is it fine to insist on Alex 3? I think it
> was released in 2011 so I think by now we can assume that it is
> available on all machines that will want to build GHC.

I'll leave looking at the patches to the haddock guys, but I think that
it's reasonable to require that GHC developers have alex 3 now. If I
understand the Haskell Platform pages correctly, the last release
included alex 3.0.2, and the one before that included 3.0.1.

> If this patch is accepted, at some point we might want to think about
> switching to Alex 3's unicode support in GHC's own lexer rather than
> relying on the current hacks. My patches do not make any change along
> those lines.

Yes; if haddock requires alex 3.0, then GHC effectively will too, so we
may as well make use of it.


More information about the Haddock mailing list