#20: We don't handle non-ASCII characters in doc comments

haddock at projects.haskell.org
Mon Jan 9 03:59:56 GMT 2012

#20: We don't handle non-ASCII characters in doc comments
Comment(by selinger):

 I agree that this should be fixed. It would be better to assume that all
 files are UTF8 than to assume all files are ASCII.

 Either way, users that use another encoding first have to do an offline
 conversion before invoking Haddock. But conversion from, say, Latin1 to
 UTF8 is trivial to do, whereas conversion from Latin1 to ASCII with HTML
 entities requires offline parsing: non-ASCII characters in Haddock
 comments must be converted to HTML entities, and non-ASCII characters in
 the code itself must be converted to something else (UTF8?), because
 Haddock will croak if it encounters an HTML entity in the code itself.

 Moreover, the current HTML entities encoding does not even work correctly;
 see bug #191.

