[Haddock] rebuilding haddock docs

Mark Lentczner markl at glyphic.com
Thu Aug 19 22:46:27 EDT 2010


On Aug 9, 2010, at 8:14 AM, Simon Marlow wrote:

> Do the anchors have to change, or would it be possible to make them compatible?
> 
> In the past when the .haddock format changes we have tried to degrade gracefully, still producing documentation but without the features that were enabled by the format change.

Here's the deal: Anchors were broken in that they weren't really compliant. The various hacks with escaping in links, and double-anchoring were really just hacks to work around this. The details of how anchor ids should be constructed are in a comment in in Haddock.Utils:

-------------------------------------------------------------------------------
-- * Anchor and URL utilities
--
-- NB: Anchor IDs, used as the destination of a link within a document must
-- conform to XML's NAME production. That, taken with XHTML and HTML 4.01's
-- various needs and compatibility constraints, means these IDs have to match:
--      [A-Za-z][A-Za-z0-9:_.-]*
-- Such IDs do not need to be escaped in any way when used as the fragment part
-- of a URL. Indeed, %-escaping them can lead to compatibility issues as it
-- isn't clear if such fragment identifiers should, or should not be unescaped
-- before being matched with IDs in the target document.
-------------------------------------------------------------------------------

We can compare how the old code treats anchors with the new by looking at three representative functions from Data.Map: !, insertWith, and insertWith':

-- old links have hrefs ending in the fragment --
	v%3A%21
	v%3AinsertWith
	v%3AinsertWith%27
-- old anchor points have two nested(!) A elements with these names --
	v%3A%21             v:!
	v%3AinsertWith      v:insertWith
	v%3AinsertWith%27   v:insertWith'

-- new links and anchor points use these --
	v:-33-
	v:insertWith
	v:insertWith-39-

Thanks to the ambiguity in the specs over the years about % escaping and fragments, most browsers are eager to try anything to get a link to work. For most identifiers, this works in our favor, and old style links will find new style anchors; and new style links will find old style anchors, since they had two, one in new(ish) form. For identifiers with non-ASCII alphanumerics, all bets are off, since the new escaping mechanism is necessarily different.

I think this represents an acceptable degradation path, if not completely graceful.

	- Mark

Mark Lentczner
http://www.ozonehouse.com/mark/
mark at glyphic.com






More information about the Haddock mailing list