[haddock] #118: parsing of multiple URLs (greedy matching?)

Wed Jul 22 13:41:44 EDT 2009

#118: parsing of multiple URLs (greedy matching?)
-------------------+--------------------------------------------------------
Reporter:  kowey   |        Owner:       
    Type:  defect  |       Status:  new  
Priority:  major   |    Milestone:  2.5.0
 Version:  2.4.2   |   Resolution:       
Keywords:          |  
-------------------+--------------------------------------------------------
Comment (by duncan):

 Replying to [ticket:569 EricKow]:

 > I get the impression that there's some kind of greedy matching going on,
 like a "<.*>" in regexp terms.

 Indeed, in the lexer:
 {{{
   \<.*\>         { strtoken $ \s -> TokURL (init (tail s)) }
   \<\<.*\>\>     { strtoken $ \s -> TokPic (init $ init $ tail $ tail s) }
   \#.*\#         { strtoken $ \s -> TokAName (init (tail s)) }
 }}}

 For emphasis like `/blah/` it uses:
 {{{
   \/ [^\/]* \/   { strtoken $ \s -> TokEmphasis (init (tail s)) }
 }}}

 The same trick should work for the three cases above. Note that the same
 code is used in ghc, haddock-0.x, hackage-scripts and hackage-server.

-- 
Ticket URL: <http://trac.haskell.org/haddock/ticket/118#comment:1>
haddock <http://www.haskell.org/haddock>
Haddock, The Haskell Documentation Tool