Network.CGI maintenance and character encoding problem

Jon Fairbairn jon.fairbairn at cl.cam.ac.uk
Tue Dec 24 20:39:55 GMT 2013


I have some old CGI code, and having recompiled it recently, I
find that unicode characters in query strings are getting
mangled (I only get the bottom byte). This used to work, but
since I’ve done many updates between the last time I compiled it
and now, I can’t say what version first went wrong.

I’m currently using haskell platform 2013.2.0.0 and either
cgi-3001.1.7.5 or cgi-3001.1.8.4 (both have the same problem).

CGI is still part of the platform, so I guess it should be
maintained, but when I emailed andersk I got no response.


The following programme illustrates the problem if called with a
query such as

wget -S -O - 'http://localhost/CGI_Problem.hsp?input=%E2%80%98%E2%80%A6%E2%80%99'

or run from the command line with

   QUERY_STRING='input=%E2%80%98%E2%80%A6%E2%80%99' ./CGI_Problem.hsp

> module Main where
> import Network.CGI

> import Data.List (intersperse)
> import Numeric (showHex)
> import Data.Text.Lazy (pack)
> import Data.Text.Lazy.Encoding (encodeUtf8)

> main = runCGI test

> test =
>   do ip <- getInput "input"
>      qs <- queryString
>      setHeader
>        "Content-type"
>        (showContentType (ContentType
>                          "text"
>                          "plain"
>                          [("charset", "UTF-8")]))
>      let decoded_inputs = formDecode qs
>      let got_input = maybe "(not defined)" id $ lookup "input" decoded_inputs

>      outputFPS $ encodeUtf8 $ pack $ 
>        "\nresult from queryString: " ++ qs ++ "\n"
>        ++ "input as decoded by getInput: "
>           ++ (maybe "(not defined)" id ip) ++ "\n"
>        ++ "input as decoded by getInput, numerical: "
>           ++ (maybe "(not defined)" (show_hex . fmap fromEnum) ip) ++ "\n"
>        ++ "input as decoded via formDecode and lookup: "
>           ++ got_input ++ "\n"
>        ++ "input as decoded via formDecode and lookup (numerical): "
>           ++ show_hex (fmap fromEnum got_input) ++ "\n"

> show_hex s =
>   "[" ++
>   foldr (.) id (intersperse (","++) $ fmap sHex s)
>   "]"
>   where sHex n = ("0x"++) . showHex n

-- 
Jón Fairbairn                                 Jon.Fairbairn at cl.cam.ac.uk
http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html  (updated 2012-10-07)




More information about the Haskell-platform mailing list