[[project @ 2003-02-11 17:19:35 by simonpj]
simonpj**20030211171936
Lots of new stuff about data types
] {
hunk ./ghc/docs/comm/genesis/modules.html 68
- FieldLabel( Type)
+ FieldLabel(Type)
hunk ./ghc/docs/comm/genesis/modules.html 70
- PprEnv (loop DataCon.DataCon, Type)
-
+GHC now generates three unique Names for each data constructor: +
+ ---- OccName ------ + String Name space Used for + --------------------------------------------------------------------------- + The "source data con" MkT DataName The DataCon itself + The "worker data con" MkT VarName Its worker Id + aka "representation data con" + The "wrapper data con" $WMkT VarName Its wrapper Id (optional) ++Recall that each occurrence name (OccName) is a pair of a string and a +name space (see The truth about names), and +two OccNames are considered the same only if both components match. +That is what distinguishes the name of the name of the DataCon from +the name of its worker Id. To keep things unambiguous, in what +follows we'll write "MkT{d}" for the source data con, and "MkT{v}" for +the worker Id. (Indeed, when you dump stuff with "-ddumpXXX", if you +also add "-dppr-debug" you'll get stuff like "Foo {- d rMv -}". The +"d" part is the name space; the "rMv" is the unique key.) +
+Each of these three names gets a distinct unique key in GHC's name cache. hunk ./ghc/docs/comm/the-beast/data-types.html 55 +
+ data T a = MkT{d} !(a,a) !Int | Nil{d} hunk ./ghc/docs/comm/the-beast/data-types.html 70 -Here, the wrapper MkT evaluates and takes the argument p, + f x = case x of + Nil{d} -> Nil{d} + MkT{d} p q -> MkT{d} p (q+1) ++Notice that in the Haskell source all data contructors are named via the "source data con" MkT{d}, +whether in pattern matching or in expressions. +
+In the translated source produced by the type checker (-ddump-tc), the program looks like this: +
+ f x = case x of + Nil{d} -> Nil{v} + MkT{d} p q -> $WMkT p (q+1) + ++Notice that the type checker replaces the occurrence of MkT by the wrapper, but +the occurrence of Nil by the worker. Reason: Nil doesn't have a wrapper because there is +nothing to do in the wrapper (this is the vastly common case). +
+Though they are not printed out by "-ddump-tc", behind the scenes, there are +also the following: the data type declaration and the wrapper function for MkT. +
+ data T a = MkT{d} a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT{v} a b t) ++Here, the wrapper $WMkT evaluates and takes apart the argument p, hunk ./ghc/docs/comm/the-beast/data-types.html 99 -with the worker constructor $wMKT. (There are more notes below -about the unboxing of strict fields.) +with the worker constructor MkT{v}. (There are more notes below +about the unboxing of strict fields.) The worker $WMkT is called an implicit binding, +because it's introduced implicitly by the data type declaration (record selectors +are also implicit bindings, for example). Implicit bindings are injected into the code +just before emitting code or External Core. hunk ./ghc/docs/comm/the-beast/data-types.html 105 -So the original constructors, MkT and Nil are really just -wrappers which perhaps do some work before calling the workers -$wMkT and $wNil. The workers are -the "representation constructors" of -the "representation data type", which we can think of as being defined thus: - +After desugaring into Core (-ddump-ds), the definition of f looks like this: +
+ f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> let { p = (a,b); q = I#r } in + $WMkT p (q+1) ++Notice the way that pattern matching has been desugared to take account of the fact +that the "real" data constructor MkT has three fields. +
+By the time the simplifier has had a go at it, f will be transformed to: hunk ./ghc/docs/comm/the-beast/data-types.html 117 - data T a = $wMkT a a Int | $wNil + f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> MkT{v} a b (r +# 1#) hunk ./ghc/docs/comm/the-beast/data-types.html 121 +Which is highly cool. hunk ./ghc/docs/comm/the-beast/data-types.html 123 -This representation data type, gives the number and types of -fields of the constructors used to represent values of type T. -This representation type is also what is emitted when you print External Core -from GHC. hunk ./ghc/docs/comm/the-beast/data-types.html 124 -
+ MkT{v} = \ p q r -> MkT{v} p q r ++This is a real hack. The occurrence on the RHS is saturated, so the code generator (both the +one that generates abstract C and the byte-code generator) treats it as a special case and +allocates a MkT; it does not make a recursive call! So now there's a top-level curried +version of the worker which is available to anyone who wants it. +
+This strange defintion is not emitted into External Core. Indeed, you might argue that +we should instead pass the list of TyCons to the code generator and have it +generate magic bindings directly. As it stands, it's a real hack: see the code in +CorePrep.mkImplicitBinds. + + +
+ data T a = MkT a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT a b t) + + f x = case x of + Nil -> Nil + MkT a b r -> MkT a b (r +# 1#) ++Notice that it makes perfect sense as a program all by itself. Constructors +look like constructors (albeit not identical to the original Haskell ones). +
+When reading in External Core, the parser is careful to read it back in just +as it was before it was spat out, namely: +
+ data T a = MkT{d} a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT{v} a b t) + + f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> MkT{v} a b (r +# 1#) ++ + +
hunk ./ghc/docs/comm/the-beast/prelude.html 11 + One of the trickiest aspects of GHC is the delicate interplay + between what knowledge is baked into the compiler, and what + knowledge it gets by reading the interface files of library + modules. In general, the less that is baked in, the better. +
hunk ./ghc/docs/comm/the-beast/prelude.html 22 -
+ intPrimTyCon :: TyCon + intPrimTyCon = .... ++Examples: +Int#, Float#, Addr#, State#. +
+
+ intTyCon :: TyCon + intTyCon = .... + + intDataCon :: DataCon + intDataCon = .... ++However, since a TyCon value contains the entire type definition inside it, it follows +that the complete definition of Int is thereby baked into the compiler. +
+Nevertheless, the library module GHC.Base still contains a definition for Int +just so that its info table etc get generated somewhere. Chaos will result if the wired-in definition +in TysWiredIn differs from that in GHC.Base. +
+The rule is that only very simple types should be wired in (for example, Ratio is not, +and IO is certainly not). No class is wired in: classes are just too complicated. +
+Examples: Int, Float, List, tuples. + +
+
+Most of these known-key names are defined in module PrelNames; a further swathe concerning +Template Haskell are defined in DsMeta. The allocation of unique keys is done manually; +chaotic things happen if you make a mistake here, which is why they are all together. +
+The next sections elaborate these three classes a bit. + + +