\documentclass{scrartcl}

\usepackage{amstext}
\usepackage{stmaryrd}
\usepackage{mathpazo}
\usepackage{hyperref}
\usepackage{booktabs}
\usepackage{graphicx}
\usepackage{amsmath}

%include polycode.fmt
%include spacing.fmt

%format <*> = "\varoast"
%format not = "\mathit{not}"
\begin{document}

\title{Chalk\\ \Large{An Architecture Description Language}}
\author{Wouter Swierstra}
\date{\today}
\maketitle

\begin{abstract}
  The document describes Chalk, a high-level language for writing
  architecture descriptions. It is strongly inspired by previous
  work on Hawk. In contrast to Hawk, however, we give a deeper
  embedding of our circuit descriptions (using the |Circuit| data
  type defined below). This enables us to further analyse circuits,
  add structural information, and possibly synthesize hardware.
\end{abstract}

\section{About this document}
\label{sec:about-this-document}

This document has several aims:
\begin{itemize}
 \item document the work that has been done so far. This is
   particularly important if we want to involve Carl and Emily, and
  other potential external collaborators.
\item outline about where the work is going
   and what we want to achieve;
 \item and ideally, form the basis for a paper about this project. 
\end{itemize}

Although I will try to keep the API described in this document as
stable as possible, I can't promise things will not change. 

\section{Installing Chalk}
\label{sec:installing-chalk}

I will try to keep a fairly recent code snapshot on the Google Groups page. This document should be in sync with the latest released version. If you want to browse the latest version in the repository, you can do so at:

\begin{verbatim}
http://projects.haskell.org/Chalk/
\end{verbatim}

To use Chalk you will need a working Haskell installation. Start by
downloading and installing the \emph{Haskell Platform}, which
consists of \texttt{ghc} (the Haskell compiler) together with a
standard collection of libraries. You can download binary
distributions of the Haskell Platform from:

\begin{verbatim}
http://hackage.haskell.org/platform/
\end{verbatim}

If no binary is available, you will need to install \texttt{ghc}
manually. There are even more binaries available from the
\texttt{ghc} homepage:

\begin{verbatim}
http://haskell.org/ghc/download.html
\end{verbatim}

Any reasonably recent version should work fine (I'd say anything
more recent than 6.6, but I'm not entirely sure).

You may want to test that \texttt{ghc} is correctly installed.
Here's two ways to do that:

\begin{verbatim}
:~$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.10.1
:~$ ghci
GHCi, version 6.10.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
Prelude> 1 + 2
3
Prelude> :q
Leaving GHCi.
\end{verbatim}
Besides the compiler (\texttt{ghc}), this should also give you the
interactive environment, \texttt{ghci}. Using \texttt{ghci} you can
type in Haskell expressions, which are then evaluated. You can quit
\texttt{ghci} with the command \texttt{:q}.

The \texttt{cabal} command is a tool for install Haskell libraries.
It should come with the Haskell Platform. If you had to install
\texttt{ghc} by hand, you may also need to install \texttt{cabal}
yourself. You can find instructions for installing \texttt{cabal} here:

\begin{verbatim}
http://www.haskell.org/cabal/download.html
\end{verbatim}

Test that you have a working \texttt{cabal} installation, for
example, by running the command \texttt{cabal --version}.

Now download Chalk -- the most recent version is the .tar bundle I
sent out late October. Extract the snapshot. The easiest way to
install the library is by navigating to the extracted folder:

\begin{verbatim}
:~$ cd Chalk
:~/Chalk$ cabal install --user
\end{verbatim}

This should give produce a bit of noise, ending with \texttt{Writing
  new package config file...} If you are behind a firewall,
http-proxy, or have a particularly weird internet connection this
step may not work. The \texttt{cabal} command needs to download
several dependencies. Let me know if you get stuck on this step --
maybe I can help. There may be a flag (see \texttt{cabal install
  --help}) that you can set to fix this.

To test that Chalk has been installed succesfully, navigate to the
\verb!examples! directory, in the Chalk folder. To load one of the
examples into \texttt{ghci}, type the following command:
\begin{verbatim}
:~/Intel/Chalk/examples$ ghci Counter.hs
GHCi, version 6.10.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
[1 of 1] Compiling Counter             ( Counter.hs, interpreted )
Ok, modules loaded: Counter.
*Counter> 
\end{verbatim}


You can get help about the various commands that \texttt{ghci}
supports by typing \texttt{:?}. The two command- you'll probably use
most are \texttt{:q}, to quit, and \texttt{:r}, to reload the file
you've loaded at the moment.

Here are a few simple tests you can run:
\begin{verbatim}
*Counter> first (counter testInput)
-- computes the value produced on the first clock cycle
*Counter> simulate (counter testInput)
-- simulates the counter with the testInput defined in the file
--  hit Ctr-C to interrupt
\end{verbatim}


\section{Introduction to Chalk}
\label{sec:introduction}

Chalk provides several combinators and types to write architecture
descriptions. In this section, I'll outline how to use these
combinators by means of several small examples. The next sections
will discuss more substantial examples and the underlying
implementation of the library.

The simplest combinator is called |pure|. It has the following type:

< pure :: a -> Signal a

In other words, it boxes any Haskell value as a signal function. A
signal function returning a value of type |a| has the Haskell
type |Signal a|.

For example, we may want to write the signal that is always
|False|:

< zero :: Signal Bool
< zero = pure False

Similarly, if we want to describe a component that inverts a signal,
we can also use the |pure| function:

< inverter :: Signal (Bool -> Bool)
< inverter = pure not

Here |not| is Haskell Boolean negation function. Note that these
signals do not need to be first-order -- here we have a signal of
type |Bool -> Bool|.

Now we may want to `connect' the |inverter| and |zero|
signals. There is a second combinator, written using the infix
operator |<*>| (written \texttt{<*>} in ASCII) that does just this:

< <*> :: Signal (a -> b) -> Signal a -> Signal b

You may want to think of this combinator as making function
application explicit. Using this combinator we can wire together the
inverter and |zero| signal to get a signal that is always |one|

< one :: Signal Bool
< one = inverter <*> zero

Using these two basic combinators, however, it is fairly
straightforward to write more complicated signal functions. For
instance, the |mux| takes three input signals. Based on its
first input signal, it outputs one of its other arguments.

< mux :: Signal Bool -> Signal a -> Signal a -> Signal a
< mux cs ts es = pure cond <*> cs <*> ts <*> es
<  where
<  cont :: Bool -> a -> a -> a
<  cond c t e = if c then t else e

There is one more primitive combinator missing. We need some notion
of delay:

< delay :: a -> Signal a -> Signal a

You can use the |delay| function to write recursive signal
functions. For example, you may want to repeatedly apply a function
(as in dataflow languages like Lustre):

< iterator :: Signal (a -> a) -> a -> Signal a
< iterator h x = delay x (h <*> iterator h x)

In the first clock cycle this will return |x|; in the second cycle
it will return |h(x)|; in the third cycle it will return |h (h(x))|;
etc.

Using this |delay| function, we can define other, more intricate
loop combinators such as:

< loop :: Signal (s -> (a,s)) -> s -> Signal a
< loop h s =  let outs = h <*> delay s (pure snd <*> outs)
<             in pure fst <*> outs

The definition is a little bit tricky. It takes a signal function of
type |s -> (a,s)| and an initial `state' of type |s|. It feeds this
initial state into the signal function. This yields a pair of
type |(a,s)| -- corresponding to the first value the circuit
produces and the next state. There's a little bit of work necessary
(as witnessed by the projections |fst| and |snd|) to wire everything
together.

Finally, you may want to organise the signal functions into a
hierarchy. There is one last combinator, |component|, that
explicitly groups some signal function into one logical unit:

< component :: String -> Signal a -> Signal a

The |String| argument gives some name to the component. It need not
be unique.
\subsection*{Counter example}
\label{sec:counter-example}

We can now show how to write a simple counter using these
combinators. The counter will have the following type:

< counter :: Signal Bool -> Signal Int

In other words, given a boolean signal, it will return the number of
clock cycles since the input signal was high.

< counter reset = component "Counter" (iterator count 0)
<   where
<   count :: Signal (Int -> Int)
<   count = mux reset (pure (const 0)) (pure increment)

This definition uses the |iterator| function to define a recursive
signal function. In the first cycle, it will yield zero. In the
later cycles, it will return |0| when the |reset| signal is high;
otherwise it will increment the previous value.

\section{Simple Hawk Microprocessor}
\label{sec:simple-hawk-arch}

In this section I'd like to give a larger example taken from one of
the Hawk papers. I'll discuss the specification of a simple
microprocessor. This constitutes the first of a series of examples
included with the code I released on the Google Groups page. The
later versions add pipelining to this simple specification.

The first version of the Simple HAwk Microprocessor, or SHAM, is
found in a file called \texttt{Sham.hs}. It defines a microprocessor
consisting of an ALU and a register file. There are four registers.
The ALU is can add, subtract, or increment integers. We start our
specification with the following data types:

\begin{code}
  data Reg = R0 | R1 | R2 | R3 deriving (Show, Eq)

data Cmd = ADD | SUB | INC deriving (Show, Eq)
\end{code}
The |Reg| data type corresponds to the four registers, |R0|
through |R3|, that the SHAM microprocessor has. The |Cmd| data type
corresponds the possible commands that can be given to the ALU.

The ALU spec is extremely simple. Given a command and a pair of
integers, it computes the result of executing that command with
those integers as input:
\begin{code}
alu :: Signal Cmd -> Signal (Int, Int) -> Signal Int
alu cmds (xys) = component "ALU" (pure interpret <*> cmds <*> xys)
  where
  interpret :: Cmd -> (Int, Int) -> Int
  interpret ADD (x,y) = x + y
  interpret SUB (x,y) = x - y
  interpret INC (x,_) = x + 1
\end{code}
Note that by convention, |INC| increments the first integer,
ignoring the second.

The register file is slightly more complicated. The type of register
file is:

> regFile :: Signal Reg -> Signal Int -> Signal Reg -> Signal Reg -> Signal (Int, Int)

The first two arguments contain write information: the register file
should write the value of the second argument to the register given
by the first argument. The |regFile| should return a pair of the
integers stored in the register specified by the third and fourth
argument respectively.

The |regFile| is defined using the |loop| combinator. Here is the
first part of the definition:
\begin{code}
regFile wr val rd1 rd2 = component "RegisterFile" rf
    where
    initRegs = (0,0,0,0)
    rf =   loop (pure regStep <*> wr <*> val <*> rd1 <*> rd2) initRegs
\end{code}
We initialize the registers to |0|, but defer the actual work to
the |regStep| function defined in Figure 1.

The |regStep| function is given the same arguments as |regFile| --
only now we have an additional argument of type |Regs|, representing
the current state of the registers. The |regFile| function computes
the new state of the registers, |reg'|, that results from performing
the required update. It looks up the two argument registers |rd1|
and |rd2| in this newly computed register state. It returns the
results of this read (|res1| and |res2|), together with the new
register state |reg'|. We need two auxiliary definitions |updateReg|
and |lookupReg| to complete the definition of the register file.
These definitions are both fairly unremarkable.
\begin{figure}
\begin{code}
  type Regs = (Int, Int, Int, Int)

  regStep :: Reg -> Int -> Reg -> Reg -> Regs -> ((Int, Int), Regs)
  regStep wr x rd1 rd2 regs =
  let regs' = updateReg (wr,x) regs
      res1 = lookupReg rd1 regs'
      res2 = lookupReg rd2 regs'
  in ((res1, res2), regs')  
 
  updateReg :: (Reg,Int) -> Regs -> Regs
  updateReg (R0,x) (a,b,c,d) = (x,b,c,d)
  updateReg (R1,x) (a,b,c,d) = (a,x,c,d)
  updateReg (R2,x) (a,b,c,d) = (a,b,x,d)
  updateReg (R3,x) (a,b,c,d) = (a,b,c,x)

  lookupReg :: Reg -> Regs -> Int
  lookupReg R0 (a,b,c,d) = a
  lookupReg R1 (a,b,c,d) = b
  lookupReg R2 (a,b,c,d) = c
  lookupReg R3 (a,b,c,d) = d
\end{code}  
  \caption{The |regStep| function}
  \label{fig:regstep}
\end{figure}

Now we can assemble the ALU and register file into a complete
microprocessor. The microprocessor receives a sequence of register
assignments, such as |R3 <- ADD R2 R4|. It proceeds by looking up the
values of the argument registers. The resulting integers are passed
to the ALU; the result of the computation is then written to the
destination register.

Correspondingly, the we define |sham| spec as follows:
\begin{code}
sham :: Signal Cmd -> Signal Reg -> Signal Reg -> Signal Reg -> (Signal Reg, Signal Int)
sham cmd dest regArg1 regArg2 = (dest' , aluOutput')
  where
  aluOutput = alu cmd aluInputs
  aluInputs = regFile dest' aluOutput' regArg1 regArg2
  dest' = delay R0 dest
  aluOutput' = delay 0 aluOutput
\end{code}
The |aluInputs| are the result of consulting the register file.
These are passed to the ALU, together with the commands specified in
the |cmd| signal, yielding the ALU outputs. The assignments
resulting from the ALU are passed back to the register file, that is
updated accordingly.

\section{Implementation}
\label{sec:implementation}

So far I have introduced several combinators and shown how they may
be used. I have not explained how these combinators may be
implemented.

The simplest way to implement the primitive combinators
(|delay|, |pure|, |component|, and |<*>|) is by defining the
following data type:

\begin{code}
type Signal a = Circuit a

data Circuit a where
  Delay :: a -> Circuit a -> Circuit a
  Pure :: a -> Circuit a
  Component :: String -> Circuit a -> Circuit a
  App :: Circuit (b -> a) -> Circuit b -> Circuit a
\end{code}

For every combinator, we introduce a separate constructor.
Using this data type, we can simulate our circuits as follows:

%format x_i = x"_i"
%format f_i = f"_i"
%format f1 = f"_1"
%format f2 = f"_2"
%format f3 = f"_3"
%format x1 = x"_1"
%format x2 = x"_2"
%format x3 = x"_3"

\begin{code}
simulate :: Signal a -> [a]
simulate (Delay x c) = x : simulate c
simulate (Pure x) = repeat x
simulate (Circuit nm c) = simulate c
simulate (App f x) = zipWith ($) (simulate f) (simulate x)
\end{code}%$

The |simulate| function takes a signal function and returns an
infinite list corresponding to the values produced by the circuit at
every clock cycle. Simulating |Delay x c| adds |x| to the head of
the list obtained by simulating |c|. In the case of |Pure x|
constructor, we return an infinite list of |x|s. To simulate a
component, we ignore the |String| and simulate its underlying
circuit. Finally, to handle application we simulate both the
`function' circuit and the `argument' circuit, yielding an infinite
list of functions and an infinite list of arguments. We return the
result of pointwise applying every function to its corresponding
argument, i.e., the list |[f1 x1, f2 x2, f3 x3, ...]|.

\subsection{Problems}
\label{sec:problems}

This simple implementation is fine if all you are interested in is
simulating the circuits. For this project, however, we are
particularly interested in analysing these circuit descriptions.

Suppose we want to count the number of components in one of these
circuit descriptions. To do so, we want to define a function
from |Circuit a -> Int| that counts the number of |Pure|
and |Component| constructors. The obvious way to do this is by
defining the function below.

\begin{code}
size :: Circuit a -> Int
size (Pure x) = 1
size (Delay x c) = size c
size (Component nm c) = 1
size (App f x) = size f + size x
\end{code}

Unfortunately this definition will fail to terminate for the example
circuits we have seen so far. To see this, consider the |counter|
example we defined previously. It uses the |iterator| function to define
a recursive circuit. The |iterator| function itself was defined as follows:

< iterator :: Signal (a -> a) -> a -> Signal a
< iterator h x = delay x (h <*> iterator h x)

Now let's see what happens when we try to compute the |size| of any
function defined using the |iterator| function:

< size (iterator h x)
< = size (Delay x (App h (iterator h x)) 
< = size (App h (iterator h x))
< = size h + size (iterator h x)

This computation evaluates to |size h + size h + size h + ...|
which clearly doesn't terminate.

\subsection{Solution}
\label{sec:solution}

To overcome this problem, we need to add more information to
the |Circuit| data type. In particular, we propose to add a unique
label to the constructors of the |Circuit| data type. 

Conceptually, the |Circuit| data type then becomes:
\begin{code}
type Name = Int

data Circuit a where
  Delay :: Unique -> a -> Circuit a -> Circuit a
  Pure :: Unique -> a -> Circuit a
  App :: Unique -> Circuit (b -> a) -> Circuit b -> Circuit a
  Component :: Unique -> Circuit a -> Circuit a
\end{code}

% The |Signal| functions now carry around a unique name. Although this
% is a bit more complicated than our previous definition, we can keep
% the same API. For example, the definition of the |pure| and |<*>|
% combinators now become:

% < pure x = \nm -> (Pure nm x, nm + 1)
% < f <*> x = \nm -> 
% <   let  nm' = nm + 1
% <        (f',nm'') = f nm'
% <        (x',nm''') = x nm''
% <   in (App nm f' x', nm''')

% (The definition of |<*>| can be a bit cleaner using the state monad
% but I thought I'd try to keep things simple in this document.)

Using such a definition, we can now detect when we have already
visited a node in the |Circuit| data type by keeping track of a list
of previously encountered nodes. We can stop the recursion when we
encounter a node we have visited previously. For example, we can
redefine the |size| function to the code below.

\begin{code}
size :: Circuit a -> Int
size c = visit [] c
  where
  check :: Unique -> [Unique] -> Int -> (Int, [Unique])
  check u us x = if u `elem` us then (0,us) else (x,u:us)
  visit :: [Unique] -> Circuit a -> (Int, [Unique])
  visit us (Pure u x) = check u us 1
  visit us (Delay u x c) = check u us (visit (u:us) c)
  visit us (App u f x) = 
    let  (fSize,us') = check u us (visit (u : us) f)
         (xSize,us'') = check u us (visit us' x)
    in (fSize + xSize, us'')
\end{code}

Using similar traversals, it should be possible to visualise the
component structure of the circuit, for example.

\section{Graph visualisation \& netlist generation}
\label{sec:graph-visualisation}

The structure for circuits we have seen so far corresponds (very
roughly) to (typed) binary trees (with sharing): the leaves store
pure Haskell values, a node applies the left subtree to the right.
To generate netlists or visualise the circuits, we need to do some
work.

The simplest way to visualize the circuit structure is a straightforward

\begin{code}
  simpleVis :: Circuit a -> [Edge] 
  simpleVis c = evalState (wiresC (hashUnique (unique (circuit c))) c) []
  where
  wiresC :: Int -> Circuit a -> State [Unique] [Edge]
  wiresC from (Circuit r) = do
    visited <- get
    if unique r `elem` visited
      then return []
      else modify (unique r :) >> step (hashUnique (unique r)) (deref r)
  step :: Int -> CircuitF Circuit a -> State [Unique] ([DotEdge], [DotNode])
  step now (Pure x) = return []
  step now (App l r) = do
    ledges <- wiresC now l
    redges <- wiresC now r
    let ledge = hashUnique (unique (circuit l)) `to` now
    let redge = hashUnique (unique (circuit r)) `to` now
    return (ledge : redge : ledges ++ redges)
  step now (Component nm c) = do
    edges <- wiresC now c
    let edge = hashUnique (unique (circuit c)) `to` now
    return (edge : edges)
  step now (Input nm) = return []
  step now (Delay x c) = do
    edges <- wiresC now c
    let edge = hashUnique (unique (circuit c)) `to` now
    return (edge : edges)
\end{code}

% Here we'r assuming a function |edge| of type |Tree a -> Tree a ->
% Edge|, that constructs an edge between the two trees it receives as
% argument.

% The problem with this implementation is that it does not handle
% functions with more than one argument well. For example, the
% expression |1 + 2| might be represented as:

% > Node (Leaf "+" ) (Node (Leaf "1") (Leaf "2"))

% The above preorder traversal will yield two edges: one between
% the |+| and the right subtree; the other between |1| and |2|. What
% we would expect however, is a pair of edges from |+| and to its two
% arguments.

% \begin{code}
% traverse :: Show a => Tree a -> String
% traverse tree = t [tree]
%   where
%   t :: Show a => [Tree a] -> String
%   t [] = ""
%   t (Leaf x : xs) = show x ++ (t xs)
%   t (Node l r : xs) = spine r (l : xs)
%   spine (Leaf x) xs = show x ++ t xs
%   spine (Node l r) xs = spine r (l : xs)
% \end{code}


\section{Wild Speculation and other further work}
\label{sec:wild-spec-other}

In this section, I'd like to outline some ideas and sketch the
directions I'd like to take this work from here.

\begin{description} 

\item[Performance analysis] Emily pointed out a
  paper outlining how to perform a simple performance analysis on a
  high-level architecture description. I expect it should be
  possible to do something similar with these descriptions. I
  haven't thought about power analysis too much yet. We could make a
  very rough type-directed estimation of the gate count necessary to
  implement these specs -- which could give us some idea of where
  the power bottlenecks might emerge.


\item[Non-standard interpretation] Given that we have a \emph{typed}
  circuit description, we might be able to perform non-standard
  interpretations -- along the lines of the |size| function -- to
  extract other information from our circuit descriptions.

\item[Integration with Lava/Wired] Besides the purely behavioural
  circuits, we should also incorporate structural descriptions, such
  as those from Lava/Wired. The easiest way to do this would be to
  add a new constructor to the |Circuit| data type:

< LavaCircuit :: Lava a -> Circuit a

That embeds any Lava circuit into the Circuit data type. This is
particularly interesting if we want to perform power/performance
analyses: such analyses are much more accurate on such
structural descriptions.

\item[Refinement] Ideally, we would like to offer architects the
  possibility to refine these behavioural descriptions, adding more
  and more implementation detail until they have a complete
  architecture description. One might achieve this by replacing
  the |Pure| components with Lava circuits. 
\end{description}


\end{document}

%%% Local Variables: 
%%% mode: latex
%%% TeX-command-default: "LiterateHaskell"
%%% TeX-master: t
%%% End: