Database.Enumerator

Takusen-0.8.2: Database library with left-fold interface, for PostgreSQL, Oracle, SQLite, ODBC.

Database.Enumerator

Portability	non-portable
Stability	experimental
Maintainer	oleg@pobox.com, alistair@abayley.org

Contents

Usage

Iteratee Functions
result and result'
Rank-2 types, ($), and the monomorphism restriction
Bind Parameters
Multiple (and nested) Result Sets

Sessions and Transactions

Exceptions and handlers

Preparing and Binding

Iteratees and Cursors

Utilities

Description

Abstract database interface, providing a left-fold enumerator and cursor operations.

There is a stub: Database.Stub.Enumerator. This lets you run the test cases without having a working DBMS installation. This isn't so valuable now, because it's dead easy to install Sqlite, but it's still there if you want to try it.

Additional reading:

Note that there are a few functions that are exported from each DBMS-specific implementation which are exposed to the API user, and which are part of the Takusen API, but are not (necessarily) in this module. They include:

connect (obviously DBMS specific)

prepareQuery, prepareLargeQuery, prepareCommand, sql, sqlbind, prefetch, cmdbind

These functions will typically have the same names and intentions, but their specific types and usage may differ between DBMS.

Synopsis

data DBM mark sess a

class ISession sess

data ConnectA sess

withSession :: (Typeable a, ISession sess) => ConnectA sess -> (forall mark . DBM mark sess a) -> IO a

withContinuedSession :: (Typeable a, ISession sess) => ConnectA sess -> (forall mark . DBM mark sess a) -> IO (a, ConnectA sess)

commit :: ISession s => DBM mark s ()

rollback :: ISession s => DBM mark s ()

beginTransaction :: (MonadReader s (ReaderT s IO), ISession s) => IsolationLevel -> DBM mark s ()

withTransaction :: ISession s => IsolationLevel -> DBM mark s a -> DBM mark s a

data IsolationLevel

execDDL :: Command stmt s => stmt -> DBM mark s ()

execDML :: Command stmt s => stmt -> DBM mark s Int

inquire :: EnvInquiry key s result => key -> DBM mark s result

data DBException

= DBError SqlState Int String

| DBFatal SqlState Int String

| DBUnexpectedNull RowNum ColNum

| DBNoData

formatDBException :: DBException -> String

basicDBExceptionReporter :: CaughtMonadIO m => DBException -> m ()

reportRethrow :: CaughtMonadIO m => DBException -> m a

reportRethrowMsg :: CaughtMonadIO m => String -> DBException -> m a

catchDB :: CaughtMonadIO m => m a -> (DBException -> m a) -> m a

catchDBError :: CaughtMonadIO m => Int -> m a -> (DBException -> m a) -> m a

ignoreDBError :: CaughtMonadIO m => Int -> m a -> m a

throwDB :: DBException -> a

type ColNum = Int

type RowNum = Int

type SqlState = (SqlStateClass, SqlStateSubClass)

type SqlStateClass = String

type SqlStateSubClass = String

data PreparedStmt mark stmt

withPreparedStatement :: (Typeable a, IPrepared stmt sess bstmt bo) => PreparationA sess stmt -> (PreparedStmt mark stmt -> DBM mark sess a) -> DBM mark sess a

withBoundStatement :: (Typeable a, IPrepared stmt s bstmt bo) => PreparedStmt mark stmt -> [BindA s stmt bo] -> (bstmt -> DBM mark s a) -> DBM mark s a

class ISession sess => Statement stmt sess q | stmt sess -> q

class ISession sess => Command stmt sess

class ISession sess => EnvInquiry inquirykey sess result | inquirykey sess -> result

data PreparationA sess stmt

class ISession sess => IPrepared stmt sess bound_stmt bo | stmt -> bound_stmt, stmt -> bo

data BindA sess stmt bo

class ISession sess => DBBind a sess stmt bo | stmt -> bo where

bindP :: a -> BindA sess stmt bo

bindP :: DBBind a sess stmt bo => a -> BindA sess stmt bo

class ISession sess => IQuery q sess b | q -> sess, q -> b where

currentRowNum :: q -> IO Int

doQuery :: (Statement stmt sess q, QueryIteratee (DBM mark sess) q i seed b, IQuery q sess b) => stmt -> i -> seed -> DBM mark sess seed

class DBType a q b | q -> b

type IterResult seedType = Either seedType seedType

type IterAct m seedType = seedType -> m (IterResult seedType)

currentRowNum :: IQuery q sess b => q -> IO Int

data NextResultSet mark stmt = NextResultSet (PreparedStmt mark stmt)

data RefCursor a = RefCursor a

cursorIsEOF :: DBCursor mark (DBM mark s) a -> DBM mark s Bool

cursorCurrent :: DBCursor mark (DBM mark s) a -> DBM mark s a

cursorNext :: DBCursor mark (DBM mark s) a -> DBM mark s (DBCursor mark (DBM mark s) a)

withCursor :: (Typeable a, Statement stmt sess q, QueryIteratee (DBM mark sess) q i seed b, IQuery q sess b) => stmt -> i -> seed -> (DBCursor mark (DBM mark sess) seed -> DBM mark sess a) -> DBM mark sess a

type Position = Int

ifNull :: Maybe a -> a -> a

result :: Monad m => IterAct m a

result' :: Monad m => IterAct m a

Usage

Let's look at some example code:

 -- sample code, doesn't necessarily compile
 module MyDbExample is

 import Database.Oracle.Enumerator
 import Database.Enumerator
 ...

 query1Iteratee :: (Monad m) => Int -> String -> Double -> IterAct m [(Int, String, Double)]
 query1Iteratee a b c accum = result' ((a, b, c):accum)

 -- non-query actions.
 otherActions session = do
   execDDL (sql "create table blah")
   execDML (cmdbind "insert into blah (...) values (?, ?, ?, ...)" [bindP "v1", bindP (1::Int), ...])
   commit
   -- Use withTransaction to delimit a transaction.
   -- It will commit at the end, or rollback if an error occurs.
   withTransaction Serialisable ( do
     execDML (sql "update blah ...")
     execDML (sql "insert into blah ...")
     )

 main :: IO ()
 main = do
   withSession (connect "user" "password" "server") ( do
     -- simple query, returning reversed list of rows.
     r <- doQuery (sql "select a, b, c from x") query1Iteratee []
     liftIO $ putStrLn $ show r
     otherActions session
     )

Notes:

connection is made by withSession, which also disconnects when done i.e. withSession delimits the connection. You must pass it a connection action, which is back-end specific, and created by calling the connect function from the relevant back-end.
inside the session, the usual transaction delimiter commands are usable e.g. beginTransaction IsolationLevel, commit, rollback, and withTransaction. We also provide execDML and execDDL.
non-DML and -DDL commands - i.e. queries - are processed by doQuery (this is the API for our left-fold). See more explanation and examples below in Iteratee Functions and Bind Parameters sections.

The first argument to doQuery must be an instance of Statement. Each back-end will provide a useful set of Statement instances and associated constructor functions for them. For example, currently all back-ends have:

for basic, all-text statements (no bind variables, default row-caching) which can be used as queries or commands:

      sql "select ..."

for a select with bind variables:

      sqlbind "select ?, ... where col = ? and ..." [bindP ..., bindP ...]

for a select with bind variables and row caching:

      prefetch 100 "select ?, ... where col = ? and ..." [bindP ..., bindP ...]

for a DML command with bind variables:

      cmdbind "insert into ... values (?, ?, ...)" [bindP ..., bindP ...]

for a reusable prepared statement: we have to first create the prepared statement, and then bind in a separate step. This separation lets us re-use prepared statements:

      let stmt = prepareQuery (sql "select ? ...")
      withPreparedStatement stmt $ \pstmt ->
        withBoundStatement pstmt [bindP ..., bindP ...] $ \bstmt -> do
          result <- doQuery bstmt iter seed
          ...

The PostgreSQL backend additionally requires that when preparing statements, you (1) give a name to the prepared statement, and (2) specify types for the bind parameters. The list of bind-types is created by applying the bindType function to dummy values of the appropriate types. e.g.

 let stmt = prepareQuery "stmtname" (sql "select ...") [bindType "", bindType (0::Int)]
 withPreparedStatement stmt $ \pstmt -> ...

A longer explanation of prepared statements and bind variables is in the Bind Parameters section below.

Iteratee Functions

doQuery takes an iteratee function, of n arguments. Argument n is the accumulator (or seed). For each row that is returned by the query, the iteratee function is called with the data from that row in arguments 1 to n-1, and the current accumulated value in the argument n.

The iteratee function returns the next value of the accumulator, wrapped in an Either. If the Either value is Left, then the query will terminate, returning the wrapped accumulator/seed value. If the value is Right, then the query will continue, with the next row begin fed to the iteratee function, along with the new accumulator/seed value.

In the example above, query1Iteratee simply conses the new row (as a tuple) to the front of the accumulator. The initial seed passed to doQuery was an empty list. Consing the rows to the front of the list results in a list with the rows in reverse order.

The types of values that can be used as arguments to the iteratee function are back-end specific; they must be instances of the class DBType. Most backends directly support the usual lowest-common-denominator set supported by most DBMS's: Int, String, Double, UTCTime. (Int64 is often, but not always, supported.)

By directly support we mean there is type-specific marshalling code implemented. Indirect support for Read- and Show-able types is supported by marshalling to and from Strings. This is done automatically by the back-end; there is no need for user-code to perform the marshalling, as long as instances of Read and Show are defined.

The iteratee function operates in the DBM monad, so if you want to do IO in it you must use liftIO (e.g. liftIO $ putStrLn "boo" ) to lift the IO action into DBM.

The iteratee function is not restricted to just constructing lists. For example, a simple counter function would ignore its arguments, and the accumulator would simply be the count e.g.

 counterIteratee :: (Monad m) => Int -> IterAct m Int
 counterIteratee _ i = result' $ (1 + i)

The iteratee function that you pass to doQuery needs type information, at least for the arguments if not the return type (which is typically determined by the type of the seed). The type synonyms IterAct and IterResult give some convenience in writing type signatures for iteratee functions:

 type IterResult seedType = Either seedType seedType
 type IterAct m seedType = seedType -> m (IterResult seedType)

Without them, the type for counterIteratee would be:

 counterIteratee :: (Monad m) => Int -> Int -> m (Either Int Int)

which doesn't seem so onerous, but for more elaborate seed types (think large tuples) it certainly helps e.g.

 iter :: Monad m =>
      String -> Double -> UTCTime -> [(String, Double, UTCTime)]
   -> m (Either [(String, Double, UTCTime)] [(String, Double, UTCTime)] )

reduces to (by using IterAct and IterResult):

 iter :: Monad m =>
      String -> Double -> UTCTime -> IterAct m [(String, Double, UTCTime)]

result and result'

The result (lazy) and result' (strict) functions are another convenient shorthand for returning values from iteratee functions. The return type from an iteratee is actually Either seed seed, where you return Right if you want processing to continue, or Left if you want processing to stop before the result-set is exhausted. The common case is:

 query1Iteratee a b c accum = return (Right ((a, b, c):accum))

which we can write as

 query1Iteratee a b c accum = result $ (a, b, c):accum)

We have lazy and strict versions of result. The strict version is almost certainly the one you want to use. If you come across a case where the lazy function is useful, please tell us about it. The lazy function tends to exhaust the stack for large result-sets, whereas the strict function does not. This is due to the accumulation of a large number of unevaluated thunks, and will happen even for simple arithmetic operations such as counting or summing.

If you use the lazy function and you have stack/memory problems, do some profiling. With GHC:

ensure the iteratee has its own cost-centre (make it a top-level function)
compile with -prof -auto-all
run with +RTS -p -hr -RTS
run hp2ps over the resulting .hp file to get a .ps document, and take a look at it. Retainer sets are listed on the RHS, and are prefixed with numbers e.g. (13)CAF, (2)SYSTEM. At the bottom of the .prof file you'll find the full descriptions of the retainer sets. Match the number in parentheses on the .ps graph with a SET in the .prof file; the one at the top of the .ps graph is the one using the most memory.

You'll probably find that the lazy iteratee is consuming all of the stack with lazy thunks, which is why we recommend the strict function.

Rank-2 types, ($), and the monomorphism restriction

In some examples we use the application operator ($) instead of parentheses (some might argue that this is a sign of developer laziness). At first glance, ($) and conventional function application via juxtaposition seem to be interchangeable e.g.

 liftIO (putStrLn (show x))

looks equivalent to

 liftIO $ putStrLn $ show x

But they're not, because Haskell's type system gives us a nice compromise.

In a Hindley-Milner type system (like ML) there is no difference between ($) and function application, because polymorphic functions are not first-class and cannot be passed to other functions. At the other end of the scale, ($) and function application in System F are equivalent, because polymorphic functions can be passed to other functions. However, type inference in System F is undecidable.

Haskell hits the sweet spot: maintaining full inference, and permitting rank-2 polymorphism, in exchange for very few type annotations. Only functions that take polymorphic functions (and thus are higher-rank) need type signatures. Rank-2 types can't be inferred. The function ($) is a regular, rank-1 function, and so it can't take polymorphic functions as arguments and return polymorphic functions.

Here's an example where ($) fails: we supply a simple test program in the README file. If you change the withSession line to use ($), like so (and remove the matching end-parenthese):

   withSession (connect "sqlite_db") $ do

then you get the error:

 Main.hs:7:38:
     Couldn't match expected type `forall mark. DBM mark Session a'
            against inferred type `a1 b'
     In the second argument of `($)', namely
       ...

Another way of rewriting it is like this, where we separate the DBM action into another function:

 {-# OPTIONS -fglasgow-exts #-}
 module Main where
 import Database.Sqlite.Enumerator
 import Control.Monad.Trans (liftIO)
 main = flip catchDB reportRethrow $
   withSession (connect "sqlite_db") hello

 hello = withTransaction RepeatableRead $ do
     let iter (s::String) (_::String) = result s
     result <- doQuery (sql "select 'Hello world.'") iter ""
     liftIO (putStrLn result)

which gives this error:

 Main.hs:9:2:
     Inferred type is less polymorphic than expected
       Quantified type variable `mark' is mentioned in the environment:
         hello :: DBM mark Session () (bound at Main.hs:15:0)
         ...

This is just the monomorphism restriction in action. Sans a type signature, the function hello is monomorphised (that is, mark is replaced with (), per GHC rules). This is easily fixed by adding this type declaration:

 hello :: DBM mark Session ()

Bind Parameters

Support for bind variables varies between DBMS's.

We call withPreparedStatement function to prepare the statement, and then call withBoundStatement to provide the bind values and execute the query. The value returned by withBoundStatement is an instance of the Statement class, so it can be passed to doQuery for result-set processing.

When we call withPreparedStatement, we must pass it a "preparation action", which is simply an action that returns the prepared query. The function to create this action varies between backends, and by convention is called prepareQuery. For DML statements, you must use prepareCommand, as the library needs to do something different depending on whether or not the statement returns a result-set.

For queries with large result-sets, we provide prepareLargeQuery, which takes an extra parameter: the number of rows to prefetch in a network call to the server. This aids performance in two ways: 1. you can limit the number of rows that come back to the client, in order to use less memory, and 2. the client library will cache rows, so that a network call to the server is not required for every row processed.

With PostgreSQL, we must specify the types of the bind parameters when the query is prepared, so the prepareQuery function takes a list of bindType values. Also, PostgreSQL requires that prepared statements are named, although you can use "" as the name.

With Sqlite and Oracle, we simply pass the query text to prepareQuery, so things are slightly simpler for these backends.

Perhaps an example will explain it better:

 postgresBindExample = do
   let
     query = sql "select blah from blahblah where id = ? and code = ?"
     iter :: (Monad m) => String -> IterAct m [String]
     iter s acc = result $ s:acc
     bindVals = [bindP (12345::Int), bindP "CODE123"]
     bindTypes = [bindType (0::Int), bindType ""]
   withPreparedStatement (prepareQuery "stmt1" query bindTypes) $ \pstmt -> do
     withBoundStatement pstmt bindVals $ \bstmt -> do
       actual <- doQuery bstmt iter []
       liftIO (print actual)

Note that we pass bstmt to doQuery; this is the bound statement object created by withBoundStatement.

The Oracle/Sqlite example code is almost the same, except for the call to prepareQuery:

 sqliteBindExample = do
   let
     query = sql "select blah from blahblah where id = ? and code = ?"
     iter :: (Monad m) => String -> IterAct m [String]
     iter s acc = result $ s:acc
     bindVals = [bindP (12345::Int), bindP "CODE123"]
   withPreparedStatement (prepareQuery query) $ \pstmt -> do
     withBoundStatement pstmt bindVals $ \bstmt -> do
       actual <- doQuery bstmt iter []
       liftIO (print actual)

It can be a bit tedious to always use the withPreparedStatement+withBoundStatement combination, so for the case where you don't plan to re-use the query, we support a short-cut for bundling the query text and parameters. The next example is valid for PostgreSQL, Sqlite, and Oracle (the Sqlite implementation provides a dummy prefetch function to ensure we have a consistent API). Sqlite has no facility for prefetching - it's an embedded database, so no network round-trip - so the Sqlite implementation ignores the prefetch count:

 bindShortcutExample = do
   let
     iter :: (Monad m) => String -> IterAct m [String]
     iter s acc = result $ s:acc
     bindVals = [bindP (12345::Int), bindP "CODE123"]
     query = prefetch 1000 "select blah from blahblah where id = ? and code = ?" bindVals
   actual <- doQuery query iter []
   liftIO (print actual)

A caveat of using prefetch with PostgreSQL is that you must be inside a transaction. This is because the PostgreSQL implementation uses a cursor and "FETCH FORWARD" to implement fetching a block of rows in a single network call, and PostgreSQL requires that cursors are only used inside transactions. It can be as simple as wrapping calls to doQuery by withTransaction, or you may prefer to delimit your transactions elsewhere (the API supports beginTransaction and commit, if you prefer to use them):

   withTransaction RepeatableRead $ do
     actual <- doQuery query iter []
     liftIO (print actual)

You may have noticed that for Int and Double literal bind values, we have to tell the compiler the type of the literal. This is due to interaction with the numeric literal defaulting mechanism. For non-numeric literals the compiler can (usually) determine the correct types to use.

If you omit type information for numeric literals, from GHC the error message looks something like this:

 Database/PostgreSQL/Test/Enumerator.lhs:194:4:
     Overlapping instances for Database.InternalEnumerator.DBBind a
                                  Session
                                  Database.PostgreSQL.PGEnumerator.PreparedStmt
                                  Database.PostgreSQL.PGEnumerator.BindObj
       arising from use of `bindP' at Database/PostgreSQL/Test/Enumerator.lhs:194:4-8
     Matching instances:
       Imported from Database.PostgreSQL.PGEnumerator:
     instance (Database.InternalEnumerator.DBBind (Maybe a)
                              Session
                              Database.PostgreSQL.PGEnumerator.PreparedStmt
                              Database.PostgreSQL.PGEnumerator.BindObj) =>
          Database.InternalEnumerator.DBBind a
                             Session
                             Database.PostgreSQL.PGEnumerator.PreparedStmt
                             Database.PostgreSQL.PGEnumerator.BindObj
       Imported from Database.PostgreSQL.PGEnumerator:
     instance Database.InternalEnumerator.DBBind (Maybe Double)
                        ....

Multiple (and nested) Result Sets

Support for returning multiple result sets from a single statement exists for PostgreSQL and Oracle. Such functionality does not exist in Sqlite.

The general idea is to invoke a database procedure or function which returns cursor variables. The variables can be processed by doQuery in one of two styles: linear or nested.

Linear style:

If we assume the existence of the following PostgreSQL function, which is used in the test suite in Database.PostgreSQL.Test.Enumerator:

 CREATE OR REPLACE FUNCTION takusenTestFunc() RETURNS SETOF refcursor AS $$
 DECLARE refc1 refcursor; refc2 refcursor;
 BEGIN
     OPEN refc1 FOR SELECT n*n from t_natural where n < 10 order by 1;
     RETURN NEXT refc1;
     OPEN refc2 FOR SELECT n, n*n, n*n*n from t_natural where n < 10 order by 1;
     RETURN NEXT refc2;
 END;$$ LANGUAGE plpgsql;

... then this code shows how linear processing of cursors would be done:

   withTransaction RepeatableRead $ do
   withPreparedStatement (prepareQuery "stmt1" (sql "select * from takusenTestFunc()") []) $ \pstmt -> do
   withBoundStatement pstmt [] $ \bstmt -> do
     dummy <- doQuery bstmt iterMain []
     result1 <- doQuery (NextResultSet pstmt) iterRS1 []
     result2 <- doQuery (NextResultSet pstmt) iterRS2 []
   where
     iterMain :: (Monad m) => (RefCursor String) -> IterAct m [RefCursor String]
     iterMain c acc = result (acc ++ [c])
     iterRS1 :: (Monad m) => Int -> IterAct m [Int]
     iterRS1 i acc = result (acc ++ [i])
     iterRS2 :: (Monad m) => Int -> Int -> Int -> IterAct m [(Int, Int, Int)]
     iterRS2 i i2 i3 acc = result (acc ++ [(i, i2, i3)])

Notes:

the use of a RefCursor String type in the iteratee function indicates to the backend that it should save each cursor value returned, which it does by stuffing them into a list attached to the prepared statement object. This means that we must use withPreparedStatement to create a prepared statement object; the prepared statament oject is the container for the cursors returned.
in this example we choose to discard the results of the first iteratee. This is not necessary, but in this case the only column is a RefCursor, and the values are already saved in the prepared statement object.
saved cursors are consumed one-at-a-time by calling doQuery, passing NextResultSet pstmt (i.e. passing the prepared statement oject wrapped by NextResultSet). This simply pulls the next cursor off the list - they're processed in the order they were pushed on (FIFO) - and processes it with the given iteratee.
if you try to process too many cursors i.e. make too many calls to doQuery passing NextResultSet pstmt, then an exception will be thrown. OTOH, failing to process returned cursors will not raise errors, but the cursors will remain open on the server according to whatever scoping rules the server applies. For PostgreSQL, this will be until the transaction (or session) ends.

Nested style:

The linear style of cursor processing is the only style supported by MS SQL Server and ODBC (which we do not yet support). However, PostgreSQL and Oracle also support using nested cursors in queries.

Again for PostgreSQL, assuming we have these functions in the database:

 CREATE OR REPLACE FUNCTION takusenTestFunc(lim int4) RETURNS refcursor AS $$
 DECLARE refc refcursor;
 BEGIN
     OPEN refc FOR SELECT n, takusenTestFunc2(n) from t_natural where n < lim order by n;
     RETURN refc;
 END; $$ LANGUAGE plpgsql;

 CREATE OR REPLACE FUNCTION takusenTestFunc2(lim int4) RETURNS refcursor AS $$
 DECLARE refc refcursor;
 BEGIN
     OPEN refc FOR SELECT n from t_natural where n < lim order by n;
     RETURN refc;
 END; $$ LANGUAGE plpgsql;

... then this code shows how nested queries might work:

 selectNestedMultiResultSet = do
   let
     q = "SELECT n, takusenTestFunc(n) from t_natural where n < 10 order by n"
     iterMain   (i::Int) (c::RefCursor String) acc = result' ((i,c):acc)
     iterInner  (i::Int) (c::RefCursor String) acc = result' ((i,c):acc)
     iterInner2 (i::Int) acc = result' (i:acc)
   withTransaction RepeatableRead $ do
     rs <- doQuery (sql q) iterMain []
     flip mapM_ rs $ \(outer, c) -> do
       rs <- doQuery c iterInner []
       flip mapM_ rs $ \(inner, c) -> do
         rs <- doQuery c iterInner2 []
         flip mapM_ rs $ \i -> do
           liftIO (putStrLn (show outer ++ " " ++ show inner ++ " " ++ show i))

Just to make it clear: the outer query returns a result-set that includes a RefCursor column. Each cursor from that column is passed to doQuery to process it's result-set; here we use mapM_ to apply an IO action to the list returned by doQuery.

For Oracle the example is slightly different. The reason it's different is that:

Oracle requires that the parent cursor must remain open while processing the children (in the PostgreSQL example, doQuery closes the parent cursor after constructing the list, before the list is processed. This is OK because PostgreSQL keeps the child cursors open on the server until they are explicitly closed, or the transaction or session ends).
our current Oracle implementation prevents marshalling of the cursor in the result-set buffer to a Haskell value, so each fetch overwrites the buffer value with a new cursor. This means you have to fully process a given cursor before fetching the next one.

Contrast this with the PostgreSQL example above, where the entire result-set is processed to give a list of RefCursor values, and then we run a list of actions over this list with mapM_. This is possible because PostgreSQL refcursors are just the database cursor names, which are Strings, which we can marshal to Haskell values easily.

 selectNestedMultiResultSet = do
   let
     q = "select n, cursor(SELECT nat2.n, cursor"
         ++ "     (SELECT nat3.n from t_natural nat3 where nat3.n < nat2.n order by n)"
         ++ "   from t_natural nat2 where nat2.n < nat.n order by n)"
         ++ " from t_natural nat where n < 10 order by n"
     iterMain   (outer::Int) (c::RefCursor StmtHandle) acc = do
       rs <- doQuery c (iterInner outer) []
       result' ((outer,c):acc)
     iterInner outer (inner::Int) (c::RefCursor StmtHandle) acc = do
       rs <- doQuery c (iterInner2 outer inner) []
       result' ((inner,c):acc)
     iterInner2 outer inner (i::Int) acc = do
       liftIO (putStrLn (show outer ++ " " ++ show inner ++ " " ++ show i))
       result' (i:acc)
   withTransaction RepeatableRead $ do
     rs <- doQuery (sql q) iterMain []
     return ()

Note that the PostgreSQL example can also be written like this (except, of course, that the actual query text is that from the PostgreSQL example).

Sessions and Transactions

data DBM mark sess a

Instances

ISession si => CaughtMonadIO (DBM mark si)

(ISession sess, ??? a sess) => Functor (DBM mark sess a)

(ISession sess, ??? a sess) => Monad (DBM mark sess a)

(ISession sess, ??? a sess) => MonadFix (DBM mark sess a)

(ISession sess, ??? a sess) => MonadIO (DBM mark sess a)

class ISession sess

The ISession class describes a database session to a particular DBMS. Oracle has its own Session object, SQLite has its own session object (which maintains the connection handle to the database engine and other related stuff). Session objects for different databases normally have different types -- yet they all belong to the class ISession so we can do generic operations like commit, execDDL, etc. in a database-independent manner.

Session objects per se are created by database connection/login functions.

The class ISession is thus an interface between low-level (and database-specific) code and the Enumerator, database-independent code. The ISession class is NOT visible to the end user -- neither the class, nor any of its methods.

The ISession class describes the mapping from connection object to the session object. The connection object is created by the end user (and this is how the end user tells which particular back end he wants). The session object is not accessible by the end user in any way. Even the type of the session object should be hidden!

Instances

ISession Session

data ConnectA sess

A wrapper around the action to open the database. That wrapper is not exported to the end user. The only reason for the wrapper is to guarantee that the only thing to do with the result of connect function is to pass it out directly to withSession.

Instances

??? sess => Typeable (ConnectA sess)

withSession :: (Typeable a, ISession sess) => ConnectA sess -> (forall mark . DBM mark sess a) -> IO a

Typeable constraint is to prevent the leakage of Session and other marked objects.

withContinuedSession :: (Typeable a, ISession sess) => ConnectA sess -> (forall mark . DBM mark sess a) -> IO (a, ConnectA sess)

Persistent database connections. This issue has been brought up by Shanky Surana. The following design is inspired by that exchange.

On one hand, implementing persistent connections is easy. One may say we should have added them long time ago, to match HSQL, HDBC, and similar database interfaces. Alas, implementing persistent connection safely is another matter. The simplest design is like the following

 withContinuedSession :: (Typeable a, IE.ISession sess) => 
     IE.ConnectA sess -> (forall mark. DBM mark sess a) -> 
     IO (a, IE.ConnectA sess)
 withContinuedSession (IE.ConnectA connecta) m = do
     conn <- connecta
     r <- runReaderT (unDBM m) conn
     return (r,(return conn))

so that the connection object is returned as the result and can be used again with withContinuedSession or withSession. The problem is that nothing prevents us from writing:

     (r1,conn) <- withContinuedSession (connect "...") query1
     r2        <- withSession conn query2
     r3        <- withSession conn query3

That is, we store the suspended connection and then use it twice. But the first withSession closes the connection. So, the second withSession gets an invalid session object. Invalid in a sense that even memory may be deallocated, so there is no telling what happens next. Also, as we can see, it is difficult to handle errors and automatically dispose of the connections if the fatal error is encountered.

All these problems are present in other interfaces... In the case of a suspended connection, the problem is how to enforce the linear access to a variable. It can be enforced, via a state-changing monad. The implementation below makes the non-linear use of a suspended connection a run-time checkable condition. It will be generic and safe - fatal errors close the connection, an attempt to use a closed connection raises an error, and we cannot reuse a connection. We have to write:

     (r1, conn1) <- withContinuedSession conn  ...
     (r2, conn2) <- withContinuedSession conn1 ...
     (r3, conn3) <- withContinuedSession conn2 ...

etc. If we reuse a suspended connection or use a closed connection, we get a run-time (exception). That is of course not very satisfactory - and yet better than a segmentation fault.

commit :: ISession s => DBM mark s ()

rollback :: ISession s => DBM mark s ()

beginTransaction :: (MonadReader s (ReaderT s IO), ISession s) => IsolationLevel -> DBM mark s ()

withTransaction :: ISession s => IsolationLevel -> DBM mark s a -> DBM mark s a

Perform an action as a transaction: commit afterwards, unless there was an exception, in which case rollback.

data IsolationLevel

Constructors

ReadUncommitted
ReadCommitted
RepeatableRead
Serialisable
Serializable	for alternative spellers

Instances

Enum IsolationLevel

Eq IsolationLevel

Ord IsolationLevel

Show IsolationLevel

execDDL :: Command stmt s => stmt -> DBM mark s ()

DDL operations don't manipulate data, so we return no information. If there is a problem, an exception will be raised.

execDML :: Command stmt s => stmt -> DBM mark s Int

Returns the number of rows affected.

inquire :: EnvInquiry key s result => key -> DBM mark s result

Allows arbitrary actions to be run the DBM monad. The back-end developer must supply instances of EnvInquiry, which is hidden away in Database.InternalEnumerator. An example of this is LastInsertRowid.

Exceptions and handlers

data DBException

Constructors

DBError SqlState Int String	DBMS error message.
DBFatal SqlState Int String
DBUnexpectedNull RowNum ColNum	the iteratee function used for queries accepts both nullable (Maybe) and non-nullable types. If the query itself returns a null in a column where a non-nullable type was specified, we can't handle it, so DBUnexpectedNull is thrown.
DBNoData	Thrown by cursor functions if you try to fetch after the end.

Instances

Show DBException

Typeable DBException

formatDBException :: DBException -> String

A show for DBExceptions.

basicDBExceptionReporter :: CaughtMonadIO m => DBException -> m ()

This simple handler reports the error to stdout and swallows it i.e. it doesn't propagate.

reportRethrow :: CaughtMonadIO m => DBException -> m a

This handler reports the error and propagates it (usually to force the program to halt).

reportRethrowMsg :: CaughtMonadIO m => String -> DBException -> m a

Same as reportRethrow, but you can prefix some text to the error (perhaps to indicate which part of your program raised it).

catchDB :: CaughtMonadIO m => m a -> (DBException -> m a) -> m a

Catch DBExceptions thrown in the DBM monad.

catchDBError :: CaughtMonadIO m => Int -> m a -> (DBException -> m a) -> m a

If you want to trap a specific error number, use this. It passes anything else up.

ignoreDBError :: CaughtMonadIO m => Int -> m a -> m a

Analogous to catchDBError, but ignores specific errors instead (propagates anything else).

throwDB :: DBException -> a

Throw a DBException. It's just a type-specific throwDyn.

type ColNum = Int

type RowNum = Int

type SqlState = (SqlStateClass, SqlStateSubClass)

type SqlStateClass = String

type SqlStateSubClass = String

Preparing and Binding

data PreparedStmt mark stmt

withPreparedStatement

:: (Typeable a, IPrepared stmt sess bstmt bo)
=> PreparationA sess stmt	preparation action to create prepared statement; this action is usually created by `prepareQuery/Command`
-> (PreparedStmt mark stmt -> DBM mark sess a)	DBM action that takes a prepared statement
-> DBM mark sess a
Prepare a statement and run a DBM action over it. This gives us the ability to re-use a statement, for example by passing different bind values for each execution. The Typeable constraint is to prevent the leakage of marked things. The type of bound statements should not be exported (and should not be in Typeable) so the bound statement can't leak either.

withBoundStatement

:: (Typeable a, IPrepared stmt s bstmt bo)
=> PreparedStmt mark stmt	prepared statement created by withPreparedStatement
-> [BindA s stmt bo]	bind values
-> (bstmt -> DBM mark s a)	action to run over bound statement
-> DBM mark s a
Applies a prepared statement to bind variables to get a bound statement, which is passed to the provided action. Note that by the time it is passed to the action, the query or command has usually been executed. A bound statement would normally be an instance of `Statement`, so it can be passed to `doQuery` in order to process the result-set, and also an instance of `Command`, so that we can write re-usable DML statements (inserts, updates, deletes). The Typeable constraint is to prevent the leakage of marked things. The type of bound statements should not be exported (and should not be in Typeable) so the bound statement can't leak either.

class ISession sess => Statement stmt sess q | stmt sess -> q

Statement defines the API for query objects i.e. which types can be queries.

Instances

Statement BoundStmt Session Query

Statement PreparedStmtObj Session Query

Statement QueryString Session Query

Statement QueryStringTuned Session Query

Statement StmtBind Session Query

Statement String Session Query

Statement (RefCursor StmtHandle) Session Query

Statement (RefCursor String) Session Query

Statement (NextResultSet mark PreparedStmtObj) Session Query

class ISession sess => Command stmt sess

Command is not a query: command deletes or updates rows, creates/drops tables, or changes database state. executeCommand returns the number of affected rows (or 0 if DDL i.e. not DML).

Instances

Command BoundStmt Session

Command CommandBind Session

Command QueryString Session

Command QueryStringTuned Session

Command StmtBind Session

Command String Session

class ISession sess => EnvInquiry inquirykey sess result | inquirykey sess -> result

Instances

EnvInquiry LastInsertRowid Session Int64

data PreparationA sess stmt

This type is not visible to the end user (cf. ConnectA). It forms a private `communication channel' between Database.Enumerator and a back end.

Why don't we make a user-visible class with a prepare method? Because it means to standardize the preparation method signature across all databases. Some databases need more parameters, some fewer. There may be several statement preparation functions within one database. So, instead of standardizing the signature of the preparation function, we standardize on the _result_ of that function. To be more precise, we standardize on the properties of the result: whatever it is, the eventual prepared statement should be suitable to be passed to bindRun.

class ISession sess => IPrepared stmt sess bound_stmt bo | stmt -> bound_stmt, stmt -> bo

Instances

IPrepared PreparedStmtObj Session BoundStmt BindObj

data BindA sess stmt bo

The binding object (bo) below is very abstract, on purpose. It may be |IO a|, it may be String, it may be a function, etc. The binding object can hold the result of marshalling, or bo can hold the current counter, etc. Different databases do things very differently: compare PostgreSQL and the Stub (which models Oracle).

class ISession sess => DBBind a sess stmt bo | stmt -> bo where

The class DBBind is not used by the end-user. It is used to tie up low-level database access and the enumerator. A database-specific library must provide a set of instances for DBBind. The latter are the dual of DBType.

Methods

bindP :: a -> BindA sess stmt bo

This is really just a wrapper that lets us write lists of heterogenous bind values e.g. [bindP "string", bindP (0::Int), ...]

Instances

DBBind (Maybe a) Session PreparedStmtObj BindObj => DBBind a Session PreparedStmtObj BindObj

DBBind (Maybe CalendarTime) Session PreparedStmtObj BindObj

DBBind (Maybe Double) Session PreparedStmtObj BindObj

DBBind (Maybe Float) Session PreparedStmtObj BindObj

DBBind (Maybe Int) Session PreparedStmtObj BindObj

DBBind (Maybe Int64) Session PreparedStmtObj BindObj

DBBind (Maybe String) Session PreparedStmtObj BindObj

DBBind (Maybe UTCTime) Session PreparedStmtObj BindObj

Show a => DBBind (Maybe a) Session PreparedStmtObj BindObj

DBBind (Out (Maybe Double)) Session PreparedStmtObj BindObj

DBBind (Out (Maybe Int)) Session PreparedStmtObj BindObj

DBBind (Out (Maybe StmtHandle)) Session PreparedStmtObj BindObj

DBBind (Out (Maybe String)) Session PreparedStmtObj BindObj

DBBind (Out (Maybe UTCTime)) Session PreparedStmtObj BindObj

Show a => DBBind (Out (Maybe a)) Session PreparedStmtObj BindObj

DBBind (Out (Maybe a)) Session PreparedStmtObj BindObj => DBBind (Out a) Session PreparedStmtObj BindObj

bindP :: DBBind a sess stmt bo => a -> BindA sess stmt bo

This is really just a wrapper that lets us write lists of heterogenous bind values e.g. [bindP "string", bindP (0::Int), ...]

Iteratees and Cursors

class ISession sess => IQuery q sess b | q -> sess, q -> b where

The class IQuery describes the class of query objects. Each database (that is, each Session object) has its own Query object. We may assume that a Query object includes (at least, conceptually) a (pointer to) a Session object, so a Query object determines the Session object. A back-end provides an instance (or instances) of IQuery. The end user never seens the IQuery class (let alone its methods).

Can a session have several types of query objects? Let's assume that it can: but a statement plus the session uniquely determine the query,

Note that we explicitly use IO monad because we will have to explicitly do FFI.

Methods

currentRowNum :: q -> IO Int

Instances

IQuery Query Session ColumnBuffer

doQuery

:: (Statement stmt sess q, QueryIteratee (DBM mark sess) q i seed b, IQuery q sess b)
=> stmt	query
-> i	iteratee function
-> seed	seed value
-> DBM mark sess seed
The left-fold interface.

class DBType a q b | q -> b

A 'buffer' means a column buffer: a data structure that points to a block of memory allocated for the values of one particular column. Since a query normally fetches a row of several columns, we typically deal with a list of column buffers. Although the column data are typed (e.g., Integer, CalendarDate, etc), column buffers hide that type. Think of the column buffer as Dynamics. The class DBType below describes marshalling functions, to fetch a typed value out of the 'untyped' columnBuffer.

Different DBMS's (that is, different session objects) have, in general, columnBuffers of different types: the type of Column Buffer is specific to a database. So, ISession (m) uniquely determines the buffer type (b)?? Or, actually, a query uniquely determines the buffer.

The class DBType is not used by the end-user. It is used to tie up low-level database access and the enumerator. A database-specific library must provide a set of instances for DBType.

Instances

DBType (Maybe a) Query ColumnBuffer => DBType a Query ColumnBuffer

DBType (Maybe CalendarTime) Query ColumnBuffer

DBType (Maybe Double) Query ColumnBuffer

DBType (Maybe Float) Query ColumnBuffer

DBType (Maybe Int) Query ColumnBuffer

DBType (Maybe Int64) Query ColumnBuffer

DBType (Maybe String) Query ColumnBuffer

DBType (Maybe UTCTime) Query ColumnBuffer

(Show a, Read a) => DBType (Maybe a) Query ColumnBuffer

DBType (RefCursor StmtHandle) Query ColumnBuffer

DBType (RefCursor String) Query ColumnBuffer

type IterResult seedType = Either seedType seedType

IterResult and IterAct give us some type sugar. Without them, the types of iteratee functions become quite unwieldy.

type IterAct m seedType = seedType -> m (IterResult seedType)

currentRowNum :: IQuery q sess b => q -> IO Int

data NextResultSet mark stmt

Constructors

NextResultSet (PreparedStmt mark stmt)

Instances

Statement (NextResultSet mark PreparedStmtObj) Session Query

data RefCursor a

Constructors

RefCursor a

Instances

DBType (RefCursor StmtHandle) Query ColumnBuffer

DBType (RefCursor String) Query ColumnBuffer

Statement (RefCursor StmtHandle) Session Query

Statement (RefCursor String) Session Query

cursorIsEOF :: DBCursor mark (DBM mark s) a -> DBM mark s Bool

cursorIsEOF's return value tells you if there are any more rows or not. If you call cursorNext when there are no more rows, a DBNoData exception is thrown. Cursors are automatically closed and freed when:

the iteratee returns Left a
the query result-set is exhausted.

To make life easier, we've created a withCursor function, which will clean up if an error (exception) occurs, or the code exits early. You can nest them to get interleaving, if you desire:

  withCursor query1 iter1 [] $ \c1 -> do
    withCursor query2 iter2 [] $ \c2 -> do
      r1 <- cursorCurrent c1
      r2 <- cursorCurrent c2
      ...
      return something

cursorCurrent :: DBCursor mark (DBM mark s) a -> DBM mark s a

Returns the results fetched so far, processed by iteratee function.

cursorNext :: DBCursor mark (DBM mark s) a -> DBM mark s (DBCursor mark (DBM mark s) a)

Advance the cursor. Returns the cursor. The return value is usually ignored.

withCursor

:: (Typeable a, Statement stmt sess q, QueryIteratee (DBM mark sess) q i seed b, IQuery q sess b)
=> stmt	query
-> i	iteratee function
-> seed	seed value
-> (DBCursor mark (DBM mark sess) seed -> DBM mark sess a)	action taking cursor parameter
-> DBM mark sess a
Ensures cursor resource is properly tidied up in exceptional cases. Propagates exceptions after closing cursor. The Typeable constraint is to prevent cursors and other marked values (like cursor computations) from escaping.

type Position = Int

Utilities

ifNull

:: Maybe a	nullable value
-> a	value to substitute if first parameter is null i.e. `Nothing`
-> a
Useful utility function, for SQL weenies.

result :: Monad m => IterAct m a

Another useful utility function. Use this to return a value from an iteratee function (the one passed to doQuery). Note that you should probably nearly always use the strict version.

result' :: Monad m => IterAct m a

A strict version. This is recommended unless you have a specific need for laziness, as the lazy version will gobble stack and heap. If you have a large result-set (in the order of 10-100K rows or more), it is likely to exhaust the standard 1M GHC stack. Whether or not result eats memory depends on what x does: if it's a delayed computation then it almost certainly will. This includes consing elements onto a list, and arithmetic operations (counting, summing, etc).

Produced by Haddock version 0.7