iteratee Iteratee breaks monadic laws
Michael Baikov
manpacket at gmail.com
Wed Feb 15 11:14:47 GMT 2012
Hello John.
I see. That's a very interesting second law, and it will be part of my
next bug report :)
"All iteratees must enter the done state (i.e. call the first
continuation) upon receipt of EOF."
This must be extended to something like "must enter the done state or
emit an exception to
perform seek or anything else to change this EOF state"
Suppose you have a file and you want to do seeking. And you want to
seek only for specific marks.
So first you read whole file, remembering all marks location and then
after getting EOF just seek for first such mark and do some magical
stream processing.
I know, this is a bad example, but only for simplicity sake.
First two options dies after 10 loops, as expected, third one waits
forever because of throwErr in joinI.
import Data.Iteratee.Iteratee
import Control.Monad
import Data.Iteratee.IO
import Control.Monad.IO.Class
import qualified Data.Iteratee as I
enumFileForever :: String -> Enumerator String IO a
enumFileForever fname iter = enumFileRandom 10 fname iter >>= enumEof
>>= enumFileForever fname
consumer :: Iteratee String IO ()
consumer = liftI (go 0)
where
go c (Chunk s) = liftIO (putStr s) >> liftI (go c)
go 10 e = error "I did 10 loops, let's die"
go n e = I.seek 0 >> liftI (go (n+1))
main = do
-- enumFileForever "foo" consumer >>= run
-- enumFileForever "foo" (join $ (mapChunks id) $ consumer) >>= run
-- enumFileForever "foo" (joinI $ (mapChunks id) $ consumer) >>= run
PS: There is something wrong with iteratee maillist against. Feb is missing.
http://projects.haskell.org/pipermail/iteratee
On Wed, Feb 15, 2012 at 6:12 PM, John Lato <jwlato at gmail.com> wrote:
> Hi Michael,
>
> Yes, this is unfortunate. Iteratees that attempt to inject data into
> a stream are unsound, for exactly this reason. Currently, I am aware
> of two laws that well-formed iteratees should follow:
>
> 1. Any data returned to a stream (via the second parameter of
> "idone") should be a subset of data received from the stream.
> 2. All iteratees must enter the done state (i.e. call the first
> continuation) upon receipt of EOF.
>
> The first is (in part) because of this problem; the second (together
> with the first) guarantees a result upon termination.
>
> The first law may in fact be too strict, at least from the perspective
> of monad laws. It's probably possible to write a safe version of
> "inject" provided that you first take some stream data, e.g.
>
> inject :: Int -> Iteratee [Int] IO ()
> inject x = liftI step
> where
> step (Chunk xs) = idone () (Chunk (x:xs))
> step str = idone () str
>
> Of course this won't actually inject values unless you apply it to an
> enumerator:
>
> main = do
> (enumPure1Chunk [] test') >>= run >>= print
> (enumPure1Chunk [] test'') >>= run >>= print
>
> Both of these do print "[0,1,2]". However if your iteratee starts
> injecting data into a stream it will alter the stream length, which
> will cause issues with things like "I.take" and "countConsumed".
> Which is another reason for the first law.
>
> The monadic law violation is a consequence of the current design, in
> which remainder stream data is carried along with a final result. In
> iteratee-0.9, I'm changing the definition of iteratee to something
> closer to (haven't quite decided yet):
>
> newtype Iteratee s m a = Iteratee{ runIter :: forall r.
> (a -> r) ->
> ((Stream s -> m (Iteratee s m a, Stream s)) -> r) ->
> (Iteratee s m a -> SomeException -> r) ->
> (forall b. m b -> (b -> (Iteratee s m a)) -> r) ->
> r}
>
> This should fix the inject problem, although injecting elements will
> still be unsound WRT I.take and similar.
>
> John L.
>
> On Wed, Feb 15, 2012 at 4:47 AM, Michael Baikov <manpacket at gmail.com> wrote:
>> There are three known monadic laws which every monad must obey:
>>
>> "Left identity": return a >>= f ≡ f a
>> "Right identity": m >>= return ≡ m
>> "Associativity": (m >>= f) >>= g ≡ m >>= (\x -> f x >>= g)
>>
>> Iteratee breaks the third one: "do { a; b; c }" gets desugared to this:
>> "a >> (b >> c)" plain usage of "a >> b >> c" is the same as "(a >> b) >> c",
>> since (>>) is defined with infixl.
>>
>> Here is a simple example which works with do notation, but fails to
>> work with (>>)
>>
>> import qualified Data.Iteratee as I
>> import Data.Iteratee.Iteratee
>>
>> inject :: Int -> Iteratee [Int] IO ()
>> inject x = idone () (Chunk [x])
>>
>> test' :: Iteratee [Int] IO [Int]
>> test' = do
>> inject 2
>> inject 1
>> inject 0
>> I.stream2list
>>
>> test'' :: Iteratee [Int] IO [Int]
>> test'' = inject 2 >> inject 1 >> inject 0 >> I.stream2list
>>
>> main = do
>> run test' >>= print -- [0, 1, 2]
>> run test'' >>= print -- [2]
More information about the Iteratee
mailing list