iteratee roll gives incorrect results

John Lato jwlato at gmail.com
Tue Mar 15 11:05:49 GMT 2011


Hi,

Thanks very much for reporting this.  I suppose I'm not too surprised that
one of the very few operations in ListLike I don't yet have a test case for
has a bug.  I wasn't ready to include 'roll' in the distribution because I
wasn't sure it was correct, but accidentally committed it anyway and left
it.

Regarding the modification to (>>=), I think I would rather avoid it due to
the increased complexity.  Modifying 'Stream' may be right way to fix this,
but it would be a big rewrite.  It would affect user code too.

There are two other approaches to consider:
1) Change the semantics so that if EOF is encountered, 'roll' will return
all elements up to EOF and then propagate EOF.  This would treat EOF as a
special case and force the caller to deal with it.  I don't like the
behavioral inconsistency, but it would only be a patch-level change.

2) Make 'roll' an enumeratee, a not-quite-fully-generalized signature could
be:
roll :: Int -> Int -> Iteratee [s] m a -> Iteratee s m (Iteratee [s] m a)

The inner iteratee would be feed appropriate chunks, but 'roll' would
consume as much of the stream as necessary.  The behavior would be something
like this:

iter = do
  s1 <- I.joinI $ roll 4 2 I.stream2list
  s2 <- I.stream2list
  return $ show (s1::[[Int]]) ++ " " ++ show (s2::[Int])

main1 = I.enumPureNChunk [1, 2, 3] 2 iter >>= run >>= putStrLn
>> [[1,2,3],[3]] []
main2 = I.enumPureNChunk [1, 2, 3, 4] 2 iter >>= run >>= putStrLn
>> [[1,2,3,4],[3,4]] []
main3 = I.enumPureNChunk [1..6] 2 iter >>= run >>= putStrLn
>> [[1,2,3,4],[3,4]] [5,6]

This is arguably better for producing a sliding-window view of a stream, but
it would completely rule out using roll as a super-peek.

John

On Mon, Mar 14, 2011 at 10:04 AM, Akio Takano <tkn.akio at gmail.com> wrote:

> Hi,
>
> I've found that `roll' from Data.Iteratee.ListLike does not work as
> expected. The following program demonstrates the problem. it prints
> "[[1,2]] [2,3]" while it should print "[[1,2,3]] [2,3]".
>
> import qualified Data.Iteratee.ListLike as I
> import Data.Iteratee.Iteratee
>
> main = I.enumPureNChunk [1, 2, 3] 2 iter >>= run >>= putStrLn
>    where
>        iter = do
>            xs <- I.roll 4 1
>            ys <- I.stream2list
>            return $ show (xs::[[Int]]) ++ " " ++ show  (ys::[Int])
>
> I tried to fix it, but it turned out to be not easy.  I try to explain
> the situation:
>
> 1. When an iteratee moves to the Done state, it must return the
> unconsumed part of the stream. Perhaps this means it must return a
> stream that is a subsequence of the given stream. Though I'm not sure
> how strictly this rule is enforced, at least (>>=) expects its LHS to
> return EOF when it sends EOF to the LHS. Otherwise the EOF in the
> stream gets lost and the resulting iteratee of (>>=) does not
> terminate on EOF.
>
> 2. An iteratee can return some elements or an EOF marker as the
> unconsumed part, but not both at the same time.
>
> Suppose you are evaluating (enumPure1Chunk [1] (roll 2 0) >>= run).
> The iteratee first receives (Chunk [1]) and continues because it has
> not seen enough number of stream elements to return a result. Next it
> receives (EOF Nothing). It should move to the Done state now, but it's
> not clear what stream it should return. If it returns (Chunk [1]) it
> violates "1." above and will not be able to be used with (>>=). If it
> returns (EOF Nothing), that means it consumed the whole stream, which
> is an incorrect result because roll is called with d=0.
>
> In general there seems to be no way for an iteratee to safely look
> ahead to an element beyond the next one, without consuming any
> elements.
>
> I can think of two options to solve or work around the problem.
>
> - Modify (>>=) so that it will allow its LHS to return some elements
> on EOF. In that case it will explicitly feed EOF to its RHS. I
> attached code showing the approach. I'm not sure this is a correct
> thing, especially regarding the smantics and the performance of (>>=).
> Also I haven't confirmed that (>>=) is the only function that depends
> on the EOF->EOF assumption.
>
> - Change the definition of Stream so that they can contain some
> elements and an EOF marker at the same time.  Then an iteratee will be
> able to "put back" any number of elements onto the stream. This will
> require a global rewrite of the library.
>
> Any ideas?
>
> Takano Akio
>
> _______________________________________________
> Iteratee mailing list
> Iteratee at projects.haskell.org
> http://projects.haskell.org/cgi-bin/mailman/listinfo/iteratee
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://projects.haskell.org/pipermail/iteratee/attachments/20110315/05941528/attachment.htm>


More information about the Iteratee mailing list