[vector] #71: Request to add total size information to Vector

vector vector at projects.haskell.org
Thu Dec 22 04:10:58 GMT 2011


#71: Request to add total size information to Vector
------------------------+---------------------------------------------------
Reporter:  sanketr      |       Owner:     
    Type:  enhancement  |      Status:  new
Priority:  major        |   Milestone:     
 Version:               |    Keywords:     
------------------------+---------------------------------------------------
 It will be very useful to have a field in Vector instances that store
 total number of bytes taken up by Vector elements. This is very useful for
 Vectors that can be extended (for example, Storable instances) to store
 Pointer to other data structures such as CStringLen.

 The current assumption in Vector package design seems to be that vector
 elements will take constant storage. So, total byte information can be
 readily calculated in such cases by multiplying length of vector by the
 sizeOf of type.

 But, given a vector of (len,Ptr CChar), if I want to find out how much
 storage I need to allocate to convert it to say ByteString, I can't
 calculate it using above method. Instead, I must either define a function
 which sums over len field of the (len,Ptr CChar) vector (inefficient to
 traverse memory again for large vectors), or return the length at build
 time, by creating a custom build function that builds the vector, and
 keeps track of the length. An example for Storable vector below - it
 builds the vector, and keeps track of length:

 create :: [(len,Ptr CChar)] -> (Vector (len, Ptr CChar), Int32) -- assume
 custom storable instance is defined for (len, Ptr CChar)
 create x = unsafePerformIO $ do
             v <- new (Prelude.length x)
             size <- fill v 0 0 x
             unsafeFreeze v >>= (\x -> return (x,size))
           where
             fill v _ s [] = return s
             fill v n s (x:xs) = unsafeWrite v n x >> fill v (n + 1) (s +
 size(x)) xs
               where
                 size (len,_) = len

 This is not an isolated case. Vectors storing pointers to variable length
 strings are quite common, especially when dealing with real world data.
 Since speed is one of great advantages of using unboxed/storable vectors,
 it will be nice to have total byte size information available in Vector.
 This helps with transformation of Vectors into other types such as
 ByteString. Right now, it is awkward to build this information in Vector,
 and pass it around.

-- 
Ticket URL: <http://trac.haskell.org/vector/ticket/71>
vector <http://trac.haskell.org/vector>
Package vector


More information about the vector mailing list