Haskell Platform proposal: Add the vector package

Thu Jul 5 10:42:09 BST 2012

On 04/07/2012 16:33, Roman Leshchinskiy wrote:
> Simon Marlow wrote:
>> We should be moving towards safe APIs by default, and separating out
>> unsafe APIs into separate modules.
>
> I completely agree with separating out unsafe APIs but I don't understand
> why modules are the right granularity for this, especially given Haskell's
> rather rudimentary module system. As I said, the module-based approach
> results in a significant maintainance burden for vector.

The choice to use the module boundary was made for pragmatic reasons - 
it reduces complexity in the implementation, but also it makes things 
much simpler from the programmer's point of view.  The programmer has a 
clear idea where the boundary lies: in a Safe module, they can only 
import other Safe/Trustworthy modules.  The Safe subset is a collection 
of modules, not some slice of the contents of all modules.  The Haddock 
docs for a module only have to say in one place whether the module is 
considered safe or not.

This is certainly a debatable part of the design, and we went back and 
forth on it once or twice already.  Conceivably it could change in the 
future.  But I don't think this is the right place to discuss the design 
of SafeHaskell, and at least in our experience the current design seems 
to work quite well.

Could you say something more about the maintenance burden?  I imagined 
that you would just separate the unsafe (in the SafeHaskell sense) 
operations into separate modules.

>> That is what SafeHaskell is about:
>> it's not an obscure feature that is only used by things like "Try
>> Haskell", the boundary between safety and unsafety is something we
>> should all be thinking about.  In that sense, we are all users of
>> SafeHaskell.  We should think of it as "good style" and best practice to
>> separate safe APIs from unsafe ones.
>
> At the risk of being blunt, I do find SafeHaskell's notion of safety
> somewhat obscure. In vector, all unsafe functions have the string "unsafe"
> in their name. Here are two examples of functions that don't do bounds
> checking:
>
> unsafeIndex :: Vector a -> Int -> a
> unsafeRead :: IOVector a -> Int -> IO a
>
> Unless I'm mistaken, SafeHaskell considers the first one unsafe and the
> second one safe. Personally, I find vector's current notion of safety much
> more useful and wouldn't want to weaken it.

SafeHaskell's notion of safety is very clear: it is essentially just 
type safety and referential transparency.  It would be impossible to 
have a clear notion of safety that considers some IO operations unsafe 
and others safe: e.g. do you consider reading a file to be unsafe?  Some 
applications would, and others wouldn't.  Sticking strictly to 
clearly-defined properties like type safety (and a couple of other 
things, including module abstraction) as the definition of safety is the 
only sensible thing you can do.

But this is beside the point.  Since unsafeRead is considered safe by 
SafeHaskell, you have the option of either putting it in the safe API or 
the unsafe API; it's up to you.

>> I would argue against adding any unsafe APIs to the Haskell Platform
>> that aren't in a .Unsafe module.  (to what extent that applies to vector
>> I don't know, so it may be that I'm causing trouble for the proposal
>> here).
>
> To avoid confusion, let's first agree on what an "unsafe API" is. For
> vector, "unsafe" basically means no bounds checking and my understanding
> is that this is quite different from SafeHaskell's notion of safety. As I
> said, such functions have the string "unsafe" in their name. Additionally,
> Data.Vector.Storable is entirely unsafe even in the SafeHaskell sense (as
> in, it unsafePerformIOs essentially arbitrary code) due to the design of
> the Storable class - there are no safe bits there at all. It still uses
> "unsafe" to distinguish between functions that do bounds checking and
> those that don't. What would be the benefit of moving functions like
> unsafeIndex into a separate module (and would it be called
> Unsafe.unsafeIndex then? or would it be Unsafe.index?)? Would you advocate
> renaming Data.Vector.Storable to Data.Vector.Storable.Unsafe?

Since it's the entire module in this case, I think it would be fine to 
just remark in the documentation for the module that the API is unsafe, 
and briefly explain why.

> Also, you seem to be arguing for both using SafeHaskell and having a
> special naming convention for modules with unsafe stuff. Wouldn't one of
> those be sufficient?

Ok, let me relax that a little. I don't care nearly as much about the 
.Unsafe naming convention as I do about separating the unsafe parts of 
the API from the safe parts.  When we have a mostly-type-safe API like 
vector, it is a shame if we can't have the compiler easily check that 
clients are using only the safe subset.

Cheers,
	Simon