Proposal: Add the unordered-containers package and the hashable package to the Haskell Platform

Wed Mar 20 18:00:08 GMT 2013

On 03/20/2013 04:29 PM, Johan Tibell wrote:
> On Wed, Mar 20, 2013 at 6:42 AM, Jan-Willem Maessen
> <jmaessen at alum.mit.edu>wrote:
> 
>> On Wed, Mar 20, 2013 at 6:02 AM, Gábor Lehel <illissius at gmail.com> wrote:
>>
>>> Compatibility issues aside, is there any reason newtypes aren't a good
>>> solution here?
>>>
>>
>> Well, the tricky bit is that there are really two distinct uses of hashing
>> in the field in general and under discussion here in particular.  The first
>> (which is the one that's important for unordered-containers and almost all
>> practical uses of hashing *within* a program) is to provide fast keys for
>> hash tables, where the hash function is backed by equality checks and the
>> like and is thus a question of performance rather than correctness.  For
>> this a reasonably good, but not necessarily cryptographically secure,
>> hashing method is more than sufficient.
>>
> 
> hashable is definitely defined for this use case. Unfortunately there's a
> class of DsS attacks (hash flooding) that work like this:
> 
> 1. The application (e.g. a webserver) stores some user input in a hash
> table (e.g. the HTTP headers received in the request).
> 2. The application uses a weak hash function (i.e. a non-cryptographic hash
> function) in the hash table.
> 3. The user feeds the application a set of keys that all collide, making
> the hash table behave as O(n) or even O(n^2), thereby DoS:ing the server.
> 

AFAICT, the hash function needn't be cryptographically secure (though
that obviously avoids the issue altogether) -- if there is some
determined-at-startup "salt" that's added in to all hashed values then
that should provide good enough protection. Obviously this will mean
that the hashes won't be repeatable across runs of the same program, but
that's usually acceptable for a hash function which isn't used for
content identification(*).

(*) For which you should use a cryptographically secure hash anyway.

> SipHash is one way to address these kinds of attacks. There are other means
> as well. For example, many general DoS protection mechanisms (timeouts, IP
> banning, etc) also work on these kind of attacks.
> 

Timeouts aren't necessarily sufficient -- the application can keep
sending data (e.g. form parameter data) and can cause 100% cpu usage for
a loooooonng time. After that it can just start over.

IP banning can only happen after the problem occurs.

Regards,