Stemming & Hello!

Jakub Waszczuk waszczuk.kuba at gmail.com
Sat Dec 29 18:30:41 GMT 2012


On Dec 28, 2012, at 4:59 AM, Mark Wotton <mwotton at gmail.com
<http://projects.haskell.org/cgi-bin/mailman/listinfo/nlp>> wrote:
>* oh, also: I've been playing around with a suggester using levenshtein*>* distance to many possible target strings. This isn't the fastest thing*>* in the world: is there a better algorithm in the literature?

Problem of finding dictionary entry nearest to the query word can be reduced
to the shortest path problem[1].  I'm currently **working** on a library which
implements this method: http://hackage.haskell.org/package/adict.
**There's still a lot* of room for *improvement** (on-the-fly DAWG construction,
more intuitive cost function definition), but perhaps the current version
of the library will be sufficient for your needs.

Cheers,
Kuba

*[1] Staworko, S. and Chomicki, J. (2006) Validity-Sensitive Querying of
XML Databases. EDBT Workshops, pp. 164–177.


*

*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://projects.haskell.org/pipermail/nlp/attachments/20121229/b96f7ec9/attachment.htm>


More information about the NLP mailing list