NLP: the missing framework

Eric Kow eric.kow at gmail.com
Sat Jan 19 10:29:52 GMT 2013


For Haskell NLP libs that use annotated data, I'd very much welcome efforts
to seamlessly make use of data provided by other frameworks such as NLTK or
OpenNLP. If somebody wanted to build this sort of thing, it'd be worth
thinking about prioritising that over some other scheme for getting data.



On 19 January 2013 09:57, Daniël de Kok <me at danieldk.eu> wrote:

> On Jan 19, 2013, at 9:39 AM, Eric Kow <eric.kow at gmail.com> wrote:
> > Just thought you might be interested in Edward Yang's call to arms if
> you haven't seen it already:
> >
> > http://blog.ezyang.com/2013/01/nlp-the-missing-framework/
> >
> > How can we push things a little bit more in the right direction in the
> Haskell NLP world? What is the right direction?
>
> Summarized, I think there are currently two major problems:
>
> * Interoperability between existing components. From different
> expectations about tokenization to different syntactic annotations.
>
> * High-quality, annotated data is often not available under a permissive
> license.
>
> Some projects, such as OpenNLP and NLTK, that aim to provide what the
> blog-post asks: ready to use, pre-trained NLP components. However, the blog
> post doesn't really lay out why these frameworks are not acceptable.
>
> The question is whether the world is best of with yet another unfinished
> framework, this time written in Haskell, rather than focusing on improving
> existing frameworks. If somebody started from scratch, I think I would be
> more interested in a set of interoperable C libraries, since they could be
> integrated fairly easily in any language.
>
> Another approach would be to combine different components by adopting
> Apache UIMA everywhere. However, I doubt that such pipelines would be easy
> to install or maintain.
>
> -- Daniël
> _______________________________________________
> NLP mailing list
> NLP at projects.haskell.org
> http://projects.haskell.org/cgi-bin/mailman/listinfo/nlp
>



-- 
Eric Kow <http://erickow.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://projects.haskell.org/pipermail/nlp/attachments/20130119/fc86bf60/attachment.htm>


More information about the NLP mailing list