ANNOUNCE: brillig 0.3 - not quite the Brill tagger

Grzegorz Chrupała pitekus at gmail.com
Wed Sep 7 17:07:08 BST 2011


2011/9/7 Eric Kow <eric.kow at gmail.com>:
> Hi!
>
> On Wed, Sep 07, 2011 at 13:32:17 +0200, Grzegorz Chrupała wrote:
>> I think I don't get it. How would you use a discriminatively trained
>> tagger like sequor in combination with the Brill tagger?
>
> So, at the moment, I don't know what a "discriminatively trained" tagger
> is and wouldn't know how to start trying to answer such a question...
>
> But hopefully I won't have to, because I was actually just saying
> something incredibly simple and non-technical, that the brillig
> executable could just provide a thin wrapper around different kinds of
> taggers (as alternatives to each other, completely disjoint).
> You know, files go in, tags come out... but this was before I looked
> at the training file format and understood that this is what sequor
> provides.  Oh well, this probably makes brillig just a bit redundant in
> infrastructure terms. :-)

Right, makes sense. Even though sequor is a standalone generic
sequence tagger, it would still make sense to use brillig to provide a
standard POS-tagger interface.

>> For what it's worth, I just trained Sequor  (using several spelling
>> features as encoded in the data/mlcomp2.features template) on the
>> initial 90% of the Brown corpus, and tested on the final 10%, and got
>> an accuracy of 96.2%. Training takes several hours, but tagging runs
>> at more than 3000 words/second.
>
> Cool!
>
> PS. can we have a small release with '-rtsopts'?

Should be on hackage now.

Best,
--
Grzegorz



More information about the NLP mailing list