Michael Collins (home)
Software and Data Sets
- I've now released the source code for the parser described in my PhD
thesis, under a GNU General Public License. Follow this
link for a tar file that contains the code.
- You may also want to try
Dan Bikel's
parser. It has a number of great features: training of the
original (WSJ treebank) models; training of Chinese and Arabic models;
n-best parsing, and so on. In addition, Dan's code has a flag which
gives output of the "events" files used in my original parser. If
you need to retrain my parser, this is probably an excellent place
to look.