Please cite this paper when using any of the material on this page.
The annotation guidelines (with relevant examples) described in the paper Weakly Supervised Learning for Hedge Classification in Scientific Literature (Medlock and Briscoe 2007) can be downloaded here.
The hedge classification data is provided in three formats:
| - Tokenized ||   (.tok extension) |
| - Tokenized & Stemmed ||   (.stm extension) |
| - Tokenized & Stemmed + Bigrams ||   (.bgm extension) |
The following files are available in all formats:
| - spec_seeds / nspec_seeds ||    Seed data for spec and nspec classes |
| - spec_test / nspec_test ||    Test data for spec and nspec classes |
| - pool ||    Unlabeled pool |
The following files are also available in the .stm and .bgm formats:
| - spec_train / nspec_train ||    Training data sets automatically induced using the probabilistic acquisition model of Medlock and Briscoe (2007) |
Downloads (gzipped tar files):
For related resources, see the Flyslip project page from the University of Cambridge.