N L P Tasks

Natural Language Processing

 

 

 

BioNLP Shared Tasks

The BioNLP Shared Task (BioNLP-ST) series represents a community-wide move in bio-textmining toward fine-grained information extraction (IE). The first event, the BioNLP 2009 shared task (Dec. 2008-March 2009), attracted wide attention, with 24 teams submitting final results. The task setup and data have since served as the basis of numerous studies and published event extraction systems and datasets. The BioNLP Shared Task 2011 (BioNLP-ST'11) is the follow-up event to the BioNLP 2009 shared task. While following the general outline and goals of the previous task in defining biologically relevant extraction targets and a linguistically motivated approach to event representation, the upcoming task will generalize and extend on the previous in three key aspects: text type, domain, and targeted event types. Manually annotated data where all annotations are bound to specific expressions in text will be provided for training, development and evaluation of extraction methods, and tools for detailed evaluation of system outputs will be made available. Participation to the task is open to all interested parties.


Tasks


Main Tasks

While the BioNLP 2009 shared task relied on the GENIA corpus which contained PubMed abstracts on transcription factors in human blood cells, the theme of BioNLP-ST 2011 is generalization, which will be pursued to three directions: text types, event types, and subject domains. To achieve that, various tasks and annotations were created in collaboration between several groups, and event extraction tasks are arranged in four tracks.

  • [GE] GENIA
  • [EPI] Epigenetics and Post-translational Modifications
  • [ID] Infectious Diseases
    • Bacteria Track:
    • [BB] Biotopes
    • [BI] Interactions

 

Supporting Tasks

BioNLP-ST 2011 also includes three supporting tasks.

  • [CO] Co-reference
  • [REL] Entity relations
  • [REN] Gene renaming

Although they are not event extraction tasks themselves, according to the analysis on the results of BioNLP 2009 shared task, co-reference resolution and entity relation detection are expected to play an important role in making a breakthrough in improving event extraction performance. Gene renaming has similar feature with co-references.

 

Resources

 

Task Description Data sets Online evaluation
 BBhomeDownloadsdevelopment set  test set
 BIhomedevelopment set  test set
 RENhome development set  test set