N L P Tasks

Natural Language Processing

 

 

 

Introducing NLPTasks

What is a challenge?

A challenge is an event where the organisers propose a set of tasks (Information Extraction, Question-Response, Translation, etc.) and the reference data associated with these tasks. During the challenge, several teams work on these data and implement methods for solving the proposed tasks, then simultaneously submit their predictions in the form of a submission. The organisers evaluate and rank the submissions using appropriate metrics. The challenge ends with a workshop where the participants present their methods and the organisers summarise the submissions.

Information Extraction challenges typically propose different tasks that can be individual or combined:

  • Named Entity Recognition (REN / NER).
  • Named Entity Linking / Named Entity Normalization (NEL / NEN).
  • Extraction de Relations / Relation Extraction (ER / RE).
  • Extraction de graphes de connaissances / Knowledge Graph Extraction (KGE).

Challenges are regularly organised and frequently grouped into series (for example: BioNLP-ST, BioASQ, CLEF).Within a series, an organiser may propose several challenges, but often the challenges are proposed by separate organisers. Participants may take part in one, several or all of the challenges in a series. Challenges in a series have a common calendar and share the same final workshop (see Lifecycle below).

 

Challenge lifecycle

There are two stages in the life cycle of a challenge:

  1. Challenge time during which the challenge is open and active. This time is punctuated by a defined sequence of exchanges between the organisers and the participants:
    • Pre-challenge communication: the organisers communicate and promote their challenge by explaining the final objectives and the tasks involved. The aim is to make the challenge attractive and to recruit participants. Duration: a few months.
    • Publication of training and development data: the organisers make training data available, enabling participants to train, ‘finetune’ and perfect their prediction systems. Duration: 1-2 months.
    • Publication of test data: the organisers make the test data available (without answers) and the participants submit their predictions. Duration: 1 week.
    • Evaluation: the organisers evaluate the participants' predictions and publish the results. Duration: 1 week.
    • Workshop: the organisers and participants meet in a workshop to publish the methods and results. This stage follows the classic publication process (CFP, submission, review, acceptance, presentation). Duration: several months. 

During this time, the website dedicated to the challenge must present new content on a regular basis, punctuated by the challenge itself. The total duration of this time is eight months on average.

  1. Post-challenge period, when the content of the site is slightly reorganised, leaving the data, evaluation services and leaderboard available. The site includes links to publications from the workshop and any special issue of the journal. In this way, teams specialising in Information Extraction continue to use the data for their research and to evaluate and compare the performance of their systems. Updates during this time are very rare. This period lasts for several years, or even decades. Translated with www.DeepL.com/Translator (free version)

 

Why a website?

A challenge website allows organisers to pass on essential information about the challenge to participants. This includes: 

  • Task descriptions.
  • Important dates (data publication, submissions, workshop).
  • Task data.
  • Evaluation scripts.
  • Rankings.
  • Context and authors.

Each challenge organised by Bibliome in the past is described on a dedicated site on the platforms of various hosting providers (mainly Google Sites and Migale). These providers do not systematically ensure continuity of service, which means that a certain number of challenge sites are unusable or even inaccessible. In addition, these sites vary in structure and visual appearance. 

Bibliome therefore wishes to centralise, reserve and standardise the challenge sites it has organised on a single, permanent platform. Centralisation will make it easier to maintain the sites and increase their visibility. It will also make it easier to set up sites for future challenges.