Question:
What kind of external resources can we use?

Answer:
You are free to use all external information that you find
useful, unannotated Medline abstracts included. However, for this
latter resource, you must select abstracts later than year 2000 to
avoid overlapping with the test data.

Question:
We won't use all the linguistic information you provide. Would that be OK?

Answer: 
The participants are free to use or not the linguistic
information associated with the data. Moreover, it will be very
interesting to compare the results according to the type of linguistic
knowledge that is exploited.

Question:
Where are the negative examples?

Answer:
The way negative examples are generated for the IE task is left to the
participants. A straightforward way is to use the Closed-World
Assumption: if no interaction is specified between biological
objects A and B then they do not interact.

Question:
We have read the description of the challenge and examined
the data set. Our question is how the evaluation will be done? Only
one set of data is available on the site, should we use
cross-validation, or a test set will also be published later on?

Answer: 
The results will be evaluated by the participants and by the
challenge organizers on a test dataset which will be published April,
1st. The evaluation criteria will be based on recall and
precision. Further details are coming soon.

Question: 
Systems have to be able to determine the genic_interaction
relations but will agents and targets be supplied or will these also
need to be determined by the entered systems.

Answer: 
Agents and targets are redundant with respect to the
genic_interaction relations, but have been included for
readability. They won't be provided in the test data. All potential
candidates as agents and targets have already been provided in the
named-entity dictionary and the information extraction task consists
in selecting in the sentences, the right couples of agents and targets
among the candidates and linking them properly.

Question: 
We think there are some errors in the training data
available from the website. The ID of each sentence should be unique,
but genic_interaction_data.txt contains at least three sentences with
the same ID number.

Answer:
Yes, it came from sentence segmentation errors. The data sets are
properly segmented now.