The details of Automatic Text Summarization task (or Text Summarization Challenge 2 (TSC2)) are explained below. The additions and updates will be announced in this page, so please check it from time to time.
Participants may take part in one or more of the following tasks. You will be asked later which task(s) you will participate in.
A single text is summarized for the task. Given the texts to be summarized and summarization rates (summarization lengths), the participants submit summaries for each text. There will be more than one summarization rates for one text.
Summaries should be in plain text. A summarization rate is the ratio between the original and its summary based on the number of characters in the text. The rates are given to the participants as the maximum number, and they may vary from text to text. If a submitted summary has characters more than the maximum number, we use only the characters from the beginning of the summary to the length specified by the rate for the evaluation. Please note that carriage return is not counted as a character. And we will check if the submitted results are indeed in plain text first, then we evaluate them.More than one (multiple) texts are summarized for the task. Given a set of texts, the participants produce summaries of it in plain text format. The information which was used to produce the document set, such as queries, as well as summarization lengths are given to the participants. There will be more than one summarization lengths for one set of texts.
Summaries should be in plain text. Summarization lengths will be given to each set as the maximum number of characters. If a submitted summary has characters more than the maximum number, we use only the characters from the beginning of the summary to the specified length for the evaluation. Please note that carriage return is not counted as a character. And we will check if the submitted results are indeed in plain text first, then we evaluate them.
The same evaluation methods (subjective and degree of revision) are used for both task A and task B. The evaluation methods are both intrinsic, in which the evaluation is done by comparing with the human-produced summarization. They are not `formal' evaluation strictly speaking. However, comparisons are made with human-produced summaries (free and important-part), and the results will be reported to the participants and at the NTCIR workshop. (note: important-part summaries are used only in the evaluation for task A)
- Evaluation Method 1: subjective evaluation
First, four kinds of summaries are provided to (three) human judges. They are the original text, human-produced summaries (two types; free and important-part summaries), system summary, and summary produced by a baseline system. In this subjective evaluation, the judges evaluate and rank them in one to four order, by examining how much the content of the text is covered and how readable the summary is. Same evaluation was conducted at task A-2 in TSC1.- Evaluation Method 2: degree of revision to system results
The judges read the original texts and revise the system summaries in terms of the content and readability, after that, the degree of revision is measured. Revisions are to be made by three operations (insertion, deletion, replacement). The degree of the revision is computed based on the number of the revision and the number of revised characters.
TSC provides each participant in the task with their own ID number and the following data.
==BNF== file := topic* topic := <TOPIC>topic-contents</TOPIC> topic-contents := topic-id keywords description ir-result sum-length* topic-id := <TOPIC-ID>number</TOPIC-ID> {topic query ID} keywords := <KEYWORDS>keyword*</KEYWORDS> {list of key words for the topic query} keyword := <KEYWORD>EUC string</KEYWORD> description := <DESCRIPTION>EUC-string</DESCRIPTION> {simple description of the topic query} ir-result :=<IR-RESULT>doc-id*</IR-RESULT> doc-id :=<DOCNO>number</DOCNO> {document ID as a result of topic query, target texts for Task B} sum-length :=<SUMLENGTH-C>number</SUMLENGTH-C> {maximum number of characters for the summary. A carriage return is not counted as a character here} Example <TOPIC> <TOPIC-ID>0001</TOPIC-ID> <KEYWORDS> <KEYWORD>自動</KEYWORD> <KEYWORD>要約</KEYWORD> </KEYWORDS> <DESCRIPTION>自動要約研究の新しい試み</DESCRIPTION> <IR-RESULT> <DOCNO>980101002</DOCNO> <DOCNO>950101008</DOCNO> ... </IR-RESULT> <SUMLENGTH-C>150</SUMLENGTH-C> <SUMLENGTH-C>300</SUMLENGTH-C> </TOPIC>
The participants should submit their results in the following formats.
==BNF== file :=system-id topic* system-id :=<SYSTEM-ID>number</SYSTEM-ID> {participant ID provided by TSC} topic :=<TOPIC>topic-id sum-result*</TOPIC> topic-id :=<TOPIC-ID>number</TOPIC-ID> sum-result :=<SUM-RESULT>sum-length sum-text</SUM-RESULT> sum-length :=<SUMLENGTH-C>number</SUMLENGTH-C> sum-text :=<SUMTEXT>EUC string</SUMTEXT> {summary in plain text whose number of characters less than or equal to the maximum number specified by TSC} Example <SYSTEM-ID>02010001</SYSTEM-ID> <TOPIC> <TOPIC-ID>0001</TOPIC-ID> <SUM-RESULT> <SUMLENGTH-C>150</SUMLENGTH-C> <SUMTEXT>TSCという,テキスト自動要約の新しい試みが始まり,現在 参加者を募っている.TSCが開催されることにより,日本におけるテキ スト自動要約技術の一層の発展が期待されている.</SUMTEXT> </SUM-RESULT> <SUM-RESULT> ... </SUM-RESULT> </TOPIC>
We will use newspaper articles from Mainichi Newspaper Database (1998, 1999 versions) in TSC2. If you are not sure how to obtain them, please contact us.
2001.10 Dryrun, results due 2001.11-12 Evaluation 2002.1 Analysis 2002.2 Round-Table discussion 2002.4 Formal run. results due 2002.5-6 Evaluation 2002.7 Analysis 2002.8 Round-Table discussion 2002.10 NTCIR Workshop3Please note that the application for the participation in NTCIR-3 is due September 30, 2001, however, we will continue to accept participation applications until the end of February, 2002. Even if you do not take part in the dryrun, you can join the formal run.
Since we had some paper work delay at the NTCIR office, we have changed the dryrun schedule as follows. And as a result of this delay, we may change the following dates (for the next year).
Dryrun: Nov 15-20 : we ask if the participants would like to take part in the dryrun. Nov 26 : dryrun tasks revealed Nov 30 : the result submission due 2002 January : the report of the evaluation
Takahiro FUKUSIMA (Otemon Gakuin University)Contact: TSC2 organizing committee (tsc-adm@lr.pi.titech.ac.jp)
Hidetsugu NANBA (The Japan Society for the Promotion of Science)
Manabu OKUMURA (Tokyo Institute of Technology)