TSC CFP

TSC TASK DESCRIPTION [Japanese] [TSC Home]

NTCIR-3 Automatic Text Summarization Task/ TSC 2: Text Summarization Challenge 2
(Last updated on Oct. 17, 2001.): ver.20011017

The details of Automatic Text Summarization task (or Text Summarization Challenge 2 (TSC2)) are explained below. The additions and updates will be announced in this page, so please check it from time to time.

1) Tasks

Participants may take part in one or more of the following tasks. You will be asked later which task(s) you will participate in.

Task A (single document summarization)

A single text is summarized for the task. Given the texts to be summarized and summarization rates (summarization lengths), the participants submit summaries for each text. There will be more than one summarization rates for one text.

Summaries should be in plain text. A summarization rate is the ratio between the original and its summary based on the number of characters in the text. The rates are given to the participants as the maximum number, and they may vary from text to text. If a submitted summary has characters more than the maximum number, we use only the characters from the beginning of the summary to the length specified by the rate for the evaluation. Please note that carriage return is not counted as a character. And we will check if the submitted results are indeed in plain text first, then we evaluate them.
This task is the same as task A-2 in TSC-1.

Task B (multi-document summarization)

More than one (multiple) texts are summarized for the task. Given a set of texts, the participants produce summaries of it in plain text format. The information which was used to produce the document set, such as queries, as well as summarization lengths are given to the participants. There will be more than one summarization lengths for one set of texts.

Summaries should be in plain text. Summarization lengths will be given to each set as the maximum number of characters. If a submitted summary has characters more than the maximum number, we use only the characters from the beginning of the summary to the specified length for the evaluation. Please note that carriage return is not counted as a character. And we will check if the submitted results are indeed in plain text first, then we evaluate them.

2) Evaluation methods for each subtask

The same evaluation methods (subjective and degree of revision) are used for both task A and task B. The evaluation methods are both intrinsic, in which the evaluation is done by comparing with the human-produced summarization. They are not `formal' evaluation strictly speaking. However, comparisons are made with human-produced summaries (free and important-part), and the results will be reported to the participants and at the NTCIR workshop. (note: important-part summaries are used only in the evaluation for task A)

Evaluation Method 1: subjective evaluation

First, four kinds of summaries are provided to (three) human judges. They are the original text, human-produced summaries (two types; free and important-part summaries), system summary, and summary produced by a baseline system. In this subjective evaluation, the judges evaluate and rank them in one to four order, by examining how much the content of the text is covered and how readable the summary is. Same evaluation was conducted at task A-2 in TSC1.

Evaluation Method 2: degree of revision to system results

The judges read the original texts and revise the system summaries in terms of the content and readability, after that, the degree of revision is measured. Revisions are to be made by three operations (insertion, deletion, replacement). The degree of the revision is computed based on the number of the revision and the number of revised characters.

3) Input and Output formats for each subtask

3-1) Common to both tasks

3-1-1) Text format

The format is the same as TSC1, so please refer to the TSC1 task description.

3-2) Task A

As mentioned above, the task is the same as Task A-2 of TSC1. Please refer to the subtask A-2 of TSC1 task description.

3-3) Task B

3-3-1) Data and its format provided by TSC to participants

TSC provides each participant in the task with their own ID number and the following data.

==BNF==

file           	:= topic*

topic          	:= <TOPIC>topic-contents</TOPIC>

topic-contents 	:= topic-id keywords description ir-result sum-length*

topic-id       	:= <TOPIC-ID>number</TOPIC-ID>
	{topic query ID}

keywords    	:= <KEYWORDS>keyword*</KEYWORDS>
	{list of key words for the topic query}

keyword    	:= <KEYWORD>EUC string</KEYWORD>

description    	:= <DESCRIPTION>EUC-string</DESCRIPTION>
	{simple description of the topic query}

ir-result	:=<IR-RESULT>doc-id*</IR-RESULT>

doc-id		:=<DOCNO>number</DOCNO>
	{document ID as a result of topic query, target texts for Task B}

sum-length      :=<SUMLENGTH-C>number</SUMLENGTH-C>
        {maximum number of characters for the summary. A carriage return 
         is not counted as a character here}

Example

	<TOPIC>
	<TOPIC-ID>0001</TOPIC-ID>
	<KEYWORDS>
	<KEYWORD>自動</KEYWORD>
	<KEYWORD>要約</KEYWORD>
        </KEYWORDS>
	<DESCRIPTION>自動要約研究の新しい試み</DESCRIPTION>
	<IR-RESULT>
	<DOCNO>980101002</DOCNO>
	<DOCNO>950101008</DOCNO>
	...
	</IR-RESULT>
        <SUMLENGTH-C>150</SUMLENGTH-C>
        <SUMLENGTH-C>300</SUMLENGTH-C>
	</TOPIC>

3-3-2)Formats for submission (from participants to TSC)

The participants should submit their results in the following formats.

==BNF==

file            :=system-id topic*

system-id       :=<SYSTEM-ID>number</SYSTEM-ID>
        {participant ID provided by TSC}

topic           :=<TOPIC>topic-id sum-result*</TOPIC>

topic-id       	:=<TOPIC-ID>number</TOPIC-ID>

sum-result      :=<SUM-RESULT>sum-length sum-text</SUM-RESULT>

sum-length      :=<SUMLENGTH-C>number</SUMLENGTH-C>

sum-text        :=<SUMTEXT>EUC string</SUMTEXT>
        {summary in plain text whose number of characters less than or 
         equal to the maximum number specified by TSC}

Example
	<SYSTEM-ID>02010001</SYSTEM-ID>	
	<TOPIC>
	<TOPIC-ID>0001</TOPIC-ID>
	<SUM-RESULT>
        <SUMLENGTH-C>150</SUMLENGTH-C>
        <SUMTEXT>ＴＳＣという，テキスト自動要約の新しい試みが始まり，現在
        参加者を募っている．ＴＳＣが開催されることにより，日本におけるテキ
        スト自動要約技術の一層の発展が期待されている．</SUMTEXT>
	</SUM-RESULT>
	<SUM-RESULT>
	...
	</SUM-RESULT>
	</TOPIC>

4) Newspaper data used for TSC

We will use newspaper articles from Mainichi Newspaper Database (1998, 1999 versions) in TSC2. If you are not sure how to obtain them, please contact us.

5) Schedule (tentative)

      2001.10  Dryrun, results due
      2001.11-12  Evaluation
      2002.1   Analysis
      2002.2   Round-Table discussion
      2002.4   Formal run. results due
      2002.5-6 Evaluation
      2002.7   Analysis
      2002.8   Round-Table discussion
      2002.10  NTCIR Workshop3

Please note that the application for the participation in NTCIR-3 is due September 30, 2001, however, we will continue to accept participation applications until the end of February, 2002. Even if you do not take part in the dryrun, you can join the formal run.

Since we had some paper work delay at the NTCIR office, we have changed the dryrun schedule as follows. And as a result of this delay, we may change the following dates (for the next year).

Dryrun:
      Nov 15-20 : we ask if the participants would like to take part
                  in the dryrun.
      Nov 26 : dryrun tasks revealed
      Nov 30 : the result submission due
      2002
      January : the report of the evaluation

Co-chairs (alphabetical order):

Takahiro FUKUSIMA (Otemon Gakuin University)
Hidetsugu NANBA (The Japan Society for the Promotion of Science)
Manabu OKUMURA (Tokyo Institute of Technology)

Contact: TSC2 organizing committee (tsc-adm@lr.pi.titech.ac.jp)