Báo cáo khoa học: Creating a Gold Standard for Sentence Clustering in Multi-Document Summarization
Số trang: 9
Loại file: pdf
Dung lượng: 194.89 KB
Lượt xem: 5
Lượt tải: 0
Xem trước 2 trang đầu tiên của tài liệu này:
Thông tin tài liệu:
Sentence Clustering is often used as a first step in Multi-Document Summarization (MDS) to find redundant information. All the same there is no gold standard available. This paper describes the creation of a gold standard for sentence clustering from DUC document sets. The procedure of building the gold standard and the guidelines which were given to six human judges are described. The most widely used and promising evaluation measures are presented and discussed. regenerated from all/some sentences in a cluster (Barzilay and McKeown, 2005). ...
Nội dung trích xuất từ tài liệu:
Báo cáo khoa học: "Creating a Gold Standard for Sentence Clustering in Multi-Document Summarization" Creating a Gold Standard for Sentence Clustering in Multi-Document Summarization Johanna Geiss University of Cambridge Computer Laboratory 15 JJ Thomson Avenue Cambridge, CB3 0FD, UK johanna.geiss@cl.cam.ac.uk Abstract regenerated from all/some sentences in a cluster (Barzilay and McKeown, 2005). Usually the qual- Sentence Clustering is often used as a first ity of the sentence clusters are only evaluated in- step in Multi-Document Summarization directly by judging the quality of the generated (MDS) to find redundant information. All summary. There is still no standard evaluation the same there is no gold standard avail- method for summarization and no consensus in the able. This paper describes the creation summarization community how to evaluate a sum- of a gold standard for sentence cluster- mary. The methods at hand are either superficial ing from DUC document sets. The proce- or time and resource consuming and not easily re- dure of building the gold standard and the peatable. Another argument against indirect evalu- guidelines which were given to six human ation of clustering is that troubleshooting becomes judges are described. The most widely more difficult. If a poor summary was created it is used and promising evaluation measures not clear which component e.g. information ex- are presented and discussed. traction through clustering or summary generation (using for example language regeneration) is re-1 Introduction sponsible for the lack of quality.The increasing amount of (online) information and However there is no gold standard for sentencethe growing number of news websites lead to a de- clustering available to which the output of a clus-bilitating amount of redundant information. Dif- tering systems can be compared. Another chal-ferent newswires publish different reports about lenge is the evaluation of sentence clusters. Therethe same event resulting in information overlap. are a lot of evaluation methods available. Each ofMulti-Document Summarization (MDS) can help them focus on different properties of a set of clus-to reduce the amount of documents a user has to ters. We will discuss and evaluate the most widelyread to keep informed. In contrast to single doc- used and most promising measures. In this paperument summarization information overlap is one the main focus is on the development of a goldof the biggest challenges to MDS systems. While standard for sentence clustering using DUC clus-repeated information is a good evidence of im- ters. The guidelines and rules that were given toportance, this information should be included in the human annotators are described and the inter-a summary only once in order to avoid a repeti- judge agreement is evaluated.tive summary. Sentence clustering has therefore 2 Related Workoften been used as an early step in MDS (Hatzi-vassiloglou et al., 2001; Marcu and Gerber, 2001; Sentence Clustering is used for different applica-Radev et al., 2000). In sentence clustering se- tion in NLP. Radev et al. (2000) use it in theirmantically similar sentences are grouped together. MDS system MEAD. The centroids of the clustersSentences within a cluster overlap in information, are used to create a summary. Only the summarybut they do not have to be identical in meaning. is evaluated, not the sentence clusters. The sameIn contrast to paraphrases sentences in a cluster do applies to Wang et al. (2008). They use symmet-not have to cover the same amount of information. ric matrix factorisation to group similar sentencesOne sentence represents one cluster in the sum- together and test their system on DUC2005 andmary. Either a sentences from the cluster is se- DUC2006 data set, but do not evaluate the clus-lected (Aliguliyev, 2006) or a new sentence is terings. However Zha (2002) created a gold ...
Nội dung trích xuất từ tài liệu:
Báo cáo khoa học: "Creating a Gold Standard for Sentence Clustering in Multi-Document Summarization" Creating a Gold Standard for Sentence Clustering in Multi-Document Summarization Johanna Geiss University of Cambridge Computer Laboratory 15 JJ Thomson Avenue Cambridge, CB3 0FD, UK johanna.geiss@cl.cam.ac.uk Abstract regenerated from all/some sentences in a cluster (Barzilay and McKeown, 2005). Usually the qual- Sentence Clustering is often used as a first ity of the sentence clusters are only evaluated in- step in Multi-Document Summarization directly by judging the quality of the generated (MDS) to find redundant information. All summary. There is still no standard evaluation the same there is no gold standard avail- method for summarization and no consensus in the able. This paper describes the creation summarization community how to evaluate a sum- of a gold standard for sentence cluster- mary. The methods at hand are either superficial ing from DUC document sets. The proce- or time and resource consuming and not easily re- dure of building the gold standard and the peatable. Another argument against indirect evalu- guidelines which were given to six human ation of clustering is that troubleshooting becomes judges are described. The most widely more difficult. If a poor summary was created it is used and promising evaluation measures not clear which component e.g. information ex- are presented and discussed. traction through clustering or summary generation (using for example language regeneration) is re-1 Introduction sponsible for the lack of quality.The increasing amount of (online) information and However there is no gold standard for sentencethe growing number of news websites lead to a de- clustering available to which the output of a clus-bilitating amount of redundant information. Dif- tering systems can be compared. Another chal-ferent newswires publish different reports about lenge is the evaluation of sentence clusters. Therethe same event resulting in information overlap. are a lot of evaluation methods available. Each ofMulti-Document Summarization (MDS) can help them focus on different properties of a set of clus-to reduce the amount of documents a user has to ters. We will discuss and evaluate the most widelyread to keep informed. In contrast to single doc- used and most promising measures. In this paperument summarization information overlap is one the main focus is on the development of a goldof the biggest challenges to MDS systems. While standard for sentence clustering using DUC clus-repeated information is a good evidence of im- ters. The guidelines and rules that were given toportance, this information should be included in the human annotators are described and the inter-a summary only once in order to avoid a repeti- judge agreement is evaluated.tive summary. Sentence clustering has therefore 2 Related Workoften been used as an early step in MDS (Hatzi-vassiloglou et al., 2001; Marcu and Gerber, 2001; Sentence Clustering is used for different applica-Radev et al., 2000). In sentence clustering se- tion in NLP. Radev et al. (2000) use it in theirmantically similar sentences are grouped together. MDS system MEAD. The centroids of the clustersSentences within a cluster overlap in information, are used to create a summary. Only the summarybut they do not have to be identical in meaning. is evaluated, not the sentence clusters. The sameIn contrast to paraphrases sentences in a cluster do applies to Wang et al. (2008). They use symmet-not have to cover the same amount of information. ric matrix factorisation to group similar sentencesOne sentence represents one cluster in the sum- together and test their system on DUC2005 andmary. Either a sentences from the cluster is se- DUC2006 data set, but do not evaluate the clus-lected (Aliguliyev, 2006) or a new sentence is terings. However Zha (2002) created a gold ...
Tìm kiếm theo từ khóa liên quan:
Creating a Gold Standard Sentence Clustering Multi-Document Summarization báo cáo khoa học báo cáo ngôn ngữ xử lý ngôn ngữ tự nhiênGợi ý tài liệu liên quan:
-
63 trang 314 0 0
-
12 trang 306 0 0
-
Phương pháp tạo ra văn bản tiếng Việt có đề tài xác định
7 trang 273 0 0 -
13 trang 264 0 0
-
Báo cáo khoa học Bước đầu tìm hiểu văn hóa ẩm thực Trà Vinh
61 trang 253 0 0 -
Tóm tắt luận án tiến sỹ Một số vấn đề tối ưu hóa và nâng cao hiệu quả trong xử lý thông tin hình ảnh
28 trang 222 0 0 -
Đề tài nghiên cứu khoa học và công nghệ cấp trường: Hệ thống giám sát báo trộm cho xe máy
63 trang 200 0 0 -
NGHIÊN CỨU CHỌN TẠO CÁC GIỐNG LÚA CHẤT LƯỢNG CAO CHO VÙNG ĐỒNG BẰNG SÔNG CỬU LONG
9 trang 199 0 0 -
Giáo trình Lập trình logic trong prolog: Phần 1
114 trang 192 0 0 -
Đề tài nghiên cứu khoa học: Tội ác và hình phạt của Dostoevsky qua góc nhìn tâm lý học tội phạm
70 trang 190 0 0