Báo cáo khoa học: Topics in Statistical Machine Translation
Số trang: 1
Loại file: pdf
Dung lượng: 60.19 KB
Lượt xem: 9
Lượt tải: 0
Xem trước 1 trang đầu tiên của tài liệu này:
Thông tin tài liệu:
In the past, we presented tutorials called “Introduction to Statistical Machine Translation”, aimed at people who know little or nothing about the field and want to get acquainted with the basic concepts. This tutorial, by contrast, goes more deeply into selected topics of intense current interest. We aim at two types of participants: 1. People who understand the basic idea of statistical machine translation and want to get a survey of hot-topic current research, in terms that they can understand. 2. People associated with statistical machine translation work, who have not had time to study the most current topics...
Nội dung trích xuất từ tài liệu:
Báo cáo khoa học: "Topics in Statistical Machine Translation" Topics in Statistical Machine Translation Kevin Knight Philipp Koehn Information Sciences Institute School of Informatics University of Southern California University of Edinburgh knight@isi.edu pkoehn@inf.ed.ac.uk1 Introduction • Phrase table pruning, storage, suffix ar- rays.In the past, we presented tutorials called “Intro-duction to Statistical Machine Translation”, aimed • Large language models (distributedat people who know little or nothing about the field LMs, noisy LMs).and want to get acquainted with the basic con- 4. NEW MODELS (1 hour and 10 minutes)cepts. This tutorial, by contrast, goes more deeplyinto selected topics of intense current interest. We • New methods for word alignment (be-aim at two types of participants: yond GIZA++). • Factored models. 1. People who understand the basic idea of sta- • Maximum entropy models for rule se- tistical machine translation and want to get a lection and re-ordering. survey of hot-topic current research, in terms • Acquisition of syntactic translation that they can understand. rules. 2. People associated with statistical machine • Syntax-based language models and translation work, who have not had time to target-language dependencies. study the most current topics in depth. • Lattices for encoding source-language uncertainties.We fill the gap between the introductory tutorialsthat have gone before and the detailed scientific 5. LEARNING TECHNIQUES (20 minutes)papers presented at ACL sessions. • Discriminative training (perceptron,2 Tutorial Outline MIRA).Below is our tutorial structure. We showcase theintuitions behind the algorithms and give exam-ples of how they work on sample data. Our se-lection of topics focuses on techniques that deliverproven gains in translation accuracy, and we sup-ply empirical results from the literature. 1. QUICK REVIEW (15 minutes) • Phrase-based and syntax-based MT. 2. ALGORITHMS (45 minutes) • Efficient decoding for phrase-based and syntax-based MT (cube pruning, for- ward/outside costs). • Minimum-Bayes risk. • System combination. 3. SCALING TO LARGE DATA (30 minutes) 2 Tutorial Abstracts of ACL-IJCNLP 2009, page 2, Suntec, Singapore, 2 August 2009. c 2009 ACL and AFNLP
Nội dung trích xuất từ tài liệu:
Báo cáo khoa học: "Topics in Statistical Machine Translation" Topics in Statistical Machine Translation Kevin Knight Philipp Koehn Information Sciences Institute School of Informatics University of Southern California University of Edinburgh knight@isi.edu pkoehn@inf.ed.ac.uk1 Introduction • Phrase table pruning, storage, suffix ar- rays.In the past, we presented tutorials called “Intro-duction to Statistical Machine Translation”, aimed • Large language models (distributedat people who know little or nothing about the field LMs, noisy LMs).and want to get acquainted with the basic con- 4. NEW MODELS (1 hour and 10 minutes)cepts. This tutorial, by contrast, goes more deeplyinto selected topics of intense current interest. We • New methods for word alignment (be-aim at two types of participants: yond GIZA++). • Factored models. 1. People who understand the basic idea of sta- • Maximum entropy models for rule se- tistical machine translation and want to get a lection and re-ordering. survey of hot-topic current research, in terms • Acquisition of syntactic translation that they can understand. rules. 2. People associated with statistical machine • Syntax-based language models and translation work, who have not had time to target-language dependencies. study the most current topics in depth. • Lattices for encoding source-language uncertainties.We fill the gap between the introductory tutorialsthat have gone before and the detailed scientific 5. LEARNING TECHNIQUES (20 minutes)papers presented at ACL sessions. • Discriminative training (perceptron,2 Tutorial Outline MIRA).Below is our tutorial structure. We showcase theintuitions behind the algorithms and give exam-ples of how they work on sample data. Our se-lection of topics focuses on techniques that deliverproven gains in translation accuracy, and we sup-ply empirical results from the literature. 1. QUICK REVIEW (15 minutes) • Phrase-based and syntax-based MT. 2. ALGORITHMS (45 minutes) • Efficient decoding for phrase-based and syntax-based MT (cube pruning, for- ward/outside costs). • Minimum-Bayes risk. • System combination. 3. SCALING TO LARGE DATA (30 minutes) 2 Tutorial Abstracts of ACL-IJCNLP 2009, page 2, Suntec, Singapore, 2 August 2009. c 2009 ACL and AFNLP
Tìm kiếm theo từ khóa liên quan:
Topics in Statistical Machine Translation Kevin Knight Long Papers báo cáo khoa học báo cáo ngôn ngữ xử lý ngôn ngữ tự nhiênGợi ý tài liệu liên quan:
-
63 trang 314 0 0
-
12 trang 306 0 0
-
Phương pháp tạo ra văn bản tiếng Việt có đề tài xác định
7 trang 273 0 0 -
13 trang 264 0 0
-
Báo cáo khoa học Bước đầu tìm hiểu văn hóa ẩm thực Trà Vinh
61 trang 253 0 0 -
Tóm tắt luận án tiến sỹ Một số vấn đề tối ưu hóa và nâng cao hiệu quả trong xử lý thông tin hình ảnh
28 trang 222 0 0 -
Đề tài nghiên cứu khoa học và công nghệ cấp trường: Hệ thống giám sát báo trộm cho xe máy
63 trang 200 0 0 -
NGHIÊN CỨU CHỌN TẠO CÁC GIỐNG LÚA CHẤT LƯỢNG CAO CHO VÙNG ĐỒNG BẰNG SÔNG CỬU LONG
9 trang 199 0 0 -
Giáo trình Lập trình logic trong prolog: Phần 1
114 trang 192 0 0 -
Đề tài nghiên cứu khoa học: Tội ác và hình phạt của Dostoevsky qua góc nhìn tâm lý học tội phạm
70 trang 190 0 0