Empirical methods for exploiting parallel texts:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Cambridge, Mass. [u.a.]
MIT Press
2001
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | Teilw. zugl.: Univ., Diss., 1998 |
Beschreibung: | X, 195 S. graph. Darst. |
ISBN: | 0262133806 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV013635546 | ||
003 | DE-604 | ||
005 | 20140508 | ||
007 | t | ||
008 | 010315s2001 d||| m||| 00||| eng d | ||
020 | |a 0262133806 |9 0-262-13380-6 | ||
035 | |a (OCoLC)231868232 | ||
035 | |a (DE-599)BVBBV013635546 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-739 |a DE-12 |a DE-384 |a DE-19 | ||
050 | 0 | |a P309 | |
084 | |a ES 960 |0 (DE-625)27938: |2 rvk | ||
084 | |a ST 306 |0 (DE-625)143654: |2 rvk | ||
100 | 1 | |a Melamed, I. Dan |e Verfasser |4 aut | |
245 | 1 | 0 | |a Empirical methods for exploiting parallel texts |c I. Dan Melamed |
264 | 1 | |a Cambridge, Mass. [u.a.] |b MIT Press |c 2001 | |
300 | |a X, 195 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Teilw. zugl.: Univ., Diss., 1998 | ||
650 | 0 | 7 | |a Maschinelle Übersetzung |0 (DE-588)4003966-3 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Linguistik |0 (DE-588)4074250-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Modell |0 (DE-588)4039798-1 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
689 | 0 | 0 | |a Maschinelle Übersetzung |0 (DE-588)4003966-3 |D s |
689 | 0 | 1 | |a Linguistik |0 (DE-588)4074250-7 |D s |
689 | 0 | 2 | |a Modell |0 (DE-588)4039798-1 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m HEBIS Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=009316120&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-009316120 |
Datensatz im Suchindex
_version_ | 1804128445084467200 |
---|---|
adam_text | Empirical Methods for Exploiting Parallel Texts
I Dan Melamed
The MIT Press
Cambridge, Massachusetts
London, England
Contents
Acknowledgments xi
1 Introduction 1
I TRANSLATIONAL EQUIVALENCE AMONG WORD
TOKENS 5
A Geometric Approach to Mapping Bitext Correspondence
Introduction
Bitext Geometry
Previous Work
The Smooth Injective Map Recognizer (SIMR)
241 Overview
242 Point Generation
243 Noise Filter
244 Point Selection
245 Reduction of the Search Space
246 Enhancements
Parameter Optimization
Evaluation
Implementation of SIMR for New Language Pairs
271 Step 1: Construct Matching Predicate
272 Step 2: Construct Axis Generators
273 Step 3: Reoptimize Parameters
Conclusion
Application: Alignment
Introduction
Correspondence is Richer Than Alignment
The Geometric Segment Alignment (GSA) Algorithm
Evaluation
Conclusion
Application: Automatic Detection of Omissions in Translations
Introduction
The Basic Method
Noise-Free Bitext Maps
A Translator s Tool
Contents
4 5 Noisy Bitext Maps 45
4 6 ADOMIT 46
4 7 Simulation of Omissions 48
4 8 Evaluation 49
4 9 Conclusion 53
II THE TYPE-TOKEN INTERFACE 55
5 Models of Co-occurrence
5 1 Introduction
5 2 Relevant Regions of the Bitext Space
5 3 Co-occurrence Counting Methods
5 4 Language-Specific Filters
5 5 Conclusion
6 Manual Annotation of Translational Equivalence
6 1 Introduction
6 2 The Gold-Standard Bitext
6 3 The Blinker Annotation Tool
6 4 Methods for Increasing Reliability
6 5 Inter-Annotator Agreement
6 6 Conclusion
III TRANSLATIONAL EQUIVALENCE AMONG WORD
TYPES 79
7 Word-to-Word Models of Translational Equivalence 81
7 1 Introduction 81
7 2 Translation Model Decomposition 82
7 3 The One-to-One Assumption 86
7 4 Previous Work 87
741 Non-Probabilistic Translation Lexicons 87
742 Re-estimated Sequence-to-Sequence Translation Models 89
743 Re-estimated Bag-to-Bag Translation Models 93
7 5 Parameter Estimation 94
751 Method A: The Competitive Linking Algorithm 97
Contents
752 Method B: Improved Estimation Using an Explicit Noise
Model 99
753 Method C: Improved Estimation Using Pre-Existing
Word Classes
Effects of Sparse Data
Evaluation
771 Evaluation at the Token Level
772 Evaluation at the Type Level
Application to MT Lexicon Development
Conclusion
Automatic Discovery of Non-Compositional Compounds
Introduction
Objective Functions
Search
Predictive Value Functions
Iteration
Credit Estimation
Single-Best Translation
Experiments
Related Work
Conclusion
Sense-to-Sense Models of Translational Equivalence
Introduction
Previous Work
Formulation of the Problem
Noise Filters
The SenseClusters Algorithm
An Application
Experiments
971 Quantitative Results
972 Qualitative Results
Conclusion
Contents
A
A I
A 2
Summary and Outiook
Annotation Style Guide for the Blinker Project
General Guidelines
A11 Omissions in Translation
A12 Phrasal Correspondence
Detailed Guidelines
A21 Idioms and Near Idioms
A22 Referring Expressions
A23 Verbs
A24 Prepositions
A25 Determiners
A26 Punctuation
Notes
References
Index
|
any_adam_object | 1 |
author | Melamed, I. Dan |
author_facet | Melamed, I. Dan |
author_role | aut |
author_sort | Melamed, I. Dan |
author_variant | i d m id idm |
building | Verbundindex |
bvnumber | BV013635546 |
callnumber-first | P - Language and Literature |
callnumber-label | P309 |
callnumber-raw | P309 |
callnumber-search | P309 |
callnumber-sort | P 3309 |
callnumber-subject | P - Philology and Linguistics |
classification_rvk | ES 960 ST 306 |
ctrlnum | (OCoLC)231868232 (DE-599)BVBBV013635546 |
discipline | Sprachwissenschaft Informatik Literaturwissenschaft |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01593nam a2200409 c 4500</leader><controlfield tag="001">BV013635546</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20140508 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">010315s2001 d||| m||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0262133806</subfield><subfield code="9">0-262-13380-6</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)231868232</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV013635546</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield><subfield code="a">DE-12</subfield><subfield code="a">DE-384</subfield><subfield code="a">DE-19</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">P309</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 960</subfield><subfield code="0">(DE-625)27938:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 306</subfield><subfield code="0">(DE-625)143654:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Melamed, I. Dan</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Empirical methods for exploiting parallel texts</subfield><subfield code="c">I. Dan Melamed</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Cambridge, Mass. [u.a.]</subfield><subfield code="b">MIT Press</subfield><subfield code="c">2001</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">X, 195 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Teilw. zugl.: Univ., Diss., 1998</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelle Übersetzung</subfield><subfield code="0">(DE-588)4003966-3</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Linguistik</subfield><subfield code="0">(DE-588)4074250-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Modell</subfield><subfield code="0">(DE-588)4039798-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Maschinelle Übersetzung</subfield><subfield code="0">(DE-588)4003966-3</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Linguistik</subfield><subfield code="0">(DE-588)4074250-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Modell</subfield><subfield code="0">(DE-588)4039798-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=009316120&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-009316120</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV013635546 |
illustrated | Illustrated |
indexdate | 2024-07-09T18:49:21Z |
institution | BVB |
isbn | 0262133806 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-009316120 |
oclc_num | 231868232 |
open_access_boolean | |
owner | DE-739 DE-12 DE-384 DE-19 DE-BY-UBM |
owner_facet | DE-739 DE-12 DE-384 DE-19 DE-BY-UBM |
physical | X, 195 S. graph. Darst. |
publishDate | 2001 |
publishDateSearch | 2001 |
publishDateSort | 2001 |
publisher | MIT Press |
record_format | marc |
spelling | Melamed, I. Dan Verfasser aut Empirical methods for exploiting parallel texts I. Dan Melamed Cambridge, Mass. [u.a.] MIT Press 2001 X, 195 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Teilw. zugl.: Univ., Diss., 1998 Maschinelle Übersetzung (DE-588)4003966-3 gnd rswk-swf Linguistik (DE-588)4074250-7 gnd rswk-swf Modell (DE-588)4039798-1 gnd rswk-swf (DE-588)4113937-9 Hochschulschrift gnd-content Maschinelle Übersetzung (DE-588)4003966-3 s Linguistik (DE-588)4074250-7 s Modell (DE-588)4039798-1 s DE-604 HEBIS Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=009316120&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Melamed, I. Dan Empirical methods for exploiting parallel texts Maschinelle Übersetzung (DE-588)4003966-3 gnd Linguistik (DE-588)4074250-7 gnd Modell (DE-588)4039798-1 gnd |
subject_GND | (DE-588)4003966-3 (DE-588)4074250-7 (DE-588)4039798-1 (DE-588)4113937-9 |
title | Empirical methods for exploiting parallel texts |
title_auth | Empirical methods for exploiting parallel texts |
title_exact_search | Empirical methods for exploiting parallel texts |
title_full | Empirical methods for exploiting parallel texts I. Dan Melamed |
title_fullStr | Empirical methods for exploiting parallel texts I. Dan Melamed |
title_full_unstemmed | Empirical methods for exploiting parallel texts I. Dan Melamed |
title_short | Empirical methods for exploiting parallel texts |
title_sort | empirical methods for exploiting parallel texts |
topic | Maschinelle Übersetzung (DE-588)4003966-3 gnd Linguistik (DE-588)4074250-7 gnd Modell (DE-588)4039798-1 gnd |
topic_facet | Maschinelle Übersetzung Linguistik Modell Hochschulschrift |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=009316120&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT melamedidan empiricalmethodsforexploitingparalleltexts |