Syntax-based collocation extraction:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Dordrecht [u.a.]
Springer
2011
|
Schriftenreihe: | Text, speech and language technology
44 |
Schlagworte: | |
Online-Zugang: | Inhaltstext Inhaltsverzeichnis |
Beschreibung: | XI, 217 S. graph. Darst. |
ISBN: | 9789400701335 9789400701342 |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV037315197 | ||
003 | DE-604 | ||
005 | 20150910 | ||
007 | t | ||
008 | 110401s2011 gw d||| |||| 00||| eng d | ||
015 | |a 10,N34 |2 dnb | ||
016 | 7 | |a 1005844895 |2 DE-101 | |
020 | |a 9789400701335 |c GB. : EUR 106.95 (freier Pr.), sfr 143.50 (freier Pr.) |9 978-94-0070133-5 | ||
020 | |a 9789400701342 |9 978-94-007-0134-2 | ||
024 | 3 | |a 9789400701335 | |
028 | 5 | 2 | |a 80018175 |
035 | |a (OCoLC)729948813 | ||
035 | |a (DE-599)DNB1005844895 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
044 | |a gw |c XA-DE-BE | ||
049 | |a DE-19 | ||
084 | |a ET 750 |0 (DE-625)28029: |2 rvk | ||
084 | |a 004 |2 sdnb | ||
100 | 1 | |a Seretan, Violeta |e Verfasser |4 aut | |
245 | 1 | 0 | |a Syntax-based collocation extraction |c by Violeta Seretan |
264 | 1 | |a Dordrecht [u.a.] |b Springer |c 2011 | |
300 | |a XI, 217 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Text, speech and language technology |v 44 | |
650 | 0 | 7 | |a Maschinelle Übersetzung |0 (DE-588)4003966-3 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 0 | 1 | |a Maschinelle Übersetzung |0 (DE-588)4003966-3 |D s |
689 | 0 | |5 DE-604 | |
830 | 0 | |a Text, speech and language technology |v 44 |w (DE-604)BV011123931 |9 44 | |
856 | 4 | 2 | |m X:MVB |q text/html |u http://deposit.dnb.de/cgi-bin/dokserv?id=3524300&prov=M&dok_var=1&dok_ext=htm |3 Inhaltstext |
856 | 4 | 2 | |m HEBIS Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=022469526&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-022469526 |
Datensatz im Suchindex
_version_ | 1805095748705976320 |
---|---|
adam_text |
Syntax-Based
Collocation Extraction
by
Violeta Seretan
University of Geneva, Switzerland
fyj Springer
Contents
Introduction 1
1 1 Collocations and Their Relevance for NLP 1
1 2 The Need for Syntax-Based Collocation Extraction 3
1 3 Aims 4
1 4 Chapters Outline 6
On Collocations 9
2 1 Introduction 9
22A Survey of Definitions 10
221 Statistical Approaches 11
222 Linguistic Approaches 12
223 Collocation vs Co-occurrence 13
2 3 Towards a Core Collocation Concept 14
2 4 Theoretical Perspectives on Collocations 17
241 Contextual ism 17
242 Text Cohesion 18
243 Meaning-Text Theory 19
244 Semantics and Metaphoricity 20
245 Lexis-Grammar Interface 21
2 5 Linguistic Descriptions 22
251 Semantic Compositionality 22
252 Morpho-Syntactic Characterisation 24
2 6 What Collocation Means in This Book 26
2 7 Summary 27
Survey of Extraction Methods 29
3 1 Introduction 29
3 2 Extraction Techniques 29
321 Collocation Features Modelled 29
322 General Extraction Architecture 31
323 Contingency Tables 32
324 Association Measures 34
325 Criteria for the Application of Association Measures 42
x Contents
3 3 Linguistic Preprocessing 44
331 Lemmatization 44
332 POS Tagging 45
333 Shallow and Deep Parsing 47
334 Beyond Parsing 48
3 4 Survey of the State of the Art 49
341 English 50
342 German - 51
343 French 54
344 Other Languages 56
3 5 Summary 58
4 Syntax-Based Extraction 59
4 1 Introduction 59
4 2 The Fips Multilingual Parser 62
4 3 Extraction Method 65
431 Candidate Identification 65
432 Candidate Ranking 68
4 4 Evaluation 69
441 On Collocation Extraction Evaluation 69
442 Evaluation Method 72
443 Experiment 1: Monolingual Evaluation 75
444 Results of Experiment 1 79
445 Experiment 2: Cross-Lingual Evaluation 81
446 Results of Experiment 2 85
4 5 Qualitative Analysis 88
451 Error Analysis 89
452 Intersection and Rank Correlation 92
453 Instance-Level Analysis 94
4 6 Discussion 97
4 7 Summary 100
5 Extensions 103
5 1 Identification of Complex Collocations 103
511 The Method 104
512 Experimental Results 107
513 Related Work 109
5 2 Data-Driven Induction of Syntactic Patterns
521 The Method
522 Experimental Results
523 Related Work
5 3 Corpus-Based Collocation Translation
531 The Method
Contents xi
532 Experimental Results 118
533 Related Work 120
5 4 Summary 121
6 Conclusion 123
6 1 Main Contributions 123
6 2 Future Directions 125
A List of Collocation Dictionaries 129
B List of Collocation Definitions 131
C Association Measures - Mathematical Notes 133
C I x2 133
C 2 Log-Likelihood Ratio 134
D Monolingual Evaluation (Experiment 1) 135
D 1 Test Data and Annotations 135
D 2 Results 154
E Cross-Lingual Evaluation (Experiment 2) 157
E 1 Test Data and Annotations 157
E 2 Results 195
F Output Comparison 197
References 199
Index 213 |
any_adam_object | 1 |
author | Seretan, Violeta |
author_facet | Seretan, Violeta |
author_role | aut |
author_sort | Seretan, Violeta |
author_variant | v s vs |
building | Verbundindex |
bvnumber | BV037315197 |
classification_rvk | ET 750 |
ctrlnum | (OCoLC)729948813 (DE-599)DNB1005844895 |
discipline | Sprachwissenschaft Informatik Literaturwissenschaft |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 cb4500</leader><controlfield tag="001">BV037315197</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20150910</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">110401s2011 gw d||| |||| 00||| eng d</controlfield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">10,N34</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">1005844895</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789400701335</subfield><subfield code="c">GB. : EUR 106.95 (freier Pr.), sfr 143.50 (freier Pr.)</subfield><subfield code="9">978-94-0070133-5</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789400701342</subfield><subfield code="9">978-94-007-0134-2</subfield></datafield><datafield tag="024" ind1="3" ind2=" "><subfield code="a">9789400701335</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">80018175</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)729948813</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DNB1005844895</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">gw</subfield><subfield code="c">XA-DE-BE</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-19</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ET 750</subfield><subfield code="0">(DE-625)28029:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">004</subfield><subfield code="2">sdnb</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Seretan, Violeta</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Syntax-based collocation extraction</subfield><subfield code="c">by Violeta Seretan</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Dordrecht [u.a.]</subfield><subfield code="b">Springer</subfield><subfield code="c">2011</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XI, 217 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Text, speech and language technology</subfield><subfield code="v">44</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelle Übersetzung</subfield><subfield code="0">(DE-588)4003966-3</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Maschinelle Übersetzung</subfield><subfield code="0">(DE-588)4003966-3</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Text, speech and language technology</subfield><subfield code="v">44</subfield><subfield code="w">(DE-604)BV011123931</subfield><subfield code="9">44</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">X:MVB</subfield><subfield code="q">text/html</subfield><subfield code="u">http://deposit.dnb.de/cgi-bin/dokserv?id=3524300&prov=M&dok_var=1&dok_ext=htm</subfield><subfield code="3">Inhaltstext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=022469526&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-022469526</subfield></datafield></record></collection> |
id | DE-604.BV037315197 |
illustrated | Illustrated |
indexdate | 2024-07-20T11:04:12Z |
institution | BVB |
isbn | 9789400701335 9789400701342 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-022469526 |
oclc_num | 729948813 |
open_access_boolean | |
owner | DE-19 DE-BY-UBM |
owner_facet | DE-19 DE-BY-UBM |
physical | XI, 217 S. graph. Darst. |
publishDate | 2011 |
publishDateSearch | 2011 |
publishDateSort | 2011 |
publisher | Springer |
record_format | marc |
series | Text, speech and language technology |
series2 | Text, speech and language technology |
spelling | Seretan, Violeta Verfasser aut Syntax-based collocation extraction by Violeta Seretan Dordrecht [u.a.] Springer 2011 XI, 217 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Text, speech and language technology 44 Maschinelle Übersetzung (DE-588)4003966-3 gnd rswk-swf Computerlinguistik (DE-588)4035843-4 gnd rswk-swf Computerlinguistik (DE-588)4035843-4 s Maschinelle Übersetzung (DE-588)4003966-3 s DE-604 Text, speech and language technology 44 (DE-604)BV011123931 44 X:MVB text/html http://deposit.dnb.de/cgi-bin/dokserv?id=3524300&prov=M&dok_var=1&dok_ext=htm Inhaltstext HEBIS Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=022469526&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Seretan, Violeta Syntax-based collocation extraction Text, speech and language technology Maschinelle Übersetzung (DE-588)4003966-3 gnd Computerlinguistik (DE-588)4035843-4 gnd |
subject_GND | (DE-588)4003966-3 (DE-588)4035843-4 |
title | Syntax-based collocation extraction |
title_auth | Syntax-based collocation extraction |
title_exact_search | Syntax-based collocation extraction |
title_full | Syntax-based collocation extraction by Violeta Seretan |
title_fullStr | Syntax-based collocation extraction by Violeta Seretan |
title_full_unstemmed | Syntax-based collocation extraction by Violeta Seretan |
title_short | Syntax-based collocation extraction |
title_sort | syntax based collocation extraction |
topic | Maschinelle Übersetzung (DE-588)4003966-3 gnd Computerlinguistik (DE-588)4035843-4 gnd |
topic_facet | Maschinelle Übersetzung Computerlinguistik |
url | http://deposit.dnb.de/cgi-bin/dokserv?id=3524300&prov=M&dok_var=1&dok_ext=htm http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=022469526&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV011123931 |
work_keys_str_mv | AT seretanvioleta syntaxbasedcollocationextraction |