A resource-light approach to morpho-syntactic tagging:
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Amsterdam [u. a.]
Rodopi
2010
|
Schriftenreihe: | Language and computers
70 |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XIV, 185 S. graph. Darst. |
ISBN: | 9789042027695 9789042027688 |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV035985792 | ||
003 | DE-604 | ||
005 | 20100608 | ||
007 | t | ||
008 | 100128s2010 d||| |||| 00||| eng d | ||
020 | |a 9789042027695 |c EBook |9 978-90-420-2769-5 | ||
020 | |a 9789042027688 |9 978-90-420-2768-8 | ||
020 | |z 9042027681 |9 90-420-2768-1 | ||
035 | |a (OCoLC)497573700 | ||
035 | |a (DE-599)BVBBV035985792 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-384 |a DE-12 |a DE-29 | ||
050 | 0 | |a P290 | |
082 | 0 | |a 410.285 | |
084 | |a ES 940 |0 (DE-625)27934: |2 rvk | ||
100 | 1 | |a Feldman, Anna |e Verfasser |0 (DE-588)140601937 |4 aut | |
245 | 1 | 0 | |a A resource-light approach to morpho-syntactic tagging |c Anna Feldman and Jirka Hana |
264 | 1 | |a Amsterdam [u. a.] |b Rodopi |c 2010 | |
300 | |a XIV, 185 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Language and computers |v 70 | |
650 | 4 | |a Grammatik | |
650 | 4 | |a Spanisch | |
650 | 4 | |a Catalan language |x Morphosyntax | |
650 | 4 | |a Cognate words | |
650 | 4 | |a Computational linguistics | |
650 | 4 | |a Corpora (Linguistics) | |
650 | 4 | |a Cross-language information retrieval | |
650 | 4 | |a Czech language |x Morphosyntax | |
650 | 4 | |a Grammar, Comparative and general |x Morphosyntax | |
650 | 4 | |a Language transfer (Language learning) | |
650 | 4 | |a Portuguese language |x Morphosyntax | |
650 | 4 | |a Russian language |x Morphosyntax | |
650 | 4 | |a Spanish language |x Morphosyntax | |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Morphosyntax |0 (DE-588)4114635-9 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Morphosyntax |0 (DE-588)4114635-9 |D s |
689 | 0 | 1 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Hana, Jirka |e Verfasser |0 (DE-588)140601988 |4 aut | |
830 | 0 | |a Language and computers |v 70 |w (DE-604)BV000833947 |9 70 | |
856 | 4 | 2 | |m HEBIS Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=018878606&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
940 | 1 | |n oe | |
999 | |a oai:aleph.bib-bvb.de:BVB01-018878606 |
Datensatz im Suchindex
_version_ | 1804141007235710976 |
---|---|
adam_text | A RESOURCE-LIGHT APPROACH TO MORPHO-SYNTACTIC TAGGING ANNA FELDMAN AND
JIRKA HANA AMSTERDAM - NEW YORK, NY 2010 CONTENTS LIST OF TABLES VII
LIST OF FIGURES XI PREFACE XIII 1 INTRODUCTION 1 1.1 ORGANIZATION OF THE
BOOK 4 2 COMMON TAGGING TECHNIQUES 5 2.1 SUPERVISED METHODS 6 2.2
UNSUPERVISED METHODS 17 2.3 COMPARISON OF THE TAGGING APPROACHES 19 2.4
CLASSIFIER COMBINATION 20 2.5 A SPECIAL APPROACH TO TAGGING HIGHLY
INFLECTED LANGUAGES 25 2.6 SUMMARY 29 3 PREVIOUS RESOURCE-LIGHT
APPROACHES TO NLP 31 3.1 UNSUPERVISED OR MINIMALLY SUPERVISED APPROACHES
32 3.2 CROSS-LANGUAGE KNOWLEDGE INDUCTION 36 3.3 SUMMARY 47 4 LANGUAGES,
CORPORA AND TAGSETS 49 4.1 LANGUAGE PROPERTIES 49 4.2 CORPORA 59 4.3
TAGSET DESIGN 60 4.4 TAGSETS IN OUR EXPERIMENTS 64 5 QUANTIFYING
LANGUAGE PROPERTIES 71 5.1 TAGSET SIZE, TAGSET COVERAGE 71 5.2 HOW MUCH
TRAINING DATA IS NECESSARY? 75 5.3 DATA SPARSITY, CONTEXT, AND TAGSET
SIZE 78 5.4 SUMMARY 78 6 RESOURCE-LIGHT MORPHOLOGICAL ANALYSIS 81 6.1
INTRODUCTION 81 VI CONTENTS 6.2 MOTIVATION-LEXICAL STATISTICS OF CZECH
82 6.3 A MORPHOLOGICAL ANALYZER OF CZECH 83 6.4 APPLICATION TO OTHER
LANGUAGES 98 6.5 POSSIBLE ENHANCEMENTS 101 7 CROSS-LANGUAGE
MORPHOLOGICAL TAGGING 103 7.1 WHY A MARKOV MODEL 103 7.2 TAGGING RUSSIAN
USING CZECH 104 7.3 USING SOURCE LANGUAGE DIRECTLY 105 7.4 EXPECTATIONS
107 7.5 USING MA TO APPROXIMATE EMISSIONS 108 7.6 IMPROVING EMISSIONS -
COGNATES 109 7.7 IMPROVING TRANSITIONS - RUSSIFICATIONS 113 7.8
DEALING WITH DATA SPARSITY - TAG DECOMPOSITION 115 7.9 RESULTS ON TEST
CORPUS 118 7.10 CATALAN 121 7.11 PORTUGUESE 123 7.12 CONCLUSION 123 8
SUMMARY AND FURTHER WORK 125 8.1 SUMMARY OF THE BOOK 125 8.2 FUTURE WORK
126 BIBLIOGRAPHY 133 APPENDICES 148 A TAGSETS WE USE 149 A.I CZECH
TAGSET 149 A.2 RUSSIAN TAGSET 154 A.3 ROMANCE TAGSETS 161 B CORPORA 165
B.I SLAVIC CORPORA 165 B.2 ROMANCE CORPORA 166 C LANGUAGE PROPERTIES 167
C.I SLAVIC LANGUAGES 167 C.2 CZECH 167 C.3 RUSSIAN 169 C.4 ROMANCE
LANGUAGES 172 C.5 CATALAN 175 C.6 PORTUGUESE 178 C.7 SPANISH 180
CITATION INDEX 183
|
any_adam_object | 1 |
author | Feldman, Anna Hana, Jirka |
author_GND | (DE-588)140601937 (DE-588)140601988 |
author_facet | Feldman, Anna Hana, Jirka |
author_role | aut aut |
author_sort | Feldman, Anna |
author_variant | a f af j h jh |
building | Verbundindex |
bvnumber | BV035985792 |
callnumber-first | P - Language and Literature |
callnumber-label | P290 |
callnumber-raw | P290 |
callnumber-search | P290 |
callnumber-sort | P 3290 |
callnumber-subject | P - Philology and Linguistics |
classification_rvk | ES 940 |
ctrlnum | (OCoLC)497573700 (DE-599)BVBBV035985792 |
dewey-full | 410.285 |
dewey-hundreds | 400 - Language |
dewey-ones | 410 - Linguistics |
dewey-raw | 410.285 |
dewey-search | 410.285 |
dewey-sort | 3410.285 |
dewey-tens | 410 - Linguistics |
discipline | Sprachwissenschaft Literaturwissenschaft |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02238nam a2200589 cb4500</leader><controlfield tag="001">BV035985792</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20100608 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">100128s2010 d||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789042027695</subfield><subfield code="c">EBook</subfield><subfield code="9">978-90-420-2769-5</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789042027688</subfield><subfield code="9">978-90-420-2768-8</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">9042027681</subfield><subfield code="9">90-420-2768-1</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)497573700</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV035985792</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-384</subfield><subfield code="a">DE-12</subfield><subfield code="a">DE-29</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">P290</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">410.285</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 940</subfield><subfield code="0">(DE-625)27934:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Feldman, Anna</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)140601937</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">A resource-light approach to morpho-syntactic tagging</subfield><subfield code="c">Anna Feldman and Jirka Hana</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Amsterdam [u. a.]</subfield><subfield code="b">Rodopi</subfield><subfield code="c">2010</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XIV, 185 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Language and computers</subfield><subfield code="v">70</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Grammatik</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Spanisch</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Catalan language</subfield><subfield code="x">Morphosyntax</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cognate words</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computational linguistics</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Corpora (Linguistics)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cross-language information retrieval</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Czech language</subfield><subfield code="x">Morphosyntax</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Grammar, Comparative and general</subfield><subfield code="x">Morphosyntax</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Language transfer (Language learning)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Portuguese language</subfield><subfield code="x">Morphosyntax</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Russian language</subfield><subfield code="x">Morphosyntax</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Spanish language</subfield><subfield code="x">Morphosyntax</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Morphosyntax</subfield><subfield code="0">(DE-588)4114635-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Morphosyntax</subfield><subfield code="0">(DE-588)4114635-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Hana, Jirka</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)140601988</subfield><subfield code="4">aut</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Language and computers</subfield><subfield code="v">70</subfield><subfield code="w">(DE-604)BV000833947</subfield><subfield code="9">70</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=018878606&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="940" ind1="1" ind2=" "><subfield code="n">oe</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-018878606</subfield></datafield></record></collection> |
id | DE-604.BV035985792 |
illustrated | Illustrated |
indexdate | 2024-07-09T22:09:01Z |
institution | BVB |
isbn | 9789042027695 9789042027688 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-018878606 |
oclc_num | 497573700 |
open_access_boolean | |
owner | DE-384 DE-12 DE-29 |
owner_facet | DE-384 DE-12 DE-29 |
physical | XIV, 185 S. graph. Darst. |
publishDate | 2010 |
publishDateSearch | 2010 |
publishDateSort | 2010 |
publisher | Rodopi |
record_format | marc |
series | Language and computers |
series2 | Language and computers |
spelling | Feldman, Anna Verfasser (DE-588)140601937 aut A resource-light approach to morpho-syntactic tagging Anna Feldman and Jirka Hana Amsterdam [u. a.] Rodopi 2010 XIV, 185 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Language and computers 70 Grammatik Spanisch Catalan language Morphosyntax Cognate words Computational linguistics Corpora (Linguistics) Cross-language information retrieval Czech language Morphosyntax Grammar, Comparative and general Morphosyntax Language transfer (Language learning) Portuguese language Morphosyntax Russian language Morphosyntax Spanish language Morphosyntax Computerlinguistik (DE-588)4035843-4 gnd rswk-swf Morphosyntax (DE-588)4114635-9 gnd rswk-swf Morphosyntax (DE-588)4114635-9 s Computerlinguistik (DE-588)4035843-4 s DE-604 Hana, Jirka Verfasser (DE-588)140601988 aut Language and computers 70 (DE-604)BV000833947 70 HEBIS Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=018878606&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Feldman, Anna Hana, Jirka A resource-light approach to morpho-syntactic tagging Language and computers Grammatik Spanisch Catalan language Morphosyntax Cognate words Computational linguistics Corpora (Linguistics) Cross-language information retrieval Czech language Morphosyntax Grammar, Comparative and general Morphosyntax Language transfer (Language learning) Portuguese language Morphosyntax Russian language Morphosyntax Spanish language Morphosyntax Computerlinguistik (DE-588)4035843-4 gnd Morphosyntax (DE-588)4114635-9 gnd |
subject_GND | (DE-588)4035843-4 (DE-588)4114635-9 |
title | A resource-light approach to morpho-syntactic tagging |
title_auth | A resource-light approach to morpho-syntactic tagging |
title_exact_search | A resource-light approach to morpho-syntactic tagging |
title_full | A resource-light approach to morpho-syntactic tagging Anna Feldman and Jirka Hana |
title_fullStr | A resource-light approach to morpho-syntactic tagging Anna Feldman and Jirka Hana |
title_full_unstemmed | A resource-light approach to morpho-syntactic tagging Anna Feldman and Jirka Hana |
title_short | A resource-light approach to morpho-syntactic tagging |
title_sort | a resource light approach to morpho syntactic tagging |
topic | Grammatik Spanisch Catalan language Morphosyntax Cognate words Computational linguistics Corpora (Linguistics) Cross-language information retrieval Czech language Morphosyntax Grammar, Comparative and general Morphosyntax Language transfer (Language learning) Portuguese language Morphosyntax Russian language Morphosyntax Spanish language Morphosyntax Computerlinguistik (DE-588)4035843-4 gnd Morphosyntax (DE-588)4114635-9 gnd |
topic_facet | Grammatik Spanisch Catalan language Morphosyntax Cognate words Computational linguistics Corpora (Linguistics) Cross-language information retrieval Czech language Morphosyntax Grammar, Comparative and general Morphosyntax Language transfer (Language learning) Portuguese language Morphosyntax Russian language Morphosyntax Spanish language Morphosyntax Computerlinguistik Morphosyntax |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=018878606&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV000833947 |
work_keys_str_mv | AT feldmananna aresourcelightapproachtomorphosyntactictagging AT hanajirka aresourcelightapproachtomorphosyntactictagging |