Memory-based language processing:
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Cambridge [u.a.]
Cambridge Univ. Press
2009
|
Ausgabe: | Digitally print. version |
Schriftenreihe: | Studies in natural language processing
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis Klappentext |
Beschreibung: | VII, 189 S. graph. Darst. |
ISBN: | 9780521114455 9780521808903 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV035394050 | ||
003 | DE-604 | ||
005 | 20180516 | ||
007 | t | ||
008 | 090326s2009 d||| |||| 00||| eng d | ||
020 | |a 9780521114455 |9 978-0-521-11445-5 | ||
020 | |a 9780521808903 |9 978-0-521-80890-3 | ||
035 | |a (OCoLC)316824099 | ||
035 | |a (DE-599)BVBBV035394050 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-19 |a DE-473 |a DE-355 |a DE-188 | ||
082 | 0 | |a 006.35 |2 22 | |
084 | |a ES 900 |0 (DE-625)27926: |2 rvk | ||
084 | |a ST 300 |0 (DE-625)143650: |2 rvk | ||
100 | 1 | |a Daelemans, Walter |d 1960- |e Verfasser |0 (DE-588)142096660 |4 aut | |
245 | 1 | 0 | |a Memory-based language processing |c Walter Daelemans ; Antal van den Bosch |
250 | |a Digitally print. version | ||
264 | 1 | |a Cambridge [u.a.] |b Cambridge Univ. Press |c 2009 | |
300 | |a VII, 189 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Studies in natural language processing | |
650 | 4 | |a Natural language processing (Computer science) | |
650 | 0 | 7 | |a Sprachverarbeitung |0 (DE-588)4116579-2 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Natürliche Sprache |0 (DE-588)4041354-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Sprachverarbeitung |0 (DE-588)4116579-2 |D s |
689 | 0 | 1 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | 2 | |a Natürliche Sprache |0 (DE-588)4041354-8 |D s |
689 | 0 | |5 DE-604 | |
689 | 1 | 0 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 1 | 1 | |a Natürliche Sprache |0 (DE-588)4041354-8 |D s |
689 | 1 | |5 DE-604 | |
700 | 1 | |a Bosch, Antal van den |d 1969- |e Verfasser |0 (DE-588)13902302X |4 aut | |
856 | 4 | 2 | |m Digitalisierung UB Regensburg |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017314794&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Regensburg |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017314794&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
999 | |a oai:aleph.bib-bvb.de:BVB01-017314794 |
Datensatz im Suchindex
_version_ | 1804138818269347840 |
---|---|
adam_text | Contents
Preface
1
1
Memory-Based Learning in Natural Language Processing
3
1.1
Naturallanguage processing as classification
......... 6
1.2
A linguistic example
....................... 9
1.3
Roadmap and software
...................... 12
1.4
Further reading
.......................... 14
2
Inspirations from linguistics and artificial intelligence
15
2.1
Inspirations from linguistics
................... 15
2.2
Inspirations from artificial intelligence
............. 21
2.3
Memory-based language processing literature
........ 22
2.4
Conclusion
............................. 24
3
Memory and Similarity
26
3.1
German plural formation
..................... 27
3.2
Similarity metric
.......................... 28
3.2.1
Information-theoretic feature weighting
........ 29
3.2.2
Alternative feature weighting methods
........ 31
3.2.3
Getting started with TlMBL
............... 32
3.2.4
Feature weighting in TlMBL
.............. 36
3.2.5
Modified value difference metric
............ 38
3.2.6
Value clustering in TlMBL
............... 39
3.2.7
Distance-weighted class voting
............. 42
3.2.8
Distance-weighted class voting in TlMBL
....... 44
3.3
Analyzing the output of MBLP
................. 45
3.3.1
Displaying nearest neighbors in TlMBL
........ 45
3.4
Implementation issues
...................... 46
3.4.1
TlMBL trees
........................ 47
vi
CONTENTS
3.5
Methodology
........................... 47
3.5.1
Experimental methodology in TlMBL
......... 48
3.5.2
Additional performance measures in TlMBL
..... 52
3.6
Conclusion
............................. 55
4
Application to morpho-phonology
57
4.1
Phonemization
.......................... 59
4.1.1
Memory-based word phonemization
......... 59
4.1.2
TREETALK
......................... 60
4.1.3
IGTree in TlMBL
.................... 67
4.1.4
Experiments: applying IGTree to word phonemization
69
4.1.5
TRIBL: trading memory for speed
............ 71
4.1.6
TRIBL in TlMBL
...................... 73
4.2
Morphological analysis
...................... 73
4.2.1
Dutch morphology
.................... 74
4.2.2
Feature and class encoding
............... 74
4.2.3
Experiments: MBMA on Dutch wordforms
...... 76
4.3
Conclusion
............................. 80
4.4
Further reading
.......................... 83
5
Application to shallow parsing
85
5.1
Part-of-speech tagging
...................... 86
5.1.1
Memory-based tagger architecture
........... 87
5.1.2
Results
........................... 88
5.1.3
Memory-based tagging with
Мвт
and Mbtg
..... 90
5.2
Constituent chunking
....................... 96
5.2.1
Results
........................... 96
5.2.2
Using
Мвт
and MBTG for chunking
.......... 97
5.3
Relation finding
.......................... 99
5.3.1
Relation finder architecture
............... 99
5.3.2
Results
........................... 100
5.4
Conclusion
............................. 101
5.5
Further reading
.......................... 102
6
Abstraction and generalization
104
6.1
Lazy versus eager learning
.................... 106
6.1.1
Benchmark language learning tasks
.......... 107
6.1.2
Forgetting by rule induction is harmful in language
learning
..........................
Ill
6.2
Editing examples
......................... 115
CONTENTS
vii
6.3
Why forgetting examples can be harmful
........... 123
6.4
Generalizing examples
...................... 128
6.4.1
Careful abstraction in memory-based learning
.... 128
6.4.2
Getting started with FAMBL
............... 135
6.4.3
Experiments with FAMBL
................ 137
6.5
Conclusion
............................. 143
6.6
Further reading
.......................... 145
7
Extensions
148
7.1
Wrapped progressive sampling
................. 149
7.1.1
The wrapped progressive sampling algorithm
.... 150
7.1.2
Getting started with wrapped progressive sampling
. 152
7.1.3
Wrapped progressive sampling results
........ 154
7.2
Optimizing output sequences
.................. 156
7.2.1
Stacking
.......................... 157
7.2.2
Predicting class n-grams
................. 160
7.2.3
Combining stacking and class
га
-grams........
162
7.2.4
Summary
......................... 164
7.3
Conclusion
............................. 164
7.4
Further reading
.......................... 165
Bibliography
168
Index
186
Memory-Based Language Processing
Paperback Re-issue
Memory-based language processing
-
a machine learning and pro¬
blem solving method for fanguage technology
-
is based on the idea
that the direct re-use of examples using analogical reasoning
is more suited for solving language processing problems than the
application of rules extracted from those examples. This book
discusses the theory and practice of memory-based language
processing, showing its comparative strengths over alternative
methods of language modeling. Language is complex, with few
generalizations, many sub-regularities and exceptions, and the
advantage of memory-based language processing is that it does not
abstract away from this valuable low-frequency information.
By applying the model to a range of benchmark problems, the
authors show that for linguistic areas ranging from phonology
to semantics, it produces excellent results. They also describe
TiMBL, a software package for memory-based language processing.
The first comprehensive overview of the approach, this book wiil
be invaluable for computational linguists, psycholinguists and
language engineers.
The web site to accompany this book, containing instructions for
downloading TiMBL and links to other useful information, can be
found at http://ilk.uvt.nl/mb[p/
|
any_adam_object | 1 |
author | Daelemans, Walter 1960- Bosch, Antal van den 1969- |
author_GND | (DE-588)142096660 (DE-588)13902302X |
author_facet | Daelemans, Walter 1960- Bosch, Antal van den 1969- |
author_role | aut aut |
author_sort | Daelemans, Walter 1960- |
author_variant | w d wd a v d b avd avdb |
building | Verbundindex |
bvnumber | BV035394050 |
classification_rvk | ES 900 ST 300 |
ctrlnum | (OCoLC)316824099 (DE-599)BVBBV035394050 |
dewey-full | 006.35 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.35 |
dewey-search | 006.35 |
dewey-sort | 16.35 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik Sprachwissenschaft Literaturwissenschaft |
edition | Digitally print. version |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02321nam a2200505 c 4500</leader><controlfield tag="001">BV035394050</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20180516 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">090326s2009 d||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780521114455</subfield><subfield code="9">978-0-521-11445-5</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780521808903</subfield><subfield code="9">978-0-521-80890-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)316824099</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV035394050</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-19</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-188</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.35</subfield><subfield code="2">22</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 900</subfield><subfield code="0">(DE-625)27926:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 300</subfield><subfield code="0">(DE-625)143650:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Daelemans, Walter</subfield><subfield code="d">1960-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)142096660</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Memory-based language processing</subfield><subfield code="c">Walter Daelemans ; Antal van den Bosch</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Digitally print. version</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Cambridge [u.a.]</subfield><subfield code="b">Cambridge Univ. Press</subfield><subfield code="c">2009</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">VII, 189 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Studies in natural language processing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Natural language processing (Computer science)</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Natürliche Sprache</subfield><subfield code="0">(DE-588)4041354-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Natürliche Sprache</subfield><subfield code="0">(DE-588)4041354-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2="1"><subfield code="a">Natürliche Sprache</subfield><subfield code="0">(DE-588)4041354-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Bosch, Antal van den</subfield><subfield code="d">1969-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)13902302X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017314794&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017314794&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-017314794</subfield></datafield></record></collection> |
id | DE-604.BV035394050 |
illustrated | Illustrated |
indexdate | 2024-07-09T21:34:14Z |
institution | BVB |
isbn | 9780521114455 9780521808903 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-017314794 |
oclc_num | 316824099 |
open_access_boolean | |
owner | DE-19 DE-BY-UBM DE-473 DE-BY-UBG DE-355 DE-BY-UBR DE-188 |
owner_facet | DE-19 DE-BY-UBM DE-473 DE-BY-UBG DE-355 DE-BY-UBR DE-188 |
physical | VII, 189 S. graph. Darst. |
publishDate | 2009 |
publishDateSearch | 2009 |
publishDateSort | 2009 |
publisher | Cambridge Univ. Press |
record_format | marc |
series2 | Studies in natural language processing |
spelling | Daelemans, Walter 1960- Verfasser (DE-588)142096660 aut Memory-based language processing Walter Daelemans ; Antal van den Bosch Digitally print. version Cambridge [u.a.] Cambridge Univ. Press 2009 VII, 189 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Studies in natural language processing Natural language processing (Computer science) Sprachverarbeitung (DE-588)4116579-2 gnd rswk-swf Natürliche Sprache (DE-588)4041354-8 gnd rswk-swf Computerlinguistik (DE-588)4035843-4 gnd rswk-swf Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf Sprachverarbeitung (DE-588)4116579-2 s Maschinelles Lernen (DE-588)4193754-5 s Natürliche Sprache (DE-588)4041354-8 s DE-604 Computerlinguistik (DE-588)4035843-4 s Bosch, Antal van den 1969- Verfasser (DE-588)13902302X aut Digitalisierung UB Regensburg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017314794&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis Digitalisierung UB Regensburg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017314794&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA Klappentext |
spellingShingle | Daelemans, Walter 1960- Bosch, Antal van den 1969- Memory-based language processing Natural language processing (Computer science) Sprachverarbeitung (DE-588)4116579-2 gnd Natürliche Sprache (DE-588)4041354-8 gnd Computerlinguistik (DE-588)4035843-4 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
subject_GND | (DE-588)4116579-2 (DE-588)4041354-8 (DE-588)4035843-4 (DE-588)4193754-5 |
title | Memory-based language processing |
title_auth | Memory-based language processing |
title_exact_search | Memory-based language processing |
title_full | Memory-based language processing Walter Daelemans ; Antal van den Bosch |
title_fullStr | Memory-based language processing Walter Daelemans ; Antal van den Bosch |
title_full_unstemmed | Memory-based language processing Walter Daelemans ; Antal van den Bosch |
title_short | Memory-based language processing |
title_sort | memory based language processing |
topic | Natural language processing (Computer science) Sprachverarbeitung (DE-588)4116579-2 gnd Natürliche Sprache (DE-588)4041354-8 gnd Computerlinguistik (DE-588)4035843-4 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
topic_facet | Natural language processing (Computer science) Sprachverarbeitung Natürliche Sprache Computerlinguistik Maschinelles Lernen |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017314794&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=017314794&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT daelemanswalter memorybasedlanguageprocessing AT boschantalvanden memorybasedlanguageprocessing |