Named entities for computational linguistics:
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
London
ISTE
2016
|
Schriftenreihe: | Cognitive science series
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis Klappentext |
Beschreibung: | XI, 170 Seiten |
ISBN: | 1848218389 9781848218383 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV043424896 | ||
003 | DE-604 | ||
005 | 20160607 | ||
007 | t | ||
008 | 160302s2016 |||| 00||| eng d | ||
020 | |a 1848218389 |c hardback |9 1-84821-838-9 | ||
020 | |a 9781848218383 |c hardback |9 978-1-84821-838-3 | ||
035 | |a (OCoLC)949261038 | ||
035 | |a (DE-599)BVBBV043424896 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-12 |a DE-355 |a DE-739 | ||
084 | |a ST 306 |0 (DE-625)143654: |2 rvk | ||
100 | 1 | |a Nouvel, Damien |e Verfasser |4 aut | |
245 | 1 | 0 | |a Named entities for computational linguistics |c Damien Nouvel, Maud Ehrmann, Sophie Rosset |
264 | 1 | |a London |b ISTE |c 2016 | |
300 | |a XI, 170 Seiten | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Cognitive science series | |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Name |0 (DE-588)4127959-1 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 0 | 1 | |a Name |0 (DE-588)4127959-1 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Ehrmann, Maud |e Verfasser |4 aut | |
700 | 1 | |a Rosset, Sophie |e Verfasser |4 aut | |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028842762&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028842762&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
999 | |a oai:aleph.bib-bvb.de:BVB01-028842762 |
Datensatz im Suchindex
_version_ | 1804176020983513088 |
---|---|
adam_text | Contents
Introduction........................................................... ix
Chapter 1. Named Entities for Accessing Information ................ I
l. 1. Research program history.................................. 2
1.1.1. Understanding documents: an ambitious task ............. 2
1.1.2. Detecting basic elements: named entities................ 3
1.1.3. Trend: a return to slot filling......................... 7
1.2. Task using named entities as a basic representation........ 9
1.3. Conclusion.................................................... 10
Chapter 2. Named Entities, Referential Units ....................... 11
2.1. Issues with the named entity concept....................... 12
2.1.1. A heterogeneous set..................................... 12
2.1.2. Existing defining formulas................................. 17
2.1.3. An NLP object.............................................. 21
2.2. The notions of meaning and reference.......................... 22
2.2.1. What is the reference?.................................. 22
2.2.2. What is meaning?........................................... 24
2.3. Proper names.................................................. 27
2.3.1. The traditional criteria for defining a proper name..... 28
2.3.2. Meaning and referential function of proper names........ 30
2.3.3. The “referential load” of proper names.................. 34
2.4. Definite descriptions......................................... 35
2.4.1. What is a definite description?......................... 35
2.4.2. The meaning of definite descriptions....................... 38
vi Named Entities for Computational Linguistics
2.4.3. Complete and incomplete definite descriptions.......... 39
2.5. The meaning and referential functioning
of named entities.............................................. 41
2.5.1. Reference to a particular................................ 42
2.5.2. Referential autonomy..................................... 44
2.5.3. A “natural” heterogeneity................................ 45
2.6. Conclusion.................................................. 46
Chapter 3. Resources Associated with Named Entities .... 47
3.1. Typologies: general and specialist domains................ 48
3.1.1. The notion of category................................... 48
3.1.2. Typology development..................................... 49
3.1.3. Typologies beyond evaluation campaigns................... 53
3.1.4. Other uses of typologies ................................ 54
3.1.5. Illustrated comparison................................... 57
3.1.6. Issues to consider regarding entities.................... 57
3.2. Corpora................................................... 59
3.2.1. Introduction............................................. 59
3.2.2. Corpora and named entities............................... 60
3.2.3. Conclusion............................................... 65
3.3. Lexicons and knowledge databases............................ 65
3.3.1. Lexical databases...................................... 66
3.3.2. Knowledge databases...................................... 72
3.4. Conclusion.................................................. 75
Chapter 4. Recognizing Named Entities ................................ 77
4.1. Detection and classification of named entities............ 78
4.2. Indicators for named entity recognition..................... 79
4.2.1. Describing word morphology .............................. 79
4.2.2. Using lexical databases.................................. 81
4.2.3. Contextual clues......................................... 83
4.2.4. Conclusion............................................... 85
4.3. Rule-based techniques....................................... 85
4.4. Data-driven and machine-learning systems.................... 88
4.4.1. Majority class models.................................... 91
4.4.2. Contextual models (HMM).................................. 92
4.4.3. Multiple feature models (Softmax and MaxEnt)............. 93
Contents vii
4,4.4. Conditional Random Fields (CRFs).................. 95
4.5. Unsupervised enrichment of supervised methods......... 95
4.6. Conclusion.............................................. 96
Chapter 5. Linking Named Entities to References.................. 99
5.1. Knowledge bases .........................................100
5.2. Formalizing polysemy in named entity mentions............102
5.3. Stages in the named entity linking process...............103
5.3.1. Detecting mentions of named entities.................103
5.3.2. Selecting candidates for each mention...............103
5.3.3. Entity disambiguation................................104
5.3.4. Entity linking.......................................106
5.4. System performance...................................... 106
5.4.1. Practical application: DBpedia Spotlight.............107
5.4.2. Future prospects.....................................108
Chapter 6. Evaluating Named Entity Recognition ...................Ill
6.1. Classic measurements: precision, recall and F-measures .... 112
6.2. Measures using error counts.............................115
6.3. Evaluating associated tasks............................. 120
6.3.1. Detecting entities and mentions......................121
6.3.2. Entity detection and linking ........................122
6.4. Evaluating preprocessing technologies ...................126
6.5. Conclusion...............................................128
Conclusion .......................................................131
Appendices........................................................137
Appendix 1. Glossary .............................................139
Appendix 2. Named Entities: Research Programs.....................141
Appendix 3. Summary of Available Corpora..........................147
Appendix 4. Annotation Formats....................................151
Appendix 5. Named Entities: Current Definitions ..................153
Bibliography .....................................................157
Index
169
FOCUS SERIES in COGNITIVE SCIENCE
One of the challenges brought on by the digital revolution of
the recent decades is the mechanism by which information
carried by texts can be extracted in order to access its
contents.
The processing of named entities remains a very active area
of research, which plays a central role in natural language
processing technologies and their applications. Named
entity recognition, a tool used in information extraction
tasks, focuses on recognizing small pieces of information in
order to extract information on a larger scale.
The authors use written text and examples in French and
English to present the necessary elements for the readers to
familiarize themselves with the main concepts related to
named entities and to discover the problems associated
with them, as well as the methods available in practice for
solving these issues.
Damien Nouvel is Associate Professor at the National
Institute of Oriental Languages And Civilizations (Inalto) in
Paris, France.
Maud Ehrmann is a Research Scientist at EPFL (École poly-
technique fédérale de Lausanne) in Geneva, Switzerland.
Sophie Roseet is a Senior Researcher at the French National
Centre for Scientific Research (CNRS) in Paris, France.
|
any_adam_object | 1 |
author | Nouvel, Damien Ehrmann, Maud Rosset, Sophie |
author_facet | Nouvel, Damien Ehrmann, Maud Rosset, Sophie |
author_role | aut aut aut |
author_sort | Nouvel, Damien |
author_variant | d n dn m e me s r sr |
building | Verbundindex |
bvnumber | BV043424896 |
classification_rvk | ST 306 |
ctrlnum | (OCoLC)949261038 (DE-599)BVBBV043424896 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01788nam a2200397 c 4500</leader><controlfield tag="001">BV043424896</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20160607 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">160302s2016 |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1848218389</subfield><subfield code="c">hardback</subfield><subfield code="9">1-84821-838-9</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781848218383</subfield><subfield code="c">hardback</subfield><subfield code="9">978-1-84821-838-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)949261038</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV043424896</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-12</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-739</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 306</subfield><subfield code="0">(DE-625)143654:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Nouvel, Damien</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Named entities for computational linguistics</subfield><subfield code="c">Damien Nouvel, Maud Ehrmann, Sophie Rosset</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">London</subfield><subfield code="b">ISTE</subfield><subfield code="c">2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XI, 170 Seiten</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Cognitive science series</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Name</subfield><subfield code="0">(DE-588)4127959-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Name</subfield><subfield code="0">(DE-588)4127959-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ehrmann, Maud</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Rosset, Sophie</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028842762&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028842762&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-028842762</subfield></datafield></record></collection> |
id | DE-604.BV043424896 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T07:25:33Z |
institution | BVB |
isbn | 1848218389 9781848218383 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-028842762 |
oclc_num | 949261038 |
open_access_boolean | |
owner | DE-12 DE-355 DE-BY-UBR DE-739 |
owner_facet | DE-12 DE-355 DE-BY-UBR DE-739 |
physical | XI, 170 Seiten |
publishDate | 2016 |
publishDateSearch | 2016 |
publishDateSort | 2016 |
publisher | ISTE |
record_format | marc |
series2 | Cognitive science series |
spelling | Nouvel, Damien Verfasser aut Named entities for computational linguistics Damien Nouvel, Maud Ehrmann, Sophie Rosset London ISTE 2016 XI, 170 Seiten txt rdacontent n rdamedia nc rdacarrier Cognitive science series Computerlinguistik (DE-588)4035843-4 gnd rswk-swf Name (DE-588)4127959-1 gnd rswk-swf Computerlinguistik (DE-588)4035843-4 s Name (DE-588)4127959-1 s DE-604 Ehrmann, Maud Verfasser aut Rosset, Sophie Verfasser aut Digitalisierung UB Regensburg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028842762&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis Digitalisierung UB Regensburg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028842762&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA Klappentext |
spellingShingle | Nouvel, Damien Ehrmann, Maud Rosset, Sophie Named entities for computational linguistics Computerlinguistik (DE-588)4035843-4 gnd Name (DE-588)4127959-1 gnd |
subject_GND | (DE-588)4035843-4 (DE-588)4127959-1 |
title | Named entities for computational linguistics |
title_auth | Named entities for computational linguistics |
title_exact_search | Named entities for computational linguistics |
title_full | Named entities for computational linguistics Damien Nouvel, Maud Ehrmann, Sophie Rosset |
title_fullStr | Named entities for computational linguistics Damien Nouvel, Maud Ehrmann, Sophie Rosset |
title_full_unstemmed | Named entities for computational linguistics Damien Nouvel, Maud Ehrmann, Sophie Rosset |
title_short | Named entities for computational linguistics |
title_sort | named entities for computational linguistics |
topic | Computerlinguistik (DE-588)4035843-4 gnd Name (DE-588)4127959-1 gnd |
topic_facet | Computerlinguistik Name |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028842762&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028842762&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT nouveldamien namedentitiesforcomputationallinguistics AT ehrmannmaud namedentitiesforcomputationallinguistics AT rossetsophie namedentitiesforcomputationallinguistics |