Context-specific Consistencies in Information Extraction: Rule-based and Probabilistic Approaches
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Abschlussarbeit Buch |
Sprache: | English |
Veröffentlicht: |
Würzburg
Würzburg University Press
2015
|
Schlagworte: | |
Online-Zugang: | Volltext Inhaltsverzeichnis |
Beschreibung: | XII, 194 Seiten Illustrationen, Diagramme |
ISBN: | 9783958260184 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV042802270 | ||
003 | DE-604 | ||
005 | 20190226 | ||
007 | t | ||
008 | 150904s2015 a||| m||| 00||| eng d | ||
020 | |a 9783958260184 |9 978-3-95826-018-4 | ||
035 | |a (OCoLC)922033880 | ||
035 | |a (DE-599)BVBBV042802270 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-384 |a DE-473 |a DE-703 |a DE-1051 |a DE-824 |a DE-29 |a DE-12 |a DE-91 |a DE-19 |a DE-1049 |a DE-92 |a DE-739 |a DE-898 |a DE-355 |a DE-706 |a DE-20 |a DE-1102 | ||
084 | |a ST 302 |0 (DE-625)143652: |2 rvk | ||
084 | |a ST 306 |0 (DE-625)143654: |2 rvk | ||
100 | 1 | |a Klügl, Peter |e Verfasser |0 (DE-588)1076176275 |4 aut | |
245 | 1 | 0 | |a Context-specific Consistencies in Information Extraction |b Rule-based and Probabilistic Approaches |c Peter Klügl |
246 | 1 | 3 | |a Kontextspezifische Konsistenzen in der Informationsextraktion: Regelbasierte und Probabilistische Ansätze |
264 | 1 | |a Würzburg |b Würzburg University Press |c 2015 | |
300 | |a XII, 194 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
502 | |b Dissertation |c Julius-Maximilians-Universität Würzburg |d 2014 | ||
650 | 0 | 7 | |a Information Extraction |0 (DE-588)4566641-6 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
689 | 0 | 0 | |a Information Extraction |0 (DE-588)4566641-6 |D s |
689 | 0 | 1 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | |5 DE-604 | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |o urn:nbn:de:bvb:20-opus-108352 |z 978-3-95826-019-1 |
856 | 4 | 1 | |u https://nbn-resolving.org/urn:nbn:de:bvb:20-opus-108352 |x Resolving-System |z kostenfrei |3 Volltext |
856 | 4 | 2 | |m DNB Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028231942&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
912 | |a ebook | ||
999 | |a oai:aleph.bib-bvb.de:BVB01-028231942 |
Datensatz im Suchindex
_version_ | 1804175037911007232 |
---|---|
adam_text | CONTENTS
1 INTRODUCTION 1
1.1 MOTIVATION 2
1.2 GOAL 4
1.3 CONTRIBUTIONS 4
1.4 STRUCTURE OF THIS WORK 9
2 INFORMATION EXTRACTION 11
2.1 FOUNDATIONS 12
2.1.1 DEFINITION 12
2.1.2 HISTORICAL DEVELOPMENT 13
2.1.3 EVALUATION MEASURES 15
2.1.4 ARCHITECTURES 17
2.1.4.1 UIMA 17
2.1.4.2 OTHER ARCHITECTURES 18
2.2 RULE-BASED INFORMATION EXTRACTION 19
2.2.1 RULE LANGUAGES 19
2.2.1.1 CPSL 20
2.2.1.2 JAPE 21
2.2.1.3 SPROUT - XTDL 24
2.2.1.4 AFST 27
2.2.1.5 SYSTEMT - AQL 29
2.2.1.6 OTHER LANGUAGES 30
2.2.2 DEVELOPMENT SUPPORT 31
2.2.3 RULE INDUCTION 31
2.2.3.1 BWI 31
2.2.3.2 CRYSTAL 32
2.2.3.3 LP
2
32
2.2.3.4 RAPIER 32
2.2.3.5 SRV 32
2.2.3.6 WHISK 32
2.2.3.7 WIEN 33
2.3 MACHINE LEARNING FOR INFORMATION EXTRACTION 33
2.3.1 ESSENTIALS OF MACHINE LEARNING 33
2.3.2 REPRESENTATION AS A MACHINE LEARNING TASK 35
2.3.2.1 CLASSIFY CANDIDATES 35
2.3.2.2 SLIDING WINDOW 36
2.3.2.3 BOUNDARY MODELS 36
2.3.2.4 FINITE STATE MACHINES 36
IX
HTTP://D-NB.INFO/1075928486
CONTENTS
2.3.2.5 WRAPPER INDUCTION 37
2.3.3 CONDITIONAL RANDOM FIELDS 37
2.3.3.1 MODELING 37
2.3.3.2 INFERENCE 39
2.3.3.3 PARAMETER ESTIMATION 41
3 CONTEXT-SPECIFIC CONSISTENCIES 43
3.1 CHARACTERISTICS 44
3.2 DOMAINS 46
3.2.1 REFERENCE SECTIONS 46
3.2.1.1 INFORMATION EXTRACTION TASK 47
3.2.1.2 APPLICATIONS 48
3.2.1.3 RELATED WORK 48
3.2.1.4 ASPECTS OF CONTEXT-SPECIFIC CONSISTENCIES 50
3.2.2 CURRICULA VITAE 51
3.2.2.1 INFORMATION EXTRACTION TASK 52
3.2.2.2 APPLICATIONS 53
3.2.2.3 RELATED WORK 54
3.2.2.4 ASPECTS OF CONTEXT-SPECIFIC CONSISTENCIES 55
3.2.3 CLINICAL DISCHARGE LETTERS 55
3.2.3.1 INFORMATION EXTRACTION TASK 58
3.2.3.2 APPLICATIONS 58
3.2.3.3 RELATED WORK 59
3.2.3.4 ASPECTS OF CONTEXT-SPECIFIC CONSISTENCIES 60
3.2.4 OTHER DOMAINS 61
3.3 EXPLOITING CONTEXT-SPECIFIC CONSISTENCIES 62
3.4 RELATED WORK 63
3.4.1 CONTEXT-SPECIFIC CONSISTENCIES 63
3.4.1.1 LEARNING WITH SCOPE 63
3.4.1.2 THE REFPARSE ALGORITHM 66
3.4.1.3 PROPERTIES-BASED COLLECTIVE INFERENCE 67
3.4.1.4 EXPLOITING CONTENT REDUNDANCY 70
3.4.1.5 OTHER PUBLICATIONS 71
3.4.2 COLLECTIVE INFORMATION EXTRACTION 71
4 UIMARUTA 73
4.1 INTRODUCTION 73
4.1.1 HISTORY AND CURRENT STATE 74
4.2 THE RULE-BASED SCRIPTING LANGUAGE 75
4.2.1 PROVIDED ANNOTATION TYPES 75
4.2.2 SYNTAX AND SEMANTICS 75
4.2.2.1 SCRIPT DEFINITION 77
4.2.2.2 RULE DEFINITION 78
4.2.2.3 EXTENSIBLE LANGUAGE DEFINITION 82
CONTENTS
4.2.3 INFERENCE 82
4.2.3.1 RULE EXECUTION 82
4.2.3.2 RULE MATCHING 84
4.2.3.3 BEYOND SEQUENTIAL MATCHING 88
4.2.4 VISIBILITY AND FILTERING 89
4.2.5 BLOCKS AND INLINED RULES 90
4.2.6 ENGINEERING APPROACHES 92
4.2.6.1 CLASSICAL APPROACHES 92
4.2.6.2 TRANSFORMATION-BASED RULES 93
4.2.6.3 SCORING RULES 93
4.2.7 EXEMPLARY SCRIPT 95
4.3 DEVELOPMENT ENVIRONMENT AND TOOLING 99
4.3.1 BASIC DEVELOPMENT SUPPORT 100
4.3.2 EXPLANATION OF RULE EXECUTION 102
4.3.3 INTROSPECTION BY QUERYING 103
4.3.4 AUTOMATIC VALIDATION 104
4.3.5 CONSTRAINT-DRIVEN EVALUATION 106
4.3.6 SUPERVISED RULE INDUCTION 109
4.3.7 SEMI-AUTOMATIC CREATION OF GOLD DOCUMENTS 110
4.4 COMPARISON TO RELATED SYSTEMS ILL
5 KNOWLEDGE ENGINEERING APPROACHES 117
5.1 IMPROVING RECALL IN PRECISION-DRIVEN PROTOTYPING 118
5.1.1 RULE SETS 118
5.1.2 EXPERIMENTAL RESULTS 120
5.2 STACKED TRANSFORMATIONS 121
5.2.1 RULE SETS 122
5.2.2 EXPERIMENTAL RESULTS 123
5.3 USAGE IN A COMPLETE APPLICATION 125
5.3.1 RULE SETS 126
5.3.1.1 GENERATING CANDIDATES 126
5.3.1.2 PROPERTIES OF HEADLINES 128
5.3.1.3 SCORE-BASED APPROACH 128
5.3.1.4 KEYWORD-BASED APPROACH 129
5.3.1.5 CONSISTENCY-BASED APPROACH 130
5.3.1.6 CORRECTION-BASED APPROACH 130
5.3.2 EXPERIMENTAL RESULTS 130
5.4 DISCUSSION 133
6 MACHINE LEARNING APPROACHES 135
6.1 LEARNING CONTEXT-SPECIFIC CONSISTENCIES 135
6.1.1 MODELING CONSISTENCIES WITH CLASSIFIERS 136
6.1.1.1 DETERMINE TYPE OF DESCRIPTION FOR CONSISTENCIES 136
6.1.1.2 SELECT CLASSIFIER FOR LEARNING CONSISTENCIES 138
6.1.1.3 PROVIDE PREDICTION OF ENTITIES 140
CONTENTS
6.1.1.4 CREATE DATASET FOR CLASSIFIER 141
6.1.1.5 LEARN CLASSIFIERS ON DATASET 142
6.1.1.6 APPLY CLASSIFIERS ON DATASET 142
6.1.2 EXAMPLE 143
6.1.3 EXPERIMENTAL RESULTS 145
6.1.3.1 RANDOM SYNTHETIC ERRORS 146
6.1.3.2 REALISTIC PREDICTION 153
6.2 STACKED CONDITIONAL RANDOM FIELDS 154
6.2.1 STACKED INFERENCE WITH CONSISTENCIES 155
6.2.2 PARAMETER ESTIMATION 157
6.2.3 EXPERIMENTAL RESULTS 158
6.2.3.1 DATASETS 158
6.2.3.2 IMPLEMENTATION DETAILS 159
6.2.3.3 RESULTS 159
6.3 TOWARDS HIGHER-ORDER MODELS 161
6.3.1 COMB-CHAIN CRFS 162
6.3.2 SKYP-CHAIN CRFS 163
6.3.3 PARAMETER ESTIMATION AND INFERENCE 165
6.3.4 EXPERIMENTAL RESULTS 165
6.3.4.1 DATASETS 165
6.3.4.2 SETTINGS 166
6.3.4.3 RESULTS 166
6.4 DISCUSSION 166
7 CONCLUSION 169
7.1 SUMMARY 169
7.2 OUTLOOK 173
BIBLIOGRAPHY 177
|
any_adam_object | 1 |
author | Klügl, Peter |
author_GND | (DE-588)1076176275 |
author_facet | Klügl, Peter |
author_role | aut |
author_sort | Klügl, Peter |
author_variant | p k pk |
building | Verbundindex |
bvnumber | BV042802270 |
classification_rvk | ST 302 ST 306 |
collection | ebook |
ctrlnum | (OCoLC)922033880 (DE-599)BVBBV042802270 |
discipline | Informatik |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02065nam a2200421 c 4500</leader><controlfield tag="001">BV042802270</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20190226 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">150904s2015 a||| m||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783958260184</subfield><subfield code="9">978-3-95826-018-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)922033880</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV042802270</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-384</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-703</subfield><subfield code="a">DE-1051</subfield><subfield code="a">DE-824</subfield><subfield code="a">DE-29</subfield><subfield code="a">DE-12</subfield><subfield code="a">DE-91</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-1049</subfield><subfield code="a">DE-92</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-898</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-706</subfield><subfield code="a">DE-20</subfield><subfield code="a">DE-1102</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 302</subfield><subfield code="0">(DE-625)143652:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 306</subfield><subfield code="0">(DE-625)143654:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Klügl, Peter</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1076176275</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Context-specific Consistencies in Information Extraction</subfield><subfield code="b">Rule-based and Probabilistic Approaches</subfield><subfield code="c">Peter Klügl</subfield></datafield><datafield tag="246" ind1="1" ind2="3"><subfield code="a">Kontextspezifische Konsistenzen in der Informationsextraktion: Regelbasierte und Probabilistische Ansätze</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Würzburg</subfield><subfield code="b">Würzburg University Press</subfield><subfield code="c">2015</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XII, 194 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="b">Dissertation</subfield><subfield code="c">Julius-Maximilians-Universität Würzburg</subfield><subfield code="d">2014</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Information Extraction</subfield><subfield code="0">(DE-588)4566641-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Information Extraction</subfield><subfield code="0">(DE-588)4566641-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="o">urn:nbn:de:bvb:20-opus-108352</subfield><subfield code="z">978-3-95826-019-1</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://nbn-resolving.org/urn:nbn:de:bvb:20-opus-108352</subfield><subfield code="x">Resolving-System</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">DNB Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028231942&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ebook</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-028231942</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV042802270 |
illustrated | Illustrated |
indexdate | 2024-07-10T07:09:55Z |
institution | BVB |
isbn | 9783958260184 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-028231942 |
oclc_num | 922033880 |
open_access_boolean | 1 |
owner | DE-384 DE-473 DE-BY-UBG DE-703 DE-1051 DE-824 DE-29 DE-12 DE-91 DE-BY-TUM DE-19 DE-BY-UBM DE-1049 DE-92 DE-739 DE-898 DE-BY-UBR DE-355 DE-BY-UBR DE-706 DE-20 DE-1102 |
owner_facet | DE-384 DE-473 DE-BY-UBG DE-703 DE-1051 DE-824 DE-29 DE-12 DE-91 DE-BY-TUM DE-19 DE-BY-UBM DE-1049 DE-92 DE-739 DE-898 DE-BY-UBR DE-355 DE-BY-UBR DE-706 DE-20 DE-1102 |
physical | XII, 194 Seiten Illustrationen, Diagramme |
psigel | ebook |
publishDate | 2015 |
publishDateSearch | 2015 |
publishDateSort | 2015 |
publisher | Würzburg University Press |
record_format | marc |
spelling | Klügl, Peter Verfasser (DE-588)1076176275 aut Context-specific Consistencies in Information Extraction Rule-based and Probabilistic Approaches Peter Klügl Kontextspezifische Konsistenzen in der Informationsextraktion: Regelbasierte und Probabilistische Ansätze Würzburg Würzburg University Press 2015 XII, 194 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier Dissertation Julius-Maximilians-Universität Würzburg 2014 Information Extraction (DE-588)4566641-6 gnd rswk-swf Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf (DE-588)4113937-9 Hochschulschrift gnd-content Information Extraction (DE-588)4566641-6 s Maschinelles Lernen (DE-588)4193754-5 s DE-604 Erscheint auch als Online-Ausgabe urn:nbn:de:bvb:20-opus-108352 978-3-95826-019-1 https://nbn-resolving.org/urn:nbn:de:bvb:20-opus-108352 Resolving-System kostenfrei Volltext DNB Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028231942&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Klügl, Peter Context-specific Consistencies in Information Extraction Rule-based and Probabilistic Approaches Information Extraction (DE-588)4566641-6 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
subject_GND | (DE-588)4566641-6 (DE-588)4193754-5 (DE-588)4113937-9 |
title | Context-specific Consistencies in Information Extraction Rule-based and Probabilistic Approaches |
title_alt | Kontextspezifische Konsistenzen in der Informationsextraktion: Regelbasierte und Probabilistische Ansätze |
title_auth | Context-specific Consistencies in Information Extraction Rule-based and Probabilistic Approaches |
title_exact_search | Context-specific Consistencies in Information Extraction Rule-based and Probabilistic Approaches |
title_full | Context-specific Consistencies in Information Extraction Rule-based and Probabilistic Approaches Peter Klügl |
title_fullStr | Context-specific Consistencies in Information Extraction Rule-based and Probabilistic Approaches Peter Klügl |
title_full_unstemmed | Context-specific Consistencies in Information Extraction Rule-based and Probabilistic Approaches Peter Klügl |
title_short | Context-specific Consistencies in Information Extraction |
title_sort | context specific consistencies in information extraction rule based and probabilistic approaches |
title_sub | Rule-based and Probabilistic Approaches |
topic | Information Extraction (DE-588)4566641-6 gnd Maschinelles Lernen (DE-588)4193754-5 gnd |
topic_facet | Information Extraction Maschinelles Lernen Hochschulschrift |
url | https://nbn-resolving.org/urn:nbn:de:bvb:20-opus-108352 http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028231942&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT kluglpeter contextspecificconsistenciesininformationextractionrulebasedandprobabilisticapproaches AT kluglpeter kontextspezifischekonsistenzeninderinformationsextraktionregelbasierteundprobabilistischeansatze |