Language and chronology: text dating by machine learning
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Leiden ; Boston
Brill
[2019]
|
Schriftenreihe: | Language and computers
volume 84 |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis Klappentext |
Beschreibung: | XI, 183 Seiten Diagramme |
ISBN: | 9789004410039 |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV046206350 | ||
003 | DE-604 | ||
005 | 20210427 | ||
007 | t | ||
008 | 191021s2019 |||| |||| 10||| eng d | ||
020 | |a 9789004410039 |c (hbk.) |9 978-90-04-41003-9 | ||
035 | |a (OCoLC)1125191078 | ||
035 | |a (DE-599)BVBBV046206350 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-384 |a DE-12 |a DE-19 |a DE-739 |a DE-188 | ||
084 | |a EC 1300 |0 (DE-625)20379: |2 rvk | ||
084 | |a EC 1620 |0 (DE-625)20411: |2 rvk | ||
100 | 1 | |a Toner, Gregory |e Verfasser |0 (DE-588)1046469312 |4 aut | |
245 | 1 | 0 | |a Language and chronology |b text dating by machine learning |c by Gregory Toner, Xiwu Han |
264 | 1 | |a Leiden ; Boston |b Brill |c [2019] | |
264 | 4 | |c © 2019 | |
300 | |a XI, 183 Seiten |b Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Language and computers |v volume 84 | |
650 | 0 | 7 | |a Datenverarbeitung |0 (DE-588)4011152-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datierung |0 (DE-588)4113278-6 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Literatur |0 (DE-588)4035964-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Literatur |0 (DE-588)4035964-5 |D s |
689 | 0 | 1 | |a Datierung |0 (DE-588)4113278-6 |D s |
689 | 0 | 2 | |a Datenverarbeitung |0 (DE-588)4011152-0 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Han, Xiwu |e Verfasser |0 (DE-588)1198896744 |4 aut | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-90-04-41004-6 |w (DE-604)BV046319582 |
830 | 0 | |a Language and computers |v volume 84 |w (DE-604)BV000833947 |9 84 | |
856 | 4 | 2 | |m Digitalisierung UB Augsburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031585344&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Augsburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031585344&sequence=000003&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
999 | |a oai:aleph.bib-bvb.de:BVB01-031585344 |
Datensatz im Suchindex
_version_ | 1804180593873780736 |
---|---|
adam_text | Contents List of Figures, Tables and Algorithms Abbreviations xii Introduction і o.i Automated Dating Methods 0.2 How to Read This Book 7 ix 3 1 Dating Texts: Principles and Methods 11 1.1 Introduction n 1.2 Texts by Known Authors 11 1.3 Internal Evidence 13 1.4 Manuscripts 16 1.5 Intertextuality 17 1.6 Metrics 19 1.7 Linguistic Dating 21 1.7.1 Linguistic Strata and Scribal Revision 1.7.2 Dialect 26 1.7.3 Register 27 1.7.4 Archaism 29 1.7.5 Lexical Dating 33 1.7.6 Methodology 36 1.8 Conclusion 39 2 Computational Approaches to Text Dating 41 2.1 A Brief History 41 2.1.1 Early Research 42 2.1.2 Recent Research 42 2.1.3 DTE Task 43 2.1.4 Featuresfor Dating 43 2.1.5 Lazy Method 44 2.2 The Problem Stated 44 2.2.1 Problem Formulation 44 2.2.2 Evaluation Methods 46 2.3 Previous Solutions 47 2.3.1 Language Modelling 48 2.3.2 Ordinal Regression/Ranking 48 2.3.3 Classification 49
VI CONTENTS 2.4 2.5 2.6 2.3.4 Feature Selection Methods 50 2.3.5 Regression vs Classification 50 New Solutions 52 2.4.1 Flexible Time Interval (fti) 52 2.4.2 Sliding Time Interval (sti) 53 2.4.3 Greedy Grouping (GG) 56 2.4.4 Temporal Landmark Selection (tls) 57 2.4.5 Compound Solution offti tls 63 Datability 64 Conclusion 65 3 Trials in English and Medieval Irish Texts 67 3.1 Dating English Texts 67 3.1.1 Data and Features 68 3.1.2 Choosing Basic Classifiers 69 3.1.3 Experiments and Results 70 3.2 Dating Medieval Irish Texts 75 3.2.1 The Corpus: the Irish Annals 77 3.3 Implementation 80 3.3.1 Data Pre-Processing and Featuresfor Classification 3.3.2 Experiments and Results 81 3.4 Temporal Parameters 83 3.5 Datability go 3.6 Conclusion 92 80 4 Dating Long Documents 94 4.0 Introduction 94 4.1 Building a Datable Medieval Irish Corpus 95 4.2 Dating Long Documents 98 4.2.1 Test Data and Pre-Processing 98 4.2.2 Dating Long Documents with the Most Frequent Predict 100 4.2.3 Dating Long Documents with Multiple Choices 100 4.2.4 Evaluation ofBias Compensation 106 4.3 Establishing the Date of Composition 109 4.3.1 Correlation between the Results and the Accepted Date of Composition no 4.3.2 Extending the Range of Texts Ո3 4.3.3 Conclusion n6 4.4 Transmission and Manuscript Dates 117 4.5 Focussed Dating Predictions Ո9
VII CONTENTS 4.6 4.7 4.8 4.5.1 Overall Performance Périodisation 125 Stratification 127 Conclusion 129 124 5 Conclusion 132 5.1 A Temporal Model 133 5.2 Towards a Tool: Computational Chronométries 5.3 Applicability to Other Literatures 140 137 Appendix A: Conventional Dating of Texts Used in This Study A.0.1 Book of the Dun Cow 143 A.0.2 Rawlinson В 502 144 A.0.3 The Book ofLeinster 144 A.1 Texts 145 A.1.1 Acallam na Seňorách 145 A.1.2 Aided Derbforgaill 146 A.1.3 Aided Echach mac Maireda 146 A.1.4 Aided Guill maic Corbada 147 A.1.5 Aided Nath t 147 A.1.6 Aistinge Óengusso 148 A.1.7 Aislinge Meie Conglinne 148 A.1.8 Bethadh Bibuis 148 A.1.9 BethaAdamnáin 149 A.1.10 Betha Colmáin 149 A.1.11 Bethu Brigte 150 A.1.12 Bórama Laigen 150 A.1.13 Bruiden Da Choca 150 A.1.14 Caithréim Thoirdhealbhaigh 150 A.1.15 Cath Almaine 151 A.1.16 In Cath Catharda 151 A.1.17 Cath Malghe Léna 151 A.1.18 Cath Ruis na Rig 152 A.1.19 Cogadh Gaedel re Gallaib 152 A.1.20 De Dosibus Medicorom 153 A.1.21 Echtra Láegaire 153 A.1.22 Fingal Rónáin 153 A.1.23 Guy of Warwick 154 A.1.24 Genemain Aeda Staine 154 A.1.25 Maundeville 154 143
VIII A.1.26 Mesca Ulad 154 A.1.27 Merugud Uilix 155 A.1.28 Monastery of Tallaght 155 A.1.29 Regimen na Sláinte 156 A.1.30 Saltair na Rann 156 A.1.31 Seél Массе Maie Dathó 157 A.1.32 Tain Bó Cúailnge 157 A.1.33 Tain Bó Fraích 158 A.1.34 Tochmarc Emire 160 A.1.35 Treatise on the Psalter 160 A.1.36 TucaitlndarbananDéssi 160 Appendix В: Machine Learning 162 B.i Classification, Regression and Clustering B.1.1 Text Classification 162 B.1.2 Feature Selection 163 B.1.3 Training 163 B.1.4 Evaluation 163 Other Relevant Statistics 164 Bibliography 165 Index 181 CONTENTS 162
In Language and Chronology, Toner and Han apply innovative Machine Learning techniques to the problem of the dating of literary texts. Many ancient and medieval literatures lack reliable chronologies which could aid scholars in locating texts in their historical context. The new machine learning method presented here uses chronological information gleaned from annalistic records to date a wide range of texts. The method is also applied to multi-layered texts to aid the identification of different chronological strata within single copies. While the algorithm is here applied to medieval Irish material of the period С.700-С.1700, it can be extended to written texts in any language or alphabet. The authors approach presents a step change in Digital Humanities, moving us beyond simple querying of electronic texts towards the production of a sophisticated tool for literary and historical studies.
|
any_adam_object | 1 |
author | Toner, Gregory Han, Xiwu |
author_GND | (DE-588)1046469312 (DE-588)1198896744 |
author_facet | Toner, Gregory Han, Xiwu |
author_role | aut aut |
author_sort | Toner, Gregory |
author_variant | g t gt x h xh |
building | Verbundindex |
bvnumber | BV046206350 |
classification_rvk | EC 1300 EC 1620 |
ctrlnum | (OCoLC)1125191078 (DE-599)BVBBV046206350 |
discipline | Literaturwissenschaft |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02114nam a2200445 cb4500</leader><controlfield tag="001">BV046206350</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20210427 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">191021s2019 |||| |||| 10||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789004410039</subfield><subfield code="c">(hbk.)</subfield><subfield code="9">978-90-04-41003-9</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1125191078</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV046206350</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-384</subfield><subfield code="a">DE-12</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-188</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">EC 1300</subfield><subfield code="0">(DE-625)20379:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">EC 1620</subfield><subfield code="0">(DE-625)20411:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Toner, Gregory</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1046469312</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Language and chronology</subfield><subfield code="b">text dating by machine learning</subfield><subfield code="c">by Gregory Toner, Xiwu Han</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Leiden ; Boston</subfield><subfield code="b">Brill</subfield><subfield code="c">[2019]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2019</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XI, 183 Seiten</subfield><subfield code="b">Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Language and computers</subfield><subfield code="v">volume 84</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenverarbeitung</subfield><subfield code="0">(DE-588)4011152-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datierung</subfield><subfield code="0">(DE-588)4113278-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Literatur</subfield><subfield code="0">(DE-588)4035964-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Literatur</subfield><subfield code="0">(DE-588)4035964-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Datierung</subfield><subfield code="0">(DE-588)4113278-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Datenverarbeitung</subfield><subfield code="0">(DE-588)4011152-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Han, Xiwu</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1198896744</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-90-04-41004-6</subfield><subfield code="w">(DE-604)BV046319582</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Language and computers</subfield><subfield code="v">volume 84</subfield><subfield code="w">(DE-604)BV000833947</subfield><subfield code="9">84</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Augsburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031585344&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Augsburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031585344&sequence=000003&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-031585344</subfield></datafield></record></collection> |
id | DE-604.BV046206350 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T08:38:14Z |
institution | BVB |
isbn | 9789004410039 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-031585344 |
oclc_num | 1125191078 |
open_access_boolean | |
owner | DE-384 DE-12 DE-19 DE-BY-UBM DE-739 DE-188 |
owner_facet | DE-384 DE-12 DE-19 DE-BY-UBM DE-739 DE-188 |
physical | XI, 183 Seiten Diagramme |
publishDate | 2019 |
publishDateSearch | 2019 |
publishDateSort | 2019 |
publisher | Brill |
record_format | marc |
series | Language and computers |
series2 | Language and computers |
spelling | Toner, Gregory Verfasser (DE-588)1046469312 aut Language and chronology text dating by machine learning by Gregory Toner, Xiwu Han Leiden ; Boston Brill [2019] © 2019 XI, 183 Seiten Diagramme txt rdacontent n rdamedia nc rdacarrier Language and computers volume 84 Datenverarbeitung (DE-588)4011152-0 gnd rswk-swf Datierung (DE-588)4113278-6 gnd rswk-swf Literatur (DE-588)4035964-5 gnd rswk-swf Literatur (DE-588)4035964-5 s Datierung (DE-588)4113278-6 s Datenverarbeitung (DE-588)4011152-0 s DE-604 Han, Xiwu Verfasser (DE-588)1198896744 aut Erscheint auch als Online-Ausgabe 978-90-04-41004-6 (DE-604)BV046319582 Language and computers volume 84 (DE-604)BV000833947 84 Digitalisierung UB Augsburg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031585344&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis Digitalisierung UB Augsburg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031585344&sequence=000003&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA Klappentext |
spellingShingle | Toner, Gregory Han, Xiwu Language and chronology text dating by machine learning Language and computers Datenverarbeitung (DE-588)4011152-0 gnd Datierung (DE-588)4113278-6 gnd Literatur (DE-588)4035964-5 gnd |
subject_GND | (DE-588)4011152-0 (DE-588)4113278-6 (DE-588)4035964-5 |
title | Language and chronology text dating by machine learning |
title_auth | Language and chronology text dating by machine learning |
title_exact_search | Language and chronology text dating by machine learning |
title_full | Language and chronology text dating by machine learning by Gregory Toner, Xiwu Han |
title_fullStr | Language and chronology text dating by machine learning by Gregory Toner, Xiwu Han |
title_full_unstemmed | Language and chronology text dating by machine learning by Gregory Toner, Xiwu Han |
title_short | Language and chronology |
title_sort | language and chronology text dating by machine learning |
title_sub | text dating by machine learning |
topic | Datenverarbeitung (DE-588)4011152-0 gnd Datierung (DE-588)4113278-6 gnd Literatur (DE-588)4035964-5 gnd |
topic_facet | Datenverarbeitung Datierung Literatur |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031585344&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=031585344&sequence=000003&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV000833947 |
work_keys_str_mv | AT tonergregory languageandchronologytextdatingbymachinelearning AT hanxiwu languageandchronologytextdatingbymachinelearning |