Literary detective work on the computer:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Amsterdam [u.a.]
Benjamins
2014
|
Schriftenreihe: | Natural language processing
12 |
Schlagworte: | |
Online-Zugang: | http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf Inhaltsverzeichnis |
Beschreibung: | X, 283 S. Ill., graph. Darst. |
ISBN: | 9027249997 9789027249999 9789027270139 |
Internformat
MARC
LEADER | 00000nam a2200000zcb4500 | ||
---|---|---|---|
001 | BV041913981 | ||
003 | DE-604 | ||
005 | 20171221 | ||
007 | t | ||
008 | 140612s2014 ad|| |||| 00||| eng d | ||
010 | |a 2014007366 | ||
020 | |a 9027249997 |9 90-272-4999-7 | ||
020 | |a 9789027249999 |9 978-90-272-4999-9 | ||
020 | |a 9789027270139 |9 978-90-272-7013-9 | ||
035 | |a (OCoLC)884916017 | ||
035 | |a (DE-599)HEB340840838 | ||
040 | |a DE-604 |b ger | ||
041 | 0 | |a eng | |
049 | |a DE-12 |a DE-20 | ||
084 | |a ES 945 |0 (DE-625)27935: |2 rvk | ||
084 | |a 24,1 |2 ssgn | ||
100 | 1 | |a Oakes, Michael P. |e Verfasser |0 (DE-588)1057567302 |4 aut | |
245 | 1 | 0 | |a Literary detective work on the computer |c Michael P. Oakes |
264 | 1 | |a Amsterdam [u.a.] |b Benjamins |c 2014 | |
300 | |a X, 283 S. |b Ill., graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Natural language processing |v 12 | |
650 | 0 | 7 | |a Plagiat |0 (DE-588)4046196-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Sprachanalyse |0 (DE-588)4129916-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Autorschaft |0 (DE-588)4130545-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Literatur |0 (DE-588)4035964-5 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Literatur |0 (DE-588)4035964-5 |D s |
689 | 0 | 1 | |a Autorschaft |0 (DE-588)4130545-0 |D s |
689 | 0 | 2 | |a Plagiat |0 (DE-588)4046196-8 |D s |
689 | 0 | 3 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 0 | 4 | |a Sprachanalyse |0 (DE-588)4129916-4 |D s |
689 | 0 | |5 DE-604 | |
830 | 0 | |a Natural language processing |v 12 |w (DE-604)BV013516598 |9 12 | |
856 | 4 | 2 | |m V:DE-603;B:DE-30 |q application/pdf |u http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf |
856 | 4 | 2 | |m HEBIS Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027357628&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
942 | 1 | 1 | |c 002 |e 22/bsb |
Datensatz im Suchindex
_version_ | 1805076292915167232 |
---|---|
adam_text |
Literary Detective Work
on the Computer
Michael P Oakes
University of Wolverhampton
John Benjamins Publishing Company
Amsterdam / Philadelphia
Table of contents
Preface
CHAPTER 1
Author identification
1 Introduction x
2 Feature selection 5
2 1 Evaluation of feature sets for authorship attribution 8
3 Inter-textual distances 11
3 1 Manhattan distance and Euclidean distance 12
3 2 Labbe and Labbes measure 14
3 3 Chi-squared distance 15
3 4 The cosine similarity measure 16
3 5 Kullback-Leibler Divergence (KLD) 18
3 6 Burrows' Delta 18
3 7 Evaluation of feature-based measures for inter-textual distance
3 8 Inter-textual distance by semantic similarity 26
3 9 Stemmatology as a measure of inter-textual distance 28
4 Clustering techniques 30
4 1 Introduction to factor analysis 31
4 2 Matrix algebra 35
4 3 Use of matrix algebra for PCA 38
4 4 PCA case studies 44
4 5 Correspondence analysis 45
5 Comparisons of classifiers 47
6 Other tasks related to authorship 50
6 1 Stylochronometry 50
6 2 Affect dictionaries and psychological profiling 53
6 3 Evaluation of author profiling 58
7 Conclusion 58
vi Literary Detective Work on the Computer
CHAPTER 2
Plagiarism and spam filtering 59
1 Introduction 59
2 Plagiarism detection software 62
2 1 Collusion and plagiarism, external and intrinsic 63
2 2 Preprocessing of corpora and feature extraction 63
2 3 Sequence comparison and exact match 64
2 4 Source-suspicious document similarity measures 65
2 5 Fingerprinting 66
2 6 Language models 67
2 7 Natural language processing 68
2 8 Intrinsic plagiarism detection 70
2 9 Plagiarism of program code 73
2 10 Distance between translated and original text 74
2 11 Direction of plagiarism 76
2 12 The search engine-based approach used at PAN-13 78
2 13 Case study 1: Hidden influences from printed sources
in the Gaelic tales of Duncan and Neil MacDonald 81
2 14 Case study 2: General George Pickett and related writings 83
2 15 Evaluation methods 84
2 16 Conclusion 85
3 Spam filters 86
3 1 Content-based techniques 87
3 2 Building a labeled corpus for training 87
3 3 Exact matching techniques 88
3 4 Rule-based methods 89
3 5 Machine learning 90
3 6 Unsupervised machine learning approaches 92
3 7 Other spam-filtering problems 93
3 8 Evaluation of spam filters 94
3 9 Non-linguistic techniques 94
3 10 Conclusion 97
4 Recommendations for further reading 98
CHAPTER 3
Computer studies of Shakespearean authorship 99
1 Introduction 99
2 Shakespeare, Wilkins and Pericles 101
2 1 Correspondence analysis for Pericles and related texts 105
3 Shakespeare, Fletcher and The Two Noble Kinsmen 108
4 King John 110
Table of contents VII
5 The Raigne of King Edward III 111
5 1 Neural networks in stylometry 111
5 2 Cusum charts in stylometry 113
5 3 Burrows' Zeta and Iota 116
6 Hand D in Sir Thomas More 118
6 1 Elliott, Valenza and the Earl of Oxford 118
6 2 Elliott and Valenza: Hand D 121
6 3 Bayesian approach to questions of Shakespearian authorship 122
6 4 Bayesian analysis of Shakespeare's second person pronouns 127
6 5 Vocabulary differences, LDA and the authorship of Hand D 130
6 6 Hand D: Conclusions 131
7 The three parts of Henry VI 132
8 Timon of Athens 132
9 The Puritan and A Yorkshire Tragedy 133
10 Arden of Faversham 134
11 Estimation of the extent of Shakespeare's vocabulary
and the authorship of the Taylor poem 136
12 The chronology of Shakespeare 141
13 Conclusion 147
CHAPTER 4
Stylometric analysis of religious texts 149
1 Introduction 149
1 1 Overview of the New Testament by correspondence analysis 151
12Q 153
1 3 Luke and Acts 169
1 4 Recent approaches to New Testament stylometry 171
1 5 The Pauline Epistles 175
1 6 Hebrews 188
1 7 The Signs Gospel 188
2 Stylometric analysis of the Book of Mormon 190
3 Stylometric studies of the Qu'ran 198
4 Conclusion 206
CHAPTER 5
Computers and decipherment 207
1 Introduction 207
1 1 Differences between cryptography and decipherment 208
1 2 Cryptological techniques for automatic language recognition 209
1 3 Dictionary approaches to language recognition 212
1 4 Sinkov's test 212
VIII Literary Detective Work on the Computer
1 5 Index of coincidence 213
1 6 The log-likelihood ratio 214
1 7 The chi-squared test statistic 215
1 8 Entropy of language 215
1 9 Zipf's Law and Heaps' Law coefficients 218
1 10 Modal token length 219
1 11 Autocorrelation analysis 220
1 12 Vowel identification 221
2 Rongorongo 224
2 1 History of Rongorongo 224
2 2 Characteristics of Rongorongo 226
2 3 Obstacles to decipherment 227
2 4 Encoding of Rongorongo symbols 227
2 5 The Mamari lunar calendar 228
2 6 Basic statistics of the Rongorongo corpus 228
2 7 Alignment of the Rongorongo corpus 229
28A concordance for Rongorongo 231
2 9 Collocations and collostructions 233
2 10 Classification by genre 234
2 11 Vocabulary richness 237
2 12 Podzniakov's approach to matching frequency curves 241
3 The Indus Valley texts 243
3 1 Why decipherment of the Indus texts is difficult 243
3 2 Are the Indus texts writing? 244
3 3 Other evidence for the Indus Script being writing 248
3 4 Determining the order of the Markov model 248
3 5 Missing symbols 249
3 6 Text segmentation and the log-likelihood measure 249
3 7 Network analysis of the Indus Signs 251
4 Linear A 252
5 The Phaistos disk 255
6 Iron Age Pictish symbols 256
7 Mayan glyphs 256
8 Conclusion 257
References 259
Index 281 |
any_adam_object | 1 |
author | Oakes, Michael P. |
author_GND | (DE-588)1057567302 |
author_facet | Oakes, Michael P. |
author_role | aut |
author_sort | Oakes, Michael P. |
author_variant | m p o mp mpo |
building | Verbundindex |
bvnumber | BV041913981 |
classification_rvk | ES 945 |
ctrlnum | (OCoLC)884916017 (DE-599)HEB340840838 |
discipline | Sprachwissenschaft Literaturwissenschaft |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000zcb4500</leader><controlfield tag="001">BV041913981</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20171221</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">140612s2014 ad|| |||| 00||| eng d</controlfield><datafield tag="010" ind1=" " ind2=" "><subfield code="a">2014007366</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9027249997</subfield><subfield code="9">90-272-4999-7</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789027249999</subfield><subfield code="9">978-90-272-4999-9</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789027270139</subfield><subfield code="9">978-90-272-7013-9</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)884916017</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)HEB340840838</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-12</subfield><subfield code="a">DE-20</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ES 945</subfield><subfield code="0">(DE-625)27935:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Oakes, Michael P.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1057567302</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Literary detective work on the computer</subfield><subfield code="c">Michael P. Oakes</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Amsterdam [u.a.]</subfield><subfield code="b">Benjamins</subfield><subfield code="c">2014</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">X, 283 S.</subfield><subfield code="b">Ill., graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Natural language processing</subfield><subfield code="v">12</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Plagiat</subfield><subfield code="0">(DE-588)4046196-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprachanalyse</subfield><subfield code="0">(DE-588)4129916-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Autorschaft</subfield><subfield code="0">(DE-588)4130545-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Literatur</subfield><subfield code="0">(DE-588)4035964-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Literatur</subfield><subfield code="0">(DE-588)4035964-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Autorschaft</subfield><subfield code="0">(DE-588)4130545-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Plagiat</subfield><subfield code="0">(DE-588)4046196-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="4"><subfield code="a">Sprachanalyse</subfield><subfield code="0">(DE-588)4129916-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Natural language processing</subfield><subfield code="v">12</subfield><subfield code="w">(DE-604)BV013516598</subfield><subfield code="9">12</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">V:DE-603;B:DE-30</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HEBIS Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027357628&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="942" ind1="1" ind2="1"><subfield code="c">002</subfield><subfield code="e">22/bsb</subfield></datafield></record></collection> |
id | DE-604.BV041913981 |
illustrated | Illustrated |
indexdate | 2024-07-20T05:54:58Z |
institution | BVB |
isbn | 9027249997 9789027249999 9789027270139 |
language | English |
lccn | 2014007366 |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-027357628 |
oclc_num | 884916017 |
open_access_boolean | |
owner | DE-12 DE-20 |
owner_facet | DE-12 DE-20 |
physical | X, 283 S. Ill., graph. Darst. |
publishDate | 2014 |
publishDateSearch | 2014 |
publishDateSort | 2014 |
publisher | Benjamins |
record_format | marc |
series | Natural language processing |
series2 | Natural language processing |
spelling | Oakes, Michael P. Verfasser (DE-588)1057567302 aut Literary detective work on the computer Michael P. Oakes Amsterdam [u.a.] Benjamins 2014 X, 283 S. Ill., graph. Darst. txt rdacontent n rdamedia nc rdacarrier Natural language processing 12 Plagiat (DE-588)4046196-8 gnd rswk-swf Sprachanalyse (DE-588)4129916-4 gnd rswk-swf Autorschaft (DE-588)4130545-0 gnd rswk-swf Computerlinguistik (DE-588)4035843-4 gnd rswk-swf Literatur (DE-588)4035964-5 gnd rswk-swf Literatur (DE-588)4035964-5 s Autorschaft (DE-588)4130545-0 s Plagiat (DE-588)4046196-8 s Computerlinguistik (DE-588)4035843-4 s Sprachanalyse (DE-588)4129916-4 s DE-604 Natural language processing 12 (DE-604)BV013516598 12 V:DE-603;B:DE-30 application/pdf http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf HEBIS Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027357628&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Oakes, Michael P. Literary detective work on the computer Natural language processing Plagiat (DE-588)4046196-8 gnd Sprachanalyse (DE-588)4129916-4 gnd Autorschaft (DE-588)4130545-0 gnd Computerlinguistik (DE-588)4035843-4 gnd Literatur (DE-588)4035964-5 gnd |
subject_GND | (DE-588)4046196-8 (DE-588)4129916-4 (DE-588)4130545-0 (DE-588)4035843-4 (DE-588)4035964-5 |
title | Literary detective work on the computer |
title_auth | Literary detective work on the computer |
title_exact_search | Literary detective work on the computer |
title_full | Literary detective work on the computer Michael P. Oakes |
title_fullStr | Literary detective work on the computer Michael P. Oakes |
title_full_unstemmed | Literary detective work on the computer Michael P. Oakes |
title_short | Literary detective work on the computer |
title_sort | literary detective work on the computer |
topic | Plagiat (DE-588)4046196-8 gnd Sprachanalyse (DE-588)4129916-4 gnd Autorschaft (DE-588)4130545-0 gnd Computerlinguistik (DE-588)4035843-4 gnd Literatur (DE-588)4035964-5 gnd |
topic_facet | Plagiat Sprachanalyse Autorschaft Computerlinguistik Literatur |
url | http://scans.hebis.de/HEBCGI/show.pl?34084083_toc.pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027357628&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV013516598 |
work_keys_str_mv | AT oakesmichaelp literarydetectiveworkonthecomputer |