Character n-gram-based sentiment analysis:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Abschlussarbeit Buch |
Sprache: | English |
Veröffentlicht: |
München
Verl. Dr. Hut
2014
|
Ausgabe: | 1. Aufl. |
Schriftenreihe: | Informatik
|
Schlagworte: | |
Online-Zugang: | Inhaltstext Inhaltsverzeichnis |
Beschreibung: | 184 S. graph. Darst. |
ISBN: | 9783843917742 3843917744 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV042205403 | ||
003 | DE-604 | ||
007 | t | ||
008 | 141125s2014 d||| m||| 00||| eng d | ||
015 | |a 14,N46 |2 dnb | ||
016 | 7 | |a 1060413329 |2 DE-101 | |
020 | |a 9783843917742 |c Pb. : EUR 42.00 (DE), EUR 43.20 (AT), sfr 55.90 (freier Pr.) |9 978-3-8439-1774-2 | ||
020 | |a 3843917744 |9 3-8439-1774-4 | ||
024 | 3 | |a 9783843917742 | |
035 | |a (OCoLC)897212229 | ||
035 | |a (DE-599)DNB1060413329 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
049 | |a DE-12 |a DE-91 | ||
084 | |a 004 |2 sdnb | ||
100 | 1 | |a Ziegelmayer, Dominique |e Verfasser |0 (DE-588)1060462060 |4 aut | |
245 | 1 | 0 | |a Character n-gram-based sentiment analysis |c von Dominique Ziegelmayer |
250 | |a 1. Aufl. | ||
264 | 1 | |a München |b Verl. Dr. Hut |c 2014 | |
300 | |a 184 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Informatik | |
502 | |a Zugl.: Köln, Univ., Diss., 2014 | ||
650 | 0 | 7 | |a Stimmungsbild |0 (DE-588)4662876-9 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
689 | 0 | 0 | |a Stimmungsbild |0 (DE-588)4662876-9 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m X:MVB |q text/html |u http://deposit.dnb.de/cgi-bin/dokserv?id=4815617&prov=M&dok_var=1&dok_ext=htm |3 Inhaltstext |
856 | 4 | 2 | |m HBZ Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027644197&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-027644197 |
Datensatz im Suchindex
_version_ | 1806329765853921280 |
---|---|
adam_text |
Titel: Character n-gram-based sentiment analysis
Autor: Ziegelmayer, Dominique
Jahr: 2014
Contents
1 Introduction 1
1.1 Motivation.2
1 2 Outline. 3
1 Background 5
2 Theoretical foundations 7
2.1 Text classification.7
2.1.1 Representations for text.8
2.1.2 Features.10
2.1.3 Feature selection.13
2.1.4 Standard approaches.13
2.2 Sentiment analysis.15
2.2.1 Problem definitions.16
2.2.2 Subtasks of sentiment analysis.18
2.2.3 Levels of sentiment analysis.19
2.2.4 Sentiment polarity classification.20
2.2.5 Approaches to sentiment classification .21
2.3 Statistical data compression.24
2.3.1 Lossless compression algorithms.25
2.3.2 Text classification using compression models.27
3 Literature review 29
3.1 Forerunners of sentiment analysis.29
3.2 Sentiment classification on the rise .29
3.3 Approaches to Sentiment classification .31
3.3.1 SVM-based approaches.31
3.3.2 Other approaches.33
3.4 Related problems.37
3.5 Character n-gram-based classification.39
3.6 Discussion.40
II Character n-gram based classification
4 Corpus creation and analysis ^
4.1 Language statistics.
4 1.1 Length distributions. . . • - •
4o
4.1.2 Spelling accuracy. .
4.1.3 Percentage of rare words. ^
4.1.4 Type-token ratio.
4.1.5 Corpus homogeneity.• • • •
4.1.6 Domain similarity. .
4 1.7 Normalizing the sample size.-
4 i
4.2 Standard corpora.
4.2.1 IM Db corpus. . ¦ ¦ ¦
4.2.2 Amazon corpus.^
4.2.3 Twitter corpus.- • ^
4.3 Multi-Domain corpora. ^
4.3.1 Electronics.
4.3.2 DVDs.
4.3.3 Books. .JO
4.3.4 Kitchens.
4.3-5 Domain similarity.^0
4.4 Foreign language corpora. T
4.4.1 German corpus.
4.4.2 Czech corpus.
5 Methodology and base results 61
5.1 Classifiers sind experimental design.62
5.1.1 SVM baseline.62
5.1.2 Compression-based classification. 62
5.1.3 Cross-entropy measures. 63
5.1.4 Evaluation.67
5.2 Results.68
5.2.1 lMDb corpus.68
5.2.2 Amazon corpus.70
5.2.3 Twitter corpus .71
5.3 Learning from domain complexity.72
5.4 Discussion.73
6 Feature selection and weighting 75
6.1 Feature selection.76
6.1.1 Probability estimation and smoothing.77
6.1.2 Feature selection metrics.79
6.2 Weighting schemes.85
6.3 Evaluation.87
6.4 Results.87
6.4.1 IMDb corpus.87
6.4.2 Amazon corpus.91
6.4.3 Twitter corpus .95
6.5 Discussion.99
III Advantages and Limitations 105
7 Cross-Domain classification 107
7.1 Approach .108
7.2 Results.109
7 3 Transfer prediction.Ill
7.4 Discussion.113
8 Character n-gram-based methods in foreign languages 115
8.1 Approach .116
8 2 Results.H7
8.3 Revisiting correlations.HO
8.4 Discussion.120
9 Rating prediction and regression 123
9.1 Preliminary analysis .124
9.2 Approach .126
9.3 Results.128
9.4 Discussion.129
10 Conclusion 131
10.1 Scientific contribution .133
10.2 Further research.!35
A List of symbols 137
B Detailed baseline results 1-30
C Feature selection and weighting 143
D Foreign languages *37
References 13*
Index 133 |
any_adam_object | 1 |
author | Ziegelmayer, Dominique |
author_GND | (DE-588)1060462060 |
author_facet | Ziegelmayer, Dominique |
author_role | aut |
author_sort | Ziegelmayer, Dominique |
author_variant | d z dz |
building | Verbundindex |
bvnumber | BV042205403 |
ctrlnum | (OCoLC)897212229 (DE-599)DNB1060413329 |
discipline | Informatik |
edition | 1. Aufl. |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV042205403</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">141125s2014 d||| m||| 00||| eng d</controlfield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">14,N46</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">1060413329</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783843917742</subfield><subfield code="c">Pb. : EUR 42.00 (DE), EUR 43.20 (AT), sfr 55.90 (freier Pr.)</subfield><subfield code="9">978-3-8439-1774-2</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">3843917744</subfield><subfield code="9">3-8439-1774-4</subfield></datafield><datafield tag="024" ind1="3" ind2=" "><subfield code="a">9783843917742</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)897212229</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DNB1060413329</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-12</subfield><subfield code="a">DE-91</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">004</subfield><subfield code="2">sdnb</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Ziegelmayer, Dominique</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1060462060</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Character n-gram-based sentiment analysis</subfield><subfield code="c">von Dominique Ziegelmayer</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. Aufl.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">München</subfield><subfield code="b">Verl. Dr. Hut</subfield><subfield code="c">2014</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">184 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Informatik</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="a">Zugl.: Köln, Univ., Diss., 2014</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Stimmungsbild</subfield><subfield code="0">(DE-588)4662876-9</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Stimmungsbild</subfield><subfield code="0">(DE-588)4662876-9</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">X:MVB</subfield><subfield code="q">text/html</subfield><subfield code="u">http://deposit.dnb.de/cgi-bin/dokserv?id=4815617&prov=M&dok_var=1&dok_ext=htm</subfield><subfield code="3">Inhaltstext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HBZ Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027644197&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-027644197</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV042205403 |
illustrated | Illustrated |
indexdate | 2024-08-03T01:58:24Z |
institution | BVB |
isbn | 9783843917742 3843917744 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-027644197 |
oclc_num | 897212229 |
open_access_boolean | |
owner | DE-12 DE-91 DE-BY-TUM |
owner_facet | DE-12 DE-91 DE-BY-TUM |
physical | 184 S. graph. Darst. |
publishDate | 2014 |
publishDateSearch | 2014 |
publishDateSort | 2014 |
publisher | Verl. Dr. Hut |
record_format | marc |
series2 | Informatik |
spelling | Ziegelmayer, Dominique Verfasser (DE-588)1060462060 aut Character n-gram-based sentiment analysis von Dominique Ziegelmayer 1. Aufl. München Verl. Dr. Hut 2014 184 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Informatik Zugl.: Köln, Univ., Diss., 2014 Stimmungsbild (DE-588)4662876-9 gnd rswk-swf (DE-588)4113937-9 Hochschulschrift gnd-content Stimmungsbild (DE-588)4662876-9 s DE-604 X:MVB text/html http://deposit.dnb.de/cgi-bin/dokserv?id=4815617&prov=M&dok_var=1&dok_ext=htm Inhaltstext HBZ Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027644197&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Ziegelmayer, Dominique Character n-gram-based sentiment analysis Stimmungsbild (DE-588)4662876-9 gnd |
subject_GND | (DE-588)4662876-9 (DE-588)4113937-9 |
title | Character n-gram-based sentiment analysis |
title_auth | Character n-gram-based sentiment analysis |
title_exact_search | Character n-gram-based sentiment analysis |
title_full | Character n-gram-based sentiment analysis von Dominique Ziegelmayer |
title_fullStr | Character n-gram-based sentiment analysis von Dominique Ziegelmayer |
title_full_unstemmed | Character n-gram-based sentiment analysis von Dominique Ziegelmayer |
title_short | Character n-gram-based sentiment analysis |
title_sort | character n gram based sentiment analysis |
topic | Stimmungsbild (DE-588)4662876-9 gnd |
topic_facet | Stimmungsbild Hochschulschrift |
url | http://deposit.dnb.de/cgi-bin/dokserv?id=4815617&prov=M&dok_var=1&dok_ext=htm http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027644197&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT ziegelmayerdominique characterngrambasedsentimentanalysis |