Text mining with machine learning: principles and techniques
1. Introduction to Text Mining with Machine Learning -- 2. Introduction to R -- 3. Structured Text Representations -- 4. Classification -- 5. Bayes Classifier -- 6. Nearest Neighbors -- 7. Decision Trees -- 8. Random Forest -- 9. Adaboost -- 10. Support Vector Machines -- 11. Deep Learning -- 12. Cl...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Boca Raton ; London ; New York
CRC Press
[2020]
|
Schriftenreihe: | A science publisher book
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Zusammenfassung: | 1. Introduction to Text Mining with Machine Learning -- 2. Introduction to R -- 3. Structured Text Representations -- 4. Classification -- 5. Bayes Classifier -- 6. Nearest Neighbors -- 7. Decision Trees -- 8. Random Forest -- 9. Adaboost -- 10. Support Vector Machines -- 11. Deep Learning -- 12. Clustering -- 13. Word Embeddings -- 14. Feature Selection -- References -- Index -- Color Section "This book provides a perspective on the application of machine learning-based methods in knowledge discovery from natural languages texts. By analysing various data sets, conclusions, which are not normally evident, emerge and can be used for various purposes and applications. The book provides explanations of principles of time-proven machine learning algorithms applied in text mining together with step-by-step demonstrations of how to reveal the semantic contents in real-world datasets using the popular R-language with its implemented machine learning algorithms. The book is not only aimed at IT specialists, but is meant for a wider audience that needs to process big sets of text documents and has basic knowledge of the subject, e.g. e-mail service providers, online shoppers, librarians, etc"-- |
Beschreibung: | Literaturverzeichnis: Seite 323-346 |
Beschreibung: | xii, 351 Seiten Illustrationen, Diagramme |
ISBN: | 9781138601826 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV047640007 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | t | ||
008 | 211214s2020 xxua||| |||| 00||| eng d | ||
020 | |a 9781138601826 |c hardback |9 978-1-138-60182-6 | ||
035 | |a (OCoLC)1289761178 | ||
035 | |a (DE-599)BVBBV047640007 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
044 | |a xxu |c XD-US |a xxk |c XA-GB | ||
049 | |a DE-355 | ||
050 | 0 | |a Q325.5 | |
082 | 0 | |a 006.3/12 | |
084 | |a ST 670 |0 (DE-625)143689: |2 rvk | ||
084 | |a MR 2200 |0 (DE-625)123489: |2 rvk | ||
084 | |a 54.64 |2 bkl | ||
084 | |a 06.74 |2 bkl | ||
100 | 1 | |a Žižka, Jan |d 19XX- |e Verfasser |0 (DE-588)1038323231 |4 aut | |
245 | 1 | 0 | |a Text mining with machine learning |b principles and techniques |c Jan Žižka (Machine learning consultant, Brno, Czech Republic), František Dařena (Department of Informatics, Mendel University, Brno, Czech Republic), Arnošt Svoboda (Department of Applied Mathematics & Computer Science, Masaryk University, Brno, Czech Republic) |
264 | 1 | |a Boca Raton ; London ; New York |b CRC Press |c [2020] | |
300 | |a xii, 351 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a A science publisher book | |
500 | |a Literaturverzeichnis: Seite 323-346 | ||
520 | 3 | |a 1. Introduction to Text Mining with Machine Learning -- 2. Introduction to R -- 3. Structured Text Representations -- 4. Classification -- 5. Bayes Classifier -- 6. Nearest Neighbors -- 7. Decision Trees -- 8. Random Forest -- 9. Adaboost -- 10. Support Vector Machines -- 11. Deep Learning -- 12. Clustering -- 13. Word Embeddings -- 14. Feature Selection -- References -- Index -- Color Section | |
520 | 3 | |a "This book provides a perspective on the application of machine learning-based methods in knowledge discovery from natural languages texts. By analysing various data sets, conclusions, which are not normally evident, emerge and can be used for various purposes and applications. The book provides explanations of principles of time-proven machine learning algorithms applied in text mining together with step-by-step demonstrations of how to reveal the semantic contents in real-world datasets using the popular R-language with its implemented machine learning algorithms. The book is not only aimed at IT specialists, but is meant for a wider audience that needs to process big sets of text documents and has basic knowledge of the subject, e.g. e-mail service providers, online shoppers, librarians, etc"-- | |
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Text Mining |0 (DE-588)4728093-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
653 | 0 | |a Machine learning | |
653 | 0 | |a Computational linguistics | |
653 | 0 | |a Semantics / Data processing | |
689 | 0 | 0 | |a Text Mining |0 (DE-588)4728093-1 |D s |
689 | 0 | 1 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 0 | 2 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Dařena, František |d 1979- |e Verfasser |0 (DE-588)1082235296 |4 aut | |
700 | 1 | |a Svoboda, Arnošt |d 1949- |e Verfasser |0 (DE-588)1225502187 |4 aut | |
775 | 0 | 8 | |i Äquivalent |n Druck-Ausgabe, Paperback |z 978-1-032-08621-7 |w (DE-604)BV047581831 |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=033024214&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-033024214 |
Datensatz im Suchindex
_version_ | 1804183092753072128 |
---|---|
adam_text | Contents Preface v Authors’Biographies Introduction to Text Mining with Machine Learning 1.1 1.2 1.3 1.4 1.5 1.6 1.7 Introduction Relation of Text Mining to Data Mining The Text Mining Process Machine Learning for Text Mining 1.4.1 Inductive Machine Learning Three Fundamental Learning Directions 1.5.1 Supervised Machine Learning 1.5.2 Unsupervised Machine Learning 1.5.3 Semi-supervised Machine Learning Big Data About This Book Introduction to R 2.1 2.2 2.3 2.4 2.5 2.6 Installing R Running R RStudio 2.3.1 Projects 2.3.2 Getting Help Writing and Executing Commands Variables and Data Types Objects in R 2.6.1 Assignment 2.6.2 Logical Values xiii 1 1 2 5 6 8 9 9 9 10 11 11 13 14 15 17 18 19 19 21 22 25 26
viii ■ Text Mining with Machine Learning: Principles and Techniques 2.6.3 Numbers 2.6.4 Character Strings 2.6.5 Special Values Functions 2.7 Operators 2.8 2.9 Vectors 2.9.1 Creating Vectors 2.9.2 Naming Vector Elements 2.9.3 Operations with Vectors 2.9.4 Accessing Vector Elements 2.10 Matrices and Arrays 2.11 Lists 2.12 Factors 2.13 Data Frames 2.14 Functions Useful in Machine Learning 2.15 Flow Control Structures 2.15.1 Conditional Statement 2.15.2 Loops 2.16 Packages 2.16.1 Installing Packages 2.16.2 Loading Packages 2.17 Graphics 27 28 29 31 35 36 36 38 40 42 43 47 49 51 55 61 61 64 65 66 67 67 Structured Text Representations 75 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 Introduction The Bag-of-Words Model The Limitations of the Bag-of-Words Model Document Features Standardization Texts in Different Encodings Language Identification Tokenization Sentence Detection Filtering Stop Words, Common, and Rare Terms Removing Diacritics Normalization 3.12.1 Case Folding 3.12.2 Stemming and Lemmatization 3.12.3 Spelling Correction Annotation 3.13.1 Part of Speech Tagging 3.13.2 Parsing 75 79 80 83 85 90 92 92 93 94 98 99 99 100 102 104 104 107
Contents ■ ix 3.14 3.15 3.16 Calculating the Weights in the Bag-of-Words Model 3.14.1 Local Weights 3.14.2 Global Weights 3.14.3 Normalization Factor Common Formats for Storing Structured Data 3.15.1 Attribute-Relation File Format (ARFF) 3.15.2 Comma-Separated Values (CSV) 3.15.3 C5 format 3.15.4 Matrix Files for CLUTO 3.15.5 SVMlight Format 3.15.6 Reading Data in R A Complex Example 4. Classification 4.1 4.2 4.3 Sample Data Selected Algorithms Classifier Quality Measurement 5. Bayes Classifier 5.1 5.2 5.3 5.4 5.5 5.6 Introduction Bayes’Theorem Optimal Bayes Classifier Naive Bayes Classifier Illustrative Example of Naïve Bayes Naïve Bayes Classifier in R 5.6.1 Running Naïve Bayes Classifier in RStudio 5.6.2 Testing with an External Dataset 5.6.3 Testing with 10-Fold Cross-Validation 6. Nearest Neighbors 6.1 6.2 6.3 6.4 Introduction Similarity as Distance Illustrative Example of C-NN ł-NNinR 7. Decision Trees 7.1 7.2 Introduction Entropy Minimization-Based c5 Algorithm 7.2.1 The Principle of Generating Trees 7.2.2 Pruning 109 109 110 111 114 114 115 117 121 121 122 123 137 137 140 142 145 145 146 148 149 150 153 154 156 158 163 163 164 166 168 173 173 174 174 178
x ■ Text Mining with Machine Learning: Principles and Techniques 7.3 C5 Tree Generator in R 7.3.1 Generating a Tree 7.3.2 Information Acquired from C5-Tree 7.3.3 Using Testing Samples to Assess Tree Accuracy 7.3.4 Using Cross-Validation to Assess Tree Accuracy 7.3.5 Generating Decision Rules 8. Random Forest 8.1 8.2 Introduction 8.1.1 Bootstrap 8.1.2 Stability and Robustness 8.1.3 Which Tree Algorithm? Random Forest in R 9. Adaboost 9.1 9.2 9.3 9.4 9.5 Introduction Boosting Principle Adaboost Principle Weak Learners Adaboost in R 10. Support Vector Machines 10.1 10.2 10.3 Introduction Support Vector MachinesPrinciples 10.2.1 Finding Optimal Separation Hyperplane 10.2.2 Nonlinear Classification and KernelFunctions 10.2.3 Multiclass SVM Classification 10.2.4 SVM Summary SVM in R 11. Deep Learning 11.1 11.2 11.3 Introduction Artificial Neural Networks Deep Learning in R 12. Clustering 12.1 12.2 12.3 Introduction to Clustering Difficulties of Clustering Similarity Measures 181 181 184 187 188 189 193 193 193 195 195 195 201 201 201 202 204 205 211 211 213 213 214 215 216 217 223 223 225 227 235 235 236 238
Contents U xi 12.4 12.5 12.6 12.7 12.8 12.9 12.10 12.11 12.12 12.13 12.14 12.15 12.16 12.3.1 Cosine Similarity 12.3.2 Euclidean Distance 12.3.3 Manhattan Distance 12.3.4 Chebyshev Distance 12.3.5 Minkowski Distance 12.3.6 Jaccard Coefficient Types of Clustering Algorithms 12.4.1 Partitional (Flat) Clustering 12.4.2 Hierarchical Clustering 12.4.3 Graph Based Clustering Clustering Criterion Functions 12.5.1 Internal Criterion Functions 12.5.2 External Criterion Function 12.5.3 Hybrid Criterion Functions 12.5.4 Graph Based Criterion Functions Deciding on the Number of Clusters K-Means K-Medoids Criterion Function Optimization Agglomerati ve Hierarchical Clustering Scatter-Gather Algorithm Divisive Hierarchical Clustering Constrained Clustering Evaluating Clustering Results 12.14.1 Metrics Based on Counting Pairs 12.14.2 Purity 12.14.3 Entropy 12.14.4 F-Measure 12.14.5 Normalized Mutual Information 12.14.6 Silhouette 12.14.7 Evaluation Based on ExpertOpinion Cluster Labeling A Few Examples 13. Word Embeddings 13.1 13.2 13.3 13.4 13.5 13.6 Introduction Determining the Context and Word Similarity Context Windows Computing Word Embeddings Aggregation of Word Vectors An Example 239 240 240 241 241 241 242 242 243 245 246 247 248 248 248 249 251 252 253 253 257 259 260 261 263 264 264 265 266 267 269 270 271 287 287 289 291 291 294 295
xii ■ Text Mining with Machine Learning: Principles and Techniques 14. Feature Selection 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 Introduction Feature Selection as State Space Search Feature Selection Methods 14.3.1 Chi Squared (χ2) 14.3.2 Mutual Information 14.3.3 Information Gain Term Elimination Based on Frequency Term Strength Term Contribution Entropy-Based Ranking Term Variance An Example 301 301 303 304 306 307 311 313 314 315 315 316 316 References 323 Index 347
|
adam_txt |
Contents Preface v Authors’Biographies Introduction to Text Mining with Machine Learning 1.1 1.2 1.3 1.4 1.5 1.6 1.7 Introduction Relation of Text Mining to Data Mining The Text Mining Process Machine Learning for Text Mining 1.4.1 Inductive Machine Learning Three Fundamental Learning Directions 1.5.1 Supervised Machine Learning 1.5.2 Unsupervised Machine Learning 1.5.3 Semi-supervised Machine Learning Big Data About This Book Introduction to R 2.1 2.2 2.3 2.4 2.5 2.6 Installing R Running R RStudio 2.3.1 Projects 2.3.2 Getting Help Writing and Executing Commands Variables and Data Types Objects in R 2.6.1 Assignment 2.6.2 Logical Values xiii 1 1 2 5 6 8 9 9 9 10 11 11 13 14 15 17 18 19 19 21 22 25 26
viii ■ Text Mining with Machine Learning: Principles and Techniques 2.6.3 Numbers 2.6.4 Character Strings 2.6.5 Special Values Functions 2.7 Operators 2.8 2.9 Vectors 2.9.1 Creating Vectors 2.9.2 Naming Vector Elements 2.9.3 Operations with Vectors 2.9.4 Accessing Vector Elements 2.10 Matrices and Arrays 2.11 Lists 2.12 Factors 2.13 Data Frames 2.14 Functions Useful in Machine Learning 2.15 Flow Control Structures 2.15.1 Conditional Statement 2.15.2 Loops 2.16 Packages 2.16.1 Installing Packages 2.16.2 Loading Packages 2.17 Graphics 27 28 29 31 35 36 36 38 40 42 43 47 49 51 55 61 61 64 65 66 67 67 Structured Text Representations 75 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 Introduction The Bag-of-Words Model The Limitations of the Bag-of-Words Model Document Features Standardization Texts in Different Encodings Language Identification Tokenization Sentence Detection Filtering Stop Words, Common, and Rare Terms Removing Diacritics Normalization 3.12.1 Case Folding 3.12.2 Stemming and Lemmatization 3.12.3 Spelling Correction Annotation 3.13.1 Part of Speech Tagging 3.13.2 Parsing 75 79 80 83 85 90 92 92 93 94 98 99 99 100 102 104 104 107
Contents ■ ix 3.14 3.15 3.16 Calculating the Weights in the Bag-of-Words Model 3.14.1 Local Weights 3.14.2 Global Weights 3.14.3 Normalization Factor Common Formats for Storing Structured Data 3.15.1 Attribute-Relation File Format (ARFF) 3.15.2 Comma-Separated Values (CSV) 3.15.3 C5 format 3.15.4 Matrix Files for CLUTO 3.15.5 SVMlight Format 3.15.6 Reading Data in R A Complex Example 4. Classification 4.1 4.2 4.3 Sample Data Selected Algorithms Classifier Quality Measurement 5. Bayes Classifier 5.1 5.2 5.3 5.4 5.5 5.6 Introduction Bayes’Theorem Optimal Bayes Classifier Naive Bayes Classifier Illustrative Example of Naïve Bayes Naïve Bayes Classifier in R 5.6.1 Running Naïve Bayes Classifier in RStudio 5.6.2 Testing with an External Dataset 5.6.3 Testing with 10-Fold Cross-Validation 6. Nearest Neighbors 6.1 6.2 6.3 6.4 Introduction Similarity as Distance Illustrative Example of C-NN ł-NNinR 7. Decision Trees 7.1 7.2 Introduction Entropy Minimization-Based c5 Algorithm 7.2.1 The Principle of Generating Trees 7.2.2 Pruning 109 109 110 111 114 114 115 117 121 121 122 123 137 137 140 142 145 145 146 148 149 150 153 154 156 158 163 163 164 166 168 173 173 174 174 178
x ■ Text Mining with Machine Learning: Principles and Techniques 7.3 C5 Tree Generator in R 7.3.1 Generating a Tree 7.3.2 Information Acquired from C5-Tree 7.3.3 Using Testing Samples to Assess Tree Accuracy 7.3.4 Using Cross-Validation to Assess Tree Accuracy 7.3.5 Generating Decision Rules 8. Random Forest 8.1 8.2 Introduction 8.1.1 Bootstrap 8.1.2 Stability and Robustness 8.1.3 Which Tree Algorithm? Random Forest in R 9. Adaboost 9.1 9.2 9.3 9.4 9.5 Introduction Boosting Principle Adaboost Principle Weak Learners Adaboost in R 10. Support Vector Machines 10.1 10.2 10.3 Introduction Support Vector MachinesPrinciples 10.2.1 Finding Optimal Separation Hyperplane 10.2.2 Nonlinear Classification and KernelFunctions 10.2.3 Multiclass SVM Classification 10.2.4 SVM Summary SVM in R 11. Deep Learning 11.1 11.2 11.3 Introduction Artificial Neural Networks Deep Learning in R 12. Clustering 12.1 12.2 12.3 Introduction to Clustering Difficulties of Clustering Similarity Measures 181 181 184 187 188 189 193 193 193 195 195 195 201 201 201 202 204 205 211 211 213 213 214 215 216 217 223 223 225 227 235 235 236 238
Contents U xi 12.4 12.5 12.6 12.7 12.8 12.9 12.10 12.11 12.12 12.13 12.14 12.15 12.16 12.3.1 Cosine Similarity 12.3.2 Euclidean Distance 12.3.3 Manhattan Distance 12.3.4 Chebyshev Distance 12.3.5 Minkowski Distance 12.3.6 Jaccard Coefficient Types of Clustering Algorithms 12.4.1 Partitional (Flat) Clustering 12.4.2 Hierarchical Clustering 12.4.3 Graph Based Clustering Clustering Criterion Functions 12.5.1 Internal Criterion Functions 12.5.2 External Criterion Function 12.5.3 Hybrid Criterion Functions 12.5.4 Graph Based Criterion Functions Deciding on the Number of Clusters K-Means K-Medoids Criterion Function Optimization Agglomerati ve Hierarchical Clustering Scatter-Gather Algorithm Divisive Hierarchical Clustering Constrained Clustering Evaluating Clustering Results 12.14.1 Metrics Based on Counting Pairs 12.14.2 Purity 12.14.3 Entropy 12.14.4 F-Measure 12.14.5 Normalized Mutual Information 12.14.6 Silhouette 12.14.7 Evaluation Based on ExpertOpinion Cluster Labeling A Few Examples 13. Word Embeddings 13.1 13.2 13.3 13.4 13.5 13.6 Introduction Determining the Context and Word Similarity Context Windows Computing Word Embeddings Aggregation of Word Vectors An Example 239 240 240 241 241 241 242 242 243 245 246 247 248 248 248 249 251 252 253 253 257 259 260 261 263 264 264 265 266 267 269 270 271 287 287 289 291 291 294 295
xii ■ Text Mining with Machine Learning: Principles and Techniques 14. Feature Selection 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 Introduction Feature Selection as State Space Search Feature Selection Methods 14.3.1 Chi Squared (χ2) 14.3.2 Mutual Information 14.3.3 Information Gain Term Elimination Based on Frequency Term Strength Term Contribution Entropy-Based Ranking Term Variance An Example 301 301 303 304 306 307 311 313 314 315 315 316 316 References 323 Index 347 |
any_adam_object | 1 |
any_adam_object_boolean | 1 |
author | Žižka, Jan 19XX- Dařena, František 1979- Svoboda, Arnošt 1949- |
author_GND | (DE-588)1038323231 (DE-588)1082235296 (DE-588)1225502187 |
author_facet | Žižka, Jan 19XX- Dařena, František 1979- Svoboda, Arnošt 1949- |
author_role | aut aut aut |
author_sort | Žižka, Jan 19XX- |
author_variant | j ž jž f d fd a s as |
building | Verbundindex |
bvnumber | BV047640007 |
callnumber-first | Q - Science |
callnumber-label | Q325 |
callnumber-raw | Q325.5 |
callnumber-search | Q325.5 |
callnumber-sort | Q 3325.5 |
callnumber-subject | Q - General Science |
classification_rvk | ST 670 MR 2200 |
ctrlnum | (OCoLC)1289761178 (DE-599)BVBBV047640007 |
dewey-full | 006.3/12 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.3/12 |
dewey-search | 006.3/12 |
dewey-sort | 16.3 212 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik Soziologie |
discipline_str_mv | Informatik Soziologie |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03667nam a2200553 c 4500</leader><controlfield tag="001">BV047640007</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">211214s2020 xxua||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781138601826</subfield><subfield code="c">hardback</subfield><subfield code="9">978-1-138-60182-6</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1289761178</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047640007</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">xxu</subfield><subfield code="c">XD-US</subfield><subfield code="a">xxk</subfield><subfield code="c">XA-GB</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-355</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">Q325.5</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.3/12</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 670</subfield><subfield code="0">(DE-625)143689:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">MR 2200</subfield><subfield code="0">(DE-625)123489:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">54.64</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">06.74</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Žižka, Jan</subfield><subfield code="d">19XX-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1038323231</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Text mining with machine learning</subfield><subfield code="b">principles and techniques</subfield><subfield code="c">Jan Žižka (Machine learning consultant, Brno, Czech Republic), František Dařena (Department of Informatics, Mendel University, Brno, Czech Republic), Arnošt Svoboda (Department of Applied Mathematics & Computer Science, Masaryk University, Brno, Czech Republic)</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boca Raton ; London ; New York</subfield><subfield code="b">CRC Press</subfield><subfield code="c">[2020]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xii, 351 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">A science publisher book</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Literaturverzeichnis: Seite 323-346</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">1. Introduction to Text Mining with Machine Learning -- 2. Introduction to R -- 3. Structured Text Representations -- 4. Classification -- 5. Bayes Classifier -- 6. Nearest Neighbors -- 7. Decision Trees -- 8. Random Forest -- 9. Adaboost -- 10. Support Vector Machines -- 11. Deep Learning -- 12. Clustering -- 13. Word Embeddings -- 14. Feature Selection -- References -- Index -- Color Section</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">"This book provides a perspective on the application of machine learning-based methods in knowledge discovery from natural languages texts. By analysing various data sets, conclusions, which are not normally evident, emerge and can be used for various purposes and applications. The book provides explanations of principles of time-proven machine learning algorithms applied in text mining together with step-by-step demonstrations of how to reveal the semantic contents in real-world datasets using the popular R-language with its implemented machine learning algorithms. The book is not only aimed at IT specialists, but is meant for a wider audience that needs to process big sets of text documents and has basic knowledge of the subject, e.g. e-mail service providers, online shoppers, librarians, etc"--</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Text Mining</subfield><subfield code="0">(DE-588)4728093-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Machine learning</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Computational linguistics</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Semantics / Data processing</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Text Mining</subfield><subfield code="0">(DE-588)4728093-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Dařena, František</subfield><subfield code="d">1979-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1082235296</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Svoboda, Arnošt</subfield><subfield code="d">1949-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1225502187</subfield><subfield code="4">aut</subfield></datafield><datafield tag="775" ind1="0" ind2="8"><subfield code="i">Äquivalent</subfield><subfield code="n">Druck-Ausgabe, Paperback</subfield><subfield code="z">978-1-032-08621-7</subfield><subfield code="w">(DE-604)BV047581831</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=033024214&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-033024214</subfield></datafield></record></collection> |
id | DE-604.BV047640007 |
illustrated | Illustrated |
index_date | 2024-07-03T18:47:43Z |
indexdate | 2024-07-10T09:17:57Z |
institution | BVB |
isbn | 9781138601826 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-033024214 |
oclc_num | 1289761178 |
open_access_boolean | |
owner | DE-355 DE-BY-UBR |
owner_facet | DE-355 DE-BY-UBR |
physical | xii, 351 Seiten Illustrationen, Diagramme |
publishDate | 2020 |
publishDateSearch | 2020 |
publishDateSort | 2020 |
publisher | CRC Press |
record_format | marc |
series2 | A science publisher book |
spelling | Žižka, Jan 19XX- Verfasser (DE-588)1038323231 aut Text mining with machine learning principles and techniques Jan Žižka (Machine learning consultant, Brno, Czech Republic), František Dařena (Department of Informatics, Mendel University, Brno, Czech Republic), Arnošt Svoboda (Department of Applied Mathematics & Computer Science, Masaryk University, Brno, Czech Republic) Boca Raton ; London ; New York CRC Press [2020] xii, 351 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier A science publisher book Literaturverzeichnis: Seite 323-346 1. Introduction to Text Mining with Machine Learning -- 2. Introduction to R -- 3. Structured Text Representations -- 4. Classification -- 5. Bayes Classifier -- 6. Nearest Neighbors -- 7. Decision Trees -- 8. Random Forest -- 9. Adaboost -- 10. Support Vector Machines -- 11. Deep Learning -- 12. Clustering -- 13. Word Embeddings -- 14. Feature Selection -- References -- Index -- Color Section "This book provides a perspective on the application of machine learning-based methods in knowledge discovery from natural languages texts. By analysing various data sets, conclusions, which are not normally evident, emerge and can be used for various purposes and applications. The book provides explanations of principles of time-proven machine learning algorithms applied in text mining together with step-by-step demonstrations of how to reveal the semantic contents in real-world datasets using the popular R-language with its implemented machine learning algorithms. The book is not only aimed at IT specialists, but is meant for a wider audience that needs to process big sets of text documents and has basic knowledge of the subject, e.g. e-mail service providers, online shoppers, librarians, etc"-- Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf Text Mining (DE-588)4728093-1 gnd rswk-swf Computerlinguistik (DE-588)4035843-4 gnd rswk-swf Machine learning Computational linguistics Semantics / Data processing Text Mining (DE-588)4728093-1 s Computerlinguistik (DE-588)4035843-4 s Maschinelles Lernen (DE-588)4193754-5 s DE-604 Dařena, František 1979- Verfasser (DE-588)1082235296 aut Svoboda, Arnošt 1949- Verfasser (DE-588)1225502187 aut Äquivalent Druck-Ausgabe, Paperback 978-1-032-08621-7 (DE-604)BV047581831 Digitalisierung UB Regensburg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=033024214&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Žižka, Jan 19XX- Dařena, František 1979- Svoboda, Arnošt 1949- Text mining with machine learning principles and techniques Maschinelles Lernen (DE-588)4193754-5 gnd Text Mining (DE-588)4728093-1 gnd Computerlinguistik (DE-588)4035843-4 gnd |
subject_GND | (DE-588)4193754-5 (DE-588)4728093-1 (DE-588)4035843-4 |
title | Text mining with machine learning principles and techniques |
title_auth | Text mining with machine learning principles and techniques |
title_exact_search | Text mining with machine learning principles and techniques |
title_exact_search_txtP | Text mining with machine learning principles and techniques |
title_full | Text mining with machine learning principles and techniques Jan Žižka (Machine learning consultant, Brno, Czech Republic), František Dařena (Department of Informatics, Mendel University, Brno, Czech Republic), Arnošt Svoboda (Department of Applied Mathematics & Computer Science, Masaryk University, Brno, Czech Republic) |
title_fullStr | Text mining with machine learning principles and techniques Jan Žižka (Machine learning consultant, Brno, Czech Republic), František Dařena (Department of Informatics, Mendel University, Brno, Czech Republic), Arnošt Svoboda (Department of Applied Mathematics & Computer Science, Masaryk University, Brno, Czech Republic) |
title_full_unstemmed | Text mining with machine learning principles and techniques Jan Žižka (Machine learning consultant, Brno, Czech Republic), František Dařena (Department of Informatics, Mendel University, Brno, Czech Republic), Arnošt Svoboda (Department of Applied Mathematics & Computer Science, Masaryk University, Brno, Czech Republic) |
title_short | Text mining with machine learning |
title_sort | text mining with machine learning principles and techniques |
title_sub | principles and techniques |
topic | Maschinelles Lernen (DE-588)4193754-5 gnd Text Mining (DE-588)4728093-1 gnd Computerlinguistik (DE-588)4035843-4 gnd |
topic_facet | Maschinelles Lernen Text Mining Computerlinguistik |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=033024214&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT zizkajan textminingwithmachinelearningprinciplesandtechniques AT darenafrantisek textminingwithmachinelearningprinciplesandtechniques AT svobodaarnost textminingwithmachinelearningprinciplesandtechniques |