Mastering natural language processing with Python: maximize your NLP capabilities while creating amazing NLP projects in Python
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Birmingham
Packt Publishing
June 2016
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis Klappentext |
Beschreibung: | Includes index |
Beschreibung: | viii, 222 Seiten Illustrationen |
ISBN: | 9781783989041 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV043661999 | ||
003 | DE-604 | ||
005 | 20170224 | ||
007 | t | ||
008 | 160708s2016 a||| |||| 00||| eng d | ||
020 | |a 9781783989041 |9 978-1-78398-904-1 | ||
035 | |a (OCoLC)953521496 | ||
035 | |a (DE-599)BVBBV043661999 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-355 |a DE-83 | ||
084 | |a ST 306 |0 (DE-625)143654: |2 rvk | ||
100 | 1 | |a Chopra, Deepti |e Verfasser |4 aut | |
245 | 1 | 0 | |a Mastering natural language processing with Python |b maximize your NLP capabilities while creating amazing NLP projects in Python |c Deepti Chopra, Nisheeth Joshi, Iti Mathur |
264 | 1 | |a Birmingham |b Packt Publishing |c June 2016 | |
300 | |a viii, 222 Seiten |b Illustrationen | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Includes index | ||
650 | 4 | |a Natural language processing (Computer science) | |
650 | 4 | |a Python (Computer program language) | |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Natürliche Sprache |0 (DE-588)4041354-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Sprachverarbeitung |0 (DE-588)4116579-2 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Natürliche Sprache |0 (DE-588)4041354-8 |D s |
689 | 0 | 1 | |a Sprachverarbeitung |0 (DE-588)4116579-2 |D s |
689 | 0 | 2 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 0 | 3 | |a Python |g Programmiersprache |0 (DE-588)4434275-5 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Mathur, Iti |e Verfasser |0 (DE-588)1109098308 |4 aut | |
700 | 1 | |a Joshi, Nisheeth |e Verfasser |0 (DE-588)1109098537 |4 aut | |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075335&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Regensburg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075335&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
999 | |a oai:aleph.bib-bvb.de:BVB01-029075335 |
Datensatz im Suchindex
_version_ | 1804176416884916224 |
---|---|
adam_text | Table of Contents
Preface______________________________________________________________________ y
Chapter 1: Working with Strings_______________________________________________1
Tokenization 1
Tokenization of text into sentences 2
Tokenization of text in other languages 2
Tokenization of sentences into words 3
Tokenization using TreebankWordTokenizer 4
Tokenization using regular expressions 5
Normalization 8
Eliminating punctuation 8
Dealing with stop words 9
Calculate stopwords in English 10
Substituting and correcting tokens 10
Replacing words using regular expressions 11
Example of the replacement of a text with another text 12
Performing substitution before tokenization 12
Dealing with repeating characters 12
Example of deleting repeating characters 13
Replacing a word with its synonym 14
Example of substituting word a with its synonym 14
Applying Zipf s law to text 15
Similarity measures 16
Applying similarity measures using Ethe edit distance algorithm 16
Applying similarity measures using Jaccard s Coefficient 18
Applying similarity measures using the Smith Waterman distance 19
Other string similarity metrics 19
Summary 21
------------------------------------- [i] --------------------------------------
Table of Contents
Chapter 2: Statistical Language Modeling_____________________________23
Understanding word frequency 23
Develop MLE for a given text 27
Hidden Markov Model estimation 35
Applying smoothing on the MLE model 36
Add-one smoothing 36
Good Turing 37
Kneser Ney estimation 43
Witten Bell estimation 43
Develop a back-off mechanism for MLE 44
Applying interpolation on data to get mix and match 44
Evaluate a language model through perplexity 45
Applying metropolis hastings in modeling languages 45
Applying Gibbs sampling in language processing 45
Summary 48
Chapter 3: Morphology - Getting Our Feet Wet_________________________49
Introducing morphology 49
Understanding stemmer 50
Understanding lemmatization 53
Developing a stemmer for non-English language 54
Morphological analyzer 56
Morphological generator 58
Search engine 59
Summary 63
Chapter 4: Parts-of-Speech Tagging - Identifying Words_______________65
Introducing parts-of-speech tagging 65
Default tagging 70
Creating POS-tagged corpora 71
Selecting a machine learning algorithm 73
Statistical modeling involving the n-gram approach 75
Developing a chunker using pos-tagged corpora 81
Summary 84
Chapter 5: Parsing - Analyzing Training Data ________________________85
Introducing parsing 85
Treebank construction 86
Extracting Context Free Grammar (CFG) rules from Treebank 91
Creating a probabilistic Context Free Grammar from CFG 97
CYK chart parsing algorithm 98
Earley chart parsing algorithm 100
Summary 106
---------------------------------[ii]----------------------------------
____________________________________________________________Table of Contents
Chapter 6: Semantic Analysis - Meaning Matters__________________ 107
Introducing semantic analysis 108
Introducing named entity recognition (NER) 111
A NER system using Hidden Markov Model 115
Training NER using Machine Learning Toolkits 121
NER using POS tagging 122
Generation of the synset id from Wordnet 124
Disambiguating senses using Wordnet 127
Summary 131
Chapter 7: Sentiment Analysis - I Am Happy__________________________133
Introducing sentiment analysis 134
Sentiment analysis using NER 139
Sentiment analysis using machine learning 140
Evaluation of the NER system 146
Summary 164
Chapter 8: Information Retrieval - Accessing Information____________165
Introducing information retrieval 165
Stop word removal 166
Information retrieval using a vector space model 168
Vector space scoring and query operator interaction 176
Developing an IR system using latent semantic indexing 178
Text summarization 179
Question-answering system 181
Summary 182
Chapter 9: Discourse Analysis - Knowing Is Believing________________183
Introducing discourse analysis 183
Discourse analysis using Centering Theory 190
Anaphora resolution 191
Summary 198
Chapter 10: Evaluation of NLP Systems - Analyzing
Performance_________________________________________________________199
The need for evaluation of NLP systems 199
Evaluation of NLP tools (POS taggers, stemmers,
and morphological analyzers) 200
Parser evaluation using gold data 211
Evaluation of IR system 211
Metrics for error identification 212
Metrics based on lexical matching 213
Metrics based on syntactic matching 217
---------------------------------[in]-----------------------------------
Table of Contents
Metrics using shallow semantic matching
Summary
Index
218
218
219
[iv]
Mastering Natural Language
Processing with Python
Natural Language Processing is one of the fields of
computational linguistics and artificial intelligence that is
concerned with human-computer interaction. It provides
seamless interaction between computers and human beings
and gives computers the ability to understand human speech
with the help of machine learning.
This book will show you how to employ various NLP tasks
in Python, and give you an insight into the best practices
when designing and building NLP-based applications using
Python. It will help you become an expert in no time and help
you to create your own NLP projects using NLTK.
You will sequentially be guided through applying machine
learning tools to develop various models. We ll provide
clarity regarding the creation of training data and the
implementation of major NLP applications such as named
entity recognition, question-answering system, discourse
analysis, transliteration, word sense disambiguation,
information retrieval, sentiment analysis, text summarization,
and anaphora resolution.
Who this book is written for
This book is for intermediate level developers in NLP with a
reasonable knowledge level and understanding of Python.
|
any_adam_object | 1 |
author | Chopra, Deepti Mathur, Iti Joshi, Nisheeth |
author_GND | (DE-588)1109098308 (DE-588)1109098537 |
author_facet | Chopra, Deepti Mathur, Iti Joshi, Nisheeth |
author_role | aut aut aut |
author_sort | Chopra, Deepti |
author_variant | d c dc i m im n j nj |
building | Verbundindex |
bvnumber | BV043661999 |
classification_rvk | ST 306 |
ctrlnum | (OCoLC)953521496 (DE-599)BVBBV043661999 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02280nam a2200457 c 4500</leader><controlfield tag="001">BV043661999</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20170224 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">160708s2016 a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781783989041</subfield><subfield code="9">978-1-78398-904-1</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)953521496</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV043661999</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-355</subfield><subfield code="a">DE-83</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 306</subfield><subfield code="0">(DE-625)143654:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Chopra, Deepti</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Mastering natural language processing with Python</subfield><subfield code="b">maximize your NLP capabilities while creating amazing NLP projects in Python</subfield><subfield code="c">Deepti Chopra, Nisheeth Joshi, Iti Mathur</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham</subfield><subfield code="b">Packt Publishing</subfield><subfield code="c">June 2016</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">viii, 222 Seiten</subfield><subfield code="b">Illustrationen</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Includes index</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Natural language processing (Computer science)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Python (Computer program language)</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Natürliche Sprache</subfield><subfield code="0">(DE-588)4041354-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Natürliche Sprache</subfield><subfield code="0">(DE-588)4041354-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Sprachverarbeitung</subfield><subfield code="0">(DE-588)4116579-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Mathur, Iti</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1109098308</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Joshi, Nisheeth</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1109098537</subfield><subfield code="4">aut</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075335&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075335&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-029075335</subfield></datafield></record></collection> |
id | DE-604.BV043661999 |
illustrated | Illustrated |
indexdate | 2024-07-10T07:31:50Z |
institution | BVB |
isbn | 9781783989041 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-029075335 |
oclc_num | 953521496 |
open_access_boolean | |
owner | DE-355 DE-BY-UBR DE-83 |
owner_facet | DE-355 DE-BY-UBR DE-83 |
physical | viii, 222 Seiten Illustrationen |
publishDate | 2016 |
publishDateSearch | 2016 |
publishDateSort | 2016 |
publisher | Packt Publishing |
record_format | marc |
spelling | Chopra, Deepti Verfasser aut Mastering natural language processing with Python maximize your NLP capabilities while creating amazing NLP projects in Python Deepti Chopra, Nisheeth Joshi, Iti Mathur Birmingham Packt Publishing June 2016 viii, 222 Seiten Illustrationen txt rdacontent n rdamedia nc rdacarrier Includes index Natural language processing (Computer science) Python (Computer program language) Computerlinguistik (DE-588)4035843-4 gnd rswk-swf Natürliche Sprache (DE-588)4041354-8 gnd rswk-swf Python Programmiersprache (DE-588)4434275-5 gnd rswk-swf Sprachverarbeitung (DE-588)4116579-2 gnd rswk-swf Natürliche Sprache (DE-588)4041354-8 s Sprachverarbeitung (DE-588)4116579-2 s Computerlinguistik (DE-588)4035843-4 s Python Programmiersprache (DE-588)4434275-5 s DE-604 Mathur, Iti Verfasser (DE-588)1109098308 aut Joshi, Nisheeth Verfasser (DE-588)1109098537 aut Digitalisierung UB Regensburg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075335&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis Digitalisierung UB Regensburg - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075335&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA Klappentext |
spellingShingle | Chopra, Deepti Mathur, Iti Joshi, Nisheeth Mastering natural language processing with Python maximize your NLP capabilities while creating amazing NLP projects in Python Natural language processing (Computer science) Python (Computer program language) Computerlinguistik (DE-588)4035843-4 gnd Natürliche Sprache (DE-588)4041354-8 gnd Python Programmiersprache (DE-588)4434275-5 gnd Sprachverarbeitung (DE-588)4116579-2 gnd |
subject_GND | (DE-588)4035843-4 (DE-588)4041354-8 (DE-588)4434275-5 (DE-588)4116579-2 |
title | Mastering natural language processing with Python maximize your NLP capabilities while creating amazing NLP projects in Python |
title_auth | Mastering natural language processing with Python maximize your NLP capabilities while creating amazing NLP projects in Python |
title_exact_search | Mastering natural language processing with Python maximize your NLP capabilities while creating amazing NLP projects in Python |
title_full | Mastering natural language processing with Python maximize your NLP capabilities while creating amazing NLP projects in Python Deepti Chopra, Nisheeth Joshi, Iti Mathur |
title_fullStr | Mastering natural language processing with Python maximize your NLP capabilities while creating amazing NLP projects in Python Deepti Chopra, Nisheeth Joshi, Iti Mathur |
title_full_unstemmed | Mastering natural language processing with Python maximize your NLP capabilities while creating amazing NLP projects in Python Deepti Chopra, Nisheeth Joshi, Iti Mathur |
title_short | Mastering natural language processing with Python |
title_sort | mastering natural language processing with python maximize your nlp capabilities while creating amazing nlp projects in python |
title_sub | maximize your NLP capabilities while creating amazing NLP projects in Python |
topic | Natural language processing (Computer science) Python (Computer program language) Computerlinguistik (DE-588)4035843-4 gnd Natürliche Sprache (DE-588)4041354-8 gnd Python Programmiersprache (DE-588)4434275-5 gnd Sprachverarbeitung (DE-588)4116579-2 gnd |
topic_facet | Natural language processing (Computer science) Python (Computer program language) Computerlinguistik Natürliche Sprache Python Programmiersprache Sprachverarbeitung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075335&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=029075335&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT chopradeepti masteringnaturallanguageprocessingwithpythonmaximizeyournlpcapabilitieswhilecreatingamazingnlpprojectsinpython AT mathuriti masteringnaturallanguageprocessingwithpythonmaximizeyournlpcapabilitieswhilecreatingamazingnlpprojectsinpython AT joshinisheeth masteringnaturallanguageprocessingwithpythonmaximizeyournlpcapabilitieswhilecreatingamazingnlpprojectsinpython |