Language Modeling for Information Retrieval:
A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize...
Gespeichert in:
Weitere Verfasser: | , |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Dordrecht
Springer Netherlands
2003
|
Ausgabe: | 1st ed. 2003 |
Schriftenreihe: | The Information Retrieval Series
13 |
Schlagworte: | |
Online-Zugang: | UBY01 URL des Eerstveröffentlichers |
Zusammenfassung: | A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat egories. The first statisticallanguage modeler was Claude Shannon. In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text. To do this, he estimated the entropy of English through experiments with human subjects, and also estimated the cross-entropy of the n-gram models on natural 1 text. The ability of language models to be quantitatively evaluated in tbis way is one of their important virtues. Of course, estimating the true entropy of language is an elusive goal, aiming at many moving targets, since language is so varied and evolves so quickly. Yet fifty years after Shannon's study, language models remain, by all measures, far from the Shannon entropy liInit in terms of their predictive power. However, tbis has not kept them from being useful for a variety of text processing tasks, and moreover can be viewed as encouragement that there is still great room for improvement in statisticallanguage modeling |
Beschreibung: | 1 Online-Ressource (XIV, 246 p) |
ISBN: | 9789401701716 |
DOI: | 10.1007/978-94-017-0171-6 |
Internformat
MARC
LEADER | 00000nmm a2200000zcb4500 | ||
---|---|---|---|
001 | BV047064724 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | cr|uuu---uuuuu | ||
008 | 201216s2003 |||| o||u| ||||||eng d | ||
020 | |a 9789401701716 |9 978-94-017-0171-6 | ||
024 | 7 | |a 10.1007/978-94-017-0171-6 |2 doi | |
035 | |a (ZDB-2-SCS)978-94-017-0171-6 | ||
035 | |a (OCoLC)1227476998 | ||
035 | |a (DE-599)BVBBV047064724 | ||
040 | |a DE-604 |b ger |e aacr | ||
041 | 0 | |a eng | |
049 | |a DE-706 | ||
082 | 0 | |a 005.73 |2 23 | |
245 | 1 | 0 | |a Language Modeling for Information Retrieval |c edited by W. Bruce Croft, John Lafferty |
250 | |a 1st ed. 2003 | ||
264 | 1 | |a Dordrecht |b Springer Netherlands |c 2003 | |
300 | |a 1 Online-Ressource (XIV, 246 p) | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
490 | 0 | |a The Information Retrieval Series |v 13 | |
520 | |a A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat egories. The first statisticallanguage modeler was Claude Shannon. In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text. To do this, he estimated the entropy of English through experiments with human subjects, and also estimated the cross-entropy of the n-gram models on natural 1 text. The ability of language models to be quantitatively evaluated in tbis way is one of their important virtues. Of course, estimating the true entropy of language is an elusive goal, aiming at many moving targets, since language is so varied and evolves so quickly. Yet fifty years after Shannon's study, language models remain, by all measures, far from the Shannon entropy liInit in terms of their predictive power. However, tbis has not kept them from being useful for a variety of text processing tasks, and moreover can be viewed as encouragement that there is still great room for improvement in statisticallanguage modeling | ||
650 | 4 | |a Data Structures and Information Theory | |
650 | 4 | |a Information Storage and Retrieval | |
650 | 4 | |a Computer Science, general | |
650 | 4 | |a Artificial Intelligence | |
650 | 4 | |a Data structures (Computer science) | |
650 | 4 | |a Information storage and retrieval | |
650 | 4 | |a Computer science | |
650 | 4 | |a Artificial intelligence | |
650 | 0 | 7 | |a Computerlinguistik |0 (DE-588)4035843-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Information Retrieval |0 (DE-588)4072803-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Sprache |0 (DE-588)4056449-6 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)1071861417 |a Konferenzschrift |y 2001 |z Pittsburgh, Pa. |2 gnd-content | |
689 | 0 | 0 | |a Information Retrieval |0 (DE-588)4072803-1 |D s |
689 | 0 | 1 | |a Sprache |0 (DE-588)4056449-6 |D s |
689 | 0 | 2 | |a Computerlinguistik |0 (DE-588)4035843-4 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Croft, W. Bruce |4 edt | |
700 | 1 | |a Lafferty, John |4 edt | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 9789048162635 |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 9781402012167 |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 9789401701723 |
856 | 4 | 0 | |u https://doi.org/10.1007/978-94-017-0171-6 |x Verlag |z URL des Eerstveröffentlichers |3 Volltext |
912 | |a ZDB-2-SCS | ||
940 | 1 | |q ZDB-2-SCS_2000/2004 | |
999 | |a oai:aleph.bib-bvb.de:BVB01-032471836 | ||
966 | e | |u https://doi.org/10.1007/978-94-017-0171-6 |l UBY01 |p ZDB-2-SCS |q ZDB-2-SCS_2000/2004 |x Verlag |3 Volltext |
Datensatz im Suchindex
_version_ | 1804182063125889024 |
---|---|
adam_txt | |
any_adam_object | |
any_adam_object_boolean | |
author2 | Croft, W. Bruce Lafferty, John |
author2_role | edt edt |
author2_variant | w b c wb wbc j l jl |
author_facet | Croft, W. Bruce Lafferty, John |
building | Verbundindex |
bvnumber | BV047064724 |
collection | ZDB-2-SCS |
ctrlnum | (ZDB-2-SCS)978-94-017-0171-6 (OCoLC)1227476998 (DE-599)BVBBV047064724 |
dewey-full | 005.73 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 005 - Computer programming, programs, data, security |
dewey-raw | 005.73 |
dewey-search | 005.73 |
dewey-sort | 15.73 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
discipline_str_mv | Informatik |
doi_str_mv | 10.1007/978-94-017-0171-6 |
edition | 1st ed. 2003 |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03817nmm a2200613zcb4500</leader><controlfield tag="001">BV047064724</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">201216s2003 |||| o||u| ||||||eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789401701716</subfield><subfield code="9">978-94-017-0171-6</subfield></datafield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/978-94-017-0171-6</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-2-SCS)978-94-017-0171-6</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1227476998</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047064724</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">aacr</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-706</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">005.73</subfield><subfield code="2">23</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Language Modeling for Information Retrieval</subfield><subfield code="c">edited by W. Bruce Croft, John Lafferty</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1st ed. 2003</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Dordrecht</subfield><subfield code="b">Springer Netherlands</subfield><subfield code="c">2003</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (XIV, 246 p)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">The Information Retrieval Series</subfield><subfield code="v">13</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat egories. The first statisticallanguage modeler was Claude Shannon. In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text. To do this, he estimated the entropy of English through experiments with human subjects, and also estimated the cross-entropy of the n-gram models on natural 1 text. The ability of language models to be quantitatively evaluated in tbis way is one of their important virtues. Of course, estimating the true entropy of language is an elusive goal, aiming at many moving targets, since language is so varied and evolves so quickly. Yet fifty years after Shannon's study, language models remain, by all measures, far from the Shannon entropy liInit in terms of their predictive power. However, tbis has not kept them from being useful for a variety of text processing tasks, and moreover can be viewed as encouragement that there is still great room for improvement in statisticallanguage modeling</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data Structures and Information Theory</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Information Storage and Retrieval</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Science, general</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Artificial Intelligence</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data structures (Computer science)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Information storage and retrieval</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Artificial intelligence</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Information Retrieval</subfield><subfield code="0">(DE-588)4072803-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Sprache</subfield><subfield code="0">(DE-588)4056449-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)1071861417</subfield><subfield code="a">Konferenzschrift</subfield><subfield code="y">2001</subfield><subfield code="z">Pittsburgh, Pa.</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Information Retrieval</subfield><subfield code="0">(DE-588)4072803-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Sprache</subfield><subfield code="0">(DE-588)4056449-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Computerlinguistik</subfield><subfield code="0">(DE-588)4035843-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Croft, W. Bruce</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lafferty, John</subfield><subfield code="4">edt</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9789048162635</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9781402012167</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9789401701723</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1007/978-94-017-0171-6</subfield><subfield code="x">Verlag</subfield><subfield code="z">URL des Eerstveröffentlichers</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-2-SCS</subfield></datafield><datafield tag="940" ind1="1" ind2=" "><subfield code="q">ZDB-2-SCS_2000/2004</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032471836</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://doi.org/10.1007/978-94-017-0171-6</subfield><subfield code="l">UBY01</subfield><subfield code="p">ZDB-2-SCS</subfield><subfield code="q">ZDB-2-SCS_2000/2004</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
genre | (DE-588)1071861417 Konferenzschrift 2001 Pittsburgh, Pa. gnd-content |
genre_facet | Konferenzschrift 2001 Pittsburgh, Pa. |
id | DE-604.BV047064724 |
illustrated | Not Illustrated |
index_date | 2024-07-03T16:12:23Z |
indexdate | 2024-07-10T09:01:35Z |
institution | BVB |
isbn | 9789401701716 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032471836 |
oclc_num | 1227476998 |
open_access_boolean | |
owner | DE-706 |
owner_facet | DE-706 |
physical | 1 Online-Ressource (XIV, 246 p) |
psigel | ZDB-2-SCS ZDB-2-SCS_2000/2004 ZDB-2-SCS ZDB-2-SCS_2000/2004 |
publishDate | 2003 |
publishDateSearch | 2003 |
publishDateSort | 2003 |
publisher | Springer Netherlands |
record_format | marc |
series2 | The Information Retrieval Series |
spelling | Language Modeling for Information Retrieval edited by W. Bruce Croft, John Lafferty 1st ed. 2003 Dordrecht Springer Netherlands 2003 1 Online-Ressource (XIV, 246 p) txt rdacontent c rdamedia cr rdacarrier The Information Retrieval Series 13 A statisticallanguage model, or more simply a language model, is a prob abilistic mechanism for generating text. Such adefinition is general enough to include an endless variety of schemes. However, a distinction should be made between generative models, which can in principle be used to synthesize artificial text, and discriminative techniques to classify text into predefined cat egories. The first statisticallanguage modeler was Claude Shannon. In exploring the application of his newly founded theory of information to human language, Shannon considered language as a statistical source, and measured how weH simple n-gram models predicted or, equivalently, compressed natural text. To do this, he estimated the entropy of English through experiments with human subjects, and also estimated the cross-entropy of the n-gram models on natural 1 text. The ability of language models to be quantitatively evaluated in tbis way is one of their important virtues. Of course, estimating the true entropy of language is an elusive goal, aiming at many moving targets, since language is so varied and evolves so quickly. Yet fifty years after Shannon's study, language models remain, by all measures, far from the Shannon entropy liInit in terms of their predictive power. However, tbis has not kept them from being useful for a variety of text processing tasks, and moreover can be viewed as encouragement that there is still great room for improvement in statisticallanguage modeling Data Structures and Information Theory Information Storage and Retrieval Computer Science, general Artificial Intelligence Data structures (Computer science) Information storage and retrieval Computer science Artificial intelligence Computerlinguistik (DE-588)4035843-4 gnd rswk-swf Information Retrieval (DE-588)4072803-1 gnd rswk-swf Sprache (DE-588)4056449-6 gnd rswk-swf (DE-588)1071861417 Konferenzschrift 2001 Pittsburgh, Pa. gnd-content Information Retrieval (DE-588)4072803-1 s Sprache (DE-588)4056449-6 s Computerlinguistik (DE-588)4035843-4 s DE-604 Croft, W. Bruce edt Lafferty, John edt Erscheint auch als Druck-Ausgabe 9789048162635 Erscheint auch als Druck-Ausgabe 9781402012167 Erscheint auch als Druck-Ausgabe 9789401701723 https://doi.org/10.1007/978-94-017-0171-6 Verlag URL des Eerstveröffentlichers Volltext |
spellingShingle | Language Modeling for Information Retrieval Data Structures and Information Theory Information Storage and Retrieval Computer Science, general Artificial Intelligence Data structures (Computer science) Information storage and retrieval Computer science Artificial intelligence Computerlinguistik (DE-588)4035843-4 gnd Information Retrieval (DE-588)4072803-1 gnd Sprache (DE-588)4056449-6 gnd |
subject_GND | (DE-588)4035843-4 (DE-588)4072803-1 (DE-588)4056449-6 (DE-588)1071861417 |
title | Language Modeling for Information Retrieval |
title_auth | Language Modeling for Information Retrieval |
title_exact_search | Language Modeling for Information Retrieval |
title_exact_search_txtP | Language Modeling for Information Retrieval |
title_full | Language Modeling for Information Retrieval edited by W. Bruce Croft, John Lafferty |
title_fullStr | Language Modeling for Information Retrieval edited by W. Bruce Croft, John Lafferty |
title_full_unstemmed | Language Modeling for Information Retrieval edited by W. Bruce Croft, John Lafferty |
title_short | Language Modeling for Information Retrieval |
title_sort | language modeling for information retrieval |
topic | Data Structures and Information Theory Information Storage and Retrieval Computer Science, general Artificial Intelligence Data structures (Computer science) Information storage and retrieval Computer science Artificial intelligence Computerlinguistik (DE-588)4035843-4 gnd Information Retrieval (DE-588)4072803-1 gnd Sprache (DE-588)4056449-6 gnd |
topic_facet | Data Structures and Information Theory Information Storage and Retrieval Computer Science, general Artificial Intelligence Data structures (Computer science) Information storage and retrieval Computer science Artificial intelligence Computerlinguistik Information Retrieval Sprache Konferenzschrift 2001 Pittsburgh, Pa. |
url | https://doi.org/10.1007/978-94-017-0171-6 |
work_keys_str_mv | AT croftwbruce languagemodelingforinformationretrieval AT laffertyjohn languagemodelingforinformationretrieval |