Information retrieval models: foundations and relationships
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
San Rafael
Morgan & Claypool
2013
|
Schriftenreihe: | Synthesis lectures on information concepts, retrieval, and services
27 |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XXI, 141 S. |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV041369015 | ||
003 | DE-604 | ||
005 | 20180208 | ||
007 | t | ||
008 | 131021s2013 |||| 00||| eng d | ||
020 | |z 9781627050784 |9 978-1-62705-078-4 | ||
035 | |a (OCoLC)867142568 | ||
035 | |a (DE-599)BVBBV041369015 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-12 | ||
084 | |a 24,1 |2 ssgn | ||
100 | 1 | |a Roelleke, Thomas |e Verfasser |4 aut | |
245 | 1 | 0 | |a Information retrieval models |b foundations and relationships |
264 | 1 | |a San Rafael |b Morgan & Claypool |c 2013 | |
300 | |a XXI, 141 S. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Synthesis lectures on information concepts, retrieval, and services |v 27 | |
650 | 0 | 7 | |a Information Retrieval |0 (DE-588)4072803-1 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Information Retrieval |0 (DE-588)4072803-1 |D s |
689 | 0 | |5 DE-604 | |
830 | 0 | |a Synthesis lectures on information concepts, retrieval, and services |v 27 |w (DE-604)BV035766709 |9 27 | |
856 | 4 | 2 | |m Digitalisierung BSB Muenchen - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026817195&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-026817195 | ||
942 | 1 | 1 | |c 025.04 |e 22/bsb |
Datensatz im Suchindex
_version_ | 1804151458042478592 |
---|---|
adam_text | Contents
■■..::: ■ .... . ■■ . ■ : ■■
χι
I
List of Figures
....................................................xvii
Preface
...........................................................xix
Acknowledgments
..................................................xxi
1
Introduction
.......................................................1
1.1
Structure and Contribution of this Book
................................1
1.2
Background: A Timeline of
IR
Models
.................................1
1.3
Notation
.........................................................3
1.3.1
The Notation Issue Term Frequency
............................6
1.3.2
Notation: Zhai s Book and this Book
.............................7
2
Foundations of IRModels
............................................9
2.1
TF-IDF
.........................................................9
2.1.1
TF Variants
................................................10
2.1.2
TF]og: Logarithmic TF
.......................................12
2.1.3
TFfrac:
Fractional (Ratio-based) TF
.............................13
2.1.4
IDF Variants
...............................................14
2.1.5
Term Weight and
RSV
.......................................16
2.1.6
Other TF Variants: Lifted TF and Pivoted TF
....................16
2.1.7
Semi-subsumed Event Occurrences: A Semantics of the BM25-TF
... 17
2.1.8
Probabilistic IDF: The Probability of Being Informative
.............19
2.1.9
Summary
..................................................23
2.2
PRF: The Probability of Relevance Framework
..........................23
2.2.1
Feature Independence Assumption
..............................25
2.2.2
Non-Query Term Assumption
.................................26
2.2.3
Term Frequency Split
........................................26
2.2.4
Probability Ranking Principle (PRP)
............................26
2.2.5
Summary
..................................................29
2.3
BIR:
Binary Independence Retrieval
..................................29
2.3.1
Term Weight and
RSV
.......................................30
Xli
2.3.2
Missing Relevance Information
................................31
2.3.3
Variants of the
BIR
Term Weight
..............................32
2.3.4
Smooth Variants of the
BIR
Term Weight
.......................33
2.3.5
RSJ Term Weight
...........................................33
2.3.6
On Theoretical Arguments for
0.5
in the RSJ Term Weight
..........33
2.3.7
Summary
..................................................35
2.4
Poisson
and
2-
Poisson
.............................................35
2.4.1
Poisson
Probability
..........................................36
2.4.2
Poisson
Analogy: Sunny Days and Term Occurrences
..............36
2.4.3
Poisson
Example: Toy Data
...................................37
2.4.4
Poisson
Example: TREC-2
...................................38
2.4.5
Binomial Probability
.........................................39
2.4.6
Relationship between
Poisson
and Binomial Probability
.............40
2.4.7
Poisson PRF
...............................................40
2.4.8
Term Weight and
RSV
.......................................42
2.4.9
2-Poisson
..................................................43
2.4.10
Summary
..................................................44
2.5
BM25
..........................................................45
2.5.1
BM25-TF
.................................................45
2.5.2
BM25-TF and Pivoted TF
....................................45
2.5.3
BM25: Literature and Wikipedia End
2012......................46
2.5.4
Term Weight and
RSV
.......................................47
2.5.5
Summary
..................................................48
2.6
LM: Language Modeling
...........................................49
2.6.1
Probability Mixtures
.........................................49
2.6.2
Term Weight and
RSV: LMl
..................................51
2.6.3
Term Weight and
RSV: LM
(normalized)
........................52
2.6.4
Term Weight and
RSV: JM-LM
...............................54
2.6.5
Term Weight and
RSV:
Dirich-LM
............................54
2.6.6
Term Weight and
RSV: LM2
..................................56
2.6.7
Summary
..................................................57
2.7
PIN s: Probabilistic Inference Networks
...............................58
2.7.1
The Turtle/Croft Link Matrix
.................................61
2.7.2
Term Weight and
RSV
.......................................62
2.7.3
Summary
..................................................63
2.8
Divergence-based Models and DFR
..................................63
2.8.1
DFR: Divergence from Randomness
............................63
xiii
2.8.2
DFR: Sampling over Documents and Locations
...................65
2.8.3
DFR: Binomial Transformation Step
............................66
2.8.4
DFR and KL-Divergence
.....................................67
2.8.5
Poisson
as a Model of Randomness: P{kt
>
0 d, c): DFR-1
.........68
2.8.6
Poisson
as a Model of Randomness: P(kt
=
tfd d, c): DFR-2
........68
2.8.7
DFR: Elite Documents
.......................................69
2.8.8
DFR: Example
.............................................69
2.8.9
Term Weights and
RSV
s.....................................
70
2.8.10
KL-Divergence Retrieval Model
...............................73
2.8.11
Summary
..................................................73
2.9
Relevance-based Models
...........................................73
2.9.1
Rocchio s Relevance Feedback Model
...........................74
2.9.2
The PRF
..................................................74
2.9.3
Lavrenko s Relevance-based Language Models
....................75
2.10
Precision and Recall
...............................................76
2.10.1
Precision and Recall: Conditional Probabilities
....................76
2.10.2
Averages: Total Probabilities
...................................76
2.11
Summary
.......................................................77
Relationships Between
IR
Models
....................................79
3.1
PRF: The Probability of Relevance Framework
..........................80
3.1.1
Estimation of Term Probabilities
...............................81
3.2
P{d
-*
q): The Probability that
d
Implies
q
............................82
3.3
lbe
Vector-Space Model (VSM)
...................................83
3.3.1
VSM and Probabilities
.......................................85
3.4
The Generalised Vector-Space Model (GVSM)
.........................85
3.4.1
GVSM and Probabilities
......................................86
3.5
A General Matrix Framework
.......................................88
3.5.1
Term-Document Matrix
......................................88
3.5.2
On the Notation Issue Term Frequency
........................90
3.5.3
Document-Document Matrix
.................................91
3.5.4
Co-Occurrence Matrices
......................................91
3.6
A Parallel Derivation of Probabilistic Retrieval Models
...................92
3.7
The
Poisson
Bridge:
Pd{í u)
·
avgtf(/, u)
=
Pl{î u)
·
avgdl(w)
.............93
3.8
Query Term Probability Assumptions
.................................94
3.8.1
Query term mixture assumption
................................94
3.8.2
Query term burstiness assumption
...............................95
XIV
3.8.3
Query term
BIR
assumption
...................................96
3.9
TF-IDF
........................................................96
3.9.1
TF-IDF and
BIR
...........................................96
3.9.2
TF-IDF and
Poisson
.........................................98
3.9.3
TF-IDFandBM25
........................................100
3.9.4
TF-IDF and LM
..........................................101
3.9.5
TF-IDF and LM: Side-by-Side
...............................102
3.9.6
TF-IDF and PIN s
.........................................104
3.9.7
TF-IDF and Divergence
.....................................105
3.9.8
TF-IDF and DFR: Risk times Gain
...........................105
3.9.9
TF-IDF and DFR: Gaps between Term Occurrences
..............106
3.10
More Relationships: BM25 and LM, LM and PIN s
....................108
3.11
Information Theory
..............................................108
3.11.1
Entropy
..................................................109
3.11.2
Joint Entropy
..............................................
HO
3.11.3
Conditional Entropy
........................................
HO
3.11.4
Mutual Information (MI)
....................................
HO
3.11.5
Cross Entropy
.............................................110
3.11.6
KL-Divergence
............................................
Ill
3.11.7
Query Clarity:
Divergence(query ||
collection)
...................
Ill
3.11.8
LM
=
Clarityiquery)
-
Divergenceiquery || doc)
..................112
3.11.9
TF-IDF
=
Clarityidoc)
-
Divergenceidoc || query)
................112
3.12
Summary
......................................................113
Summary
&
Research Outlook
......................................117
4.1
Summary
......................................................117
4.2
Research Outlook
................................................119
4.2.1
Retrieval Models
...........................................119
4.2.2
Evaluation Models
.........................................120
4.2.3
A Unified Framework for Retrieval AND Evaluation
..............121
4.2.4
Model Combinations and New Models
.......................122
4.2.5
Dependence-aware Models
...................................123
4.2.6
Query-Log and other More-Evidence Models
..................124
4.2.7
Phase-2 Models: Retrieval Result Condensation Models
...........124
4.2.8
A Theoretical Framework to Predict Ranking Quality
.............124
4.2.9
MIR: Math for
IR
.........................................125
4.2.10
AIR: Abstraction for
IR
................. .. 125
XV
Bibliography
..................................................... 127
Author s Biography
................................................ 135
Index
........................................................... 137
|
any_adam_object | 1 |
author | Roelleke, Thomas |
author_facet | Roelleke, Thomas |
author_role | aut |
author_sort | Roelleke, Thomas |
author_variant | t r tr |
building | Verbundindex |
bvnumber | BV041369015 |
ctrlnum | (OCoLC)867142568 (DE-599)BVBBV041369015 |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01409nam a2200349 cb4500</leader><controlfield tag="001">BV041369015</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20180208 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">131021s2013 |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">9781627050784</subfield><subfield code="9">978-1-62705-078-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)867142568</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV041369015</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-12</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Roelleke, Thomas</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Information retrieval models</subfield><subfield code="b">foundations and relationships</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">San Rafael</subfield><subfield code="b">Morgan & Claypool</subfield><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXI, 141 S.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Synthesis lectures on information concepts, retrieval, and services</subfield><subfield code="v">27</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Information Retrieval</subfield><subfield code="0">(DE-588)4072803-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Information Retrieval</subfield><subfield code="0">(DE-588)4072803-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Synthesis lectures on information concepts, retrieval, and services</subfield><subfield code="v">27</subfield><subfield code="w">(DE-604)BV035766709</subfield><subfield code="9">27</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung BSB Muenchen - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026817195&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-026817195</subfield></datafield><datafield tag="942" ind1="1" ind2="1"><subfield code="c">025.04</subfield><subfield code="e">22/bsb</subfield></datafield></record></collection> |
id | DE-604.BV041369015 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T00:55:08Z |
institution | BVB |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-026817195 |
oclc_num | 867142568 |
open_access_boolean | |
owner | DE-12 |
owner_facet | DE-12 |
physical | XXI, 141 S. |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
publisher | Morgan & Claypool |
record_format | marc |
series | Synthesis lectures on information concepts, retrieval, and services |
series2 | Synthesis lectures on information concepts, retrieval, and services |
spelling | Roelleke, Thomas Verfasser aut Information retrieval models foundations and relationships San Rafael Morgan & Claypool 2013 XXI, 141 S. txt rdacontent n rdamedia nc rdacarrier Synthesis lectures on information concepts, retrieval, and services 27 Information Retrieval (DE-588)4072803-1 gnd rswk-swf Information Retrieval (DE-588)4072803-1 s DE-604 Synthesis lectures on information concepts, retrieval, and services 27 (DE-604)BV035766709 27 Digitalisierung BSB Muenchen - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026817195&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Roelleke, Thomas Information retrieval models foundations and relationships Synthesis lectures on information concepts, retrieval, and services Information Retrieval (DE-588)4072803-1 gnd |
subject_GND | (DE-588)4072803-1 |
title | Information retrieval models foundations and relationships |
title_auth | Information retrieval models foundations and relationships |
title_exact_search | Information retrieval models foundations and relationships |
title_full | Information retrieval models foundations and relationships |
title_fullStr | Information retrieval models foundations and relationships |
title_full_unstemmed | Information retrieval models foundations and relationships |
title_short | Information retrieval models |
title_sort | information retrieval models foundations and relationships |
title_sub | foundations and relationships |
topic | Information Retrieval (DE-588)4072803-1 gnd |
topic_facet | Information Retrieval |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026817195&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV035766709 |
work_keys_str_mv | AT roellekethomas informationretrievalmodelsfoundationsandrelationships |