The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Abschlussarbeit Buch |
Sprache: | English |
Veröffentlicht: |
Borås [u.a.]
Valfrid
2004
|
Schriftenreihe: | Skrifter från Valfrid
28 |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | 166 S. graph. Darst. |
ISBN: | 9189416104 |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV019655654 | ||
003 | DE-604 | ||
005 | 20080429 | ||
007 | t | ||
008 | 050112s2004 d||| m||| 00||| eng d | ||
020 | |a 9189416104 |9 91-89416-10-4 | ||
035 | |a (OCoLC)186632924 | ||
035 | |a (DE-599)BVBBV019655654 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-19 |a DE-12 |a DE-188 | ||
084 | |a 24,1 |2 ssgn | ||
100 | 1 | |a Ahlgren, Per |e Verfasser |4 aut | |
245 | 1 | 0 | |a The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database |c Per Ahlgren |
264 | 1 | |a Borås [u.a.] |b Valfrid |c 2004 | |
300 | |a 166 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Skrifter från Valfrid |v 28 | |
502 | |a Zugl.: Göteborg, Univ., Diss., 2004 | ||
650 | 7 | |a Fulltextdatabaser |2 sao | |
650 | 7 | |a Informationsåtervinning - metodik |2 sao | |
650 | 0 | 7 | |a Volltextdatenbank |0 (DE-588)4138258-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Abfragesprache |0 (DE-588)4134011-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Morphologie |0 (DE-588)4040289-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Schwedisch |0 (DE-588)4116437-4 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Information-Retrieval-System |0 (DE-588)4670557-0 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4113937-9 |a Hochschulschrift |2 gnd-content | |
689 | 0 | 0 | |a Schwedisch |0 (DE-588)4116437-4 |D s |
689 | 0 | 1 | |a Volltextdatenbank |0 (DE-588)4138258-4 |D s |
689 | 0 | 2 | |a Information-Retrieval-System |0 (DE-588)4670557-0 |D s |
689 | 0 | |5 DE-188 | |
689 | 1 | 0 | |a Schwedisch |0 (DE-588)4116437-4 |D s |
689 | 1 | 1 | |a Volltextdatenbank |0 (DE-588)4138258-4 |D s |
689 | 1 | 2 | |a Abfragesprache |0 (DE-588)4134011-5 |D s |
689 | 1 | 3 | |a Morphologie |0 (DE-588)4040289-7 |D s |
689 | 1 | |5 DE-188 | |
830 | 0 | |a Skrifter från Valfrid |v 28 |w (DE-604)BV010976203 |9 28 | |
856 | 4 | 2 | |m Digitalisierung BSBMuenchen |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=012984191&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-012984191 |
Datensatz im Suchindex
_version_ | 1804133033017606144 |
---|---|
adam_text | Contents
PART ONE FRAMEWORK
9
1
Introduction
11
2
Central concepts of the research setting
14
2.1
Automatic indexing
15
2.1.1
Lexical analysis
16
2.1.2
Stop words
17
2.1.3
Stemming and normalization
17
2.1.4
Removal of additional high frequency terms
18
2.1.5
The index
18
2.1.6
Visualization of the outlined automatic indexing process
19
2.2
Retrieval models
21
2.2.1
Boolean model
21
2.2.2
Vector model
22
2.2.3
Two probabilistic models
24
2.3
Evaluation of retrieval effectiveness
39
3
Some linguistic phenomena with relevance to
IR,
and conflation
41
3.1
Some linguistic phenomena with relevance to
IR
41
3.1.1
Properties of Swedish related to
IR
42
3.2
Conflation
45
3.2.1
Stemming
45
3.2.2
Normalization
46
4
Research on conflation
48
4.1
Research on a morphologically simple language
48
4.2
Research on morphologically more complex languages
51
4.3
Summary of the main results
57
PART TWO EXPERIMENT
59
5
Test documents and the indexing strategies used in the study
61
5.1
Test documents
61
5.2
Indexing of the Swedish news articles
63
5.2.1
Lexical analysis of the Swedish news articles
63
5.2.2
Stop words
65
5.2.3
Indexing strategy based on inflected word forms
65
5.2.4
Indexing strategies based on normalization
65
5.2.5
Visualization of the indexing process
72
6
Variables, aim of the study and research questions
74
7
Data and methods
78
7.1
InQuery retrieval system
78
7.2
Topics, queries and pooling
80
7.2.1
Topics
80
7.2.2
Queries
80
7.2.3
Pooling
84
7.3
Relevance assessments
85
7.3.1
Relevance scale
85
7.3.2
Assessment process
86
7.4
Data for pools, recall bases and for sets of irrelevant documents
87
7.5
Evaluation
89
7.5.1
Gain vectors
89
7.5.2
Binary relevance situation
90
7.5.3
Multiple degree relevance situation
92
7.5.4
Significance testing
99
8
Findings
102
8.1
Binary relevance situation
102
8.1.1
Precision at given DCVs of the five indexing strategy-query term
combinations
102
8.1.2
Tests of significance
104
8.1.3
Effectiveness by topics
104
8.2
Multiple degree relevance situation
111
8.2.1
USl.l
HI
8.2.2
USI.
2
113
8.2.3
US2.1
114
8.2.4
US2.2
116
8.2.5
Tests of significance
117
8.2.6
Effectiveness by topics under US2.2
118
9
Discussion
125
9.1
Binary relevance situation and US2.2: detailed topic-by-topic analysis
125
9.2
Binary relevance situation and US2.2: changes in relative effectiveness
130
9.3
Splitting of compounds in queries
131
9.4
Expansion of query base forms with derivatives
133
10
Conclusion
135
References
« 17
Appendix
1
Topics used in the study
-,.,
Appendix
2
One of the used topics: English version
,5«
Appendix
3
Sample word lists and corresponding queries iri
Appendix
4
Examples of terms not recognized by SWETWOL
1
en
Appendix
5
Problem with queries for
SSPLÏT-bfqt
and SSPLIT-EL-bfqt
15»
Appendix
6
Alternative token definition ^
ќ1
Appendix
7
Instructions for the assessors (English translation) i64
|
any_adam_object | 1 |
author | Ahlgren, Per |
author_facet | Ahlgren, Per |
author_role | aut |
author_sort | Ahlgren, Per |
author_variant | p a pa |
building | Verbundindex |
bvnumber | BV019655654 |
ctrlnum | (OCoLC)186632924 (DE-599)BVBBV019655654 |
format | Thesis Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02157nam a2200517 cb4500</leader><controlfield tag="001">BV019655654</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20080429 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">050112s2004 d||| m||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9189416104</subfield><subfield code="9">91-89416-10-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)186632924</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV019655654</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-19</subfield><subfield code="a">DE-12</subfield><subfield code="a">DE-188</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Ahlgren, Per</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database</subfield><subfield code="c">Per Ahlgren</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Borås [u.a.]</subfield><subfield code="b">Valfrid</subfield><subfield code="c">2004</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">166 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Skrifter från Valfrid</subfield><subfield code="v">28</subfield></datafield><datafield tag="502" ind1=" " ind2=" "><subfield code="a">Zugl.: Göteborg, Univ., Diss., 2004</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Fulltextdatabaser</subfield><subfield code="2">sao</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Informationsåtervinning - metodik</subfield><subfield code="2">sao</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Volltextdatenbank</subfield><subfield code="0">(DE-588)4138258-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Abfragesprache</subfield><subfield code="0">(DE-588)4134011-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Morphologie</subfield><subfield code="0">(DE-588)4040289-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Schwedisch</subfield><subfield code="0">(DE-588)4116437-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Information-Retrieval-System</subfield><subfield code="0">(DE-588)4670557-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4113937-9</subfield><subfield code="a">Hochschulschrift</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Schwedisch</subfield><subfield code="0">(DE-588)4116437-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Volltextdatenbank</subfield><subfield code="0">(DE-588)4138258-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Information-Retrieval-System</subfield><subfield code="0">(DE-588)4670557-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-188</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Schwedisch</subfield><subfield code="0">(DE-588)4116437-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2="1"><subfield code="a">Volltextdatenbank</subfield><subfield code="0">(DE-588)4138258-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2="2"><subfield code="a">Abfragesprache</subfield><subfield code="0">(DE-588)4134011-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2="3"><subfield code="a">Morphologie</subfield><subfield code="0">(DE-588)4040289-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="5">DE-188</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Skrifter från Valfrid</subfield><subfield code="v">28</subfield><subfield code="w">(DE-604)BV010976203</subfield><subfield code="9">28</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung BSBMuenchen</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=012984191&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-012984191</subfield></datafield></record></collection> |
genre | (DE-588)4113937-9 Hochschulschrift gnd-content |
genre_facet | Hochschulschrift |
id | DE-604.BV019655654 |
illustrated | Illustrated |
indexdate | 2024-07-09T20:02:16Z |
institution | BVB |
isbn | 9189416104 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-012984191 |
oclc_num | 186632924 |
open_access_boolean | |
owner | DE-19 DE-BY-UBM DE-12 DE-188 |
owner_facet | DE-19 DE-BY-UBM DE-12 DE-188 |
physical | 166 S. graph. Darst. |
publishDate | 2004 |
publishDateSearch | 2004 |
publishDateSort | 2004 |
publisher | Valfrid |
record_format | marc |
series | Skrifter från Valfrid |
series2 | Skrifter från Valfrid |
spelling | Ahlgren, Per Verfasser aut The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database Per Ahlgren Borås [u.a.] Valfrid 2004 166 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Skrifter från Valfrid 28 Zugl.: Göteborg, Univ., Diss., 2004 Fulltextdatabaser sao Informationsåtervinning - metodik sao Volltextdatenbank (DE-588)4138258-4 gnd rswk-swf Abfragesprache (DE-588)4134011-5 gnd rswk-swf Morphologie (DE-588)4040289-7 gnd rswk-swf Schwedisch (DE-588)4116437-4 gnd rswk-swf Information-Retrieval-System (DE-588)4670557-0 gnd rswk-swf (DE-588)4113937-9 Hochschulschrift gnd-content Schwedisch (DE-588)4116437-4 s Volltextdatenbank (DE-588)4138258-4 s Information-Retrieval-System (DE-588)4670557-0 s DE-188 Abfragesprache (DE-588)4134011-5 s Morphologie (DE-588)4040289-7 s Skrifter från Valfrid 28 (DE-604)BV010976203 28 Digitalisierung BSBMuenchen application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=012984191&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Ahlgren, Per The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database Skrifter från Valfrid Fulltextdatabaser sao Informationsåtervinning - metodik sao Volltextdatenbank (DE-588)4138258-4 gnd Abfragesprache (DE-588)4134011-5 gnd Morphologie (DE-588)4040289-7 gnd Schwedisch (DE-588)4116437-4 gnd Information-Retrieval-System (DE-588)4670557-0 gnd |
subject_GND | (DE-588)4138258-4 (DE-588)4134011-5 (DE-588)4040289-7 (DE-588)4116437-4 (DE-588)4670557-0 (DE-588)4113937-9 |
title | The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database |
title_auth | The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database |
title_exact_search | The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database |
title_full | The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database Per Ahlgren |
title_fullStr | The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database Per Ahlgren |
title_full_unstemmed | The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database Per Ahlgren |
title_short | The effects of indexing strategy-query term combination on retrieval effectiveness in a Swedish full text database |
title_sort | the effects of indexing strategy query term combination on retrieval effectiveness in a swedish full text database |
topic | Fulltextdatabaser sao Informationsåtervinning - metodik sao Volltextdatenbank (DE-588)4138258-4 gnd Abfragesprache (DE-588)4134011-5 gnd Morphologie (DE-588)4040289-7 gnd Schwedisch (DE-588)4116437-4 gnd Information-Retrieval-System (DE-588)4670557-0 gnd |
topic_facet | Fulltextdatabaser Informationsåtervinning - metodik Volltextdatenbank Abfragesprache Morphologie Schwedisch Information-Retrieval-System Hochschulschrift |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=012984191&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV010976203 |
work_keys_str_mv | AT ahlgrenper theeffectsofindexingstrategyquerytermcombinationonretrievaleffectivenessinaswedishfulltextdatabase |