Text information retrieval systems:
Gespeichert in:
Format: | Buch |
---|---|
Sprache: | English |
Veröffentlicht: |
Amsterdam [u.a.]
Academic Press
2007
|
Ausgabe: | 3. ed. |
Schriftenreihe: | Library and information science
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XVII, 371 S. |
ISBN: | 9780123694126 0123694124 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV022243308 | ||
003 | DE-604 | ||
005 | 20070223 | ||
007 | t | ||
008 | 070126s2007 |||| 00||| eng d | ||
020 | |a 9780123694126 |9 978-0-12-369412-6 | ||
020 | |a 0123694124 |9 0-12-369412-4 | ||
035 | |a (OCoLC)85828387 | ||
035 | |a (DE-599)BVBBV022243308 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
049 | |a DE-473 |a DE-355 | ||
050 | 0 | |a Z667 | |
082 | 0 | |a 025.04 |2 22 | |
084 | |a ST 270 |0 (DE-625)143638: |2 rvk | ||
084 | |a ST 271 |0 (DE-625)143639: |2 rvk | ||
245 | 1 | 0 | |a Text information retrieval systems |c Charles T. Meadow ... |
250 | |a 3. ed. | ||
264 | 1 | |a Amsterdam [u.a.] |b Academic Press |c 2007 | |
300 | |a XVII, 371 S. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Library and information science | |
650 | 4 | |a Systèmes d'information | |
650 | 4 | |a Information retrieval | |
650 | 4 | |a Information storage and retrieval systems | |
650 | 4 | |a Text processing (Computer science) | |
650 | 0 | 7 | |a Information Retrieval |0 (DE-588)4072803-1 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Information Retrieval |0 (DE-588)4072803-1 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Meadow, Charles T. |e Sonstige |4 oth | |
856 | 4 | 2 | |m Digitalisierung UB Regensburg |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=015454202&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-015454202 |
Datensatz im Suchindex
_version_ | 1804136235982127104 |
---|---|
adam_text | Contents
Preface
xv
1 _________________________________________
Introduction
1.1
What Is Information?
1
1.2
What Is Information Retrieval?
2
1.3
How Does Information Retrieval Work?
5
1.3.1
The User Sequence
6
1.3.2
The Database Producer Sequence
10
1.3.3
System Design and Functioning
13
1.3.4
Why the Process Is Not Perfect
15
1.4
Who Uses Information Retrieval?
17
1.4.1
Information Specialists
17
1.4.2
Subject Specialist End
Usen
18
1.4.3
Non-Subject Specialist End Users
18
1.5
What Are the Problems in
1RS
Design and Use?
19
1.5.1
Design
19
1.5.2
Understanding User Behavior
20
1.6
A Brief History of Information Retrieval
21
1.6.1
Traditional Information Retrieval Methods
21
1.6.2
Pre-Computer
IR
Systems
23
1.6.3
Special Purpose Computer Systems
26
1.6.4
General Purpose Computer Systems
27
1.6.5
Online Database Services
29
1.6.6
The World Wide Web
31
Recommended Reading
34
VI
Contents
Data, Information,
and Knowledge
2.1
Introduction
37
2.1
Definitions
37
2.2.1
Data
38
2.2.2
Information
38
2.2.3
News
40
2.2.4
Knowledge
40
2.2.5
Intelligence
41
2.2.6
Meaning
42
2.2.7
Wisdom
42
2.2.8
Relevance and Value
43
2.3
Metadata
43
2.4
Knowledge Base
46
2.5
Credence, Justified Belief, and Point of View
48
2.6
Summary
3
50
Representation of Information
3.1
Information to Be Represented
53
3.2
Types of Representation
58
3.2.1
Natural Language
59
3.2.2
Restricted Natural Language
60
3.2.3
Artificial Language
61
3.2.4
Codes, Measures, and Descriptors
62
3.2.5
Mathematical Models of Text
63
3.3
Characteristics of Information Representations
64
3.3.1
Discriminating Power
65
3.3.2
Identification of Similarity
66
3.3.3
Descriptiveness
66
3.3.4
Ambiguity
66
3.3.5
Conciseness
67
3.4
Relationships Among Entities and Attribute Values
67
3.4.1
Hierarchical Codes
67
3.4.2
Measurements
67
3.4.3
Nominal Descriptors
69
3.4.4
Inflected Language
70
3.4.5
FuU Text
70
3.4.6
Explicit
Pointen
and Links
70
3.5
Summary
71
Contents
VU
Attribute Content and
Values
4.1
Types of
Attribute Symbols 73
4.1.1
Numbers
74
4.1.2
Character Strings: Names
74
4.1.3
Other Character Strings
75
4.2
Class Relationships
75
4.2.1
Hierarchical Classification
76
4.2.2
Network Relationships
77
4.2.3
Class Membership: Binary, Probabilistic, or Fuzzy
78
4.3
Transformations of Values
81
4.3.1
Transformation of Words by Stemming
82
4.3.2
Sound-Based Transformation of Words
85
4.3.3
Transformation of Words by Meaning
86
4.3.4
Transformation of Graphics
88
4.3.5
Transformation of Sound
91
4.4
Uniqueness of Values
93
4.5
Ambiguity of Attribute Values
94
4.6
Indexing of Text
96
4.7
Control of Vocabulary
98
4.7.1
Elements of Control
98
4.7.2
Dissemination of Controlled Vocabularies
100
4.8
Importance of Point of View
100
4.9
Summary
102
Models of Virtual Data Structure
103
106
106
107
107
107
107
109
109
111
112
112
114
5.1
Concept of Models of Data
5.2
Basic Data Elements and Structures
5.2.1
Scalar Variables and Constants
5.2.2
Vector Variables
5.2.3
Structures
5.2.4
Arrays
5.2.5
Tuples
5.2.6
Relations
5.2.7
Text
5.3
Common Structural Models
5.3.1
Linear Sequential Model
5.3.2
Relational Model
5.3.3
Hierarchical and Network Models
VIU
Contents
5.4
Applications
of the
Basic
Models 116
5.4.1 Hypertext 116
5.4.2
Spreadsheet
Files 118
5.5
Entity-Relationship
Model 120
5.6
Summary
121
The Physical Structure of Data
6.1
Introduction to Physical Structures
123
6.2
Record Structures and Their Effects
124
6.2.1
Basic Structures
124
6.2.2
Space-Time and Transaction Rate
127
6.3
Basic Concepts of File Structure
127
6.3.1
The Order of Records
128
6.3.2
Finding Records
128
6.4
Organizational Methods
129
6.4.1
Sequential Files
129
6.4.2
Index-File Structures
131
6.4.3
Lists
133
6.4.4
Trees
136
6.4.5
Direct-Acess Structures
138
6.5
Parsing of Data Elements
141
6.5.1
Phrase Parsing
142
6.5.2
Word Parsing
143
6.5.3
Word and Phrase Parsing
143
6.6
Combination Structures
144
6.6.1
Nested Indexes
144
6.6.2
Direct Structure with Chains
145
6.6.3
Indexed Sequential Access Method
147
6.7
Summary
148
Querying the Information Retrieval System
7.1
Introduction
151
7.2
Language Types
152
7.3
Query Logic
154
7.3.1
Sets and Subsets
155
7.3.2
Relational Statements
155
7.3.3
Boolean Query Logic
156
7.3.4
Ranked and Fuzzy Sets
159
7.3.5
Similarity Measures
162
7.4
Functions Performed
162
7.4.1
Connect to an
1RS 162
Contents
IX
7.4.2
Select
a
Database
. 164
7.4.3
Search the Inverted File or Thesaurus
164
7.4.4
Create a Subset of the Database
167
7.4.5
Search for Strings
168
7.4.6
Analyze a Set
170
7.4.7
Sort, Display, and Format Records
171
7.4.8
Handle the Unstructured Record
- 172
7.4.9
Download
172
7.4.10
Order Documents
173
7.4.11
Save, Recall, and Edit Searches
173
7.4.12
Current Awareness Search
174
7.4.13
Cost Summary
175
7.4.14
Terminate a Session
■ 175
7.5
The Basis for Charging for Searches
176
8__________________________________________________
Interpretation and Execution of Query Statements
8.1
Problems of Query Language Interpretation
177
8.1.1
Parang Command Language
178
8.1.2
Parsing Natural Language
181
8.1.3
Processing Menu Choices
183
8.2
Executing Retrieval Commands
184
8.2.1
Database Selection
184
8.2.2
Inverted File Search
184
8.2.3
Set or Subset Creation
185
8.2.4
Truncation and Universal Characters
187
8.2.5
Left-Hand Truncation
188
8.3
Executing Record Analysis and Presentation Commands
191
8.3.1
Set Analysis Functions
191
8.3.2
Display, Format, and Sort
193
8.3.3
Offline Printing
195
8.4
Executing Other Commands
196
8.4.1
Ordering
196
8.4.2
Save, Recall, and Edit Searches
196
8.4.3
Current Awareness
197
8.4.4
Cost Summation and Billing
198
8.4.5
Terminate a Session
199
8.5
Feedback to Users and Error Messages
199
8.5.1
Response to Command Errors
199
8.5.2
Set-Size Indication
200
8.5.3
Record Display
200
8.5.4
Set Analysis ;
201
8.5.5
Cost
201
8.5.6
Help
201
Contents
Text
Searching
9.1 The Special Problems
of Text Searching
203
9.1.1
A Note on Terminology and Symbols
204
9.1.2
The Semantic Web
205
9.2
Some Characteristics of Text and Their Applications
207
9.2.1
Components of Text
207
9.2.2
Significant Words
—
Indexing
208
9.2.3
Significant Sentences
—
Abstracting
209
9.2.4
Measures of Complete Texts
213
9.3
Command Language for Text Searching
214
9.3.1
Set Membership Statements
215
9.3.2
Word or String Occurrence Statements
215
9.3.3
Proximity Statements
215
9.3.4
Web Based Text Search
217
9.4
Term Weighting
218
9.4.1
Indexing with Weights
220
9.4.2
Automated Assignment of Weights
220
9.4.3
Improving Weights
221
9.5
Word Association Techniques
221
9.5.1
Dictionaries and Thesauri
221
9.5.2
Mini-Thesauri
222
9.5.3
Word Co-occurrence Statistics
223
9.5.4
Stemming and Conflation
224
9.6
Text or Record Association Techniques
224
9.6.1
Similarity Measures
225
9.6.2
Clustering
228
9.6.3
Signature Matching
230
9.6.4
Discriminant Methods
233
9.7
Other Processes with Words of a Text
234
9.7.1
Stop Words
234
9.7.2
Replacement of Words with Roots or
Associated Words
235
9.7.3
Varying Significance as a Function of Frequency
236
9.7.4
Comments on the Computation of the Strength of
Document Association
236
10
System-Computed Relevance and Ranking
10.1
The Retrieval Status Value (rsv)
241
10.2
Ranking
241
10.3
Methods of Evaluating the rsv
242
10.3.1
The Vector Space Model
242
Contents Xl
10.3.2
The Probabilistic Model
244
10.3.3
The Extended Boolean Model
245
10.4
The rsv in Operational Retrieval
247
11 _________________________________________
Search Feedback and Iteration
11.1
Basic Concepts of Feedback and Iteration
249
11.2
Command Sequences
251
11.3
Information Available as Feedback
252
11.3.1
File or Database Selection
252
11.3.2
Term Search or Browsing
253
11.3.3
Record Search and Set Formation
254
11.3.4
Record Display and Browsing
256
11.3.5
Record Acquisition
257
11.3.6
Requests for Information About the Retrieval
System
257
11.3.7
Establishing Communications Parameters
258
11.3.8
Trends Over Sequences and Cycles
258
11.4
Adjustments in the Search
259
11.4.1
Improve Term Selection
260
11.4.2
Improve Set Formation Logic
260
11.4.3
Improve Final Set Size
260
11.4.4
Improve Precision, Recall, or Total Utility
260
11.5
Feedback from User to System
261
12 _____________________________________________________
Multi-Database Searching and Mapping
12.1
Basic Concepts
265
12.2
Multi-Database Search
266
12.2.1
Nature of Duplicate Records
266
12.2.2
Detection of Duplicates
269
12.2.3
Scanning Multiple Databases
271
12.3
Mapping
273
12.4
Value of Mapping
275
із
________________________________:___
Search Strategy
13.1
The Nature of Searching Reconsidered
277
13.1.1
Known Item Search
278
13.1.2
Specific Information Search
278
13.1.3
General Information Search
278
13.1.4
Exploration of the Database
279
Xli Contents
13.2
The Nature of Search Strategy
279
13.2.1
Search Objective
280
13.2.2
General Plan of Operation
280
13.2.3
The Essential Information Elements of a Search
281
13.2.4
Specific Plan of Operation
282
13.3
Types of Strategies
282
13.3.1
Categorizing by Objective
283
13.3.2
Categorizing by Plan of Operation
283
13.4
Tactics
285
13.4.1
Monitoring Tactics
286
13.4.2
File Structure Tactics
286
13.4.3
Search Formulation Tactics
286
13.4.4
Term Tactics
286
13.5
Summary
286
14 ____________________________________________________
The Information Retrieval System Interface
14.1
General Model of Message Flow
287
14.2
Sources of Ambiguity
290
14.3
The Role of a Search Intermediary
291
14.3.1
Establishing the Information Need
292
14.3.2
Development of a Search Strategy
292
14.3.3
Translation of the Need Statement into a Query
292
14.3.4
Interpretation and Evaluation of Output
293
14.3.5
Search Iteration within the Strategic Plan
293
14.3.6
Change of Strategy When Necessary
293
14.3.7
Help in Using an
1RS 294
14.4
Automated Search Mediation
294
14.4.1
Early Development
294
14.4.2
Fully Automatic Intermediary Functions
295
14.4.3
Interactive Intermediary Functions
296
14.5
The User Interface as a Component of All Systems
298
14.6
The User Interface in Web Search Engines
299
15 ____________________________________________________
A Sampling
ofinformation
Retrieval Systems
15.1
Introduction
301
15.2
Dialog
302
15.2.1
Command Language Using Boolean Logic
303
15.2.2
Target
304
15.2.3
DIALOGWeb: A Web Adaptation
305
15.3
AltaVista
308
15.3.1
Default Query Entry Form
309
15.3.2
Advanced Search Form
310
15.4
Google
15.4.1 Web
Crawler
15.4.2
Searching
15.4.3
Google Advanced
Search
15.5
PubMed
15.6
EBSCO
Host
15.7
Summary
Contents
Xlii
311
311
312
312
313
314
315
16 _________________________________________________
Measurement and Evaluation
16.1
Basics of Measurement
317
16.1.1
The Data Manager
. ,.„ 318
16.1.2
The Query Manager
319
16.1.3
The Query Composition Process
319
16.1.4
Deriving the Information Need
320
16.1.5
The Database
320
16.1.6
Users
321
16.2
Relevance, Value, and Utility
321
16.2.1
Relevance as Relatedness
322
16.2.2
Aspects of Value
322
16.2.3
Relevance as Utility
323
16.2.4
Retaining Two Separate Relevance Measures
323
16.2.5
The Relevance Measurement Scale
325
16.2.6
Taking the Measurements
326
16.2.7
Questions about Relevance as a Measure
327
16.3
Measures Based on Relevance
328
16.3.1
Precision (Pr)
328
16.3.2
Recall (Re)
329
16.3.3
Relationship
oí
Recall and Precision
330
16.3.4
Overall Effectiveness Measures Based on Re and Pr
331
16.4
Measures of Process
334
16.4.1
Query Translation
334
16.4.2
Errors in a Query Statement
334
16.4.3
Average Time per Command or per User
Decision
335
16.4.4
Elapsed Time of a Search
335
16.4.5
Number of Commands or Steps in a Search
335
16.4.6
Cost of a Search
335
16.4.7
Size of Final Set Formed
336
16.4.8
Number of Records Reviewed by the User
336
16.4.9
Patterns of Language Use
336
16.4.10
Measures of Rank Order
339
16.5
Measures of Outcome
340
16.5.1
Precision
341
16.5.2
Recall
341
XIV Contents
16.5.3
Efficiency
341
16.5.4
Overall User Evaluation
341
16.6
Measures of Environment
342
16.6.1
Database Record Selection
342
16.6.2
Record Content
342
16.6.3
Measures of Users
342
16.7
Conclusion
343
Bibliography
345
Index
357
|
adam_txt |
Contents
Preface
xv
1 _
Introduction
1.1
What Is Information?
1
1.2
What Is Information Retrieval?
2
1.3
How Does Information Retrieval Work?
5
1.3.1
The User Sequence
6
1.3.2
The Database Producer Sequence
10
1.3.3
System Design and Functioning
13
1.3.4
Why the Process Is Not Perfect
15
1.4
Who Uses Information Retrieval?
17
1.4.1
Information Specialists
17
1.4.2
Subject Specialist End
Usen
18
1.4.3
Non-Subject Specialist End Users
18
1.5
What Are the Problems in
1RS
Design and Use?
19
1.5.1
Design
19
1.5.2
Understanding User Behavior
20
1.6
A Brief History of Information Retrieval
21
1.6.1
Traditional Information Retrieval Methods
21
1.6.2
Pre-Computer
IR
Systems
23
1.6.3
Special Purpose Computer Systems
26
1.6.4
General Purpose Computer Systems
27
1.6.5
Online Database Services
29
1.6.6
The World Wide Web
31
Recommended Reading
34
VI
Contents
Data, Information,
and Knowledge
2.1
Introduction
37
2.1
Definitions
37
2.2.1
Data
38
2.2.2
Information
38
2.2.3
News
40
2.2.4
Knowledge
40
2.2.5
Intelligence
41
2.2.6
Meaning
42
2.2.7
Wisdom
42
2.2.8
Relevance and Value
43
2.3
Metadata
43
2.4
Knowledge Base
46
2.5
Credence, Justified Belief, and Point of View
48
2.6
Summary
3
50
Representation of Information
3.1
Information to Be Represented
53
3.2
Types of Representation
58
3.2.1
Natural Language
59
3.2.2
Restricted Natural Language
60
3.2.3
Artificial Language
61
3.2.4
Codes, Measures, and Descriptors
62
3.2.5
Mathematical Models of Text
63
3.3
Characteristics of Information Representations
64
3.3.1
Discriminating Power
65
3.3.2
Identification of Similarity
66
3.3.3
Descriptiveness
66
3.3.4
Ambiguity
66
3.3.5
Conciseness
67
3.4
Relationships Among Entities and Attribute Values
67
3.4.1
Hierarchical Codes
67
3.4.2
Measurements
67
3.4.3
Nominal Descriptors
69
3.4.4
Inflected Language
70
3.4.5
FuU Text
70
3.4.6
Explicit
Pointen
and Links
70
3.5
Summary
71
Contents
VU
Attribute Content and
Values
4.1
Types of
Attribute Symbols 73
4.1.1
Numbers
74
4.1.2
Character Strings: Names
74
4.1.3
Other Character Strings
75
4.2
Class Relationships
75
4.2.1
Hierarchical Classification
76
4.2.2
Network Relationships
77
4.2.3
Class Membership: Binary, Probabilistic, or Fuzzy
78
4.3
Transformations of Values
81
4.3.1
Transformation of Words by Stemming
82
4.3.2
Sound-Based Transformation of Words
85
4.3.3
Transformation of Words by Meaning
86
4.3.4
Transformation of Graphics
88
4.3.5
Transformation of Sound
91
4.4
Uniqueness of Values
93
4.5
Ambiguity of Attribute Values
94
4.6
Indexing of Text
96
4.7
Control of Vocabulary
98
4.7.1
Elements of Control
98
4.7.2
Dissemination of Controlled Vocabularies
100
4.8
Importance of Point of View
100
4.9
Summary
102
Models of Virtual Data Structure
103
106
106
107
107
107
107
109
109
111
112
112
114
5.1
Concept of Models of Data
5.2
Basic Data Elements and Structures
5.2.1
Scalar Variables and Constants
5.2.2
Vector Variables
5.2.3
Structures
5.2.4
Arrays
5.2.5
Tuples
5.2.6
Relations
5.2.7
Text
5.3
Common Structural Models
5.3.1
Linear Sequential Model
5.3.2
Relational Model
5.3.3
Hierarchical and Network Models
VIU
Contents
5.4
Applications
of the
Basic
Models 116
5.4.1 Hypertext 116
5.4.2
Spreadsheet
Files 118
5.5
Entity-Relationship
Model 120
5.6
Summary
121
The Physical Structure of Data
6.1
Introduction to Physical Structures
123
6.2
Record Structures and Their Effects
124
6.2.1
Basic Structures
124
6.2.2
Space-Time and Transaction Rate
127
6.3
Basic Concepts of File Structure
127
6.3.1
The Order of Records
128
6.3.2
Finding Records
128
6.4
Organizational Methods
129
6.4.1
Sequential Files
129
6.4.2
Index-File Structures
131
6.4.3
Lists
133
6.4.4
Trees
136
6.4.5
Direct-Acess Structures
138
6.5
Parsing of Data Elements
141
6.5.1
Phrase Parsing
142
6.5.2
Word Parsing
143
6.5.3
Word and Phrase Parsing
143
6.6
Combination Structures
144
6.6.1
Nested Indexes
144
6.6.2
Direct Structure with Chains
145
6.6.3
Indexed Sequential Access Method
147
6.7
Summary
148
Querying the Information Retrieval System
7.1
Introduction
151
7.2
Language Types
152
7.3
Query Logic
154
7.3.1
Sets and Subsets
155
7.3.2
Relational Statements
155
7.3.3
Boolean Query Logic
156
7.3.4
Ranked and Fuzzy Sets
159
7.3.5
Similarity Measures
162
7.4
Functions Performed
162
7.4.1
Connect to an
1RS 162
Contents
IX
7.4.2
Select
a
Database
. 164
7.4.3
Search the Inverted File or Thesaurus
164
7.4.4
Create a Subset of the Database
167
7.4.5
Search for Strings
168
7.4.6
Analyze a Set
170
7.4.7
Sort, Display, and Format Records
171
7.4.8
Handle the Unstructured Record
- 172
7.4.9
Download
172
7.4.10
Order Documents
173
7.4.11
Save, Recall, and Edit Searches
173
7.4.12
Current Awareness Search
174
7.4.13
Cost Summary
175
7.4.14
Terminate a Session
■ 175
7.5
The Basis for Charging for Searches
176
8_
Interpretation and Execution of Query Statements
8.1
Problems of Query Language Interpretation
177
8.1.1
Parang Command Language
178
8.1.2
Parsing Natural Language
181
8.1.3
Processing Menu Choices
183
8.2
Executing Retrieval Commands
184
8.2.1
Database Selection
184
8.2.2
Inverted File Search
184
8.2.3
Set or Subset Creation
185
8.2.4
Truncation and Universal Characters
187
8.2.5
Left-Hand Truncation
188
8.3
Executing Record Analysis and Presentation Commands
191
8.3.1
Set Analysis Functions
191
8.3.2
Display, Format, and Sort
193
8.3.3
Offline Printing
195
8.4
Executing Other Commands
196
8.4.1
Ordering
196
8.4.2
Save, Recall, and Edit Searches
196
8.4.3
Current Awareness
197
8.4.4
Cost Summation and Billing
198
8.4.5
Terminate a Session
199
8.5
Feedback to Users and Error Messages
199
8.5.1
Response to Command Errors
199
8.5.2
Set-Size Indication
200
8.5.3
Record Display
200
8.5.4
Set Analysis ;
201
8.5.5
Cost
' 201
8.5.6
Help
201
Contents
Text
Searching
9.1 The Special Problems
of Text Searching
203
9.1.1
A Note on Terminology and Symbols
204
9.1.2
The Semantic Web
205
9.2
Some Characteristics of Text and Their Applications
207
9.2.1
Components of Text
207
9.2.2
Significant Words
—
Indexing
208
9.2.3
Significant Sentences
—
Abstracting
209
9.2.4
Measures of Complete Texts
213
9.3
Command Language for Text Searching
214
9.3.1
Set Membership Statements
215
9.3.2
Word or String Occurrence Statements
215
9.3.3
Proximity Statements
215
9.3.4
Web Based Text Search
217
9.4
Term Weighting
218
9.4.1
Indexing with Weights
220
9.4.2
Automated Assignment of Weights
220
9.4.3
Improving Weights
221
9.5
Word Association Techniques
221
9.5.1
Dictionaries and Thesauri
221
9.5.2
Mini-Thesauri
222
9.5.3
Word Co-occurrence Statistics
223
9.5.4
Stemming and Conflation
224
9.6
Text or Record Association Techniques
224
9.6.1
Similarity Measures
225
9.6.2
Clustering
228
9.6.3
Signature Matching
230
9.6.4
Discriminant Methods
233
9.7
Other Processes with Words of a Text
234
9.7.1
Stop Words
234
9.7.2
Replacement of Words with Roots or
Associated Words
235
9.7.3
Varying Significance as a Function of Frequency
236
9.7.4
Comments on the Computation of the Strength of
Document Association
236
10
System-Computed Relevance and Ranking
10.1
The Retrieval Status Value (rsv)
241
10.2
Ranking
241
10.3
Methods of Evaluating the rsv
242
10.3.1
The Vector Space Model
242
Contents Xl
10.3.2
The Probabilistic Model
244
10.3.3
The Extended Boolean Model
245
10.4
The rsv in Operational Retrieval
247
11 _
Search Feedback and Iteration
11.1
Basic Concepts of Feedback and Iteration
249
11.2
Command Sequences
251
11.3
Information Available as Feedback
252
11.3.1
File or Database Selection
252
11.3.2
Term Search or Browsing
253
11.3.3
Record Search and Set Formation
254
11.3.4
Record Display and Browsing
256
11.3.5
Record Acquisition
257
11.3.6
Requests for Information About the Retrieval
System
257
11.3.7
Establishing Communications Parameters
'258
11.3.8
Trends Over Sequences and Cycles
258
11.4
Adjustments in the Search
259
11.4.1
Improve Term Selection
260
11.4.2
Improve Set Formation Logic
260
11.4.3
Improve Final Set Size
260
11.4.4
Improve Precision, Recall, or Total Utility
260
11.5
Feedback from User to System
261
12 _
Multi-Database Searching and Mapping
12.1
Basic Concepts
265
12.2
Multi-Database Search
266
12.2.1
Nature of Duplicate Records
266
12.2.2
Detection of Duplicates
269
12.2.3
Scanning Multiple Databases
271
12.3
Mapping
273
12.4
Value of Mapping
275
із
_:_
Search Strategy
13.1
The Nature of Searching Reconsidered
277
13.1.1
Known Item Search
278
13.1.2
Specific Information Search
278
13.1.3
General Information Search
278
13.1.4
Exploration of the Database
279
Xli Contents
13.2
The Nature of Search Strategy
279
13.2.1
Search Objective
280
13.2.2
General Plan of Operation
280
13.2.3
The Essential Information Elements of a Search
281
13.2.4
Specific Plan of Operation
282
13.3
Types of Strategies
282
13.3.1
Categorizing by Objective
283
13.3.2
Categorizing by Plan of Operation
283
13.4
Tactics
285
13.4.1
Monitoring Tactics
286
13.4.2
File Structure Tactics
286
13.4.3
Search Formulation Tactics
286
13.4.4
Term Tactics
286
13.5
Summary
286
14 _
The Information Retrieval System Interface
14.1
General Model of Message Flow
287
14.2
Sources of Ambiguity
290
14.3
The Role of a Search Intermediary
291
14.3.1
Establishing the Information Need
292
14.3.2
Development of a Search Strategy
292
14.3.3
Translation of the Need Statement into a Query
292
14.3.4
Interpretation and Evaluation of Output
293
14.3.5
Search Iteration within the Strategic Plan
293
14.3.6
Change of Strategy When Necessary
293
14.3.7
Help in Using an
1RS 294
14.4
Automated Search Mediation
294
14.4.1
Early Development
294
14.4.2
Fully Automatic Intermediary Functions
295
14.4.3
Interactive Intermediary Functions
296
14.5
The User Interface as a Component of All Systems
298
14.6
The User Interface in Web Search Engines
299
15 _
A Sampling
ofinformation
Retrieval Systems
15.1
Introduction
301
15.2
Dialog
302
15.2.1
Command Language Using Boolean Logic
303
15.2.2
Target
304
15.2.3
DIALOGWeb: A Web Adaptation
305
15.3
AltaVista
308
15.3.1
Default Query Entry Form
309
15.3.2
Advanced Search Form
310
15.4
Google
15.4.1 Web
Crawler
15.4.2
Searching
15.4.3
Google Advanced
Search
15.5
PubMed
15.6
EBSCO
Host
15.7
Summary
Contents
Xlii
311
311
312
312
313
314
315
16 _
Measurement and Evaluation
16.1
Basics of Measurement
317
16.1.1
The Data Manager
.',.„ 318
16.1.2
The Query Manager
319
16.1.3
The Query Composition Process
319
16.1.4
Deriving the Information Need
320
16.1.5
The Database
320
16.1.6
Users
321
16.2
Relevance, Value, and Utility
321
16.2.1
Relevance as Relatedness
322
16.2.2
Aspects of Value
322
16.2.3
Relevance as Utility
323
16.2.4
Retaining Two Separate Relevance Measures
323
16.2.5
The Relevance Measurement Scale
325
16.2.6
Taking the Measurements
326
16.2.7
Questions about Relevance as a Measure
327
16.3
Measures Based on Relevance
328
16.3.1
Precision (Pr)
328
16.3.2
Recall (Re)
329
16.3.3
Relationship
oí
Recall and Precision
330
16.3.4
Overall Effectiveness Measures Based on Re and Pr
331
16.4
Measures of Process
334
16.4.1
Query Translation
334
16.4.2
Errors in a Query Statement
334
16.4.3
Average Time per Command or per User
Decision
335
16.4.4
Elapsed Time of a Search
335
16.4.5
Number of Commands or Steps in a Search
335
16.4.6
Cost of a Search
335
16.4.7
Size of Final Set Formed
336
16.4.8
Number of Records Reviewed by the User
336
16.4.9
Patterns of Language Use
336
16.4.10
Measures of Rank Order
339
16.5
Measures of Outcome
340
16.5.1
Precision
341
16.5.2
Recall
341
XIV Contents
16.5.3
Efficiency
341
16.5.4
Overall User Evaluation
341
16.6
Measures of Environment
342
16.6.1
Database Record Selection
342
16.6.2
Record Content
342
16.6.3
Measures of Users
342
16.7
Conclusion
343
Bibliography
345
Index
357 |
any_adam_object | 1 |
any_adam_object_boolean | 1 |
building | Verbundindex |
bvnumber | BV022243308 |
callnumber-first | Z - Library Science |
callnumber-label | Z667 |
callnumber-raw | Z667 |
callnumber-search | Z667 |
callnumber-sort | Z 3667 |
callnumber-subject | Z - Books and Writing |
classification_rvk | ST 270 ST 271 |
ctrlnum | (OCoLC)85828387 (DE-599)BVBBV022243308 |
dewey-full | 025.04 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 025 - Operations of libraries and archives |
dewey-raw | 025.04 |
dewey-search | 025.04 |
dewey-sort | 225.04 |
dewey-tens | 020 - Library and information sciences |
discipline | Allgemeines Informatik |
discipline_str_mv | Allgemeines Informatik |
edition | 3. ed. |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01573nam a2200433 c 4500</leader><controlfield tag="001">BV022243308</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20070223 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">070126s2007 |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780123694126</subfield><subfield code="9">978-0-12-369412-6</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0123694124</subfield><subfield code="9">0-12-369412-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)85828387</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV022243308</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-473</subfield><subfield code="a">DE-355</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">Z667</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">025.04</subfield><subfield code="2">22</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 270</subfield><subfield code="0">(DE-625)143638:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 271</subfield><subfield code="0">(DE-625)143639:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Text information retrieval systems</subfield><subfield code="c">Charles T. Meadow ...</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">3. ed.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Amsterdam [u.a.]</subfield><subfield code="b">Academic Press</subfield><subfield code="c">2007</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XVII, 371 S.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Library and information science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Systèmes d'information</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Information retrieval</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Information storage and retrieval systems</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text processing (Computer science)</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Information Retrieval</subfield><subfield code="0">(DE-588)4072803-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Information Retrieval</subfield><subfield code="0">(DE-588)4072803-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Meadow, Charles T.</subfield><subfield code="e">Sonstige</subfield><subfield code="4">oth</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Regensburg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=015454202&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-015454202</subfield></datafield></record></collection> |
id | DE-604.BV022243308 |
illustrated | Not Illustrated |
index_date | 2024-07-02T16:36:46Z |
indexdate | 2024-07-09T20:53:11Z |
institution | BVB |
isbn | 9780123694126 0123694124 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-015454202 |
oclc_num | 85828387 |
open_access_boolean | |
owner | DE-473 DE-BY-UBG DE-355 DE-BY-UBR |
owner_facet | DE-473 DE-BY-UBG DE-355 DE-BY-UBR |
physical | XVII, 371 S. |
publishDate | 2007 |
publishDateSearch | 2007 |
publishDateSort | 2007 |
publisher | Academic Press |
record_format | marc |
series2 | Library and information science |
spelling | Text information retrieval systems Charles T. Meadow ... 3. ed. Amsterdam [u.a.] Academic Press 2007 XVII, 371 S. txt rdacontent n rdamedia nc rdacarrier Library and information science Systèmes d'information Information retrieval Information storage and retrieval systems Text processing (Computer science) Information Retrieval (DE-588)4072803-1 gnd rswk-swf Information Retrieval (DE-588)4072803-1 s DE-604 Meadow, Charles T. Sonstige oth Digitalisierung UB Regensburg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=015454202&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Text information retrieval systems Systèmes d'information Information retrieval Information storage and retrieval systems Text processing (Computer science) Information Retrieval (DE-588)4072803-1 gnd |
subject_GND | (DE-588)4072803-1 |
title | Text information retrieval systems |
title_auth | Text information retrieval systems |
title_exact_search | Text information retrieval systems |
title_exact_search_txtP | Text information retrieval systems |
title_full | Text information retrieval systems Charles T. Meadow ... |
title_fullStr | Text information retrieval systems Charles T. Meadow ... |
title_full_unstemmed | Text information retrieval systems Charles T. Meadow ... |
title_short | Text information retrieval systems |
title_sort | text information retrieval systems |
topic | Systèmes d'information Information retrieval Information storage and retrieval systems Text processing (Computer science) Information Retrieval (DE-588)4072803-1 gnd |
topic_facet | Systèmes d'information Information retrieval Information storage and retrieval systems Text processing (Computer science) Information Retrieval |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=015454202&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT meadowcharlest textinformationretrievalsystems |