Multiple imputation for nonresponse in surveys:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Hoboken, NJ. [u.a.]
Wiley
2004
|
Schriftenreihe: | Wiley classics library edition
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XXIX, 287 S. |
ISBN: | 0471655740 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV021812894 | ||
003 | DE-604 | ||
005 | 20230704 | ||
007 | t | ||
008 | 061115s2004 |||| 00||| eng d | ||
020 | |a 0471655740 |9 0-471-65574-0 | ||
035 | |a (OCoLC)56214438 | ||
035 | |a (DE-599)BVBBV021812894 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
049 | |a DE-N32 |a DE-473 |a DE-11 |a DE-188 |a DE-578 |a DE-19 | ||
050 | 0 | |a HA31.2 | |
082 | 0 | |a 001.4/22 |2 22 | |
082 | 0 | |a 001.422 | |
084 | |a QH 235 |0 (DE-625)141550: |2 rvk | ||
084 | |a QH 244 |0 (DE-625)141558: |2 rvk | ||
084 | |a SK 840 |0 (DE-625)143261: |2 rvk | ||
084 | |a MAT 625f |2 stub | ||
100 | 1 | |a Rubin, Donald B. |d 1943- |e Verfasser |0 (DE-588)131607618 |4 aut | |
245 | 1 | 0 | |a Multiple imputation for nonresponse in surveys |c Donald B. Rubin |
264 | 1 | |a Hoboken, NJ. [u.a.] |b Wiley |c 2004 | |
300 | |a XXIX, 287 S. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 0 | |a Wiley classics library edition | |
650 | 7 | |a Ontbrekende gegevens |2 gtt | |
650 | 7 | |a Survey-onderzoek |2 gtt | |
650 | 4 | |a Multiple imputation (Statistics) | |
650 | 4 | |a Nonresponse (Statistics) | |
650 | 4 | |a Social surveys |x Response rate | |
650 | 0 | 7 | |a Datenerhebung |0 (DE-588)4155272-6 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Ausreißerwert |0 (DE-588)4143602-7 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Umfrage |0 (DE-588)4005227-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Statistik |0 (DE-588)4056995-0 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Imputationstechnik |0 (DE-588)4609617-6 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Schätztheorie |0 (DE-588)4121608-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Non-response-Problem |0 (DE-588)4133974-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Antwortverweigerung |0 (DE-588)4343644-4 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Schätztheorie |0 (DE-588)4121608-8 |D s |
689 | 0 | 1 | |a Imputationstechnik |0 (DE-588)4609617-6 |D s |
689 | 0 | |5 DE-604 | |
689 | 1 | 0 | |a Umfrage |0 (DE-588)4005227-8 |D s |
689 | 1 | 1 | |a Statistik |0 (DE-588)4056995-0 |D s |
689 | 1 | |5 DE-604 | |
689 | 2 | 0 | |a Antwortverweigerung |0 (DE-588)4343644-4 |D s |
689 | 2 | 1 | |a Imputationstechnik |0 (DE-588)4609617-6 |D s |
689 | 2 | |5 DE-604 | |
689 | 3 | 0 | |a Umfrage |0 (DE-588)4005227-8 |D s |
689 | 3 | 1 | |a Non-response-Problem |0 (DE-588)4133974-5 |D s |
689 | 3 | 2 | |a Imputationstechnik |0 (DE-588)4609617-6 |D s |
689 | 3 | |5 DE-188 | |
689 | 4 | 0 | |a Datenerhebung |0 (DE-588)4155272-6 |D s |
689 | 4 | |8 1\p |5 DE-604 | |
689 | 5 | 0 | |a Ausreißerwert |0 (DE-588)4143602-7 |D s |
689 | 5 | |8 2\p |5 DE-604 | |
776 | 0 | 8 | |i Erscheint auch als |n Onlineausgabe |z 978-0-470-31669-6 |
856 | 4 | 2 | |m Digitalisierung UB Bamberg |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=015025147&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-015025147 | ||
883 | 1 | |8 1\p |a cgwrk |d 20201028 |q DE-101 |u https://d-nb.info/provenance/plan#cgwrk | |
883 | 1 | |8 2\p |a cgwrk |d 20201028 |q DE-101 |u https://d-nb.info/provenance/plan#cgwrk |
Datensatz im Suchindex
_version_ | 1804135734667378688 |
---|---|
adam_text | Contents
TABUES ANB
FIGURES
xxiv
GLOSSARY
xxvii
1.
INTRODUCTION
1
1.1.
Overview
1
Nonresponse
in Surveys
1
Multiple
Imputation
2
Can Multiple Imputation Be Used in Nonsurvey
Problems?
3
Background
4
1.2.
Examples of Surveys with
Nonresponse
4
Example
1.1.
Educational Testing Service s Sample
Survey of Schools
4
Example
1.2.
Current Population Survey and Missing
Incomes
5
Example
1.3.
Census PubEc-Use Data Bases and
Missing Occupation Codes
6
Example
1.4.
Normative Aging Study of Drinking
7
13.
Properly Handling
Nonresponse
7
Handling Noaresponse is Example
1.1
7
Handling
Nonresponse
in Example
1.2
9
Handing Nonresponss in Example 1J
10
Xii CONTENTS
Handling
Nonresponse
in
Example
1.4 10
The Variety of Objectives When Handling
Nonresponse
11
1.4.
Single Imputation
11
Imputation Allows Standard Complete-Data Methods
of Analysis to Be Used
11
Imputation Can Incorporate Data Collectors
Knowledge
12
The Problem with One Imputation for Each Missing
Value
12
Example
1.5.
Best-Prediction Imputation in a Simple
Random Sample
13
Example
1.6.
Drawing Imputations from a Distribution
(Example
1.5
continued)
14
1.5.
Multiple Imputation
15
Advantages of Multiple Imputation
15
The General Need to Display Sensitivity to Models of
Nonresponse
16
Disadvantages of Multiple Imputation
17
1.6.
Numerical Example Using Multiple Imputation
19
Analyzing This Multiply-Imputed Data Set
19
Creating This Multiply-Imputed Data Set
22
1.7.
Guidance for
aie
Reader
22
Problems
23
2.
STATISTICAL BACKGROUND
27
2.1.
Introduction
27
Random Indexing of Units
27
2.2.
Variables in the Finite Population
28
Covariates X
28
Outcome Variables
Y
29
Indicator for Inclusion in the Survey I
29
Indicator for Response in the Survey
R
30
Stable Response
30
Surveys with Stages of Sampling
31
CONTENTS
ХІІІ
13.
Probability Distributions and Related Calculations
31
Conditional Probability Distributions
32
Probability Specifications Are Symmetric in Unit Indices
32
Bayes s Theorem
33
Finding Means and Variances from Conditional Means
and Variances
33
2.4.
Probability Specifications for Indicator Variables
35
Sampling Mechanisms
35
Examples of Unconfounded Probability Sampling
Mechanisms
37
Examples of Confounded and Nonprobabflity Sampling
Mechanisms
38
Response Mechanisms
38
2.5.
Probability Specifications for
(X, F
) 39
de Finetti s
Theorem
40
Some Intuition
40
Example
2.1.
A Simple Normal Model for
Y¡
40
Lemma
2.1,
Distributions Relevant to Example
2.1 41
Example
2.2.
A Generalization of Example
2.1 42
Example
2.3.
An Application of Example
2.2:
The Bayesian Bootstrap
44
Example
2.4.
Y- Approximately Proportional to
X¡
46
2.6.
Bayesian Inference for a Population Quantity
48
Notation
48
The Posterior Distribution for Q{X, Y)
48
Relating the Posterior Distribution of
Q
to the Posterior
Distribution of Ynob
49
Ignorable
Sampling Mechanisms SO
Result
2.1.
An Equivalent Definition for
Ignorable
Sampling Mechanisms SO
Ignorable
Response Mechanisms SI
Result
2.2.
IgHOrabiity of the Response Mechanism
When the Sampling Mechanism Is
Ignorable SI
Result
2.3.
Тће
Practical
Importane»
of IgaoraMe
Mechanisms
52
xiv CONTENTS
Relating
Ignorable
Sampling and Response Mechanisms
to Standard Terminology in the Literature on Parametric
Inference from Incomplete Data
S3
2.7.
Interval
Estimation
54
General Interval Estimates
55
Bayesian Posterior Coverage
55
Example
2.5.
Interval Estimation in the Context of
Example
2.1 56
Fixed-Response Randomization-Based Coverage
56
Random-Response Randomization-Based Coverage
58
Nominal versus Actual Coverage of Intervals
58
2.8.
Bayesian Procedures for Constructing Interval Estimates,
Including Significance Levels and Point Estimates
59
Highest Posterior Density
Repons
59
Significance Levels
—
/»-Values
60
Point Estimates
62
2.9.
Evaluating me Performance of Procedures
62
A Protocol for Evaluating Procedures
63
Result
2.4.
The Average Coverages Are All Equal to
the Probability That
С
Includes
Q
64
Further Comments on Calibration
64
2.10.
Similarity of Bayesian and Randomization-Based
Inferences in
Mány
Practical Cases
65
Standard Asymptotic Results Concerning Bayesian
Procedures
66
Extensions of These Standard Results
66
Practical Conclusions of Asymptotic Results
67
Relevance to tbe Multiple-Imputation Approach to
Nonresponse
67
Problems
68
3.
UNDERLYING BAYESIAN THEORY
75
3.1.
introduction and Summary of Repeated-Imputation
Inferences
75
Notation
75
CONTENTS
XV
Combining the Repeated Complete-Data Estimates
and Variances
76
Scalar
g
77
Significance Levels Based on the Combined
Estimates and Variances
77
Significance Levels Based on Repeated
Compiete-Data
Significance Levels
78
Example
3.1.
Inference for Regression Coefficients
79
3.2.
Key Results for Analysis Wben
tne
Multiple Imputations
Are Repeated Draws from the Posterior
Distribution
of
the Missing Values
81
Result
3.1.
Averaging the Completed-Data Posterior
Distribution of
Q
over the Posterior Distribution of Ymit
to Obtain the Actual Posterior Distribution of
Q
82
Example
32.
The Normal Model Continued
82
The Posterior Cumulative Distribution Function of
Q
83
Result
3.2.
Posterior Mean and Variance of
Q
84
Simulating the Posterior Mean and Variance of
Q
85
Missing and Observed Information with Infinite
m
85
Inference for
Q
from Repeated Completed-Data Means
and Variances
86
Example
3
J. Example
32
Continued
87
33.
Inference for Scalar Estimands from
ш
Modest
Number
of Repeated Completed-Data Means and Variances
87
Тће
Plan of Attack
88
The Samplmg Distribution of Sm Gives (X, Yebs, Riae)
88
The Conditional Distribution of (QmS Um) Given Sm
and
S„
89
The
Condiţional
Distribution of
Q
Given
Ѕт
and JM
89
The Conditional Distribution of
Д»
Given Sm
90
The Conditional
ШїгіЬйїіоа
of
Пт + Џ
+
m~l)Bm
Gives
Ѕ„
90
Approximation
3.1
Relevant to the
Befareas-Fîsner
Distribution
91
Applying
Apfxrosámation
ЗЛ
to
öböia
(3
J
J)
92
The Approximating
t
Refermée
Dìsteibatìoa
for Scalar
Q
92
XVI CONTENTS
Example
3.4.
Example
3.3
Continued
92
Fraction of Information Missing Due to Nonresponse
93
3.4.
Significance Levels for Multicoraponent
Estimanđs
from
a Modest Number of Repeated Completed-Data Means
and Variance-Covarianee Matrices
5*4
The Conditional Distribution of
Q
Given Sm and
Bœ
94
The Bayesian /»-Value for a Null Value Qo Given Sm:
General Expression
95
The Bayesian ¿»-Value Given Sm with Scalar
Q
95
The Bayesian
p-Vàiue
Given
S„witn
Scalar
Q
—
Closed-Form Approximation
96
^Values with
В„
a Priori Proportional to Tm
96
¿¡»-Values with
Bœ
a Priori Proportional to
Тж
—
Closed-Form Approximation
97
/j-Vahies When
В„
Is Not a Priori Proportional
to Um
98
3.5.
Significance Levels from Repeated Completed-Data
Significance Levels
99
A New Test Statistic
99
The Asymptotic Equivalence of Dm and Dm
—
Proof
100
Integrating over rm to Obtain a Significance Level from
Repeated Completed-Data Significance Levels
100
3.6.
Relating the Completed-Data and Complete-Data
Posterior Distributions When the Sampling Mechanism
Is
Ignorable
1Θ2
Result
3.3.
The Completed-Data and Complete-Data
Posterior Distributions Are Equal When Sampling and
Response
Mechanisms Are
Ignorable
103
Using ii.d. Modeling
104
Result
3.4.
The Equality of Completed-Data and
Complete-Data Posterior Distributions When Using
ii-d. Models
104
Example
3.5.
A Situation
io
Which Conditional on
9XY, the Completed-Data aad Complete-Data Posterior
Distributions of
Q
Are Equal—Condition
(3.6.7) 105
Example
3.6.
Cases in Which Condition
(3.6.7)
Nearly
Holds
105
CONTENTS
XVii
Example
3.7.
Situations in Which the Completed-Data
r»nd Complete-Data Posterior Distributions of
θχγ
Ait
Equal—Condition
(3.6.8) 106
Example 3.S. A Simple Case Illustrating the Large-
Sample Equivalence of Completed-Data and Complete-
Data Posterior Distributions of &XY
106
The General Use of Complete-Data Statistics
106
Problems
107
4.
RANDOMIZATION-BASED EVALUATIONS
113
4.1.
Introduction
113
Major Conclusions
113
Large-Sample Relative Efficiency of Point Estimates
114
Large-Sample Coverage of /-Based Interval Estimates
114
Outline of Chapter
115
4.2.
General Conditions for the Randomization-Validity of
înfinîte-m
Repeated-Imputation Inferences
116
Complications in Practice
117
More General Conditions for Randomization-Validity
117
Definition: Proper Multiple-Imputation Methods
118
Result
4.1.
If the Complete-Data Inference Is
Randomization-Valid and the Multiple-Imputation
Procedure Is Proper, Then the Infinite-m Repeated-
Imputation Inference Is Randomization-Valid
ander
the
Posited Response Mechanism
119
43.
Examples of Proper and Improper Imputation Methods
in a Simple Case with
Ignorable
Nonresponse
120
Example
4.1.
Simple Random Multiple Imputation
120
Why Variability Is Underestimated Using the Maltiple-
Imputation Hot-Deck
122
Example
4.2.
Falły
Normal
Вауеяав
Repeated
Imputation
123
Example
4.3.
A
Nonnormal Bayesian
Imputation
Procedure That Is Proper for the Standard Inference
—
The Bayesian Bootstrap
123
Example
4.4.
An Approximately Bayesiaa yet Proper
Imputation Method-—The Approximate
Взувши
Bootstrap
124
XVÜi
CONTENTS
Example
4.5.
The Mean and Variance Adjusted
Hot-Deck
124
4.4.
Further Discussion of Proper Imputation Methods
125
Conclusion
4.1.
Approximate Repetitions from a
Bayesian Model Tend to Be Proper
125
The Heuristic Argument
126
Messages of Conclusion
4.1 126
The Importance of Drawing Repeated Imputations
Appropriate for the Posited Response Mechanism
127
The Role of the
Compiete-Data
Statistics in Determining
Whether a Repeated Imputation Method Is Proper
127
4.5.
The Asymptotic Distribution of (Qmf Um, Bm) for Proper
Imputation
Methods
128
Validity of the Asymptotic Sampling Distribution of Sm
128
The
Distribution
of {Qm,
Ђт,
B J
Given
(Ж,
Y)
for Scalar Q
129
Random-Response Randomization-Based Justification
for the
/
Reference Distribution
130
Extension of Results to Multicomponent
Q
131
Asymptotic Efficiency of Qm Relative to
Qœ
131
4.6.
Evaluations of Finite-*» Inferences with Scalar Estimands
132
Small-Sample Efficiencies of Asymptotically Proper
Imputation Methods from Examples
4.2-4.5 132
Large-Sample Coverages of Interval Estimates Using a
t
Reference Distribution and Proper Imputation
Methods
134
Small-Sample Monte Carlo Coverages of Asymptotically
Proper Imputation Methods from Examples
4.2-4.5 135
Evaluation of Significance Levels
135
4.7.
Evaluation of Significance Levels from die Moment-
Based Statistics Dm and Dm with Multicomponent
Estimands
137
The Level of a Significance Testing
Procedare
138
The Level of J>m
—
Analysis for Proper Imputation
Methods and Large Samples
138
The Level of
!>„—
Numerical Results
139
CONTENTS XIX
The Level
of A,—Analysis
139
The Effect of Unequal Fractions of Missing Information
on bm
141
Some Numerical Results for
Ďm
with k
=
(k
+
1)p/2
141
4.8.
Evaluation of Significance Levels Based on Repeated
Significance Levels
144
The Statistic Dm
144
The Asymptotic Sampling Distribution of dm and
s¿
144
Some Numerical Results for 2>m
145
The Superiority of Multiple Imputation Significance
Levels
145
Problems
148
5.
PROCEDURES WITH
IGNORABLE
NONRESPONSE
154
5.1.
Introduction
154
No Direct Evidence to Contradict
Ignorable
Nonresponse
155
Adjust for All Observed Differences and Assume
Unobserved Residual Differences Are Random
155
Univariate
Y¡
and Many Respondents at Each Distinct
Value of X, That Occurs Among Nonrespondents
156
The More Common Situation, Even with Univariate
Y¡
156
A Popular Implicit Model
—
The
Cessas
Bureau s
Hot-Deck
15?
Metric-Matching Hot-Deck Methods
158
Least-Squares Regression
159
Outline of Chapter
159
5.2.
Creating Imputed Values under an Explicit Model
160
The Modeling Task
160
The Imputation Task
léi
Result
5.1.
The Imputation Task with
Ignorable
Nonresponse
162
The Estimation Task
163
Resell
5.2.
Tàe
Estimation Task with
Ignorable
NonrespoBse When
ФщК
and $x Are
β
Priori
Independent
164
XX CONTENTS
Resalí
53.
The Estimation Task with
Ignorable
Monresponse,
θγίΧ
and
θχ
a Priori Independent,
and Univariate
Y¡
165
A Simplified Notation
165
53.
Some Explicit Imputation Models with Univariate
Fř
and
Covariates
166
Example
5.1.
Noraial Linear
Régression
Model with
Univariate
Y¡
166
Ехашріе
5.2.
Adding a Hot-Deck Component to the
Normal Linear Regression Imputation Model
168
Extending the Normal Linear Regression Model
168
Example
53.
A Logistic Regression Imputation Model
for Dichotomous
Y¡
169
5.4.
Monotone Patterns of Missingness in Multivariate
Y¡
170
Monotone Missingness in
Y
—
Definition
171
The General Monotone Pattern
—
Description of General
Techniques
171
Example
5.4.
Bivariate
Y¡
and an Implicit Imputation
Model
172
Example
5.5.
Bivariate
Y¡
with an Explicit Normal
Linear Regression Model
173
Monotone-Distinct Structure
174
Result
5.4.
The Estimation Task with a Monotone-
Distinct Structure
175
Result
5.5.
The Imputation Task with a Monotone-
Distinct Structure
177
5.5.
Missing
Somi
Security Benefits in the Current
Population Survey
178
The CPS-IRS-SSA Exact Match File
178
The Reduced Data Base
179
The Modeling Task
179
The Estimation Task
180
The Imputation Task
181
Results Concerning Absolute Accuracies of Prediction
181
Inferences for the Average
ÖASDI
Benefits for the
Nonrespondents in the Sample
184
Results on Inferences for Population Quantities
185
CONTENTS XXI
5.6.
Beyond
Monotone
Míssingness
186
Two Outcomes Never Jointly Observed
—
Statistical
Matching of Files
186
Example
5.6.
Two Normal Outconts Never Jointly
Observed
187
Problems Arising with
Nonmonotone
Patterns
188
Discarding Data to Obtain a Monotone Pattern
189
Assuming Conditional Independence Among Blocks of
Variables to Create Independent Monotone Patterns
190
Using Computationally Convenient Explicit Models
191
Iteratively Using Methods for Monotone Patterns
192
The Sampling/Importance Resampling Algorithm
192
Some Details of SIR
193
Example
5.7.
An Illustrative Application of SIR
194
Problems
195
6.
PROCEDURES WITH
NONIGNORABŁE
NONRESPONSE
202
6.1.
Introduction
202
Displaying Sensitivity to Models for
Nonresponse
202
The Need to Use Easily Communicated Models
203
Transformations to Create Nonignorabie Imputed
Values from
Ignorable
Imputed Values
203
Other Simple Methods for Creating Nonignofable
Imputed Values Using
Ignorable
Imputation Models
203
Essential Statistical Issues and Outline of Chapter
204
6.2.
Nonignorabie
Nonresponse
with Umvariate
Y¡
and No X,-
2Ö5
The Modeling Task
205
The Imputation Task
206
The Estimation Task
206
Two Basic Approaches to the
Modefing
Task
207
Example
6.1.
The Simple
Normat
Mixture Model
207
Example
6.2.
The Simple Normal Selection Model
209
63.
Formal Tasks wittt Nonignorabie NearespeBse
2Ιβ
The Modeling Task—Notation
210
XXU
CONTENTS
Two General Approaches to the Modeling Task
211
Similarities with
Ignorable
Case
211
The Imputation Task
212
Result
6.1.
The Imputation Task with Nonignorable
Nonresponse
212
Result
62,
The Imputation Task with Nonignorable
Nonresponse
When Each Unit Is Either Included in or
Excluded from the Survey
212
The Estimation Task
213
Result
6.3.
The Estimation Task with Nonignorable
Nonresponse
When
вУщХ
Is a Priori Independent of
θχ
213
Result
6.4.
The Estimation Task with Nonignorable
Nonresponse
When O^xr Is a Priori Independent of
(вщХ,
θ
x) and Each Unit Is Either Included in or
Excluded from the Survey
213
Result 6.S. The Imputation and Estimation Tasks with
Nonignorable
Nonresponse
and Univariate
Y¡
214
Monotone Missingness
214
Result
6.6.
The Estimation and Imputation Tasks with
a Monotone-Distinct Structure and a Mixture Model for
Nonignorable
Nonresponse
214
Selection Modeling and Monotone Missingness
215
6.4.
Illustrating Mixture Moddiag Using Educational Testing
Service Data
215
The Data Base
216
The Modeling Task
216
Clarification of Prior Distribution Relating
Nonrespondent
and Respondent Parameters
217
Comments on Assumptions
218
The Estimation Task
219
The Imputation Task
219
Analysis of Multiply-Imputed Data
221
6.5.
Illustrating Selection Modeling Using CPS Date
222
The Data Base
223
The Modeimg Task
224
The Estimation Task
225
The Imputation Task
225
CONTENTS
ХХЇІІ
Accuracy of Results for Single Imputation Methods
226
Estimates and Standard Errors for Average log(wage)
for Nonrespondents is the Sample
22Î
Inferences for Population Mean logCwage)
229
6.6.
Extensions to Surveys with Follow-Ups
229
Ignorable
Nonresponse
231
Nonignorable
Nonresponse
with
100%
Follow-Up
Response
231
Example
6.3. 100%
Follow-Up Response in a Simple
Random Sample of
Y¡
232
Ignorable
Hard-Core
Nonresponse
Among Foflow-Ups
233
Nonignorable Hard-Core
Nonresponse
Among Foilow-
Ups
233
Waves of FoHow-Ups
234
6.7.
Follow-Up Response in a Survey of Drinking Behavior
Among
Mea
of Retirement Age
234
The Data Base
235
The Modeling Task
235
The Estimation Task
235
The Imputation Task
235
Inference for the Effect of Retirement Status oa Drinking
Behavior
239
Problems
240
REFERENCES
244
AUTHOR INDEX
251
SUBJECT INDEX 2S3
APPENDIX
lì
Report Written for the Social Security
Administration In
1977 259
APPENDIX II: Report Written for the Census
Borea«
In
1983 268
Tables
and Figures
Figure
1.1.
Data set with
m
imputations for each missing datum.
3
ТаЫе
1.1.
Artificial example of survey data and multiple imputa¬
tion.
20
ТаЫе 1.2.
Analysis of multiply-imputed data set of Table
1.1. 21
Figure
2.1.
Matrix of variables in a finite population of JV units.
29
Figure
2.2.
Contours of the posterior distribution of
Q
with the
null value Qo indicated. The significance level of Qo is
the posterior probability that
g
is in the shaded area
and beyond.
62
ТаЫе
4.1.
Large-sample relative efficiency (in
%)
when using a
finite number of proper imputations, m, rather than an
infinite
number, as a function of the fraction of missing
information,
γ0:
RE
= (1 +
yo/w)~1/2.
114
ТаЫе
4.2.
Large-sample coverage probability (in
%)
of interval
estimates based on the
r
reference distribution,
(3.1.8),
as a function of the number of proper imputations,
m ä:
2;
the fraction of missing information,
γ0;
and the
nominal level,
1 —
a. Also included for contrast are
results based on single imputation,
m
= 1,
using the
complete-daîa
normal^eference distribution
(3.1.1)
with
Q
replaced by Qx
=
Qmi and
U
replaced by
Ux
=
Umi.
115
ТаЫе
43.
Simulated coverages (in
%)
oí
asymptotically proper
multiple (m
= 2)
imputation procedures with nominal
levels
Ш%
and
95%,
using
г
-based
inferences, response
rates n-i/n, and normal and
nonnormal
data (Laplace,
logBOrmal =exp
Νφ,
í));
maximum standard error
< 1%. 136
ХЗНГГ
TABLES
AND FIGURES
XXV
Table
4.4.
Large-sample level (in
%)
of Dm with Fk
,
reference
distribution as a function of nominal level, a; number
of components being tested, k; number of proper
imputations, m; and fraction of missing information,
γ0.
Accuracy of results
= 5000
simulations of
(4.7.8)
with p0 set to I.
Table
45.
Large-sample level (in
%)
of
Ďm
with
FkÁk+Vir/2
reference
distribution as a
function of number of com¬
ponents being tested, k; number of proper imputa¬
tions, m; fraction of missing information, y0; and
variance of fractions of missing information,
0
(zero),
S
(small),
L
(large). Accuracy of results
= 5000
simula¬
tions of
(4
J.9).
Large-sample level (in
%)
of Dm with Fkil+k-i)f/2
reference
distribution as a
function of number of com¬
ponents being tested, k; number of proper imputa¬
tions, m; fraction of missing information,
γ0;
and
variance of fractions of missing information,
0
(zero),
S
(small),
L
(large). Accuracy of results
= 5000
simula¬
tions of
(4.7.7).
Large-sample level (in
%}
of
¿Ц^
with
χ|
reference
distribution as a
function of nominal level a; number
of components being tested, k; and fraction of missing
information,
γ0.
Table
4.6.
Table
4.7.
Figure
5.1.
Figure
5.2.
Table
5.1.
ТаЫе
5.2.
Table
53.
Table
5.4.
A monotone pattern of missingness,
1 —
observed,
0 =
missing.
Artificial example illustrating hot-deck multiple impu¬
tation with a monotone pattern of missing data;
parentheses enclose
m
= 2
imputations.
Multiple imputations of OASDI benefits for nonre-
spondents
62-71
years of age.
Multiple imputations of
0АЅШ
benefits for nonre-
spondents over
72
years of age.
Accuracies of
imputation
methods with respect to
mean absolute deviation (MAD) and root mean squared
deviation (RMS).
Comparison of estimates (standard errors) for mean
ОАЅШ
benefits impied by imputation Methods for
noarespondeni groups in the sample.
140
142
146
147
171
172
182
183
183
184
TABLES
AND FIGURES
ТаЫе
5.5.
ТаЫе
5.6.
Table
6.1.
ТаЫе
6.2.
Table
63.
ТаЫе
6.4.
тт*
ì
6.І
ТаЫе
63.
ТаЫе
6.6.
ТаЫе
6.7.
Table
6.8.
Table
6.9.
Table
6.10.
Comparison of estimates (standard errors) for mean
OASDI benefits impEed by imputation methods for
groups ia the
population.
185
Example from
Marini,
Olsen
and Rubin
(1980)
il¬
lustrating how to obtain a monotone pattern of missing
data by discarding data;
1 =
observed,
0 =
missing.
190
Summary of repeated-imputation intervals for variable
17B in educational example.
221
Background variables X for GRZ example on imputa¬
tion of missing incomes.
223
Root-mean-squared error of imputations of log-wage:
Impute posterior mean given
θ
fixed at MLE,
Θ.
226
Repeated-imputation estimates (standard errors) for
average log(wage) for nonrespondents in the sample
under five
imputation
procedures.
228
Schematic data structure with follow-up surveys of
nonrespondents: boldface produces
У
data.
230
Mean alcohol consumption level and retirement status
for respondents and nonrespondents within birth
cohort: Data from
1982
Normative Aging Study drink¬
ing
questionnaire.
236
Summary of least-squares estimates of the regression of
log(l
+
drinks/day) on retirement status
(0 =
working,
1 —
retired), birth year, and retirement status
x
birth
year interaction.
237
Five values of regression parameters for nonrespon¬
dents drawn from their posterior distribution.
237
Five imputed values of log(l
+
drinks/day) for each
of the
74
non-followed-up nonrespondents.
238
Sets of least-squares estimates from the five data sets
completed by imputation.
239
Repeated-imputation estimates, standard errors, and
percentages of missing information for the regression
of log(l
+
drinks/day) oe retirement status, birth year,
and retirement status X birth year interaction.
239
|
adam_txt |
Contents
TABUES ANB
FIGURES
xxiv
GLOSSARY
xxvii
1.
INTRODUCTION
1
1.1.
Overview
1
Nonresponse
in Surveys
1
Multiple
Imputation
2
Can Multiple Imputation Be Used in Nonsurvey
Problems?
3
Background
4
1.2.
Examples of Surveys with
Nonresponse
4
Example
1.1.
Educational Testing Service's Sample
Survey of Schools
4
Example
1.2.
Current Population Survey and Missing
Incomes
5
Example
1.3.
Census PubEc-Use Data Bases and
Missing Occupation Codes
6
Example
1.4.
Normative Aging Study of Drinking
7
13.
Properly Handling
Nonresponse
7
Handling Noaresponse is Example
1.1
7
Handling
Nonresponse
in Example
1.2
9
Handing Nonresponss in Example 1J
10
Xii CONTENTS
Handling
Nonresponse
in
Example
1.4 10
The Variety of Objectives When Handling
Nonresponse
11
1.4.
Single Imputation
11
Imputation Allows Standard Complete-Data Methods
of Analysis to Be Used
11
Imputation Can Incorporate Data Collectors
Knowledge
12
The Problem with One Imputation for Each Missing
Value
12
Example
1.5.
Best-Prediction Imputation in a Simple
Random Sample
13
Example
1.6.
Drawing Imputations from a Distribution
(Example
1.5
continued)
14
1.5.
Multiple Imputation
15
Advantages of Multiple Imputation
15
The General Need to Display Sensitivity to Models of
Nonresponse
16
Disadvantages of Multiple Imputation
17
1.6.
Numerical Example Using Multiple Imputation
19
Analyzing This Multiply-Imputed Data Set
19
Creating This Multiply-Imputed Data Set
22
1.7.
Guidance for
aie
Reader
22
Problems
23
2.
STATISTICAL BACKGROUND
27
2.1.
Introduction
27
Random Indexing of Units
27
2.2.
Variables in the Finite Population
28
Covariates X
28
Outcome Variables
Y
29
Indicator for Inclusion in the Survey I
29
Indicator for Response in the Survey
R
30
Stable Response
30
Surveys with Stages of Sampling
31
CONTENTS
ХІІІ
13.
Probability Distributions and Related Calculations
31
Conditional Probability Distributions
32
Probability Specifications Are Symmetric in Unit Indices
32
Bayes's Theorem
33
Finding Means and Variances from Conditional Means
and Variances
33
2.4.
Probability Specifications for Indicator Variables
35
Sampling Mechanisms
35
Examples of Unconfounded Probability Sampling
Mechanisms
37
Examples of Confounded and Nonprobabflity Sampling
Mechanisms
38
Response Mechanisms
38
2.5.
Probability Specifications for
(X, F
) 39
de Finetti's
Theorem
40
Some Intuition
40
Example
2.1.
A Simple Normal Model for
Y¡
40
Lemma
2.1,
Distributions Relevant to Example
2.1 41
Example
2.2.
A Generalization of Example
2.1 42
Example
2.3.
An Application of Example
2.2:
The Bayesian Bootstrap
44
Example
2.4.
Y- Approximately Proportional to
X¡
46
2.6.
Bayesian Inference for a Population Quantity
48
Notation
48
The Posterior Distribution for Q{X, Y)
48
Relating the Posterior Distribution of
Q
to the Posterior
Distribution of Ynob
49
Ignorable
Sampling Mechanisms SO
Result
2.1.
An Equivalent Definition for
Ignorable
Sampling Mechanisms SO
Ignorable
Response Mechanisms SI
Result
2.2.
IgHOrabiity of the Response Mechanism
When the Sampling Mechanism Is
Ignorable SI
Result
2.3.
Тће
Practical
Importane»
of IgaoraMe
Mechanisms
52
xiv CONTENTS
Relating
Ignorable
Sampling and Response Mechanisms
to Standard Terminology in the Literature on Parametric
Inference from Incomplete Data
S3
2.7.
Interval
Estimation
54
General Interval Estimates
55
Bayesian Posterior Coverage
55
Example
2.5.
Interval Estimation in the Context of
Example
2.1 56
Fixed-Response Randomization-Based Coverage
56
Random-Response Randomization-Based Coverage
58
Nominal versus Actual Coverage of Intervals
58
2.8.
Bayesian Procedures for Constructing Interval Estimates,
Including Significance Levels and Point Estimates
59
Highest Posterior Density
Repons
59
Significance Levels
—
/»-Values
60
Point Estimates
62
2.9.
Evaluating me Performance of Procedures
62
A Protocol for Evaluating Procedures
63
Result
2.4.
The Average Coverages Are All Equal to
the Probability That
С
Includes
Q
64
Further Comments on Calibration
64
2.10.
Similarity of Bayesian and Randomization-Based
Inferences in
Mány
Practical Cases
65
Standard Asymptotic Results Concerning Bayesian
Procedures
66
Extensions of These Standard Results
66
Practical Conclusions of Asymptotic Results
67
Relevance to tbe Multiple-Imputation Approach to
Nonresponse
67
Problems
68
3.
UNDERLYING BAYESIAN THEORY
75
3.1.
introduction and Summary of Repeated-Imputation
Inferences
75
Notation
75
CONTENTS
XV
Combining the Repeated Complete-Data Estimates
and Variances
76
Scalar
g
77
Significance Levels Based on the Combined
Estimates and Variances
77
Significance Levels Based on Repeated
Compiete-Data
Significance Levels
78
Example
3.1.
Inference for Regression Coefficients
79
3.2.
Key Results for Analysis Wben
tne
Multiple Imputations
Are Repeated Draws from the Posterior
Distribution
of
the Missing Values
81
Result
3.1.
Averaging the Completed-Data Posterior
Distribution of
Q
over the Posterior Distribution of Ymit
to Obtain the Actual Posterior Distribution of
Q
82
Example
32.
The Normal Model Continued
82
The Posterior Cumulative Distribution Function of
Q
83
Result
3.2.
Posterior Mean and Variance of
Q
84
Simulating the Posterior Mean and Variance of
Q
85
Missing and Observed Information with Infinite
m
85
Inference for
Q
from Repeated Completed-Data Means
and Variances
86
Example
3
J. Example
32
Continued
87
33.
Inference for Scalar Estimands from
ш
Modest
Number
of Repeated Completed-Data Means and Variances
87
Тће
Plan of Attack
88
The Samplmg Distribution of Sm Gives (X, Yebs, Riae)
88
The Conditional Distribution of (QmS Um) Given Sm
and
S„
89
The
Condiţional
Distribution of
Q
Given
Ѕт
and JM
89
The Conditional Distribution of
Д»
Given Sm
90
The Conditional
ШїгіЬйїіоа
of
Пт + Џ
+
m~l)Bm
Gives
Ѕ„
90
Approximation
3.1
Relevant to the
Befareas-Fîsner
Distribution
91
Applying
Apfxrosámation
ЗЛ
to
öböia
(3
J
J)
92
The Approximating
t
Refermée
Dìsteibatìoa
for Scalar
Q
92
XVI CONTENTS
Example
3.4.
Example
3.3
Continued
92
Fraction of Information Missing Due to Nonresponse
93
3.4.
Significance Levels for Multicoraponent
Estimanđs
from
a Modest Number of Repeated Completed-Data Means
and Variance-Covarianee Matrices
5*4
The Conditional Distribution of
Q
Given Sm and
Bœ
94
The Bayesian /»-Value for a Null Value Qo Given Sm:
General Expression
95
The Bayesian ¿»-Value Given Sm with Scalar
Q
95
The Bayesian
p-Vàiue
Given
S„witn
Scalar
Q
—
Closed-Form Approximation
96
^Values with
В„
a Priori Proportional to Tm
96
¿¡»-Values with
Bœ
a Priori Proportional to
Тж
—
Closed-Form Approximation
97
/j-Vahies When
В„
Is Not a Priori Proportional
to Um
98
3.5.
Significance Levels from Repeated Completed-Data
Significance Levels
99
A New Test Statistic
99
The Asymptotic Equivalence of Dm and Dm
—
Proof
100
Integrating over rm to Obtain a Significance Level from
Repeated Completed-Data Significance Levels
100
3.6.
Relating the Completed-Data and Complete-Data
Posterior Distributions When the Sampling Mechanism
Is
Ignorable
1Θ2
Result
3.3.
The Completed-Data and Complete-Data
Posterior Distributions Are Equal When Sampling and
Response
Mechanisms Are
Ignorable
103
Using ii.d. Modeling
104
Result
3.4.
The Equality of Completed-Data and
Complete-Data Posterior Distributions When Using
ii-d. Models
104
Example
3.5.
A Situation
io
Which Conditional on
9XY, the Completed-Data aad Complete-Data Posterior
Distributions of
Q
Are Equal—Condition
(3.6.7) 105
Example
3.6.
Cases in Which Condition
(3.6.7)
Nearly
Holds
105
CONTENTS
XVii
Example
3.7.
Situations in Which the Completed-Data
r»nd Complete-Data Posterior Distributions of
θχγ
Ait
Equal—Condition
(3.6.8) 106
Example 3.S. A Simple Case Illustrating the Large-
Sample Equivalence of Completed-Data and Complete-
Data Posterior Distributions of &XY
106
The General Use of Complete-Data Statistics
106
Problems
107
4.
RANDOMIZATION-BASED EVALUATIONS
113
4.1.
Introduction
113
Major Conclusions
113
Large-Sample Relative Efficiency of Point Estimates
114
Large-Sample Coverage of /-Based Interval Estimates
114
Outline of Chapter
115
4.2.
General Conditions for the Randomization-Validity of
înfinîte-m
Repeated-Imputation Inferences
116
Complications in Practice
117
More General Conditions for Randomization-Validity
117
Definition: Proper Multiple-Imputation Methods
118
Result
4.1.
If the Complete-Data Inference Is
Randomization-Valid and the Multiple-Imputation
Procedure Is Proper, Then the Infinite-m Repeated-
Imputation Inference Is Randomization-Valid
ander
the
Posited Response Mechanism
119
43.
Examples of Proper and Improper Imputation Methods
in a Simple Case with
Ignorable
Nonresponse
120
Example
4.1.
Simple Random Multiple Imputation
120
Why Variability Is Underestimated Using the Maltiple-
Imputation Hot-Deck
122
Example
4.2.
Falły
Normal
Вауеяав
Repeated
Imputation
123
Example
4.3.
A
Nonnormal Bayesian
Imputation
Procedure That Is Proper for the Standard Inference
—
The Bayesian Bootstrap
123
Example
4.4.
An Approximately Bayesiaa yet Proper
Imputation Method-—The Approximate
Взувши
Bootstrap
124
XVÜi
CONTENTS
Example
4.5.
The Mean and Variance Adjusted
Hot-Deck
124
4.4.
Further Discussion of Proper Imputation Methods
125
Conclusion
4.1.
Approximate Repetitions from a
Bayesian Model Tend to Be Proper
125
The Heuristic Argument
126
Messages of Conclusion
4.1 126
The Importance of Drawing Repeated Imputations
Appropriate for the Posited Response Mechanism
127
The Role of the
Compiete-Data
Statistics in Determining
Whether a Repeated Imputation Method Is Proper
127
4.5.
The Asymptotic Distribution of (Qmf Um, Bm) for Proper
Imputation
Methods
128
Validity of the Asymptotic Sampling Distribution of Sm
128
The
Distribution
of {Qm,
Ђт,
B J
Given
(Ж,
Y)
for Scalar Q
129
Random-Response Randomization-Based Justification
for the
/
Reference Distribution
130
Extension of Results to Multicomponent
Q
131
Asymptotic Efficiency of Qm Relative to
Qœ
131
4.6.
Evaluations of Finite-*» Inferences with Scalar Estimands
132
Small-Sample Efficiencies of Asymptotically Proper
Imputation Methods from Examples
4.2-4.5 132
Large-Sample Coverages of Interval Estimates Using a
t
Reference Distribution and Proper Imputation
Methods
134
Small-Sample Monte Carlo Coverages of Asymptotically
Proper Imputation Methods from Examples
4.2-4.5 135
Evaluation of Significance Levels
135
4.7.
Evaluation of Significance Levels from die Moment-
Based Statistics Dm and Dm with Multicomponent
Estimands
137
The Level of a Significance Testing
Procedare
138
The Level of J>m
—
Analysis for Proper Imputation
Methods and Large Samples
138
The Level of
!>„—
Numerical Results
139
CONTENTS XIX
The Level
of A,—Analysis
139
The Effect of Unequal Fractions of Missing Information
on bm
141
Some Numerical Results for
Ďm
with k'
=
(k
+
1)p/2
141
4.8.
Evaluation of Significance Levels Based on Repeated
Significance Levels
144
The Statistic Dm
144
The Asymptotic Sampling Distribution of dm and
s¿
144
Some Numerical Results for 2>m
145
The Superiority of Multiple Imputation Significance
Levels
145
Problems
148
5.
PROCEDURES WITH
IGNORABLE
NONRESPONSE
154
5.1.
Introduction
154
No Direct Evidence to Contradict
Ignorable
Nonresponse
155
Adjust for All Observed Differences and Assume
Unobserved Residual Differences Are Random
155
Univariate
Y¡
and Many Respondents at Each Distinct
Value of X, That Occurs Among Nonrespondents
156
The More Common Situation, Even with Univariate
Y¡
156
A Popular Implicit Model
—
The
Cessas
Bureau's
Hot-Deck
15?
Metric-Matching Hot-Deck Methods
158
Least-Squares Regression
159
Outline of Chapter
159
5.2.
Creating Imputed Values under an Explicit Model
160
The Modeling Task
160
The Imputation Task
léi
Result
5.1.
The Imputation Task with
Ignorable
Nonresponse
162
The Estimation Task
163
Resell
5.2.
Tàe
Estimation Task with
Ignorable
NonrespoBse When
ФщК
and $x Are
β
Priori
Independent
164
XX CONTENTS
Resalí
53.
The Estimation Task with
Ignorable
Monresponse,
θγίΧ
and
θχ
a Priori Independent,
and Univariate
Y¡
165
A Simplified Notation
165
53.
Some Explicit Imputation Models with Univariate
Fř
and
Covariates
166
Example
5.1.
Noraial Linear
Régression
Model with
Univariate
Y¡
166
Ехашріе
5.2.
Adding a Hot-Deck Component to the
Normal Linear Regression Imputation Model
168
Extending the Normal Linear Regression Model
168
Example
53.
A Logistic Regression Imputation Model
for Dichotomous
Y¡
169
5.4.
Monotone Patterns of Missingness in Multivariate
Y¡
170
Monotone Missingness in
Y
—
Definition
171
The General Monotone Pattern
—
Description of General
Techniques
171
Example
5.4.
Bivariate
Y¡
and an Implicit Imputation
Model
172
Example
5.5.
Bivariate
Y¡
with an Explicit Normal
Linear Regression Model
173
Monotone-Distinct Structure
174
Result
5.4.
The Estimation Task with a Monotone-
Distinct Structure
175
Result
5.5.
The Imputation Task with a Monotone-
Distinct Structure
177
5.5.
Missing
Somi
Security Benefits in the Current
Population Survey
178
The CPS-IRS-SSA Exact Match File
178
The Reduced Data Base
179
The Modeling Task
179
The Estimation Task
180
The Imputation Task
181
Results Concerning Absolute Accuracies of Prediction
181
Inferences for the Average
ÖASDI
Benefits for the
Nonrespondents in the Sample
184
Results on Inferences for Population Quantities
185
CONTENTS XXI
5.6.
Beyond
Monotone
Míssingness
186
Two Outcomes Never Jointly Observed
—
Statistical
Matching of Files
186
Example
5.6.
Two Normal Outconts Never Jointly
Observed
187
Problems Arising with
Nonmonotone
Patterns
188
Discarding Data to Obtain a Monotone Pattern
189
Assuming Conditional Independence Among Blocks of
Variables to Create Independent Monotone Patterns
190
Using Computationally Convenient Explicit Models
191
Iteratively Using Methods for Monotone Patterns
192
The Sampling/Importance Resampling Algorithm
192
Some Details of SIR
193
Example
5.7.
An Illustrative Application of SIR
194
Problems
195
6.
PROCEDURES WITH
NONIGNORABŁE
NONRESPONSE
202
6.1.
Introduction
202
Displaying Sensitivity to Models for
Nonresponse
202
The Need to Use Easily Communicated Models
203
Transformations to Create Nonignorabie Imputed
Values from
Ignorable
Imputed Values
203
Other Simple Methods for Creating Nonignofable
Imputed Values Using
Ignorable
Imputation Models
203
Essential Statistical Issues and Outline of Chapter
204
6.2.
Nonignorabie
Nonresponse
with Umvariate
Y¡
and No X,-
2Ö5
The Modeling Task
205
The Imputation Task
206
The Estimation Task
206
Two Basic Approaches to the
Modefing
Task
207
Example
6.1.
The Simple
Normat
Mixture Model
207
Example
6.2.
The Simple Normal Selection Model
209
63.
Formal Tasks wittt Nonignorabie NearespeBse
2Ιβ
The Modeling Task—Notation
210
XXU
CONTENTS
Two General Approaches to the Modeling Task
211
Similarities with
Ignorable
Case
211
The Imputation Task
212
Result
6.1.
The Imputation Task with Nonignorable
Nonresponse
212
Result
62,
The Imputation Task with Nonignorable
Nonresponse
When Each Unit Is Either Included in or
Excluded from the Survey
212
The Estimation Task
213
Result
6.3.
The Estimation Task with Nonignorable
Nonresponse
When
вУщХ
Is a Priori Independent of
θχ
213
Result
6.4.
The Estimation Task with Nonignorable
Nonresponse
When O^xr Is a Priori Independent of
(вщХ,
θ
x) and Each Unit Is Either Included in or
Excluded from the Survey
213
Result 6.S. The Imputation and Estimation Tasks with
Nonignorable
Nonresponse
and Univariate
Y¡
214
Monotone Missingness
214
Result
6.6.
The Estimation and Imputation Tasks with
a Monotone-Distinct Structure and a Mixture Model for
Nonignorable
Nonresponse
214
Selection Modeling and Monotone Missingness
215
6.4.
Illustrating Mixture Moddiag Using Educational Testing
Service Data
215
The Data Base
216
The Modeling Task
216
Clarification of Prior Distribution Relating
Nonrespondent
and Respondent Parameters
217
Comments on Assumptions
218
The Estimation Task
219
The Imputation Task
219
Analysis of Multiply-Imputed Data
221
6.5.
Illustrating Selection Modeling Using CPS Date
222
The Data Base
223
The Modeimg Task
224
The Estimation Task
225
The Imputation Task
225
CONTENTS
ХХЇІІ
Accuracy of Results for Single Imputation Methods
226
Estimates and Standard Errors for Average log(wage)
for Nonrespondents is the Sample
22Î
Inferences for Population Mean logCwage)
229
6.6.
Extensions to Surveys with Follow-Ups
229
Ignorable
Nonresponse
231
Nonignorable
Nonresponse
with
100%
Follow-Up
Response
231
Example
6.3. 100%
Follow-Up Response in a Simple
Random Sample of
Y¡
232
Ignorable
Hard-Core
Nonresponse
Among Foflow-Ups
233
Nonignorable Hard-Core
Nonresponse
Among Foilow-
Ups
233
Waves of FoHow-Ups
234
6.7.
Follow-Up Response in a Survey of Drinking Behavior
Among
Mea
of Retirement Age
234
The Data Base
235
The Modeling Task
235
The Estimation Task
235
The Imputation Task
235
Inference for the Effect of Retirement Status oa Drinking
Behavior
239
Problems
240
REFERENCES
244
AUTHOR INDEX
251
SUBJECT INDEX 2S3
APPENDIX
lì
Report Written for the Social Security
Administration In
1977 259
APPENDIX II: Report Written for the Census
Borea«
In
1983 268
Tables
and Figures
Figure
1.1.
Data set with
m
imputations for each missing datum.
3
ТаЫе
1.1.
Artificial example of survey data and multiple imputa¬
tion.
20
ТаЫе 1.2.
Analysis of multiply-imputed data set of Table
1.1. 21
Figure
2.1.
Matrix of variables in a finite population of JV" units.
29
Figure
2.2.
Contours of the posterior distribution of
Q
with the
null value Qo indicated. The significance level of Qo is
the posterior probability that
g
is in the shaded area
and beyond.
62
ТаЫе
4.1.
Large-sample relative efficiency (in
%)
when using a
finite number of proper imputations, m, rather than an
infinite
number, as a function of the fraction of missing
information,
γ0:
RE
= (1 +
yo/w)~1/2.
114
ТаЫе
4.2.
Large-sample coverage probability (in
%)
of interval
estimates based on the
r
reference distribution,
(3.1.8),
as a function of the number of proper imputations,
m ä:
2;
the fraction of missing information,
γ0;
and the
nominal level,
1 —
a. Also included for contrast are
results based on single imputation,
m
= 1,
using the
complete-daîa
normal^eference distribution
(3.1.1)
with
Q
replaced by Qx
=
Qmi and
U
replaced by
Ux
=
Umi.
115
ТаЫе
43.
Simulated coverages (in
%)
oí
asymptotically proper
multiple (m
= 2)
imputation procedures with nominal
levels
Ш%
and
95%,
using
г
-based
inferences, response
rates n-i/n, and normal and
nonnormal
data (Laplace,
logBOrmal =exp
Νφ,
í));
maximum standard error
< 1%. 136
ХЗНГГ
TABLES
AND FIGURES
XXV
Table
4.4.
Large-sample level (in
%)
of Dm with Fk
,
reference
distribution as a function of nominal level, a; number
of components being tested, k; number of proper
imputations, m; and fraction of missing information,
γ0.
Accuracy of results
= 5000
simulations of
(4.7.8)
with p0 set to I.
Table
45.
Large-sample level (in
%)
of
Ďm
with
FkÁk+Vir/2
reference
distribution as a
function of number of com¬
ponents being tested, k; number of proper imputa¬
tions, m; fraction of missing information, y0; and
variance of fractions of missing information,
0
(zero),
S
(small),
L
(large). Accuracy of results
= 5000
simula¬
tions of
(4
J.9).
Large-sample level (in
%)
of Dm with Fkil+k-i)f/2
reference
distribution as a
function of number of com¬
ponents being tested, k; number of proper imputa¬
tions, m; fraction of missing information,
γ0;
and
variance of fractions of missing information,
0
(zero),
S
(small),
L
(large). Accuracy of results
= 5000
simula¬
tions of
(4.7.7).
Large-sample level (in
%}
of
¿Ц^
with
χ|
reference
distribution as a
function of nominal level a; number
of components being tested, k; and fraction of missing
information,
γ0.
Table
4.6.
Table
4.7.
Figure
5.1.
Figure
5.2.
Table
5.1.
ТаЫе
5.2.
Table
53.
Table
5.4.
A monotone pattern of missingness,
1 —
observed,
0 =
missing.
Artificial example illustrating hot-deck multiple impu¬
tation with a monotone pattern of missing data;
parentheses enclose
m
= 2
imputations.
Multiple imputations of OASDI benefits for nonre-
spondents
62-71
years of age.
Multiple imputations of
0АЅШ
benefits for nonre-
spondents over
72
years of age.
Accuracies of
imputation
methods with respect to
mean absolute deviation (MAD) and root mean squared
deviation (RMS).
Comparison of estimates (standard errors) for mean
ОАЅШ
benefits impied by imputation Methods for
noarespondeni groups in the sample.
140
142
146
147
171
172
182
183
183
184
TABLES
AND FIGURES
ТаЫе
5.5.
ТаЫе
5.6.
Table
6.1.
ТаЫе
6.2.
Table
63.
ТаЫе
6.4.
тт*
ì
6.І
ТаЫе
63.
ТаЫе
6.6.
ТаЫе
6.7.
Table
6.8.
Table
6.9.
Table
6.10.
Comparison of estimates (standard errors) for mean
OASDI benefits impEed by imputation methods for
groups ia the
population.
185
Example from
Marini,
Olsen
and Rubin
(1980)
il¬
lustrating how to obtain a monotone pattern of missing
data by discarding data;
1 =
observed,
0 =
missing.
190
Summary of repeated-imputation intervals for variable
17B in educational example.
221
Background variables X for GRZ example on imputa¬
tion of missing incomes.
223
Root-mean-squared error of imputations of log-wage:
Impute posterior mean given
θ
fixed at MLE,
Θ.
226
Repeated-imputation estimates (standard errors) for
average log(wage) for nonrespondents in the sample
under five
imputation
procedures.
228
Schematic data structure with follow-up surveys of
nonrespondents: boldface produces
У
data.
230
Mean alcohol consumption level and retirement status
for respondents and nonrespondents within birth
cohort: Data from
1982
Normative Aging Study drink¬
ing
questionnaire.
236
Summary of least-squares estimates of the regression of
log(l
+
drinks/day) on retirement status
(0 =
working,
1 —
retired), birth year, and retirement status
x
birth
year interaction.
237
Five values of regression parameters for nonrespon¬
dents drawn from their posterior distribution.
237
Five imputed values of log(l
+
drinks/day) for each
of the
74
non-followed-up nonrespondents.
238
Sets of least-squares estimates from the five data sets
completed by imputation.
239
Repeated-imputation estimates, standard errors, and
percentages of missing information for the regression
of log(l
+
drinks/day) oe retirement status, birth year,
and retirement status X birth year interaction.
239 |
any_adam_object | 1 |
any_adam_object_boolean | 1 |
author | Rubin, Donald B. 1943- |
author_GND | (DE-588)131607618 |
author_facet | Rubin, Donald B. 1943- |
author_role | aut |
author_sort | Rubin, Donald B. 1943- |
author_variant | d b r db dbr |
building | Verbundindex |
bvnumber | BV021812894 |
callnumber-first | H - Social Science |
callnumber-label | HA31 |
callnumber-raw | HA31.2 |
callnumber-search | HA31.2 |
callnumber-sort | HA 231.2 |
callnumber-subject | HA - Statistics |
classification_rvk | QH 235 QH 244 SK 840 |
classification_tum | MAT 625f |
ctrlnum | (OCoLC)56214438 (DE-599)BVBBV021812894 |
dewey-full | 001.4/22 001.422 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 001 - Knowledge |
dewey-raw | 001.4/22 001.422 |
dewey-search | 001.4/22 001.422 |
dewey-sort | 11.4 222 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Allgemeines Mathematik Wirtschaftswissenschaften |
discipline_str_mv | Allgemeines Mathematik Wirtschaftswissenschaften |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03027nam a2200757 c 4500</leader><controlfield tag="001">BV021812894</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20230704 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">061115s2004 |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0471655740</subfield><subfield code="9">0-471-65574-0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)56214438</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV021812894</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-N32</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-188</subfield><subfield code="a">DE-578</subfield><subfield code="a">DE-19</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">HA31.2</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">001.4/22</subfield><subfield code="2">22</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">001.422</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">QH 235</subfield><subfield code="0">(DE-625)141550:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">QH 244</subfield><subfield code="0">(DE-625)141558:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 840</subfield><subfield code="0">(DE-625)143261:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">MAT 625f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Rubin, Donald B.</subfield><subfield code="d">1943-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)131607618</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Multiple imputation for nonresponse in surveys</subfield><subfield code="c">Donald B. Rubin</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Hoboken, NJ. [u.a.]</subfield><subfield code="b">Wiley</subfield><subfield code="c">2004</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXIX, 287 S.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Wiley classics library edition</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Ontbrekende gegevens</subfield><subfield code="2">gtt</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Survey-onderzoek</subfield><subfield code="2">gtt</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Multiple imputation (Statistics)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Nonresponse (Statistics)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Social surveys</subfield><subfield code="x">Response rate</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenerhebung</subfield><subfield code="0">(DE-588)4155272-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Ausreißerwert</subfield><subfield code="0">(DE-588)4143602-7</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Umfrage</subfield><subfield code="0">(DE-588)4005227-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Imputationstechnik</subfield><subfield code="0">(DE-588)4609617-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Schätztheorie</subfield><subfield code="0">(DE-588)4121608-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Non-response-Problem</subfield><subfield code="0">(DE-588)4133974-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Antwortverweigerung</subfield><subfield code="0">(DE-588)4343644-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Schätztheorie</subfield><subfield code="0">(DE-588)4121608-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Imputationstechnik</subfield><subfield code="0">(DE-588)4609617-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Umfrage</subfield><subfield code="0">(DE-588)4005227-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2="1"><subfield code="a">Statistik</subfield><subfield code="0">(DE-588)4056995-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="2" ind2="0"><subfield code="a">Antwortverweigerung</subfield><subfield code="0">(DE-588)4343644-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="2" ind2="1"><subfield code="a">Imputationstechnik</subfield><subfield code="0">(DE-588)4609617-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="2" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="3" ind2="0"><subfield code="a">Umfrage</subfield><subfield code="0">(DE-588)4005227-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="3" ind2="1"><subfield code="a">Non-response-Problem</subfield><subfield code="0">(DE-588)4133974-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="3" ind2="2"><subfield code="a">Imputationstechnik</subfield><subfield code="0">(DE-588)4609617-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="3" ind2=" "><subfield code="5">DE-188</subfield></datafield><datafield tag="689" ind1="4" ind2="0"><subfield code="a">Datenerhebung</subfield><subfield code="0">(DE-588)4155272-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="4" ind2=" "><subfield code="8">1\p</subfield><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="5" ind2="0"><subfield code="a">Ausreißerwert</subfield><subfield code="0">(DE-588)4143602-7</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="5" ind2=" "><subfield code="8">2\p</subfield><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Onlineausgabe</subfield><subfield code="z">978-0-470-31669-6</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=015025147&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-015025147</subfield></datafield><datafield tag="883" ind1="1" ind2=" "><subfield code="8">1\p</subfield><subfield code="a">cgwrk</subfield><subfield code="d">20201028</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#cgwrk</subfield></datafield><datafield tag="883" ind1="1" ind2=" "><subfield code="8">2\p</subfield><subfield code="a">cgwrk</subfield><subfield code="d">20201028</subfield><subfield code="q">DE-101</subfield><subfield code="u">https://d-nb.info/provenance/plan#cgwrk</subfield></datafield></record></collection> |
id | DE-604.BV021812894 |
illustrated | Not Illustrated |
index_date | 2024-07-02T15:51:33Z |
indexdate | 2024-07-09T20:45:13Z |
institution | BVB |
isbn | 0471655740 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-015025147 |
oclc_num | 56214438 |
open_access_boolean | |
owner | DE-N32 DE-473 DE-BY-UBG DE-11 DE-188 DE-578 DE-19 DE-BY-UBM |
owner_facet | DE-N32 DE-473 DE-BY-UBG DE-11 DE-188 DE-578 DE-19 DE-BY-UBM |
physical | XXIX, 287 S. |
publishDate | 2004 |
publishDateSearch | 2004 |
publishDateSort | 2004 |
publisher | Wiley |
record_format | marc |
series2 | Wiley classics library edition |
spelling | Rubin, Donald B. 1943- Verfasser (DE-588)131607618 aut Multiple imputation for nonresponse in surveys Donald B. Rubin Hoboken, NJ. [u.a.] Wiley 2004 XXIX, 287 S. txt rdacontent n rdamedia nc rdacarrier Wiley classics library edition Ontbrekende gegevens gtt Survey-onderzoek gtt Multiple imputation (Statistics) Nonresponse (Statistics) Social surveys Response rate Datenerhebung (DE-588)4155272-6 gnd rswk-swf Ausreißerwert (DE-588)4143602-7 gnd rswk-swf Umfrage (DE-588)4005227-8 gnd rswk-swf Statistik (DE-588)4056995-0 gnd rswk-swf Imputationstechnik (DE-588)4609617-6 gnd rswk-swf Schätztheorie (DE-588)4121608-8 gnd rswk-swf Non-response-Problem (DE-588)4133974-5 gnd rswk-swf Antwortverweigerung (DE-588)4343644-4 gnd rswk-swf Schätztheorie (DE-588)4121608-8 s Imputationstechnik (DE-588)4609617-6 s DE-604 Umfrage (DE-588)4005227-8 s Statistik (DE-588)4056995-0 s Antwortverweigerung (DE-588)4343644-4 s Non-response-Problem (DE-588)4133974-5 s DE-188 Datenerhebung (DE-588)4155272-6 s 1\p DE-604 Ausreißerwert (DE-588)4143602-7 s 2\p DE-604 Erscheint auch als Onlineausgabe 978-0-470-31669-6 Digitalisierung UB Bamberg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=015025147&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis 1\p cgwrk 20201028 DE-101 https://d-nb.info/provenance/plan#cgwrk 2\p cgwrk 20201028 DE-101 https://d-nb.info/provenance/plan#cgwrk |
spellingShingle | Rubin, Donald B. 1943- Multiple imputation for nonresponse in surveys Ontbrekende gegevens gtt Survey-onderzoek gtt Multiple imputation (Statistics) Nonresponse (Statistics) Social surveys Response rate Datenerhebung (DE-588)4155272-6 gnd Ausreißerwert (DE-588)4143602-7 gnd Umfrage (DE-588)4005227-8 gnd Statistik (DE-588)4056995-0 gnd Imputationstechnik (DE-588)4609617-6 gnd Schätztheorie (DE-588)4121608-8 gnd Non-response-Problem (DE-588)4133974-5 gnd Antwortverweigerung (DE-588)4343644-4 gnd |
subject_GND | (DE-588)4155272-6 (DE-588)4143602-7 (DE-588)4005227-8 (DE-588)4056995-0 (DE-588)4609617-6 (DE-588)4121608-8 (DE-588)4133974-5 (DE-588)4343644-4 |
title | Multiple imputation for nonresponse in surveys |
title_auth | Multiple imputation for nonresponse in surveys |
title_exact_search | Multiple imputation for nonresponse in surveys |
title_exact_search_txtP | Multiple imputation for nonresponse in surveys |
title_full | Multiple imputation for nonresponse in surveys Donald B. Rubin |
title_fullStr | Multiple imputation for nonresponse in surveys Donald B. Rubin |
title_full_unstemmed | Multiple imputation for nonresponse in surveys Donald B. Rubin |
title_short | Multiple imputation for nonresponse in surveys |
title_sort | multiple imputation for nonresponse in surveys |
topic | Ontbrekende gegevens gtt Survey-onderzoek gtt Multiple imputation (Statistics) Nonresponse (Statistics) Social surveys Response rate Datenerhebung (DE-588)4155272-6 gnd Ausreißerwert (DE-588)4143602-7 gnd Umfrage (DE-588)4005227-8 gnd Statistik (DE-588)4056995-0 gnd Imputationstechnik (DE-588)4609617-6 gnd Schätztheorie (DE-588)4121608-8 gnd Non-response-Problem (DE-588)4133974-5 gnd Antwortverweigerung (DE-588)4343644-4 gnd |
topic_facet | Ontbrekende gegevens Survey-onderzoek Multiple imputation (Statistics) Nonresponse (Statistics) Social surveys Response rate Datenerhebung Ausreißerwert Umfrage Statistik Imputationstechnik Schätztheorie Non-response-Problem Antwortverweigerung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=015025147&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT rubindonaldb multipleimputationfornonresponseinsurveys |