Privacy-preserving data mining: models and algorithms
Gespeichert in:
Format: | Buch |
---|---|
Sprache: | English |
Veröffentlicht: |
New York ; London
Springer
2008
|
Schriftenreihe: | Advances in database systems
v. 34 |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | xxii, 513 Seiten 24 cm |
ISBN: | 9781441943712 1441943714 |
Internformat
MARC
LEADER | 00000nam a22000008cb4500 | ||
---|---|---|---|
001 | BV046751479 | ||
003 | DE-604 | ||
005 | 20200616 | ||
007 | t | ||
008 | 200605s2008 |||| 00||| eng d | ||
020 | |a 9781441943712 |c (pbk.) £89.99 |9 978-1-4419-4371-2 | ||
020 | |a 1441943714 |c (pbk.) £89.99 |9 1-4419-4371-4 | ||
035 | |a (OCoLC)255823401 | ||
035 | |a (DE-599)BVBBV046751479 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-739 | ||
082 | 0 | |a 006.312 | |
084 | |a ST 530 |0 (DE-625)143679: |2 rvk | ||
245 | 1 | 0 | |a Privacy-preserving data mining |b models and algorithms |c edited by Charu C. Aggarwal and Philip S. Yu |
264 | 1 | |a New York ; London |b Springer |c 2008 | |
300 | |a xxii, 513 Seiten |c 24 cm | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Advances in database systems |v v. 34 | |
650 | 0 | 7 | |a Data Mining |0 (DE-588)4428654-5 |2 gnd |9 rswk-swf |
653 | 0 | |a Data protection | |
653 | 0 | |a Data mining | |
653 | 0 | |a Privacy, Right of | |
689 | 0 | 0 | |a Data Mining |0 (DE-588)4428654-5 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Aggarwal, Charu C. |d 1970- |e Sonstige |0 (DE-588)133500101 |4 oth | |
700 | 1 | |a Yu, Philip S. |e Sonstige |0 (DE-588)142917206 |4 oth | |
830 | 0 | |a Advances in database systems |v v. 34 |w (DE-604)BV021653394 |9 v. 34 | |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032161189&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-032161189 |
Datensatz im Suchindex
_version_ | 1804181511144996864 |
---|---|
adam_text | Contents Preface List of Figures List of Tables v xvii xxi 1 An Introduction to Privacy-Preserving Data Mining Cham C. Aggarwal, Philip S. Yu 1.1. Introduction 1.2. Privacy-Preserving Data Mining Algorithms 1.3. Conclusions and Summary References 2 A General Survey of Privacy-Preserving Data Mining Models and Algorithms Cham C. Aggarwal, Philip S. Yu 2.1. Introduction 2.2. The Randomization Method 2.2.1 Privacy Quantification 2.2.2 Adversarial Attacks on Randomization 2.2.3 Randomization Methods for Data Streams 2.2.4 Multiplicative Perturbations 2.2.5 Data Swapping 2.3. Group Based Anonymization 2.3.1 The ¿-Anonymity Framework 2.3.2 Personalized Privacy-Preservation 2.3.3 Utility Based Privacy Preservation 2.3.4 Sequential Releases 2.3.5 The /-diversity Method 2.3.6 The ¿-closeness Model 2.3.7 Models for Text, Binary and String Data 2.4. Distributed Privacy-Preserving Data Mining 2.4.1 Distributed Algorithms over Florizontally Partitioned Data Sets 2.4.2 Distributed Algorithms over Vertically Partitioned Data 2.4.3 Distributed Algorithms for ¿-Anonymity 1 1 3 7 8 11 11 13 15 18 18 19 19 20 20 24 24 25 26 27 27 28 30 31 32
viii Contents 2.5. Privacy-Preservation of Application Results 2.5.1 Association Rule Hiding 2.5.2 Downgrading Classifier Effectiveness 2.5.3 Query Auditing and Inference Control 2.6. Limitations of Privacy: The Curse of Dimensionality 2.7. Applications of Privacy-Preserving Data Mining 2.7.1 Medical Databases: The Scrub and Datafly Systems 2.7.2 Bioterrorism Applications 2.7.3 Homeland Security Applications 2.7.4 Genomic Privacy 2.8. Summary References 32 33 34 34 37 38 39 40 40 42 43 43 3 A Survey of Inference Control Methods for Privacy-Preserving Data Mining 53 Josep Domingo-Ferrer 3.1. 3.2. 3.3. Introduction A classification of Microdata Protection Methods Perturbative Masking Methods 3.3.1 Additive Noise 3.3.2 Microaggregation 3.3.3 Data Wapping and Rank Swapping 3.3.4 Rounding 3.3.5 Resampling 3.3.6 PRAM 3.3.7 MASSC 3.4. Non-perturbative Masking Methods 3.4.1 Sampling 3.4.2 Global Recoding 3.4.3 Top and Bottom Coding 3.4.4 Local Suppression 3.5. Synthetic Microdata Generation 3.5.1 Synthetic Data by Multiple Imputation 3.5.2 Synthetic Data by Bootstrap 3.5.3 Synthetic Data by Latin Hypercube Sampling 3.5.4 Partially Synthetic Data by Cholesky Decomposition 3.5.5 Other Partially Synthetic and Hybrid Microdata Approaches 3.5.6 Pros and Cons of Synthetic Microdata 3.6. Trading off Information Loss and Disclosure Risk 3.6.1 Score Construction 3.6.2 R-UMaps 3.6.3 fc-anonymity 3.7. Conclusions and Research Directions References 54 55 58 58 59 61 62 62 62 63 63 64 64 65 65 65 65 66 66 67 67 68 69 69 71 71 72 73
Contents ix 4 Measures of Anonymity Suresh Venkatasubramanian Introduction 4.1. 4.1.1 What is Privacy? 4.1.2 Data Anonymization Methods 4.1.3 A Classification of Methods Statistical Measures of Anonymity 4.2. 4.2.1 Query Restriction 4.2.2 Anonymity via Variance 4.2.3 Anonymity via Multiplicity Probabilistic Measures of Anonymity 4.3. 4.3.1 Measures Based on Random Perturbation 4.3.2 Measures Based on Generalization 4.3.3 Utility vs Privacy 4.4. Computational Measures of Anonymity 4.4.1 Anonymity via Isolation 4.5. Conclusions and New Directions 4.5.1 New Directions References 81 82 83 84 85 85 85 86 87 87 90 94 94 97 97 98 99 5 k-Anonymous Data Mining: A Survey V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati 5.1. Introduction 5.2. fc-Anonymity 5.3. Algorithms for Enforcing k-Anonymity 5.4. /(•-Anonymity Threats from Data Mining 5.4.1 Association Rules 5.4.2 Classification Mining 5.5. k-Anonymity in Data Mining 5.6. Anonymize-and-Mine 5.7. Mine-and-Anonymize 5.7.1 Enforcing k-Anonymity on Association Rules 5.7.2 Enforcing k-Anonymity on Decision Trees 5.8. Conclusions Acknowledgments References 6 A Survey of Randomization Methods for Privacy-Preserving Data Mining Charu C. Aggarwal, Philip S. Yu 6.1. Introduction 6.2. Reconstruction Methods for Randomization 6.2.1 The Bayes Reconstruction Method 6.2.2 The EM Reconstruction Method 6.2.3 Utility and Optimality of Randomization Models 105 105 107 110 117 118 118 120 123 126 126 130 133 133 134 137 137 139 139 141 143
x Contents 6.3. Applications of Randomization 6.3.1 Privacy-Preserving Classification with Randomization 6.3.2 Privacy-Preserving OLAP 6.3.3 Collaborative Filtering 6.4. The Privacy-Information Loss Tradeoff 6.5. Vulnerabilities of the Randomization Method 6.6. Randomization of Time Series Data Streams 6.7. Multiplicative Noise for Randomization 6.7.1 Vulnerabilities of Multiplicative Randomization 6.7.2 Sketch Based Randomization 6.8. Conclusions and Summary References 7 A Survey of Multiplicative Perturbation for Privacy-Preserving Data Mining 144 144 145 145 146 149 151 152 153 153 154 154 157 Keke Chen and Ling Liu 7.1. Introduction 7.1.1 Data Privacy vs. Data Utility 7.1.2 Outline 7.2. Definition of Multiplicative Perturbation 7.2.1 Notations 7.2.2 Rotation Perturbation 7.2.3 Projection Perturbation 7.2.4 Sketch-based Approach 7.2.5 Geometric Perturbation 7.3. Transformation Invariant Data Mining Models 7.3.1 Definition of Transformation Invariant Models 7.3.2 Transformation-Invariant Classification Models 7.3.3 Transformation-Invariant Clustering Models 7.4. Privacy Evaluation for Multiplicative Perturbation 7.4.1 A Conceptual Multidimensional Privacy Evaluation Model 7.4.2 Variance of Difference as Column Privacy Metric 7.4.3 Incorporating Attack Evaluation 7.4.4 Other Metrics 7.5. Attack Resilient Multiplicative Perturbations 7.5.1 Naive Estimation to Rotation Perturbation 7.5.2 ICA-Based Attacks 7.5.3 Distance-Inference Attacks 7.5.4 Attacks with More Prior Knowledge 7.5.5 Finding Attack-Resilient Perturbations 7.6. Conclusion Acknowledgment References 158 159 160
161 161 161 162 164 164 165 166 166 167 168 168 169 170 171 171 171 173 174 176 177 177 178 179 8 A Survey of Quantification of Privacy Preserving Data Mining Algorithms 183 Elisa Bertino, Dan Lin and Wei Jiang 8.1. 8.2. Introduction Metrics for Quantifying Privacy Level 8.2.1 Data Privacy 184 186 186
Contents 8.2.2 Result Privacy Metrics for Quantifying Hiding Failure 8.3. Metrics for Quantifying Data Quality 8.4. 8.4.1 Quality of the Data Resulting from the PPDM Process 8.4.2 Quality of the Data Mining Results Complexity Metrics 8.5. How to Select a Proper Metric 8.6. Conclusion and Research Directions 8.7. References 9 A Survey of Utility-based Privacy-Preserving Data Transformation Methods Ming Hua and Jian Pei 9.1. Introduction 9.1.1 What is Utility-based Privacy Preservation? 9.2. Types of Utility-based Privacy Preservation Methods 9.2.1 Privacy Models 9.2.2 Utility Measures 9.2.3 Summary of the Utility-Based Privacy Preserving Methods 9.3. Utility-Based Anonymization Using Local Recoding 9.3.1 Global Recoding and Local Recoding 9.3.2 Utility Measure 9.3.3 Anonymization Methods 9.3.4 Summary and Discussion 9.4. The Utility-based Privacy Preserving Methods in Classification Prob lems 9.4.1 The Top-Down Specialization Method 9.4.2 The Progressive Disclosure Algorithm 9.4.3 Summary and Discussion 9.5. Anonymized Marginal: Injecting Utility into Anonymized Data Sets 9.5.1 Anonymized Marginal 9.5.2 Utility Measure 9.5.3 Injecting Utility Using Anonymized Marginals 9.5.4 Summary and Discussion 9.6. Summary Acknowledgments References Xl 191 192 193 193 198 200 201 202 202 207 208 209 210 210 212 214 214 215 216 217 219 219 220 224 228 228 229 230 231 233 234 234 234 10 Mining Association Rules under Privacy Constraints Jayant R. Haritsa 10.1. Introduction 10.2. Problem Framework 10.2.1 Database Model 10.2.2 Mining Objective 10.2.3 Privacy Mechanisms 10.2.4 Privacy Metric
10.2.5 Accuracy Metric 239 239 240 240 241 241 243 245
Contents xii 10.3. 10.4. Evolution of the Literature The FRAPP Framework 10.4.1 Reconstruction Model 10.4.2 Estimation Error 10.4.3 Randomizing the Perturbation Matrix 10.4.4 Efficient Perturbation 10.4.5 Integration with Association Rule Mining 10.5. Sample Results 10.6. Closing Remarks Acknowledgments References 11 A Survey of Association Rule Hiding Methods for Privacy Vassilios S. Verykios and Aris Gkoulalas-Divanis 11.1. Introduction 11.2. Terminology and Preliminaries 11.3. Taxonomy of Association Rule Hiding Algorithms 11.4. Classes of Association Rule Algorithms 11.4.1 Heuristic Approaches 11.4.2 Border-based Approaches 11.4.3 Exact Approaches 11.5. Other Hiding Approaches 11.6. Metrics and Performance Analysis 11.7. Discussion and Future Trends 11.8. Conclusions References 12 A Survey of Statistical Approaches to Preserving Confidentiality of Contingency Table Entries Stephen E. Fienberg and Aleksandra В. Slavkovic 12.1. Introduction 12.2. The Statistical ApproachPrivacy Protection 12.3. Datamining Algorithms, Association Rules, and Disclosure Limitation 12.4. Estimation and Disclosure Limitation for Multi-way Contingency Tables 12.5. Two Illustrative Examples 12.5.1 Example 1 : Data from a Randomized Clinical Trial 12.5.2 Example 2: Data from the 1993 U.S. Current Population Survey 12.6. Conclusions Acknowledgments References 13 A Survey of Privacy-Preserving Methods Across Horizontally Partitioned Data Murat Kantarcioglu 13.1. Introduction 246 251 252 253 256 256 258 259 263 263 263 267 267 269 27 0 271 272 277 278 279 281 284 285 286 291 291 292 294 295 301 301
305 308 309 309 313 313
Contents Basic Cryptographic Techniques for Privacy-Preserving Distributed Data Mining 13.3. Common Secure Sub-protocols Used in Privacy-Preserving Distributed Data Mining 13.4. Privacy-preserving Distributed Data Mining on Horizontally Partitioned Data 13.5. Comparison to Vertically Partitioned Data Model 13.6. Extension to Malicious Parties 13.7. Limitations of the Cryptographic Techniques Used in PrivacyPreserving Distributed Data Mining 13.8. Privacy Issues Related to Data Mining Results 13.9. Conclusion References хш 13.2. 14 A Survey of Privacy-Preserving Methods Across Vertically Partitioned Data Jaideep Vaidya 14.1. Introduction 14.2. Classification 14.2.1 Naive Bayes Classification 14.2.2 Bayesian Network Structure Learning 14.2.3 Decision Tree Classification 14.3. Clustering 14.4. Association Rule Mining 14.5. Outlier detection 14.5.1 Algorithm 14.5.2 Security Analysis 14.5.3 Computation and Communication Analysis 14.6. Challenges and Research Directions References 15 A Survey of Attack Techniques on Privacy-Preserving Data Perturbation Methods Kun Liu, Chris Giannella, and Hillol Karsupta 15.1. Introduction 15.2. Definitions and Notation 15.3. Attacking Additive Data Perturbation 15.3.1 Eigen-Analysis and PCA Preliminaries 15.3.2 Spectral Filtering 15.3.3 SVD Filtering 15.3.4 PCA Filtering 15.3.5 MAP Estimation Attack 15.3.6 Distribution Analysis Attack 15.3.7 Summary 15.4. Attacking Matrix Multiplicative Data Perturbation 15.4.1 Known I/O Attacks 15.4.2 Known S ample Attack 15.4.3 Other Attacks Based on ICA 315 318 323 326 327 329 330 332 332 337 337 341 342 343
344 346 347 349 351 352 354 355 356 359 360 360 361 362 363 364 365 366 367 367 369 370 373 374
Contents XIV 15.4.4 Summary 15.5. Attacking £-Anonymization 15.6. Conclusion Acknowledgments References 16 Private Data Analysis via Output Perturbation Kobbi Nissirn 16.1. Introduction 16.2. The Abstract Model - Statistical Databases, Queries, and Sanitizers 16.3. Privacy 16.3.1 Interpreting the Privacy Definition 16.4. The Basic Technique: Calibrating Noise to Sensitivity 16.4.1 Applications: Functions with Low Global Sensitivity 16.5. Constructing Sanitizers for Complex Functionalities 16.5.1 k-Means Clustering 16.5.2 SVD and PCA 16.5.3 Learning in the Statistical Queries Model 16.6. Beyond the Basics 16.6.1 Instance Based Noise and Smooth Sensitivity 16.6.2 The Sample-Aggregate Framework 16.6.3 A General Sanitization Mechanism 16.7. Related Work and Bibliographic Notes Acknowledgments References 17 A Survey of Query Auditing Techniques for Data Privacy Shubha U. Nabar, Krishnaram Kenthapadi, Nina Mishra and Rajeev Motwani 17.1. Introduction 17.2. Auditing Aggregate Queries 17.2.1 Offline Auditing 17.2.2 Online Auditing 17.3. Auditing Select-Project-Join Queries 17.4. Challenges in Auditing 17.5. Reading References 18 Privacy and the Dimensionality Curse Cham C, Aggarwal 18.1. Introduction 18.2. The Dimensionality Curse and the/с-anonymity Method 18.3. The Dimensionality Curse and Condensation 18.4. The Dimensionality Curse and the Randomization Method 18.4.1 Effects of Public Information 18.4.2 Effects of High Dimensionality 18.4.3 Gaussian Perturbing Distribution 18.4.4 Uniform Perturbing Distribution 375 376 376 377 377 383 383 385 388 390 394 396 400 401 403 404 405
406 408 409 409 411 411 415 415 416 417 418 426 427 429 430 433 433 435 441 446 446 450 450 455
Contents 18.5. The Dimensionality Curse and /-diversity 18.6. Conclusions and Research Directions References 19 Personalized Privacy Preservation Yufei Tao and Xiaokui Xiao 19.1. Introduction 19.2. Formalization of Personalized Anonymity 19.2.1 Personal Privacy Requirements 19.2.2 Generalization 19.3. Combinatorial Process of Privacy Attack 19.3.1 Primary Case 19.3.2 Non-primary Case 19.4. Theoretical Foundation 19.4.1 Notations and Basic Properties 19.4.2 Derivation of the Breach Probability 19.5. Generalization Algorithm 19.5.1 The Greedy Framework 19.5.2 Optimal SA-generalization 19.6. Alternative Forms of Personalized Privacy Preservation 19.6.1 Extension of /“-anonymity 19.6.2 Personalization in Location Privacy Protection 19.7. Summary and Future Work References XV 458 459 460 461 461 463 464 465 467 468 469 470 471 472 473 474 476 478 479 480 482 485 20 Privacy-Preserving Data Stream Classification Yabo Xu, Ke Wang, Ada Wai-Chee Fu, Rong She, and Jian Pei 20.1. Introduction 20.1.1 Motivating Example 20.1.2 Contributions and Paper Outline 20.2. Related Works 20.3. Problem Statement 20.3.1 Secure Join Stream Classification 20.3.2 Naive Bayesian Classifiers 20.4. Our Approach 20.4.1 Initialization 20.4.2 Bottom-Up Propagation 20.4.3 Top-Down Propagation 20.4.4 Using NBC 20.4.5 Algorithm Analysis 20.5. Empirical Studies 20.5.1 Real-life Datasets 20.5.2 Synthetic Datasets 20.5.3 Discussion 20.6. Conclusions References 487 488 490 491 493 493 494 495 495 496 497 499 500 501 502 504 506 507 508 Index 511 487
|
adam_txt |
Contents Preface List of Figures List of Tables v xvii xxi 1 An Introduction to Privacy-Preserving Data Mining Cham C. Aggarwal, Philip S. Yu 1.1. Introduction 1.2. Privacy-Preserving Data Mining Algorithms 1.3. Conclusions and Summary References 2 A General Survey of Privacy-Preserving Data Mining Models and Algorithms Cham C. Aggarwal, Philip S. Yu 2.1. Introduction 2.2. The Randomization Method 2.2.1 Privacy Quantification 2.2.2 Adversarial Attacks on Randomization 2.2.3 Randomization Methods for Data Streams 2.2.4 Multiplicative Perturbations 2.2.5 Data Swapping 2.3. Group Based Anonymization 2.3.1 The ¿-Anonymity Framework 2.3.2 Personalized Privacy-Preservation 2.3.3 Utility Based Privacy Preservation 2.3.4 Sequential Releases 2.3.5 The /-diversity Method 2.3.6 The ¿-closeness Model 2.3.7 Models for Text, Binary and String Data 2.4. Distributed Privacy-Preserving Data Mining 2.4.1 Distributed Algorithms over Florizontally Partitioned Data Sets 2.4.2 Distributed Algorithms over Vertically Partitioned Data 2.4.3 Distributed Algorithms for ¿-Anonymity 1 1 3 7 8 11 11 13 15 18 18 19 19 20 20 24 24 25 26 27 27 28 30 31 32
viii Contents 2.5. Privacy-Preservation of Application Results 2.5.1 Association Rule Hiding 2.5.2 Downgrading Classifier Effectiveness 2.5.3 Query Auditing and Inference Control 2.6. Limitations of Privacy: The Curse of Dimensionality 2.7. Applications of Privacy-Preserving Data Mining 2.7.1 Medical Databases: The Scrub and Datafly Systems 2.7.2 Bioterrorism Applications 2.7.3 Homeland Security Applications 2.7.4 Genomic Privacy 2.8. Summary References 32 33 34 34 37 38 39 40 40 42 43 43 3 A Survey of Inference Control Methods for Privacy-Preserving Data Mining 53 Josep Domingo-Ferrer 3.1. 3.2. 3.3. Introduction A classification of Microdata Protection Methods Perturbative Masking Methods 3.3.1 Additive Noise 3.3.2 Microaggregation 3.3.3 Data Wapping and Rank Swapping 3.3.4 Rounding 3.3.5 Resampling 3.3.6 PRAM 3.3.7 MASSC 3.4. Non-perturbative Masking Methods 3.4.1 Sampling 3.4.2 Global Recoding 3.4.3 Top and Bottom Coding 3.4.4 Local Suppression 3.5. Synthetic Microdata Generation 3.5.1 Synthetic Data by Multiple Imputation 3.5.2 Synthetic Data by Bootstrap 3.5.3 Synthetic Data by Latin Hypercube Sampling 3.5.4 Partially Synthetic Data by Cholesky Decomposition 3.5.5 Other Partially Synthetic and Hybrid Microdata Approaches 3.5.6 Pros and Cons of Synthetic Microdata 3.6. Trading off Information Loss and Disclosure Risk 3.6.1 Score Construction 3.6.2 R-UMaps 3.6.3 fc-anonymity 3.7. Conclusions and Research Directions References 54 55 58 58 59 61 62 62 62 63 63 64 64 65 65 65 65 66 66 67 67 68 69 69 71 71 72 73
Contents ix 4 Measures of Anonymity Suresh Venkatasubramanian Introduction 4.1. 4.1.1 What is Privacy? 4.1.2 Data Anonymization Methods 4.1.3 A Classification of Methods Statistical Measures of Anonymity 4.2. 4.2.1 Query Restriction 4.2.2 Anonymity via Variance 4.2.3 Anonymity via Multiplicity Probabilistic Measures of Anonymity 4.3. 4.3.1 Measures Based on Random Perturbation 4.3.2 Measures Based on Generalization 4.3.3 Utility vs Privacy 4.4. Computational Measures of Anonymity 4.4.1 Anonymity via Isolation 4.5. Conclusions and New Directions 4.5.1 New Directions References 81 82 83 84 85 85 85 86 87 87 90 94 94 97 97 98 99 5 k-Anonymous Data Mining: A Survey V. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati 5.1. Introduction 5.2. fc-Anonymity 5.3. Algorithms for Enforcing k-Anonymity 5.4. /(•-Anonymity Threats from Data Mining 5.4.1 Association Rules 5.4.2 Classification Mining 5.5. k-Anonymity in Data Mining 5.6. Anonymize-and-Mine 5.7. Mine-and-Anonymize 5.7.1 Enforcing k-Anonymity on Association Rules 5.7.2 Enforcing k-Anonymity on Decision Trees 5.8. Conclusions Acknowledgments References 6 A Survey of Randomization Methods for Privacy-Preserving Data Mining Charu C. Aggarwal, Philip S. Yu 6.1. Introduction 6.2. Reconstruction Methods for Randomization 6.2.1 The Bayes Reconstruction Method 6.2.2 The EM Reconstruction Method 6.2.3 Utility and Optimality of Randomization Models 105 105 107 110 117 118 118 120 123 126 126 130 133 133 134 137 137 139 139 141 143
x Contents 6.3. Applications of Randomization 6.3.1 Privacy-Preserving Classification with Randomization 6.3.2 Privacy-Preserving OLAP 6.3.3 Collaborative Filtering 6.4. The Privacy-Information Loss Tradeoff 6.5. Vulnerabilities of the Randomization Method 6.6. Randomization of Time Series Data Streams 6.7. Multiplicative Noise for Randomization 6.7.1 Vulnerabilities of Multiplicative Randomization 6.7.2 Sketch Based Randomization 6.8. Conclusions and Summary References 7 A Survey of Multiplicative Perturbation for Privacy-Preserving Data Mining 144 144 145 145 146 149 151 152 153 153 154 154 157 Keke Chen and Ling Liu 7.1. Introduction 7.1.1 Data Privacy vs. Data Utility 7.1.2 Outline 7.2. Definition of Multiplicative Perturbation 7.2.1 Notations 7.2.2 Rotation Perturbation 7.2.3 Projection Perturbation 7.2.4 Sketch-based Approach 7.2.5 Geometric Perturbation 7.3. Transformation Invariant Data Mining Models 7.3.1 Definition of Transformation Invariant Models 7.3.2 Transformation-Invariant Classification Models 7.3.3 Transformation-Invariant Clustering Models 7.4. Privacy Evaluation for Multiplicative Perturbation 7.4.1 A Conceptual Multidimensional Privacy Evaluation Model 7.4.2 Variance of Difference as Column Privacy Metric 7.4.3 Incorporating Attack Evaluation 7.4.4 Other Metrics 7.5. Attack Resilient Multiplicative Perturbations 7.5.1 Naive Estimation to Rotation Perturbation 7.5.2 ICA-Based Attacks 7.5.3 Distance-Inference Attacks 7.5.4 Attacks with More Prior Knowledge 7.5.5 Finding Attack-Resilient Perturbations 7.6. Conclusion Acknowledgment References 158 159 160
161 161 161 162 164 164 165 166 166 167 168 168 169 170 171 171 171 173 174 176 177 177 178 179 8 A Survey of Quantification of Privacy Preserving Data Mining Algorithms 183 Elisa Bertino, Dan Lin and Wei Jiang 8.1. 8.2. Introduction Metrics for Quantifying Privacy Level 8.2.1 Data Privacy 184 186 186
Contents 8.2.2 Result Privacy Metrics for Quantifying Hiding Failure 8.3. Metrics for Quantifying Data Quality 8.4. 8.4.1 Quality of the Data Resulting from the PPDM Process 8.4.2 Quality of the Data Mining Results Complexity Metrics 8.5. How to Select a Proper Metric 8.6. Conclusion and Research Directions 8.7. References 9 A Survey of Utility-based Privacy-Preserving Data Transformation Methods Ming Hua and Jian Pei 9.1. Introduction 9.1.1 What is Utility-based Privacy Preservation? 9.2. Types of Utility-based Privacy Preservation Methods 9.2.1 Privacy Models 9.2.2 Utility Measures 9.2.3 Summary of the Utility-Based Privacy Preserving Methods 9.3. Utility-Based Anonymization Using Local Recoding 9.3.1 Global Recoding and Local Recoding 9.3.2 Utility Measure 9.3.3 Anonymization Methods 9.3.4 Summary and Discussion 9.4. The Utility-based Privacy Preserving Methods in Classification Prob lems 9.4.1 The Top-Down Specialization Method 9.4.2 The Progressive Disclosure Algorithm 9.4.3 Summary and Discussion 9.5. Anonymized Marginal: Injecting Utility into Anonymized Data Sets 9.5.1 Anonymized Marginal 9.5.2 Utility Measure 9.5.3 Injecting Utility Using Anonymized Marginals 9.5.4 Summary and Discussion 9.6. Summary Acknowledgments References Xl 191 192 193 193 198 200 201 202 202 207 208 209 210 210 212 214 214 215 216 217 219 219 220 224 228 228 229 230 231 233 234 234 234 10 Mining Association Rules under Privacy Constraints Jayant R. Haritsa 10.1. Introduction 10.2. Problem Framework 10.2.1 Database Model 10.2.2 Mining Objective 10.2.3 Privacy Mechanisms 10.2.4 Privacy Metric
10.2.5 Accuracy Metric 239 239 240 240 241 241 243 245
Contents xii 10.3. 10.4. Evolution of the Literature The FRAPP Framework 10.4.1 Reconstruction Model 10.4.2 Estimation Error 10.4.3 Randomizing the Perturbation Matrix 10.4.4 Efficient Perturbation 10.4.5 Integration with Association Rule Mining 10.5. Sample Results 10.6. Closing Remarks Acknowledgments References 11 A Survey of Association Rule Hiding Methods for Privacy Vassilios S. Verykios and Aris Gkoulalas-Divanis 11.1. Introduction 11.2. Terminology and Preliminaries 11.3. Taxonomy of Association Rule Hiding Algorithms 11.4. Classes of Association Rule Algorithms 11.4.1 Heuristic Approaches 11.4.2 Border-based Approaches 11.4.3 Exact Approaches 11.5. Other Hiding Approaches 11.6. Metrics and Performance Analysis 11.7. Discussion and Future Trends 11.8. Conclusions References 12 A Survey of Statistical Approaches to Preserving Confidentiality of Contingency Table Entries Stephen E. Fienberg and Aleksandra В. Slavkovic 12.1. Introduction 12.2. The Statistical ApproachPrivacy Protection 12.3. Datamining Algorithms, Association Rules, and Disclosure Limitation 12.4. Estimation and Disclosure Limitation for Multi-way Contingency Tables 12.5. Two Illustrative Examples 12.5.1 Example 1 : Data from a Randomized Clinical Trial 12.5.2 Example 2: Data from the 1993 U.S. Current Population Survey 12.6. Conclusions Acknowledgments References 13 A Survey of Privacy-Preserving Methods Across Horizontally Partitioned Data Murat Kantarcioglu 13.1. Introduction 246 251 252 253 256 256 258 259 263 263 263 267 267 269 27 0 271 272 277 278 279 281 284 285 286 291 291 292 294 295 301 301
305 308 309 309 313 313
Contents Basic Cryptographic Techniques for Privacy-Preserving Distributed Data Mining 13.3. Common Secure Sub-protocols Used in Privacy-Preserving Distributed Data Mining 13.4. Privacy-preserving Distributed Data Mining on Horizontally Partitioned Data 13.5. Comparison to Vertically Partitioned Data Model 13.6. Extension to Malicious Parties 13.7. Limitations of the Cryptographic Techniques Used in PrivacyPreserving Distributed Data Mining 13.8. Privacy Issues Related to Data Mining Results 13.9. Conclusion References хш 13.2. 14 A Survey of Privacy-Preserving Methods Across Vertically Partitioned Data Jaideep Vaidya 14.1. Introduction 14.2. Classification 14.2.1 Naive Bayes Classification 14.2.2 Bayesian Network Structure Learning 14.2.3 Decision Tree Classification 14.3. Clustering 14.4. Association Rule Mining 14.5. Outlier detection 14.5.1 Algorithm 14.5.2 Security Analysis 14.5.3 Computation and Communication Analysis 14.6. Challenges and Research Directions References 15 A Survey of Attack Techniques on Privacy-Preserving Data Perturbation Methods Kun Liu, Chris Giannella, and Hillol Karsupta 15.1. Introduction 15.2. Definitions and Notation 15.3. Attacking Additive Data Perturbation 15.3.1 Eigen-Analysis and PCA Preliminaries 15.3.2 Spectral Filtering 15.3.3 SVD Filtering 15.3.4 PCA Filtering 15.3.5 MAP Estimation Attack 15.3.6 Distribution Analysis Attack 15.3.7 Summary 15.4. Attacking Matrix Multiplicative Data Perturbation 15.4.1 Known I/O Attacks 15.4.2 Known S ample Attack 15.4.3 Other Attacks Based on ICA 315 318 323 326 327 329 330 332 332 337 337 341 342 343
344 346 347 349 351 352 354 355 356 359 360 360 361 362 363 364 365 366 367 367 369 370 373 374
Contents XIV 15.4.4 Summary 15.5. Attacking £-Anonymization 15.6. Conclusion Acknowledgments References 16 Private Data Analysis via Output Perturbation Kobbi Nissirn 16.1. Introduction 16.2. The Abstract Model - Statistical Databases, Queries, and Sanitizers 16.3. Privacy 16.3.1 Interpreting the Privacy Definition 16.4. The Basic Technique: Calibrating Noise to Sensitivity 16.4.1 Applications: Functions with Low Global Sensitivity 16.5. Constructing Sanitizers for Complex Functionalities 16.5.1 k-Means Clustering 16.5.2 SVD and PCA 16.5.3 Learning in the Statistical Queries Model 16.6. Beyond the Basics 16.6.1 Instance Based Noise and Smooth Sensitivity 16.6.2 The Sample-Aggregate Framework 16.6.3 A General Sanitization Mechanism 16.7. Related Work and Bibliographic Notes Acknowledgments References 17 A Survey of Query Auditing Techniques for Data Privacy Shubha U. Nabar, Krishnaram Kenthapadi, Nina Mishra and Rajeev Motwani 17.1. Introduction 17.2. Auditing Aggregate Queries 17.2.1 Offline Auditing 17.2.2 Online Auditing 17.3. Auditing Select-Project-Join Queries 17.4. Challenges in Auditing 17.5. Reading References 18 Privacy and the Dimensionality Curse Cham C, Aggarwal 18.1. Introduction 18.2. The Dimensionality Curse and the/с-anonymity Method 18.3. The Dimensionality Curse and Condensation 18.4. The Dimensionality Curse and the Randomization Method 18.4.1 Effects of Public Information 18.4.2 Effects of High Dimensionality 18.4.3 Gaussian Perturbing Distribution 18.4.4 Uniform Perturbing Distribution 375 376 376 377 377 383 383 385 388 390 394 396 400 401 403 404 405
406 408 409 409 411 411 415 415 416 417 418 426 427 429 430 433 433 435 441 446 446 450 450 455
Contents 18.5. The Dimensionality Curse and /-diversity 18.6. Conclusions and Research Directions References 19 Personalized Privacy Preservation Yufei Tao and Xiaokui Xiao 19.1. Introduction 19.2. Formalization of Personalized Anonymity 19.2.1 Personal Privacy Requirements 19.2.2 Generalization 19.3. Combinatorial Process of Privacy Attack 19.3.1 Primary Case 19.3.2 Non-primary Case 19.4. Theoretical Foundation 19.4.1 Notations and Basic Properties 19.4.2 Derivation of the Breach Probability 19.5. Generalization Algorithm 19.5.1 The Greedy Framework 19.5.2 Optimal SA-generalization 19.6. Alternative Forms of Personalized Privacy Preservation 19.6.1 Extension of /“-anonymity 19.6.2 Personalization in Location Privacy Protection 19.7. Summary and Future Work References XV 458 459 460 461 461 463 464 465 467 468 469 470 471 472 473 474 476 478 479 480 482 485 20 Privacy-Preserving Data Stream Classification Yabo Xu, Ke Wang, Ada Wai-Chee Fu, Rong She, and Jian Pei 20.1. Introduction 20.1.1 Motivating Example 20.1.2 Contributions and Paper Outline 20.2. Related Works 20.3. Problem Statement 20.3.1 Secure Join Stream Classification 20.3.2 Naive Bayesian Classifiers 20.4. Our Approach 20.4.1 Initialization 20.4.2 Bottom-Up Propagation 20.4.3 Top-Down Propagation 20.4.4 Using NBC 20.4.5 Algorithm Analysis 20.5. Empirical Studies 20.5.1 Real-life Datasets 20.5.2 Synthetic Datasets 20.5.3 Discussion 20.6. Conclusions References 487 488 490 491 493 493 494 495 495 496 497 499 500 501 502 504 506 507 508 Index 511 487 |
any_adam_object | 1 |
any_adam_object_boolean | 1 |
author_GND | (DE-588)133500101 (DE-588)142917206 |
building | Verbundindex |
bvnumber | BV046751479 |
classification_rvk | ST 530 |
ctrlnum | (OCoLC)255823401 (DE-599)BVBBV046751479 |
dewey-full | 006.312 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.312 |
dewey-search | 006.312 |
dewey-sort | 16.312 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
discipline_str_mv | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01636nam a22004098cb4500</leader><controlfield tag="001">BV046751479</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20200616 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">200605s2008 |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781441943712</subfield><subfield code="c">(pbk.) £89.99</subfield><subfield code="9">978-1-4419-4371-2</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1441943714</subfield><subfield code="c">(pbk.) £89.99</subfield><subfield code="9">1-4419-4371-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)255823401</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV046751479</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.312</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Privacy-preserving data mining</subfield><subfield code="b">models and algorithms</subfield><subfield code="c">edited by Charu C. Aggarwal and Philip S. Yu</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">New York ; London</subfield><subfield code="b">Springer</subfield><subfield code="c">2008</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xxii, 513 Seiten</subfield><subfield code="c">24 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Advances in database systems</subfield><subfield code="v">v. 34</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Data protection</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Data mining</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Privacy, Right of</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Aggarwal, Charu C.</subfield><subfield code="d">1970-</subfield><subfield code="e">Sonstige</subfield><subfield code="0">(DE-588)133500101</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yu, Philip S.</subfield><subfield code="e">Sonstige</subfield><subfield code="0">(DE-588)142917206</subfield><subfield code="4">oth</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Advances in database systems</subfield><subfield code="v">v. 34</subfield><subfield code="w">(DE-604)BV021653394</subfield><subfield code="9">v. 34</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032161189&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032161189</subfield></datafield></record></collection> |
id | DE-604.BV046751479 |
illustrated | Not Illustrated |
index_date | 2024-07-03T14:42:09Z |
indexdate | 2024-07-10T08:52:49Z |
institution | BVB |
isbn | 9781441943712 1441943714 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032161189 |
oclc_num | 255823401 |
open_access_boolean | |
owner | DE-739 |
owner_facet | DE-739 |
physical | xxii, 513 Seiten 24 cm |
publishDate | 2008 |
publishDateSearch | 2008 |
publishDateSort | 2008 |
publisher | Springer |
record_format | marc |
series | Advances in database systems |
series2 | Advances in database systems |
spelling | Privacy-preserving data mining models and algorithms edited by Charu C. Aggarwal and Philip S. Yu New York ; London Springer 2008 xxii, 513 Seiten 24 cm txt rdacontent n rdamedia nc rdacarrier Advances in database systems v. 34 Data Mining (DE-588)4428654-5 gnd rswk-swf Data protection Data mining Privacy, Right of Data Mining (DE-588)4428654-5 s DE-604 Aggarwal, Charu C. 1970- Sonstige (DE-588)133500101 oth Yu, Philip S. Sonstige (DE-588)142917206 oth Advances in database systems v. 34 (DE-604)BV021653394 v. 34 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032161189&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Privacy-preserving data mining models and algorithms Advances in database systems Data Mining (DE-588)4428654-5 gnd |
subject_GND | (DE-588)4428654-5 |
title | Privacy-preserving data mining models and algorithms |
title_auth | Privacy-preserving data mining models and algorithms |
title_exact_search | Privacy-preserving data mining models and algorithms |
title_exact_search_txtP | Privacy-preserving data mining models and algorithms |
title_full | Privacy-preserving data mining models and algorithms edited by Charu C. Aggarwal and Philip S. Yu |
title_fullStr | Privacy-preserving data mining models and algorithms edited by Charu C. Aggarwal and Philip S. Yu |
title_full_unstemmed | Privacy-preserving data mining models and algorithms edited by Charu C. Aggarwal and Philip S. Yu |
title_short | Privacy-preserving data mining |
title_sort | privacy preserving data mining models and algorithms |
title_sub | models and algorithms |
topic | Data Mining (DE-588)4428654-5 gnd |
topic_facet | Data Mining |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032161189&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV021653394 |
work_keys_str_mv | AT aggarwalcharuc privacypreservingdataminingmodelsandalgorithms AT yuphilips privacypreservingdataminingmodelsandalgorithms |