The art and science of analyzing software data:
Gespeichert in:
Weitere Verfasser: | , , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Amsterdam ; Boston ; Heidelberg ; London ; New York ; Oxford ; Paris ; San Diego ; San Francisco ; Singapore ; Sydney ; Tokyo
Morgan Kaufmann
[2015]
|
Schlagworte: | |
Online-Zugang: | Klappentext Inhaltsverzeichnis |
Beschreibung: | xxiii, 660 Seiten Illustrationen, Diagramme |
ISBN: | 9780124115194 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV042617204 | ||
003 | DE-604 | ||
005 | 20241127 | ||
007 | t| | ||
008 | 150615s2015 xx a||| |||| 00||| eng d | ||
020 | |a 9780124115194 |c pbk |9 978-0-12-411519-4 | ||
035 | |a (OCoLC)909330141 | ||
035 | |a (DE-599)BVBBV042617204 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-739 |a DE-473 |a DE-898 |a DE-384 |a DE-703 | ||
082 | 0 | |a 005.74 | |
084 | |a ST 233 |0 (DE-625)143620: |2 rvk | ||
084 | |a ST 530 |0 (DE-625)143679: |2 rvk | ||
245 | 1 | 0 | |a The art and science of analyzing software data |c Christian Bird, Microsoft Research, Redmond, WA, USA, Tim Menzies, North Carolina State University Press, Raleigh, NC, USA, Thomas Zimmermann, Microsoft Research, Redmond, WA, USA |
264 | 1 | |a Amsterdam ; Boston ; Heidelberg ; London ; New York ; Oxford ; Paris ; San Diego ; San Francisco ; Singapore ; Sydney ; Tokyo |b Morgan Kaufmann |c [2015] | |
300 | |a xxiii, 660 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Softwareentwicklung |0 (DE-588)4116522-6 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Data Mining |0 (DE-588)4428654-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Datenanalyse |0 (DE-588)4123037-1 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Datenanalyse |0 (DE-588)4123037-1 |D s |
689 | 0 | 1 | |a Data Mining |0 (DE-588)4428654-5 |D s |
689 | 0 | 2 | |a Softwareentwicklung |0 (DE-588)4116522-6 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Bird, Christian |0 (DE-588)1078047405 |4 edt | |
700 | 1 | |a Menzies, Tim |0 (DE-588)17331967X |4 edt | |
700 | 1 | |a Zimmermann, Thomas |0 (DE-588)1055593705 |4 edt | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-0-12-411543-9 |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028050001&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028050001&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-028050001 |
Datensatz im Suchindex
_version_ | 1816888225411629056 |
---|---|
adam_text |
A comprehensive guide to the art and science of analyzing software data, with best practices
generated by leading data scientists, collected from their experience training software
engineering students and practitioners on how to master data science.
The Art and Science of Analyzing Software Data provides valuable information on analysis techniques
often used to derive insight from software data. This book shares best practices in the field generated
by leading data scientists, collected from their experience training software engineering students and
practitioners to master data science.
The book covers topics such as the analysis of security data, code reviews, app stores, log files, and
user telemetry, among others. It covers a wide variety of techniques such as co-change analysis, text
analysis, topic analysis, and concept analysis, as well as advanced topics such as release planning and
generation of source code comments. It includes stories from the trenches from expert data scientists
illustrating how to apply data analysis in industry and open source, present results to stakeholders,
and drive decisions.
Key Features:
• Presents best practices, hints, and tips to analyze data and apply tools in data science projects
• Presents research methods and case studies that have emerged over the past few years to further
understanding of software data
• Shares stories from the trenches of successful data science initiatives in industry
Contents
List of Contributors.xix
CHAPTER 1 Past, Present, and Future of Analyzing Software Data.1
1.1 Definitions.3
1.2 The Past: Origins.5
1.2.1 Generation 1: Preliminary Work.6
1.2.2 Generation 2: Academic Experiments.6
1.2.3 Generation 3: Industrial Experiments.6
1.2.4 Generation 4: Data Science Everywhere.7
1.3 Present Day.7
1.4 Conclusion. 11
Acknowledgments.12
References.12
PART 1 TUTORIAL-TECHNIQUES_
CHAPTER 2 Mining Patterns and Violations Using Concept Analysis
2.1 Introduction.
2.1.1 Contributions.
2.2 Patterns and B locks.
2.3 Computing All Blocks.
2.3.1 Algorithm in a Nutshell.
2.4 Mining Shopping Carts with Colibri.
2.5 Violations.
2.6 Finding Violations.
2.7 Two Patterns or One Violation?.
2.8 Performance.
2.9 Encoding Order.
2.10 Inlining.
2.11 Related Work.
2.11.1 Mining Patterns.
2.11.2 Mining Violations.
2.11.3 PR Miner.
2.12 Conclusions.
Acknowledgments.
References.
15
17
.17
.19
.19
.21
.23
„25
„27
„28
.28
.30
.32
„32
.33
.33
.34
.35
.36
.37
.37
v
vi
Contents
CHAPTER 3 Analyzing Text in Software Projects. 39
3.1 Introduction.39
3.2 Textual Software Project Data and Retrieval.40
3.2.1 Textual Data.40
3.2.2 Text Retrieval.43
3.3 Manual Coding. 45
3.3.1 Coding Process.45
3.3.2 Challenges.48
3.4 Automated Analysis.49
3.4.1 Topic Modeling.49
3.4.2 Part-of-Speech Tagging and Relationship Extraction.51
3.4.3 n-Grams.53
3.4.4 Clone Detection.54
3.4.5 Visualization.57
3.5 Two Industrial Studies.60
3.5.1 Naming the Pain in Requirements Engineering: A Requirements
Engineering Survey.60
3.5.2 Clone Detection in Requirements Specifications.65
3.6 Summary.70
References.71
CHAPTER 4
4.1
4.2
4.3
4.4
4.5
4.6
Synthesizing Knowledge from Software Development Artifacts
Problem Statement.
Artifact Lifecycle Models.
4.2.1 Example: Patch Lifecycle.
4.2.2 Model Extraction.
Code Review.
4.3.1 Mozilla Project.
4.3.2 WebKit Project.
4.3.3 Blink Project.
Lifecycle Analysis.
4.4.1 Mozilla Firefox.
4.4.2 WebKit.
4.4.3 Blink.
Other Appl ications.
Conclusion.
References.
73
.73
.74
.74
.75
.76
.76
.77
.77
.77
.77
.80
83
83
83
Contents
VII
CHAPTER 5
5.1
5.2
5.3
5.4
5.5
5.6
5.7
CHAPTER 6
6.1
6.2
6.3
6.4
A Practical Guide to Analyzing IDE Usage Data .
Introduction.
Usage Data Research Concepts.
5.2.1 What is Usage Data and Why Should We Analyze it?
5.2.2 Selecting Relevant Data on the Basis of a Goal.
5.2.3 Privacy Concerns.
5.2.4 Study Scope.
Howto Collect Data.
5.3.1 Eclipse Usage Data Collector.
5.3.2 Mylyn and the Ed ipse My lyn Monitor.
5.3.3 CodingSpectator.
5.3.4 Build it Yourself for Visual Studio.
How to Analyze Usage Data.
5.4.1 Data Anonymity.
5.4.2 Usage Data Format.
5.4.3 Magnitude Analysis.
5.4.4 Categorization Analysis.
5.4.5 Sequence Analysis.
5.4.6 State Model Analysis.
5.4.7 The Critical Incident Technique.
5.4.8 Including Data from Other Sources.
Limits of What You Can Learn from Usage Data.
Conclusion.
Code Listings.
Acknowledgments.
References.
85
.86
.87
.87
„88
.89
„90
.90
.91
100
102
108
115
.116
.116
.117
.118
.120
.122
.123
.124
.125
.125
.136
.136
Latent Dirichlet Allocation: Extracting Topics from Software
Engineering Data.
иі m hi ri ini я я « u я n M u ані։ a h u d a k a
u « tt ti а я u o 5 u n u n u a n a a
139
Introduction.
Applications of LDA in Software Analysis
How LDA Works.
LDA Tutorial.
6.4.1 Materials.
6.4.2 Acquiring Software- Engineering Data.
6.4.3 Text Analysis and Data Transformation.
6.4.4 Applying LDA.
6.4.5 LDA Output Summarization.
140
141
. 145
,146
.146
. 149
. 149
VIII
Contents
6.5 Pitfalls and Threats to Validity.153
6.5.1 Criterion Validity.155
6.5.2 Construct Validity.155
6.5.3 Internal Validity.155
6.5.4 External Validity.156
6.5.5 Reliability.157
6.6 Conclusions.157
References.157
CHAPTER 7
7.1
7.2
7.3
7.4
7.5
7.6
Tools and Techniques for Analyzing Product and Process Data . 161
Introduction.162
A Rational Analysis Pipeline.
7.2.1 Getting the Data.
7.2.2 Selecting.
7.2.3 Processing.
7.2.4 Summarizing.
7.2.5 Plumbing.
Source Code Analysis.
7.3.1 Heuristics.
7.3.2 Lexical Analysis.
7.3.3 Parsing and Semantic Analysis.
7.3.4 Third-Party Tools.
Compiled Code Analysis.
7.4.1 Assembly Language.
7.4.2 Machine Code.
7.4.3 Dealing with Name Mangling.
7.4.4 ByteCode.
7.4.5 Dynamic Linking.
7.4.6 Libraries.
Analysis of Configuration Management Data.
7.5.1 Obtaining Repository Data.
7.5.2 Analyzing Metadata.
7.5.3 Analyzing Time Series Snapshots.
7.5.4 Analyzing a Checked Out Repository
7.5.5 Combining Files with Metadata.
7.5.6 Assembling Repositories.
Data Visualization.
7.6.1 Graphs.
7.6.2 Declarative Diagrams.
163
163
164
165
167
168
168
169
169
173
173
179
179
181
182
183
185
186
188
188
190
193
196
198
199
199
203
Contents
IX
7.6.3 Charts.204
7.6.4 Maps.206
7.7 Concluding Remarks.208
References.209
PART 2 PATA/PROBIEM FOCUSSED_213
CHAPTER 8 Analyzing Security Data . .215
8.1 Vulnerability.215
8.1.1 Exploits.217
8.2 Security Data “Gotchas”.217
8.2.1 Gotcha #1. Having Vulnerabilities is Normal.217
8.2.2 Gotcha #2. “More Vulnerabilities” Does not Always Mean “Less Secure” 218
8.2.3 Gotcha #3. Design Level Flaws are not Usually Tracked.219
8.2.4 Gotcha #4. Security is Negatively Defined.219
8.3 Measuring Vulnerability Severity.
8.4
8.5
8.6
8.3.1 CVSS Overview.
8.3.2 Example CVSS Application.
8.3.3 Criticisms of the CVSS.
Method of Collecting and Analyzing Vulnerability Data.
8.4.1 Step 1. Trace Reported Vulnerabilities Back to Fixes
8.4.2 Step 2. Aggregate Source Control Logs.
8.4.3 Step 3a. Determine Vulnerability Coverage.
8.4.4 Step 3c. Classify According to Engineering Mistake.
What Security Data has Told Us Thus Far.
8.5.1 Vulnerabilities have Socio-Technical Elements.
8.5.2 Vulnerabilities have Long, Complex Histories.
Summary.
References.
220
221
221
222
222
223
224
224
225
226
227
228
CHAPTER 9
0.1
* J 7
'■•.fui. .
A Mixed Methods Approach to Mining Code Review fiate:
Examples and a Study of Multioornniit Reviews
and Pull Request
Introduction.
^ vÜ* » a u ;j « a n a u o ¡; n u u s. o ïï o a r; a u
u n a no r, o u it :j it :i ;j n j
Motivation for a Mixed Methods Approach.
Review Process and Data.
9.3.1 Software Inspection.
9.3.2 OSS Code Review.
9.3.3 Code Review at Microsoft.
9.3.4 Google Based Gerrit Code Review.
;! l! it i: il y y i'tG li
.232
.232
.233
.233
.234
.234
.234
X
Contents
9.3.5 GitHub Pull Requests.235
9.3.6 Data Measures and Attributes.235
9.4 Quantitative Replication Study: Code Review on Branches.237
9.4.1 Research Question 1—Commits per Review.238
9.4.2 Research Question 2—Size of Commits.238
9.4.3 Research Question 3—Review Interval.239
9.4.4 Research Question 4—Reviewer Participation.240
9.4.5 Conclusion.240
9.5 Qualitative Approaches.241
9.5.1 Sampling Approaches.242
9.5.2 Data Collection.244
9.5.3 Qualitative Analysis of Microsoft Data.245
9.5.4 Applying Grounded Theory to Archival Data to Understand
OSS Review.246
9.6 Triangulation.247
9.6.1 Using Surveys to Triangulate Qualitative Findings.248
9.6.2 How Multicommit Branches are Reviewed in Linux.249
9.6.3 Closed Coding: Branch or Revision on GitHub and Gerrit.250
9.6.4 Understanding Why Pull Requests are Rejected.251
9.7 Conclusion.252
References.253
CHAPTER 10 Mining Android Apps for Anomalies
10.1 Introduction.
10.2 Clustering Apps by Description.
10.2.1 Collecting Applications.
10.2.2 Preprocessing Descriptions with NLP
10.2.3 Identifying Topics with LDA.
10.2.4 Clustering Apps with K-means.
10.2.5 Finding the Best Number of Clusters.
10.2.6 Resulting App Clusters.
10.3 Identifying Anomalies by APIs.
10.3.1 Extracting API Usage.
10.3.2 Sensitive and Rare APIs.
10.3.3 Distance-Based Outlier Detection.
10.3.4 CHABADA as a Malware Detector.
10.4 Evaluation.
10.4.1 RQ1 : Anomaly Detection.
10.4.2 RQ2: Feature Selection.
10.4.3 RQ3: Malware Detection.
10.4.4 Limitations and Threats to Validity.,.
a
.257
.258
.261
.261
.262
.263
.263
.265
.267
.267
.267
.267
.269
.270
.271
.271
.,.274
.275
.278
Contents
XI
10.5 Related Work.279
10.5.1 Mining App Descriptions.279
10.5.2 Behavior/Description Mismatches.280
10.5.3 Detecting Malicious Apps.280
10.6 Conclusion and Future Work.281
Acknowledgments.281
References.281
CHAPTER 11 Change Coupling Between Software Artifacts: Learning from Past
ChangeS aBaaBaaaaBaBBaaBBaBBBBBBBBaaaaaBaaKaau.BBDaaasBaaB.BUBaaaaaeaanaacaaB 235
11.1 Introduction.286
11.2 Change Coupling.287
11.2.1 Why Do Artifacts Co-Change?.
11.2.2 Benefits of Using Change Coupling.
11.3 Change Coupling Identification Approaches.289
11.3.1 Raw Counting. 290
11.3.2 Association Rules.298
11.3.3 Time-Series Analysis.303
11.4 Challenges in Change Coupling Identification.306
11.4.1 Impact of Commit Practices.306
11.4.2 Practical Advice for Change Coupling Detection.307
11.4.3 Alternative Approaches.310
11.5 Change Coupling Applications.312
11.5.1 Change Prediction and Change Impact Analysis.312
11.5.2 Discovery of Design Flaws and Opportunities for Refactoring.313
11.5.3 Architecture Evaluation.317
11.5.4 Coordination Requirements and Socio-Technical Congruence.318
11.6 Conclusion.319
References.319
PART 3 STORIES FROM THE TRENCHES
327
CHAPTER 12 Applying Software Data Analysis in Industry Contexts; mmt
HUOlfCI I Iff 11 iflub liO L y □ ci a is a a a u a n a n n o rj :t u \i n u » as n c a u u ü ¡1 p u ։; rj 1: 1: a 11 : u « y ti 0 ;; u u :t ¡.t u ՝Ji. .1 Ü
12.1 Introduction.
i 2.2 Background.
12.2.1 Fraunhofer’s Experience in Software Measurement.
12.2.2 Terminology.
12.2.3 Empirical Methods.330
12.2.4 Applying Software Measurement in Practice—The General
Approach.331
,321
Contents
XII
12.3 Six Key Issues when Implementing a Measurement Program in Industry.332
12.3.1 Stakeholders, Requirements, and Planning: The Groundwork
for a Successful Measurement Program.332
12.3.2 Gathering Measurements—How, When, and Who.335
12.3.3 All Data, No Information—When the Data is not What You Need
or Expect.337
12.3.4 The Pivotal Role of Subject Matter Expertise.340
12.3.5 Responding to Changing Needs.342
12.3.6 Effective Ways to Communicate Analysis Results to the Consumers.343
12.4 Conclusions.345
References.346
CHAPTER 13 Using Data to Make Decisions in Software Engineering:
Providing a Method to our Madness. 349
13.1 Introduction.350
13.2 Short History of Software Engineering Metrics.352
13.3 Establishing Clear Goals.353
13.3.1 Benchmarking.354
13.3.2 Product Goals.355
13.4 Review of Metrics.356
13.4.1 Contextual Metrics.358
13.4.2 Constraint Metrics.360
13.4.3 Development Metrics.363
13.5 Challenges with Data Analysis on Software Projects.366
13.5.1 Data Collection.366
13.5.2 Data Interpretation.368
13.6 Example of Changing Product Development Through the Use of Data.370
13.7 Driving Software Engineering Processes with Data.372
References.374
CHAPTER 14 Community Data for OSS Adoption Risk Management.377
14.1 Introduction.378
14.2 Background.379
14.2.1 Risk and Open Source Software Basic Concepts.379
14.2.2 Modeling and Analysis Techniques.382
14.3 An Approach to OSS Risk Adoption Management.384
14.4 OSS Communities Structure and Behavior Analysis: The XWiki Case.386
14.4.1 OSS Community Social Network Analysis.387
14.4.2 Statistical Analytics of Software Quality, OSS Communities’
Behavior and OSS Projects.388
14.4.3 Risk Indicators Assessment via Bayesian Networks.392
Contents
XIII
14.5
14.6
14.7
CHAPTER 15
15.1
15.2
15.3
15.4
15.5
15.6
15.7
15.8
15.9
15.10
14.4.4 OSS Ecosystems Modeling and Reasoning in f.393
14.4.5 Integrating the Analysis for a Comprehensive Risk Assessment.397
A Risk Assessment Example: The Moodbile Case.399
Related Work.405
14.6.1 Data Analysis in OSS Communities.405
14.6.2 Risk Modeling and Analysis via Goal-oriented Techniques.406
Conclusions.407
Acknowledgments.407
References.407
Assessing the State of Software in a Large Enterprise: A 12 Year
RetrOSpeCtiVe . c ·! ՝' . 1
Introduction.412
Evolution of the Process and the Assessment.413
Impact Summary of the State of Avaya Software Report.
Assessment Approach and Mechanisms.
15.4.1 Evolution of the Approach Over Time.
Data Sources.
15.5.1 Data Accuracy.
15.5.2 Types of Data Analyzed.
Examples of Analyses.
15.6.1 Demographic Analyses.
15.6.2 Analysis of Predictability.
15.6.3 Risky File Management.
Software Practices.
15.7.1 Original Seven Key Software Areas.
15.7.2 Four Practices Tracked as Representative.
15.7.3 Example Practice Area—Design Quality In.
15.7.4 Example Individual Practice—Static Analysis.
Assessment Follow-up: Recommendations and Impact.
15.8.1 Example Recommendations.
15.8.2 Deployment of Recommendations.
Impact of the Assessments.
15.9.1 Example: Automated Build Management.
15.9.2 Example: Deployment of Risky File Management.
15.9.3 Improvement in Customer Quality Metric (COM).
Conclusions.
15.10.1 Impact of the Assessment Process.
15.10.2 Factors Contributing to Success.
15.10.3 Organizational Attributes.
.416
.416
.418
.420
.42.3
.423
.424
.425
.427
.430
.434
.435
.436
.437
.438
.440
.441
. 442
. 443
/U-5
t I I '
A. A A
t I I I
.444
.445
.445
.446
.446
XIV
Contents
15.10.4 Selling the Assessment Process.447
15.10.5 Next Steps.447
15.11 Appendix.448
15.11.1 Example Questions Used for Input Sessions.448
Acknowledgments.448
References.449
CHAPTER 16 Lessons Learned from Software Analytics in Practice.453
16.1 Introduction.453
16.2 Problem Selection.455
16.3 Data Collection.457
16.3.1 Datasets.458
16.3.2 Data Extraction.465
16.4 Descriptive Analytics.468
16.4.1 Data Visualization. 468
16.4.2 Reporting via Statistics.470
16.5 Predictive Analytics.473
16.5.1 A Predictive Model for all Conditions.473
16.5.2 Performance Evaluation.478
16.5.3 Prescriptive Analytics.482
16.6 Road Ahead.483
References.485
PART 4 ADVANCED TOPICS_491
CHAPTER 17 Code Comment Analysis for Improving Software Quality.493
17.1 Introduction.494
17.1.1 Benefits of Studying and Analyzing Code Comments.494
17.1.2 Challenges of Studying and Analyzing Code Comments.497
17.1.3 Code Comment Analysis for Specification Mining and Bug Detection.497
17.2 Text Analytics: Techniques, Tools, and Measures.499
17.2.1 Natural Language Processing.499
17.2.2 Machine Learning.499
17.2.3 Analysis Tools.501
17.2.4 Evaluation Measures.501
17.3 Studies of Code Comments.501
17.3.1 Content of Code Comments.502
17.3.2 Common Topics of Code Comments.302
17.4 Automated Code Comment Analysis for Specification
Mining and Bug Detection.503
17.4.1 What Should We Extract?.504
17.4.2 How Should We Extract Information?.507
Contents
XV
17.4.3 Additional Reading.511
17.5 Studies and Analysis of API Documentation.511
17.5.1 Studies of API Documentation.512
17.5.2 Analysis of API Documentation.512
17.6 Future Directions and Challenges.513
References.514
CHAPTER 18 Mining Software Logs for Goal-Driven Root Cause Analysis.519
18.1 Introduction.520
18.2 Approaches to Root Cause Analysis.521
18.2.1 Rule-Based Approaches.521
18.2.2 Probabilistic Approaches.522
18.2.3 Model-Based Approaches.522
18.3 Root Cause Analysis Framework Overview.52,3
18.4
18.5
18.6
18.7
18.8
18.9
Modeling Diagnostics for Root Cause Analysis.
18.4.1 Goal Models.
18.4.2 Antigoal Models.
18.4.3 Model Annotations.
18.4.4 Loan Application Scenario.
Log Reduction.
18.5.1 Latent Semantic Indexing.
18.5.2 Probabilistic Latent Semantic Indexing.
Reasoning Techniques.
18.6.1 Markov Logic Networks.
Root Cause Analysis for Failures Induced by Internal Faults
18.7.1 Knowledge Representation.
18.7.2 Diagnosis.
Root Cause Analysis for Failures due to External Threats.
18.8.1 Antigoal Model Rules.
18.8.2 Inference.
Experimental Evaluations.
18.9.1 Detecting Root Causes due to Internal Faults.
18.9.2 Detecting Root Causes due to External Actions.
526
526
527
529
530
530
531
531
533
533
540
543
544
544
547
18.9.3 Performance Evaluation
i rt. I I) Conclusions.
55
( ' t: G
.
References
jj3
CHAPTER 19 Analytical Product Release Planning.
19.1 Introduction and Motivation.
19.2 Taxonomy of Data-intensive Release Planning Problems
556
557
XVI
Contents
19.3
19.4
19.5
19.6
19.7
19.2.1 What-to-Release Planning.557
19.2.2 Theme-Based Release Planning.557
19.2.3 When-to-Release Problem.558
19.2.4 Release Planning in Consideration of Quality.558
19.2.5 Operational Release Planning.559
19.2.6 Release Planning in Consideration of Technical Debt.560
19.2.7 Release Planning for Multiple Products.560
Information Needs for Software Release Planning.561
19.3.1 Features.561
19.3.2 Feature Value.562
19.3.3 Feature Dependencies.562
19.3.4 Stakeholders.562
19.3.5 Stakeholder Opinions and Priorities.563
19.3.6 Release Readiness.563
19.3.7 Market Trends.564
19.3.8 Resource Consumptions and Constraints.564
19.3.9 Synthesis of Results.564
The Paradigm of Analytical Open Innovation.564
19.4.1 The AOI@RP Platform.565
19.4.2 Analytical Techniques.567
Analytical Release Planning—A Case Study.571
19.5.1 OTT Case Study—The Context and Content.572
19.5.2 Formalization of the Problem.573
19.5.3 The Case Study Process.574
19.5.4 Release Planning in the Presence of Advanced Feature Dependencies
and Synergies.575
19.5.5 Real-Time What-to-Release Planning.577
19.5.6 Re-Planning Based on Crowd Clustering. 578
19.5.7 Conclusions and Discussion of Results.581
Summary and Future Research.583
Appendix: Feature Dependency Constraints.586
Acknowledgments.586
References.586
PART 5 DATA ANALYSIS AT SCALE (BIG DATA)
CHAPTER 20 Boa: An Enabling Language and Infrastructure for
Ultra-Large-Scale MSR Studies.
20.1 Objectives.
593
,594
Contents
XVII
20.2
20.3
20.4
20.5
20.6
20.7
20.8
CHAPTER 21
21.1
21.4
Getting Started with Boa.
20.2.1 Boa’s Architecture.
20.2.2 Submitting a Task.
20.2.3 Obtaining the Results.
Boa’s Syntax and Semantics.
20.3.1 Basic and Compound Types.
20.3.2 Output Aggregation.
20.3.3 Expressing Loops with Quantifiers.
20.3.4 User-Defined Functions.
Mining Project and Repository Metadata.
20.4.1 Types for Mining Software Repositories.
20.4.2 Example I. : Mining Top 10 Programming Languages
20.4.3 Intrinsic Functions.
20.4.4 Example 2; Mining Revisions that Fix Bugs.
20.4.5 Example 3: Computing Project Churn Rates.
Mining Source Code with Visitors.
20.5.1 Types for Mining Source Code.
20.5.2 Intrinsic Functions.
20.5.3 Visitor Syntax.
20.5.4 Example 4: Mining AST Count.
20.5.5 Custom Traversal Strategies.
20.5.6 Example 5: Mining for Added Null Checks.
20.5.7 Example 6: Finding Unreachable Code.
Guidelines for Replicable Research.
Conclusions.
Practice Problems.
References.
.594
.594
.596
.597
.597
.598
.600
.602
.603
.604
.604
.605
.606
.607
.608
.608
.608
.611
.612
.613
.614
.614
.616
.618
.618
.619
.620
Scalable Parallelization of Specification Mining Using
Distributed Computing.
623
Introduction.
Background.
21.2.1 Specification Mining Algorithms
21.2.2 Distributed Computing.
Distributed Specification Mining.
21.3.1 Principles.
21.3.2 Algorithm- Specific Parallelization.
Implementation and Empirical Evaluation.
21.4.1 Dataset and Experimental Settings.
21.4.2 Research Questions and Results.
21.4.3 Threats to Validity and Current Limitations.
.624
.626
.626
.628
.629
.629
.632
.637
638
642
XVIII
Contents
21.5 Related Work.643
21.5.1 Specification Mining and Its Applications.643
21.5.2 MapReduce in Software Engineering.644
21.5.3 Parallel Data Mining Algorithms.645
21.6 Conclusion and Future Work.645
References.646 |
any_adam_object | 1 |
author2 | Bird, Christian Menzies, Tim Zimmermann, Thomas |
author2_role | edt edt edt |
author2_variant | c b cb t m tm t z tz |
author_GND | (DE-588)1078047405 (DE-588)17331967X (DE-588)1055593705 |
author_facet | Bird, Christian Menzies, Tim Zimmermann, Thomas |
building | Verbundindex |
bvnumber | BV042617204 |
classification_rvk | ST 233 ST 530 |
ctrlnum | (OCoLC)909330141 (DE-599)BVBBV042617204 |
dewey-full | 005.74 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 005 - Computer programming, programs, data, security |
dewey-raw | 005.74 |
dewey-search | 005.74 |
dewey-sort | 15.74 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV042617204</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20241127</controlfield><controlfield tag="007">t|</controlfield><controlfield tag="008">150615s2015 xx a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780124115194</subfield><subfield code="c">pbk</subfield><subfield code="9">978-0-12-411519-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)909330141</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV042617204</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-739</subfield><subfield code="a">DE-473</subfield><subfield code="a">DE-898</subfield><subfield code="a">DE-384</subfield><subfield code="a">DE-703</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">005.74</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 233</subfield><subfield code="0">(DE-625)143620:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 530</subfield><subfield code="0">(DE-625)143679:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">The art and science of analyzing software data</subfield><subfield code="c">Christian Bird, Microsoft Research, Redmond, WA, USA, Tim Menzies, North Carolina State University Press, Raleigh, NC, USA, Thomas Zimmermann, Microsoft Research, Redmond, WA, USA</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Amsterdam ; Boston ; Heidelberg ; London ; New York ; Oxford ; Paris ; San Diego ; San Francisco ; Singapore ; Sydney ; Tokyo</subfield><subfield code="b">Morgan Kaufmann</subfield><subfield code="c">[2015]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xxiii, 660 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Softwareentwicklung</subfield><subfield code="0">(DE-588)4116522-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Datenanalyse</subfield><subfield code="0">(DE-588)4123037-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Data Mining</subfield><subfield code="0">(DE-588)4428654-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Softwareentwicklung</subfield><subfield code="0">(DE-588)4116522-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Bird, Christian</subfield><subfield code="0">(DE-588)1078047405</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Menzies, Tim</subfield><subfield code="0">(DE-588)17331967X</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zimmermann, Thomas</subfield><subfield code="0">(DE-588)1055593705</subfield><subfield code="4">edt</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-0-12-411543-9</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028050001&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028050001&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-028050001</subfield></datafield></record></collection> |
id | DE-604.BV042617204 |
illustrated | Illustrated |
indexdate | 2024-11-27T15:00:35Z |
institution | BVB |
isbn | 9780124115194 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-028050001 |
oclc_num | 909330141 |
open_access_boolean | |
owner | DE-739 DE-473 DE-BY-UBG DE-898 DE-BY-UBR DE-384 DE-703 |
owner_facet | DE-739 DE-473 DE-BY-UBG DE-898 DE-BY-UBR DE-384 DE-703 |
physical | xxiii, 660 Seiten Illustrationen, Diagramme |
publishDate | 2015 |
publishDateSearch | 2015 |
publishDateSort | 2015 |
publisher | Morgan Kaufmann |
record_format | marc |
spelling | The art and science of analyzing software data Christian Bird, Microsoft Research, Redmond, WA, USA, Tim Menzies, North Carolina State University Press, Raleigh, NC, USA, Thomas Zimmermann, Microsoft Research, Redmond, WA, USA Amsterdam ; Boston ; Heidelberg ; London ; New York ; Oxford ; Paris ; San Diego ; San Francisco ; Singapore ; Sydney ; Tokyo Morgan Kaufmann [2015] xxiii, 660 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier Softwareentwicklung (DE-588)4116522-6 gnd rswk-swf Data Mining (DE-588)4428654-5 gnd rswk-swf Datenanalyse (DE-588)4123037-1 gnd rswk-swf Datenanalyse (DE-588)4123037-1 s Data Mining (DE-588)4428654-5 s Softwareentwicklung (DE-588)4116522-6 s DE-604 Bird, Christian (DE-588)1078047405 edt Menzies, Tim (DE-588)17331967X edt Zimmermann, Thomas (DE-588)1055593705 edt Erscheint auch als Online-Ausgabe 978-0-12-411543-9 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028050001&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Klappentext Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028050001&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | The art and science of analyzing software data Softwareentwicklung (DE-588)4116522-6 gnd Data Mining (DE-588)4428654-5 gnd Datenanalyse (DE-588)4123037-1 gnd |
subject_GND | (DE-588)4116522-6 (DE-588)4428654-5 (DE-588)4123037-1 |
title | The art and science of analyzing software data |
title_auth | The art and science of analyzing software data |
title_exact_search | The art and science of analyzing software data |
title_full | The art and science of analyzing software data Christian Bird, Microsoft Research, Redmond, WA, USA, Tim Menzies, North Carolina State University Press, Raleigh, NC, USA, Thomas Zimmermann, Microsoft Research, Redmond, WA, USA |
title_fullStr | The art and science of analyzing software data Christian Bird, Microsoft Research, Redmond, WA, USA, Tim Menzies, North Carolina State University Press, Raleigh, NC, USA, Thomas Zimmermann, Microsoft Research, Redmond, WA, USA |
title_full_unstemmed | The art and science of analyzing software data Christian Bird, Microsoft Research, Redmond, WA, USA, Tim Menzies, North Carolina State University Press, Raleigh, NC, USA, Thomas Zimmermann, Microsoft Research, Redmond, WA, USA |
title_short | The art and science of analyzing software data |
title_sort | the art and science of analyzing software data |
topic | Softwareentwicklung (DE-588)4116522-6 gnd Data Mining (DE-588)4428654-5 gnd Datenanalyse (DE-588)4123037-1 gnd |
topic_facet | Softwareentwicklung Data Mining Datenanalyse |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028050001&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=028050001&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT birdchristian theartandscienceofanalyzingsoftwaredata AT menziestim theartandscienceofanalyzingsoftwaredata AT zimmermannthomas theartandscienceofanalyzingsoftwaredata |