Cross-layer reliability of computing systems:
This book presents state-of-the-art solutions for increasing the resilience of computing systems, both at single levels of abstraction and multi-layers. It is a valuable resource for researchers, postgraduate students and professional computer architects focusing on the dependability of computing sy...
Gespeichert in:
Weitere Verfasser: | , , , , |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Stevenage
The Institution of Engineering and Technology
2020
|
Schriftenreihe: | IET materials, circuits and devices series
57 |
Online-Zugang: | TUM01 UBY01 UER01 |
Zusammenfassung: | This book presents state-of-the-art solutions for increasing the resilience of computing systems, both at single levels of abstraction and multi-layers. It is a valuable resource for researchers, postgraduate students and professional computer architects focusing on the dependability of computing systems |
Beschreibung: | Intro -- Contents -- Part I: Design techniques to improve the resilience of computing systems -- 1. Technological layer | Antonio Rubio and Ramon Canal -- 1.1 Introduction -- 1.1.1 Faults, errors and failures -- 1.2 Technology overview -- 1.2.1 Technologies based on electric charge -- 1.2.2 Roadmap for adoption -- 1.2.3 Sources of unreliability in technology -- 1.3 CPU building blocks -- 1.3.1 Combinatorial circuits -- 1.3.2 Memories -- 1.3.3 Main memory and storage -- 1.3.4 Emerging memories -- 1.4 Characterization -- 1.4.1 Manufacturing -- 1.4.2 Radiation -- 1.5 Conclusions -- References -- 2. Design techniques to improve the resilience of computing systems: logic layer | Lorena Anghel and Michael Nicolaidis -- 2.1 Introduction -- 2.2 Performance and reliability monitors -- 2.2.1 Double-sampling methodology and the basic architecture -- 2.3 Double-sampling-based monitors for detecting performance violations and transient faults -- 2.3.1 External-design monitors -- 2.3.2 Embedded monitors -- 2.3.3 Other types of monitors -- 2.3.4 Discussions -- 2.4 Conclusions -- References -- 3. Design techniques to improve the resilience of computing systems: architectural layer | Aviral Shrivastava, Kyoungwoo Lee, Hwisoo So, Jinhyo Jung, and Prudhvi Gali -- 3.1 Cache protection techniques -- 3.2 Register file protection techniques -- 3.3 Pipeline and core protection -- References -- 4. Design techniques to improve the resilience of computing systems: software layer | Alberto Bosio, Stefano Di Carlo, Giorgio Di Natale, Matteo Sonza Reorda, and Josie E. Rodriguez Condia -- 4.1 Introduction -- 4.2 Fault taxonomy -- 4.2.1 Software faults -- 4.3 Software-Implemented Hardware Fault Tolerance -- 4.3.1 Modify the software in order to reduce the probability of fault occurrences -- 4.3.2 Detecting/tolerating the presence of an error -- 4.4 Software-Based Self-Test 4.4.1 Basics on SBST -- 4.5 SBST for GPGPUs -- 4.5.1 Introduction -- 4.5.2 Effects of permanent faults in GPGPU devices -- 4.5.3 SBST techniques for testing the GPGPU scheduler -- References -- 5. Cross-layer resilience | Eric Cheng and Subhasish Mitra -- 5.1 Introduction -- 5.2 CLEAR framework -- 5.2.1 Reliability analysis -- 5.2.2 Execution time -- 5.2.3 Physical design -- 5.2.4 Resilience library -- 5.3 Cross-layer combinations -- 5.3.1 Combinations for general-purpose processors -- 5.3.2 Targeting specific applications -- 5.4 Application benchmark dependence -- 5.5 The design of new resilience techniques -- 5.6 Conclusions -- Acknowledgments -- References -- Part II: Reliability assessment -- 6. Physical stress | Fernando Fernandes dos Santos, Fabio Benevenuti, Gennaro Rodrigues, Fernanda Kastensmidt, and Paolo Rech -- 6.1 Introduction -- 6.2 Effects and physical sources -- 6.3 Reliability metrics -- 6.4 General setup -- 6.5 Neutron beam experiments -- 6.6 Heavy ions and proton experiments -- 6.7 Laser test -- 6.8 Conclusions -- References -- 7. Soft error modeling and simulation | Mojtaba Ebrahimi and Mehdi Tahoori -- 7.1 Introduction -- 7.2 FIT rate analysis at device level -- 7.3 Multiple transient error site identification using layout information -- 7.3.1 Motivation for layout-based MT analysis and mitigation -- 7.3.2 Proposed layout-based MT error site extraction technique -- 7.3.3 Experimental results of MT modeling -- 7.4 Propagating flip-flop errors at circuit level -- 7.4.1 Event-driven logic simulation -- 7.4.2 Error propagation from single flip-flop -- 7.4.3 Concurrent transient error propagation from multiple flip-flops -- 7.4.4 Experimental results -- 7.5 Propagating combinational gates errors at circuit level -- 7.6 Emulation-based fault injection platform -- 7.6.1 Shadow components 7.6.2 Shadow components-based fault injection technique -- 7.6.3 Experimental results -- 7.7 Fault injection acceleration -- 7.7.1 Workflow -- 7.7.2 Analytical modeling -- 7.7.3 Case study: fault injection on memory arrays of Leon3 -- 7.8 Conclusions -- References -- 8. Microarchitecture-level reliability assessment of multi-core processors | Athanasios Chatzidimitriou and Dimitris Gizopoulos -- 8.1 Introduction -- 8.2 Background -- 8.2.1 Threats and vulnerability -- 8.3 Fault-effect classes -- 8.4 Statistical fault injection -- 8.5 Cross-layer and single-layer evaluation -- 8.6 Assessment throughput -- 8.6.1 Simulation acceleration -- 8.6.2 Fault list reduction -- 8.7 Estimation accuracy -- 8.8 Conclusions -- References -- 9. Fault injection at the instruction set architecture (ISA) level | Karthik Pattabiraman and Guanpeng Li -- 9.1 Introduction -- 9.2 Background -- 9.2.1 Terms and definitions -- 9.2.2 Failure outcomes -- 9.2.3 Metrics -- 9.2.4 Fault Injection process -- 9.2.5 Fault model -- 9.3 Classification of injection techniques -- 9.3.1 Simulation versus direct -- 9.3.2 Intrusive versus nonintrusive -- 9.3.3 Level of injection -- 9.3.4 Platform -- 9.3.5 Classification results -- 9.4 LLFI and PINFI fault injectors -- 9.4.1 LLVM fault injector: LLFI -- 9.4.2 PINFI -- 9.5 Open challenges and conclusion -- 9.5.1 Challenge 1: level of injection -- 9.5.2 Challenge 2: target platform -- 9.5.3 Challenge 3: bit-flip model -- 9.5.4 Conclusion -- Acknowledgments -- References -- 10. Analytical modeling for crosslayer resiliency | Arijit Biswas -- 10.1 Introduction -- 10.2 ACE lifetime analysis -- 10.2.1 Un-ACE and ACE -- 10.2.2 Little's law -- 10.2.3 Example of ACE lifetime analysis -- 10.2.4 AVFs of various structures and workloads using ACE lifetime analysis -- 10.2.5 Hamming Distance Analysis and bit field analysis 10.2.6 Hamming Distance Analysis and multi-bit fault modeling -- 10.3 Sequential AVF analysis -- 10.3.1 port AVF (pAVF) and structure AVF -- 10.3.2 Sequential AVF computation -- 10.4 Program vulnerability factor -- 10.4.1 Cross-layer modeling using AVF and PVF -- 10.5 Artifacts of analytical vulnerability modeling and mitigations -- 10.5.1 Significance of data values in analytical modeling -- 10.5.2 Reducing unknowns-warmup and cooldown -- 10.5.3 Dealing with large and complex models -- 10.6 Future directions for analytical technique -- 10.7 Summary of analytical modeling for vulnerability -- References -- 11. Stochastic methods | Alessandro Savino, Alessandro Vallero, and Stefano Di Carlo -- 11.1 Introduction -- 11.2 Methodologies -- 11.2.1 Reliability Block Diagrams -- 11.2.2 Markov Chains -- 11.2.3 Bayesian Networks -- 11.3 Conclusions -- References -- Index |
Beschreibung: | 1 Online-Ressource Illustrationen, Diagramme |
ISBN: | 9781785617980 |
Internformat
MARC
LEADER | 00000nmm a2200000zcb4500 | ||
---|---|---|---|
001 | BV047017407 | ||
003 | DE-604 | ||
005 | 20240216 | ||
007 | cr|uuu---uuuuu | ||
008 | 201119s2020 |||| o||u| ||||||eng d | ||
020 | |a 9781785617980 |c PDF |9 978-1-78561-798-0 | ||
024 | 7 | |a 10.1049/PBCS057E |2 doi | |
035 | |a (ZDB-30-PQE)EBC6341977 | ||
035 | |a (ZDB-100-IET)9781785617980 | ||
035 | |a (OCoLC)1224011900 | ||
035 | |a (DE-599)BVBBV047017407 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-91 |a DE-29 |a DE-706 | ||
082 | 0 | |a 004 | |
084 | |a DAT 286 |2 stub | ||
245 | 1 | 0 | |a Cross-layer reliability of computing systems |c edited by Giorgio Di Natale, Dimitris Gizopoulos, Stefano Di Carlo, Alberto Bosio and Ramon Canal |
264 | 1 | |a Stevenage |b The Institution of Engineering and Technology |c 2020 | |
300 | |a 1 Online-Ressource |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
490 | 1 | |a IET materials, circuits and devices series |v 57 | |
500 | |a Intro -- Contents -- Part I: Design techniques to improve the resilience of computing systems -- 1. Technological layer | Antonio Rubio and Ramon Canal -- 1.1 Introduction -- 1.1.1 Faults, errors and failures -- 1.2 Technology overview -- 1.2.1 Technologies based on electric charge -- 1.2.2 Roadmap for adoption -- 1.2.3 Sources of unreliability in technology -- 1.3 CPU building blocks -- 1.3.1 Combinatorial circuits -- 1.3.2 Memories -- 1.3.3 Main memory and storage -- 1.3.4 Emerging memories -- 1.4 Characterization -- 1.4.1 Manufacturing -- 1.4.2 Radiation -- 1.5 Conclusions -- References -- 2. Design techniques to improve the resilience of computing systems: logic layer | Lorena Anghel and Michael Nicolaidis -- 2.1 Introduction -- 2.2 Performance and reliability monitors -- 2.2.1 Double-sampling methodology and the basic architecture -- 2.3 Double-sampling-based monitors for detecting performance violations and transient faults -- 2.3.1 External-design monitors -- 2.3.2 Embedded monitors -- 2.3.3 Other types of monitors -- 2.3.4 Discussions -- 2.4 Conclusions -- References -- 3. Design techniques to improve the resilience of computing systems: architectural layer | Aviral Shrivastava, Kyoungwoo Lee, Hwisoo So, Jinhyo Jung, and Prudhvi Gali -- 3.1 Cache protection techniques -- 3.2 Register file protection techniques -- 3.3 Pipeline and core protection -- References -- 4. Design techniques to improve the resilience of computing systems: software layer | Alberto Bosio, Stefano Di Carlo, Giorgio Di Natale, Matteo Sonza Reorda, and Josie E. Rodriguez Condia -- 4.1 Introduction -- 4.2 Fault taxonomy -- 4.2.1 Software faults -- 4.3 Software-Implemented Hardware Fault Tolerance -- 4.3.1 Modify the software in order to reduce the probability of fault occurrences -- 4.3.2 Detecting/tolerating the presence of an error -- 4.4 Software-Based Self-Test | ||
500 | |a 4.4.1 Basics on SBST -- 4.5 SBST for GPGPUs -- 4.5.1 Introduction -- 4.5.2 Effects of permanent faults in GPGPU devices -- 4.5.3 SBST techniques for testing the GPGPU scheduler -- References -- 5. Cross-layer resilience | Eric Cheng and Subhasish Mitra -- 5.1 Introduction -- 5.2 CLEAR framework -- 5.2.1 Reliability analysis -- 5.2.2 Execution time -- 5.2.3 Physical design -- 5.2.4 Resilience library -- 5.3 Cross-layer combinations -- 5.3.1 Combinations for general-purpose processors -- 5.3.2 Targeting specific applications -- 5.4 Application benchmark dependence -- 5.5 The design of new resilience techniques -- 5.6 Conclusions -- Acknowledgments -- References -- Part II: Reliability assessment -- 6. Physical stress | Fernando Fernandes dos Santos, Fabio Benevenuti, Gennaro Rodrigues, Fernanda Kastensmidt, and Paolo Rech -- 6.1 Introduction -- 6.2 Effects and physical sources -- 6.3 Reliability metrics -- 6.4 General setup -- 6.5 Neutron beam experiments -- 6.6 Heavy ions and proton experiments -- 6.7 Laser test -- 6.8 Conclusions -- References -- 7. Soft error modeling and simulation | Mojtaba Ebrahimi and Mehdi Tahoori -- 7.1 Introduction -- 7.2 FIT rate analysis at device level -- 7.3 Multiple transient error site identification using layout information -- 7.3.1 Motivation for layout-based MT analysis and mitigation -- 7.3.2 Proposed layout-based MT error site extraction technique -- 7.3.3 Experimental results of MT modeling -- 7.4 Propagating flip-flop errors at circuit level -- 7.4.1 Event-driven logic simulation -- 7.4.2 Error propagation from single flip-flop -- 7.4.3 Concurrent transient error propagation from multiple flip-flops -- 7.4.4 Experimental results -- 7.5 Propagating combinational gates errors at circuit level -- 7.6 Emulation-based fault injection platform -- 7.6.1 Shadow components | ||
500 | |a 7.6.2 Shadow components-based fault injection technique -- 7.6.3 Experimental results -- 7.7 Fault injection acceleration -- 7.7.1 Workflow -- 7.7.2 Analytical modeling -- 7.7.3 Case study: fault injection on memory arrays of Leon3 -- 7.8 Conclusions -- References -- 8. Microarchitecture-level reliability assessment of multi-core processors | Athanasios Chatzidimitriou and Dimitris Gizopoulos -- 8.1 Introduction -- 8.2 Background -- 8.2.1 Threats and vulnerability -- 8.3 Fault-effect classes -- 8.4 Statistical fault injection -- 8.5 Cross-layer and single-layer evaluation -- 8.6 Assessment throughput -- 8.6.1 Simulation acceleration -- 8.6.2 Fault list reduction -- 8.7 Estimation accuracy -- 8.8 Conclusions -- References -- 9. Fault injection at the instruction set architecture (ISA) level | Karthik Pattabiraman and Guanpeng Li -- 9.1 Introduction -- 9.2 Background -- 9.2.1 Terms and definitions -- 9.2.2 Failure outcomes -- 9.2.3 Metrics -- 9.2.4 Fault Injection process -- 9.2.5 Fault model -- 9.3 Classification of injection techniques -- 9.3.1 Simulation versus direct -- 9.3.2 Intrusive versus nonintrusive -- 9.3.3 Level of injection -- 9.3.4 Platform -- 9.3.5 Classification results -- 9.4 LLFI and PINFI fault injectors -- 9.4.1 LLVM fault injector: LLFI -- 9.4.2 PINFI -- 9.5 Open challenges and conclusion -- 9.5.1 Challenge 1: level of injection -- 9.5.2 Challenge 2: target platform -- 9.5.3 Challenge 3: bit-flip model -- 9.5.4 Conclusion -- Acknowledgments -- References -- 10. Analytical modeling for crosslayer resiliency | Arijit Biswas -- 10.1 Introduction -- 10.2 ACE lifetime analysis -- 10.2.1 Un-ACE and ACE -- 10.2.2 Little's law -- 10.2.3 Example of ACE lifetime analysis -- 10.2.4 AVFs of various structures and workloads using ACE lifetime analysis -- 10.2.5 Hamming Distance Analysis and bit field analysis | ||
500 | |a 10.2.6 Hamming Distance Analysis and multi-bit fault modeling -- 10.3 Sequential AVF analysis -- 10.3.1 port AVF (pAVF) and structure AVF -- 10.3.2 Sequential AVF computation -- 10.4 Program vulnerability factor -- 10.4.1 Cross-layer modeling using AVF and PVF -- 10.5 Artifacts of analytical vulnerability modeling and mitigations -- 10.5.1 Significance of data values in analytical modeling -- 10.5.2 Reducing unknowns-warmup and cooldown -- 10.5.3 Dealing with large and complex models -- 10.6 Future directions for analytical technique -- 10.7 Summary of analytical modeling for vulnerability -- References -- 11. Stochastic methods | Alessandro Savino, Alessandro Vallero, and Stefano Di Carlo -- 11.1 Introduction -- 11.2 Methodologies -- 11.2.1 Reliability Block Diagrams -- 11.2.2 Markov Chains -- 11.2.3 Bayesian Networks -- 11.3 Conclusions -- References -- Index | ||
520 | |a This book presents state-of-the-art solutions for increasing the resilience of computing systems, both at single levels of abstraction and multi-layers. It is a valuable resource for researchers, postgraduate students and professional computer architects focusing on the dependability of computing systems | ||
700 | 1 | |a Di Natale, Giorgio |4 edt | |
700 | 1 | |a Gizopoulos, Dimitris |4 edt | |
700 | 1 | |a Di Carlo, Stefano |0 (DE-588)1256954519 |4 edt | |
700 | 1 | |a Bosio, Alberto |4 edt | |
700 | 1 | |a Canal, Ramon |4 edt | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 978-1-78561-797-3 |
830 | 0 | |a IET materials, circuits and devices series |v 57 |w (DE-604)BV044007507 |9 57 | |
912 | |a ZDB-30-PQE |a ZDB-100-IET | ||
999 | |a oai:aleph.bib-bvb.de:BVB01-032424941 | ||
966 | e | |u https://ebookcentral.proquest.com/lib/munchentech/detail.action?docID=6341977 |l TUM01 |p ZDB-30-PQE |q TUM_PDA_PQE_Kauf |x Aggregator |3 Volltext | |
966 | e | |u https://doi.org/10.1049/PBCS057E |l UBY01 |p ZDB-100-IET |x Verlag |3 Volltext | |
966 | e | |u https://doi.org/10.1049/PBCS057E |l UER01 |p ZDB-100-IET |x Verlag |3 Volltext |
Datensatz im Suchindex
_version_ | 1804181979294334976 |
---|---|
adam_txt | |
any_adam_object | |
any_adam_object_boolean | |
author2 | Di Natale, Giorgio Gizopoulos, Dimitris Di Carlo, Stefano Bosio, Alberto Canal, Ramon |
author2_role | edt edt edt edt edt |
author2_variant | n g d ng ngd d g dg c s d cs csd a b ab r c rc |
author_GND | (DE-588)1256954519 |
author_facet | Di Natale, Giorgio Gizopoulos, Dimitris Di Carlo, Stefano Bosio, Alberto Canal, Ramon |
building | Verbundindex |
bvnumber | BV047017407 |
classification_tum | DAT 286 |
collection | ZDB-30-PQE ZDB-100-IET |
ctrlnum | (ZDB-30-PQE)EBC6341977 (ZDB-100-IET)9781785617980 (OCoLC)1224011900 (DE-599)BVBBV047017407 |
dewey-full | 004 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 004 - Computer science |
dewey-raw | 004 |
dewey-search | 004 |
dewey-sort | 14 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
discipline_str_mv | Informatik |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>08622nmm a2200505zcb4500</leader><controlfield tag="001">BV047017407</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20240216 </controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">201119s2020 |||| o||u| ||||||eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781785617980</subfield><subfield code="c">PDF</subfield><subfield code="9">978-1-78561-798-0</subfield></datafield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1049/PBCS057E</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PQE)EBC6341977</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-100-IET)9781785617980</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1224011900</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047017407</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield><subfield code="a">DE-29</subfield><subfield code="a">DE-706</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">004</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 286</subfield><subfield code="2">stub</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Cross-layer reliability of computing systems</subfield><subfield code="c">edited by Giorgio Di Natale, Dimitris Gizopoulos, Stefano Di Carlo, Alberto Bosio and Ramon Canal</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Stevenage</subfield><subfield code="b">The Institution of Engineering and Technology</subfield><subfield code="c">2020</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">IET materials, circuits and devices series</subfield><subfield code="v">57</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Intro -- Contents -- Part I: Design techniques to improve the resilience of computing systems -- 1. Technological layer | Antonio Rubio and Ramon Canal -- 1.1 Introduction -- 1.1.1 Faults, errors and failures -- 1.2 Technology overview -- 1.2.1 Technologies based on electric charge -- 1.2.2 Roadmap for adoption -- 1.2.3 Sources of unreliability in technology -- 1.3 CPU building blocks -- 1.3.1 Combinatorial circuits -- 1.3.2 Memories -- 1.3.3 Main memory and storage -- 1.3.4 Emerging memories -- 1.4 Characterization -- 1.4.1 Manufacturing -- 1.4.2 Radiation -- 1.5 Conclusions -- References -- 2. Design techniques to improve the resilience of computing systems: logic layer | Lorena Anghel and Michael Nicolaidis -- 2.1 Introduction -- 2.2 Performance and reliability monitors -- 2.2.1 Double-sampling methodology and the basic architecture -- 2.3 Double-sampling-based monitors for detecting performance violations and transient faults -- 2.3.1 External-design monitors -- 2.3.2 Embedded monitors -- 2.3.3 Other types of monitors -- 2.3.4 Discussions -- 2.4 Conclusions -- References -- 3. Design techniques to improve the resilience of computing systems: architectural layer | Aviral Shrivastava, Kyoungwoo Lee, Hwisoo So, Jinhyo Jung, and Prudhvi Gali -- 3.1 Cache protection techniques -- 3.2 Register file protection techniques -- 3.3 Pipeline and core protection -- References -- 4. Design techniques to improve the resilience of computing systems: software layer | Alberto Bosio, Stefano Di Carlo, Giorgio Di Natale, Matteo Sonza Reorda, and Josie E. Rodriguez Condia -- 4.1 Introduction -- 4.2 Fault taxonomy -- 4.2.1 Software faults -- 4.3 Software-Implemented Hardware Fault Tolerance -- 4.3.1 Modify the software in order to reduce the probability of fault occurrences -- 4.3.2 Detecting/tolerating the presence of an error -- 4.4 Software-Based Self-Test</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">4.4.1 Basics on SBST -- 4.5 SBST for GPGPUs -- 4.5.1 Introduction -- 4.5.2 Effects of permanent faults in GPGPU devices -- 4.5.3 SBST techniques for testing the GPGPU scheduler -- References -- 5. Cross-layer resilience | Eric Cheng and Subhasish Mitra -- 5.1 Introduction -- 5.2 CLEAR framework -- 5.2.1 Reliability analysis -- 5.2.2 Execution time -- 5.2.3 Physical design -- 5.2.4 Resilience library -- 5.3 Cross-layer combinations -- 5.3.1 Combinations for general-purpose processors -- 5.3.2 Targeting specific applications -- 5.4 Application benchmark dependence -- 5.5 The design of new resilience techniques -- 5.6 Conclusions -- Acknowledgments -- References -- Part II: Reliability assessment -- 6. Physical stress | Fernando Fernandes dos Santos, Fabio Benevenuti, Gennaro Rodrigues, Fernanda Kastensmidt, and Paolo Rech -- 6.1 Introduction -- 6.2 Effects and physical sources -- 6.3 Reliability metrics -- 6.4 General setup -- 6.5 Neutron beam experiments -- 6.6 Heavy ions and proton experiments -- 6.7 Laser test -- 6.8 Conclusions -- References -- 7. Soft error modeling and simulation | Mojtaba Ebrahimi and Mehdi Tahoori -- 7.1 Introduction -- 7.2 FIT rate analysis at device level -- 7.3 Multiple transient error site identification using layout information -- 7.3.1 Motivation for layout-based MT analysis and mitigation -- 7.3.2 Proposed layout-based MT error site extraction technique -- 7.3.3 Experimental results of MT modeling -- 7.4 Propagating flip-flop errors at circuit level -- 7.4.1 Event-driven logic simulation -- 7.4.2 Error propagation from single flip-flop -- 7.4.3 Concurrent transient error propagation from multiple flip-flops -- 7.4.4 Experimental results -- 7.5 Propagating combinational gates errors at circuit level -- 7.6 Emulation-based fault injection platform -- 7.6.1 Shadow components</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">7.6.2 Shadow components-based fault injection technique -- 7.6.3 Experimental results -- 7.7 Fault injection acceleration -- 7.7.1 Workflow -- 7.7.2 Analytical modeling -- 7.7.3 Case study: fault injection on memory arrays of Leon3 -- 7.8 Conclusions -- References -- 8. Microarchitecture-level reliability assessment of multi-core processors | Athanasios Chatzidimitriou and Dimitris Gizopoulos -- 8.1 Introduction -- 8.2 Background -- 8.2.1 Threats and vulnerability -- 8.3 Fault-effect classes -- 8.4 Statistical fault injection -- 8.5 Cross-layer and single-layer evaluation -- 8.6 Assessment throughput -- 8.6.1 Simulation acceleration -- 8.6.2 Fault list reduction -- 8.7 Estimation accuracy -- 8.8 Conclusions -- References -- 9. Fault injection at the instruction set architecture (ISA) level | Karthik Pattabiraman and Guanpeng Li -- 9.1 Introduction -- 9.2 Background -- 9.2.1 Terms and definitions -- 9.2.2 Failure outcomes -- 9.2.3 Metrics -- 9.2.4 Fault Injection process -- 9.2.5 Fault model -- 9.3 Classification of injection techniques -- 9.3.1 Simulation versus direct -- 9.3.2 Intrusive versus nonintrusive -- 9.3.3 Level of injection -- 9.3.4 Platform -- 9.3.5 Classification results -- 9.4 LLFI and PINFI fault injectors -- 9.4.1 LLVM fault injector: LLFI -- 9.4.2 PINFI -- 9.5 Open challenges and conclusion -- 9.5.1 Challenge 1: level of injection -- 9.5.2 Challenge 2: target platform -- 9.5.3 Challenge 3: bit-flip model -- 9.5.4 Conclusion -- Acknowledgments -- References -- 10. Analytical modeling for crosslayer resiliency | Arijit Biswas -- 10.1 Introduction -- 10.2 ACE lifetime analysis -- 10.2.1 Un-ACE and ACE -- 10.2.2 Little's law -- 10.2.3 Example of ACE lifetime analysis -- 10.2.4 AVFs of various structures and workloads using ACE lifetime analysis -- 10.2.5 Hamming Distance Analysis and bit field analysis</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">10.2.6 Hamming Distance Analysis and multi-bit fault modeling -- 10.3 Sequential AVF analysis -- 10.3.1 port AVF (pAVF) and structure AVF -- 10.3.2 Sequential AVF computation -- 10.4 Program vulnerability factor -- 10.4.1 Cross-layer modeling using AVF and PVF -- 10.5 Artifacts of analytical vulnerability modeling and mitigations -- 10.5.1 Significance of data values in analytical modeling -- 10.5.2 Reducing unknowns-warmup and cooldown -- 10.5.3 Dealing with large and complex models -- 10.6 Future directions for analytical technique -- 10.7 Summary of analytical modeling for vulnerability -- References -- 11. Stochastic methods | Alessandro Savino, Alessandro Vallero, and Stefano Di Carlo -- 11.1 Introduction -- 11.2 Methodologies -- 11.2.1 Reliability Block Diagrams -- 11.2.2 Markov Chains -- 11.2.3 Bayesian Networks -- 11.3 Conclusions -- References -- Index</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This book presents state-of-the-art solutions for increasing the resilience of computing systems, both at single levels of abstraction and multi-layers. It is a valuable resource for researchers, postgraduate students and professional computer architects focusing on the dependability of computing systems</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Di Natale, Giorgio</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Gizopoulos, Dimitris</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Di Carlo, Stefano</subfield><subfield code="0">(DE-588)1256954519</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Bosio, Alberto</subfield><subfield code="4">edt</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Canal, Ramon</subfield><subfield code="4">edt</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">978-1-78561-797-3</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">IET materials, circuits and devices series</subfield><subfield code="v">57</subfield><subfield code="w">(DE-604)BV044007507</subfield><subfield code="9">57</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-PQE</subfield><subfield code="a">ZDB-100-IET</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032424941</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://ebookcentral.proquest.com/lib/munchentech/detail.action?docID=6341977</subfield><subfield code="l">TUM01</subfield><subfield code="p">ZDB-30-PQE</subfield><subfield code="q">TUM_PDA_PQE_Kauf</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://doi.org/10.1049/PBCS057E</subfield><subfield code="l">UBY01</subfield><subfield code="p">ZDB-100-IET</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://doi.org/10.1049/PBCS057E</subfield><subfield code="l">UER01</subfield><subfield code="p">ZDB-100-IET</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
id | DE-604.BV047017407 |
illustrated | Not Illustrated |
index_date | 2024-07-03T15:58:22Z |
indexdate | 2024-07-10T09:00:15Z |
institution | BVB |
isbn | 9781785617980 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032424941 |
oclc_num | 1224011900 |
open_access_boolean | |
owner | DE-91 DE-BY-TUM DE-29 DE-706 |
owner_facet | DE-91 DE-BY-TUM DE-29 DE-706 |
physical | 1 Online-Ressource Illustrationen, Diagramme |
psigel | ZDB-30-PQE ZDB-100-IET ZDB-30-PQE TUM_PDA_PQE_Kauf |
publishDate | 2020 |
publishDateSearch | 2020 |
publishDateSort | 2020 |
publisher | The Institution of Engineering and Technology |
record_format | marc |
series | IET materials, circuits and devices series |
series2 | IET materials, circuits and devices series |
spelling | Cross-layer reliability of computing systems edited by Giorgio Di Natale, Dimitris Gizopoulos, Stefano Di Carlo, Alberto Bosio and Ramon Canal Stevenage The Institution of Engineering and Technology 2020 1 Online-Ressource Illustrationen, Diagramme txt rdacontent c rdamedia cr rdacarrier IET materials, circuits and devices series 57 Intro -- Contents -- Part I: Design techniques to improve the resilience of computing systems -- 1. Technological layer | Antonio Rubio and Ramon Canal -- 1.1 Introduction -- 1.1.1 Faults, errors and failures -- 1.2 Technology overview -- 1.2.1 Technologies based on electric charge -- 1.2.2 Roadmap for adoption -- 1.2.3 Sources of unreliability in technology -- 1.3 CPU building blocks -- 1.3.1 Combinatorial circuits -- 1.3.2 Memories -- 1.3.3 Main memory and storage -- 1.3.4 Emerging memories -- 1.4 Characterization -- 1.4.1 Manufacturing -- 1.4.2 Radiation -- 1.5 Conclusions -- References -- 2. Design techniques to improve the resilience of computing systems: logic layer | Lorena Anghel and Michael Nicolaidis -- 2.1 Introduction -- 2.2 Performance and reliability monitors -- 2.2.1 Double-sampling methodology and the basic architecture -- 2.3 Double-sampling-based monitors for detecting performance violations and transient faults -- 2.3.1 External-design monitors -- 2.3.2 Embedded monitors -- 2.3.3 Other types of monitors -- 2.3.4 Discussions -- 2.4 Conclusions -- References -- 3. Design techniques to improve the resilience of computing systems: architectural layer | Aviral Shrivastava, Kyoungwoo Lee, Hwisoo So, Jinhyo Jung, and Prudhvi Gali -- 3.1 Cache protection techniques -- 3.2 Register file protection techniques -- 3.3 Pipeline and core protection -- References -- 4. Design techniques to improve the resilience of computing systems: software layer | Alberto Bosio, Stefano Di Carlo, Giorgio Di Natale, Matteo Sonza Reorda, and Josie E. Rodriguez Condia -- 4.1 Introduction -- 4.2 Fault taxonomy -- 4.2.1 Software faults -- 4.3 Software-Implemented Hardware Fault Tolerance -- 4.3.1 Modify the software in order to reduce the probability of fault occurrences -- 4.3.2 Detecting/tolerating the presence of an error -- 4.4 Software-Based Self-Test 4.4.1 Basics on SBST -- 4.5 SBST for GPGPUs -- 4.5.1 Introduction -- 4.5.2 Effects of permanent faults in GPGPU devices -- 4.5.3 SBST techniques for testing the GPGPU scheduler -- References -- 5. Cross-layer resilience | Eric Cheng and Subhasish Mitra -- 5.1 Introduction -- 5.2 CLEAR framework -- 5.2.1 Reliability analysis -- 5.2.2 Execution time -- 5.2.3 Physical design -- 5.2.4 Resilience library -- 5.3 Cross-layer combinations -- 5.3.1 Combinations for general-purpose processors -- 5.3.2 Targeting specific applications -- 5.4 Application benchmark dependence -- 5.5 The design of new resilience techniques -- 5.6 Conclusions -- Acknowledgments -- References -- Part II: Reliability assessment -- 6. Physical stress | Fernando Fernandes dos Santos, Fabio Benevenuti, Gennaro Rodrigues, Fernanda Kastensmidt, and Paolo Rech -- 6.1 Introduction -- 6.2 Effects and physical sources -- 6.3 Reliability metrics -- 6.4 General setup -- 6.5 Neutron beam experiments -- 6.6 Heavy ions and proton experiments -- 6.7 Laser test -- 6.8 Conclusions -- References -- 7. Soft error modeling and simulation | Mojtaba Ebrahimi and Mehdi Tahoori -- 7.1 Introduction -- 7.2 FIT rate analysis at device level -- 7.3 Multiple transient error site identification using layout information -- 7.3.1 Motivation for layout-based MT analysis and mitigation -- 7.3.2 Proposed layout-based MT error site extraction technique -- 7.3.3 Experimental results of MT modeling -- 7.4 Propagating flip-flop errors at circuit level -- 7.4.1 Event-driven logic simulation -- 7.4.2 Error propagation from single flip-flop -- 7.4.3 Concurrent transient error propagation from multiple flip-flops -- 7.4.4 Experimental results -- 7.5 Propagating combinational gates errors at circuit level -- 7.6 Emulation-based fault injection platform -- 7.6.1 Shadow components 7.6.2 Shadow components-based fault injection technique -- 7.6.3 Experimental results -- 7.7 Fault injection acceleration -- 7.7.1 Workflow -- 7.7.2 Analytical modeling -- 7.7.3 Case study: fault injection on memory arrays of Leon3 -- 7.8 Conclusions -- References -- 8. Microarchitecture-level reliability assessment of multi-core processors | Athanasios Chatzidimitriou and Dimitris Gizopoulos -- 8.1 Introduction -- 8.2 Background -- 8.2.1 Threats and vulnerability -- 8.3 Fault-effect classes -- 8.4 Statistical fault injection -- 8.5 Cross-layer and single-layer evaluation -- 8.6 Assessment throughput -- 8.6.1 Simulation acceleration -- 8.6.2 Fault list reduction -- 8.7 Estimation accuracy -- 8.8 Conclusions -- References -- 9. Fault injection at the instruction set architecture (ISA) level | Karthik Pattabiraman and Guanpeng Li -- 9.1 Introduction -- 9.2 Background -- 9.2.1 Terms and definitions -- 9.2.2 Failure outcomes -- 9.2.3 Metrics -- 9.2.4 Fault Injection process -- 9.2.5 Fault model -- 9.3 Classification of injection techniques -- 9.3.1 Simulation versus direct -- 9.3.2 Intrusive versus nonintrusive -- 9.3.3 Level of injection -- 9.3.4 Platform -- 9.3.5 Classification results -- 9.4 LLFI and PINFI fault injectors -- 9.4.1 LLVM fault injector: LLFI -- 9.4.2 PINFI -- 9.5 Open challenges and conclusion -- 9.5.1 Challenge 1: level of injection -- 9.5.2 Challenge 2: target platform -- 9.5.3 Challenge 3: bit-flip model -- 9.5.4 Conclusion -- Acknowledgments -- References -- 10. Analytical modeling for crosslayer resiliency | Arijit Biswas -- 10.1 Introduction -- 10.2 ACE lifetime analysis -- 10.2.1 Un-ACE and ACE -- 10.2.2 Little's law -- 10.2.3 Example of ACE lifetime analysis -- 10.2.4 AVFs of various structures and workloads using ACE lifetime analysis -- 10.2.5 Hamming Distance Analysis and bit field analysis 10.2.6 Hamming Distance Analysis and multi-bit fault modeling -- 10.3 Sequential AVF analysis -- 10.3.1 port AVF (pAVF) and structure AVF -- 10.3.2 Sequential AVF computation -- 10.4 Program vulnerability factor -- 10.4.1 Cross-layer modeling using AVF and PVF -- 10.5 Artifacts of analytical vulnerability modeling and mitigations -- 10.5.1 Significance of data values in analytical modeling -- 10.5.2 Reducing unknowns-warmup and cooldown -- 10.5.3 Dealing with large and complex models -- 10.6 Future directions for analytical technique -- 10.7 Summary of analytical modeling for vulnerability -- References -- 11. Stochastic methods | Alessandro Savino, Alessandro Vallero, and Stefano Di Carlo -- 11.1 Introduction -- 11.2 Methodologies -- 11.2.1 Reliability Block Diagrams -- 11.2.2 Markov Chains -- 11.2.3 Bayesian Networks -- 11.3 Conclusions -- References -- Index This book presents state-of-the-art solutions for increasing the resilience of computing systems, both at single levels of abstraction and multi-layers. It is a valuable resource for researchers, postgraduate students and professional computer architects focusing on the dependability of computing systems Di Natale, Giorgio edt Gizopoulos, Dimitris edt Di Carlo, Stefano (DE-588)1256954519 edt Bosio, Alberto edt Canal, Ramon edt Erscheint auch als Druck-Ausgabe 978-1-78561-797-3 IET materials, circuits and devices series 57 (DE-604)BV044007507 57 |
spellingShingle | Cross-layer reliability of computing systems IET materials, circuits and devices series |
title | Cross-layer reliability of computing systems |
title_auth | Cross-layer reliability of computing systems |
title_exact_search | Cross-layer reliability of computing systems |
title_exact_search_txtP | Cross-layer reliability of computing systems |
title_full | Cross-layer reliability of computing systems edited by Giorgio Di Natale, Dimitris Gizopoulos, Stefano Di Carlo, Alberto Bosio and Ramon Canal |
title_fullStr | Cross-layer reliability of computing systems edited by Giorgio Di Natale, Dimitris Gizopoulos, Stefano Di Carlo, Alberto Bosio and Ramon Canal |
title_full_unstemmed | Cross-layer reliability of computing systems edited by Giorgio Di Natale, Dimitris Gizopoulos, Stefano Di Carlo, Alberto Bosio and Ramon Canal |
title_short | Cross-layer reliability of computing systems |
title_sort | cross layer reliability of computing systems |
volume_link | (DE-604)BV044007507 |
work_keys_str_mv | AT dinatalegiorgio crosslayerreliabilityofcomputingsystems AT gizopoulosdimitris crosslayerreliabilityofcomputingsystems AT dicarlostefano crosslayerreliabilityofcomputingsystems AT bosioalberto crosslayerreliabilityofcomputingsystems AT canalramon crosslayerreliabilityofcomputingsystems |