The CUDA handbook: a comprehensive guide to GPU programming
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Upper Saddle River, NJ ; Munich [u.a.]
Addison-Wesley
2013
|
Ausgabe: | 1. printing |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XXV, 494 S. graph. Darst. |
ISBN: | 9780321809469 0321809467 |
Internformat
MARC
LEADER | 00000nam a2200000zc 4500 | ||
---|---|---|---|
001 | BV041100348 | ||
003 | DE-604 | ||
005 | 20130716 | ||
007 | t | ||
008 | 130620s2013 d||| |||| 00||| eng d | ||
016 | 7 | |a 016246420 |2 DE-101 | |
020 | |a 9780321809469 |c (pbk.) £51.99 |9 978-0-321-80946-9 | ||
020 | |a 0321809467 |9 0-321-80946-7 | ||
035 | |a (OCoLC)856806625 | ||
035 | |a (DE-599)HBZHT017552895 | ||
040 | |a DE-604 |b ger | ||
041 | 0 | |a eng | |
049 | |a DE-703 |a DE-91G |a DE-29T | ||
084 | |a ST 151 |0 (DE-625)143595: |2 rvk | ||
084 | |a ST 230 |0 (DE-625)143617: |2 rvk | ||
084 | |a ST 320 |0 (DE-625)143657: |2 rvk | ||
084 | |a DAT 516f |2 stub | ||
084 | |a DAT 752f |2 stub | ||
100 | 1 | |a Wilt, Nicholas |d 1970- |e Verfasser |0 (DE-588)172709784 |4 aut | |
245 | 1 | 0 | |a The CUDA handbook |b a comprehensive guide to GPU programming |c Nicholas Wilt |
250 | |a 1. printing | ||
264 | 1 | |a Upper Saddle River, NJ ; Munich [u.a.] |b Addison-Wesley |c 2013 | |
300 | |a XXV, 494 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Grafikprozessor |0 (DE-588)4582114-8 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Programmierung |0 (DE-588)4076370-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Parallelverarbeitung |0 (DE-588)4075860-6 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a CUDA |g Informatik |0 (DE-588)7719528-0 |2 gnd |9 rswk-swf |
653 | |a Application software--Development. | ||
653 | |a Computer architecture. | ||
653 | |a Graphics processing units--Programming. | ||
689 | 0 | 0 | |a Parallelverarbeitung |0 (DE-588)4075860-6 |D s |
689 | 0 | 1 | |a Programmierung |0 (DE-588)4076370-5 |D s |
689 | 0 | 2 | |a Grafikprozessor |0 (DE-588)4582114-8 |D s |
689 | 0 | 3 | |a CUDA |g Informatik |0 (DE-588)7719528-0 |D s |
689 | 0 | |5 DE-604 | |
856 | 4 | 2 | |m Digitalisierung UB Bayreuth |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026076746&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-026076746 |
Datensatz im Suchindex
_version_ | 1804150481933565953 |
---|---|
adam_text | Contents
Preface
.....................................xxi
Acknowledgments
..............................xxiii
About the Author
.............................. xxv
PARTI
_______________________________________
Ì
Chapter
1:
Background
.......................3
1.1
Our Approach
..................................5
1.2
Code
........................................6
1.2.1 Microbenchmarks ............................6
1.2.2 Microdemos................................7
1.2.3
Optimization Journeys
..........................7
1.3
Administrative Items
..............................7
1.3.1
Open Source
................................7
1.3.2
CUDA
Handbook Library (chLib)
....................8
1.3.3
Coding Style
................................8
1.3.4
CUDA
SDK
.................................8
1.4
Road Map
.....................................8
Chapter
2:
Hardware Architecture
............... 11
2.1
CPU Configurations
..............................11
2.1.1
Front-Side Bus
.............................12
CONTENTS
2.1.2
Symmetric Multiprocessors
......................13
2.1.3
Nonuniform
Memory Access
.....................14
2.1.4
PCI Express Integration
........................17
2.2
Integrated GPUs
................................17
2.3
Multiple GPUs
.................................19
2.4
Address Spaces in
CUDA
...........................22
2.4.1
Virtual Addressing: A Brief History
..................22
2.4.2
Disjoint Address Spaces
........................26
2.4.3
Mapped Pinned Memory
........................28
2.4.4
Portable Pinned Memory
.......................29
2.4.5
Unified Addressing
...........................30
2.4.6
Peer-to-Peer Mappings
........................31
2.5
CPU/GPU Interactions
............................32
2.5.1
Pinned Host Memory and Command Buffers
............32
2.5.2
CPU/GPU Concurrency
........................35
2.5.3
The Host Interface and Intra-GPU Synchronization
........39
2.5.4
Inter-GPU Synchronization
......................41
2.6
GPU Architecture
...............................41
2.6.1
Overview
.................................42
2.6.2
Streaming Multiprocessors
.....................46
2.7
Further Reading
................................50
Chapters·. Software Architecture
................51
3.1
Software Layers
................................51
3.1.1
CUDA
Runtime and Driver
.......................53
3.1.2
Driver Models
..............................54
3.1.3
nvcc, PTX, and Microcode
........... 57
CONTENTS
3.2
Devices and Initialization
...........................59
3.2.1
Device Count
............................... 60
3.2.2
Device Attributes
............................ 60
3.2.3
When
CUDA
Is Not Present
...................... 63
3.3
Contexts
.................................... 67
3.3.1
Lifetime and Scoping
.......................... 68
3.3.2
Preallocation of Resources
...................... 68
3.3.3
Address Space
............................. 69
3.3.4
Current Context Stack
......................... 69
3.3.5
Context State
.............................. 71
3.4
Modules and Functions
............................ 71
3.5
Kernels (Functions)
.............................. 73
3.6
Device Memory
................................ 75
3.7
Streams and Events
.............................. 76
3.7.1
Software Pipelining
........................... 76
3.7.2
Stream Callbacks
............................ 77
3.7.3
The NULL Stream
............................ 77
3.7.4
Events
.................................. 78
3.8
Host Memory
.................................. 79
3.8.1
Pinned Host Memory
..........................80
3.8.2
Portable Pinned Memory
.......................81
3.8.3
Mapped Pinned Memory
........................81
3.8.4
Host Memory Registration
......................81
3.9
CUDA
Arrays and Texturing
.........................82
3.9.1
Texture References
...........................82
3.9.2
Surface References
..........................85
CONTENTS
3.10
Graphics
Interoperability...........................
86
3.11
The
CUDA
Runtime and
CUDA
Driver
API
.................87
Chapter
4:
Software Environment
................93
4.1
nvcc—
CUDA
Compiler Driver
........................93
4.2
ptxas-thePTX Assembler
........................ 100
4.3
cuobjdump
.................................. 105
4.4
nvidia-smi
................................. 106
4.5
Amazon Web Services
........................... 109
4.5.1
Command-Line Tools
........................ 110
4.5.2
EC2 and Virtualization
........................ 110
4.5.3
Key Pairs
............................... 111
4.5.4
Availability Zones (AZs) and Regions
................ 112
4.5.5 S3 ................................... 112
4.5.6
EBS
.................................. 113
4.5.7
AMIs
.................................. 113
4.5.8
Linux on EC2
............................. 114
4.5.9
Windows on EC2
........................... 115
PART II
_____________________________________119
Chapter
5:
Memory
........................121
5.1
Host Memory
................................. 122
5.1.1
Allocating Pinned Memory
...................... 122
5.1.2
Portable Pinned Memory
...................... 123
5.1.3
Mapped Pinned Memory
....................... 124
5.1.4
Write-Combined Pinned Memory
.................. 124
CONTENTS
5.1.5
Registering Pinned
Memory..................... 125
5.1.6
Pinned Memory and
UVA
...................... 126
5.1.7
Mapped Pinned Memory Usage
................... 127
5.1.8
NUMA,
Thread Affinity, and Pinned Memory
........... 128
5.2
Global Memory
................................ 130
5.2.1
Pointers
................................ 131
5.2.2
Dynamic Allocations
......................... 132
5.2.3
Querying the Amount of Global Memory
.............. 137
5.2.4
Static Allocations
........................... 138
5.2.5
MemsetAPIs
............................. 139
5.2.6
Pointer Queries
............................ 140
5.2.7
Peer-to-Peer Access
......................... 143
5.2.8
Reading and Writing Global Memory
................
U3
5.2.9
Coalescing Constraints
....................... 143
5.2.10 Microbenchmarks:
Peak Memory Bandwidth
.......... 147
5.2.11
Atomic Operations
.......................... 152
5.2.12
Texturing from Global Memory
.................. 155
5.2.13
ECC (Error Correcting Codes)
................... 155
5.3
Constant Memory
.............................. 156
5.3.1
Host and Device
__
constant
__
Memory
............ 157
5.3.2
Accessing
__
constant
__
Memory
................ 157
5.4
Local Memory
................................ 158
5.5
Texture Memory
............................... 162
5.6
Shared Memory
............................... 162
5.6.1
Unsized Shared Memory Declarations
............... 163
5.6.2
Warp-Synchronous Coding
..................... 164
5.6.3
Pointers to Shared Memory
..................... 164
CONTENTS
5.7 Memory
Copy
................................
5.7.1
Synchronous
versus
Asynchronous Memcpy
........... 165
5.7.2
Unified
Virtual
Addressing......................
166
5.7.3
CUDA
Runtime
............................ 166
5.7.4 DriverAPI............................... 169
Chapter
6:
Streams and Events
................. 173
6.1
CPU/GPU Concurrency: Covering Driver Overhead
........... 174
6.1.1
Kernel Launches
........................... 174
6.2
Asynchronous Memcpy
........................... 178
6.2.1
Asynchronous Memcpy: Host-^Device
............... 179
6.2.2
Asynchronous Memcpy: Device-^Host
.............. 181
6.2.3
The NULL Stream and Concurrency Breaks
........... 181
6.3
CUDA
Events: CPU/GPU Synchronization
................ 183
6.3.1
Blocking Events
........................... 186
6.3.2
Queries
................................ 186
6.4
CUDA
Events: Timing
............................ 186
6.5
Concurrent Copying and Kernel Processing
............... 187
6.5.1
concurrencyMemcpyKernel
.
cu
................ 189
6.5.2
Performance Results
........................
19Л
6.5.3
Breaking Interengine Concurrency
................. 196
6.6
Mapped Pinned Memory
.......................... 197
6.7
Concurrent Kernel Processing
...................... 199
6.8
GPU/GPU Synchronization: cudaStreamWaitEvent
( ) ....... 202
6.8.1
Streams and Events on Multi-GPU: Notes and Limitations
. . . 202
6.9
Source Code Reference
........................... 202
CONTENTS
Chapter
7: Kernel
Execution
.................. 205
7.1
Overview
................................... 205
7.2
Syntax
..................................... 206
7.2.1
Limitations
............................... 208
7.2.2
Caches and Coherency
........................ 209
7.2.3
Asynchrony and Error Handling
.................. 209
7.2.
Д
Timeouts
................................ 210
7.2.5
Local Memory
............................. 210
7.2.6
Shared Memory
............................ 211
7.3
Blocks, Threads, Warps, and Lanes
.................... 211
7.3.1
Grids of Blocks
............................ 211
7.3.2
Execution Guarantees
........................ 215
7.3.3
Block and Thread IDs
........................ 216
7.Д
Occupancy
.................................. 220
7.5
Dynamic Parallelism
............................ 222
7.5.1
Scoping and Synchronization
.................... 223
7.5.2
Memory Model
............................ 224
7.5.3
Streams and Events
......................... 225
7.5.4
Error Handling
............................ 225
7.5.5
Compiling and Linking
........................ 226
7.5.6
Resource Management
....................... 226
7.5.7
Summary
............................... 228
Chapter
8:
Streaming Multiprocessors
............ 231
8.1
Memory
.................................... 233
8.1.1
Registers
................................ 233
8.1.2
Local Memory
.............................234
CONTENTS
8.1.3 Global Memory............................ 235
8.1.4
Constant
Memory........................... 237
8.1.5
Shared
Memory............................ 237
8.1.6
Barriers and Coherency
....................... 240
8.2 Integer Support............................... 241
8.2.1
Multiplication
............................. 241
8.2.2
Miscellaneous (Bit
Manipulation).................. 242
8.2.3
Funnel Shift
(SM
3.5)......................... 243
8.3 Floating-Point Support........................... 244
8.3.1 Formats................................ 244
8.3.2 Single
Precision (32-Bit).......................
250
8.3.3 Double
Precision (64-Bit)
...................... 253
8.3.4 Half
Precision (16-Bit)
........................ 253
8.3.5
Case Study:
f
loat-rtial
f
Conversion
.............. 253
8.3.6
Math Library
............................. 258
8.3.7
Additional Reading
.......................... 266
8.4
Conditional Code
............................... 267
8.4.1
Predication
.............................. 267
8.4.2
Divergence and Convergence
.................... 268
8.4.3
Special Cases:
Min,
Max and Absolute Value
........... 269
8.5
Textures and Surfaces
........................... 269
8.6
Miscellaneous Instructions
........................ 270
8.6.1
Warp-Level Primitives
........................ 270
8.6.2
Block-Level Primitives
....................... 272
8.6.3
Performance Counter
............... 272
8.6.4
Video Instructions
............... 272
CONTENTS
8.6.5 Special Registers........................... 275
8.7
Instruction Sets
............................... 275
Chapter?:
Multiple GPUs.................... 287
9.1
Overview
................................... 287
9.2 Peer-to-Peer................................. 288
9.2.1 Peer-to-Peer Memcpy........................ 288
9.2.2 Peer-to-Peer
Addressing
...................... 289
9.3
UVA:
Inferring
Device
from Address ...................
291
9.4
Inter-GPU Synchronization
......................... 292
9.5
Single-Threaded Multi-GPU
........................ 294
9.5.1
Current Context Stack
........................ 294
9.5.2
N-Body
................................. 296
9.6
Multithreaded Multi-GPU
.......................... 299
Chapter
10:
Texturing
...................... 305
10.1
Overview
................................... 305
10.1.1
Two Use Cases
............................ 306
10.2
Texture Memory
............................... 306
10.2.1
Device Memory
........................... 307
10.2.2
CUDA
Arrays and Block Linear Addressing
........... 308
10.2.3
Device Memory versus
CUDA
Arrays
............... 313
10.3
1D Texturing
.................................
3U
10.3.1
Texture Setup
............................
3U
10.4
Texture as a Read Path
........................... 317
10.4.1
Increasing Effective Address Coverage
.............. 318
10.4.2
Texturing from Host Memory
................... 321
CONTENTS
10.5
Texturing with
Unnormalized
Coordinates
................ 323
10.6
Texturing with Normalized Coordinates
................. 331
10.7
1D Surface Read/Write
........................... 333
10.8
2D Texturing
................................. 335
10.8.1 Microdemo:
tex2d_opengl.cu
................. 335
10.9
2D Texturing: Copy Avoidance
....................... 338
10.9.1
2D Texturing from Device Memory
................ 338
10.9.2
2D Surface Read/Write
....................... 340
10.10 3D
Texturing
................................. 340
10.11
Layered Textures
.............................. 342
10.11.1
1D Layered Textures
........................ 343
10.11.2
2D Layered Textures
........................ 343
10.12
Optimal Block Sizing and Performance
.................. 343
10.12.1
Results
................................ 344
10.13
Texturing Quick References
........................ 345
10.13.1
Hardware Capabilities
....................... 345
10.13.2
CUDA
Runtime
........................... 347
10.13.3
Driver API
.............................. 349
PART III
____________________________________351
Chapter
11:
Streaming Workloads
............... 353
11.1
Device Memory
....................... 355
11.2
Asynchronous Memcpy
.................. 358
11.3
Streams
...................... 359
11.4
Mapped Pinned Memory
........... 361
11.5
Performance and Summary
.............. 362
CONTENTS
Chapter
12:
Reduction
...................... 365
12.1
Overview
................................... 365
12.2
Two-Pass Reduction
............................ 367
12.3
Single-Pass Reduction
........................... 373
12.4
Reduction with Atomics
........................... 376
12.5
Arbitrary Block Sizes
............................ 377
12.6
Reduction Using Arbitrary Data Types
................... 378
12.7
Predicate Reduction
............................. 382
12.8
Warp Reduction with Shuffle
........................ 382
Chapter
13:
Scan
......................... 385
13.1
Definition and Variations
.......................... 385
13.2
Overview
................................... 387
13.3
Scan and Circuit Design
.......................... 390
13.4
CUDA
Implementations
........................... 394
13.4.1
Scan-Then-Fan
........................... 394
13.4.2
Reduce-Then-Scan (Recursive)
.................. 400
13.4.3
Reduce-Then-Scan (Two Pass)
.................. 403
13.5
Warp Scans
................................. 407
13.5.1
Zero Padding
............................. 408
13.5.2
Templated Formulations
...................... 409
13.5.3
Warp Shuffle
............................. 410
13.5.4
Instruction Counts
......................... 412
13.6
Stream Compaction
............................. 414
13.7
References [Parallel Scan Algorithms)
.................. 418
13.8
Further Reading (Parallel Prefix Sum Circuits)
............. 419
CONTENTS
Chapter
14: N-Body....................... 421
14.1
Introduction
.................................423
ил Л
A Matrix of Forces
..........................424
U.2
Naïve
Implementation
...........................428
14.3
Shared Memory
...............................432
UÀ
Constant Memory
..............................434
14.5
Warp Shuffle
.................................436
14.6
Multiple GPUs and Scalability
.......................438
14.7
CPU Optimizations
..............................439
14.8
Conclusion
..................................444
14.9
References and Further Reading
.....................446
Chapter
15:
Image Processing: Normalized Correlation
. . 449
15.1
Overview
................................... 449
15.2
Naive Texture-Texture Implementation
.................. 452
15.3
Template in Constant Memory
....................... 456
15.4
Image in Shared Memory
.......................... 459
15.5
Further Optimizations
............................ 463
15.5.1
SM-Aware Coding
.......................... 463
15.5.2.
Loop Unrolling
............................ 464
15.6
Source Code
................................. 465
15.7
Performance and Further Reading
.................... 466
15.8
Further Reading
............................... 469
Appendix A The
CUDA
Handbook Library
............471
A.1 Timing
.....................................471
A.2 Threading
......................... 472
CONTENTS
Α.
3
Driver API Facilities
.............................
Д74
A.4 Shmoos
.................................... 475
A.5 Command Line Parsing
........................... 476
A.6 Error Handling
................................ 477
Glossary/TLA Decoder
............................. 481
Index
....................................... 487
|
any_adam_object | 1 |
author | Wilt, Nicholas 1970- |
author_GND | (DE-588)172709784 |
author_facet | Wilt, Nicholas 1970- |
author_role | aut |
author_sort | Wilt, Nicholas 1970- |
author_variant | n w nw |
building | Verbundindex |
bvnumber | BV041100348 |
classification_rvk | ST 151 ST 230 ST 320 |
classification_tum | DAT 516f DAT 752f |
ctrlnum | (OCoLC)856806625 (DE-599)HBZHT017552895 |
discipline | Informatik |
edition | 1. printing |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02035nam a2200505zc 4500</leader><controlfield tag="001">BV041100348</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20130716 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">130620s2013 d||| |||| 00||| eng d</controlfield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">016246420</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780321809469</subfield><subfield code="c">(pbk.) £51.99</subfield><subfield code="9">978-0-321-80946-9</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0321809467</subfield><subfield code="9">0-321-80946-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)856806625</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)HBZHT017552895</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-703</subfield><subfield code="a">DE-91G</subfield><subfield code="a">DE-29T</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 151</subfield><subfield code="0">(DE-625)143595:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 230</subfield><subfield code="0">(DE-625)143617:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 320</subfield><subfield code="0">(DE-625)143657:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 516f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 752f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Wilt, Nicholas</subfield><subfield code="d">1970-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)172709784</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">The CUDA handbook</subfield><subfield code="b">a comprehensive guide to GPU programming</subfield><subfield code="c">Nicholas Wilt</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1. printing</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Upper Saddle River, NJ ; Munich [u.a.]</subfield><subfield code="b">Addison-Wesley</subfield><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXV, 494 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Grafikprozessor</subfield><subfield code="0">(DE-588)4582114-8</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Programmierung</subfield><subfield code="0">(DE-588)4076370-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Parallelverarbeitung</subfield><subfield code="0">(DE-588)4075860-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">CUDA</subfield><subfield code="g">Informatik</subfield><subfield code="0">(DE-588)7719528-0</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Application software--Development.</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Computer architecture.</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Graphics processing units--Programming.</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Parallelverarbeitung</subfield><subfield code="0">(DE-588)4075860-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Programmierung</subfield><subfield code="0">(DE-588)4076370-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="2"><subfield code="a">Grafikprozessor</subfield><subfield code="0">(DE-588)4582114-8</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="3"><subfield code="a">CUDA</subfield><subfield code="g">Informatik</subfield><subfield code="0">(DE-588)7719528-0</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bayreuth</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026076746&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-026076746</subfield></datafield></record></collection> |
id | DE-604.BV041100348 |
illustrated | Illustrated |
indexdate | 2024-07-10T00:39:37Z |
institution | BVB |
isbn | 9780321809469 0321809467 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-026076746 |
oclc_num | 856806625 |
open_access_boolean | |
owner | DE-703 DE-91G DE-BY-TUM DE-29T |
owner_facet | DE-703 DE-91G DE-BY-TUM DE-29T |
physical | XXV, 494 S. graph. Darst. |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
publisher | Addison-Wesley |
record_format | marc |
spelling | Wilt, Nicholas 1970- Verfasser (DE-588)172709784 aut The CUDA handbook a comprehensive guide to GPU programming Nicholas Wilt 1. printing Upper Saddle River, NJ ; Munich [u.a.] Addison-Wesley 2013 XXV, 494 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Grafikprozessor (DE-588)4582114-8 gnd rswk-swf Programmierung (DE-588)4076370-5 gnd rswk-swf Parallelverarbeitung (DE-588)4075860-6 gnd rswk-swf CUDA Informatik (DE-588)7719528-0 gnd rswk-swf Application software--Development. Computer architecture. Graphics processing units--Programming. Parallelverarbeitung (DE-588)4075860-6 s Programmierung (DE-588)4076370-5 s Grafikprozessor (DE-588)4582114-8 s CUDA Informatik (DE-588)7719528-0 s DE-604 Digitalisierung UB Bayreuth application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026076746&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Wilt, Nicholas 1970- The CUDA handbook a comprehensive guide to GPU programming Grafikprozessor (DE-588)4582114-8 gnd Programmierung (DE-588)4076370-5 gnd Parallelverarbeitung (DE-588)4075860-6 gnd CUDA Informatik (DE-588)7719528-0 gnd |
subject_GND | (DE-588)4582114-8 (DE-588)4076370-5 (DE-588)4075860-6 (DE-588)7719528-0 |
title | The CUDA handbook a comprehensive guide to GPU programming |
title_auth | The CUDA handbook a comprehensive guide to GPU programming |
title_exact_search | The CUDA handbook a comprehensive guide to GPU programming |
title_full | The CUDA handbook a comprehensive guide to GPU programming Nicholas Wilt |
title_fullStr | The CUDA handbook a comprehensive guide to GPU programming Nicholas Wilt |
title_full_unstemmed | The CUDA handbook a comprehensive guide to GPU programming Nicholas Wilt |
title_short | The CUDA handbook |
title_sort | the cuda handbook a comprehensive guide to gpu programming |
title_sub | a comprehensive guide to GPU programming |
topic | Grafikprozessor (DE-588)4582114-8 gnd Programmierung (DE-588)4076370-5 gnd Parallelverarbeitung (DE-588)4075860-6 gnd CUDA Informatik (DE-588)7719528-0 gnd |
topic_facet | Grafikprozessor Programmierung Parallelverarbeitung CUDA Informatik |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=026076746&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT wiltnicholas thecudahandbookacomprehensiveguidetogpuprogramming |