Neural networks and deep learning: a textbook
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Cham, Switzerland
Springer
[2018]
|
Schlagworte: | |
Online-Zugang: | Inhaltstext https://www.springer.com/de/book/9783319944623 Inhaltsverzeichnis |
Beschreibung: | XXIII, 497 Seiten Illustrationen, Diagramme 23.5 cm x 15.5 cm |
ISBN: | 9783030068561 9783319944623 3319944622 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV045147819 | ||
003 | DE-604 | ||
005 | 20210806 | ||
007 | t | ||
008 | 180827s2018 sz a||| |||| 00||| eng d | ||
016 | 7 | |a 1159914656 |2 DE-101 | |
020 | |a 9783030068561 |c pbk |9 978-3-030-06856-1 | ||
020 | |a 9783319944623 |c Festeinband : circa EUR 64.19 (DE) (freier Preis), circa EUR 65.99 (AT) (freier Preis), circa CHF 66.00 (freier Preis) |9 978-3-319-94462-3 | ||
020 | |a 3319944622 |9 3-319-94462-2 | ||
024 | 3 | |a 9783319944623 | |
035 | |a (OCoLC)1055864028 | ||
035 | |a (DE-599)DNB1159914656 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
044 | |a sz |c XA-CH | ||
049 | |a DE-29T |a DE-945 |a DE-M347 |a DE-1028 |a DE-Aug4 |a DE-11 |a DE-83 |a DE-188 |a DE-19 |a DE-355 |a DE-1043 |a DE-739 |a DE-634 |a DE-20 | ||
084 | |a ST 301 |0 (DE-625)143651: |2 rvk | ||
100 | 1 | |a Aggarwal, Charu C. |d 1970- |e Verfasser |0 (DE-588)133500101 |4 aut | |
245 | 1 | 0 | |a Neural networks and deep learning |b a textbook |c Charu C. Aggarwal |
264 | 1 | |a Cham, Switzerland |b Springer |c [2018] | |
264 | 4 | |c © 2018 | |
300 | |a XXIII, 497 Seiten |b Illustrationen, Diagramme |c 23.5 cm x 15.5 cm | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
650 | 0 | 7 | |a Neuronales Netz |0 (DE-588)4226127-2 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Deep learning |0 (DE-588)1135597375 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4123623-3 |a Lehrbuch |2 gnd-content | |
689 | 0 | 0 | |a Neuronales Netz |0 (DE-588)4226127-2 |D s |
689 | 0 | 1 | |a Deep learning |0 (DE-588)1135597375 |D s |
689 | 0 | |5 DE-604 | |
689 | 1 | 0 | |a Maschinelles Lernen |0 (DE-588)4193754-5 |D s |
689 | 1 | |5 DE-604 | |
710 | 2 | |a Springer International Publishing |0 (DE-588)1064344704 |4 pbl | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-3-319-94463-0 |
856 | 4 | 2 | |m X:MVB |q text/html |u http://deposit.dnb.de/cgi-bin/dokserv?id=d03ee684ae884db698677fd189f4cd49&prov=M&dok_var=1&dok_ext=htm |3 Inhaltstext |
856 | 4 | 2 | |m X:MVB |u https://www.springer.com/de/book/9783319944623 |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030537527&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-030537527 |
Datensatz im Suchindex
_version_ | 1804178817306066944 |
---|---|
adam_text | Contents 1 An Introduction to Neural Networks 1 1.1 1 Introduction......................................................................................................... 1.1.1 Humans Versus Computers: Stretching the Limits of Artificial Intelligence........................................................................ 1.2 The Basic Architecture of Neural Networks................................................... 1.2.1 Single Computational Layer: The Perceptron.................................... 1.2.1.1 What Objective Function Is the Perceptron Optimizing? . 1.2.1.2 Relationship with Support Vector Machines......................... 1.2.1.3 Choice of Activation and Loss Functions ............................ 1.2.1.4 Choice and Number of Output Nodes.................................. 1.2.1.5 Choice of Loss Function.......................................................... 1.2.1.6 Some Useful Derivatives of Activation Functions................ 1.2.2 Multilayer Neural Networks.................................................................. 1.2.3 The Multilayer Network as a Computational Graph ........................ 1.3 Training a Neural Network with Backpropagation.......................................... 1.4 Practical Issues in Neural Network Training................................................... 1.4.1 The Problem of Overfitting.................................................................. 1.4.1.1 Regularization......................................................................... 1.4.1.2 Neural Architecture and Parameter Sharing.........................
1.4.1.3 Early Stopping......................................................................... 1.4.1.4 Trading Off Breadth for Depth.............................................. 1.4.1.5 Ensemble Methods................................................................... 1.4.2 The Vanishing and Exploding Gradient Problems.............................. 1.4.3 Difficulties in Convergence..................................................................... 1.4.4 Local and Spurious Optima.................................................................. 1.4.5 Computational Challenges..................................................................... 1.5 The Secrets to the Power of Function Composition....................................... 1.5.1 The Importance of Nonlinear Activation............................................. 1.5.2 Reducing Parameter Requirements with Depth................................. 1.5.3 Unconventional Neural Architectures................................................... 1.5.3.1 Blurring the Distinctions Between Input, Hidden, and Output Layers.................................................................. 1.5.3.2 Unconventional Operations and Sum-Product Networks . . 3 4 5 8 10 11 14 14 16 17 20 21 24 25 26 27 27 27 28 28 29 29 29 30 32 34 35 35 36 XIII
XIV 1.6 1.7 1.8 1.9 1.10 1.11 CONTENTS Common Neural Architectures ........................................................................ 1.6.1 Simulating Basic Machine Learning with Shallow Models ............... 1.6.2 Radial Basis Function Networks ......................................................... 1.6.3 Restricted Boltzmann Machines............................................................ 1.6.4 Recurrent Neural Networks.................................................................. 1.6.5 Convolutional Neural Networks............................................................ 1.6.6 Hierarchical Feature Engineering and Pretrained Models.................. Advanced Topics................................................................................................ 1.7.1 Reinforcement Learning........................................................................ 1.7.2 Separating Data Storage and Computations....................................... 1.7.3 Generative Adversarial Networks......................................................... Two Notable Benchmarks................................................................................. 1.8.1 The MNIST Database of Handwritten Digits.................................... 1.8.2 The ImageNet Database........................................................................ Summary............................................................................................................ Bibliographic Notes............................................................................................. 1.10.1
Video Lectures....................................................................................... 1.10.2 Software Resources................................................................................. Exercises ............................................................................................................ 2 Machine Learning with Shallow Neural Networks 2.1 2.2 Introduction......................................................................................................... Neural Architectures for Binary ClassificationModels .................................. 2.2.1 Revisiting the Perceptron..................................................................... 2.2.2 Least-Squares Regression..................................................................... 2.2.2.1 Widrow-Hoff Learning............................................................ 2.2.2.2 Closed Form Solutions............................................................ 2.2.3 Logistic Regression................................................................................. 2.2.3.1 Alternative Choices of Activation and Loss........................ 2.2.4 Support Vector Machines..................................................................... 2.3 Neural Architectures for Multiclass Models ................................................... 2.3.1 Multiclass Perceptron........................................................................... 2.3.2 Weston-Watkins SVM........................................................................... 2.3.3 Multinomial Logistic Regression (Softmax
Classifier)........................ 2.3.4 Hierarchical Softmax for Many Classes................................................ 2.4 Backpropagated Saliency for Feature Selection ............................................. 2.5 Matrix Factorization with Autoencoders......................................................... 2.5.1 Autoencoder: Basic Principles............................................................... 2.5.1.1 Autoencoder with a Single Hidden Layer........................... 2.5.1.2 Connections with Singular Value Decomposition............... 2.5.1.3 Sharing Weights in Encoder and Decoder........................... 2.5.1.4 Other Matrix FactorizationMethods..................................... 2.5.2 Nonlinear Activations........................................................................... 2.5.3 Deep Autoencoders................................................................................. 2.5.4 Application to Outlier Detection......................................................... 2.5.5 When the Hidden Layer Is Broader than the Input Layer............... 2.5.5.1 Sparse Feature Learning......................................................... 2.5.6 Other Applications................................................................................. 37 37 37 38 38 40 42 44 44 45 45 46 46 47 48 48 50 50 51 53 53 55 56 58 59 61 61 63 63 65 65 67 68 69 70 70 71 72 74 74 76 76 78 80 81 81 82
CONTENTS 2.6 2.7 2.8 2.9 2.10 2.5.7 Recommender Systems: Row Index to Row Value Prediction .... 2.5.8 Discussion................................................................................................ Word2vec: An Application of Simple Neural Architectures........................... 2.6.1 Neural Embedding with Continuous Bag of Words........................... 2.6.2 Neural Embedding with Skip-Gram Model.......................................... 2.6.3 Word2vec (SGNS) Is Logistic Matrix Factorization........................... 2.6.4 Vanilla Skip-Gram Is Multinomial Matrix Factorization.................. Simple Neural Architectures for Graph Embeddings .................................... 2.7.1 Handling Arbitrary Edge Counts......................................................... 2.7.2 Multinomial Model................................................................................. 2.7.3 Connections with DeepWalk and Node2vec....................................... Summary............................................................................................................ Bibliographic Notes............................................................................................. 2.9.1 Software Resources................................................................................. Exercises ........................................................................................................... 3 Training Deep Neural Networks 3.1 Introduction......................................................................................................... 3.2
Backpropagation: The Gory Details.................................................................. 3.2.1 Backpropagation with the Computational Graph Abstraction .... 3.2.2 Dynamic Programming to the Rescue................................................ 3.2.3 Backpropagation with Post-Activation Variables.............................. 3.2.4 Backpropagation with Pre-activation Variables................................. 3.2.5 Examples of Updates for Various Activations.................................... 3.2.5.1 The Special Case of Softmax................................................ 3.2.6 A Decoupled View of Vector-Centric Backpropagation..................... 3.2.7 Loss Functions on Multiple Output Nodes and Hidden Nodes .... 3.2.8 Mini-Batch Stochastic Gradient Descent............................................. 3.2.9 Backpropagation Tricks for Handling Shared Weights ..................... 3.2.10 Checking the Correctness of Gradient Computation ........................ 3.3 Setup and Initialization Issues........................................................................... 3.3.1 Tuning Hyperparameters ..................................................................... 3.3.2 Feature Preprocessing........................................................................... 3.3.3 Initialization..................-..................................................................... 3.4 The Vanishing and Exploding Gradient Problems.......................................... 3.4.1 Geometric Understanding of the Effect of Gradient Ratios............... 3.4.2 A Partial
Fix with Activation Function Choice................................. 3.4.3 Dying Neurons and “Brain Damage”................................................... 3.4.3.1 Leaky ReLU............................................................................ 3.4.3.2 Maxout..................................................................................... 3.5 Gradient-Descent Strategies.............................................................................. 3.5.1 Learning Rate Decay.............................................................................. 3.5.2 Momentum-Based Learning.................................................................. 3.5.2.1 Nesterov Momentum ............................................................ 3.5.3 Parameter-Specific Learning Rates...................................................... 3.5.3.1 AdaGrad................................................................................. 3.5.3.2 RMSProp.................................................................................. 3.5.3.3 RMSProp with Nesterov Momentum..................................... XV 83 86 87 87 90 95 98 98 100 100 100 101 101 102 103 105 105 107 107 Ill 113 115 117 117 118 121 121 123 124 125 125 126 128 129 130 133 133 133 134 134 135 136 137 137 138 138 139
XVI 4 CONTENTS 3.5.3.4 AdaDelta........................................................................ 139 3.5.3.5 Adam.............................................................................. 140 3.5.4 Cliffs and Higher-Order Instability...................................................... 3.5.5 Gradient Clipping................................................................................. 3.5.6 Second-Order Derivatives..................................................................... 3.5.6.1 Conjugate Gradients and Hessian-FreeOptimization.... 3.5.6.2 Quasi-Newton Methods and BFGS............................... 148 3.5.6.3 Problems with Second-Order Methods: Saddle Points . . . 3.5.7 Polyak Averaging.................................................................................... 3.5.8 Local and Spurious Minima.................................................................. 3.6 Batch Normalization.......................................................................................... 3.7 Practical Tricks for Acceleration and Compression....................................... 3.7.1 GPU Acceleration................................................................................. 3.7.2 Parallel and Distributed Implementations.......................................... 3.7.3 Algorithmic Tricks for Model Compression ....................................... 3.8 Summary............................................................................................................ 3.9 Bibliographic
Notes............................................................................................. 3.9.1 Software Resources................................................................................. 3.10 Exercises ............................................................................................................ 149 151 151 152 156 157 158 160 163 163 165 165 Teaching Deep Learners to Generalize 169 4.1 4.2 169 174 175 178 179 180 181 181 182 183 184 185 186 186 187 188 188 191 192 192 193 197 197 199 199 200 200 4.3 4.4 4.5 4.6 4.7 4.8 4.9 Introduction......................................................................................................... The Bias-Variance Trade-Off........................................................................... 4.2.1 Formal View.......................................................................................... Generalization Issues in Model Tuning and Evaluation................................. 4.3.1 Evaluating with Hold-Out and Cross-Validation................................. 4.3.2 Issues with Training at Scale............................................................... 4.3.3 How to Detect Need to Collect More Data.......................................... Penalty-Based Regularization........................................................................... 4.4.1 Connections with Noise Injection......................................................... 4.4.2 Li-Regularization ................................................................................. 4.4.3 Li- or
^-Regularization?..................................................................... 4.4.4 Penalizing Hidden Units: Learning Sparse Representations............... Ensemble Methods............................................................................................. 4.5.1 Bagging and Subsampling..................................................................... 4.5.2 Parametric Model Selection and Averaging....................................... 4.5.3 Randomized Connection Dropping...................................................... 4.5.4 Dropout................................................................................................... 4.5.5 Data Perturbation Ensembles............................................................... Early Stopping................................................................................................... 4.6.1 Understanding Early Stopping from the Variance Perspective .... Unsupervised Pretraining................................................................................. 4.7.1 Variations of Unsupervised Pretraining................................................ 4.7.2 What About Supervised Pretraining? ................................................ Continuation and Curriculum Learning............................................................ 4.8.1 Continuation Learning........................................................................... 4.8.2 Curriculum Learning.............................................................................. Parameter
Sharing............................................................................................. 141 142 143 145
CONTENTS 5 4.10 Regularization in Unsupervised Applications ................................................ 4.10.1 Value-Based Penalization: Sparse Autoencoders................................. 4.10.2 Noise Injection: De-noising Autoencoders.......................................... 4.10.3 Gradient-Based Penalization: Contractive Autoencoders.................. 4.10.4 Hidden Probabilistic Structure: Variational Autoencoders............... 4.10.4.1 Reconstruction and Generative Sampling........................... 4.10.4.2 Conditional Variational Autoencoders................................. 4.10.4.3 Relationship with Generative Adversarial Networks .... 4.11 Summary............................................................................................................ 4.12 Bibliographic Notes............................................................................................. 4.12.1 Software Resources................................................................................. 4.13 Exercises ............................................................................................................ 201 202 202 204 207 210 212 213 213 214 215 215 Radial Basis Function Networks 217 5.1 5.2 217 220 5.3 5.4 5.5 5.6 5.7 6 XVII Introduction................................................................................. Training an RBF Network......................................................... 5.2.1 Training the Hidden Layer............................................. 5.2.2 Training the Output Layer .......................................... 5.2.2.1
Expression with Pseudo-Inverse .................. 5.2.3 Orthogonal Least-Squares Algorithm........................... 5.2.4 Fully Supervised Learning............................................. Variations and Special Cases of RBF Networks..................... 5.3.1 Classification with Perceptron Criterion..................... 5.3.2 Classification with Hinge Loss....................................... 5.3.3 Example of Linear Separability Promoted by RBF . . 5.3.4 Application to Interpolation.......................................... Relationship with Kernel Methods.......................................... 5.4.1 Kernel Regression as a Special Case of RBF Networks 5.4.2 Kernel SVM as a Special Case of RBF Networks . . . 5.4.3 Observations.................................................................. Summary.................................................................................... Bibliographic Notes..................................................................... Exercises .................................................................................... 221 222 224 224 225 226 226 227 227 228 229 229 230 231 231 232 232 Restricted Boltzmann Machines 235 6.1 235 236 237 238 240 241 242 243 244 245 247 249 250 251 6.2 6.3 6.4 Introduction......................................................................................................... 6.1.1 Historical Perspective........................................................................... Hopfield Networks.............................................................................................
6.2.1 Optimal State Configurations of a Trained Network ........................ 6.2.2 Training a Hopfield Network ............................................................... 6.2.3 Building a Toy Recommender and Its Limitations ........................... 6.2.4 Increasing the Expressive Power of the Hopfield Network ............... The Boltzmann Machine.................................................................................... 6.3.1 How a Boltzmann Machine Generates Data....................................... 6.3.2 Learning the Weights of a Boltzmann Machine................................. Restricted Boltzmann Machines........................................................................ 6.4.1 Training the RBM................................................................................. 6.4.2 Contrastive Divergence Algorithm...................................................... 6.4.3 Practical Issues and Improvisations......................................................
ХѴШ CONTENTS 6.5 251 252 254 257 260 262 263 264 266 267 267 268 268 270 Applications of Restricted Boltzmann Machines............................................. 6.5.1 Dimensionality Reduction and Data Reconstruction........................ 6.5.2 RBMs for Collaborative Filtering......................................................... 6.5.3 Using RBMs for Classification............................................................... 6.5.4 Topic Models with RBMs..................................................................... 6.5.5 RBMs for Machine Learning with MultimodalData.......................... 6.6 Using RBMs Beyond Binary Data Types......................................................... 6.7 Stacking Restricted Boltzmann Machines ...................................................... 6.7.1 Unsupervised Learning........................................................................... 6.7.2 Supervised Learning.............................................................................. 6.7.3 Deep Boltzmann Machines and Deep Belief Networks ..................... 6.8 Summary............................................................................................................ 6.9 Bibliographic Notes............................................................................................. 6.10 Exercises ............................................................................................................ 7 Recurrent Neural Networks 271 7.1
Introduction......................................................................................................... 271 7.1.1 Expressiveness of Recurrent Networks................................................ 274 7.2 The Architecture of Recurrent Neural Networks............................................. 274 7.2.1 Language Modeling Example of RNN ................................................ 277 7.2.1.1Generating a Language Sample.................................................. 278 7.2.2 Backpropagation Through Time ......................................................... 280 7.2.3 Bidirectional Recurrent Networks......................................................... 283 7.2.4 Multilayer Recurrent Networks............................................................ 284 7.3 The Challenges of Training Recurrent Networks............................................. 286 7.3.1 Layer Normalization.............................................................................. 289 7.4 Echo-State Networks.......................................................................................... 290 7.5 Long Short-Term Memory (LSTM).................................................................. 292 7.6 Gated Recurrent Units (GRUs)........................................................................ 295 7.7 Applications of Recurrent Neural Networks................................................... 297 7.7.1 Application to Automatic Image Captioning....................................... 298 7.7.2 Sequence-to-Sequence Learning and Machine Translation ...............
299 7.7.2.1 Question-Answering Systems................................................ 301 7.7.3 Application to Sentence-Level Classification....................................... 303 7.7.4 Token-Level Classification with Linguistic Features........................... 304 7.7.5 Time-Series Forecasting and Prediction ............................................. 305 7.7.6 Temporal Recommender Systems......................................................... 307 7.7.7 Secondary Protein Structure Prediction............................................. 309 7.7.8 End-to-End Speech Recognition............................................................ 309 7.7.9 Handwriting Recognition ..................................................................... 309 7.8 Summary............................................................................................................ 310 7.9 Bibliographic Notes............................................................................................. 310 7.9.1 Software Resources................................................................................ 311 7.10 Exercises ............................................................................................................ 312
CONTENTS XIX 8 Convolutional Neural Networks 315 8.1 Introduction............................................................................................................. 315 8.1.1 Historical Perspective and Biological Inspiration ............................... 316 8.1.2 Broader Observations About Convolutional Neural Networks .... 317 8.2 The Basic Structure of a Convolutional Network............................................ 318 8.2.1 Padding....................................................................................................... 322 8.2.2 Strides.......................................................................................................... 324 8.2.3 Typical Settings ....................................................................................... 324 8.2.4 The ReLU Layer....................................................................................... 325 8.2.5 Pooling....................................................................................................... 326 8.2.6 Fully Connected Layers........................................................................... 327 8.2.7 The Interleaving Between Layers........................................................... 328 8.2.8 Local Response Normalization .............................................................. 330 8.2.9 Hierarchical Feature Engineering........................................................... 331 8.3 Training a Convolutional Network .................................................................... 332 8.3.1 Backpropagating Through
Convolutions............................................... 333 8.3.2 Backpropagation as Convolution with Inverted/Transposed Filter . . 334 8.3.3 Convolution/Backpropagation as Matrix Multiplications................... 335 8.3.4 Data Augmentation................................................................................. 337 8.4 Case Studies of Convolutional Architectures..................................................... 338 8.4.1 AlexNet....................................................................................................... 339 8.4.2 ZFNet.......................................................................................................... 341 8.4.3 VGG............................................................................................................. 342 8.4.4 GoogLeNet................................................................................................ 345 8.4.5 ResNet ....................................................................................................... 347 8.4.6 The Effects of Depth................................................................................. 350 8.4.7 Pretrained Models.................................................................................... 351 8.5 Visualization and Unsupervised Learning ........................................................ 352 8.5.1 Visualizing the Features of a Trained Network .................................. 353 8.5.2 Convolutional Autoencoders..................................................................... 357 8.6 Applications of Convolutional
Networks........................................................... 363 8.6.1 Content-Based Image Retrieval.............................................................. 363 8.6.2 Object Localization ................................................................................. 364 8.6.3 Object Detection....................................................................................... 365 8.6.4 Natural Language and Sequence Learning............................................ 366 8.6.5 Video Classification ................................................................................. 367 8.7 Summary................................................................................................................ 368 8.8 Bibliographic Notes................................................................................................ 368 8.8.1 Software Resources and Data Sets........................................................ 370 8.9 Exercises ................................................................................................................ 371 9 Deep Reinforcement Learning 373 9.1 Introduction............................................................................................................. 373 9.2 Stateless Algorithms: Multi-Armed Bandits..................................................... 375 9.2.1 Naive Algorithm....................................................................................... 376 9.2.2 e-Greedy Algorithm ................................................................................. 376 9.2.3 Upper Bounding
Methods........................................................................ 376 9.3 The Basic Framework of Reinforcement Learning............................................ 377 9.3.1 Challenges of Reinforcement Learning................................................. 379
XX CONTENTS 9.3.2 Simple Reinforcement Learning for Tic-Tac-Toe................................. 380 9.3.3 Role of Deep Learning and a Straw-Man Algorithm ........................ 380 9.4 Bootstrapping for Value Function Learning................................................... 383 9.4.1 Deep Learning Models as Function Approximators........................... 384 9.4.2 Example: Neural Network for Atari Setting....................................... 386 9.4.3 On-Policy Versus Off-Policy Methods: SARSA ................................. 387 9.4.4 Modeling States Versus State-Action Pairs.......................................... 389 9.5 Policy Gradient Methods ................................................................................. 391 9.5.1 Finite Difference Methods..................................................................... 392 9.5.2 Likelihood Ratio Methods..................................................................... 393 9.5.3 Combining Supervised Learning with Policy Gradients..................... 395 9.5.4 Actor-Critic Methods ........................................................................... 395 9.5.5 Continuous Action Spaces..................................................................... 397 9.5.6 Advantages and Disadvantages of Policy Gradients........................... 397 9.6 Monte Carlo Tree Search.................................................................................... 398 9.7 Case Studies ...................................................................................................... 399 9.7.1 AlphaGo:
Championship Level Play at Go.......................................... 399 9.7.1.1 Alpha Zero: Enhancements to Zero Human Knowledge . . 402 9.7.2 Self-Learning Robots.............................................................................. 404 9.7.2.1 Deep Learning of Locomotion Skills.................................... 404 9.7.2.2 Deep Learning of Visuomotor Skills..................................... 406 9.7.3 Building Conversational Systems: Deep Learning for Chatbots . . . 407 9.7.4 Self-Driving Cars.................................................................................... 410 9.7.5 Inferring Neural Architectures with Reinforcement Learning............ 412 9.8 Practical Challenges Associated with Safety................................................... 413 9.9 Summary........................................................................................................... 414 9.10 Bibliographic Notes............................................................................................. 414 9.10.1 Software Resources and Testbeds......................................................... 416 9.11 Exercises ........................................................................................................... 416 10 Advanced Topics in Deep Learning 10.1 Introduction......................................................................................................... 10.2 Attention Mechanisms....................................................................................... 10.2.1 Recurrent Models of Visual
Attention................................................ 10.2.1.1 Application to Image Captioning.......................................... 10.2.2 Attention Mechanisms for Machine Translation................................. 10.3 Neural Networks with External Memory........................................................ 10.3.1 A Fantasy Video Game: Sorting by Example .................................... 10.3.1.1 Implementing Swaps with Memory Operations.................. 10.3.2 Neural Turing Machines........................................................................ 10.3.3 Differentiable Neural Computer: A Brief Overview........................... 10.4 Generative Adversarial Networks (GANs)...................................................... 10.4.1 Training a Generative Adversarial Network....................................... 10.4.2 Comparison with Variational Autoencoder.......................................... 10.4.3 Using GANs for Generating Image Data............................................. 10.4.4 Conditional Generative Adversarial Networks.................................... 10.5 Competitive Learning ....................................................................................... 10.5.1 Vector Quantization.............................................................................. 10.5.2 Kohonen Self-Organizing Map ............................................................ 419 419 421 422 424 425 429 430 431 432 437 438 439 442 442 444 449 450 450
CONTENTS 10.6 Limitations of Neural Networks............................................... 10.6.1 An Aspirational Goal: One-Shot Learning................ 10.6.2 An Aspirational Goal: Energy-Efficient Learning...................... 10.7 Summary.................................................................................... 10.8 Bibliographic Notes.............................................................. 10.8.1 Software Resources.................................................. 10.9 Exercises .............................................................................. XXI 4gg 453 455 4gg 4gg ^gg Bibliography 459 Index 493
|
any_adam_object | 1 |
author | Aggarwal, Charu C. 1970- |
author_GND | (DE-588)133500101 |
author_facet | Aggarwal, Charu C. 1970- |
author_role | aut |
author_sort | Aggarwal, Charu C. 1970- |
author_variant | c c a cc cca |
building | Verbundindex |
bvnumber | BV045147819 |
classification_rvk | ST 301 |
ctrlnum | (OCoLC)1055864028 (DE-599)DNB1159914656 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>02390nam a2200505 c 4500</leader><controlfield tag="001">BV045147819</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20210806 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">180827s2018 sz a||| |||| 00||| eng d</controlfield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">1159914656</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783030068561</subfield><subfield code="c">pbk</subfield><subfield code="9">978-3-030-06856-1</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783319944623</subfield><subfield code="c">Festeinband : circa EUR 64.19 (DE) (freier Preis), circa EUR 65.99 (AT) (freier Preis), circa CHF 66.00 (freier Preis)</subfield><subfield code="9">978-3-319-94462-3</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">3319944622</subfield><subfield code="9">3-319-94462-2</subfield></datafield><datafield tag="024" ind1="3" ind2=" "><subfield code="a">9783319944623</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1055864028</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DNB1159914656</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">sz</subfield><subfield code="c">XA-CH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-29T</subfield><subfield code="a">DE-945</subfield><subfield code="a">DE-M347</subfield><subfield code="a">DE-1028</subfield><subfield code="a">DE-Aug4</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-188</subfield><subfield code="a">DE-19</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-1043</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-634</subfield><subfield code="a">DE-20</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 301</subfield><subfield code="0">(DE-625)143651:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Aggarwal, Charu C.</subfield><subfield code="d">1970-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)133500101</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Neural networks and deep learning</subfield><subfield code="b">a textbook</subfield><subfield code="c">Charu C. Aggarwal</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Cham, Switzerland</subfield><subfield code="b">Springer</subfield><subfield code="c">[2018]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2018</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XXIII, 497 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield><subfield code="c">23.5 cm x 15.5 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Neuronales Netz</subfield><subfield code="0">(DE-588)4226127-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Deep learning</subfield><subfield code="0">(DE-588)1135597375</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4123623-3</subfield><subfield code="a">Lehrbuch</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Neuronales Netz</subfield><subfield code="0">(DE-588)4226127-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Deep learning</subfield><subfield code="0">(DE-588)1135597375</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="689" ind1="1" ind2="0"><subfield code="a">Maschinelles Lernen</subfield><subfield code="0">(DE-588)4193754-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="1" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Springer International Publishing</subfield><subfield code="0">(DE-588)1064344704</subfield><subfield code="4">pbl</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-3-319-94463-0</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">X:MVB</subfield><subfield code="q">text/html</subfield><subfield code="u">http://deposit.dnb.de/cgi-bin/dokserv?id=d03ee684ae884db698677fd189f4cd49&prov=M&dok_var=1&dok_ext=htm</subfield><subfield code="3">Inhaltstext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">X:MVB</subfield><subfield code="u">https://www.springer.com/de/book/9783319944623</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030537527&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030537527</subfield></datafield></record></collection> |
genre | (DE-588)4123623-3 Lehrbuch gnd-content |
genre_facet | Lehrbuch |
id | DE-604.BV045147819 |
illustrated | Illustrated |
indexdate | 2024-07-10T08:10:00Z |
institution | BVB |
institution_GND | (DE-588)1064344704 |
isbn | 9783030068561 9783319944623 3319944622 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-030537527 |
oclc_num | 1055864028 |
open_access_boolean | |
owner | DE-29T DE-945 DE-M347 DE-1028 DE-Aug4 DE-11 DE-83 DE-188 DE-19 DE-BY-UBM DE-355 DE-BY-UBR DE-1043 DE-739 DE-634 DE-20 |
owner_facet | DE-29T DE-945 DE-M347 DE-1028 DE-Aug4 DE-11 DE-83 DE-188 DE-19 DE-BY-UBM DE-355 DE-BY-UBR DE-1043 DE-739 DE-634 DE-20 |
physical | XXIII, 497 Seiten Illustrationen, Diagramme 23.5 cm x 15.5 cm |
publishDate | 2018 |
publishDateSearch | 2018 |
publishDateSort | 2018 |
publisher | Springer |
record_format | marc |
spelling | Aggarwal, Charu C. 1970- Verfasser (DE-588)133500101 aut Neural networks and deep learning a textbook Charu C. Aggarwal Cham, Switzerland Springer [2018] © 2018 XXIII, 497 Seiten Illustrationen, Diagramme 23.5 cm x 15.5 cm txt rdacontent n rdamedia nc rdacarrier Neuronales Netz (DE-588)4226127-2 gnd rswk-swf Maschinelles Lernen (DE-588)4193754-5 gnd rswk-swf Deep learning (DE-588)1135597375 gnd rswk-swf (DE-588)4123623-3 Lehrbuch gnd-content Neuronales Netz (DE-588)4226127-2 s Deep learning (DE-588)1135597375 s DE-604 Maschinelles Lernen (DE-588)4193754-5 s Springer International Publishing (DE-588)1064344704 pbl Erscheint auch als Online-Ausgabe 978-3-319-94463-0 X:MVB text/html http://deposit.dnb.de/cgi-bin/dokserv?id=d03ee684ae884db698677fd189f4cd49&prov=M&dok_var=1&dok_ext=htm Inhaltstext X:MVB https://www.springer.com/de/book/9783319944623 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030537527&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Aggarwal, Charu C. 1970- Neural networks and deep learning a textbook Neuronales Netz (DE-588)4226127-2 gnd Maschinelles Lernen (DE-588)4193754-5 gnd Deep learning (DE-588)1135597375 gnd |
subject_GND | (DE-588)4226127-2 (DE-588)4193754-5 (DE-588)1135597375 (DE-588)4123623-3 |
title | Neural networks and deep learning a textbook |
title_auth | Neural networks and deep learning a textbook |
title_exact_search | Neural networks and deep learning a textbook |
title_full | Neural networks and deep learning a textbook Charu C. Aggarwal |
title_fullStr | Neural networks and deep learning a textbook Charu C. Aggarwal |
title_full_unstemmed | Neural networks and deep learning a textbook Charu C. Aggarwal |
title_short | Neural networks and deep learning |
title_sort | neural networks and deep learning a textbook |
title_sub | a textbook |
topic | Neuronales Netz (DE-588)4226127-2 gnd Maschinelles Lernen (DE-588)4193754-5 gnd Deep learning (DE-588)1135597375 gnd |
topic_facet | Neuronales Netz Maschinelles Lernen Deep learning Lehrbuch |
url | http://deposit.dnb.de/cgi-bin/dokserv?id=d03ee684ae884db698677fd189f4cd49&prov=M&dok_var=1&dok_ext=htm https://www.springer.com/de/book/9783319944623 http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=030537527&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT aggarwalcharuc neuralnetworksanddeeplearningatextbook AT springerinternationalpublishing neuralnetworksanddeeplearningatextbook |