Neuro-dynamic programming:
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Belmont, Mass.
Athena Scientific
1996
|
Schriftenreihe: | Athena Scientific optimization and computation series
3 |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | Hier auch später erschienene, unveränderte Nachdrucke |
Beschreibung: | XIII, 491 S. graph. Darst. |
ISBN: | 1886529108 |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV015264189 | ||
003 | DE-604 | ||
005 | 20210715 | ||
007 | t | ||
008 | 021209s1996 d||| |||| 00||| eng d | ||
020 | |a 1886529108 |9 1-886529-10-8 | ||
035 | |a (OCoLC)35983505 | ||
035 | |a (DE-599)BVBBV015264189 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-20 |a DE-703 |a DE-91G |a DE-91 |a DE-384 |a DE-634 |a DE-83 |a DE-11 |a DE-355 |a DE-706 |a DE-739 |a DE-898 | ||
050 | 0 | |a QA76.87 | |
082 | 1 | |a 006.32 |2 22 | |
082 | 0 | |a 519.7/03 |2 21 | |
084 | |a QH 423 |0 (DE-625)141577: |2 rvk | ||
084 | |a QH 700 |0 (DE-625)141608: |2 rvk | ||
084 | |a SK 870 |0 (DE-625)143265: |2 rvk | ||
084 | |a SK 880 |0 (DE-625)143266: |2 rvk | ||
084 | |a ST 301 |0 (DE-625)143651: |2 rvk | ||
084 | |a DAT 717f |2 stub | ||
084 | |a MAT 917f |2 stub | ||
100 | 1 | |a Bertsekas, Dimitri P. |d 1942- |e Verfasser |0 (DE-588)171165519 |4 aut | |
245 | 1 | 0 | |a Neuro-dynamic programming |c Dimitri P. Bertsekas and John N. Tsitsiklis |
264 | 1 | |a Belmont, Mass. |b Athena Scientific |c 1996 | |
300 | |a XIII, 491 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Athena Scientific optimization and computation series |v 3 | |
500 | |a Hier auch später erschienene, unveränderte Nachdrucke | ||
650 | 7 | |a Approximation stochastique |2 ram | |
650 | 7 | |a Programmation dynamique |2 ram | |
650 | 7 | |a Réseaux neuronaux (informatique) |2 ram | |
650 | 4 | |a Apprentissage par renforcement (Intelligence artificielle) | |
650 | 4 | |a Approximation stochastique | |
650 | 7 | |a Dynamische programmering |2 gtt | |
650 | 7 | |a Inteligencia artificial |2 larpcal | |
650 | 7 | |a Intelligence artificielle |2 ram | |
650 | 7 | |a Neurale netwerken |2 gtt | |
650 | 7 | |a Optimaliseren |2 gtt | |
650 | 7 | |a Optimisation mathématique |2 ram | |
650 | 7 | |a Programacao dinamica |2 larpcal | |
650 | 4 | |a Programmation dynamique | |
650 | 4 | |a Réseaux neuronaux (Informatique) | |
650 | 7 | |a optimisation mathématique |2 inriac | |
650 | 7 | |a programmation dynamique |2 inriac | |
650 | 7 | |a réseau neuronal |2 inriac | |
650 | 4 | |a Künstliche Intelligenz | |
650 | 4 | |a Dynamic programming | |
650 | 4 | |a Mathematical optimization | |
650 | 4 | |a Neural networks (Computer science) | |
650 | 0 | 7 | |a Neuronales Netz |0 (DE-588)4226127-2 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Dynamische Optimierung |0 (DE-588)4125677-3 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Dynamische Optimierung |0 (DE-588)4125677-3 |D s |
689 | 0 | 1 | |a Neuronales Netz |0 (DE-588)4226127-2 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Tsitsiklis, John N. |d 1958- |e Verfasser |0 (DE-588)170583996 |4 aut | |
830 | 0 | |a Athena Scientific optimization and computation series |v 3 |w (DE-604)BV015264203 |9 3 | |
856 | 4 | 2 | |m HBZ Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=010098849&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-010098849 |
Datensatz im Suchindex
_version_ | 1804129688919998464 |
---|---|
adam_text | Contents
1. Introduction p. 1
1.1. Cost to go Approximations in Dynamic Programming .... p. 3
1.2. Approximation Architectures p. 5
1.3. Simulation and Training p. 6
1.4. Neuro Dynamic Programming p. 8
1.5. Notes and Sources p. 9
2. Dynamic Programming p. 11
2.1. Introduction p. 12
2.1.1. Finite Horizon Problems p. 13
2.1.2. Infinite Horizon Problems p. 14
2.2. Stochastic Shortest Path Problems p. 17
2.2.1. General Theory p. 18
2.2.2. Value Iteration p. 25
2.2.3. Policy Iteration p. 29
2.2.4. Linear Programming p. 36
2.3. Discounted Problems p. 37
2.3.1. Temporal Difference Based Policy Iteration p. 41
2.4. Problem Formulation and Examples p. 47
2.5. Notes and Sources p. 57
3. Neural Network Architectures and Training p. 59
3.1. Architectures for Approximation p. 60
3.1.1. An Overview of Approximation Architectures p. 61
3.1.2. Features p. 66
3.1.3. Partitioning p. 70
3.1.4. Using Heuristic Policies to Construct Features . . . . p. 72
3.2. Neural Network Training p. 76
3.2.1. Optimality Conditions p. 78
3.2.2. Linear Least Squares Methods p. 81
3.2.3. Gradient Methods p. 89
v
vi Contents
3.2.4. Incremental Gradient Methods for Least Squares . . p. 108
3.2.5. Convergence Analysis of Incremental Gradient
Methods p. 115
3.2.6. Extended Kalman Filtering p. 124
3.2.7. Comparison of Various Methods p. 128
3.3. Notes and Sources p. 129
4. Stochastic Iterative Algorithms p. 131
4.1. The Basic Model p. 134
4.2. Convergence Based on a Smooth Potential Function .... p. 139
4.2.1. A Convergence Result p. 139
4.2.2. Two Pass Methods p. 147
4.2.3. Convergence Proofs p. 148
4.3. Convergence under Contraction or Monotonicity
Assumptions p. 154
4.3.1. Algorithmic Model p. 154
4.3.2. Weighted Maximum Norm Contractions p. 155
4.3.3. Time Dependent Maps and Additional Noise Terms . p. 157
4.3.4. Convergence under Monotonicity Assumptions ... p. 158
4.3.5. Boundedness p. 159
4.3.6. Convergence Proofs p. 161
4.4. The ODE Approach p. 171
4.4.1. The Case of Markov Noise p. 173
4.5. Notes and Sources p. 178
5. Simulation Methods for a Lookup Table Representation p. 179
5.1. Some Aspects of Monte Carlo Simulation p. 181
5.2. Policy Evaluation by Monte Carlo Simulation p. 186
5.2.1. Multiple Visits to the Same State p. 187
5.2.2. Q Factors and Policy Iteration p. 192
5.3. Temporal Difference Methods p. 193
5.3.1. Monte Carlo Simulation Using Temporal
Differences p. 193
5.3.2. TD(A) p. 195
5.3.3. General Temporal Difference Methods p. 201
5.3.4. Discounted Problems p. 204
5.3.5. Convergence of Off Line Temporal Difference
Methods p. 208
5.3.6. Convergence of On Line Temporal Difference
Methods p. 219
5.3.7. Convergence for Discounted Problems p. 222
5.4. Optimistic Policy Iteration p. 224
5.5. Simulation Based Value Iteration p. 237
Contents vii
5.6. Q Learning p. 245
5.7. Notes and Sources p. 251
6. Approximate DP with Cost to Go Function
Approximation p. 255
6.1. Generic Issues From Parameters to Policies p. 259
6.1.1. Generic Error Bounds p. 262
6.1.2. Multistage Lookahead Variations p. 264
6.1.3. Rollout Policies p. 266
6.1.4. Trading off Control Space Complexity with
State Space Complexity p. 268
6.2. Approximate Policy Iteration p. 269
6.2.1. Approximate Policy Iteration Based on
Monte Carlo Simulation p. 270
6.2.2. Error Bounds for Approximate Policy Iteration ... p. 275
6.2.3. Tightness of the Error Bounds and Empirical Behavior p. 282
6.3. Approximate Policy Evaluation Using TD(A) p. 284
6.3.1. Approximate Policy Evaluation Using TD(1) .... p. 285
6.3.2. TD(A) for General A p. 287
6.3.3. TD(A) with Linear Architectures Discounted
Problems p. 294
6.3.4. TD(A) with Linear Architectures Stochastic Shortest
Path Problems p. 308
6.4. Optimistic Policy Iteration p. 312
6.4.1. Analysis of Optimistic Policy Iteration p. 318
6.4.2. Oscillation of Policies in Optimistic Policy Iteration . p. 320
6.5. Approximate Value Iteration p. 329
6.5.1. Sequential Backward Approximation for
Finite Horizon Problems p. 329
6.5.2. Sequential Approximation in State Space p. 331
6.5.3. Sequential Backward Approximation for
Infinite Horizon Problems p. 331
6.5.4. Incremental Value Iteration p. 335
6.6. Q Learning and Advantage Updating p. 337
6.6.1. (Q Learning and Policy Iteration p. 338
6.6.2. Advantage Updating p. 339
6.7. Value Iteration with State Aggregation p. 341
6.7.1. A Method Based on Value Iteration p. 342
6.7.2. Relation to an Auxiliary Problem p. 343
6.7.3. Convergence Results p. 344
6.7.4. Error Bounds p. 349
6.7.5. Comparison with TD(0) p. 351
6.7.6. Discussion of Sampling Mechanisms p. 352
viii Contents
6.7.7. The Model Free Case p. 352
6.8. Euclidean Contractions and Optimal Stopping p. 353
6.8.1. Assumptions and Main Convergence Result p. 353
6.8.2. Error Bounds p. 357
6.8.3. Applicability of the Result p. 358
6.8.4. Q Learning for Optimal Stopping Problems p. 358
6.9. Value Iteration with Representative States p. 362
6.10. Bellman Error Methods p. 364
6.10.1. The Case of a Single Policy p. 366
6.10.2. Approximation of the Q Factors p. 367
6.10.3. Another Variant p. 368
6.10.4. Discussion and Related Methods p. 369
6.11. Continuous States and the Slope of the Cost to Go .... p. 370
6.12. Approximate Linear Programming p. 375
6.13. Overview p. 377
6.14. Notes and Sources p. 379
7. Extensions p. 385
7.1. Average Cost per Stage Problems p. 386
7.1.1. The Associated Stochastic Shortest Path Problem . . p. 387
7.1.2. Value Iteration Methods p. 391
7.1.3. Policy Iteration p. 397
7.1.4. Linear Programming p. 398
7.1.5. Simulation Based Value Iteration and Q Learning . . p. 399
7.1.6. Simulation Based Policy Iteration p. 405
7.1.7. Minimization of the Bellman Equation Error .... p. 408
7.2. Dynamic Games p. 408
7.2.1. Discounted Games p. 410
7.2.2. Stochastic Shortest Path Games p. 412
7.2.3. Sequential Games, Policy Iteration, and Q Learning . p. 412
7.2.4. Function Approximation Methods p. 416
7.3. Parallel Computation Issues p. 418
7.4. Notes and Sources p. 419
8. Case Studies p. 421
8.1. Parking p. 422
8.2. Football p. 426
8.3. Tctris p. 435
8.4. Combinatorial Optimization Maintenance and Repair . . p. 440
8.5. Dynamic Channel Allocation p. 448
8.6. Backgammon p. 452
8.7. Notes and Sources p. 456
Contents ix
Appendix A: Mathematical Review p. 457
A.I. Sets p. 458
A.2. Euclidean Space p. 459
A.3. Matrices p. 460
A.4. Analysis p. 462
A.5. Convex Sets and Functions p. 465
Appendix B: On Probability Theory and Markov Chains . p. 467
B.I. Probability Spaces p. 468
B.2. Random Variables p. 469
B.3. Conditional Probability p. 470
B.4. Stationary Markov Chains p. 471
B.5. Classification of States p. 472
B.6. Limiting Probabilities p. 472
B.7. First Passage Times p. 473
References p. 475
Index p. 487
|
any_adam_object | 1 |
author | Bertsekas, Dimitri P. 1942- Tsitsiklis, John N. 1958- |
author_GND | (DE-588)171165519 (DE-588)170583996 |
author_facet | Bertsekas, Dimitri P. 1942- Tsitsiklis, John N. 1958- |
author_role | aut aut |
author_sort | Bertsekas, Dimitri P. 1942- |
author_variant | d p b dp dpb j n t jn jnt |
building | Verbundindex |
bvnumber | BV015264189 |
callnumber-first | Q - Science |
callnumber-label | QA76 |
callnumber-raw | QA76.87 |
callnumber-search | QA76.87 |
callnumber-sort | QA 276.87 |
callnumber-subject | QA - Mathematics |
classification_rvk | QH 423 QH 700 SK 870 SK 880 ST 301 |
classification_tum | DAT 717f MAT 917f |
ctrlnum | (OCoLC)35983505 (DE-599)BVBBV015264189 |
dewey-full | 006.32 519.7/03 |
dewey-hundreds | 000 - Computer science, information, general works 500 - Natural sciences and mathematics |
dewey-ones | 006 - Special computer methods 519 - Probabilities and applied mathematics |
dewey-raw | 006.32 519.7/03 |
dewey-search | 006.32 519.7/03 |
dewey-sort | 16.32 |
dewey-tens | 000 - Computer science, information, general works 510 - Mathematics |
discipline | Informatik Mathematik Wirtschaftswissenschaften |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03035nam a2200745 cb4500</leader><controlfield tag="001">BV015264189</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20210715 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">021209s1996 d||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1886529108</subfield><subfield code="9">1-886529-10-8</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)35983505</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV015264189</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-20</subfield><subfield code="a">DE-703</subfield><subfield code="a">DE-91G</subfield><subfield code="a">DE-91</subfield><subfield code="a">DE-384</subfield><subfield code="a">DE-634</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-11</subfield><subfield code="a">DE-355</subfield><subfield code="a">DE-706</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-898</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">QA76.87</subfield></datafield><datafield tag="082" ind1="1" ind2=" "><subfield code="a">006.32</subfield><subfield code="2">22</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">519.7/03</subfield><subfield code="2">21</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">QH 423</subfield><subfield code="0">(DE-625)141577:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">QH 700</subfield><subfield code="0">(DE-625)141608:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 870</subfield><subfield code="0">(DE-625)143265:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 880</subfield><subfield code="0">(DE-625)143266:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 301</subfield><subfield code="0">(DE-625)143651:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 717f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">MAT 917f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Bertsekas, Dimitri P.</subfield><subfield code="d">1942-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)171165519</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Neuro-dynamic programming</subfield><subfield code="c">Dimitri P. Bertsekas and John N. Tsitsiklis</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Belmont, Mass.</subfield><subfield code="b">Athena Scientific</subfield><subfield code="c">1996</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XIII, 491 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Athena Scientific optimization and computation series</subfield><subfield code="v">3</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Hier auch später erschienene, unveränderte Nachdrucke</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Approximation stochastique</subfield><subfield code="2">ram</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Programmation dynamique</subfield><subfield code="2">ram</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Réseaux neuronaux (informatique)</subfield><subfield code="2">ram</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Apprentissage par renforcement (Intelligence artificielle)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Approximation stochastique</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Dynamische programmering</subfield><subfield code="2">gtt</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Inteligencia artificial</subfield><subfield code="2">larpcal</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Intelligence artificielle</subfield><subfield code="2">ram</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Neurale netwerken</subfield><subfield code="2">gtt</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Optimaliseren</subfield><subfield code="2">gtt</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Optimisation mathématique</subfield><subfield code="2">ram</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Programacao dinamica</subfield><subfield code="2">larpcal</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Programmation dynamique</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Réseaux neuronaux (Informatique)</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">optimisation mathématique</subfield><subfield code="2">inriac</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">programmation dynamique</subfield><subfield code="2">inriac</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">réseau neuronal</subfield><subfield code="2">inriac</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Künstliche Intelligenz</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Dynamic programming</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Mathematical optimization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neural networks (Computer science)</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Neuronales Netz</subfield><subfield code="0">(DE-588)4226127-2</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Dynamische Optimierung</subfield><subfield code="0">(DE-588)4125677-3</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Dynamische Optimierung</subfield><subfield code="0">(DE-588)4125677-3</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Neuronales Netz</subfield><subfield code="0">(DE-588)4226127-2</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Tsitsiklis, John N.</subfield><subfield code="d">1958-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)170583996</subfield><subfield code="4">aut</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Athena Scientific optimization and computation series</subfield><subfield code="v">3</subfield><subfield code="w">(DE-604)BV015264203</subfield><subfield code="9">3</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HBZ Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=010098849&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-010098849</subfield></datafield></record></collection> |
id | DE-604.BV015264189 |
illustrated | Illustrated |
indexdate | 2024-07-09T19:09:07Z |
institution | BVB |
isbn | 1886529108 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-010098849 |
oclc_num | 35983505 |
open_access_boolean | |
owner | DE-20 DE-703 DE-91G DE-BY-TUM DE-91 DE-BY-TUM DE-384 DE-634 DE-83 DE-11 DE-355 DE-BY-UBR DE-706 DE-739 DE-898 DE-BY-UBR |
owner_facet | DE-20 DE-703 DE-91G DE-BY-TUM DE-91 DE-BY-TUM DE-384 DE-634 DE-83 DE-11 DE-355 DE-BY-UBR DE-706 DE-739 DE-898 DE-BY-UBR |
physical | XIII, 491 S. graph. Darst. |
publishDate | 1996 |
publishDateSearch | 1996 |
publishDateSort | 1996 |
publisher | Athena Scientific |
record_format | marc |
series | Athena Scientific optimization and computation series |
series2 | Athena Scientific optimization and computation series |
spelling | Bertsekas, Dimitri P. 1942- Verfasser (DE-588)171165519 aut Neuro-dynamic programming Dimitri P. Bertsekas and John N. Tsitsiklis Belmont, Mass. Athena Scientific 1996 XIII, 491 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Athena Scientific optimization and computation series 3 Hier auch später erschienene, unveränderte Nachdrucke Approximation stochastique ram Programmation dynamique ram Réseaux neuronaux (informatique) ram Apprentissage par renforcement (Intelligence artificielle) Approximation stochastique Dynamische programmering gtt Inteligencia artificial larpcal Intelligence artificielle ram Neurale netwerken gtt Optimaliseren gtt Optimisation mathématique ram Programacao dinamica larpcal Programmation dynamique Réseaux neuronaux (Informatique) optimisation mathématique inriac programmation dynamique inriac réseau neuronal inriac Künstliche Intelligenz Dynamic programming Mathematical optimization Neural networks (Computer science) Neuronales Netz (DE-588)4226127-2 gnd rswk-swf Dynamische Optimierung (DE-588)4125677-3 gnd rswk-swf Dynamische Optimierung (DE-588)4125677-3 s Neuronales Netz (DE-588)4226127-2 s DE-604 Tsitsiklis, John N. 1958- Verfasser (DE-588)170583996 aut Athena Scientific optimization and computation series 3 (DE-604)BV015264203 3 HBZ Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=010098849&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Bertsekas, Dimitri P. 1942- Tsitsiklis, John N. 1958- Neuro-dynamic programming Athena Scientific optimization and computation series Approximation stochastique ram Programmation dynamique ram Réseaux neuronaux (informatique) ram Apprentissage par renforcement (Intelligence artificielle) Approximation stochastique Dynamische programmering gtt Inteligencia artificial larpcal Intelligence artificielle ram Neurale netwerken gtt Optimaliseren gtt Optimisation mathématique ram Programacao dinamica larpcal Programmation dynamique Réseaux neuronaux (Informatique) optimisation mathématique inriac programmation dynamique inriac réseau neuronal inriac Künstliche Intelligenz Dynamic programming Mathematical optimization Neural networks (Computer science) Neuronales Netz (DE-588)4226127-2 gnd Dynamische Optimierung (DE-588)4125677-3 gnd |
subject_GND | (DE-588)4226127-2 (DE-588)4125677-3 |
title | Neuro-dynamic programming |
title_auth | Neuro-dynamic programming |
title_exact_search | Neuro-dynamic programming |
title_full | Neuro-dynamic programming Dimitri P. Bertsekas and John N. Tsitsiklis |
title_fullStr | Neuro-dynamic programming Dimitri P. Bertsekas and John N. Tsitsiklis |
title_full_unstemmed | Neuro-dynamic programming Dimitri P. Bertsekas and John N. Tsitsiklis |
title_short | Neuro-dynamic programming |
title_sort | neuro dynamic programming |
topic | Approximation stochastique ram Programmation dynamique ram Réseaux neuronaux (informatique) ram Apprentissage par renforcement (Intelligence artificielle) Approximation stochastique Dynamische programmering gtt Inteligencia artificial larpcal Intelligence artificielle ram Neurale netwerken gtt Optimaliseren gtt Optimisation mathématique ram Programacao dinamica larpcal Programmation dynamique Réseaux neuronaux (Informatique) optimisation mathématique inriac programmation dynamique inriac réseau neuronal inriac Künstliche Intelligenz Dynamic programming Mathematical optimization Neural networks (Computer science) Neuronales Netz (DE-588)4226127-2 gnd Dynamische Optimierung (DE-588)4125677-3 gnd |
topic_facet | Approximation stochastique Programmation dynamique Réseaux neuronaux (informatique) Apprentissage par renforcement (Intelligence artificielle) Dynamische programmering Inteligencia artificial Intelligence artificielle Neurale netwerken Optimaliseren Optimisation mathématique Programacao dinamica Réseaux neuronaux (Informatique) optimisation mathématique programmation dynamique réseau neuronal Künstliche Intelligenz Dynamic programming Mathematical optimization Neural networks (Computer science) Neuronales Netz Dynamische Optimierung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=010098849&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV015264203 |
work_keys_str_mv | AT bertsekasdimitrip neurodynamicprogramming AT tsitsiklisjohnn neurodynamicprogramming |