Dynamic programming and optimal control: Volume 2 Approximate dynamic programming
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Belmont, Mass.
Athena Scientific
[2012]
|
Ausgabe: | Fourth edition |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | xvii, 694 Seiten Diagramme |
ISBN: | 1886529442 9781886529441 |
Internformat
MARC
LEADER | 00000nam a2200000 cc4500 | ||
---|---|---|---|
001 | BV040463410 | ||
003 | DE-604 | ||
005 | 20171220 | ||
007 | t | ||
008 | 121010s2012 |||| |||| 00||| eng d | ||
020 | |a 1886529442 |9 1-886529-44-2 | ||
020 | |a 9781886529441 |9 978-1-886529-44-1 | ||
035 | |a (OCoLC)815947736 | ||
035 | |a (DE-599)BVBBV040463410 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-706 |a DE-384 |a DE-739 |a DE-29T |a DE-573 |a DE-188 |a DE-83 |a DE-523 |a DE-634 |a DE-91G |a DE-91 | ||
084 | |a SK 880 |0 (DE-625)143266: |2 rvk | ||
084 | |a SM 613 |0 (DE-625)143297: |2 rvk | ||
084 | |a MAT 917f |2 stub | ||
100 | 1 | |a Bertsekas, Dimitri P. |d 1942- |e Verfasser |0 (DE-588)171165519 |4 aut | |
245 | 1 | 0 | |a Dynamic programming and optimal control |n Volume 2 |p Approximate dynamic programming |c Dimitri P. Bertsekas |
250 | |a Fourth edition | ||
264 | 1 | |a Belmont, Mass. |b Athena Scientific |c [2012] | |
300 | |a xvii, 694 Seiten |b Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
773 | 0 | 8 | |w (DE-604)BV011951112 |g 2 |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025310808&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-025310808 |
Datensatz im Suchindex
_version_ | 1804149529558122496 |
---|---|
adam_text | Contents
1.
Discounted
Problems —
Theory
1.1.
Minimization of Total Cost
-
Introduction
........
p.
3
1.1.1.
The Finite-Horizon DP Algorithm
........
p.
5
1.1.2.
Shorthand Notation and
Monotoniei
ty
......
p.
б
1.1.3.
A Preview of Infinite Horizon Results
.......
p.
10
1.1.4.
Randomized and History-Dependent Policies
. . . .
p.
11
1.2.
Discounted Problems
-
Bounded Cost per Stage
.....
p.
14
1.3.
Scheduling and Multiarmed Bandit Problems
......
p.
22
1.4.
Discounted Continuous-Time Problems
.........
p.
32
1.5.
The Role of Contraction Mappings
...........
p.
45
1.5.1.
Sup-Norm Contractions
.............
p.
47
1.5.2.
Discounted Problems
-
Unbounded Cost per Stage
.
p.
54
1.6.
General Forms of Discounted Dynamic Programming
. . .
p.
57
1.6.1.
Basic Results Under Contraction and
Monotonicity
.
p.
63
1.6.2.
Discounted Dynamic Games
...........
p.
69
1.7.
Notes, Sources, and Exercises
.............
p.
71
2.
Discounted Problems
—
Computational Methods
2.1.
Markovian Decision Problems
.............
p.
82
2.2.
Value Iteration
...................
p.
84
2.2.1.
Monotonie
Error Bounds for Value Iteration
. . . .
p.
85
2.2.2.
Variants of Value Iteration
...........
p.
92
2.2.3.
Q-Learning
..................
p.
95
2.3.
Policy Iteration
...................
p.
97
2.3.1.
Policy Iteration for Costs
............
p.
97
2.3.2.
Policy Iteration for Q-Factors
.........
p.
102
2.3.3.
Optimistic Policy Iteration
..........
p.
103
2.3.4.
Limited Lookahead Policies and Rollout
.....
p.
106
2.4.
Linear Programming Methods
............
p.
112
2.5.
Methods for General Discounted Problems
......
p.
115
2.5.1.
Limited Lookahead Policies and Approximations
.
p.
117
iii
iv Contents
2.5.2.
Generalized Value Iteration
..........
p.
119
2.5.3.
Approximate Value Iteration
..........
p.
120
2.5.4.
Generalized Policy Iteration
..........
p.
123
2.5.5.
Generalized Optimistic Policy Iteration
.....
p.
126
2.5.6.
Approximate Policy Iteration
.........
p.
132
2.5.7.
Mathematical Programming
..........
p.
137
2.6.
Asynchronous Algorithms
..............
p.
138
2.6.1.
Asynchronous Value Iteration
.........
p.
138
2.6.2.
Asynchronous Policy Iteration
.........
p.
144
2.6.3.
Policy Iteration with a Uniform Fixed Point
. . .
p.
149
2.7.
Notes, Sources, and Exercises
............
p.
156
3.
Stochastic Shortest Path Problems
3.1.
Problem
Formulation
................
p.
172
3.2.
Main Results
...................
p.
175
3.3.
Underlying Contraction Properties
..........
p.
182
3.4.
Value Iteration
..................
p.
184
3.4.1.
Conditions for Finite Termination
.......
p.
185
3.4.2.
Asynchronous Value Iteration
.........
p.
188
3.5.
Policy Iteration
..................
p.
189
3.5.1.
Optimistic Policy Iteration
..........
p.
190
3.5.2.
Approximate Policy Iteration
.........
p.
191
3.5.3.
Policy Iteration with Improper Policies
.....
p.
193
3.5.4.
Policy Iteration with a Uniform Fixed Point
. . .
p.
197
3.6.
Countable-State Problems
.............
p.
201
3.7.
Notes, Sources, and Exercises
............
p.
204
4.
Undiscounted Problems
4.1.
Unbounded Costs per Stage
.............
p.
214
4.1.1.
Main Results
................
p.
216
4.1.2.
Value Iteration
................
p.
224
4.1.3.
Other Computational Methods
.........
p.
230
4.2.
Linear Systems and Quadratic Cost
.........
p.
231
4.3.
Inventory Control
.................
p.
233
4.4.
Optimal Stopping
.................
p.
235
4.5.
Optimal Gambling Strategies
............
p.
241
4.6.
Continuous-Time Problems
-
Control of Queues
....
p.
248
4.7.
Nonstationary and Periodic Problems
........
p.
256
4.8.
Notes, Sources, and Exercises
............
p.
261
5.
Average Cost per Stage Problems
5.1.
Finite-Spaces Average Cost Models
.........
p.
274
5.1.1.
Relation with the Discounted Cost Problem
...
p.
278
Contents
v
5.1.2. Blackwell Optimal
Policies
..........
p.
284
5.1.3. Optimality
Equations
............
p.
294
5.2.
Conditions for Equal Average Cost for all Initial States
.
p.
298
5.3.
Value Iteration
..................
p.
304
5.3.1.
Single-Chain Value Iteration
..........
p.
307
5.3.2.
Multi-Chain Value Iteration
..........
p.
322
5.4.
Policy Iteration
..................
p.
329
5.4.1.
Single-Chain Policy Iteration
.........
p.
329
5.4.2.
Multi-Chain Policy Iteration
..........
p.
335
5.5.
Linear Programming
................
p.
339
5.6.
Infinite-Spaces Average Cost Models
.........
p.
345
5.6.1.
A Sufficient Condition for
Optimality......
p.
353
5.6.2.
Finite State Space and Infinite Control Space
. .
p.
355
5.6.3.
Countable States
-
Vanishing Discount Approach p.
364
5.6.4.
Countable States Contraction Approach
....
p.
367
5.6.5.
Linear Systems with Quadratic Cost
......
p.
372
5.7.
Notes, Sources, and Exercises
............
p.
374
6.
Approximate Dynamic Programming
-
Discounted Models
6.1.
General Issues of Simulation-Based Cost Approximation
. .
p.
391
6.1.1.
Approximation Architectures
.........
p.
391
6.1.2.
Simulation-Based Approximate Policy Iteration
.
p.
397
6.1.3.
Direct and Indirect Approximation
.......
p.
403
6.1.4.
Monte Carlo Simulation
............
p.
405
6.1.5.
Simplifications
................
p.
413
6.2.
Direct Policy Evaluation
-
Gradient Methods
......
p.
418
6.3.
Projected Equation Methods for Policy Evaluation
....
p.
423
6.3.1.
The Projected Bellman Equation
........
p.
424
6.3.2.
The Matrix Form of the Projected Equation
...
p.
428
6.3.3.
Simulation-Based Methods
..........
p.
431
6.3.4.
LSTD, LSPE, and TD(0) Methods
.......
p.
433
6.3.5.
Optimistic Versions
..............
p.
437
6.3.6.
Multistep Simulation-Based Methods
......
p.
438
6.3.7.
A Synopsis
.................
p.
447
6.4.
Policy Iteration Issues
................
p,
451
6.4.1.
Exploration Enhancement by Geometric Sampling p.
453
6.4.2.
Exploration Enhancement by Off-Policy Methods p.
464
6.4.3.
Policy Oscillations ~ Chattering
........
p.
467
6.5.
Aggregation Methods
.................
p.
474
6.5.1.
Cost Approximation via the Aggregate
Problem
.
p.
482
6.5.2.
Cost Approximation via the Enlarged Problem
. .
p.
485
6.5.3.
Multistep Aggregation
............
p.
490
6.5.4.
Asynchronous Distributed Aggregation
.....
p.
491
6.6.
Q-Learning
.....................
p.
493
vi
Contents
6.6.1.
Q-Learning:
A Stochastic VI Algorithm
....
p.
494
6.6.2.
Q-Learning and Policy Iteration
........
p.
496
6.6.3.
Q-Factor Approximation and Projected Equations p.
499
6.6.4.
Q-Learning for Optimal Stopping Problems
...
p.
502
6.6.5.
Q-Learning and Aggregation
..........
p.
507
6.6.6.
Finite Horizon Q-Learning
..........
p.
509
6.7.
Notes, Sources, and Exercises
.............
p.
511
7.
Approximate Dynamic Programming
-
Nondiscount ed
Models and Generalizations
7.1.
Stochastic Shortest Path Problems
...........
p.
532
7.2.
Average Cost Problems
................
p.
537
7.2.1.
Approximate Policy Evaluation
........
p.
537
7.2.2.
Approximate Policy Iteration
.........
p.
546
7.2.3.
Q-Learning for Average Cost Problems
.....
p.
548
7.3.
General Problems and Monte Carlo Linear Algebra
....
p.
552
7.3.1.
Projected Equations
.............
p.
562
7.3.2.
Matrix Inversion and Iterative Methods
.....
p.
569
7.3.3.
Multistep Methods
..............
p.
576
7.3.4.
Extension of Q-Learning for Optimal Stopping
. .
p.
584
7.3.5.
Equation Error Methods
...........
p.
586
7.3.6.
Oblique Projections
.............
p.
591
7.3.7.
Generalized Aggregation
...........
p.
593
7.3.8.
Deterministic Methods for Singular Linear Systems p.
597
7.3.9.
Stochastic Methods for Singular Linear Systems
.
p.
608
7.4.
Approximation in Policy Space
.............
p.
620
7.4.1.
The Gradient Formula
............
p.
622
7.4.2.
Computing the Gradient by Simulation
.....
p.
623
7.4.3.
Essential Features for Gradient Evaluation
...
p.
625
7.4.4.
Approximations in Policy and Value Space
...
p.
627
7.5.
Notes. Sources, and Exercises
.............
p.
629
Appendix A: Measure-Theoretic Issues in Dynamic Programming
A.I. A Two-Stage Example
.............
p.
641
A.
2.
Resolution of the Measurability Issues
......
p.
646
References
.......................
p.
657
Index
..........................
p.
691
|
any_adam_object | 1 |
author | Bertsekas, Dimitri P. 1942- |
author_GND | (DE-588)171165519 |
author_facet | Bertsekas, Dimitri P. 1942- |
author_role | aut |
author_sort | Bertsekas, Dimitri P. 1942- |
author_variant | d p b dp dpb |
building | Verbundindex |
bvnumber | BV040463410 |
classification_rvk | SK 880 SM 613 |
classification_tum | MAT 917f |
ctrlnum | (OCoLC)815947736 (DE-599)BVBBV040463410 |
discipline | Mathematik |
edition | Fourth edition |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01400nam a2200337 cc4500</leader><controlfield tag="001">BV040463410</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20171220 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">121010s2012 |||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1886529442</subfield><subfield code="9">1-886529-44-2</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781886529441</subfield><subfield code="9">978-1-886529-44-1</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)815947736</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV040463410</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-706</subfield><subfield code="a">DE-384</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-29T</subfield><subfield code="a">DE-573</subfield><subfield code="a">DE-188</subfield><subfield code="a">DE-83</subfield><subfield code="a">DE-523</subfield><subfield code="a">DE-634</subfield><subfield code="a">DE-91G</subfield><subfield code="a">DE-91</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 880</subfield><subfield code="0">(DE-625)143266:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SM 613</subfield><subfield code="0">(DE-625)143297:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">MAT 917f</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Bertsekas, Dimitri P.</subfield><subfield code="d">1942-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)171165519</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Dynamic programming and optimal control</subfield><subfield code="n">Volume 2</subfield><subfield code="p">Approximate dynamic programming</subfield><subfield code="c">Dimitri P. Bertsekas</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Fourth edition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Belmont, Mass.</subfield><subfield code="b">Athena Scientific</subfield><subfield code="c">[2012]</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xvii, 694 Seiten</subfield><subfield code="b">Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="w">(DE-604)BV011951112</subfield><subfield code="g">2</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025310808&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-025310808</subfield></datafield></record></collection> |
id | DE-604.BV040463410 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T00:24:29Z |
institution | BVB |
isbn | 1886529442 9781886529441 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-025310808 |
oclc_num | 815947736 |
open_access_boolean | |
owner | DE-706 DE-384 DE-739 DE-29T DE-573 DE-188 DE-83 DE-523 DE-634 DE-91G DE-BY-TUM DE-91 DE-BY-TUM |
owner_facet | DE-706 DE-384 DE-739 DE-29T DE-573 DE-188 DE-83 DE-523 DE-634 DE-91G DE-BY-TUM DE-91 DE-BY-TUM |
physical | xvii, 694 Seiten Diagramme |
publishDate | 2012 |
publishDateSearch | 2012 |
publishDateSort | 2012 |
publisher | Athena Scientific |
record_format | marc |
spelling | Bertsekas, Dimitri P. 1942- Verfasser (DE-588)171165519 aut Dynamic programming and optimal control Volume 2 Approximate dynamic programming Dimitri P. Bertsekas Fourth edition Belmont, Mass. Athena Scientific [2012] xvii, 694 Seiten Diagramme txt rdacontent n rdamedia nc rdacarrier (DE-604)BV011951112 2 Digitalisierung UB Passau - ADAM Catalogue Enrichment application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025310808&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Bertsekas, Dimitri P. 1942- Dynamic programming and optimal control |
title | Dynamic programming and optimal control |
title_auth | Dynamic programming and optimal control |
title_exact_search | Dynamic programming and optimal control |
title_full | Dynamic programming and optimal control Volume 2 Approximate dynamic programming Dimitri P. Bertsekas |
title_fullStr | Dynamic programming and optimal control Volume 2 Approximate dynamic programming Dimitri P. Bertsekas |
title_full_unstemmed | Dynamic programming and optimal control Volume 2 Approximate dynamic programming Dimitri P. Bertsekas |
title_short | Dynamic programming and optimal control |
title_sort | dynamic programming and optimal control approximate dynamic programming |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025310808&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV011951112 |
work_keys_str_mv | AT bertsekasdimitrip dynamicprogrammingandoptimalcontrolvolume2 |