Examples in Markov decision processes:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
London
Imperial College Press
2013
|
Schriftenreihe: | Imperial College Press optimization series
2 |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | XIII, 293 S. graph. Darst. |
ISBN: | 9781848167933 |
Internformat
MARC
LEADER | 00000nam a22000002cb4500 | ||
---|---|---|---|
001 | BV040487851 | ||
003 | DE-604 | ||
005 | 20121211 | ||
007 | t | ||
008 | 121017s2013 d||| |||| 00||| eng d | ||
020 | |a 9781848167933 |9 978-1-84816-793-3 | ||
035 | |a (OCoLC)816253967 | ||
035 | |a (DE-599)HBZHT017149614 | ||
040 | |a DE-604 |b ger |e rakwb | ||
041 | 0 | |a eng | |
049 | |a DE-824 |a DE-19 | ||
084 | |a SK 820 |0 (DE-625)143258: |2 rvk | ||
100 | 1 | |a Piunovskij, Alexei B. |e Verfasser |0 (DE-588)1029167125 |4 aut | |
245 | 1 | 0 | |a Examples in Markov decision processes |c A. B. Piunovskiy |
264 | 1 | |a London |b Imperial College Press |c 2013 | |
300 | |a XIII, 293 S. |b graph. Darst. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Imperial College Press optimization series |v 2 | |
650 | 0 | 7 | |a Markov-Entscheidungsprozess |0 (DE-588)4168927-6 |2 gnd |9 rswk-swf |
655 | 7 | |0 (DE-588)4144384-6 |a Beispielsammlung |2 gnd-content | |
689 | 0 | 0 | |a Markov-Entscheidungsprozess |0 (DE-588)4168927-6 |D s |
689 | 0 | |5 DE-604 | |
830 | 0 | |a Imperial College Press optimization series |v 2 |w (DE-604)BV035878988 |9 2 | |
856 | 4 | 2 | |m HBZ Datenaustausch |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025334923&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-025334923 |
Datensatz im Suchindex
_version_ | 1804149550102872064 |
---|---|
adam_text | Titel: Examples in Markov decision processes
Autor: Piunovskij, Alexei B
Jahr: 2013
Contents
Preface v
1. Finite-Horizon Models 1
1.1 Preliminaries......................... 1
1.2 Model Description...................... 3
1.3 Dynamic Programming Approach.............. 5
1.4 Examples........................... 8
1.4.1 Non-transitivity of the correlation......... 8
1.4.2 The more frequently used control is not better . . 9
1.4.3 Voting ........................ 11
1.4.4 The secretary problem............... 13
1.4.5 Constrained optimization.............. 14
1.4.6 Equivalent Markov selectors in non-atomic MDPs .17
1.4.7 Strongly equivalent Markov selectors in non-
atomic MDPs.................... 20
1.4.8 Stock exchange ................... 25
1.4.9 Markov or non-Markov strategy? Randomized or
not? When is the Bellman principle violated? . . 27
1.4.10 Uniformly optimal, but not optimal strategy ... 31
1.4.11 Martingales and the Bellman principle ...... 32
1.4.12 Conventions on expectation and infinities..... 34
1.4.13 Nowhere-differentiable function vt(x);
discontinuous function vt(x)............ 38
1.4.14 The non-measurable Bellman function....... 43
1.4.15 No one strategy is uniformly e-optimal...... 44
1.4.16 Semi-continuous model............... 46
x Examples in Markov Decision Processes
2. Homogeneous Infmite-Horizon Models: Expected Total Loss 51
2.1 Homogeneous Non-discounted Model............ 51
2.2 Examples........................... 54
2.2.1 Mixed Strategies................... 54
2.2.2 Multiple Solutions to the optimality equation ... 56
2.2.3 Finite model: multiple Solutions to the optimality
equation; conserving but not equalizing strategy . 58
2.2.4 The Single conserving strategy is not equalizing
and not optimal................... 58
2.2.5 When strategy iteration is not successful..... 61
2.2.6 When value iteration is not successful....... 63
2.2.7 When value iteration is not successful: positive
model I........................ 67
2.2.8 When value iteration is not successful: positive
model II....................... 69
2.2.9 Value iteration and stability in optimal stopping
Problems....................... 71
2.2.10 A non-equalizing strategy is uniformly optimal . . 73
2.2.11 A stationary uniformly e-optimal selector does not
exist (positive model)................ 75
2.2.12 A stationary uniformly e-optimal selector does not
exist (negative model)................ 77
2.2.13 Finite-action negative model where a stationary
uniformly e-optimal selector does not exist .... 80
2.2.14 Nearly uniformly optimal selectors in negative
modeis........................ 83
2.2.15 Semi-continuous modeis and the blackmailer s
dilemma....................... 85
2.2.16 Not a semi-continuous model............ 88
2.2.17 The Bellman function is non-measurable and no
one strategy is uniformly e-optimal........ 91
2.2.18 A randomized strategy is better than any selector
(finite action space)................. 92
2.2.19 The fluid approximation does not work...... 94
2.2.20 The fluid approximation: refined model...... 97
2.2.21 Occupation measures: phantom Solutions..... 101
2.2.22 Occupation measures in transient modeis..... 104
2.2.23 Occupation measures and duality......... 107
Contents xi
2.2.24 Occupation measures: compactness........ 109
2.2.25 The bold strategy in gambling is not optimal
(house limit)..................... 112
2.2.26 The bold strategy in gambling is not optimal
(inflation) ...................... 115
2.2.27 Search strategy for a moving target........ 119
2.2.28 Thethree-wayduel ( Truel )............ 122
3. Homogeneous Infinite-Horizon Models: Discounted Loss 127
3.1 Preliminaries......................... 127
3.2 Examples........................... 128
3.2.1 Phantom Solutions of the optimality equation . . 128
3.2.2 When value iteration is not successful: positive
model......................... 130
3.2.3 A non-optimal strategy w for which uj solves the
optimality equation................. 132
3.2.4 The Single conserving strategy is not equalizing
and not optimal................... 134
3.2.5 Value iteration and convergence of strategies ... 135
3.2.6 Value iteration in countable modeis........ 137
3.2.7 The Bellman function is non-measurable and no
one strategy is uniformly e-optimal........ 140
3.2.8 No one selector is uniformly e-optimal....... 141
3.2.9 Myopie strategies.................. 141
3.2.10 Stable and unstable Controllers for linear Systems 143
3.2.11 Incorrect optimal actions in the model with partial
Information...................... 146
3.2.12 Occupation measures and stationary strategies . . 149
3.2.13 Constrained optimization and the Bellman
principle....................... 152
3.2.14 Constrained optimization and Lagrange
multipliers...................... 153
3.2.15 Constrained optimization: multiple Solutions ... 157
3.2.16 Weighted discounted loss and (N, oo)-stationary
selectors ....................... 158
3.2.17 Non-constant discounting . ............. 160
3.2.18 The nearly optimal strategy is not Blackwell
optimal........................ 163
3.2.19 Blackwell optimal strategies and opportunity loss 164
Examples in Markov Decision Processes
3.2.20 Blackwell optimal and n-discount optimal
strategies....................... 165
3.2.21 No Blackwell (Maitra) optimal strategies..... 168
3.2.22 Optimal strategies as ß -¥ 1- and MDPs with the
average loss I ................... 171
3.2.23 Optimal strategies as ß - ¦ 1- and MDPs with the
average loss - II................... 172
Homogeneous Infinite-Horizon Models: Average Loss and
Other Criteria 177
4.1 Preliminaries......................... 177
4.2 Examples........................... 179
4.2.1 Whylimsup? .................... 179
4.2.2 AC-optimal non-canonical strategies........ 181
4.2.3 Canonical triplets and canonical equations .... 183
4.2.4 Multiple Solutions to the canonical equations in
finite modeis..................... 186
4.2.5 No AC-optimal strategies.............. 187
4.2.6 Canonical equations have no Solutions: the finite
action space..................... 188
4.2.7 No AC-e-optimal stationary strategies in a finite
State model...................... 191
4.2.8 No AC-optimal strategies in a finite-state semi-
continuous model.................. 192
4.2.9 Semi-continuous modeis and the sufnciency of
stationary selectors................. 194
4.2.10 No AC-optimal stationary strategies in a unichain
model with a finite action space.......... 195
4.2.11 No AC-e-optimal stationary strategies in a finite
action model..................... 198
4.2.12 No AC-e-optimal Markov strategies........ 199
4.2.13 Singular perturbation of an MDP......... 201
4.2.14 Blackwell optimal strategies and AC-optimality . 203
4.2.15 Strategy iteration in a unichain model....... 204
4.2.16 Unichain strategy iteration in a finite
communicating model................ 207
4.2.17 Strategy iteration in semi-continuous modeis . . . 208
4.2.18 When value iteration is not successful....... 211
4.2.19 The finite-horizon approximation does not work . 213
Contents xiii
4.2.20 The linear programming approach to finite modeis 215
4.2.21 Linear programming for infinite modeis...... 219
4.2.22 Linear programs and expected frequencies in finite
modeis........................ 223
4.2.23 Constrained optimization.............. 225
4.2.24 AC-optimal, bias optimal, overtaking optimal and
opportunity-cost optimal strategies: periodic
model......................... 229
4.2.25 AC-optimal and average-overtaking optimal
strategies....................... 232
4.2.26 Blackwell optimal, bias optimal, average-
overtaking optimal and AC-optimal strategies . . 235
4.2.27 Nearly optimal and average-overtaking optimal
strategies....................... 238
4.2.28 Strong-overtaking/average optimal, overtaking
optimal, AC-optimal strategies and minimal
opportunity loss................... 239
4.2.29 Strong-overtaking optimal and strong*-overtaking
optimal strategies.................. 242
4.2.30 Parrondo s paradox................. 247
4.2.31 An optimal service strategy in a queueing System 249
Afterword 253
Appendix A Borel Spaces and Other Theoretical Issues 257
A.l Main Concepts........................ 257
A.2 Probability Measures on Borel Spaces........... 260
A.3 Semi-continuous Functions and Measurable Selection . . . 263
A.4 Abelian (Tauberian) Theorem................ 265
Appendix B Proofs of Auxiliary Statements 267
Notation 281
List of the Main Statements 283
Bibliography 285
Index 291
|
any_adam_object | 1 |
author | Piunovskij, Alexei B. |
author_GND | (DE-588)1029167125 |
author_facet | Piunovskij, Alexei B. |
author_role | aut |
author_sort | Piunovskij, Alexei B. |
author_variant | a b p ab abp |
building | Verbundindex |
bvnumber | BV040487851 |
classification_rvk | SK 820 |
ctrlnum | (OCoLC)816253967 (DE-599)HBZHT017149614 |
discipline | Mathematik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01428nam a22003492cb4500</leader><controlfield tag="001">BV040487851</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20121211 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">121017s2013 d||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781848167933</subfield><subfield code="9">978-1-84816-793-3</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)816253967</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)HBZHT017149614</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-824</subfield><subfield code="a">DE-19</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SK 820</subfield><subfield code="0">(DE-625)143258:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Piunovskij, Alexei B.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1029167125</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Examples in Markov decision processes</subfield><subfield code="c">A. B. Piunovskiy</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">London</subfield><subfield code="b">Imperial College Press</subfield><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XIII, 293 S.</subfield><subfield code="b">graph. Darst.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Imperial College Press optimization series</subfield><subfield code="v">2</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Markov-Entscheidungsprozess</subfield><subfield code="0">(DE-588)4168927-6</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="655" ind1=" " ind2="7"><subfield code="0">(DE-588)4144384-6</subfield><subfield code="a">Beispielsammlung</subfield><subfield code="2">gnd-content</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Markov-Entscheidungsprozess</subfield><subfield code="0">(DE-588)4168927-6</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Imperial College Press optimization series</subfield><subfield code="v">2</subfield><subfield code="w">(DE-604)BV035878988</subfield><subfield code="9">2</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">HBZ Datenaustausch</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025334923&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-025334923</subfield></datafield></record></collection> |
genre | (DE-588)4144384-6 Beispielsammlung gnd-content |
genre_facet | Beispielsammlung |
id | DE-604.BV040487851 |
illustrated | Illustrated |
indexdate | 2024-07-10T00:24:48Z |
institution | BVB |
isbn | 9781848167933 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-025334923 |
oclc_num | 816253967 |
open_access_boolean | |
owner | DE-824 DE-19 DE-BY-UBM |
owner_facet | DE-824 DE-19 DE-BY-UBM |
physical | XIII, 293 S. graph. Darst. |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
publisher | Imperial College Press |
record_format | marc |
series | Imperial College Press optimization series |
series2 | Imperial College Press optimization series |
spelling | Piunovskij, Alexei B. Verfasser (DE-588)1029167125 aut Examples in Markov decision processes A. B. Piunovskiy London Imperial College Press 2013 XIII, 293 S. graph. Darst. txt rdacontent n rdamedia nc rdacarrier Imperial College Press optimization series 2 Markov-Entscheidungsprozess (DE-588)4168927-6 gnd rswk-swf (DE-588)4144384-6 Beispielsammlung gnd-content Markov-Entscheidungsprozess (DE-588)4168927-6 s DE-604 Imperial College Press optimization series 2 (DE-604)BV035878988 2 HBZ Datenaustausch application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025334923&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Piunovskij, Alexei B. Examples in Markov decision processes Imperial College Press optimization series Markov-Entscheidungsprozess (DE-588)4168927-6 gnd |
subject_GND | (DE-588)4168927-6 (DE-588)4144384-6 |
title | Examples in Markov decision processes |
title_auth | Examples in Markov decision processes |
title_exact_search | Examples in Markov decision processes |
title_full | Examples in Markov decision processes A. B. Piunovskiy |
title_fullStr | Examples in Markov decision processes A. B. Piunovskiy |
title_full_unstemmed | Examples in Markov decision processes A. B. Piunovskiy |
title_short | Examples in Markov decision processes |
title_sort | examples in markov decision processes |
topic | Markov-Entscheidungsprozess (DE-588)4168927-6 gnd |
topic_facet | Markov-Entscheidungsprozess Beispielsammlung |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=025334923&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV035878988 |
work_keys_str_mv | AT piunovskijalexeib examplesinmarkovdecisionprocesses |