Transfer in reinforcement learning domains:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Berlin ; Heidelberg
Springer
2009
|
Schriftenreihe: | Studies in computational intelligence
216 |
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | Literaturverz. S. 221 - 229 |
Beschreibung: | XII, 229 S. graph. Darst. 24 cm |
ISBN: | 9783642018817 |
Internformat
MARC
LEADER | 00000nam a2200000 cb4500 | ||
---|---|---|---|
001 | BV035760176 | ||
003 | DE-604 | ||
005 | 20091126 | ||
007 | t | ||
008 | 091008s2009 gw d||| |||| 00||| eng d | ||
015 | |a 09,A34,0091 |2 dnb | ||
016 | 7 | |a 993961916 |2 DE-101 | |
020 | |a 9783642018817 |c Pp. : EUR 106.95 (freier Pr.), sfr 166.00 (freier Pr.) |9 978-3-642-01881-7 | ||
024 | 3 | |a 9783642018817 | |
028 | 5 | 2 | |a 12674947 |
035 | |a (OCoLC)394927771 | ||
035 | |a (DE-599)DNB993961916 | ||
040 | |a DE-604 |b ger |e rakddb | ||
041 | 0 | |a eng | |
044 | |a gw |c XA-DE-BE | ||
049 | |a DE-473 | ||
050 | 0 | |a Q325.6 | |
082 | 0 | |a 006.31 |2 22/ger | |
082 | 0 | |a 006.3/1 |2 22 | |
084 | |a ST 300 |0 (DE-625)143650: |2 rvk | ||
084 | |a 004 |2 sdnb | ||
100 | 1 | |a Taylor, Matthew E. |e Verfasser |4 aut | |
245 | 1 | 0 | |a Transfer in reinforcement learning domains |c Matthew E. Taylor |
264 | 1 | |a Berlin ; Heidelberg |b Springer |c 2009 | |
300 | |a XII, 229 S. |b graph. Darst. |c 24 cm | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
490 | 1 | |a Studies in computational intelligence |v 216 | |
500 | |a Literaturverz. S. 221 - 229 | ||
650 | 4 | |a Bestärkendes Lernen <Künstliche Intelligenz> | |
650 | 4 | |a Künstliche Intelligenz | |
650 | 4 | |a Artificial intelligence | |
650 | 4 | |a Reinforcement learning | |
650 | 0 | 7 | |a Bestärkendes Lernen |g Künstliche Intelligenz |0 (DE-588)4825546-4 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Bestärkendes Lernen |g Künstliche Intelligenz |0 (DE-588)4825546-4 |D s |
689 | 0 | |5 DE-604 | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-3-642-01882-4 |
830 | 0 | |a Studies in computational intelligence |v 216 |w (DE-604)BV020822171 |9 216 | |
856 | 4 | 2 | |m Digitalisierung UB Bamberg |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=018620060&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
999 | |a oai:aleph.bib-bvb.de:BVB01-018620060 |
Datensatz im Suchindex
_version_ | 1804140683427053568 |
---|---|
adam_text | Contents
Introduction
............................................. 1
1.1 Problem
Definition
.................................... 4
1.2
Evaluating Transfer Learning Methods
................... 5
1.2.1
Empirical Transfer Comparisons
................... 7
1.2.2
Dimensions of Comparison
....................... 10
1.3
Transfer for Reinforcement Learning
..................... 11
1.3.1
Reinforcement Learning Notation
.........../·...... 11
1.3.2
What Follows
................................... 12
Reinforcement Learning Background
.................... 15
2.1
Framing the Reinforcement Learning Problem
............. 15
2.2
Function Approximation
............................... 17
2.2.1
Cerebellar Model Arithmetic Computers
........... 17
2.2.2
Radial Basis Functions
........................... 18
2.2.3
Artificial Neural Networks
........................ 19
2.2.4
Instance-Based Approximation
.................... 21
2.3
Learning Methods
..................................... 22
2.3.1
Sarsa
.......................................... 23
2.3.2
NeuroEvolution of Augmenting Topologies
(NEAT)
............................ ............ 25
2.3.3
Fitted R-Max
................................... 27
Related Work
........................................... 31
3.1
Transfer Approaches
................................... 31
3.1.1
Transfer Methods Categorization
.................. 32
3.1.2
Multi-Task Learning
............................. 34
3.1.3
Inter-Task Mappings
............................. 36
3.1.4
Related Paradigms
.............................. 38
3.2
Transfer Methods for Fixed State Variables and Actions
.... 39
Contents
3.3
Multi-Task Learning Methods
........................... 44
3.4
Transferring Task-Invariant Knowledge between Tasks
with Differing State Variables and Actions
................ 49
3.5
Explicit Mappings to Transfer between Different Actions
and State Representations
.............................. 52
3.6
Learning Task Mappings
............................... 57
Empirical Domains
...................................... 61
4.1
Generalized Mountain Car
.............................. 61
4.1.1
Two Dimensional Mountain Car
................... 62
4.1.2
Three Dimensional Mountain Car
................. 03
4.1.3
Learning Mountain Car
.......................... 66
4.2
Server Job Scheduling
.................................. 67
4.2.1
Learning Server Job Scheduling
................... 70
4.3
Robot Soccer Keepaway
................................ 71
4.3.1
More Complex Keepaway Tasks
................... 73
4.3.2 3
vs.
2
XOR Keepaway
........................... 75
4.3.3
Additional Keepaway Variants
.................... 76
4.3.4
Learning Keepaway
.............................. 77
4.4
Ringworld
............................................ 83
4.4.1
Learning Ringworld
.............................. 85
4.5
Knight Joust
.......................................... 86
4.5.1
Learning Knight Joust
........................... 88
4.6
Summary of Domains
.................................. 89
Value Function Transfer via Inter-Task Mappings
........ 91
5.1
Inter-Task Mappings
................................... 91
5.1.1
Inter-Task Mappings for the Keepaway Domain
..... 93
5.2
Value Function Transfer
................................ 95
5.2.1
Constructing pc
m
ас
and Prbf
................... 96
5.2.2
Constructing p_ANN
.............................. 98
5.3
Empirical Evaluation of Value Function Transfer
.......... 99
5.3.1
Transfer from
3
vs.
2
Keepaway to
4
vs.
3
Keepaway
...................................... 99
5.3.2
Understanding pc mac s Benefit
................... 105
5.3.3
Transfer between Players with Differing Abilities
.... 110
5.3.4
Transfer from Knight Joust to
4
vs.
3
Keepaway
..... 112
5.3.5
Variants on
3
vs.
2
Keepaway: Negative Transfer
.... 115
5.3.6
Larger Keepaway Tasks and Multi-Step Transfer
.... 116
5.4
On the Applicability of Value Function Transfer
........... 118
Extending Transfer via Inter-Task Mappings
............. 121
6.1
Q-Value Reuse
........................................ 121
6.1.1
Q-Value Reuse Results: Mountain Car
............. 123
6.1.2
Q-Value Reuse Results: Keepaway
................. 125
Contents Xl
6.2
Policy Transfer
........................................ 127
6.2.1 Server Job
Scheduling Results.....................
129
6.2.2 Keepaway
Results...............................
134
6.2.3
Partial Mapping Results
......................... 136
6.3
Chapter Summary
..................................... 138
7
Transfer between Different Reinforcement Learning
Methods
................................................. 139
7.1
TIMBREL: Instance-Based Transfer
..................... 140
7.1.1
Model Transfer
.................................. 141
7.1.2
Implementing TIMBREL in Mountain Car
......... 142
7.1.3
TIMBREL Transfer Experiments
.................. 145
7.1.4
TIMBREL Summary
............................ 148
7.2
Transfer via Rules
..................................... 148
7.2.1
Rule Utilization Schemes
......................... 150
7.2.2
Cross-Domain Rule Transfer Results
............... 153
7.2.3
Rule Transfer:
3
vs.
2
Keepaway to
4
vs.
3
Keepaway
...................................... 161
7.2.4
Rule Transfer Summary
.......................... 161
7.3
Representation Transfer
................................ 163
7.3.1
Complexification
................................ 165
7.3.2
Offline Representation Transfer
................... 166
7.3.3
Overview of Representation Transfer Results
........ 168
7.3.4
Complexification in XOR Keepaway
......... .·■..... 169
7.3.5
Offline Representation Transfer in
3
vs.
2
Keepaway
...................................... 171
7.3.6
Offline Representation Transfer for Task Transfer
.... 175
7.3.7
Representation Transfer Summary
................. 176
7.4
Comparison of Presented Transfer Methods
............... 178
8
Learning Inter-Task Mappings
........................... 181
8.1
Learning Inter-Task Mappings via Classification
........... 181
8.1.1
Learning Keepaway Inter-Task Mapping s
........... 184
8.1.2
Learning Server Job Scheduling Inter-Task
Mappings
...................................... 186
8.1.3
Learned Inter-Task Mapping Results
............... 187
8.1.4
Discussion
...................................... 189
8.2
MASTER: Learning Inter-Task Mappings Offline
.......... 190
8.2.1
MASTER: Empirical Validation
................... 193
8.2.2
Transfer from 2D to
3D
Mountain Car
............. 196
8.2.3
Reducing the Total Sample Complexity
............ 197
8.2.4
Comparison to Previous Work
.................... 199
8.2.5
Transfer in Hand Brake Mountain Car
............. 201
8.3
Chapter Summary
..................................... 203
XII
Contents
9
Conclusion and Future Work
............................ 205
9.1
Summary of Monograph Methods
....................... 205
9.2
Possible Enhancements to Monograph Methods
........... 207
9.2.1
Inter-Task Mappings
............................. 207
9.2.2
Transfer Algorithms
............................. 208
9.3
Determining the Efficacy of Transfer
..................... 211
9.3.1
Avoiding
Negativo»
Transfer
....................... 211
9.4
Future Transfer Work
.................................. 213
9.4.1
Increasing Transfer s Applicability
................. 213
9.4.2
Improving Transfer s Efficacy
..................... 214
9.4.3
Transfer in More Difficult Tasks
................... 217
9.5
Conclusion
............................................ 217
A On-Line Appendix
....................................... 219
References
................................................... 221
|
any_adam_object | 1 |
author | Taylor, Matthew E. |
author_facet | Taylor, Matthew E. |
author_role | aut |
author_sort | Taylor, Matthew E. |
author_variant | m e t me met |
building | Verbundindex |
bvnumber | BV035760176 |
callnumber-first | Q - Science |
callnumber-label | Q325 |
callnumber-raw | Q325.6 |
callnumber-search | Q325.6 |
callnumber-sort | Q 3325.6 |
callnumber-subject | Q - General Science |
classification_rvk | ST 300 |
ctrlnum | (OCoLC)394927771 (DE-599)DNB993961916 |
dewey-full | 006.31 006.3/1 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.31 006.3/1 |
dewey-search | 006.31 006.3/1 |
dewey-sort | 16.31 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01994nam a2200517 cb4500</leader><controlfield tag="001">BV035760176</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20091126 </controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">091008s2009 gw d||| |||| 00||| eng d</controlfield><datafield tag="015" ind1=" " ind2=" "><subfield code="a">09,A34,0091</subfield><subfield code="2">dnb</subfield></datafield><datafield tag="016" ind1="7" ind2=" "><subfield code="a">993961916</subfield><subfield code="2">DE-101</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9783642018817</subfield><subfield code="c">Pp. : EUR 106.95 (freier Pr.), sfr 166.00 (freier Pr.)</subfield><subfield code="9">978-3-642-01881-7</subfield></datafield><datafield tag="024" ind1="3" ind2=" "><subfield code="a">9783642018817</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">12674947</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)394927771</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DNB993961916</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rakddb</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="044" ind1=" " ind2=" "><subfield code="a">gw</subfield><subfield code="c">XA-DE-BE</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-473</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">Q325.6</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.31</subfield><subfield code="2">22/ger</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.3/1</subfield><subfield code="2">22</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 300</subfield><subfield code="0">(DE-625)143650:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">004</subfield><subfield code="2">sdnb</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Taylor, Matthew E.</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Transfer in reinforcement learning domains</subfield><subfield code="c">Matthew E. Taylor</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Berlin ; Heidelberg</subfield><subfield code="b">Springer</subfield><subfield code="c">2009</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">XII, 229 S.</subfield><subfield code="b">graph. Darst.</subfield><subfield code="c">24 cm</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Studies in computational intelligence</subfield><subfield code="v">216</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Literaturverz. S. 221 - 229</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Bestärkendes Lernen <Künstliche Intelligenz></subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Künstliche Intelligenz</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Artificial intelligence</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Reinforcement learning</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Bestärkendes Lernen</subfield><subfield code="g">Künstliche Intelligenz</subfield><subfield code="0">(DE-588)4825546-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Bestärkendes Lernen</subfield><subfield code="g">Künstliche Intelligenz</subfield><subfield code="0">(DE-588)4825546-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-3-642-01882-4</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Studies in computational intelligence</subfield><subfield code="v">216</subfield><subfield code="w">(DE-604)BV020822171</subfield><subfield code="9">216</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=018620060&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-018620060</subfield></datafield></record></collection> |
id | DE-604.BV035760176 |
illustrated | Illustrated |
indexdate | 2024-07-09T22:03:52Z |
institution | BVB |
isbn | 9783642018817 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-018620060 |
oclc_num | 394927771 |
open_access_boolean | |
owner | DE-473 DE-BY-UBG |
owner_facet | DE-473 DE-BY-UBG |
physical | XII, 229 S. graph. Darst. 24 cm |
publishDate | 2009 |
publishDateSearch | 2009 |
publishDateSort | 2009 |
publisher | Springer |
record_format | marc |
series | Studies in computational intelligence |
series2 | Studies in computational intelligence |
spelling | Taylor, Matthew E. Verfasser aut Transfer in reinforcement learning domains Matthew E. Taylor Berlin ; Heidelberg Springer 2009 XII, 229 S. graph. Darst. 24 cm txt rdacontent n rdamedia nc rdacarrier Studies in computational intelligence 216 Literaturverz. S. 221 - 229 Bestärkendes Lernen <Künstliche Intelligenz> Künstliche Intelligenz Artificial intelligence Reinforcement learning Bestärkendes Lernen Künstliche Intelligenz (DE-588)4825546-4 gnd rswk-swf Bestärkendes Lernen Künstliche Intelligenz (DE-588)4825546-4 s DE-604 Erscheint auch als Online-Ausgabe 978-3-642-01882-4 Studies in computational intelligence 216 (DE-604)BV020822171 216 Digitalisierung UB Bamberg application/pdf http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=018620060&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA Inhaltsverzeichnis |
spellingShingle | Taylor, Matthew E. Transfer in reinforcement learning domains Studies in computational intelligence Bestärkendes Lernen <Künstliche Intelligenz> Künstliche Intelligenz Artificial intelligence Reinforcement learning Bestärkendes Lernen Künstliche Intelligenz (DE-588)4825546-4 gnd |
subject_GND | (DE-588)4825546-4 |
title | Transfer in reinforcement learning domains |
title_auth | Transfer in reinforcement learning domains |
title_exact_search | Transfer in reinforcement learning domains |
title_full | Transfer in reinforcement learning domains Matthew E. Taylor |
title_fullStr | Transfer in reinforcement learning domains Matthew E. Taylor |
title_full_unstemmed | Transfer in reinforcement learning domains Matthew E. Taylor |
title_short | Transfer in reinforcement learning domains |
title_sort | transfer in reinforcement learning domains |
topic | Bestärkendes Lernen <Künstliche Intelligenz> Künstliche Intelligenz Artificial intelligence Reinforcement learning Bestärkendes Lernen Künstliche Intelligenz (DE-588)4825546-4 gnd |
topic_facet | Bestärkendes Lernen <Künstliche Intelligenz> Künstliche Intelligenz Artificial intelligence Reinforcement learning Bestärkendes Lernen Künstliche Intelligenz |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=018620060&sequence=000002&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
volume_link | (DE-604)BV020822171 |
work_keys_str_mv | AT taylormatthewe transferinreinforcementlearningdomains |