Reinforcement Learning:
Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying the...
Gespeichert in:
Weitere Verfasser: | |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Boston, MA
Springer US
1992
|
Schriftenreihe: | The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems
173 |
Schlagworte: | |
Online-Zugang: | BTU01 URL des Erstveröffentlichers |
Zusammenfassung: | Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation, and through that all subsequent rewards. These two characteristics -- trial-and-error search and delayed reward -- are the most important distinguishing features of reinforcement learning. Reinforcement learning is both a new and a very old topic in AI. The term appears to have been coined by Minsk (1961), and independently in control theory by Walz and Fu (1965). The earliest machine learning research now viewed as directly relevant was Samuel's (1959) checker player, which used temporal-difference learning to manage delayed reward much as it is used today. Of course learning and reinforcement have been studied in psychology for almost a century, and that work has had a very strong impact on the AI/engineering work. One could in fact consider all of reinforcement learning to be simply the reverse engineering of certain psychological learning processes (e.g. operant conditioning and secondary reinforcement). Reinforcement Learning is an edited volume of original research, comprising seven invited contributions by leading researchers |
Beschreibung: | 1 Online-Ressource (172 p) |
ISBN: | 9781461536185 |
DOI: | 10.1007/978-1-4615-3618-5 |
Internformat
MARC
LEADER | 00000nmm a2200000zcb4500 | ||
---|---|---|---|
001 | BV045185856 | ||
003 | DE-604 | ||
005 | 00000000000000.0 | ||
007 | cr|uuu---uuuuu | ||
008 | 180912s1992 |||| o||u| ||||||eng d | ||
020 | |a 9781461536185 |9 978-1-4615-3618-5 | ||
024 | 7 | |a 10.1007/978-1-4615-3618-5 |2 doi | |
035 | |a (ZDB-2-ENG)978-1-4615-3618-5 | ||
035 | |a (OCoLC)1053793463 | ||
035 | |a (DE-599)BVBBV045185856 | ||
040 | |a DE-604 |b ger |e aacr | ||
041 | 0 | |a eng | |
049 | |a DE-634 | ||
082 | 0 | |a 006.3 |2 23 | |
245 | 1 | 0 | |a Reinforcement Learning |c edited by Richard S. Sutton |
264 | 1 | |a Boston, MA |b Springer US |c 1992 | |
300 | |a 1 Online-Ressource (172 p) | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
490 | 0 | |a The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems |v 173 | |
520 | |a Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation, and through that all subsequent rewards. These two characteristics -- trial-and-error search and delayed reward -- are the most important distinguishing features of reinforcement learning. Reinforcement learning is both a new and a very old topic in AI. The term appears to have been coined by Minsk (1961), and independently in control theory by Walz and Fu (1965). The earliest machine learning research now viewed as directly relevant was Samuel's (1959) checker player, which used temporal-difference learning to manage delayed reward much as it is used today. Of course learning and reinforcement have been studied in psychology for almost a century, and that work has had a very strong impact on the AI/engineering work. One could in fact consider all of reinforcement learning to be simply the reverse engineering of certain psychological learning processes (e.g. operant conditioning and secondary reinforcement). Reinforcement Learning is an edited volume of original research, comprising seven invited contributions by leading researchers | ||
650 | 4 | |a Computer Science | |
650 | 4 | |a Artificial Intelligence (incl. Robotics) | |
650 | 4 | |a Statistical Physics, Dynamical Systems and Complexity | |
650 | 4 | |a Computer science | |
650 | 4 | |a Artificial intelligence | |
650 | 4 | |a Statistical physics | |
650 | 4 | |a Dynamical systems | |
700 | 1 | |a Sutton, Richard S. |4 edt | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 9781461366089 |
856 | 4 | 0 | |u https://doi.org/10.1007/978-1-4615-3618-5 |x Verlag |z URL des Erstveröffentlichers |3 Volltext |
912 | |a ZDB-2-ENG | ||
940 | 1 | |q ZDB-2-ENG_Archiv | |
999 | |a oai:aleph.bib-bvb.de:BVB01-030575033 | ||
966 | e | |u https://doi.org/10.1007/978-1-4615-3618-5 |l BTU01 |p ZDB-2-ENG |q ZDB-2-ENG_Archiv |x Verlag |3 Volltext |
Datensatz im Suchindex
_version_ | 1804178876214018048 |
---|---|
any_adam_object | |
author2 | Sutton, Richard S. |
author2_role | edt |
author2_variant | r s s rs rss |
author_facet | Sutton, Richard S. |
building | Verbundindex |
bvnumber | BV045185856 |
collection | ZDB-2-ENG |
ctrlnum | (ZDB-2-ENG)978-1-4615-3618-5 (OCoLC)1053793463 (DE-599)BVBBV045185856 |
dewey-full | 006.3 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.3 |
dewey-search | 006.3 |
dewey-sort | 16.3 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
doi_str_mv | 10.1007/978-1-4615-3618-5 |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03122nmm a2200457zcb4500</leader><controlfield tag="001">BV045185856</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">180912s1992 |||| o||u| ||||||eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781461536185</subfield><subfield code="9">978-1-4615-3618-5</subfield></datafield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/978-1-4615-3618-5</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-2-ENG)978-1-4615-3618-5</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1053793463</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045185856</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">aacr</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-634</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.3</subfield><subfield code="2">23</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Reinforcement Learning</subfield><subfield code="c">edited by Richard S. Sutton</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boston, MA</subfield><subfield code="b">Springer US</subfield><subfield code="c">1992</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (172 p)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems</subfield><subfield code="v">173</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation, and through that all subsequent rewards. These two characteristics -- trial-and-error search and delayed reward -- are the most important distinguishing features of reinforcement learning. Reinforcement learning is both a new and a very old topic in AI. The term appears to have been coined by Minsk (1961), and independently in control theory by Walz and Fu (1965). The earliest machine learning research now viewed as directly relevant was Samuel's (1959) checker player, which used temporal-difference learning to manage delayed reward much as it is used today. Of course learning and reinforcement have been studied in psychology for almost a century, and that work has had a very strong impact on the AI/engineering work. One could in fact consider all of reinforcement learning to be simply the reverse engineering of certain psychological learning processes (e.g. operant conditioning and secondary reinforcement). Reinforcement Learning is an edited volume of original research, comprising seven invited contributions by leading researchers</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Artificial Intelligence (incl. Robotics)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Statistical Physics, Dynamical Systems and Complexity</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Artificial intelligence</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Statistical physics</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Dynamical systems</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Sutton, Richard S.</subfield><subfield code="4">edt</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9781461366089</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1007/978-1-4615-3618-5</subfield><subfield code="x">Verlag</subfield><subfield code="z">URL des Erstveröffentlichers</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-2-ENG</subfield></datafield><datafield tag="940" ind1="1" ind2=" "><subfield code="q">ZDB-2-ENG_Archiv</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030575033</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://doi.org/10.1007/978-1-4615-3618-5</subfield><subfield code="l">BTU01</subfield><subfield code="p">ZDB-2-ENG</subfield><subfield code="q">ZDB-2-ENG_Archiv</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
id | DE-604.BV045185856 |
illustrated | Not Illustrated |
indexdate | 2024-07-10T08:10:56Z |
institution | BVB |
isbn | 9781461536185 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-030575033 |
oclc_num | 1053793463 |
open_access_boolean | |
owner | DE-634 |
owner_facet | DE-634 |
physical | 1 Online-Ressource (172 p) |
psigel | ZDB-2-ENG ZDB-2-ENG_Archiv ZDB-2-ENG ZDB-2-ENG_Archiv |
publishDate | 1992 |
publishDateSearch | 1992 |
publishDateSort | 1992 |
publisher | Springer US |
record_format | marc |
series2 | The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems |
spelling | Reinforcement Learning edited by Richard S. Sutton Boston, MA Springer US 1992 1 Online-Ressource (172 p) txt rdacontent c rdamedia cr rdacarrier The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems 173 Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation, and through that all subsequent rewards. These two characteristics -- trial-and-error search and delayed reward -- are the most important distinguishing features of reinforcement learning. Reinforcement learning is both a new and a very old topic in AI. The term appears to have been coined by Minsk (1961), and independently in control theory by Walz and Fu (1965). The earliest machine learning research now viewed as directly relevant was Samuel's (1959) checker player, which used temporal-difference learning to manage delayed reward much as it is used today. Of course learning and reinforcement have been studied in psychology for almost a century, and that work has had a very strong impact on the AI/engineering work. One could in fact consider all of reinforcement learning to be simply the reverse engineering of certain psychological learning processes (e.g. operant conditioning and secondary reinforcement). Reinforcement Learning is an edited volume of original research, comprising seven invited contributions by leading researchers Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems Sutton, Richard S. edt Erscheint auch als Druck-Ausgabe 9781461366089 https://doi.org/10.1007/978-1-4615-3618-5 Verlag URL des Erstveröffentlichers Volltext |
spellingShingle | Reinforcement Learning Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems |
title | Reinforcement Learning |
title_auth | Reinforcement Learning |
title_exact_search | Reinforcement Learning |
title_full | Reinforcement Learning edited by Richard S. Sutton |
title_fullStr | Reinforcement Learning edited by Richard S. Sutton |
title_full_unstemmed | Reinforcement Learning edited by Richard S. Sutton |
title_short | Reinforcement Learning |
title_sort | reinforcement learning |
topic | Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems |
topic_facet | Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems |
url | https://doi.org/10.1007/978-1-4615-3618-5 |
work_keys_str_mv | AT suttonrichards reinforcementlearning |