Verfügbarkeit: Reinforcement Learning

Reinforcement Learning:

Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Weitere Verfasser:	Sutton, Richard S. (HerausgeberIn)
Format:	Elektronisch E-Book
Sprache:	English
Veröffentlicht:	Boston, MA Springer US 1992
Schriftenreihe:	The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems 173
Schlagworte:	Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems
Online-Zugang:	BTU01 Volltext
Zusammenfassung:	Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation, and through that all subsequent rewards. These two characteristics -- trial-and-error search and delayed reward -- are the most important distinguishing features of reinforcement learning. Reinforcement learning is both a new and a very old topic in AI. The term appears to have been coined by Minsk (1961), and independently in control theory by Walz and Fu (1965). The earliest machine learning research now viewed as directly relevant was Samuel's (1959) checker player, which used temporal-difference learning to manage delayed reward much as it is used today. Of course learning and reinforcement have been studied in psychology for almost a century, and that work has had a very strong impact on the AI/engineering work. One could in fact consider all of reinforcement learning to be simply the reverse engineering of certain psychological learning processes (e.g. operant conditioning and secondary reinforcement). Reinforcement Learning is an edited volume of original research, comprising seven invited contributions by leading researchers
Beschreibung:	1 Online-Ressource (172 p)
ISBN:	9781461536185
DOI:	10.1007/978-1-4615-3618-5

Internformat

MARC


LEADER	00000nmm a2200000zcb4500
001	BV045185856
003	DE-604
005	00000000000000.0
007	cr\|uuu---uuuuu
008	180912s1992 \|\|\|\| o\|\|u\| \|\|\|\|\|\|eng d
020			\|a 9781461536185 \|9 978-1-4615-3618-5
024	7		\|a 10.1007/978-1-4615-3618-5 \|2 doi
035			\|a (ZDB-2-ENG)978-1-4615-3618-5
035			\|a (OCoLC)1053793463
035			\|a (DE-599)BVBBV045185856
040			\|a DE-604 \|b ger \|e aacr
041	0		\|a eng
049			\|a DE-634
082	0		\|a 006.3 \|2 23
245	1	0	\|a Reinforcement Learning \|c edited by Richard S. Sutton
264		1	\|a Boston, MA \|b Springer US \|c 1992
300			\|a 1 Online-Ressource (172 p)
336			\|b txt \|2 rdacontent
337			\|b c \|2 rdamedia
338			\|b cr \|2 rdacarrier
490	0		\|a The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems \|v 173
520			\|a Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation, and through that all subsequent rewards. These two characteristics -- trial-and-error search and delayed reward -- are the most important distinguishing features of reinforcement learning. Reinforcement learning is both a new and a very old topic in AI. The term appears to have been coined by Minsk (1961), and independently in control theory by Walz and Fu (1965). The earliest machine learning research now viewed as directly relevant was Samuel's (1959) checker player, which used temporal-difference learning to manage delayed reward much as it is used today. Of course learning and reinforcement have been studied in psychology for almost a century, and that work has had a very strong impact on the AI/engineering work. One could in fact consider all of reinforcement learning to be simply the reverse engineering of certain psychological learning processes (e.g. operant conditioning and secondary reinforcement). Reinforcement Learning is an edited volume of original research, comprising seven invited contributions by leading researchers
650		4	\|a Computer Science
650		4	\|a Artificial Intelligence (incl. Robotics)
650		4	\|a Statistical Physics, Dynamical Systems and Complexity
650		4	\|a Computer science
650		4	\|a Artificial intelligence
650		4	\|a Statistical physics
650		4	\|a Dynamical systems
700	1		\|a Sutton, Richard S. \|4 edt
776	0	8	\|i Erscheint auch als \|n Druck-Ausgabe \|z 9781461366089
856	4	0	\|u https://doi.org/10.1007/978-1-4615-3618-5 \|x Verlag \|z URL des Erstveröffentlichers \|3 Volltext
912			\|a ZDB-2-ENG
940	1		\|q ZDB-2-ENG_Archiv
999			\|a oai:aleph.bib-bvb.de:BVB01-030575033
966	e		\|u https://doi.org/10.1007/978-1-4615-3618-5 \|l BTU01 \|p ZDB-2-ENG \|q ZDB-2-ENG_Archiv \|x Verlag \|3 Volltext

Datensatz im Suchindex

_version_	1804178876214018048
any_adam_object
author2	Sutton, Richard S.
author2_role	edt
author2_variant	r s s rs rss
author_facet	Sutton, Richard S.
building	Verbundindex
bvnumber	BV045185856
collection	ZDB-2-ENG
ctrlnum	(ZDB-2-ENG)978-1-4615-3618-5 (OCoLC)1053793463 (DE-599)BVBBV045185856
dewey-full	006.3
dewey-hundreds	000 - Computer science, information, general works
dewey-ones	006 - Special computer methods
dewey-raw	006.3
dewey-search	006.3
dewey-sort	16.3
dewey-tens	000 - Computer science, information, general works
discipline	Informatik
doi_str_mv	10.1007/978-1-4615-3618-5
format	Electronic eBook
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>03122nmm a2200457zcb4500</leader><controlfield tag="001">BV045185856</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">00000000000000.0</controlfield><controlfield tag="007">cr\|uuu---uuuuu</controlfield><controlfield tag="008">180912s1992 \|\|\|\| o\|\|u\| \|\|\|\|\|\|eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781461536185</subfield><subfield code="9">978-1-4615-3618-5</subfield></datafield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/978-1-4615-3618-5</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-2-ENG)978-1-4615-3618-5</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1053793463</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV045185856</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">aacr</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-634</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.3</subfield><subfield code="2">23</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Reinforcement Learning</subfield><subfield code="c">edited by Richard S. Sutton</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Boston, MA</subfield><subfield code="b">Springer US</subfield><subfield code="c">1992</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (172 p)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems</subfield><subfield code="v">173</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation, and through that all subsequent rewards. These two characteristics -- trial-and-error search and delayed reward -- are the most important distinguishing features of reinforcement learning. Reinforcement learning is both a new and a very old topic in AI. The term appears to have been coined by Minsk (1961), and independently in control theory by Walz and Fu (1965). The earliest machine learning research now viewed as directly relevant was Samuel's (1959) checker player, which used temporal-difference learning to manage delayed reward much as it is used today. Of course learning and reinforcement have been studied in psychology for almost a century, and that work has had a very strong impact on the AI/engineering work. One could in fact consider all of reinforcement learning to be simply the reverse engineering of certain psychological learning processes (e.g. operant conditioning and secondary reinforcement). Reinforcement Learning is an edited volume of original research, comprising seven invited contributions by leading researchers</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Artificial Intelligence (incl. Robotics)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Statistical Physics, Dynamical Systems and Complexity</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer science</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Artificial intelligence</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Statistical physics</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Dynamical systems</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Sutton, Richard S.</subfield><subfield code="4">edt</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">9781461366089</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1007/978-1-4615-3618-5</subfield><subfield code="x">Verlag</subfield><subfield code="z">URL des Erstveröffentlichers</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-2-ENG</subfield></datafield><datafield tag="940" ind1="1" ind2=" "><subfield code="q">ZDB-2-ENG_Archiv</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-030575033</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://doi.org/10.1007/978-1-4615-3618-5</subfield><subfield code="l">BTU01</subfield><subfield code="p">ZDB-2-ENG</subfield><subfield code="q">ZDB-2-ENG_Archiv</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield></record></collection>
id	DE-604.BV045185856
illustrated	Not Illustrated
indexdate	2024-07-10T08:10:56Z
institution	BVB
isbn	9781461536185
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-030575033
oclc_num	1053793463
open_access_boolean
owner	DE-634
owner_facet	DE-634
physical	1 Online-Ressource (172 p)
psigel	ZDB-2-ENG ZDB-2-ENG_Archiv ZDB-2-ENG ZDB-2-ENG_Archiv
publishDate	1992
publishDateSearch	1992
publishDateSort	1992
publisher	Springer US
record_format	marc
series2	The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems
spelling	Reinforcement Learning edited by Richard S. Sutton Boston, MA Springer US 1992 1 Online-Ressource (172 p) txt rdacontent c rdamedia cr rdacarrier The Springer International Series in Engineering and Computer Science, Knowledge Representation, Learning and Expert Systems 173 Reinforcement learning is the learning of a mapping from situations to actions so as to maximize a scalar reward or reinforcement signal. The learner is not told which action to take, as in most forms of machine learning, but instead must discover which actions yield the highest reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation, and through that all subsequent rewards. These two characteristics -- trial-and-error search and delayed reward -- are the most important distinguishing features of reinforcement learning. Reinforcement learning is both a new and a very old topic in AI. The term appears to have been coined by Minsk (1961), and independently in control theory by Walz and Fu (1965). The earliest machine learning research now viewed as directly relevant was Samuel's (1959) checker player, which used temporal-difference learning to manage delayed reward much as it is used today. Of course learning and reinforcement have been studied in psychology for almost a century, and that work has had a very strong impact on the AI/engineering work. One could in fact consider all of reinforcement learning to be simply the reverse engineering of certain psychological learning processes (e.g. operant conditioning and secondary reinforcement). Reinforcement Learning is an edited volume of original research, comprising seven invited contributions by leading researchers Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems Sutton, Richard S. edt Erscheint auch als Druck-Ausgabe 9781461366089 https://doi.org/10.1007/978-1-4615-3618-5 Verlag URL des Erstveröffentlichers Volltext
spellingShingle	Reinforcement Learning Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems
title	Reinforcement Learning
title_auth	Reinforcement Learning
title_exact_search	Reinforcement Learning
title_full	Reinforcement Learning edited by Richard S. Sutton
title_fullStr	Reinforcement Learning edited by Richard S. Sutton
title_full_unstemmed	Reinforcement Learning edited by Richard S. Sutton
title_short	Reinforcement Learning
title_sort	reinforcement learning
topic	Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems
topic_facet	Computer Science Artificial Intelligence (incl. Robotics) Statistical Physics, Dynamical Systems and Complexity Computer science Artificial intelligence Statistical physics Dynamical systems
url	https://doi.org/10.1007/978-1-4615-3618-5
work_keys_str_mv	AT suttonrichards reinforcementlearning

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand! Volltext öffnen

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge