Verfügbarkeit: Grokking deep reinforcement learning

Grokking deep reinforcement learning:

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Morales, Miguel (VerfasserIn)
Format:	Buch
Sprache:	English
Veröffentlicht:	Manning Shelter Island [2020]
Online-Zugang:	Inhaltsverzeichnis
Beschreibung:	xxi, 447 Seiten Illustrationen, Diagramme
ISBN:	9781617295454

Internformat

MARC


LEADER	00000nam a2200000 c 4500
001	BV047001780
003	DE-604
005	20241203
007	t\|
008	201117s2020 xx a\|\|\| \|\|\|\| 00\|\|\| eng d
020			\|a 9781617295454 \|c pbk \|9 978-1-61729-545-4
035			\|a (OCoLC)1241670259
035			\|a (DE-599)BVBBV047001780
040			\|a DE-604 \|b ger \|e rda
041	0		\|a eng
049			\|a DE-862 \|a DE-473
084			\|a ST 302 \|0 (DE-625)143652: \|2 rvk
100	1		\|a Morales, Miguel \|e Verfasser \|0 (DE-588)1318329892 \|4 aut
245	1	0	\|a Grokking deep reinforcement learning \|c Miguel Morales ; foreword by Charles Isbell, Jr.
246	1	3	\|a Deep reinforcement learning
264		1	\|a Manning \|b Shelter Island \|c [2020]
264		4	\|c © 2020
300			\|a xxi, 447 Seiten \|b Illustrationen, Diagramme
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
700	1		\|a Isbell, Charles \|d ca. 20. Jh. \|0 (DE-588)1242481133 \|4 wpr
856	4	2	\|m Digitalisierung UB Bamberg - ADAM Catalogue Enrichment \|q application/pdf \|u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032409393&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA \|3 Inhaltsverzeichnis
943	1		\|a oai:aleph.bib-bvb.de:BVB01-032409393

Datensatz im Suchindex

DE-BY-862_location	2000
DE-BY-FWS_call_number	2000/ST 302 M828
DE-BY-FWS_katkey	857838
DE-BY-FWS_media_number	083000518519
_version_	1819742269733863424
adam_text	contents 00000000000000 00000000 xi foreword xiii preface acknowledgments XV about this book xvii about the author xxi 1 Introduction to deep reinforcement learning 1 β « ® « « e ® S î» .V s -ÿ ·? « « J; i? « * * ». * ?- « * о » ч s * ■« ?. 3 * ΐ ■,· -э 4 f· « ■;' '■ ч? « ί ϊ « ft « ΐ ÿ £ Φ 4- У I1 ^ »? ι» * S Λ « vu 2 What is deep reinforcement learning? The past, present, and future of deep reinforcement learning The suitability of deep reinforcement learning Setting clear two-way expectations 2 15 22 25 Mathematical foundations of reinforcement learning 31 ? * 4 » δ a ?■ s ? : t », -3 ? ϊ f $ ? ® 9 ■» f1 5' ‘ и f; -ϊ ‘. -t :f '? ft * я '.- Components of reinforcement learning MDPs: The engine of the environment vii * * » ? -ϊ f‘ ^ x? ■ * « ♦ * » ^ ® 9 * « ·* β s s % ? i· 33 45 viii 3 contents Balancing immediate and long-term goals 65 ΐ ® % » © ® « й 4 Я = S· fs K- it- ö к ^ ■« ® «ί 5 * » «7 ® ΐ lÿ Ή ? -ΐ й ·ϊ- · S ® Φ # Й· ΐ* ιί e ·® Æ S ^ ® ? « Φ «# ® Я fi Й Й й ® ^ ® % · Ф is ® The objective of a decision-making agent Planning optimal sequences of actions 4 66 77 Balancing the gathering and use of information sosooeo* s e w n 97 ·* Æ 1 S' δ й ·Φ ö й « * Ь if ΐ· a ·μ A ®· Л ® Ψ А1 К ® а » « ÿ ξώ ■ * # if Æ «- -S · И S S ft ® й # '■ Ф Ъ ® $ i? ■» -3 'й The challenge of interpreting evaluative feedback Strategie exploration 5 99 117 Evaluating agents'behaviors 131 ^· Ψ S· SS « f? ^ Ö S’ S i» «» 5 "S ^ ?■ * « ^ ί' ’ * ^ S «· ■’:· à.’ « ® $■ * ‘“ '" S '^ ® ■’ ^ ■' Ф «' ÿ ® "’ ϊ 9 ' ^ ^ i* й ■? $ $ ·ί» ^ '^ “' ^ ® Ö fe ® S » Learning to estimate the value of policies Learning to estimate from multiple steps 6 133 150 Improving agents'behaviors й ® ® » ® ® ®- St i? О gJ- J'· a» '!·. Ъ S'- 3 it- s'? -i » V· ■■ ·? 2 S -^ -·Λ ,5 s Й “ -f «■ ф л ^ ? ?· .-J t ■" · 'S if '£- si- % 'S ^ ty 167 w ’ f ΐ- I« si· » ■’- V 5· ®· Ϊ· The anatomy of reinforcement learning agents Learning to improve policies of behavior Decoupling behavior from learning 7 168 176 187 Achieving goals more effectively and efficiently ® ?· S’ ff * » ¥ * $ ^ -£ Й ' «· К * » ^ « S Ф ;"- * «« * -® © ■» S 5· Й -S 5 ? S « й * 203 ?: ύ * Λ 5; # $ Й й Æ ® Learning to improve policies using robust targets Agents that interact, learn, and plan 8 Introduction to value-based deep reinforcement learning 205 217 239 ä ä Ö fl 0 ^ 8 ΐ a 9 -'’ i * if * 8- $ -Φ # ■# ύ: Ъ ч^ S iS О ф Й 'S ® S ?« ?· « -ö © β· а a S © £ * -^ й * * « ?й #’ ® Ф Й ^ ■ ■ ^ ^- ^ «■ г^ й Ф ΰ «■ The kind of feedback deep reinforcement learning agents use Introduction to function approximation for reinforcement learning NFQ: The first attempt at value-based deep reinforcement learning 241 248 253 contents 9 ix Morestable value-based methods 275 ® ® æ Й· ® Ä 'S Sf ■$ ^ © ® ® ® 3 Й* $ 6 % «. ® 96$«$$«8»$8S» «■ й й $ В Λ * Ф S si S ïV ф ?â ® ü £ » ^ ^ S ® Й- ® й Й $ » S % ® DQN: Making reinforcement learning more like supervised learning 276 Double DQN: Mitigating the overestimation of action- value functions 293 309 10 Sample-efficient value-based methods © s· 0 ® © s- «»®»»®e®e Si s- $ ® % № φ * ® e- ■ ** * * 4 4 $■ % ^ ® $ $ « ® ® sj я ф © s $ ® ® « й ô ® » -a· ф $· ® à а а Dueling DDQN: A reinforcement-learning-aware neural network architecture PER: Prioritizing the replay of meaningful experiences 310 323 11 Policy-gradient and actor-critic methods Ф- ?? ^ e '5 % s $ ® s· » ώ * ^ ® »a ₽ #■ r· ® «·■ й 339 i? j? · s ®· ·ν й ® s· s* ® it ъ jf ^ .^ Ф ®® s Ф ® ® ^ ®· л л? * ® © й ® а щ s ^ REINFORCE: Outcome-based policy learning VPG: Learning a value function A3C: Parallel policy updates GAE: Robust advantage estimation A2C: Synchronous policy updates ® •* * ф6** 6* 6** * *984 13 Toward artificial general intelligence $, гф ® ^ ■© Й Я· « ®- ® ©■ й £ s fi а 4! »®g«äSä»$^«®»9 -Й №■ ^ 9 а ® What was covered and what notably wasn't? More advanced concepts toward AGI What happens next? index g s ■ » 375 й DDPG: Approximating a deterministic policy TD3: State-of-the-art improvements over DDPG SAC: Maximizing the expected return and entropy PPO: Restricting optimization steps « Ф . ^ ф ф it й S « ®. «S ® ·2 ίί ■ а й й Й « εχ· Ф î 340 350 355 362 365 12 Advanced actor-critic methods s e # » e a -a' @ й й » tlanie ss * ® a у ■ ? » © ® ^ e ί ® ® « $ 0 5a@ e ·» a » ф 377 384 391 398 411 ® ^ Ά 413 424 432 437
adam_txt
any_adam_object	1
any_adam_object_boolean
author	Morales, Miguel
author_GND	(DE-588)1318329892 (DE-588)1242481133
author_facet	Morales, Miguel
author_role	aut
author_sort	Morales, Miguel
author_variant	m m mm
building	Verbundindex
bvnumber	BV047001780
classification_rvk	ST 302
ctrlnum	(OCoLC)1241670259 (DE-599)BVBBV047001780
discipline	Informatik
discipline_str_mv	Informatik
format	Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV047001780</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20241203</controlfield><controlfield tag="007">t\|</controlfield><controlfield tag="008">201117s2020 xx a\|\|\| \|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781617295454</subfield><subfield code="c">pbk</subfield><subfield code="9">978-1-61729-545-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1241670259</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047001780</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-862</subfield><subfield code="a">DE-473</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 302</subfield><subfield code="0">(DE-625)143652:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Morales, Miguel</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1318329892</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Grokking deep reinforcement learning</subfield><subfield code="c">Miguel Morales ; foreword by Charles Isbell, Jr.</subfield></datafield><datafield tag="246" ind1="1" ind2="3"><subfield code="a">Deep reinforcement learning</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Manning</subfield><subfield code="b">Shelter Island</subfield><subfield code="c">[2020]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2020</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xxi, 447 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Isbell, Charles</subfield><subfield code="d">ca. 20. Jh.</subfield><subfield code="0">(DE-588)1242481133</subfield><subfield code="4">wpr</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032409393&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032409393</subfield></datafield></record></collection>
id	DE-604.BV047001780
illustrated	Illustrated
index_date	2024-07-03T15:57:12Z
indexdate	2024-12-29T04:04:24Z
institution	BVB
isbn	9781617295454
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-032409393
oclc_num	1241670259
open_access_boolean
owner	DE-862 DE-BY-FWS DE-473 DE-BY-UBG
owner_facet	DE-862 DE-BY-FWS DE-473 DE-BY-UBG
physical	xxi, 447 Seiten Illustrationen, Diagramme
publishDate	2020
publishDateSearch	2020
publishDateSort	2020
publisher	Shelter Island
record_format	marc
spellingShingle	Morales, Miguel Grokking deep reinforcement learning
title	Grokking deep reinforcement learning
title_alt	Deep reinforcement learning
title_auth	Grokking deep reinforcement learning
title_exact_search	Grokking deep reinforcement learning
title_exact_search_txtP	Grokking deep reinforcement learning
title_full	Grokking deep reinforcement learning Miguel Morales ; foreword by Charles Isbell, Jr.
title_fullStr	Grokking deep reinforcement learning Miguel Morales ; foreword by Charles Isbell, Jr.
title_full_unstemmed	Grokking deep reinforcement learning Miguel Morales ; foreword by Charles Isbell, Jr.
title_short	Grokking deep reinforcement learning
title_sort	grokking deep reinforcement learning
url	http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032409393&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA
work_keys_str_mv	AT moralesmiguel grokkingdeepreinforcementlearning AT isbellcharles grokkingdeepreinforcementlearning AT moralesmiguel deepreinforcementlearning AT isbellcharles deepreinforcementlearning

Verfügbarkeit

Inhaltsverzeichnis

THWS Schweinfurt Zentralbibliothek Lesesaal

Bestandesangaben von THWS Schweinfurt Zentralbibliothek Lesesaal
Signatur:	2000 ST 302 M828
Exemplar 1	ausleihbar Checked out – Rückgabe bis: 10.02.2025 Vormerken

MARC

Datensatz im Suchindex

THWS Schweinfurt Zentralbibliothek Lesesaal

Ähnliche Einträge