Grokking deep reinforcement learning:
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Manning
Shelter Island
[2020]
|
Online-Zugang: | Inhaltsverzeichnis |
Beschreibung: | xxi, 447 Seiten Illustrationen, Diagramme |
ISBN: | 9781617295454 |
Internformat
MARC
LEADER | 00000nam a2200000 c 4500 | ||
---|---|---|---|
001 | BV047001780 | ||
003 | DE-604 | ||
005 | 20241203 | ||
007 | t| | ||
008 | 201117s2020 xx a||| |||| 00||| eng d | ||
020 | |a 9781617295454 |c pbk |9 978-1-61729-545-4 | ||
035 | |a (OCoLC)1241670259 | ||
035 | |a (DE-599)BVBBV047001780 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-862 |a DE-473 | ||
084 | |a ST 302 |0 (DE-625)143652: |2 rvk | ||
100 | 1 | |a Morales, Miguel |e Verfasser |0 (DE-588)1318329892 |4 aut | |
245 | 1 | 0 | |a Grokking deep reinforcement learning |c Miguel Morales ; foreword by Charles Isbell, Jr. |
246 | 1 | 3 | |a Deep reinforcement learning |
264 | 1 | |a Manning |b Shelter Island |c [2020] | |
264 | 4 | |c © 2020 | |
300 | |a xxi, 447 Seiten |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
700 | 1 | |a Isbell, Charles |d ca. 20. Jh. |0 (DE-588)1242481133 |4 wpr | |
856 | 4 | 2 | |m Digitalisierung UB Bamberg - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032409393&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-032409393 |
Datensatz im Suchindex
DE-BY-862_location | 2000 |
---|---|
DE-BY-FWS_call_number | 2000/ST 302 M828 |
DE-BY-FWS_katkey | 857838 |
DE-BY-FWS_media_number | 083000518519 |
_version_ | 1819742269733863424 |
adam_text |
contents 00000000000000 00000000 xi foreword xiii preface acknowledgments XV about this book xvii about the author xxi 1 Introduction to deep reinforcement learning 1 β « ® « « e ® S î» .V s -ÿ ·? « « J; i? « * * ». * ?- « * о » ч s * ■« ?. 3 * ΐ ■,· -э 4 f· « ■;' '■ ч? « ί ϊ « ft « ΐ ÿ £ Φ 4- У I1 ^ »? ι» * S Λ « vu 2 What is deep reinforcement learning? The past, present, and future of deep reinforcement learning The suitability of deep reinforcement learning Setting clear two-way expectations 2 15 22 25 Mathematical foundations of reinforcement learning 31 ? * 4 » δ a ?■ s ? : t », -3 ? ϊ f $ ? ® 9 ■» f1 5' *‘ и * f; -ϊ ‘. -t :f '? ft * я '.- Components of reinforcement learning MDPs: The engine of the environment vii * * » ? -ϊ f‘ ^ x? ■ * « ♦ * » ^ ® 9 * « ·* β s s % ? i· 33 45
viii 3 contents Balancing immediate and long-term goals 65 ΐ ® % » © ® « й 4 Я = S· fs K- it- ö к ^ ■« ® «ί 5 * » «7 ® ΐ lÿ Ή ? -ΐ й ·ϊ- · S ® Φ # Й· ΐ* ιί e ·® Æ S ^ ® ? « Φ «# ® Я fi Й Й й ® ^ ® % · Ф is ® The objective of a decision-making agent Planning optimal sequences of actions 4 66 77
Balancing the gathering and use of information sosooeo* s e w n 97 ·* Æ 1 S' δ й ·Φ ö й « * Ь if ΐ· a ·μ A ®· Л ® Ψ А1 К ® а » « ÿ ξώ ■ * # if Æ «- -S · И S S ft ® й # '■ Ф Ъ ® $ i? ■» -3 'й The challenge of interpreting evaluative feedback Strategie exploration 5 99 117 Evaluating agents'behaviors
131 ^· Ψ S· SS « f? ^ Ö S’ S i» «» *5 "S ^ ?■ * * « ^ ί' ’ * ^ S «· ■’:· à.’ « ® $■ * ‘“ '" S '^ ® ■’ ^ ■' Ф «' ÿ ® "’ ϊ *9 '* ^ ^ i* й ■? $ $ ·ί» ^ '^ “*' ^ ® Ö * fe ® *S » Learning to estimate the value of policies Learning to estimate from multiple steps 6 133 150 Improving agents'behaviors й ® ®
» ® ® ®- * St i? О gJ- J'· a» '!·. Ъ S'- 3 it- s'? -i » V· ■■ ·? 2 *S -^ -·Λ ,5 s Й “ -f «■ ф л ^ *? ?· .-J t ■" · 'S if '£- si- % 'S ^ ty 167 w ’ f ΐ- I« si· » ■’- V 5· ®· Ϊ· The anatomy of reinforcement learning agents Learning to improve policies of behavior Decoupling behavior from learning 7
168 176 187 Achieving goals more effectively and efficiently ® ?· S’ ff * » ¥ * $ ^ -£ Й ' «· К * » ^ « S Ф ;"- * «« * -® © ■» S 5· Й -S 5 ? S « й * 203 ?: ύ * Λ 5; # $ Й й Æ ® Learning to improve policies using robust targets Agents that interact, learn, and plan 8 Introduction to value-based deep
reinforcement learning 205 217 239 ä ä Ö fl 0 ^ 8 ΐ a 9 -'’ i * if * 8- $ -Φ # ■# ύ: Ъ ч^ S iS О ф Й 'S ®
S ?« ?· « -ö © β· а a S © £ * -^ й * * « ?й #’ ® Ф Й ^ ■ ■ ^ ^- ^ «■ г^ й Ф ΰ «■ The kind of feedback deep reinforcement learning agents use Introduction to function approximation for reinforcement learning NFQ: The first attempt at value-based deep reinforcement learning 241 248 253
contents 9 ix Morestable value-based methods 275 ® ® æ Й· ® Ä 'S Sf ■$ ^ © ® ® ® 3 Й* $ 6 % «. ® 96$«$$«8»$8S» «■ й й $ В Λ * Ф S si S ïV ф ?â ® ü £ » ^ ^ S ® Й- ® й Й $ » S % ® DQN: Making reinforcement learning more like supervised learning 276 Double DQN: Mitigating the overestimation of action-
value functions 293 309 10 Sample-efficient value-based methods © s· 0 ® © s- «»®»»®e®e Si s- $ ® % № φ * ® e- ■ ** * * 4 4 $■ % ^ ® $ $ « ® ® sj я ф © s $ ® ® « й ô ® » -a· ф $· ® à а а Dueling DDQN: A reinforcement-learning-aware neural network architecture PER: Prioritizing the replay of
meaningful experiences 310 323 11 Policy-gradient and actor-critic methods Ф- ?? ^ e '5 % s $ ® s· » ώ * ^ ® »a ₽ #■ r· ® «·■ й 339 i? j? · s ®· ·ν й ® s· s* ® it ъ jf ^ .^ Ф ®® s Ф ® ® ^ ®· л л? * ® © й ® а щ s ^ REINFORCE: Outcome-based policy learning VPG: Learning a value function A3C: Parallel
policy updates GAE: Robust advantage estimation A2C: Synchronous policy updates ® •* *** ф6**** 6* 6** * **9*8*4 13 Toward artificial general intelligence $, гф ® ^ ■© Й Я· « ®- ® ©■ й £ s fi а 4! »®g«äSä»$^«®»9 -Й №■ ^ 9 а ® What was covered and what notably wasn't? More advanced concepts toward
AGI What happens next? index g s ■ » 375 й DDPG: Approximating a deterministic policy TD3: State-of-the-art improvements over DDPG SAC: Maximizing the expected return and entropy PPO: Restricting optimization steps « Ф . ^ ф ф it й S « ®. «S ® ·2 ίί ■ а й й Й « εχ· Ф î 340 350 355 362 365 12
Advanced actor-critic methods s e # » e a -a' @ * й й » tlanie ss * ® a у ■ ? » © ® ^ e ί ® ® « $ 0 5a@ e ·» a » ф 377
384 391 398 411 ® ^ Ά 413 424 432 437 |
adam_txt | |
any_adam_object | 1 |
any_adam_object_boolean | |
author | Morales, Miguel |
author_GND | (DE-588)1318329892 (DE-588)1242481133 |
author_facet | Morales, Miguel |
author_role | aut |
author_sort | Morales, Miguel |
author_variant | m m mm |
building | Verbundindex |
bvnumber | BV047001780 |
classification_rvk | ST 302 |
ctrlnum | (OCoLC)1241670259 (DE-599)BVBBV047001780 |
discipline | Informatik |
discipline_str_mv | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV047001780</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20241203</controlfield><controlfield tag="007">t|</controlfield><controlfield tag="008">201117s2020 xx a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781617295454</subfield><subfield code="c">pbk</subfield><subfield code="9">978-1-61729-545-4</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1241670259</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047001780</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-862</subfield><subfield code="a">DE-473</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 302</subfield><subfield code="0">(DE-625)143652:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Morales, Miguel</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1318329892</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Grokking deep reinforcement learning</subfield><subfield code="c">Miguel Morales ; foreword by Charles Isbell, Jr.</subfield></datafield><datafield tag="246" ind1="1" ind2="3"><subfield code="a">Deep reinforcement learning</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Manning</subfield><subfield code="b">Shelter Island</subfield><subfield code="c">[2020]</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2020</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">xxi, 447 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Isbell, Charles</subfield><subfield code="d">ca. 20. Jh.</subfield><subfield code="0">(DE-588)1242481133</subfield><subfield code="4">wpr</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Bamberg - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032409393&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032409393</subfield></datafield></record></collection> |
id | DE-604.BV047001780 |
illustrated | Illustrated |
index_date | 2024-07-03T15:57:12Z |
indexdate | 2024-12-29T04:04:24Z |
institution | BVB |
isbn | 9781617295454 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032409393 |
oclc_num | 1241670259 |
open_access_boolean | |
owner | DE-862 DE-BY-FWS DE-473 DE-BY-UBG |
owner_facet | DE-862 DE-BY-FWS DE-473 DE-BY-UBG |
physical | xxi, 447 Seiten Illustrationen, Diagramme |
publishDate | 2020 |
publishDateSearch | 2020 |
publishDateSort | 2020 |
publisher | Shelter Island |
record_format | marc |
spellingShingle | Morales, Miguel Grokking deep reinforcement learning |
title | Grokking deep reinforcement learning |
title_alt | Deep reinforcement learning |
title_auth | Grokking deep reinforcement learning |
title_exact_search | Grokking deep reinforcement learning |
title_exact_search_txtP | Grokking deep reinforcement learning |
title_full | Grokking deep reinforcement learning Miguel Morales ; foreword by Charles Isbell, Jr. |
title_fullStr | Grokking deep reinforcement learning Miguel Morales ; foreword by Charles Isbell, Jr. |
title_full_unstemmed | Grokking deep reinforcement learning Miguel Morales ; foreword by Charles Isbell, Jr. |
title_short | Grokking deep reinforcement learning |
title_sort | grokking deep reinforcement learning |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=032409393&sequence=000001&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT moralesmiguel grokkingdeepreinforcementlearning AT isbellcharles grokkingdeepreinforcementlearning AT moralesmiguel deepreinforcementlearning AT isbellcharles deepreinforcementlearning |
Inhaltsverzeichnis
THWS Schweinfurt Zentralbibliothek Lesesaal
Signatur: |
2000 ST 302 M828 |
---|---|
Exemplar 1 | ausleihbar Checked out – Rückgabe bis: 10.02.2025 Vormerken |