Deep reinforcement learning with Python :: master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow /
Deep Reinforcement Learning with Python - Second Edition will help you learn reinforcement learning algorithms, techniques and architectures - including deep reinforcement learning - from scratch. This new edition is an extensive update of the original, reflecting the state-of-the-art latest thinkin...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Birmingham, UK :
Packt Publishing Ltd.,
2020.
|
Ausgabe: | Second edition. |
Schriftenreihe: | Expert insight.
|
Schlagworte: | |
Online-Zugang: | Volltext |
Zusammenfassung: | Deep Reinforcement Learning with Python - Second Edition will help you learn reinforcement learning algorithms, techniques and architectures - including deep reinforcement learning - from scratch. This new edition is an extensive update of the original, reflecting the state-of-the-art latest thinking in reinforcement learning. |
Beschreibung: | Previous edition published in 2018. The blackjack environment in the Gym library. |
Beschreibung: | 1 online resource (xxi, 730 pages) : illustrations |
ISBN: | 9781839215599 1839215593 |
Internformat
MARC
LEADER | 00000cam a2200000Mu 4500 | ||
---|---|---|---|
001 | ZDB-4-EBA-on1202475325 | ||
003 | OCoLC | ||
005 | 20241004212047.0 | ||
006 | m o d | ||
007 | cr unu---uuuuu | ||
008 | 201031s2020 enka o 001 0 eng d | ||
040 | |a EBLCP |b eng |c EBLCP |d YDX |d N$T |d OCLCF |d OCLCO |d GPM |d OCLCQ |d OCLCO |d OCLCL |d TMA |d OCLCQ | ||
019 | |a 1198892819 | ||
020 | |a 9781839215599 | ||
020 | |a 1839215593 | ||
035 | |a (OCoLC)1202475325 |z (OCoLC)1198892819 | ||
050 | 4 | |a Q325.6 |b .R38 2020 | |
082 | 7 | |a 006.31 |2 23 | |
049 | |a MAIN | ||
100 | 1 | |a Ravichandiran, Sudharsan, |e author. | |
245 | 1 | 0 | |a Deep reinforcement learning with Python : |b master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / |c Sudharsan Ravichandiran. |
250 | |a Second edition. | ||
264 | 1 | |a Birmingham, UK : |b Packt Publishing Ltd., |c 2020. | |
300 | |a 1 online resource (xxi, 730 pages) : |b illustrations | ||
336 | |a text |b txt |2 rdacontent | ||
337 | |a computer |b c |2 rdamedia | ||
338 | |a online resource |b cr |2 rdacarrier | ||
490 | 1 | |a Expert insight | |
500 | |a Previous edition published in 2018. | ||
505 | 0 | |a Cover -- Copyright -- Packt Page -- Contributors -- Table of Contents -- Preface -- Chapter 1: Fundamentals of Reinforcement Learning -- Key elements of RL -- Agent -- Environment -- State and action -- Reward -- The basic idea of RL -- The RL algorithm -- RL agent in the grid world -- How RL differs from other ML paradigms -- Markov Decision Processes -- The Markov property and Markov chain -- The Markov Reward Process -- The Markov Decision Process -- Fundamental concepts of RL -- Math essentials -- Expectation -- Action space -- Policy -- Deterministic policy -- Stochastic policy | |
505 | 8 | |a Episode -- Episodic and continuous tasks -- Horizon -- Return and discount factor -- Small discount factor -- Large discount factor -- What happens when we set the discount factor to 0? -- What happens when we set the discount factor to 1? -- The value function -- Q function -- Model-based and model-free learning -- Different types of environments -- Deterministic and stochastic environments -- Discrete and continuous environments -- Episodic and non-episodic environments -- Single and multi-agent environments -- Applications of RL -- RL glossary -- Summary -- Questions -- Further reading | |
505 | 8 | |a Chapter 2: A Guide to the Gym Toolkit -- Setting up our machine -- Installing Anaconda -- Installing the Gym toolkit -- Common error fixes -- Creating our first Gym environment -- Exploring the environment -- States -- Actions -- Transition probability and reward function -- Generating an episode in the Gym environment -- Action selection -- Generating an episode -- More Gym environments -- Classic control environments -- State space -- Action space -- Cart-Pole balancing with random policy -- Atari game environments -- General environment -- Deterministic environment -- No frame skipping | |
505 | 8 | |a State and action space -- An agent playing the Tennis game -- Recording the game -- Other environments -- Box2D -- MuJoCo -- Robotics -- Toy text -- Algorithms -- Environment synopsis -- Summary -- Questions -- Further reading -- Chapter 3: The Bellman Equation and Dynamic Programming -- The Bellman equation -- The Bellman equation of the value function -- The Bellman equation of the Q function -- The Bellman optimality equation -- The relationship between the value and Q functions -- Dynamic programming -- Value iteration -- The value iteration algorithm | |
505 | 8 | |a Solving the Frozen Lake problem with value iteration -- Policy iteration -- Algorithm -- policy iteration -- Solving the Frozen Lake problem with policy iteration -- Is DP applicable to all environments? -- Summary -- Questions -- Chapter 4: Monte Carlo Methods -- Understanding the Monte Carlo method -- Prediction and control tasks -- Prediction task -- Control task -- Monte Carlo prediction -- MC prediction algorithm -- Types of MC prediction -- First-visit Monte Carlo -- Every-visit Monte Carlo -- Implementing the Monte Carlo prediction method -- Understanding the blackjack game | |
500 | |a The blackjack environment in the Gym library. | ||
520 | |a Deep Reinforcement Learning with Python - Second Edition will help you learn reinforcement learning algorithms, techniques and architectures - including deep reinforcement learning - from scratch. This new edition is an extensive update of the original, reflecting the state-of-the-art latest thinking in reinforcement learning. | ||
650 | 0 | |a Reinforcement learning. |0 http://id.loc.gov/authorities/subjects/sh92000704 | |
650 | 0 | |a Python (Computer program language) |0 http://id.loc.gov/authorities/subjects/sh96008834 | |
650 | 6 | |a Apprentissage par renforcement (Intelligence artificielle) | |
650 | 6 | |a Python (Langage de programmation) | |
650 | 7 | |a Python (Computer program language) |2 fast | |
650 | 7 | |a Reinforcement learning |2 fast | |
758 | |i has work: |a Deep reinforcement learning with Python (Text) |1 https://id.oclc.org/worldcat/entity/E39PCG9CRGRdtwWk3YpKfxQT9C |4 https://id.oclc.org/worldcat/ontology/hasWork | ||
776 | 0 | 8 | |i Print version: |a Ravichandiran, Sudharsan |t Deep Reinforcement Learning with Python : Master Classic RL, Deep RL, Distributional RL, Inverse RL, and More with OpenAI Gym and TensorFlow, 2nd Edition |d Birmingham : Packt Publishing, Limited,c2020 |z 9781839210686 |
830 | 0 | |a Expert insight. |0 http://id.loc.gov/authorities/names/no2019019794 | |
856 | 4 | 0 | |l FWS01 |p ZDB-4-EBA |q FWS_PDA_EBA |u https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=2640444 |3 Volltext |
938 | |a ProQuest Ebook Central |b EBLB |n EBL6362643 | ||
938 | |a YBP Library Services |b YANK |n 301586817 | ||
938 | |a EBSCOhost |b EBSC |n 2640444 | ||
994 | |a 92 |b GEBAY | ||
912 | |a ZDB-4-EBA | ||
049 | |a DE-863 |
Datensatz im Suchindex
DE-BY-FWS_katkey | ZDB-4-EBA-on1202475325 |
---|---|
_version_ | 1816882531822206976 |
adam_text | |
any_adam_object | |
author | Ravichandiran, Sudharsan |
author_facet | Ravichandiran, Sudharsan |
author_role | aut |
author_sort | Ravichandiran, Sudharsan |
author_variant | s r sr |
building | Verbundindex |
bvnumber | localFWS |
callnumber-first | Q - Science |
callnumber-label | Q325 |
callnumber-raw | Q325.6 .R38 2020 |
callnumber-search | Q325.6 .R38 2020 |
callnumber-sort | Q 3325.6 R38 42020 |
callnumber-subject | Q - General Science |
collection | ZDB-4-EBA |
contents | Cover -- Copyright -- Packt Page -- Contributors -- Table of Contents -- Preface -- Chapter 1: Fundamentals of Reinforcement Learning -- Key elements of RL -- Agent -- Environment -- State and action -- Reward -- The basic idea of RL -- The RL algorithm -- RL agent in the grid world -- How RL differs from other ML paradigms -- Markov Decision Processes -- The Markov property and Markov chain -- The Markov Reward Process -- The Markov Decision Process -- Fundamental concepts of RL -- Math essentials -- Expectation -- Action space -- Policy -- Deterministic policy -- Stochastic policy Episode -- Episodic and continuous tasks -- Horizon -- Return and discount factor -- Small discount factor -- Large discount factor -- What happens when we set the discount factor to 0? -- What happens when we set the discount factor to 1? -- The value function -- Q function -- Model-based and model-free learning -- Different types of environments -- Deterministic and stochastic environments -- Discrete and continuous environments -- Episodic and non-episodic environments -- Single and multi-agent environments -- Applications of RL -- RL glossary -- Summary -- Questions -- Further reading Chapter 2: A Guide to the Gym Toolkit -- Setting up our machine -- Installing Anaconda -- Installing the Gym toolkit -- Common error fixes -- Creating our first Gym environment -- Exploring the environment -- States -- Actions -- Transition probability and reward function -- Generating an episode in the Gym environment -- Action selection -- Generating an episode -- More Gym environments -- Classic control environments -- State space -- Action space -- Cart-Pole balancing with random policy -- Atari game environments -- General environment -- Deterministic environment -- No frame skipping State and action space -- An agent playing the Tennis game -- Recording the game -- Other environments -- Box2D -- MuJoCo -- Robotics -- Toy text -- Algorithms -- Environment synopsis -- Summary -- Questions -- Further reading -- Chapter 3: The Bellman Equation and Dynamic Programming -- The Bellman equation -- The Bellman equation of the value function -- The Bellman equation of the Q function -- The Bellman optimality equation -- The relationship between the value and Q functions -- Dynamic programming -- Value iteration -- The value iteration algorithm Solving the Frozen Lake problem with value iteration -- Policy iteration -- Algorithm -- policy iteration -- Solving the Frozen Lake problem with policy iteration -- Is DP applicable to all environments? -- Summary -- Questions -- Chapter 4: Monte Carlo Methods -- Understanding the Monte Carlo method -- Prediction and control tasks -- Prediction task -- Control task -- Monte Carlo prediction -- MC prediction algorithm -- Types of MC prediction -- First-visit Monte Carlo -- Every-visit Monte Carlo -- Implementing the Monte Carlo prediction method -- Understanding the blackjack game |
ctrlnum | (OCoLC)1202475325 |
dewey-full | 006.31 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.31 |
dewey-search | 006.31 |
dewey-sort | 16.31 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
edition | Second edition. |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>05789cam a2200565Mu 4500</leader><controlfield tag="001">ZDB-4-EBA-on1202475325</controlfield><controlfield tag="003">OCoLC</controlfield><controlfield tag="005">20241004212047.0</controlfield><controlfield tag="006">m o d </controlfield><controlfield tag="007">cr unu---uuuuu</controlfield><controlfield tag="008">201031s2020 enka o 001 0 eng d</controlfield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">EBLCP</subfield><subfield code="b">eng</subfield><subfield code="c">EBLCP</subfield><subfield code="d">YDX</subfield><subfield code="d">N$T</subfield><subfield code="d">OCLCF</subfield><subfield code="d">OCLCO</subfield><subfield code="d">GPM</subfield><subfield code="d">OCLCQ</subfield><subfield code="d">OCLCO</subfield><subfield code="d">OCLCL</subfield><subfield code="d">TMA</subfield><subfield code="d">OCLCQ</subfield></datafield><datafield tag="019" ind1=" " ind2=" "><subfield code="a">1198892819</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781839215599</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">1839215593</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1202475325</subfield><subfield code="z">(OCoLC)1198892819</subfield></datafield><datafield tag="050" ind1=" " ind2="4"><subfield code="a">Q325.6</subfield><subfield code="b">.R38 2020</subfield></datafield><datafield tag="082" ind1="7" ind2=" "><subfield code="a">006.31</subfield><subfield code="2">23</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">MAIN</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Ravichandiran, Sudharsan,</subfield><subfield code="e">author.</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Deep reinforcement learning with Python :</subfield><subfield code="b">master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow /</subfield><subfield code="c">Sudharsan Ravichandiran.</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">Second edition.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham, UK :</subfield><subfield code="b">Packt Publishing Ltd.,</subfield><subfield code="c">2020.</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 online resource (xxi, 730 pages) :</subfield><subfield code="b">illustrations</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">computer</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">online resource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="1" ind2=" "><subfield code="a">Expert insight</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Previous edition published in 2018.</subfield></datafield><datafield tag="505" ind1="0" ind2=" "><subfield code="a">Cover -- Copyright -- Packt Page -- Contributors -- Table of Contents -- Preface -- Chapter 1: Fundamentals of Reinforcement Learning -- Key elements of RL -- Agent -- Environment -- State and action -- Reward -- The basic idea of RL -- The RL algorithm -- RL agent in the grid world -- How RL differs from other ML paradigms -- Markov Decision Processes -- The Markov property and Markov chain -- The Markov Reward Process -- The Markov Decision Process -- Fundamental concepts of RL -- Math essentials -- Expectation -- Action space -- Policy -- Deterministic policy -- Stochastic policy</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Episode -- Episodic and continuous tasks -- Horizon -- Return and discount factor -- Small discount factor -- Large discount factor -- What happens when we set the discount factor to 0? -- What happens when we set the discount factor to 1? -- The value function -- Q function -- Model-based and model-free learning -- Different types of environments -- Deterministic and stochastic environments -- Discrete and continuous environments -- Episodic and non-episodic environments -- Single and multi-agent environments -- Applications of RL -- RL glossary -- Summary -- Questions -- Further reading</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Chapter 2: A Guide to the Gym Toolkit -- Setting up our machine -- Installing Anaconda -- Installing the Gym toolkit -- Common error fixes -- Creating our first Gym environment -- Exploring the environment -- States -- Actions -- Transition probability and reward function -- Generating an episode in the Gym environment -- Action selection -- Generating an episode -- More Gym environments -- Classic control environments -- State space -- Action space -- Cart-Pole balancing with random policy -- Atari game environments -- General environment -- Deterministic environment -- No frame skipping</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">State and action space -- An agent playing the Tennis game -- Recording the game -- Other environments -- Box2D -- MuJoCo -- Robotics -- Toy text -- Algorithms -- Environment synopsis -- Summary -- Questions -- Further reading -- Chapter 3: The Bellman Equation and Dynamic Programming -- The Bellman equation -- The Bellman equation of the value function -- The Bellman equation of the Q function -- The Bellman optimality equation -- The relationship between the value and Q functions -- Dynamic programming -- Value iteration -- The value iteration algorithm</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Solving the Frozen Lake problem with value iteration -- Policy iteration -- Algorithm -- policy iteration -- Solving the Frozen Lake problem with policy iteration -- Is DP applicable to all environments? -- Summary -- Questions -- Chapter 4: Monte Carlo Methods -- Understanding the Monte Carlo method -- Prediction and control tasks -- Prediction task -- Control task -- Monte Carlo prediction -- MC prediction algorithm -- Types of MC prediction -- First-visit Monte Carlo -- Every-visit Monte Carlo -- Implementing the Monte Carlo prediction method -- Understanding the blackjack game</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">The blackjack environment in the Gym library.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Deep Reinforcement Learning with Python - Second Edition will help you learn reinforcement learning algorithms, techniques and architectures - including deep reinforcement learning - from scratch. This new edition is an extensive update of the original, reflecting the state-of-the-art latest thinking in reinforcement learning.</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Reinforcement learning.</subfield><subfield code="0">http://id.loc.gov/authorities/subjects/sh92000704</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Python (Computer program language)</subfield><subfield code="0">http://id.loc.gov/authorities/subjects/sh96008834</subfield></datafield><datafield tag="650" ind1=" " ind2="6"><subfield code="a">Apprentissage par renforcement (Intelligence artificielle)</subfield></datafield><datafield tag="650" ind1=" " ind2="6"><subfield code="a">Python (Langage de programmation)</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Python (Computer program language)</subfield><subfield code="2">fast</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Reinforcement learning</subfield><subfield code="2">fast</subfield></datafield><datafield tag="758" ind1=" " ind2=" "><subfield code="i">has work:</subfield><subfield code="a">Deep reinforcement learning with Python (Text)</subfield><subfield code="1">https://id.oclc.org/worldcat/entity/E39PCG9CRGRdtwWk3YpKfxQT9C</subfield><subfield code="4">https://id.oclc.org/worldcat/ontology/hasWork</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Print version:</subfield><subfield code="a">Ravichandiran, Sudharsan</subfield><subfield code="t">Deep Reinforcement Learning with Python : Master Classic RL, Deep RL, Distributional RL, Inverse RL, and More with OpenAI Gym and TensorFlow, 2nd Edition</subfield><subfield code="d">Birmingham : Packt Publishing, Limited,c2020</subfield><subfield code="z">9781839210686</subfield></datafield><datafield tag="830" ind1=" " ind2="0"><subfield code="a">Expert insight.</subfield><subfield code="0">http://id.loc.gov/authorities/names/no2019019794</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="l">FWS01</subfield><subfield code="p">ZDB-4-EBA</subfield><subfield code="q">FWS_PDA_EBA</subfield><subfield code="u">https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=2640444</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">ProQuest Ebook Central</subfield><subfield code="b">EBLB</subfield><subfield code="n">EBL6362643</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">YBP Library Services</subfield><subfield code="b">YANK</subfield><subfield code="n">301586817</subfield></datafield><datafield tag="938" ind1=" " ind2=" "><subfield code="a">EBSCOhost</subfield><subfield code="b">EBSC</subfield><subfield code="n">2640444</subfield></datafield><datafield tag="994" ind1=" " ind2=" "><subfield code="a">92</subfield><subfield code="b">GEBAY</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-4-EBA</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-863</subfield></datafield></record></collection> |
id | ZDB-4-EBA-on1202475325 |
illustrated | Illustrated |
indexdate | 2024-11-27T13:30:06Z |
institution | BVB |
isbn | 9781839215599 1839215593 |
language | English |
oclc_num | 1202475325 |
open_access_boolean | |
owner | MAIN DE-863 DE-BY-FWS |
owner_facet | MAIN DE-863 DE-BY-FWS |
physical | 1 online resource (xxi, 730 pages) : illustrations |
psigel | ZDB-4-EBA |
publishDate | 2020 |
publishDateSearch | 2020 |
publishDateSort | 2020 |
publisher | Packt Publishing Ltd., |
record_format | marc |
series | Expert insight. |
series2 | Expert insight |
spelling | Ravichandiran, Sudharsan, author. Deep reinforcement learning with Python : master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / Sudharsan Ravichandiran. Second edition. Birmingham, UK : Packt Publishing Ltd., 2020. 1 online resource (xxi, 730 pages) : illustrations text txt rdacontent computer c rdamedia online resource cr rdacarrier Expert insight Previous edition published in 2018. Cover -- Copyright -- Packt Page -- Contributors -- Table of Contents -- Preface -- Chapter 1: Fundamentals of Reinforcement Learning -- Key elements of RL -- Agent -- Environment -- State and action -- Reward -- The basic idea of RL -- The RL algorithm -- RL agent in the grid world -- How RL differs from other ML paradigms -- Markov Decision Processes -- The Markov property and Markov chain -- The Markov Reward Process -- The Markov Decision Process -- Fundamental concepts of RL -- Math essentials -- Expectation -- Action space -- Policy -- Deterministic policy -- Stochastic policy Episode -- Episodic and continuous tasks -- Horizon -- Return and discount factor -- Small discount factor -- Large discount factor -- What happens when we set the discount factor to 0? -- What happens when we set the discount factor to 1? -- The value function -- Q function -- Model-based and model-free learning -- Different types of environments -- Deterministic and stochastic environments -- Discrete and continuous environments -- Episodic and non-episodic environments -- Single and multi-agent environments -- Applications of RL -- RL glossary -- Summary -- Questions -- Further reading Chapter 2: A Guide to the Gym Toolkit -- Setting up our machine -- Installing Anaconda -- Installing the Gym toolkit -- Common error fixes -- Creating our first Gym environment -- Exploring the environment -- States -- Actions -- Transition probability and reward function -- Generating an episode in the Gym environment -- Action selection -- Generating an episode -- More Gym environments -- Classic control environments -- State space -- Action space -- Cart-Pole balancing with random policy -- Atari game environments -- General environment -- Deterministic environment -- No frame skipping State and action space -- An agent playing the Tennis game -- Recording the game -- Other environments -- Box2D -- MuJoCo -- Robotics -- Toy text -- Algorithms -- Environment synopsis -- Summary -- Questions -- Further reading -- Chapter 3: The Bellman Equation and Dynamic Programming -- The Bellman equation -- The Bellman equation of the value function -- The Bellman equation of the Q function -- The Bellman optimality equation -- The relationship between the value and Q functions -- Dynamic programming -- Value iteration -- The value iteration algorithm Solving the Frozen Lake problem with value iteration -- Policy iteration -- Algorithm -- policy iteration -- Solving the Frozen Lake problem with policy iteration -- Is DP applicable to all environments? -- Summary -- Questions -- Chapter 4: Monte Carlo Methods -- Understanding the Monte Carlo method -- Prediction and control tasks -- Prediction task -- Control task -- Monte Carlo prediction -- MC prediction algorithm -- Types of MC prediction -- First-visit Monte Carlo -- Every-visit Monte Carlo -- Implementing the Monte Carlo prediction method -- Understanding the blackjack game The blackjack environment in the Gym library. Deep Reinforcement Learning with Python - Second Edition will help you learn reinforcement learning algorithms, techniques and architectures - including deep reinforcement learning - from scratch. This new edition is an extensive update of the original, reflecting the state-of-the-art latest thinking in reinforcement learning. Reinforcement learning. http://id.loc.gov/authorities/subjects/sh92000704 Python (Computer program language) http://id.loc.gov/authorities/subjects/sh96008834 Apprentissage par renforcement (Intelligence artificielle) Python (Langage de programmation) Python (Computer program language) fast Reinforcement learning fast has work: Deep reinforcement learning with Python (Text) https://id.oclc.org/worldcat/entity/E39PCG9CRGRdtwWk3YpKfxQT9C https://id.oclc.org/worldcat/ontology/hasWork Print version: Ravichandiran, Sudharsan Deep Reinforcement Learning with Python : Master Classic RL, Deep RL, Distributional RL, Inverse RL, and More with OpenAI Gym and TensorFlow, 2nd Edition Birmingham : Packt Publishing, Limited,c2020 9781839210686 Expert insight. http://id.loc.gov/authorities/names/no2019019794 FWS01 ZDB-4-EBA FWS_PDA_EBA https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=2640444 Volltext |
spellingShingle | Ravichandiran, Sudharsan Deep reinforcement learning with Python : master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / Expert insight. Cover -- Copyright -- Packt Page -- Contributors -- Table of Contents -- Preface -- Chapter 1: Fundamentals of Reinforcement Learning -- Key elements of RL -- Agent -- Environment -- State and action -- Reward -- The basic idea of RL -- The RL algorithm -- RL agent in the grid world -- How RL differs from other ML paradigms -- Markov Decision Processes -- The Markov property and Markov chain -- The Markov Reward Process -- The Markov Decision Process -- Fundamental concepts of RL -- Math essentials -- Expectation -- Action space -- Policy -- Deterministic policy -- Stochastic policy Episode -- Episodic and continuous tasks -- Horizon -- Return and discount factor -- Small discount factor -- Large discount factor -- What happens when we set the discount factor to 0? -- What happens when we set the discount factor to 1? -- The value function -- Q function -- Model-based and model-free learning -- Different types of environments -- Deterministic and stochastic environments -- Discrete and continuous environments -- Episodic and non-episodic environments -- Single and multi-agent environments -- Applications of RL -- RL glossary -- Summary -- Questions -- Further reading Chapter 2: A Guide to the Gym Toolkit -- Setting up our machine -- Installing Anaconda -- Installing the Gym toolkit -- Common error fixes -- Creating our first Gym environment -- Exploring the environment -- States -- Actions -- Transition probability and reward function -- Generating an episode in the Gym environment -- Action selection -- Generating an episode -- More Gym environments -- Classic control environments -- State space -- Action space -- Cart-Pole balancing with random policy -- Atari game environments -- General environment -- Deterministic environment -- No frame skipping State and action space -- An agent playing the Tennis game -- Recording the game -- Other environments -- Box2D -- MuJoCo -- Robotics -- Toy text -- Algorithms -- Environment synopsis -- Summary -- Questions -- Further reading -- Chapter 3: The Bellman Equation and Dynamic Programming -- The Bellman equation -- The Bellman equation of the value function -- The Bellman equation of the Q function -- The Bellman optimality equation -- The relationship between the value and Q functions -- Dynamic programming -- Value iteration -- The value iteration algorithm Solving the Frozen Lake problem with value iteration -- Policy iteration -- Algorithm -- policy iteration -- Solving the Frozen Lake problem with policy iteration -- Is DP applicable to all environments? -- Summary -- Questions -- Chapter 4: Monte Carlo Methods -- Understanding the Monte Carlo method -- Prediction and control tasks -- Prediction task -- Control task -- Monte Carlo prediction -- MC prediction algorithm -- Types of MC prediction -- First-visit Monte Carlo -- Every-visit Monte Carlo -- Implementing the Monte Carlo prediction method -- Understanding the blackjack game Reinforcement learning. http://id.loc.gov/authorities/subjects/sh92000704 Python (Computer program language) http://id.loc.gov/authorities/subjects/sh96008834 Apprentissage par renforcement (Intelligence artificielle) Python (Langage de programmation) Python (Computer program language) fast Reinforcement learning fast |
subject_GND | http://id.loc.gov/authorities/subjects/sh92000704 http://id.loc.gov/authorities/subjects/sh96008834 |
title | Deep reinforcement learning with Python : master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / |
title_auth | Deep reinforcement learning with Python : master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / |
title_exact_search | Deep reinforcement learning with Python : master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / |
title_full | Deep reinforcement learning with Python : master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / Sudharsan Ravichandiran. |
title_fullStr | Deep reinforcement learning with Python : master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / Sudharsan Ravichandiran. |
title_full_unstemmed | Deep reinforcement learning with Python : master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / Sudharsan Ravichandiran. |
title_short | Deep reinforcement learning with Python : |
title_sort | deep reinforcement learning with python master classic rl deep rl distributional rl inverse rl and more with openai gym and tensorflow |
title_sub | master classic RL, deep RL, distributional RL, inverse RL, and more with OpenAI Gym and TensorFlow / |
topic | Reinforcement learning. http://id.loc.gov/authorities/subjects/sh92000704 Python (Computer program language) http://id.loc.gov/authorities/subjects/sh96008834 Apprentissage par renforcement (Intelligence artificielle) Python (Langage de programmation) Python (Computer program language) fast Reinforcement learning fast |
topic_facet | Reinforcement learning. Python (Computer program language) Apprentissage par renforcement (Intelligence artificielle) Python (Langage de programmation) Reinforcement learning |
url | https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&AN=2640444 |
work_keys_str_mv | AT ravichandiransudharsan deepreinforcementlearningwithpythonmasterclassicrldeeprldistributionalrlinverserlandmorewithopenaigymandtensorflow |