Verfügbarkeit: Reinforcement learning algorithms with Python

Reinforcement learning algorithms with Python: learn, understand, and develop smart algorithms for addressing AI challenges

With this book, you will understand the core concepts and techniques of reinforcement learning. You will take a look into each RL algorithm and will develop your own self-learning algorithms and models. You will optimize the algorithms for better precision, use high-speed actions and lower the risk...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Lonza, Andrea (VerfasserIn)
Format:	Buch
Sprache:	English
Veröffentlicht:	Birmingham ; Mumbai Packt Publishing October 2019
Schlagworte:	Python > Programmiersprache Operante Konditionierung Computer algorithms Python (Computer program language) Electronic books
Zusammenfassung:	With this book, you will understand the core concepts and techniques of reinforcement learning. You will take a look into each RL algorithm and will develop your own self-learning algorithms and models. You will optimize the algorithms for better precision, use high-speed actions and lower the risk of anomalies in your applications
Beschreibung:	Implementing REINFORCE with baseline
Beschreibung:	vii, 351 Seiten Illustrationen, Diagramme
ISBN:	9781789131116

Internformat

MARC


LEADER	00000nam a2200000 c 4500
001	BV046768925
003	DE-604
007	t
008	200617s2019 a\|\|\| \|\|\|\| 00\|\|\| eng d
020			\|a 9781789131116 \|9 978-1-78913-111-6
035			\|a (OCoLC)1164646408
035			\|a (DE-599)BVBBV046768925
040			\|a DE-604 \|b ger \|e rda
041	0		\|a eng
049			\|a DE-573 \|a DE-858 \|a DE-898
084			\|a ST 250 \|0 (DE-625)143626: \|2 rvk
084			\|a ST 300 \|0 (DE-625)143650: \|2 rvk
100	1		\|a Lonza, Andrea \|e Verfasser \|4 aut
245	1	0	\|a Reinforcement learning algorithms with Python \|b learn, understand, and develop smart algorithms for addressing AI challenges \|c Andrea Lonza
264		1	\|a Birmingham ; Mumbai \|b Packt Publishing \|c October 2019
300			\|a vii, 351 Seiten \|b Illustrationen, Diagramme
336			\|b txt \|2 rdacontent
337			\|b n \|2 rdamedia
338			\|b nc \|2 rdacarrier
500			\|a Implementing REINFORCE with baseline
505	8		\|a Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Algorithms and Environments; Chapter 1: The Landscape of Reinforcement Learning; An introduction to RL; Comparing RL and supervised learning; History of RL; Deep RL; Elements of RL; Policy; The value function; Reward; Model; Applications of RL; Games; Robotics and Industry 4.0; Machine learning; Economics and finance; Healthcare; Intelligent transportation systems; Energy optimization and smart grid; Summary; Questions; Further reading
505	8		\|a Chapter 2: Implementing RL Cycle and OpenAI GymSetting up the environment; Installing OpenAI Gym; Installing Roboschool; OpenAI Gym and RL cycles; Developing an RL cycle; Getting used to spaces; Development of ML models using TensorFlow; Tensor; Constant; Placeholder; Variable; Creating a graph; Simple linear regression example; Introducing TensorBoard; Types of RL environments; Why different environments?; Open source environments; Summary; Questions; Further reading; Chapter 3: Solving Problems with Dynamic Programming; MDP; Policy; Return; Value functions; Bellman equation
505	8		\|a Categorizing RL algorithmsModel-free algorithms; Value-based algorithms; Policy gradient algorithms; Actor-Critic algorithms; Hybrid algorithms; Model-based RL; Algorithm diversity; Dynamic programming; Policy evaluation and policy improvement; Policy iteration; Policy iteration applied to FrozenLake; Value iteration; Value iteration applied to FrozenLake; Summary; Questions; Further reading; Section 2: Model-Free RL Algorithms; Chapter 4: Q-Learning and SARSA Applications; Learning without a model; User experience; Policy evaluation; The exploration problem; Why explore?; How to explore
505	8		\|a TD learningTD update; Policy improvement; Comparing Monte Carlo and TD; SARSA; The algorithm; Applying SARSA to Taxi-v2; Q-learning; Theory; The algorithm; Applying Q-learning to Taxi-v2; Comparing SARSA and Q-learning; Summary; Questions; Chapter 5: Deep Q-Network; Deep neural networks and Q-learning; Function approximation; Q-learning with neural networks; Deep Q-learning instabilities; DQN; The solution; Replay memory; The target network; The DQN algorithm; The loss function; Pseudocode; Model architecture; DQN applied to Pong; Atari games; Preprocessing; DQN implementation; DNNs
505	8		\|a The experienced bufferThe computational graph and training loop; Results; DQN variations; Double DQN; DDQN implementation; Results; Dueling DQN; Dueling DQN implementation; Results; N-step DQN; Implementation; Results; Summary; Questions; Further reading; Chapter 6: Learning Stochastic and PG Optimization; Policy gradient methods; The gradient of the policy; Policy gradient theorem; Computing the gradient; The policy; On-policy PG; Understanding the REINFORCE algorithm; Implementing REINFORCE; Landing a spacecraft using REINFORCE; Analyzing the results; REINFORCE with baseline
520	3		\|a With this book, you will understand the core concepts and techniques of reinforcement learning. You will take a look into each RL algorithm and will develop your own self-learning algorithms and models. You will optimize the algorithms for better precision, use high-speed actions and lower the risk of anomalies in your applications
650	0	7	\|a Python \|g Programmiersprache \|0 (DE-588)4434275-5 \|2 gnd \|9 rswk-swf
650	0	7	\|a Operante Konditionierung \|0 (DE-588)4172613-3 \|2 gnd \|9 rswk-swf
653		0	\|a Computer algorithms
653		0	\|a Python (Computer program language)
653		6	\|a Electronic books
653		6	\|a Electronic books
689	0	0	\|a Python \|g Programmiersprache \|0 (DE-588)4434275-5 \|D s
689	0	1	\|a Operante Konditionierung \|0 (DE-588)4172613-3 \|D s
689	0		\|5 DE-604
776	0	8	\|i Erscheint auch als \|n Online-Ausgabe \|z 978-1-78913-970-9

Datensatz im Suchindex

_version_	1805084086672293888
adam_text
adam_txt
any_adam_object
any_adam_object_boolean
author	Lonza, Andrea
author_facet	Lonza, Andrea
author_role	aut
author_sort	Lonza, Andrea
author_variant	a l al
building	Verbundindex
bvnumber	BV046768925
classification_rvk	ST 250 ST 300
contents	Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Algorithms and Environments; Chapter 1: The Landscape of Reinforcement Learning; An introduction to RL; Comparing RL and supervised learning; History of RL; Deep RL; Elements of RL; Policy; The value function; Reward; Model; Applications of RL; Games; Robotics and Industry 4.0; Machine learning; Economics and finance; Healthcare; Intelligent transportation systems; Energy optimization and smart grid; Summary; Questions; Further reading Chapter 2: Implementing RL Cycle and OpenAI GymSetting up the environment; Installing OpenAI Gym; Installing Roboschool; OpenAI Gym and RL cycles; Developing an RL cycle; Getting used to spaces; Development of ML models using TensorFlow; Tensor; Constant; Placeholder; Variable; Creating a graph; Simple linear regression example; Introducing TensorBoard; Types of RL environments; Why different environments?; Open source environments; Summary; Questions; Further reading; Chapter 3: Solving Problems with Dynamic Programming; MDP; Policy; Return; Value functions; Bellman equation Categorizing RL algorithmsModel-free algorithms; Value-based algorithms; Policy gradient algorithms; Actor-Critic algorithms; Hybrid algorithms; Model-based RL; Algorithm diversity; Dynamic programming; Policy evaluation and policy improvement; Policy iteration; Policy iteration applied to FrozenLake; Value iteration; Value iteration applied to FrozenLake; Summary; Questions; Further reading; Section 2: Model-Free RL Algorithms; Chapter 4: Q-Learning and SARSA Applications; Learning without a model; User experience; Policy evaluation; The exploration problem; Why explore?; How to explore TD learningTD update; Policy improvement; Comparing Monte Carlo and TD; SARSA; The algorithm; Applying SARSA to Taxi-v2; Q-learning; Theory; The algorithm; Applying Q-learning to Taxi-v2; Comparing SARSA and Q-learning; Summary; Questions; Chapter 5: Deep Q-Network; Deep neural networks and Q-learning; Function approximation; Q-learning with neural networks; Deep Q-learning instabilities; DQN; The solution; Replay memory; The target network; The DQN algorithm; The loss function; Pseudocode; Model architecture; DQN applied to Pong; Atari games; Preprocessing; DQN implementation; DNNs The experienced bufferThe computational graph and training loop; Results; DQN variations; Double DQN; DDQN implementation; Results; Dueling DQN; Dueling DQN implementation; Results; N-step DQN; Implementation; Results; Summary; Questions; Further reading; Chapter 6: Learning Stochastic and PG Optimization; Policy gradient methods; The gradient of the policy; Policy gradient theorem; Computing the gradient; The policy; On-policy PG; Understanding the REINFORCE algorithm; Implementing REINFORCE; Landing a spacecraft using REINFORCE; Analyzing the results; REINFORCE with baseline
ctrlnum	(OCoLC)1164646408 (DE-599)BVBBV046768925
discipline	Informatik
discipline_str_mv	Informatik
format	Book
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>00000nam a2200000 c 4500</leader><controlfield tag="001">BV046768925</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="007">t</controlfield><controlfield tag="008">200617s2019 a\|\|\| \|\|\|\| 00\|\|\| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781789131116</subfield><subfield code="9">978-1-78913-111-6</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1164646408</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV046768925</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-573</subfield><subfield code="a">DE-858</subfield><subfield code="a">DE-898</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 250</subfield><subfield code="0">(DE-625)143626:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 300</subfield><subfield code="0">(DE-625)143650:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Lonza, Andrea</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Reinforcement learning algorithms with Python</subfield><subfield code="b">learn, understand, and develop smart algorithms for addressing AI challenges</subfield><subfield code="c">Andrea Lonza</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham ; Mumbai</subfield><subfield code="b">Packt Publishing</subfield><subfield code="c">October 2019</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">vii, 351 Seiten</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Implementing REINFORCE with baseline</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Algorithms and Environments; Chapter 1: The Landscape of Reinforcement Learning; An introduction to RL; Comparing RL and supervised learning; History of RL; Deep RL; Elements of RL; Policy; The value function; Reward; Model; Applications of RL; Games; Robotics and Industry 4.0; Machine learning; Economics and finance; Healthcare; Intelligent transportation systems; Energy optimization and smart grid; Summary; Questions; Further reading</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Chapter 2: Implementing RL Cycle and OpenAI GymSetting up the environment; Installing OpenAI Gym; Installing Roboschool; OpenAI Gym and RL cycles; Developing an RL cycle; Getting used to spaces; Development of ML models using TensorFlow; Tensor; Constant; Placeholder; Variable; Creating a graph; Simple linear regression example; Introducing TensorBoard; Types of RL environments; Why different environments?; Open source environments; Summary; Questions; Further reading; Chapter 3: Solving Problems with Dynamic Programming; MDP; Policy; Return; Value functions; Bellman equation</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Categorizing RL algorithmsModel-free algorithms; Value-based algorithms; Policy gradient algorithms; Actor-Critic algorithms; Hybrid algorithms; Model-based RL; Algorithm diversity; Dynamic programming; Policy evaluation and policy improvement; Policy iteration; Policy iteration applied to FrozenLake; Value iteration; Value iteration applied to FrozenLake; Summary; Questions; Further reading; Section 2: Model-Free RL Algorithms; Chapter 4: Q-Learning and SARSA Applications; Learning without a model; User experience; Policy evaluation; The exploration problem; Why explore?; How to explore</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">TD learningTD update; Policy improvement; Comparing Monte Carlo and TD; SARSA; The algorithm; Applying SARSA to Taxi-v2; Q-learning; Theory; The algorithm; Applying Q-learning to Taxi-v2; Comparing SARSA and Q-learning; Summary; Questions; Chapter 5: Deep Q-Network; Deep neural networks and Q-learning; Function approximation; Q-learning with neural networks; Deep Q-learning instabilities; DQN; The solution; Replay memory; The target network; The DQN algorithm; The loss function; Pseudocode; Model architecture; DQN applied to Pong; Atari games; Preprocessing; DQN implementation; DNNs</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">The experienced bufferThe computational graph and training loop; Results; DQN variations; Double DQN; DDQN implementation; Results; Dueling DQN; Dueling DQN implementation; Results; N-step DQN; Implementation; Results; Summary; Questions; Further reading; Chapter 6: Learning Stochastic and PG Optimization; Policy gradient methods; The gradient of the policy; Policy gradient theorem; Computing the gradient; The policy; On-policy PG; Understanding the REINFORCE algorithm; Implementing REINFORCE; Landing a spacecraft using REINFORCE; Analyzing the results; REINFORCE with baseline</subfield></datafield><datafield tag="520" ind1="3" ind2=" "><subfield code="a">With this book, you will understand the core concepts and techniques of reinforcement learning. You will take a look into each RL algorithm and will develop your own self-learning algorithms and models. You will optimize the algorithms for better precision, use high-speed actions and lower the risk of anomalies in your applications</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Operante Konditionierung</subfield><subfield code="0">(DE-588)4172613-3</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Computer algorithms</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Python (Computer program language)</subfield></datafield><datafield tag="653" ind1=" " ind2="6"><subfield code="a">Electronic books</subfield></datafield><datafield tag="653" ind1=" " ind2="6"><subfield code="a">Electronic books</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Python</subfield><subfield code="g">Programmiersprache</subfield><subfield code="0">(DE-588)4434275-5</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Operante Konditionierung</subfield><subfield code="0">(DE-588)4172613-3</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-78913-970-9</subfield></datafield></record></collection>
id	DE-604.BV046768925
illustrated	Illustrated
index_date	2024-07-03T14:46:12Z
indexdate	2024-07-20T07:58:52Z
institution	BVB
isbn	9781789131116
language	English
oai_aleph_id	oai:aleph.bib-bvb.de:BVB01-032178365
oclc_num	1164646408
open_access_boolean
owner	DE-573 DE-858 DE-898 DE-BY-UBR
owner_facet	DE-573 DE-858 DE-898 DE-BY-UBR
physical	vii, 351 Seiten Illustrationen, Diagramme
publishDate	2019
publishDateSearch	2019
publishDateSort	2019
publisher	Packt Publishing
record_format	marc
spelling	Lonza, Andrea Verfasser aut Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges Andrea Lonza Birmingham ; Mumbai Packt Publishing October 2019 vii, 351 Seiten Illustrationen, Diagramme txt rdacontent n rdamedia nc rdacarrier Implementing REINFORCE with baseline Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Algorithms and Environments; Chapter 1: The Landscape of Reinforcement Learning; An introduction to RL; Comparing RL and supervised learning; History of RL; Deep RL; Elements of RL; Policy; The value function; Reward; Model; Applications of RL; Games; Robotics and Industry 4.0; Machine learning; Economics and finance; Healthcare; Intelligent transportation systems; Energy optimization and smart grid; Summary; Questions; Further reading Chapter 2: Implementing RL Cycle and OpenAI GymSetting up the environment; Installing OpenAI Gym; Installing Roboschool; OpenAI Gym and RL cycles; Developing an RL cycle; Getting used to spaces; Development of ML models using TensorFlow; Tensor; Constant; Placeholder; Variable; Creating a graph; Simple linear regression example; Introducing TensorBoard; Types of RL environments; Why different environments?; Open source environments; Summary; Questions; Further reading; Chapter 3: Solving Problems with Dynamic Programming; MDP; Policy; Return; Value functions; Bellman equation Categorizing RL algorithmsModel-free algorithms; Value-based algorithms; Policy gradient algorithms; Actor-Critic algorithms; Hybrid algorithms; Model-based RL; Algorithm diversity; Dynamic programming; Policy evaluation and policy improvement; Policy iteration; Policy iteration applied to FrozenLake; Value iteration; Value iteration applied to FrozenLake; Summary; Questions; Further reading; Section 2: Model-Free RL Algorithms; Chapter 4: Q-Learning and SARSA Applications; Learning without a model; User experience; Policy evaluation; The exploration problem; Why explore?; How to explore TD learningTD update; Policy improvement; Comparing Monte Carlo and TD; SARSA; The algorithm; Applying SARSA to Taxi-v2; Q-learning; Theory; The algorithm; Applying Q-learning to Taxi-v2; Comparing SARSA and Q-learning; Summary; Questions; Chapter 5: Deep Q-Network; Deep neural networks and Q-learning; Function approximation; Q-learning with neural networks; Deep Q-learning instabilities; DQN; The solution; Replay memory; The target network; The DQN algorithm; The loss function; Pseudocode; Model architecture; DQN applied to Pong; Atari games; Preprocessing; DQN implementation; DNNs The experienced bufferThe computational graph and training loop; Results; DQN variations; Double DQN; DDQN implementation; Results; Dueling DQN; Dueling DQN implementation; Results; N-step DQN; Implementation; Results; Summary; Questions; Further reading; Chapter 6: Learning Stochastic and PG Optimization; Policy gradient methods; The gradient of the policy; Policy gradient theorem; Computing the gradient; The policy; On-policy PG; Understanding the REINFORCE algorithm; Implementing REINFORCE; Landing a spacecraft using REINFORCE; Analyzing the results; REINFORCE with baseline With this book, you will understand the core concepts and techniques of reinforcement learning. You will take a look into each RL algorithm and will develop your own self-learning algorithms and models. You will optimize the algorithms for better precision, use high-speed actions and lower the risk of anomalies in your applications Python Programmiersprache (DE-588)4434275-5 gnd rswk-swf Operante Konditionierung (DE-588)4172613-3 gnd rswk-swf Computer algorithms Python (Computer program language) Electronic books Python Programmiersprache (DE-588)4434275-5 s Operante Konditionierung (DE-588)4172613-3 s DE-604 Erscheint auch als Online-Ausgabe 978-1-78913-970-9
spellingShingle	Lonza, Andrea Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Section 1: Algorithms and Environments; Chapter 1: The Landscape of Reinforcement Learning; An introduction to RL; Comparing RL and supervised learning; History of RL; Deep RL; Elements of RL; Policy; The value function; Reward; Model; Applications of RL; Games; Robotics and Industry 4.0; Machine learning; Economics and finance; Healthcare; Intelligent transportation systems; Energy optimization and smart grid; Summary; Questions; Further reading Chapter 2: Implementing RL Cycle and OpenAI GymSetting up the environment; Installing OpenAI Gym; Installing Roboschool; OpenAI Gym and RL cycles; Developing an RL cycle; Getting used to spaces; Development of ML models using TensorFlow; Tensor; Constant; Placeholder; Variable; Creating a graph; Simple linear regression example; Introducing TensorBoard; Types of RL environments; Why different environments?; Open source environments; Summary; Questions; Further reading; Chapter 3: Solving Problems with Dynamic Programming; MDP; Policy; Return; Value functions; Bellman equation Categorizing RL algorithmsModel-free algorithms; Value-based algorithms; Policy gradient algorithms; Actor-Critic algorithms; Hybrid algorithms; Model-based RL; Algorithm diversity; Dynamic programming; Policy evaluation and policy improvement; Policy iteration; Policy iteration applied to FrozenLake; Value iteration; Value iteration applied to FrozenLake; Summary; Questions; Further reading; Section 2: Model-Free RL Algorithms; Chapter 4: Q-Learning and SARSA Applications; Learning without a model; User experience; Policy evaluation; The exploration problem; Why explore?; How to explore TD learningTD update; Policy improvement; Comparing Monte Carlo and TD; SARSA; The algorithm; Applying SARSA to Taxi-v2; Q-learning; Theory; The algorithm; Applying Q-learning to Taxi-v2; Comparing SARSA and Q-learning; Summary; Questions; Chapter 5: Deep Q-Network; Deep neural networks and Q-learning; Function approximation; Q-learning with neural networks; Deep Q-learning instabilities; DQN; The solution; Replay memory; The target network; The DQN algorithm; The loss function; Pseudocode; Model architecture; DQN applied to Pong; Atari games; Preprocessing; DQN implementation; DNNs The experienced bufferThe computational graph and training loop; Results; DQN variations; Double DQN; DDQN implementation; Results; Dueling DQN; Dueling DQN implementation; Results; N-step DQN; Implementation; Results; Summary; Questions; Further reading; Chapter 6: Learning Stochastic and PG Optimization; Policy gradient methods; The gradient of the policy; Policy gradient theorem; Computing the gradient; The policy; On-policy PG; Understanding the REINFORCE algorithm; Implementing REINFORCE; Landing a spacecraft using REINFORCE; Analyzing the results; REINFORCE with baseline Python Programmiersprache (DE-588)4434275-5 gnd Operante Konditionierung (DE-588)4172613-3 gnd
subject_GND	(DE-588)4434275-5 (DE-588)4172613-3
title	Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges
title_auth	Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges
title_exact_search	Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges
title_exact_search_txtP	Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges
title_full	Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges Andrea Lonza
title_fullStr	Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges Andrea Lonza
title_full_unstemmed	Reinforcement learning algorithms with Python learn, understand, and develop smart algorithms for addressing AI challenges Andrea Lonza
title_short	Reinforcement learning algorithms with Python
title_sort	reinforcement learning algorithms with python learn understand and develop smart algorithms for addressing ai challenges
title_sub	learn, understand, and develop smart algorithms for addressing AI challenges
topic	Python Programmiersprache (DE-588)4434275-5 gnd Operante Konditionierung (DE-588)4172613-3 gnd
topic_facet	Python Programmiersprache Operante Konditionierung
work_keys_str_mv	AT lonzaandrea reinforcementlearningalgorithmswithpythonlearnunderstandanddevelopsmartalgorithmsforaddressingaichallenges

Verfügbarkeit

Es ist kein Print-Exemplar vorhanden.

Fernleihe Bestellen Achtung: Nicht im THWS-Bestand!

MARC

Datensatz im Suchindex

Es ist kein Print-Exemplar vorhanden.

Ähnliche Einträge