Multi-agent coordination: a reinforcement learning approach
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Elektronisch E-Book |
Sprache: | English |
Veröffentlicht: |
Piscataway, NJ
IEEE Press
2021
Hoboken, NJ Wiley |
Schlagworte: | |
Online-Zugang: | FCO01 FHI01 TUM01 |
Beschreibung: | Description based on publisher supplied metadata and other sources |
Beschreibung: | 1 Online-Ressource (xxii, 296 Seiten) Illustrationen, Diagramme |
ISBN: | 9781119698999 9781119699026 9781119699057 |
Internformat
MARC
LEADER | 00000nmm a2200000zc 4500 | ||
---|---|---|---|
001 | BV047442412 | ||
003 | DE-604 | ||
005 | 20240219 | ||
007 | cr|uuu---uuuuu | ||
008 | 210827s2021 |||| o||u| ||||||eng d | ||
020 | |a 9781119698999 |q PDF |9 978-1-119-69899-9 | ||
020 | |a 9781119699026 |q EPUB |9 978-1-119-69902-6 | ||
020 | |a 9781119699057 |c online |9 978-1-119-69905-7 | ||
035 | |a (ZDB-30-PQE)EBC6413914 | ||
035 | |a (ZDB-30-PAD)EBC6413914 | ||
035 | |a (ZDB-89-EBL)EBL6413914 | ||
035 | |a (OCoLC)1225551708 | ||
035 | |a (DE-599)BVBBV047442412 | ||
040 | |a DE-604 |b ger |e rda | ||
041 | 0 | |a eng | |
049 | |a DE-91 |a DE-858 |a DE-573 | ||
082 | 0 | |a 006.31 | |
084 | |a ST 300 |0 (DE-625)143650: |2 rvk | ||
084 | |a DAT 708 |2 stub | ||
084 | |a DAT 815 |2 stub | ||
100 | 1 | |a Sadhu, Arup Kumar |d ca. 20./21. Jh. |e Verfasser |0 (DE-588)1249719453 |4 aut | |
245 | 1 | 0 | |a Multi-agent coordination |b a reinforcement learning approach |c Arup Kumar Sadhu, Amit Konar |
264 | 1 | |a Piscataway, NJ |b IEEE Press |c 2021 | |
264 | 1 | |a Hoboken, NJ |b Wiley | |
264 | 4 | |c © 2021 | |
300 | |a 1 Online-Ressource (xxii, 296 Seiten) |b Illustrationen, Diagramme | ||
336 | |b txt |2 rdacontent | ||
337 | |b c |2 rdamedia | ||
338 | |b cr |2 rdacarrier | ||
500 | |a Description based on publisher supplied metadata and other sources | ||
505 | 8 | |a Cover -- Title Page -- Copyright Page -- Contents -- Preface -- Acknowledgments -- Chapter 1 Introduction: Multi-agent Coordination by Reinforcement Learning and Evolutionary Algorithms -- 1.1 Introduction -- 1.2 Single Agent Planning -- 1.2.1 Terminologies Used in Single Agent Planning -- 1.2.2 Single Agent Search-Based Planning Algorithms -- 1.2.2.1 Dijkstra's Algorithm -- 1.2.2.2 A* (A-star) Algorithm -- 1.2.2.3 D* (D-star) Algorithm -- 1.2.2.4 Planning by STRIPS-Like Language -- 1.2.3 Single Agent RL -- 1.2.3.1 Multiarmed Bandit Problem -- 1.2.3.2 DP and Bellman Equation -- 1.2.3.3 Correlation Between RL and DP -- 1.2.3.4 Single Agent Q-Learning -- 1.2.3.5 Single Agent Planning Using Q-Learning -- 1.3 Multi-agent Planning and Coordination -- 1.3.1 Terminologies Related to Multi-agent Coordination -- 1.3.2 Classification of MAS -- 1.3.3 Game Theory for Multi-agent Coordination -- 1.3.3.1 Nash Equilibrium -- 1.3.3.2 Correlated Equilibrium -- 1.3.3.3 Static Game Examples -- 1.3.4 Correlation Among RL, DP, and GT -- 1.3.5 Classification of MARL -- 1.3.5.1 Cooperative MARL -- 1.3.5.2 Competitive MARL -- 1.3.5.3 Mixed MARL -- 1.3.6 Coordination and Planning by MAQL -- 1.3.7 Performance Analysis of MAQL and MAQL-Based Coordination -- 1.4 Coordination by Optimization Algorithm -- 1.4.1 PSO Algorithm -- 1.4.2 Firefly Algorithm -- 1.4.2.1 Initialization -- 1.4.2.2 Attraction to Brighter Fireflies -- 1.4.2.3 Movement of Fireflies -- 1.4.3 Imperialist Competitive Algorithm -- 1.4.3.1 Initialization -- 1.4.3.2 Selection of Imperialists and Colonies -- 1.4.3.3 Formation of Empires -- 1.4.3.4 Assimilation of Colonies -- 1.4.3.5 Revolution -- 1.4.3.6 Imperialistic Competition -- 1.4.4 Differential Evolution Algorithm -- 1.4.4.1 Initialization -- 1.4.4.2 Mutation -- 1.4.4.3 Recombination -- 1.4.4.4 Selection -- 1.4.5 Off-line Optimization | |
505 | 8 | |a 1.4.6 Performance Analysis of Optimization Algorithms -- 1.4.6.1 Friedman Test -- 1.4.6.2 Iman-Davenport Test -- 1.5 Summary -- References -- Chapter 2 Improve Convergence Speed of Multi-Agent Q-Learning for Cooperative Task Planning -- 2.1 Introduction -- 2.2 Literature Review -- 2.3 Preliminaries -- 2.3.1 Single Agent Q-learning -- 2.3.2 Multi-agent Q-learning -- 2.4 Proposed MAQL -- 2.4.1 Two Useful Properties -- 2.5 Proposed FCMQL Algorithms and Their Convergence Analysis -- 2.5.1 Proposed FCMQL Algorithms -- 2.5.2 Convergence Analysis of the Proposed FCMQL Algorithms -- 2.6 FCMQL-Based Cooperative Multi-agent Planning -- 2.7 Experiments and Results -- 2.8 Conclusions -- 2.9 Summary -- 2.A More Details on Experimental Results -- 2.A.1 Additional Details of Experiment 2.1 -- 2.A.2 Additional Details of Experiment 2.2 -- 2.A.3 Additional Details of Experiment 2.4 -- References -- Chapter 3 Consensus Q-Learning for Multi-agent Cooperative Planning -- 3.1 Introduction -- 3.2 Preliminaries -- 3.2.1 Single Agent Q-Learning -- 3.2.2 Equilibrium-Based Multi-agent Q-Learning -- 3.3 Consensus -- 3.4 Proposed CoQL and Planning -- 3.4.1 Consensus Q-Learning -- 3.4.2 Consensus-Based Multi-robot Planning -- 3.5 Experiments and Results -- 3.5.1 Experimental Setup -- 3.5.2 Experiments for CoQL -- 3.5.3 Experiments for Consensus-Based Planning -- 3.6 Conclusions -- 3.7 Summary -- References -- Chapter 4 An Efficient Computing of Correlated Equilibrium for Cooperative Q-Learning-Based Multi-Robot Planning -- 4.1 Introduction -- 4.2 Single-Agent Q-Learning and Equilibrium-Based MAQL -- 4.2.1 Single Agent Q-Learning -- 4.2.2 Equilibrium-Based MAQL -- 4.3 Proposed Cooperative MAQL and Planning -- 4.3.1 Proposed Schemes with Their Applicability -- 4.3.2 Immediate Rewards in Scheme-I and -II -- 4.3.3 Scheme-I-Induced MAQL -- 4.3.4 Scheme-II-Induced MAQL. | |
505 | 8 | |a 4.3.5 Algorithms for Scheme-I and II -- 4.3.6 Constraint OmegaQL-I/OmegaQL-II(COmegaQL-I/COmegaQL-II) -- 4.3.7 Convergence -- 4.3.8 Multi-agent Planning -- 4.4 Complexity Analysis -- 4.4.1 Complexity of CQL -- 4.4.1.1 Space Complexity -- 4.4.1.2 Time Complexity -- 4.4.2 Complexity of the Proposed Algorithms -- 4.4.2.1 Space Complexity -- 4.4.2.2 Time Complexity -- 4.4.3 Complexity Comparison -- 4.4.3.1 Space Complexity -- 4.4.3.2 Time Complexity -- 4.5 Simulation and Experimental Results -- 4.5.1 Experimental Platform -- 4.5.1.1 Simulation -- 4.5.1.2 Hardware -- 4.5.2 Experimental Approach -- 4.5.2.1 Learning Phase -- 4.5.2.2 Planning Phase -- 4.5.3 Experimental Results -- 4.6 Conclusion -- 4.7 Summary -- References -- Chapter 5 A Modified Imperialist Competitive Algorithm for Multi-Robot Stick-Carrying Application -- 5.1 Introduction -- 5.2 Problem Formulation for Multi-Robot Stick-Carrying -- 5.3 Proposed Hybrid Algorithm -- 5.3.1 An Overview of ICA -- 5.3.1.1 Initialization -- 5.3.1.2 Selection of Imperialists and Colonies -- 5.3.1.3 Formation of Empires -- 5.3.1.4 Assimilation of Colonies -- 5.3.1.5 Revolution -- 5.3.1.6 Imperialistic Competition -- 5.3.1.6.1 Total Empire Power Evaluation -- 5.3.1.6.2 Reassignment of Colonies and Removal of Empire -- 5.3.1.6.3 Union of Empires -- 5.4 An Overview of FA -- 5.4.1 Initialization -- 5.4.2 Attraction to Brighter Fireflies -- 5.4.3 Movement of Fireflies -- 5.5 Proposed ICFA -- 5.5.1 Assimilation of Colonies -- 5.5.1.1 Attraction to Powerful Colonies -- 5.5.1.2 Modification of Empire Behavior -- 5.5.1.3 Union of Empires -- 5.6 Simulation Results -- 5.6.1 Comparative Framework -- 5.6.2 Parameter Settings -- 5.6.3 Analysis on Explorative Power of ICFA -- 5.6.4 Comparison of Quality of the Final Solution -- 5.6.5 Performance Analysis -- 5.7 Computer Simulation and Experiment | |
505 | 8 | |a 5.7.1 Average Total Path Deviation (ATPD) -- 5.7.2 Average Uncovered Target Distance (AUTD) -- 5.7.3 Experimental Setup in Simulation Environment -- 5.7.4 Experimental Results in Simulation Environment -- 5.7.5 Experimental Setup with Khepera Robots -- 5.7.6 Experimental Results with Khepera Robots -- 5.8 Conclusion -- 5.9 Summary -- References -- Chapter 6 Conclusions and Future Directions -- 6.1 Conclusions -- 6.2 Future Directions -- Index -- EULA. | |
650 | 0 | 7 | |a Mehragentensystem |0 (DE-588)4389058-1 |2 gnd |9 rswk-swf |
650 | 0 | 7 | |a Bestärkendes Lernen |g Künstliche Intelligenz |0 (DE-588)4825546-4 |2 gnd |9 rswk-swf |
689 | 0 | 0 | |a Mehragentensystem |0 (DE-588)4389058-1 |D s |
689 | 0 | 1 | |a Bestärkendes Lernen |g Künstliche Intelligenz |0 (DE-588)4825546-4 |D s |
689 | 0 | |5 DE-604 | |
700 | 1 | |a Konar, Amit |d 1963- |e Verfasser |0 (DE-588)1063311365 |4 aut | |
776 | 0 | 8 | |i Erscheint auch als |a Sadhu, Arup Kumar |t Multi-Agent Coordination |d Newark : John Wiley & Sons, Incorporated,c2020 |n Druck-Ausgabe |z 978-1-119-69903-3 |
912 | |a ZDB-30-PQE |a ZDB-35-WIC |a ZDB-35-WEL | ||
999 | |a oai:aleph.bib-bvb.de:BVB01-032844564 | ||
966 | e | |u https://onlinelibrary.wiley.com/doi/book/10.1002/9781119699057 |l FCO01 |p ZDB-35-WIC |q FCO_PDA_WIC_Kauf |x Verlag |3 Volltext | |
966 | e | |u https://ieeexplore.ieee.org/servlet/opac?bknumber=9292527 |l FHI01 |p ZDB-35-WEL |x Verlag |3 Volltext | |
966 | e | |u https://ebookcentral.proquest.com/lib/munchentech/detail.action?docID=6413914 |l TUM01 |p ZDB-30-PQE |q TUM_PDA_PQE_Kauf |x Aggregator |3 Volltext |
Datensatz im Suchindex
_version_ | 1804182734871986176 |
---|---|
adam_txt | |
any_adam_object | |
any_adam_object_boolean | |
author | Sadhu, Arup Kumar ca. 20./21. Jh Konar, Amit 1963- |
author_GND | (DE-588)1249719453 (DE-588)1063311365 |
author_facet | Sadhu, Arup Kumar ca. 20./21. Jh Konar, Amit 1963- |
author_role | aut aut |
author_sort | Sadhu, Arup Kumar ca. 20./21. Jh |
author_variant | a k s ak aks a k ak |
building | Verbundindex |
bvnumber | BV047442412 |
classification_rvk | ST 300 |
classification_tum | DAT 708 DAT 815 |
collection | ZDB-30-PQE ZDB-35-WIC ZDB-35-WEL |
contents | Cover -- Title Page -- Copyright Page -- Contents -- Preface -- Acknowledgments -- Chapter 1 Introduction: Multi-agent Coordination by Reinforcement Learning and Evolutionary Algorithms -- 1.1 Introduction -- 1.2 Single Agent Planning -- 1.2.1 Terminologies Used in Single Agent Planning -- 1.2.2 Single Agent Search-Based Planning Algorithms -- 1.2.2.1 Dijkstra's Algorithm -- 1.2.2.2 A* (A-star) Algorithm -- 1.2.2.3 D* (D-star) Algorithm -- 1.2.2.4 Planning by STRIPS-Like Language -- 1.2.3 Single Agent RL -- 1.2.3.1 Multiarmed Bandit Problem -- 1.2.3.2 DP and Bellman Equation -- 1.2.3.3 Correlation Between RL and DP -- 1.2.3.4 Single Agent Q-Learning -- 1.2.3.5 Single Agent Planning Using Q-Learning -- 1.3 Multi-agent Planning and Coordination -- 1.3.1 Terminologies Related to Multi-agent Coordination -- 1.3.2 Classification of MAS -- 1.3.3 Game Theory for Multi-agent Coordination -- 1.3.3.1 Nash Equilibrium -- 1.3.3.2 Correlated Equilibrium -- 1.3.3.3 Static Game Examples -- 1.3.4 Correlation Among RL, DP, and GT -- 1.3.5 Classification of MARL -- 1.3.5.1 Cooperative MARL -- 1.3.5.2 Competitive MARL -- 1.3.5.3 Mixed MARL -- 1.3.6 Coordination and Planning by MAQL -- 1.3.7 Performance Analysis of MAQL and MAQL-Based Coordination -- 1.4 Coordination by Optimization Algorithm -- 1.4.1 PSO Algorithm -- 1.4.2 Firefly Algorithm -- 1.4.2.1 Initialization -- 1.4.2.2 Attraction to Brighter Fireflies -- 1.4.2.3 Movement of Fireflies -- 1.4.3 Imperialist Competitive Algorithm -- 1.4.3.1 Initialization -- 1.4.3.2 Selection of Imperialists and Colonies -- 1.4.3.3 Formation of Empires -- 1.4.3.4 Assimilation of Colonies -- 1.4.3.5 Revolution -- 1.4.3.6 Imperialistic Competition -- 1.4.4 Differential Evolution Algorithm -- 1.4.4.1 Initialization -- 1.4.4.2 Mutation -- 1.4.4.3 Recombination -- 1.4.4.4 Selection -- 1.4.5 Off-line Optimization 1.4.6 Performance Analysis of Optimization Algorithms -- 1.4.6.1 Friedman Test -- 1.4.6.2 Iman-Davenport Test -- 1.5 Summary -- References -- Chapter 2 Improve Convergence Speed of Multi-Agent Q-Learning for Cooperative Task Planning -- 2.1 Introduction -- 2.2 Literature Review -- 2.3 Preliminaries -- 2.3.1 Single Agent Q-learning -- 2.3.2 Multi-agent Q-learning -- 2.4 Proposed MAQL -- 2.4.1 Two Useful Properties -- 2.5 Proposed FCMQL Algorithms and Their Convergence Analysis -- 2.5.1 Proposed FCMQL Algorithms -- 2.5.2 Convergence Analysis of the Proposed FCMQL Algorithms -- 2.6 FCMQL-Based Cooperative Multi-agent Planning -- 2.7 Experiments and Results -- 2.8 Conclusions -- 2.9 Summary -- 2.A More Details on Experimental Results -- 2.A.1 Additional Details of Experiment 2.1 -- 2.A.2 Additional Details of Experiment 2.2 -- 2.A.3 Additional Details of Experiment 2.4 -- References -- Chapter 3 Consensus Q-Learning for Multi-agent Cooperative Planning -- 3.1 Introduction -- 3.2 Preliminaries -- 3.2.1 Single Agent Q-Learning -- 3.2.2 Equilibrium-Based Multi-agent Q-Learning -- 3.3 Consensus -- 3.4 Proposed CoQL and Planning -- 3.4.1 Consensus Q-Learning -- 3.4.2 Consensus-Based Multi-robot Planning -- 3.5 Experiments and Results -- 3.5.1 Experimental Setup -- 3.5.2 Experiments for CoQL -- 3.5.3 Experiments for Consensus-Based Planning -- 3.6 Conclusions -- 3.7 Summary -- References -- Chapter 4 An Efficient Computing of Correlated Equilibrium for Cooperative Q-Learning-Based Multi-Robot Planning -- 4.1 Introduction -- 4.2 Single-Agent Q-Learning and Equilibrium-Based MAQL -- 4.2.1 Single Agent Q-Learning -- 4.2.2 Equilibrium-Based MAQL -- 4.3 Proposed Cooperative MAQL and Planning -- 4.3.1 Proposed Schemes with Their Applicability -- 4.3.2 Immediate Rewards in Scheme-I and -II -- 4.3.3 Scheme-I-Induced MAQL -- 4.3.4 Scheme-II-Induced MAQL. 4.3.5 Algorithms for Scheme-I and II -- 4.3.6 Constraint OmegaQL-I/OmegaQL-II(COmegaQL-I/COmegaQL-II) -- 4.3.7 Convergence -- 4.3.8 Multi-agent Planning -- 4.4 Complexity Analysis -- 4.4.1 Complexity of CQL -- 4.4.1.1 Space Complexity -- 4.4.1.2 Time Complexity -- 4.4.2 Complexity of the Proposed Algorithms -- 4.4.2.1 Space Complexity -- 4.4.2.2 Time Complexity -- 4.4.3 Complexity Comparison -- 4.4.3.1 Space Complexity -- 4.4.3.2 Time Complexity -- 4.5 Simulation and Experimental Results -- 4.5.1 Experimental Platform -- 4.5.1.1 Simulation -- 4.5.1.2 Hardware -- 4.5.2 Experimental Approach -- 4.5.2.1 Learning Phase -- 4.5.2.2 Planning Phase -- 4.5.3 Experimental Results -- 4.6 Conclusion -- 4.7 Summary -- References -- Chapter 5 A Modified Imperialist Competitive Algorithm for Multi-Robot Stick-Carrying Application -- 5.1 Introduction -- 5.2 Problem Formulation for Multi-Robot Stick-Carrying -- 5.3 Proposed Hybrid Algorithm -- 5.3.1 An Overview of ICA -- 5.3.1.1 Initialization -- 5.3.1.2 Selection of Imperialists and Colonies -- 5.3.1.3 Formation of Empires -- 5.3.1.4 Assimilation of Colonies -- 5.3.1.5 Revolution -- 5.3.1.6 Imperialistic Competition -- 5.3.1.6.1 Total Empire Power Evaluation -- 5.3.1.6.2 Reassignment of Colonies and Removal of Empire -- 5.3.1.6.3 Union of Empires -- 5.4 An Overview of FA -- 5.4.1 Initialization -- 5.4.2 Attraction to Brighter Fireflies -- 5.4.3 Movement of Fireflies -- 5.5 Proposed ICFA -- 5.5.1 Assimilation of Colonies -- 5.5.1.1 Attraction to Powerful Colonies -- 5.5.1.2 Modification of Empire Behavior -- 5.5.1.3 Union of Empires -- 5.6 Simulation Results -- 5.6.1 Comparative Framework -- 5.6.2 Parameter Settings -- 5.6.3 Analysis on Explorative Power of ICFA -- 5.6.4 Comparison of Quality of the Final Solution -- 5.6.5 Performance Analysis -- 5.7 Computer Simulation and Experiment 5.7.1 Average Total Path Deviation (ATPD) -- 5.7.2 Average Uncovered Target Distance (AUTD) -- 5.7.3 Experimental Setup in Simulation Environment -- 5.7.4 Experimental Results in Simulation Environment -- 5.7.5 Experimental Setup with Khepera Robots -- 5.7.6 Experimental Results with Khepera Robots -- 5.8 Conclusion -- 5.9 Summary -- References -- Chapter 6 Conclusions and Future Directions -- 6.1 Conclusions -- 6.2 Future Directions -- Index -- EULA. |
ctrlnum | (ZDB-30-PQE)EBC6413914 (ZDB-30-PAD)EBC6413914 (ZDB-89-EBL)EBL6413914 (OCoLC)1225551708 (DE-599)BVBBV047442412 |
dewey-full | 006.31 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.31 |
dewey-search | 006.31 |
dewey-sort | 16.31 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
discipline_str_mv | Informatik |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>08467nmm a2200577zc 4500</leader><controlfield tag="001">BV047442412</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20240219 </controlfield><controlfield tag="007">cr|uuu---uuuuu</controlfield><controlfield tag="008">210827s2021 |||| o||u| ||||||eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781119698999</subfield><subfield code="q">PDF</subfield><subfield code="9">978-1-119-69899-9</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781119699026</subfield><subfield code="q">EPUB</subfield><subfield code="9">978-1-119-69902-6</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781119699057</subfield><subfield code="c">online</subfield><subfield code="9">978-1-119-69905-7</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PQE)EBC6413914</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-30-PAD)EBC6413914</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ZDB-89-EBL)EBL6413914</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1225551708</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV047442412</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield><subfield code="a">DE-858</subfield><subfield code="a">DE-573</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.31</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 300</subfield><subfield code="0">(DE-625)143650:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 708</subfield><subfield code="2">stub</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">DAT 815</subfield><subfield code="2">stub</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Sadhu, Arup Kumar</subfield><subfield code="d">ca. 20./21. Jh.</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1249719453</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Multi-agent coordination</subfield><subfield code="b">a reinforcement learning approach</subfield><subfield code="c">Arup Kumar Sadhu, Amit Konar</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Piscataway, NJ</subfield><subfield code="b">IEEE Press</subfield><subfield code="c">2021</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Hoboken, NJ</subfield><subfield code="b">Wiley</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">© 2021</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (xxii, 296 Seiten)</subfield><subfield code="b">Illustrationen, Diagramme</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Description based on publisher supplied metadata and other sources</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Cover -- Title Page -- Copyright Page -- Contents -- Preface -- Acknowledgments -- Chapter 1 Introduction: Multi-agent Coordination by Reinforcement Learning and Evolutionary Algorithms -- 1.1 Introduction -- 1.2 Single Agent Planning -- 1.2.1 Terminologies Used in Single Agent Planning -- 1.2.2 Single Agent Search-Based Planning Algorithms -- 1.2.2.1 Dijkstra's Algorithm -- 1.2.2.2 A* (A-star) Algorithm -- 1.2.2.3 D* (D-star) Algorithm -- 1.2.2.4 Planning by STRIPS-Like Language -- 1.2.3 Single Agent RL -- 1.2.3.1 Multiarmed Bandit Problem -- 1.2.3.2 DP and Bellman Equation -- 1.2.3.3 Correlation Between RL and DP -- 1.2.3.4 Single Agent Q-Learning -- 1.2.3.5 Single Agent Planning Using Q-Learning -- 1.3 Multi-agent Planning and Coordination -- 1.3.1 Terminologies Related to Multi-agent Coordination -- 1.3.2 Classification of MAS -- 1.3.3 Game Theory for Multi-agent Coordination -- 1.3.3.1 Nash Equilibrium -- 1.3.3.2 Correlated Equilibrium -- 1.3.3.3 Static Game Examples -- 1.3.4 Correlation Among RL, DP, and GT -- 1.3.5 Classification of MARL -- 1.3.5.1 Cooperative MARL -- 1.3.5.2 Competitive MARL -- 1.3.5.3 Mixed MARL -- 1.3.6 Coordination and Planning by MAQL -- 1.3.7 Performance Analysis of MAQL and MAQL-Based Coordination -- 1.4 Coordination by Optimization Algorithm -- 1.4.1 PSO Algorithm -- 1.4.2 Firefly Algorithm -- 1.4.2.1 Initialization -- 1.4.2.2 Attraction to Brighter Fireflies -- 1.4.2.3 Movement of Fireflies -- 1.4.3 Imperialist Competitive Algorithm -- 1.4.3.1 Initialization -- 1.4.3.2 Selection of Imperialists and Colonies -- 1.4.3.3 Formation of Empires -- 1.4.3.4 Assimilation of Colonies -- 1.4.3.5 Revolution -- 1.4.3.6 Imperialistic Competition -- 1.4.4 Differential Evolution Algorithm -- 1.4.4.1 Initialization -- 1.4.4.2 Mutation -- 1.4.4.3 Recombination -- 1.4.4.4 Selection -- 1.4.5 Off-line Optimization</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">1.4.6 Performance Analysis of Optimization Algorithms -- 1.4.6.1 Friedman Test -- 1.4.6.2 Iman-Davenport Test -- 1.5 Summary -- References -- Chapter 2 Improve Convergence Speed of Multi-Agent Q-Learning for Cooperative Task Planning -- 2.1 Introduction -- 2.2 Literature Review -- 2.3 Preliminaries -- 2.3.1 Single Agent Q-learning -- 2.3.2 Multi-agent Q-learning -- 2.4 Proposed MAQL -- 2.4.1 Two Useful Properties -- 2.5 Proposed FCMQL Algorithms and Their Convergence Analysis -- 2.5.1 Proposed FCMQL Algorithms -- 2.5.2 Convergence Analysis of the Proposed FCMQL Algorithms -- 2.6 FCMQL-Based Cooperative Multi-agent Planning -- 2.7 Experiments and Results -- 2.8 Conclusions -- 2.9 Summary -- 2.A More Details on Experimental Results -- 2.A.1 Additional Details of Experiment 2.1 -- 2.A.2 Additional Details of Experiment 2.2 -- 2.A.3 Additional Details of Experiment 2.4 -- References -- Chapter 3 Consensus Q-Learning for Multi-agent Cooperative Planning -- 3.1 Introduction -- 3.2 Preliminaries -- 3.2.1 Single Agent Q-Learning -- 3.2.2 Equilibrium-Based Multi-agent Q-Learning -- 3.3 Consensus -- 3.4 Proposed CoQL and Planning -- 3.4.1 Consensus Q-Learning -- 3.4.2 Consensus-Based Multi-robot Planning -- 3.5 Experiments and Results -- 3.5.1 Experimental Setup -- 3.5.2 Experiments for CoQL -- 3.5.3 Experiments for Consensus-Based Planning -- 3.6 Conclusions -- 3.7 Summary -- References -- Chapter 4 An Efficient Computing of Correlated Equilibrium for Cooperative Q-Learning-Based Multi-Robot Planning -- 4.1 Introduction -- 4.2 Single-Agent Q-Learning and Equilibrium-Based MAQL -- 4.2.1 Single Agent Q-Learning -- 4.2.2 Equilibrium-Based MAQL -- 4.3 Proposed Cooperative MAQL and Planning -- 4.3.1 Proposed Schemes with Their Applicability -- 4.3.2 Immediate Rewards in Scheme-I and -II -- 4.3.3 Scheme-I-Induced MAQL -- 4.3.4 Scheme-II-Induced MAQL.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">4.3.5 Algorithms for Scheme-I and II -- 4.3.6 Constraint OmegaQL-I/OmegaQL-II(COmegaQL-I/COmegaQL-II) -- 4.3.7 Convergence -- 4.3.8 Multi-agent Planning -- 4.4 Complexity Analysis -- 4.4.1 Complexity of CQL -- 4.4.1.1 Space Complexity -- 4.4.1.2 Time Complexity -- 4.4.2 Complexity of the Proposed Algorithms -- 4.4.2.1 Space Complexity -- 4.4.2.2 Time Complexity -- 4.4.3 Complexity Comparison -- 4.4.3.1 Space Complexity -- 4.4.3.2 Time Complexity -- 4.5 Simulation and Experimental Results -- 4.5.1 Experimental Platform -- 4.5.1.1 Simulation -- 4.5.1.2 Hardware -- 4.5.2 Experimental Approach -- 4.5.2.1 Learning Phase -- 4.5.2.2 Planning Phase -- 4.5.3 Experimental Results -- 4.6 Conclusion -- 4.7 Summary -- References -- Chapter 5 A Modified Imperialist Competitive Algorithm for Multi-Robot Stick-Carrying Application -- 5.1 Introduction -- 5.2 Problem Formulation for Multi-Robot Stick-Carrying -- 5.3 Proposed Hybrid Algorithm -- 5.3.1 An Overview of ICA -- 5.3.1.1 Initialization -- 5.3.1.2 Selection of Imperialists and Colonies -- 5.3.1.3 Formation of Empires -- 5.3.1.4 Assimilation of Colonies -- 5.3.1.5 Revolution -- 5.3.1.6 Imperialistic Competition -- 5.3.1.6.1 Total Empire Power Evaluation -- 5.3.1.6.2 Reassignment of Colonies and Removal of Empire -- 5.3.1.6.3 Union of Empires -- 5.4 An Overview of FA -- 5.4.1 Initialization -- 5.4.2 Attraction to Brighter Fireflies -- 5.4.3 Movement of Fireflies -- 5.5 Proposed ICFA -- 5.5.1 Assimilation of Colonies -- 5.5.1.1 Attraction to Powerful Colonies -- 5.5.1.2 Modification of Empire Behavior -- 5.5.1.3 Union of Empires -- 5.6 Simulation Results -- 5.6.1 Comparative Framework -- 5.6.2 Parameter Settings -- 5.6.3 Analysis on Explorative Power of ICFA -- 5.6.4 Comparison of Quality of the Final Solution -- 5.6.5 Performance Analysis -- 5.7 Computer Simulation and Experiment</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">5.7.1 Average Total Path Deviation (ATPD) -- 5.7.2 Average Uncovered Target Distance (AUTD) -- 5.7.3 Experimental Setup in Simulation Environment -- 5.7.4 Experimental Results in Simulation Environment -- 5.7.5 Experimental Setup with Khepera Robots -- 5.7.6 Experimental Results with Khepera Robots -- 5.8 Conclusion -- 5.9 Summary -- References -- Chapter 6 Conclusions and Future Directions -- 6.1 Conclusions -- 6.2 Future Directions -- Index -- EULA.</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Mehragentensystem</subfield><subfield code="0">(DE-588)4389058-1</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="650" ind1="0" ind2="7"><subfield code="a">Bestärkendes Lernen</subfield><subfield code="g">Künstliche Intelligenz</subfield><subfield code="0">(DE-588)4825546-4</subfield><subfield code="2">gnd</subfield><subfield code="9">rswk-swf</subfield></datafield><datafield tag="689" ind1="0" ind2="0"><subfield code="a">Mehragentensystem</subfield><subfield code="0">(DE-588)4389058-1</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2="1"><subfield code="a">Bestärkendes Lernen</subfield><subfield code="g">Künstliche Intelligenz</subfield><subfield code="0">(DE-588)4825546-4</subfield><subfield code="D">s</subfield></datafield><datafield tag="689" ind1="0" ind2=" "><subfield code="5">DE-604</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Konar, Amit</subfield><subfield code="d">1963-</subfield><subfield code="e">Verfasser</subfield><subfield code="0">(DE-588)1063311365</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="a">Sadhu, Arup Kumar</subfield><subfield code="t">Multi-Agent Coordination</subfield><subfield code="d">Newark : John Wiley & Sons, Incorporated,c2020</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">978-1-119-69903-3</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-PQE</subfield><subfield code="a">ZDB-35-WIC</subfield><subfield code="a">ZDB-35-WEL</subfield></datafield><datafield tag="999" ind1=" " ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-032844564</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://onlinelibrary.wiley.com/doi/book/10.1002/9781119699057</subfield><subfield code="l">FCO01</subfield><subfield code="p">ZDB-35-WIC</subfield><subfield code="q">FCO_PDA_WIC_Kauf</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://ieeexplore.ieee.org/servlet/opac?bknumber=9292527</subfield><subfield code="l">FHI01</subfield><subfield code="p">ZDB-35-WEL</subfield><subfield code="x">Verlag</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="966" ind1="e" ind2=" "><subfield code="u">https://ebookcentral.proquest.com/lib/munchentech/detail.action?docID=6413914</subfield><subfield code="l">TUM01</subfield><subfield code="p">ZDB-30-PQE</subfield><subfield code="q">TUM_PDA_PQE_Kauf</subfield><subfield code="x">Aggregator</subfield><subfield code="3">Volltext</subfield></datafield></record></collection> |
id | DE-604.BV047442412 |
illustrated | Not Illustrated |
index_date | 2024-07-03T18:01:24Z |
indexdate | 2024-07-10T09:12:16Z |
institution | BVB |
isbn | 9781119698999 9781119699026 9781119699057 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-032844564 |
oclc_num | 1225551708 |
open_access_boolean | |
owner | DE-91 DE-BY-TUM DE-858 DE-573 |
owner_facet | DE-91 DE-BY-TUM DE-858 DE-573 |
physical | 1 Online-Ressource (xxii, 296 Seiten) Illustrationen, Diagramme |
psigel | ZDB-30-PQE ZDB-35-WIC ZDB-35-WEL ZDB-35-WIC FCO_PDA_WIC_Kauf ZDB-30-PQE TUM_PDA_PQE_Kauf |
publishDate | 2021 |
publishDateSearch | 2021 |
publishDateSort | 2021 |
publisher | IEEE Press Wiley |
record_format | marc |
spelling | Sadhu, Arup Kumar ca. 20./21. Jh. Verfasser (DE-588)1249719453 aut Multi-agent coordination a reinforcement learning approach Arup Kumar Sadhu, Amit Konar Piscataway, NJ IEEE Press 2021 Hoboken, NJ Wiley © 2021 1 Online-Ressource (xxii, 296 Seiten) Illustrationen, Diagramme txt rdacontent c rdamedia cr rdacarrier Description based on publisher supplied metadata and other sources Cover -- Title Page -- Copyright Page -- Contents -- Preface -- Acknowledgments -- Chapter 1 Introduction: Multi-agent Coordination by Reinforcement Learning and Evolutionary Algorithms -- 1.1 Introduction -- 1.2 Single Agent Planning -- 1.2.1 Terminologies Used in Single Agent Planning -- 1.2.2 Single Agent Search-Based Planning Algorithms -- 1.2.2.1 Dijkstra's Algorithm -- 1.2.2.2 A* (A-star) Algorithm -- 1.2.2.3 D* (D-star) Algorithm -- 1.2.2.4 Planning by STRIPS-Like Language -- 1.2.3 Single Agent RL -- 1.2.3.1 Multiarmed Bandit Problem -- 1.2.3.2 DP and Bellman Equation -- 1.2.3.3 Correlation Between RL and DP -- 1.2.3.4 Single Agent Q-Learning -- 1.2.3.5 Single Agent Planning Using Q-Learning -- 1.3 Multi-agent Planning and Coordination -- 1.3.1 Terminologies Related to Multi-agent Coordination -- 1.3.2 Classification of MAS -- 1.3.3 Game Theory for Multi-agent Coordination -- 1.3.3.1 Nash Equilibrium -- 1.3.3.2 Correlated Equilibrium -- 1.3.3.3 Static Game Examples -- 1.3.4 Correlation Among RL, DP, and GT -- 1.3.5 Classification of MARL -- 1.3.5.1 Cooperative MARL -- 1.3.5.2 Competitive MARL -- 1.3.5.3 Mixed MARL -- 1.3.6 Coordination and Planning by MAQL -- 1.3.7 Performance Analysis of MAQL and MAQL-Based Coordination -- 1.4 Coordination by Optimization Algorithm -- 1.4.1 PSO Algorithm -- 1.4.2 Firefly Algorithm -- 1.4.2.1 Initialization -- 1.4.2.2 Attraction to Brighter Fireflies -- 1.4.2.3 Movement of Fireflies -- 1.4.3 Imperialist Competitive Algorithm -- 1.4.3.1 Initialization -- 1.4.3.2 Selection of Imperialists and Colonies -- 1.4.3.3 Formation of Empires -- 1.4.3.4 Assimilation of Colonies -- 1.4.3.5 Revolution -- 1.4.3.6 Imperialistic Competition -- 1.4.4 Differential Evolution Algorithm -- 1.4.4.1 Initialization -- 1.4.4.2 Mutation -- 1.4.4.3 Recombination -- 1.4.4.4 Selection -- 1.4.5 Off-line Optimization 1.4.6 Performance Analysis of Optimization Algorithms -- 1.4.6.1 Friedman Test -- 1.4.6.2 Iman-Davenport Test -- 1.5 Summary -- References -- Chapter 2 Improve Convergence Speed of Multi-Agent Q-Learning for Cooperative Task Planning -- 2.1 Introduction -- 2.2 Literature Review -- 2.3 Preliminaries -- 2.3.1 Single Agent Q-learning -- 2.3.2 Multi-agent Q-learning -- 2.4 Proposed MAQL -- 2.4.1 Two Useful Properties -- 2.5 Proposed FCMQL Algorithms and Their Convergence Analysis -- 2.5.1 Proposed FCMQL Algorithms -- 2.5.2 Convergence Analysis of the Proposed FCMQL Algorithms -- 2.6 FCMQL-Based Cooperative Multi-agent Planning -- 2.7 Experiments and Results -- 2.8 Conclusions -- 2.9 Summary -- 2.A More Details on Experimental Results -- 2.A.1 Additional Details of Experiment 2.1 -- 2.A.2 Additional Details of Experiment 2.2 -- 2.A.3 Additional Details of Experiment 2.4 -- References -- Chapter 3 Consensus Q-Learning for Multi-agent Cooperative Planning -- 3.1 Introduction -- 3.2 Preliminaries -- 3.2.1 Single Agent Q-Learning -- 3.2.2 Equilibrium-Based Multi-agent Q-Learning -- 3.3 Consensus -- 3.4 Proposed CoQL and Planning -- 3.4.1 Consensus Q-Learning -- 3.4.2 Consensus-Based Multi-robot Planning -- 3.5 Experiments and Results -- 3.5.1 Experimental Setup -- 3.5.2 Experiments for CoQL -- 3.5.3 Experiments for Consensus-Based Planning -- 3.6 Conclusions -- 3.7 Summary -- References -- Chapter 4 An Efficient Computing of Correlated Equilibrium for Cooperative Q-Learning-Based Multi-Robot Planning -- 4.1 Introduction -- 4.2 Single-Agent Q-Learning and Equilibrium-Based MAQL -- 4.2.1 Single Agent Q-Learning -- 4.2.2 Equilibrium-Based MAQL -- 4.3 Proposed Cooperative MAQL and Planning -- 4.3.1 Proposed Schemes with Their Applicability -- 4.3.2 Immediate Rewards in Scheme-I and -II -- 4.3.3 Scheme-I-Induced MAQL -- 4.3.4 Scheme-II-Induced MAQL. 4.3.5 Algorithms for Scheme-I and II -- 4.3.6 Constraint OmegaQL-I/OmegaQL-II(COmegaQL-I/COmegaQL-II) -- 4.3.7 Convergence -- 4.3.8 Multi-agent Planning -- 4.4 Complexity Analysis -- 4.4.1 Complexity of CQL -- 4.4.1.1 Space Complexity -- 4.4.1.2 Time Complexity -- 4.4.2 Complexity of the Proposed Algorithms -- 4.4.2.1 Space Complexity -- 4.4.2.2 Time Complexity -- 4.4.3 Complexity Comparison -- 4.4.3.1 Space Complexity -- 4.4.3.2 Time Complexity -- 4.5 Simulation and Experimental Results -- 4.5.1 Experimental Platform -- 4.5.1.1 Simulation -- 4.5.1.2 Hardware -- 4.5.2 Experimental Approach -- 4.5.2.1 Learning Phase -- 4.5.2.2 Planning Phase -- 4.5.3 Experimental Results -- 4.6 Conclusion -- 4.7 Summary -- References -- Chapter 5 A Modified Imperialist Competitive Algorithm for Multi-Robot Stick-Carrying Application -- 5.1 Introduction -- 5.2 Problem Formulation for Multi-Robot Stick-Carrying -- 5.3 Proposed Hybrid Algorithm -- 5.3.1 An Overview of ICA -- 5.3.1.1 Initialization -- 5.3.1.2 Selection of Imperialists and Colonies -- 5.3.1.3 Formation of Empires -- 5.3.1.4 Assimilation of Colonies -- 5.3.1.5 Revolution -- 5.3.1.6 Imperialistic Competition -- 5.3.1.6.1 Total Empire Power Evaluation -- 5.3.1.6.2 Reassignment of Colonies and Removal of Empire -- 5.3.1.6.3 Union of Empires -- 5.4 An Overview of FA -- 5.4.1 Initialization -- 5.4.2 Attraction to Brighter Fireflies -- 5.4.3 Movement of Fireflies -- 5.5 Proposed ICFA -- 5.5.1 Assimilation of Colonies -- 5.5.1.1 Attraction to Powerful Colonies -- 5.5.1.2 Modification of Empire Behavior -- 5.5.1.3 Union of Empires -- 5.6 Simulation Results -- 5.6.1 Comparative Framework -- 5.6.2 Parameter Settings -- 5.6.3 Analysis on Explorative Power of ICFA -- 5.6.4 Comparison of Quality of the Final Solution -- 5.6.5 Performance Analysis -- 5.7 Computer Simulation and Experiment 5.7.1 Average Total Path Deviation (ATPD) -- 5.7.2 Average Uncovered Target Distance (AUTD) -- 5.7.3 Experimental Setup in Simulation Environment -- 5.7.4 Experimental Results in Simulation Environment -- 5.7.5 Experimental Setup with Khepera Robots -- 5.7.6 Experimental Results with Khepera Robots -- 5.8 Conclusion -- 5.9 Summary -- References -- Chapter 6 Conclusions and Future Directions -- 6.1 Conclusions -- 6.2 Future Directions -- Index -- EULA. Mehragentensystem (DE-588)4389058-1 gnd rswk-swf Bestärkendes Lernen Künstliche Intelligenz (DE-588)4825546-4 gnd rswk-swf Mehragentensystem (DE-588)4389058-1 s Bestärkendes Lernen Künstliche Intelligenz (DE-588)4825546-4 s DE-604 Konar, Amit 1963- Verfasser (DE-588)1063311365 aut Erscheint auch als Sadhu, Arup Kumar Multi-Agent Coordination Newark : John Wiley & Sons, Incorporated,c2020 Druck-Ausgabe 978-1-119-69903-3 |
spellingShingle | Sadhu, Arup Kumar ca. 20./21. Jh Konar, Amit 1963- Multi-agent coordination a reinforcement learning approach Cover -- Title Page -- Copyright Page -- Contents -- Preface -- Acknowledgments -- Chapter 1 Introduction: Multi-agent Coordination by Reinforcement Learning and Evolutionary Algorithms -- 1.1 Introduction -- 1.2 Single Agent Planning -- 1.2.1 Terminologies Used in Single Agent Planning -- 1.2.2 Single Agent Search-Based Planning Algorithms -- 1.2.2.1 Dijkstra's Algorithm -- 1.2.2.2 A* (A-star) Algorithm -- 1.2.2.3 D* (D-star) Algorithm -- 1.2.2.4 Planning by STRIPS-Like Language -- 1.2.3 Single Agent RL -- 1.2.3.1 Multiarmed Bandit Problem -- 1.2.3.2 DP and Bellman Equation -- 1.2.3.3 Correlation Between RL and DP -- 1.2.3.4 Single Agent Q-Learning -- 1.2.3.5 Single Agent Planning Using Q-Learning -- 1.3 Multi-agent Planning and Coordination -- 1.3.1 Terminologies Related to Multi-agent Coordination -- 1.3.2 Classification of MAS -- 1.3.3 Game Theory for Multi-agent Coordination -- 1.3.3.1 Nash Equilibrium -- 1.3.3.2 Correlated Equilibrium -- 1.3.3.3 Static Game Examples -- 1.3.4 Correlation Among RL, DP, and GT -- 1.3.5 Classification of MARL -- 1.3.5.1 Cooperative MARL -- 1.3.5.2 Competitive MARL -- 1.3.5.3 Mixed MARL -- 1.3.6 Coordination and Planning by MAQL -- 1.3.7 Performance Analysis of MAQL and MAQL-Based Coordination -- 1.4 Coordination by Optimization Algorithm -- 1.4.1 PSO Algorithm -- 1.4.2 Firefly Algorithm -- 1.4.2.1 Initialization -- 1.4.2.2 Attraction to Brighter Fireflies -- 1.4.2.3 Movement of Fireflies -- 1.4.3 Imperialist Competitive Algorithm -- 1.4.3.1 Initialization -- 1.4.3.2 Selection of Imperialists and Colonies -- 1.4.3.3 Formation of Empires -- 1.4.3.4 Assimilation of Colonies -- 1.4.3.5 Revolution -- 1.4.3.6 Imperialistic Competition -- 1.4.4 Differential Evolution Algorithm -- 1.4.4.1 Initialization -- 1.4.4.2 Mutation -- 1.4.4.3 Recombination -- 1.4.4.4 Selection -- 1.4.5 Off-line Optimization 1.4.6 Performance Analysis of Optimization Algorithms -- 1.4.6.1 Friedman Test -- 1.4.6.2 Iman-Davenport Test -- 1.5 Summary -- References -- Chapter 2 Improve Convergence Speed of Multi-Agent Q-Learning for Cooperative Task Planning -- 2.1 Introduction -- 2.2 Literature Review -- 2.3 Preliminaries -- 2.3.1 Single Agent Q-learning -- 2.3.2 Multi-agent Q-learning -- 2.4 Proposed MAQL -- 2.4.1 Two Useful Properties -- 2.5 Proposed FCMQL Algorithms and Their Convergence Analysis -- 2.5.1 Proposed FCMQL Algorithms -- 2.5.2 Convergence Analysis of the Proposed FCMQL Algorithms -- 2.6 FCMQL-Based Cooperative Multi-agent Planning -- 2.7 Experiments and Results -- 2.8 Conclusions -- 2.9 Summary -- 2.A More Details on Experimental Results -- 2.A.1 Additional Details of Experiment 2.1 -- 2.A.2 Additional Details of Experiment 2.2 -- 2.A.3 Additional Details of Experiment 2.4 -- References -- Chapter 3 Consensus Q-Learning for Multi-agent Cooperative Planning -- 3.1 Introduction -- 3.2 Preliminaries -- 3.2.1 Single Agent Q-Learning -- 3.2.2 Equilibrium-Based Multi-agent Q-Learning -- 3.3 Consensus -- 3.4 Proposed CoQL and Planning -- 3.4.1 Consensus Q-Learning -- 3.4.2 Consensus-Based Multi-robot Planning -- 3.5 Experiments and Results -- 3.5.1 Experimental Setup -- 3.5.2 Experiments for CoQL -- 3.5.3 Experiments for Consensus-Based Planning -- 3.6 Conclusions -- 3.7 Summary -- References -- Chapter 4 An Efficient Computing of Correlated Equilibrium for Cooperative Q-Learning-Based Multi-Robot Planning -- 4.1 Introduction -- 4.2 Single-Agent Q-Learning and Equilibrium-Based MAQL -- 4.2.1 Single Agent Q-Learning -- 4.2.2 Equilibrium-Based MAQL -- 4.3 Proposed Cooperative MAQL and Planning -- 4.3.1 Proposed Schemes with Their Applicability -- 4.3.2 Immediate Rewards in Scheme-I and -II -- 4.3.3 Scheme-I-Induced MAQL -- 4.3.4 Scheme-II-Induced MAQL. 4.3.5 Algorithms for Scheme-I and II -- 4.3.6 Constraint OmegaQL-I/OmegaQL-II(COmegaQL-I/COmegaQL-II) -- 4.3.7 Convergence -- 4.3.8 Multi-agent Planning -- 4.4 Complexity Analysis -- 4.4.1 Complexity of CQL -- 4.4.1.1 Space Complexity -- 4.4.1.2 Time Complexity -- 4.4.2 Complexity of the Proposed Algorithms -- 4.4.2.1 Space Complexity -- 4.4.2.2 Time Complexity -- 4.4.3 Complexity Comparison -- 4.4.3.1 Space Complexity -- 4.4.3.2 Time Complexity -- 4.5 Simulation and Experimental Results -- 4.5.1 Experimental Platform -- 4.5.1.1 Simulation -- 4.5.1.2 Hardware -- 4.5.2 Experimental Approach -- 4.5.2.1 Learning Phase -- 4.5.2.2 Planning Phase -- 4.5.3 Experimental Results -- 4.6 Conclusion -- 4.7 Summary -- References -- Chapter 5 A Modified Imperialist Competitive Algorithm for Multi-Robot Stick-Carrying Application -- 5.1 Introduction -- 5.2 Problem Formulation for Multi-Robot Stick-Carrying -- 5.3 Proposed Hybrid Algorithm -- 5.3.1 An Overview of ICA -- 5.3.1.1 Initialization -- 5.3.1.2 Selection of Imperialists and Colonies -- 5.3.1.3 Formation of Empires -- 5.3.1.4 Assimilation of Colonies -- 5.3.1.5 Revolution -- 5.3.1.6 Imperialistic Competition -- 5.3.1.6.1 Total Empire Power Evaluation -- 5.3.1.6.2 Reassignment of Colonies and Removal of Empire -- 5.3.1.6.3 Union of Empires -- 5.4 An Overview of FA -- 5.4.1 Initialization -- 5.4.2 Attraction to Brighter Fireflies -- 5.4.3 Movement of Fireflies -- 5.5 Proposed ICFA -- 5.5.1 Assimilation of Colonies -- 5.5.1.1 Attraction to Powerful Colonies -- 5.5.1.2 Modification of Empire Behavior -- 5.5.1.3 Union of Empires -- 5.6 Simulation Results -- 5.6.1 Comparative Framework -- 5.6.2 Parameter Settings -- 5.6.3 Analysis on Explorative Power of ICFA -- 5.6.4 Comparison of Quality of the Final Solution -- 5.6.5 Performance Analysis -- 5.7 Computer Simulation and Experiment 5.7.1 Average Total Path Deviation (ATPD) -- 5.7.2 Average Uncovered Target Distance (AUTD) -- 5.7.3 Experimental Setup in Simulation Environment -- 5.7.4 Experimental Results in Simulation Environment -- 5.7.5 Experimental Setup with Khepera Robots -- 5.7.6 Experimental Results with Khepera Robots -- 5.8 Conclusion -- 5.9 Summary -- References -- Chapter 6 Conclusions and Future Directions -- 6.1 Conclusions -- 6.2 Future Directions -- Index -- EULA. Mehragentensystem (DE-588)4389058-1 gnd Bestärkendes Lernen Künstliche Intelligenz (DE-588)4825546-4 gnd |
subject_GND | (DE-588)4389058-1 (DE-588)4825546-4 |
title | Multi-agent coordination a reinforcement learning approach |
title_auth | Multi-agent coordination a reinforcement learning approach |
title_exact_search | Multi-agent coordination a reinforcement learning approach |
title_exact_search_txtP | Multi-agent coordination a reinforcement learning approach |
title_full | Multi-agent coordination a reinforcement learning approach Arup Kumar Sadhu, Amit Konar |
title_fullStr | Multi-agent coordination a reinforcement learning approach Arup Kumar Sadhu, Amit Konar |
title_full_unstemmed | Multi-agent coordination a reinforcement learning approach Arup Kumar Sadhu, Amit Konar |
title_short | Multi-agent coordination |
title_sort | multi agent coordination a reinforcement learning approach |
title_sub | a reinforcement learning approach |
topic | Mehragentensystem (DE-588)4389058-1 gnd Bestärkendes Lernen Künstliche Intelligenz (DE-588)4825546-4 gnd |
topic_facet | Mehragentensystem Bestärkendes Lernen Künstliche Intelligenz |
work_keys_str_mv | AT sadhuarupkumar multiagentcoordinationareinforcementlearningapproach AT konaramit multiagentcoordinationareinforcementlearningapproach |