Algorithms for Games AI

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Algorithms for Multidisciplinary Applications".

Deadline for manuscript submissions: 20 June 2024 | Viewed by 28856

Special Issue Editors


E-Mail Website
Guest Editor
Institute for Artificial Intelligence, Peking University, Beijing 100871, China
Interests: game AI; reinforcement applications; game design

E-Mail Website
Assistant Guest Editor
Institute of Automation, University of Chinese Academy of Science, Beijing 100190, China
Interests: multi-agent reinforcement learning; computational advertising; game agent; agent confrontation platform

Special Issue Information

Dear Colleagues,

We invite you to submit your latest research in the area of gaming AI algorithms to this Special Issue, Algorithms for Game AI. We are looking for new and innovative approaches for solving game AI problems theoretically or empirically. Submissions are welcome both for traditional game AI algorithms (planning, tree search, etc.), as well as new algorithmss (deep reinforcement learning, etc.). Potential topics include, but are not limited to, the history of game AI, the develepment of Monte Carlo tree search algorithms or other tree search algorithms, the theoretical analysis of reinforcement learning algorithms or their application in specific games.   

Prof. Dr. Wenxin Li
Dr. Haifeng Zhang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

15 pages, 2977 KiB  
Article
Learning State-Specific Action Masks for Reinforcement Learning
by Ziyi Wang, Xinran Li, Luoyang Sun, Haifeng Zhang, Hualin Liu and Jun Wang
Algorithms 2024, 17(2), 60; https://0-doi-org.brum.beds.ac.uk/10.3390/a17020060 - 30 Jan 2024
Viewed by 1277
Abstract
Efficient yet sufficient exploration remains a critical challenge in reinforcement learning (RL), especially for Markov Decision Processes (MDPs) with vast action spaces. Previous approaches have commonly involved projecting the original action space into a latent space or employing environmental action masks to reduce [...] Read more.
Efficient yet sufficient exploration remains a critical challenge in reinforcement learning (RL), especially for Markov Decision Processes (MDPs) with vast action spaces. Previous approaches have commonly involved projecting the original action space into a latent space or employing environmental action masks to reduce the action possibilities. Nevertheless, these methods often lack interpretability or rely on expert knowledge. In this study, we introduce a novel method for automatically reducing the action space in environments with discrete action spaces while preserving interpretability. The proposed approach learns state-specific masks with a dual purpose: (1) eliminating actions with minimal influence on the MDP and (2) aggregating actions with identical behavioral consequences within the MDP. Specifically, we introduce a novel concept called Bisimulation Metrics on Actions by States (BMAS) to quantify the behavioral consequences of actions within the MDP and design a dedicated mask model to ensure their binary nature. Crucially, we present a practical learning procedure for training the mask model, leveraging transition data collected by any RL policy. Our method is designed to be plug-and-play and adaptable to all RL policies, and to validate its effectiveness, an integration into two prominent RL algorithms, DQN and PPO, is performed. Experimental results obtained from Maze, Atari, and μRTS2 reveal a substantial acceleration in the RL learning process and noteworthy performance improvements facilitated by the introduced approach. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

22 pages, 3492 KiB  
Article
Optimizing Reinforcement Learning Using a Generative Action-Translator Transformer
by Jiaming Li, Ning Xie and Tingting Zhao
Algorithms 2024, 17(1), 37; https://0-doi-org.brum.beds.ac.uk/10.3390/a17010037 - 16 Jan 2024
Viewed by 1731
Abstract
In recent years, with the rapid advancements in Natural Language Processing (NLP) technologies, large models have become widespread. Traditional reinforcement learning algorithms have also started experimenting with language models to optimize training. However, they still fundamentally rely on the Markov Decision Process (MDP) [...] Read more.
In recent years, with the rapid advancements in Natural Language Processing (NLP) technologies, large models have become widespread. Traditional reinforcement learning algorithms have also started experimenting with language models to optimize training. However, they still fundamentally rely on the Markov Decision Process (MDP) for reinforcement learning, and do not fully exploit the advantages of language models for dealing with long sequences of problems. The Decision Transformer (DT) introduced in 2021 is the initial effort to completely transform the reinforcement learning problem into a challenge within the NLP domain. It attempts to use text generation techniques to create reinforcement learning trajectories, addressing the issue of finding optimal trajectories. However, the article places the training trajectory data of reinforcement learning directly into a basic language model for training. Its aim is to predict the entire trajectory, encompassing state and reward information. This approach deviates from the reinforcement learning training objective of finding the optimal action. Furthermore, it generates redundant information in the output, impacting the final training effectiveness of the agent. This paper proposes a more reasonable network model structure, the Action-Translator Transformer (ATT), to predict only the next action of the agent. This makes the language model more interpretable for the reinforcement learning problem. We test our model in simulated gaming scenarios and compare it with current mainstream methods in the offline reinforcement learning field. Based on the presented experimental results, our model demonstrates superior performance. We hope that introducing this model will inspire new ideas and solutions for combining language models and reinforcement learning, providing fresh perspectives for offline reinforcement learning research. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

18 pages, 967 KiB  
Article
Reducing Q-Value Estimation Bias via Mutual Estimation and Softmax Operation in MADRL
by Zheng Li, Xinkai Chen, Jiaqing Fu, Ning Xie and Tingting Zhao
Algorithms 2024, 17(1), 36; https://0-doi-org.brum.beds.ac.uk/10.3390/a17010036 - 16 Jan 2024
Viewed by 1187
Abstract
With the development of electronic game technology, the content of electronic games presents a larger number of units, richer unit attributes, more complex game mechanisms, and more diverse team strategies. Multi-agent deep reinforcement learning shines brightly in this type of team electronic game, [...] Read more.
With the development of electronic game technology, the content of electronic games presents a larger number of units, richer unit attributes, more complex game mechanisms, and more diverse team strategies. Multi-agent deep reinforcement learning shines brightly in this type of team electronic game, achieving results that surpass professional human players. Reinforcement learning algorithms based on Q-value estimation often suffer from Q-value overestimation, which may seriously affect the performance of AI in multi-agent scenarios. We propose a multi-agent mutual evaluation method and a multi-agent softmax method to reduce the estimation bias of Q values in multi-agent scenarios, and have tested them in both the particle multi-agent environment and the multi-agent tank environment we constructed. The multi-agent tank environment we have built has achieved a good balance between experimental verification efficiency and multi-agent game task simulation. It can be easily extended for different multi-agent cooperation or competition tasks. We hope that it can be promoted in the research of multi-agent deep reinforcement learning. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

20 pages, 1201 KiB  
Article
Hierarchical Reinforcement Learning for Crude Oil Supply Chain Scheduling
by Nan Ma, Ziyi Wang, Zeyu Ba, Xinran Li, Ning Yang, Xinyi Yang and Haifeng Zhang
Algorithms 2023, 16(7), 354; https://0-doi-org.brum.beds.ac.uk/10.3390/a16070354 - 24 Jul 2023
Cited by 1 | Viewed by 1294
Abstract
Crude oil resource scheduling is one of the critical issues upstream in the crude oil industry chain. It aims to reduce transportation and inventory costs and avoid alerts of inventory limit violations by formulating reasonable crude oil transportation and inventory strategies. Two main [...] Read more.
Crude oil resource scheduling is one of the critical issues upstream in the crude oil industry chain. It aims to reduce transportation and inventory costs and avoid alerts of inventory limit violations by formulating reasonable crude oil transportation and inventory strategies. Two main difficulties coexist in this problem: the large problem scale and uncertain supply and demand. Traditional operations research (OR) methods, which rely on forecasting supply and demand, face significant challenges when applied to the complicated and uncertain short-term operational process of the crude oil supply chain. To address these challenges, this paper presents a novel hierarchical optimization framework and proposes a well-designed hierarchical reinforcement learning (HRL) algorithm. Specifically, reinforcement learning (RL), as an upper-level agent, is used to select the operational operators combined by various sub-goals and solving orders, while the lower-level agent finds a viable solution and provides penalty feedback to the upper-level agent based on the chosen operator. Additionally, we deploy a simulator based on real-world data and execute comprehensive experiments. Regarding the alert number, maximum alert penalty, and overall transportation cost, our HRL method outperforms existing OR and two RL algorithms in the majority of time steps. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

27 pages, 893 KiB  
Article
Official International Mahjong: A New Playground for AI Research
by Yunlong Lu, Wenxin Li and Wenlong Li
Algorithms 2023, 16(5), 235; https://0-doi-org.brum.beds.ac.uk/10.3390/a16050235 - 28 Apr 2023
Cited by 3 | Viewed by 2207
Abstract
Games have long been benchmarks and testbeds for AI research. In recent years, with the development of new algorithms and the boost in computational power, many popular games played by humans have been solved by AI systems. Mahjong is one of the most [...] Read more.
Games have long been benchmarks and testbeds for AI research. In recent years, with the development of new algorithms and the boost in computational power, many popular games played by humans have been solved by AI systems. Mahjong is one of the most popular games played in China and has been spread worldwide, which presents challenges for AI research due to its multi-agent nature, rich hidden information, and complex scoring rules, but it has been somehow overlooked in the community of game AI research. In 2020 and 2022, we held two AI competitions of Official International Mahjong, the standard variant of Mahjong rules, in conjunction with a top-tier AI conference called IJCAI. We are the first to adopt the duplicate format in evaluating Mahjong AI agents to mitigate the high variance in this game. By comparing the algorithms and performance of AI agents in the competitions, we conclude that supervised learning and reinforcement learning are the current state-of-the-art methods in this game and perform much better than heuristic methods based on human knowledge. We also held a human-versus-AI competition and found that the top AI agent still could not beat professional human players. We claim that this game can be a new benchmark for AI research due to its complexity and popularity among people. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

22 pages, 1927 KiB  
Article
Measuring the Non-Transitivity in Chess
by Ricky Sanjaya, Jun Wang and Yaodong Yang
Algorithms 2022, 15(5), 152; https://0-doi-org.brum.beds.ac.uk/10.3390/a15050152 - 28 Apr 2022
Cited by 7 | Viewed by 2817
Abstract
In this paper, we quantify the non-transitivity in chess using human game data. Specifically, we perform non-transitivity quantification in two ways—Nash clustering and counting the number of rock–paper–scissor cycles—on over one billion matches from the Lichess and FICS databases. Our findings indicate that [...] Read more.
In this paper, we quantify the non-transitivity in chess using human game data. Specifically, we perform non-transitivity quantification in two ways—Nash clustering and counting the number of rock–paper–scissor cycles—on over one billion matches from the Lichess and FICS databases. Our findings indicate that the strategy space of real-world chess strategies has a spinning top geometry and that there exists a strong connection between the degree of non-transitivity and the progression of a chess player’s rating. Particularly, high degrees of non-transitivity tend to prevent human players from making progress in their Elo ratings. We also investigate the implications of non-transitivity for population-based training methods. By considering fixed-memory fictitious play as a proxy, we conclude that maintaining large and diverse populations of strategies is imperative to training effective AI agents for solving chess. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

23 pages, 591 KiB  
Article
Research and Challenges of Reinforcement Learning in Cyber Defense Decision-Making for Intranet Security
by Wenhao Wang, Dingyuanhao Sun, Feng Jiang, Xingguo Chen and Cheng Zhu
Algorithms 2022, 15(4), 134; https://0-doi-org.brum.beds.ac.uk/10.3390/a15040134 - 18 Apr 2022
Cited by 5 | Viewed by 4525
Abstract
In recent years, cyber attacks have shown diversified, purposeful, and organized characteristics, which pose significant challenges to cyber defense decision-making on internal networks. Due to the continuous confrontation between attackers and defenders, only using data-based statistical or supervised learning methods cannot cope with [...] Read more.
In recent years, cyber attacks have shown diversified, purposeful, and organized characteristics, which pose significant challenges to cyber defense decision-making on internal networks. Due to the continuous confrontation between attackers and defenders, only using data-based statistical or supervised learning methods cannot cope with increasingly severe security threats. It is urgent to rethink network defense from the perspective of decision-making, and prepare for every possible situation. Reinforcement learning has made great breakthroughs in addressing complicated decision-making problems. We propose a framework that defines four modules based on the life cycle of threats: pentest, design, response, recovery. Our aims are to clarify the problem boundary of network defense decision-making problems, to study the problem characteristics in different contexts, to compare the strengths and weaknesses of existing research, and to identify promising challenges for future work. Our work provides a systematic view for understanding and solving decision-making problems in the application of reinforcement learning to cyber defense. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

Review

Jump to: Research

27 pages, 363 KiB  
Review
Techniques and Paradigms in Modern Game AI Systems
by Yunlong Lu and Wenxin Li
Algorithms 2022, 15(8), 282; https://0-doi-org.brum.beds.ac.uk/10.3390/a15080282 - 12 Aug 2022
Cited by 5 | Viewed by 5651
Abstract
Games have long been benchmarks and test-beds for AI algorithms. With the development of AI techniques and the boost of computational power, modern game AI systems have achieved superhuman performance in many games played by humans. These games have various features and present [...] Read more.
Games have long been benchmarks and test-beds for AI algorithms. With the development of AI techniques and the boost of computational power, modern game AI systems have achieved superhuman performance in many games played by humans. These games have various features and present different challenges to AI research, so the algorithms used in each of these AI systems vary. This survey aims to give a systematic review of the techniques and paradigms used in modern game AI systems. By decomposing each of the recent milestones into basic components and comparing them based on the features of games, we summarize the common paradigms to build game AI systems and their scope and limitations. We claim that deep reinforcement learning is the most general methodology to become a mainstream method for games with higher complexity. We hope this survey can both provide a review of game AI algorithms and bring inspiration to the game AI community for future directions. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

43 pages, 565 KiB  
Review
A Review: Machine Learning for Combinatorial Optimization Problems in Energy Areas
by Xinyi Yang, Ziyi Wang, Hengxi Zhang, Nan Ma, Ning Yang, Hualin Liu, Haifeng Zhang and Lei Yang
Algorithms 2022, 15(6), 205; https://0-doi-org.brum.beds.ac.uk/10.3390/a15060205 - 13 Jun 2022
Cited by 11 | Viewed by 5735
Abstract
Combinatorial optimization problems (COPs) are a class of NP-hard problems with great practical significance. Traditional approaches for COPs suffer from high computational time and reliance on expert knowledge, and machine learning (ML) methods, as powerful tools have been used to overcome these problems. [...] Read more.
Combinatorial optimization problems (COPs) are a class of NP-hard problems with great practical significance. Traditional approaches for COPs suffer from high computational time and reliance on expert knowledge, and machine learning (ML) methods, as powerful tools have been used to overcome these problems. In this review, the COPs in energy areas with a series of modern ML approaches, i.e., the interdisciplinary areas of COPs, ML and energy areas, are mainly investigated. Recent works on solving COPs using ML are sorted out firstly by methods which include supervised learning (SL), deep learning (DL), reinforcement learning (RL) and recently proposed game theoretic methods, and then problems where the timeline of the improvements for some fundamental COPs is the layout. Practical applications of ML methods in the energy areas, including the petroleum supply chain, steel-making, electric power system and wind power, are summarized for the first time, and challenges in this field are analyzed. Full article
(This article belongs to the Special Issue Algorithms for Games AI)
Show Figures

Figure 1

Back to TopTop