Point of interest coverage with distributed multi-unmanned aerial vehicles based on distributed reinforcement learning Dağıtık pekiştirmeli öğrenme tabanlı çoklu insansız hava aracı ile ilgi çekici nokta kapsama

Aydemir, Fatih; ÇETİN, AYDIN

doi:10.17341/gazimmfd.1172120

Point of interest coverage with distributed multi-unmanned aerial vehicles based on distributed reinforcement learning Dağıtık pekiştirmeli öğrenme tabanlı çoklu insansız hava aracı ile ilgi çekici nokta kapsama

Journal of the Faculty of Engineering and Architecture of Gazi University, cilt.39, sa.1, ss.563-575, 2023 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 39 Sayı: 1
Basım Tarihi: 2023
Doi Numarası: 10.17341/gazimmfd.1172120
Dergi Adı: Journal of the Faculty of Engineering and Architecture of Gazi University
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Art Source, Compendex, TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.563-575
Anahtar Kelimeler: dynamic area coverage, grid decomposition, multi-agent system, reinforcement learning, Unmanned aerial vehicle
Gazi Üniversitesi Adresli: Evet

Özet

Mobile vehicles are widely used in various area coverage applications such as mapping, traffic monitoring, search and rescue operations. Appropriate positioning model and effective learning strategy are required to improve the coverage process. Mobile vehicles can adapt to dynamic environments and find the optimum locations with the navigation mechanism that includes a motion model. In studies, where the positioning process is managed on the basis of a multi-agent mobile system, multiple agents should complete tasks such as detection, data collection and surveillance with a collaborative approach. This learning-based process can be carried out through mobile agents that can learn to optimize a task in real time. In this study, it is aimed to effectively cover points of interest (PoI) in a dynamic environment by modeling a group of unmanned aerial vehicles (UAVs) on the basis of a learning multi-agent system. The target area is decomposed into grids to maximize PoI coverage and minimize energy consumption. Decomposition is performed by considering the location of the target area and the communication distance of the UAVs modeled as mobile agents. However, mobile agents planning to go to grids also learn to avoid collisions. The proposed method has been tested in a simulation environment and the results are presented by comparing with similar studies. The results show that the proposed method outperforms existing similar studies and is suitable for area coverage applications.