This paper addresses safe path planning problem in urban environments under onboard sensor availability uncertainty. In this context, an approach based on Mixed-Observability Markov Decision Process (MOMDP) is presented. Such a model enables the planner to deal with a priori probabilistic sensor availability and path execution error propagation, the which depends on the navigation solution. Due to modelling particularities of this safe path planning problem, such as bounded hidden and fully observable state variables, discrete actions and particular transition function form, the belief state update function becomes a complex step that cannot be ignored during planning. Recent advances in Partially Observable Markov Decision Process (POMDP) solving have proposed a planning algorithm called POMCP, which is based on Monte-Carlo Tree Search method. It allows the planner to work on the history of the action-observation pairs without the need to compute belief state updates. Thereby, this paper proposes to apply a POMCP-like algorithm to solve the addressed MOMDP safe path planning problem. The obtained results show the feasibility of the approach and the impact of considering different a priori probabilistic sensor availability on the result policy.
A Vehicular Ad-Hoc Network (VANET) helps vehicles send and receive environmental and traffic information, making it a crucial component towards fully autonomous roads. For VANETs to serve their purpose, there has to be sufficient coverage, even in less populated areas. Moreover, a lot of the safety information is time-sensitive; excessive delay in data transfer can increase the risk of fatal accidents. Unmanned Aerial Vehicles (UAVs) can be used as mobile base-stations to fill in gaps of coverage. The placement of these UAVs is crucial towards how well the system performs. We are particularly interested in the placement of mobile base-stations for a rural highway with sparse traffic, as it represents the worst-case scenario for vehicular communication. Instead of heuristic or linear programming methods for optimal placement, we use multi-agent reinforcement learning (MARL). The main benefit of MARL is that it allows the agents to learn model-free through experience. We propose a variation of the traditional Deep Independent Q-Learning. The modifications include an observation function augmented with information directly shared between neighbouring agents as well a shared policy scheme. We also implement a custom sparse highway simulator that is used for training and testing our algorithm. Our testing shows that the proposed MARL algorithm is able to learn the placement policies that produce the maximum rewards for different scenarios while adapting to the dynamic road densities along the service area. Our experiments show that our model is scalable, allowing the number of agents to increase without any modifications to the code. Finally, we show that our model can be generalized as the algorithm can be directly used and performs equally as well on an industry standard simulator. Future experiments can be performed to improve the realism and complexity of the highway models and adapting the algorithm to real-world scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.