We study the operational problem of shared autonomous electric vehicles that cooperate in providing on-demand mobility services while maximizing fleet profit and service quality. Therefore, we model the fleet operator and vehicles as interactive agents enriched with advanced decision-making aids. Our focus is on learning smart charging policies (when and where to charge vehicles) in anticipation of uncertain future demands to accommodate long charging times, restricted charging infrastructure, and time-varying electricity prices. We propose a distributed approach and formulate the problem as a semi-Markov decision process to capture its stochastic and dynamic nature. We use cooperative multiagent reinforcement learning with reshaped reward functions. The effectiveness and scalability of the proposed model are upgraded through deep learning. A mean-field approximation deals with environment instabilities, and hierarchical learning distinguishes high-level and low-level decisions. We evaluate our model using various numerical examples based on real data from ShareNow in Berlin, Germany. We show that the policies learned using our decentralized and dynamic approach outperform central static charging strategies. Finally, we conduct a sensitivity analysis for different fleet characteristics to demonstrate the proposed model’s robustness and provide managerial insights into the impacts of strategic decisions on fleet performance and derived charging policies. Supplemental Material: The online appendix is available at https://doi.org/10.1287/trsc.2022.1187 .
<p style='text-indent:20px;'>Advertising has always been considered a key part of marketing strategy and played a prominent role in the success or failure of products. This paper investigates a multi-product and multi-period advertising budget allocation, determining the amount of advertising budget for each product through the time horizon. Imperative factors including life cycle stage, <inline-formula><tex-math id="M1">\begin{document}$ BCG $\end{document}</tex-math></inline-formula> matrix class, competitors' reactions, and budget constraints affect the joint chain of decisions for all products to maximize the total profits. To do so, we define a stochastic sequential resource allocation problem and use an approximate dynamic programming (<inline-formula><tex-math id="M2">\begin{document}$ ADP $\end{document}</tex-math></inline-formula>) algorithm to alleviate the huge size of the problem and multi-dimensional uncertainties of the environment. These uncertainties are the reactions of competitors based on the current status of the market and our decisions, as well as the stochastic effectiveness (rewards) of the taken action. We apply an approximate value iteration (<inline-formula><tex-math id="M3">\begin{document}$ AVI $\end{document}</tex-math></inline-formula>) algorithm on a numerical example and compare the results with four different policies to highlight our managerial contributions. In the end, the validity of our proposed approach is assessed against a genetic algorithm. To do so, we simplify the environment by fixing the competitor's reaction and considering a deterministic environment.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.