Animals are able to flexibly adapt to new environments by controlling different behavioral patterns. Identification of the behavioral strategy used for this control is important for understanding animals' decision-making, but methods available for quantifying such behavioral strategies have not been fully established. In this study, we developed an inverse reinforcement-learning (IRL) framework to identify an animal's behavioral strategy from behavioral time-series data. As a particular target, we applied this framework to thermotactic behavior in C.elegans. After identifying the behavioral strategy dependent on thermosensory state, we found it comprised mixture of two strategies: directed migration (DM) and isothermal migration (IM). First, the DM is a strategy that the worms efficiently reach to specific temperature, which not only explained observation that the worms migrate toward the cultivated temperature, but also clarifies how the worms control thermosensory state through the 10 migration. Second, the IM is a strategy that the worms track along a constant temperature, which reflects isothermal tracking well observed in previous studies. By further applying our method to starved worm and thermosensory neuron-deficient worms, we identified the neural basis underlying the strategies. Consequently, we interpreted behavioral strategies in terms of control theory. Therefore, this study validates and presents a novel approach that should propel the development of new, more effective experiments to identify behavioral strategies and decision-making in animals.
Author SummaryUnderstanding animal decision-making has been a fundamental problem in neuroscience and behavioral 20 ecology. Many studies analyze actions that represent decision-making in behavioral tasks, in which rewards are artificially designed with specific objectives. However, it is impossible to extend this artificially designed experiment to a natural environment, because in a natural environment, the rewards for freely-behaving animals cannot be clearly defined. To this end, we must reverse the current paradigm so that rewards are identified from behavioral data. Here, we propose a new reverse-engineering approach (inverse reinforcement learning) that can estimate a behavioral strategy from time-series data of freely-behaving animals. By applying this technique with thermotaxis in C. elegans, we successfully identified the rewardbased behavioral strategy.
30not peer-reviewed)