To explain the predicted answers and evaluate the reasoning abilities of models, several studies have utilized underlying reasoning (UR) tasks in multi-hop question answering (QA) datasets. However, it remains an open question as to how effective UR tasks are for the QA task when training models on both tasks in an endto-end manner. In this study, we address this question by analyzing the effectiveness of UR tasks (including both sentence-level and entitylevel tasks) in three aspects: (1) QA performance, (2) reasoning shortcuts, and (3) robustness. While the previous models have not been explicitly trained on an entity-level reasoning prediction task, we build a multi-task model that performs three tasks together: sentencelevel supporting facts prediction, entity-level reasoning prediction, and answer prediction. Experimental results on 2WikiMultiHopQA and HotpotQA-small datasets reveal that (1) UR tasks can improve QA performance. Using four debiased datasets that are newly created, we demonstrate that (2) UR tasks are helpful in preventing reasoning shortcuts in the multi-hop QA task. However, we find that (3) UR tasks do not contribute to improving the robustness of the model on adversarial questions, such as sub-questions and inverted questions. We encourage future studies to investigate the effectiveness of entity-level reasoning in the form of natural language questions (e.g., sub-question forms). 1 * Equal contribution. 1 Our data and code are available at https://github. com/Alab-NII/multi-hop-analysis Question: Who is the paternal grandfather of Joan of Valois, Countess of Beaumont? Paragraph A: Joan of Valois, Countess of Beaumont [1] Joan of Valois (1304 -1363) was the daughter of Charles of Valois and his second wife ... Paragraph B: Charles, Count of Valois [2] Charles of Valois (12 March 1270 -16 December 1325), the third son of Philip III of France and, ... . [3] ... Answer: Philip III of France a) Standard QA task format b) UR tasks and three aspects Sentence-level supporting facts: 1, 2Entity-level reasoning prediction (Evidence):Step 1: ("Joan of Valois, Countess of Beaumont", "father", "Charles of Valois") &Step 2: ("Charles of Valois", "father", "Philip III of France") QA Performance Reasoning Shortcuts Robustness Paragraph A: Joan of Valois, Countess of Beaumont [1] We can also establish the global weak solution ... [2] Joan of Valois (1304 -1363) was the daughter of Charles of Valois and his second wife ... Paragraph B: Charles, Count of Valois [3] This gives a clear impulse to develop ... [4] Charles of Valois (12 March 1270 -16 December 1325), the third son of Philip III of France and, ... . [5] ...