Reinforcement learning has shown success in solving complex control problems, yet safety remains paramount in engineering applications like energy management systems (EMS), particularly in hybrid electric vehicles (HEVs). An effective EMS is crucial for coordinating power flow while ensuring safety, such as maintaining the battery state of charge within safe limits, which presents a challenging task. Traditional reinforcement learning struggles with safety constraints, and the penalty method often leads to suboptimal performance. This study introduces Lagrangian-based parameterized soft actor–critic (PASACLag), a novel safe hybrid-action reinforcement learning algorithm for HEV energy management. PASACLag utilizes a unique composite action representation to handle continuous actions (e.g., engine torque) and discrete actions (e.g., gear shift and clutch engagement) concurrently. It integrates a Lagrangian method to separately address control objectives and constraints, simplifying the reward function and enhancing safety. We evaluate PASACLag’s performance using the World Harmonized Vehicle Cycle (901 s), with a generalization analysis of four different cycles. During generalization, the results indicate that PASACLag achieves a less than 10% increase in fuel consumption compared to dynamic programming. Moreover, PASACLag surpasses PASAC, an unsafe counterpart using penalty methods, in fuel economy and constraint satisfaction metrics. These findings highlight PASACLag’s effectiveness in acquiring complex EMS for control within a hybrid action space while prioritizing safety.