Unfortunately, previous fuzzing approaches cannot discover this type of violations for the following two reasons. First, they do not consider the entire input space of the RV's control software, including user commands, configuration parameters, and environmental factors. Second, they only focus on finding memory corruption bugs or RV's control stability issues. Therefore, they cannot detect safety policy violations, e.g., a drone is deploying the parachute at a too-low altitude. We develop PGFUZZ, a policy-based fuzzing framework designed to address these challenges. PGFUZZ includes three interconnected components: (1) Pre-Processing, (2) Policy-Guided Fuzzing, and (3) Bug Post-Processing. In the Pre-Processing component, we express the correct operation of an RV through policies denoted by a metric temporal logic (MTL). Thereafter, we minimize the fuzzing space via finding inputs related to the tested policies that, when mutated, could potentially trigger policy violations. For example, given a policy in natural language stating that "the fail-safe mode must be triggered when the engine temperature is higher than 100°C", PGFUZZ expresses this policy with the MTL formula: {(temperature>100°C) → (failsafe=on)}. It then decomposes this formula into the temperature and the fail-safe mode states, and identifies fuzzing inputs such as user commands (e.g., increasing temperature) and configuration parameters (e.g., units of temperature), influencing the policy states. Then, the Policy-Guided Fuzzing mutates inputs identified by the Pre-Processing component. It implements two kinds of distance metrics, propositional distances to guide the mutation engine, and a global distance to detect when a policy violation occurs. The distance metrics quantify how close the current system states are to a policy violation. Positive distances indicate the policy holds, whereas negative distances indicate the policy is violated. Therefore, PGFUZZ mutates inputs to minimize the global distance. After each input is sent to the control software, which runs in an RV simulator, PGFUZZ collects the system states and computes the distance metrics. The input's impact on the distance metric (whether it increases or decreases) is leveraged to decide on the next inputs. When the global distance becomes negative, a policy violation is detected. Turning to the fail-safe mode example, PGFUZZ mutates inputs to increase the temperature to be larger than 100°C, and checks whether, at the same time, the fail-safe mode is activated. The last component, Bug Post-Processing, minimizes the input sequence triggering the bugs by excluding inputs irrelevant to the policy violation. The minimized input sequence is then used to identify the root cause of each violated policy. To verify the correctness and effectiveness of PGFUZZ, we Abstract-Robotic vehicles (RVs) are becoming essential tools of modern systems, including autonomous delivery services, public transportation, and environment monitoring. Despite their diverse deployment, safety and security issues ...