The early detection of anomalies associated with changes in the behavior of structures is important for ensuring their serviceability and safety. Identifying anomalies from monitoring data is prone to false and missed alarms due to the uncertain nature of the infrastructure responses' dependency on external factors such as temperature and loading. Existing anomaly detection strategies typically rely on univariate threshold values and disregard the planning horizon in the context of decision making. This paper proposes an anomaly detection framework that combines the interpretability of existing Bayesian dynamic linear models, a particular form of state-space models, with the longterm planning ability of reinforcement learning. The new framework provides (a) reinforcement learning formalism for anomaly detection in Bayesian dynamic linear models, (b) a method for simulating anomalies with respect to its height, duration, and time of occurrence, and (c) a method for quantifying anomaly detectability. The potential of the new framework is demonstrated on monitoring data collected on a bridge in Canada. The results show that the framework is able to detect real anomalies that were known to have occurred, as well as synthetic anomalies.