Stormwater runoff is one of the most common non-point sources of water pollution to rivers, lakes, estuaries, and coastal beaches. While most pollutants and nutrients, including nitrate-nitrogen, in stormwater are discharged into receiving waters during the first-flush period, no existing best management practices (BMPs) are specifically designed to capture and treat the first-flush portion of urban stormwater runoff. This paper presents a novel BMP device for highway and urban stormwater treatment with emphasis on numerical modeling of the new BMP, called first-flush reactor (FFR). A new model, called VART-DN model, for simulation of denitrification process in the designed first-flush reactor was developed using the variable residence time (VART) model. The VART-DN model is capable of simulating various processes and mechanisms responsible for denitrification in the FFR. Based on sensitivity analysis results of model parameters, the denitrification process is sensitive to the temperature correction factor (b), maximum nitrate-nitrogen decay rate (K (max)), actual varying residence time (T (v)), the constant decay rate of denitrifiying bacteria (v (dec)), temperature (T), biomass inhibition constant (K (b)), maximum growth rate of denitrifiying bacteria (v (max)), denitrifying bacteria concentration (X), longitudinal dispersion coefficient (K (s)), and half-saturation constant of dissolved carbon for biomass (K (Car-X)); a 10% increase in the model parameter values causes a change in model root mean square error (RMSE) of -28.02, -16.16, -12.35, 11.44, -9.68, 10.61, -16.30, -9.27, 6.58 and 3.89%, respectively. The VART-DN model was tested using the data from laboratory experiments conducted using highway stormwater and secondary wastewater. Model results for the denitrification process of highway stormwater showed a good agreement with observed data and the simulation error was less than 9.0%. The RMSE and the coefficient of determination for simulating denitrification process of wastewater were 0.5167 and 0.6912, respectively, demonstrating the efficacy of the VART-DN model.