Green roofs are a form of green infrastructure aimed at retaining or slowing the movement of precipitation as stormwater runoff to sewer systems. To determine total runoff versus retention from green roofs, researchers and practitioners alike employ hydrologic models that are calibrated to one or more observed events. However, questions still remain regarding how event size may impact parameter sensitivity, how best to constrain initial soil moisture (ISM), and whether limited observations (i.e., a single event) can be used within a calibration-validation framework. We explored these questions by applying the storm water management model to simulate a large green roof located in Syracuse, NY. We found that model performance was very high (e.g., Nash Sutcliffe efficiency index > 0.8 and Kling-Gupta efficiency index > 0.8) for many events. We initially compared model performance across two parameterizations of ISM. For some events, we found similar performance when ISM was varied versus set to zero; for others, varying ISM yielded higher performance as well as greater water balance closure. Within a calibration-validation framework, we found that calibrating to larger events tended to produce moderate to high performance for other noncalibration events. However, very small storms were notoriously difficult to simulate, regardless of calibration event size, as these events are likely fully retained on the roof.Using regional sensitivity analysis, we confirmed that only a subset of model parameters was sensitive across 16 events. Interestingly, many parameters were sensitive regardless of event size, though some parameters were more sensitive when simulating smaller events. This emphasizes that storm size likely influences parameter sensitivity.Overall, we show that while calibrating to a single event can achieve high performance, exploring simulations across multiple events can yield important insight regarding the hydrologic performance of green roofs that can be used to guide the gathering of in situ properties and observations for refining model frameworks.