SUMMARYDynamic models of nanometer-scale phenomena often require an explicit consideration of interactions among a large number of atoms or molecules. The corresponding mathematical representation may thus be high dimensional, nonlinear, and stochastic, incompatible with tools in nonlinear control theory that are designed for low-dimensional deterministic equations. We consider here a general class of probabilistic systems that are linear in the state, but whose input enters as a function multiplying the state vector. Model reduction is accomplished by grouping probabilities that evolve together, and truncating states that are unlikely to be accessed. An error bound for this reduction is also derived. A system identification approach that exploits the inherent linearity is then developed, which generates all coefficients in either a full or reduced model. These concepts are then extended to extremely high-dimensional systems, in which kinetic Monte Carlo (KMC) simulations provide the input-output data. This work was motivated by our interest in thin film deposition. We demonstrate the approaches developed in the paper on a KMC simulation of surface evolution during film growth, and use the reduced model to compute optimal temperature profiles that minimize surface roughness.