Binaural room auralization involves Binaural Room Impulse Responses (BRIRs). Dynamic binaural synthesis (i.e., head-tracked presentation) requires BRIRs for multiple head poses. Artificial heads can be used to measure BRIRs, but BRIR modeling from microphone array room impulse responses (RIRs) is becoming popular since personalized BRIRs can be obtained for any head pose with low extra effort. We present a novel framework for estimating a binaural signal from microphone array signals, using causal Wiener filtering and polynomial matrix formalism. The formulation places no explicit constraints on the geometry of the microphone array and enables directional weighting of the estimation error. A microphone noise model is used for regularization and to balance filter performance and noise gain. A complete procedure for BRIR modeling from microphone array RIRs is also presented, employing the proposed Wiener filtering framework. An application example illustrates the modeling procedure using a 19-channel spherical microphone array. Direct and reflected sound segments are modeled separately. The modeled BRIRs are compared to measured BRIRs and are shown to be waveform-accurate up to at least 1.5 kHz. At higher frequencies, correct statistical properties of diffuse sound field components are aimed for. A listening test indicates small perceptual differences to measured BRIRs. The presented method facilitates fast BRIR data set acquisition for use in dynamic binaural synthesis and is a viable alternative to Ambisonics-based binaural room auralization.