The introduction of deep learning has resolved the high-cost issues associated with traditional methods in handling complex aerodynamics problems and is commonly used for simulating fluid behavior and optimizing aircraft design. However, flow field prediction based on deep learning typically encodes the freestream conditions and geometric information into the neural network model concurrently. This encoding scheme makes it difficult for the model to distinguish and deal with the intrinsic differences between these two types of information. As a result, the ability of the model to capture complex flow field features decreases and the difficulty of model fitting increases, which in turn reduces the effectiveness of the model. To solve these problems, this paper proposes the Operator-Convolution MultiModal Fusion Network (OCMMFNet), a new neural network architecture to predict the flow fields of airfoils with various geometries and freestream conditions. The proposed network architecture uses a freestream generalization network to encode the input freestream conditions. The resulting approximate flow field information is combined with the airfoil geometry information and fed into a shape feature compensation network to improve the prediction accuracy. We compare the performance of OCMMFNet with those of a deep operator network(DeepONet) and a vision transformer(ViT) model. When generalizing both freestream conditions and airfoil shapes, OCMMFNet reduces the prediction error in the pressure field by 9.71% and 3.76% compared to DeepONet and ViT, respectively. In tests involving extrapolation of Reynolds numbers, OCMMFNet significantly reduces the prediction error in the pressure field by 13.73% and 11.84% compared to DeepONet and ViT, respectively. The results show that OCMMFNet achieves better prediction accuracy than both DeepONet and ViT and displays superior robustness and generalization ability.