Advancements in technology have revolutionized communication on social media platforms, where reviews and comments greatly influence product selection. Current opinion mining methods predominantly focus on textual content, overlooking the rich information within customer-posted images, termed here as Multus-Medium. This research introduces an innovative deep learning approach, Multus Medium Opinion Mining (MMOM), capitalizing on both text and image data for comprehensive product review analysis. MMOM employs an integrated model of Bidirectional Long Short-Term Memory (BiLSTM) and embedded Convolutional Neural Network (CNN), incorporating architectures of GoogleNet and VGGNet, thus enabling efficient extraction and fusion of textual and visual features. This synergistic approach enables data collection, preprocessing, feature extraction, fusion strategy-based feature vector generation, and subsequent product recommendation. Performance evaluation on two diverse real-world datasets name as “flicker8k” and “t4sa”. These datasets show substantial improvement over existing methods. MMOM outperforms standard benchmark models, achieving an accuracy, F1 score and ROC over fliker8k are 90.38%, 88.75% and 93.08%, whereas for twitter dataset are 88.54%, 86.34%, and 92.26% respectively, the accuracy of purposed model is 7.34% and 9.54% higher than the other two mentioned techniques. These statistics highlight the robustness and applicability of MMOM across various domains. The compelling results underscore the potential of MMOM, providing a more holistic and precise approach to opinion mining in the era of social media product reviews. This research, product recommendation helps the customer to make purchasing decision. Last but not the least, the purposed scheme can further be expanded in any other sentiment task like hospital recommendation system, crop farming recommendation and medical diagnostic system etc.