Dysregulation of lung tissue collagen level plays a vital role in understanding how lung diseases progress. However, traditional scoring methods rely on manual histopathological examination introducing subjectivity and inconsistency into the assessment process. These methods are further hampered by inter-observer variability, lack of quantification, and their time-consuming nature. To mitigate these drawbacks, we propose a machine learning-driven framework for automated scoring of lung collagen content. Our study begins with the collection of a lung slide image dataset from adult female mice using second harmonic generation (SHG) microscopy. In our proposed approach, first, we manually extracted features based on the 46 statistical parameters of fibrillar collagen. Subsequently, we pre-processed the images and utilized a pre-trained VGG16 model to uncover hidden features from pre-processed images. We then combined both image and statistical features to train various machine learning and deep neural network models for classification tasks. We employed advanced unsupervised techniques like K-means, principal component analysis (PCA), t-distributed stochastic neighbour embedding (t-SNE), and uniform manifold approximation and projection (UMAP) to conduct thorough image analysis for lung collagen content. Also, the evaluation of the trained models using the collagen data includes both binary and multi-label classification to predict lung cancer in a urethane-induced mouse model. Experimental validation of our proposed approach demonstrates promising results. We obtained an average accuracy of 83% and an area under the receiver operating characteristic curve (ROC AUC) values of 0.96 through the use of a support vector machine (SVM) model for binary categorization tasks. For multi-label classification tasks, to quantify the structural alteration of collagen, we attained an average accuracy of 73% and ROC AUC values of 1.0, 0.38, 0.95, and 0.86 for control, baseline, treatment_1, and treatment_2 groups, respectively. Our findings provide significant potential for enhancing diagnostic accuracy, understanding disease mechanisms, and improving clinical practice using machine learning and deep learning models.