The data-driven (DD) is a systematic approach to improve the data and model by deriving/adding features to address the problem identified during the iterative loop of forecasting model development. This article proposes a DD framework for forecasting short-term PV generation and load demand. A framework of 3 stages with a unique contribution in each stage, such as generalising data pre-processing steps (stage-1), multivariate feature generation and selection (stage-2) and model hyperparameter tuning (stage-3) for further improvement in forecasting is proposed. It focuses on data as well as forecasting models. The whole process is analysed using the time series measured data collected from a real-life demonstration project in Ireland. Data pre-processing is generalised for both generation and demand forecasting under the same framework. The relevant features are selected with the help of the proposed random forest sequential forward feature selection (RF-SFS) algorithm. Hyperparameters are tuned through Tree-structured Parzen Estimator (TPE) algorithm for further improvement. In addition, the performance of the classical ARIMA model is compared with the machine learning-based GRU, LSTM, RNN and CNN models. Results show that the data-driven forecasting model framework systematically improves the model performance. The seasonal variation has also a high impact on the model performances.