Short term electric load forecasting plays a crucial role for utility companies, as it allows for the efficient operation and management of power grid networks, optimal balancing between production and demand, as well as reduced production costs. As the volume and variety of energy data provided by building automation systems, smart meters, and other sources are continuously increasing, long short-term memory (LSTM) deep learning models have become an attractive approach for energy load forecasting. These models are characterized by their capabilities of learning long-term dependencies in collected electric data, which lead to accurate prediction results that outperform several alternative statistical and machine learning approaches. Unfortunately, applying LSTM models may not produce acceptable forecasting results, not only because of the noisy electric data but also due to the naive selection of its hyperparameter values. Therefore, an optimal configuration of an LSTM model is necessary to describe the electric consumption patterns and discover the time-series dynamics in the energy domain. Finding such an optimal configuration is, on the one hand, a combinatorial problem where selection is done from a very large space of choices; on the other hand, it is a learning problem where the hyperparameters should reflect the energy consumption domain knowledge, such as the influential time lags, seasonality, periodicity, and other temporal attributes. To handle this problem, we use in this paper metaheuristic-search-based algorithms, known by their ability to alleviate search complexity as well as their capacity to learn from the domain where they are applied, to find optimal or near-optimal values for the set of tunable LSTM hyperparameters in the electrical energy consumption domain. We tailor both a genetic algorithm (GA) and particle swarm optimization (PSO) to learn hyperparameters for load forecasting in the context of energy consumption of big data. The statistical analysis of the obtained result shows that the multi-sequence deep learning model tuned by the metaheuristic search algorithms provides more accurate results than the benchmark machine learning models and the LSTM model whose inputs and hyperparameters were established through limited experience and a discounted number of experimentations.