A Deep CNN-LSTM Framework for Fast Video Coding

Bouaafia, Soulef; Khemiri, Randa; Sayadi, Fatma Ezahra; Atri, Mohamed; Liouane, Noureddine

doi:10.1007/978-3-030-51935-3_22

Cited by 5 publications

(4 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These features would then be passed to the LSTM layers that produces an output that accounts for the entire sequence. Convolutions of 2-dimensional data are usually applied to contexts where the elements of the sequence can be represented as an image, such as traffic prediction [42] or video [43].…”

Section: ) Cnn-lstmmentioning

confidence: 99%

Neural Networks for Aircraft Trajectory Prediction: Answering Open Questions About Their Performance

Ayala

Vidal

et al. 2023

IEEE Access

View full text Add to dashboard Cite

The increase in air traffic in the recent years has motivated the development of technologies to monitor air space and warn about possible collisions by predicting the trajectories that will be followed by aircraft. In this field, neural networks have become prominent thanks to their potential to learn to predict maneuvers without providing aspects that are difficult to model such as atmospheric conditions, or detailed aircraft parameters. A variety of models have been proposed; however, these are often tested in very limited setups, leaving many unanswered questions about how they perform in certain conditions, or whether or not their accuracy can be improved by training models for specific trajectories, using additional features, predicting more distant points directly, etc. This may be problematic for researchers or developers of these systems, who have no way of knowing what strategies will yield the best results. We have identified ten open research questions that have not been answered through in-depth testing. This motivated us to carry out a novel experimental study that performs aircraft trajectory prediction with several dozens configuration variants to answer the aforementioned questions by means of a much more complete evaluation. Some of the conclusions of our study stand in contrast with some popular practices in the state of the art, which casts some doubts on the simplicity of their application; for example, differential features are crucial for proper performance but are not mentioned by most studies, while complex, more elaborate models may lead to worse results than simple ones. Other important insights include the benefit from specialized models in more challenging scenarios, the influence of the known trajectory length in said scenarios, the step degradation of predictions when predicting further into the future, or the detrimental effect of adding additional features. These insights should help guide future research about the application of neural networks when it comes to aircraft trajectory prediction and their eventual inclusion in final systems.

show abstract

Section: ) Cnn-lstmmentioning

confidence: 99%

Neural Networks for Aircraft Trajectory Prediction: Answering Open Questions About Their Performance

Ayala

Vidal

et al. 2023

IEEE Access

View full text Add to dashboard Cite

show abstract

“…On the other hand, the search of the optimal CU prediction mode can be modeled as classification problem. In this regard, researchers adopted learning-based methods in classifying CU mode decision in order to reduce the computational complexity [16][17][18][19][20][21][22]. Shen and Yu [16] proposed a CU early termination algorithm for each level of the quadtree CU partition based on weighted SVM.…”

Section: Related Work Overviewmentioning

confidence: 99%

CNN-LSTM Learning Approach-Based Complexity Reduction for High-Efficiency Video Coding Standard

Bouaafia

Khemiri

Amna

et al. 2021

Scientific Programming

Self Cite

View full text Add to dashboard Cite

High-Efficiency Video Coding provides a better compression ratio compared to earlier standard, H.264/Advanced Video Coding. In fact, HEVC saves 50% bit rate compared to H.264/AVC for the same subjective quality. This improvement is notably obtained through the hierarchical quadtree structured Coding Unit. However, the computational complexity significantly increases due to the full search Rate-Distortion Optimization, which allows reaching the optimal Coding Tree Unit partition. Despite the many speedup algorithms developed in the literature, the HEVC encoding complexity still remains a crucial problem in video coding field. Towards this goal, we propose in this paper a deep learning model-based fast mode decision algorithm for HEVC intermode. Firstly, we provide a deep insight overview of the proposed CNN-LSTM, which plays a kernel and pivotal role in this contribution, thus predicting the CU splitting and reducing the HEVC encoding complexity. Secondly, a large training and inference dataset for HEVC intercoding was investigated to train and test the proposed deep framework. Based on this framework, the temporal correlation of the CU partition for each video frame is solved by the LSTM network. Numerical results prove that the proposed CNN-LSTM scheme reduces the encoding complexity by 58.60% with an increase in the BD rate of 1.78% and a decrease in the BD-PSNR of -0.053 dB. Compared to the related works, the proposed scheme has achieved a best compromise between RD performance and complexity reduction, as proven by experimental results.

show abstract

“…Recently, deep learning (a branch of artificial intelligence) has seen great success in computer vision tasks [ 24 , 25 ], especially for video encoding [ 26 – 28 ]. Indeed, deep neural networks have been adopted to improve coding tools, including intra- and inter-prediction, transformation, quantization, and loop filtering for HEVC and VVC standards.…”

Section: Related Workmentioning

confidence: 99%

Deep learning-based video quality enhancement for the new versatile video coding

Bouaafia

Khemiri

Messaoud

et al. 2021

Neural Comput & Applic

Self Cite

View full text Add to dashboard Cite

Multimedia IoT (M-IoT) is an emerging type of Internet of things (IoT) relaying multimedia data (images, videos, audio and speech, etc.). The rapid growth of M-IoT devices enables the creation of a massive volume of multimedia data with different characteristics and requirements. With the development of artificial intelligence (AI), AI-based multimedia IoT systems have been recently designed and deployed for various video-based services for contemporary daily life, like video surveillance with high definition (HD) and ultra-high definition (UHD) and mobile multimedia streaming. These new services need higher video quality in order to meet the quality of experience (QoE) required by the users. Versatile video coding (VVC) is the new video coding standard that achieves significant coding efficiency over its predecessor high-efficiency video coding (HEVC). Moreover, VVC can achieve up to 30% BD rate savings compared to HEVC. Inspired by the rapid advancements in deep learning, we propose in this paper a wide-activated squeeze-and-excitation deep convolutional neural network (WSE-DCNN) technique-based video quality enhancement for VVC. Therefore, we replace the conventional in-loop filtering in VVC by the proposed WSE-DCNN model that eliminates the compression artifacts in order to improve visual quality and hence increase the end user QoE. The obtained results prove that the proposed in-loop filtering technique achieves %, %, and % BD rate reduction for luma and both chroma components under random access configuration. Compared to the traditional CNN-based filtering approaches, the proposed WSE-DCNN-based in-loop filtering framework achieves efficient performance in terms of RD cost.

show abstract

A Deep CNN-LSTM Framework for Fast Video Coding

Cited by 5 publications

References 14 publications

Neural Networks for Aircraft Trajectory Prediction: Answering Open Questions About Their Performance

Neural Networks for Aircraft Trajectory Prediction: Answering Open Questions About Their Performance

CNN-LSTM Learning Approach-Based Complexity Reduction for High-Efficiency Video Coding Standard

Deep learning-based video quality enhancement for the new versatile video coding

Contact Info

Product

Resources

About