Videoconferencing represents a technology of the future, in modern education. A combination of audio and video information serves in understanding the content of lectures or presentations, in the form of videoconferencing. The evaluation of the quality of videoconferencing is difficult, as the image and sound affects the final quality. In general, occasional image disturbance has less impact on the perception of quality in comparison to the disturbances in an audio track. In this research, we simulated a real packet network environment and tested video sequences that present different teaching content. We artificially degraded the quality of video sequences by packet loss and jitter. Our test aimed to compare subjective methods of video quality evaluation with objective methods and to evaluate the impact of audio quality on the overall video sequence quality. This paper describes a novel process of evaluating the quality of audio and video signals. Timeconsuming subjective measurements were supported by models and programs that simplified the preparation, testing, and processing of results. The contribution of this article is to present and evaluate the results of video sequence quality testing with an emphasis on semantics, which has a significant impact on viewers' sensitivity to video sequence quality.