2019 IEEE International Conference on Multimedia and Expo (ICME) 2019
DOI: 10.1109/icme.2019.00289
|View full text |Cite
|
Sign up to set email alerts
|

Tiyuntsong: A Self-Play Reinforcement Learning Approach for ABR Video Streaming

Abstract: Existing reinforcement learning (RL)-based adaptive bitrate (ABR) approaches outperform the previous fixed control rules based methods by improving the Quality of Experience (QoE) score, as the QoE metric can hardly provide clear guidance for optimization, finally resulting in the unexpected strategies. In this paper, we propose Tiyuntsong, a selfplay reinforcement learning approach with generative adversarial network (GAN)-based method for ABR video streaming. Tiyuntsong learns strategies automatically by tra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 16 publications
(9 citation statements)
references
References 17 publications
0
9
0
Order By: Relevance
“…Note that the output is an n-dim vector indicating the probability of the bitrate being selected under the current ABR state S k . In this work, we set n = 6, which is widely used in ABR papers [9,14].…”
Section: Nn Architecturementioning
confidence: 99%
See 1 more Smart Citation
“…Note that the output is an n-dim vector indicating the probability of the bitrate being selected under the current ABR state S k . In this work, we set n = 6, which is widely used in ABR papers [9,14].…”
Section: Nn Architecturementioning
confidence: 99%
“…Taking a look from another perspective, we observe that the aforementioned problem can be naturally described as a deterministic goal or requirement [9]. For instance, in the most cases, the goal of the adaptive bitrate (ABR) streaming algorithm is to achieve lower rebuffering time first, and next, reaching higher bitrate [10,13].…”
Section: Introductionmentioning
confidence: 99%
“…Nevertheless, such methods are built in pre-assumptions that it is hard to keep its performance in all considered network scenarios [28]. To this end, learning-based ABR algorithms [15,18,28] are proposed to solve the problem from another perspective: it Figure 1: We evaluate quality-aware ABR algorithm and bitrate-aware ABR algorithm with the same video on Norway network traces respectively. Results are plotted as the curves of selected bitrate, buffer occupancy and the selected chunk's VMAF ( §5.1, [38]) for entire sessions.…”
Section: Challenges For Learning-based Abrsmentioning
confidence: 99%
“…Recall that the key principle of RL-based method is to maximize reward of each action taken by the agent in given states per step, since the agent doesn't really know the optimal strategy [45]. However, recent work [6,18,28,36,43,51] has demonstrated that the ABR process can be precisely emulated by an offline virtual player ( §6.1) with complete future network information. What's more, by taking several steps ahead, we can further accurately estimate the near-optimal expert policy of any ABR state within an acceptable time ( §4.2).…”
Section: Training Abrs Via Imitation Learningmentioning
confidence: 99%
See 1 more Smart Citation