2023
DOI: 10.1007/s11704-023-2689-5
|View full text |Cite
|
Sign up to set email alerts
|

Large sequence models for sequential decision-making: a survey

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(2 citation statements)
references
References 31 publications
0
2
0
Order By: Relevance
“…By conditioning on target returns, the policy can generate actions that closely resemble the behaviors presented in the dataset. Decision Transformers (DT) and its variants (Siebenborn et al 2022;Zheng, Zhang, and Grover 2022;Hu et al 2023;Wen et al 2023) use returns-to-go, i.e. cumulative future returns, as the conditional inputs and model trajectories with causal transformers (Vaswani et al 2017).…”
Section: Return-conditioned Supervised Learningmentioning
confidence: 99%
“…By conditioning on target returns, the policy can generate actions that closely resemble the behaviors presented in the dataset. Decision Transformers (DT) and its variants (Siebenborn et al 2022;Zheng, Zhang, and Grover 2022;Hu et al 2023;Wen et al 2023) use returns-to-go, i.e. cumulative future returns, as the conditional inputs and model trajectories with causal transformers (Vaswani et al 2017).…”
Section: Return-conditioned Supervised Learningmentioning
confidence: 99%
“…In the meantime, the past few years have witnessed huge success in applying sequence modeling to natural language processing (Vaswani et al 2017;Brown et al 2020). In light of the similarity between language sequences and RL trajectories, a lot of works have explored the idea of modeling RL trajectories using sequence modeling approaches (Wen et al 2023). For example, Decision Transformer (DT) (Chen et al 2021) models offline trajectories extended with the sum of the future rewards along the trajectory, namely the return-to-go (RTG).…”
Section: Introductionmentioning
confidence: 99%