2020
DOI: 10.1007/s12205-020-1489-9
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Guidance System for Long-Distance Curved Pipe-Jacking

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…One is that the objective function with the average KL divergence constraint makes the smaller update step size, which results in slower learning efficiency, and the other is that the optimization problem with constraints requires higher calculated cost, and the solving process is cumbersome. For these reasons, the PPO method [27] is proposed to replace the constraint with the penalty, which can solve the problem of determining the universal penalty factor and obtain a more concise optimization form, as shown in Formula (7). Applying r t (θ) as a distance measure between the updated policy and the original policy, the application of clipping eliminates the driving force of r t (θ) exceeding [1 − , 1 + ], which limit the policy update within a certain range.…”
Section: Ppo Algorithm (1) Policy-based Frameworkmentioning
confidence: 99%
See 1 more Smart Citation
“…One is that the objective function with the average KL divergence constraint makes the smaller update step size, which results in slower learning efficiency, and the other is that the optimization problem with constraints requires higher calculated cost, and the solving process is cumbersome. For these reasons, the PPO method [27] is proposed to replace the constraint with the penalty, which can solve the problem of determining the universal penalty factor and obtain a more concise optimization form, as shown in Formula (7). Applying r t (θ) as a distance measure between the updated policy and the original policy, the application of clipping eliminates the driving force of r t (θ) exceeding [1 − , 1 + ], which limit the policy update within a certain range.…”
Section: Ppo Algorithm (1) Policy-based Frameworkmentioning
confidence: 99%
“…The laser-based guidance system [5,6] installed on TBM can measure and display TBM attitude and position deviation relative to the designed tunnel axis in real time. The TBM operators set the appropriate control parameters for TBM attitude based on the current attitude and position deviation and the rough geological information from a geological survey using their experience [7,8]. Because of the complex and ever-changing geological environment, the operators cannot obtain enough current geological information according to the geological survey and personal perception.…”
Section: Introductionmentioning
confidence: 99%
“…Long-distance curved pipe-jacking has significant engineering and economic benefits in tunneling projects crossing densely built-up urban areas, heavily trafficked road sections, and oversized cross-sections [ 1 , 2 ]. However, the traditional pipe-jacking guidance method relies on the visibility conditions of the environment and does not apply to long-distance curved pipe-jacking.…”
Section: Introductionmentioning
confidence: 99%