This system paper describes the BIT-Xiaomi simultaneous translation system for Autosimtrans 2022 simultaneous translation challenge. We participated in three tracks: the Zh-En text-to-text track, the Zh-En audio-to-text track, and the En-Es test-to-text track. In our system, wait-k is utilized to train prefix-to-prefix translation models. We integrate streaming chunking to detect segmentation boundaries as the source streaming reading in. We further improve our system with data selection, data augmentation, and R-Drop training methods. Results show that our wait-k implementation outperforms the organizer's baseline by at most 8 BLEU score and our proposed streaming chunking method further improves by about 2 BLEU score in the low latency regime.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.