Physical model building is essential for realizing digital twins in the
manufacturing industry and requires much toil. We aim to develop
automated physical model builder (AutoPMoB) that can automatically build
physical models from literature databases. AutoPMoB requires several
fundamental technologies, and domain-specific datasets play a vital role
in developing such technologies. Although datasets related to variables
have been created, there has been no dataset in the chemical engineering
domain. To create such a dataset, in this study, we developed an
algorithm for extracting variable symbols from documents and a variable
annotation tool, VARAT, based on the algorithm. We used the tool and
created a dataset containing about 1,733 variable symbols from 45 papers
on physical models of five manufacturing processes. VARAT enables us to
quickly and accurately extract the variable symbols from documents and
reduces the time for annotation per paper to less than half, which
streamlines the annotation process.