“…For BERT + refine (all), we sample over all valid logical forms according to a uniform distribution. For BERT + refine (1,2,5), BERT + refine (2,4,5), and BERT + refine (4,7,9), logical forms are uniformly sampled over (#1, #2, #5), (#2, #4, #5), and (#4 , #7, #9), respectively. The training procedures follow the same hyper-parameters described above.…”