“…For each test sentence formed by NW words, of which NOOV are out-of-vocabulary (OOV), we compute the following 5 features using each LM (taking inspiration for features from the works of [29,30,12,9,10]): a) log(P ) N W , that is, the average log-probability of the sentence, b) log(P OOV ) N OOV , that is, the average contribution of OOV words to the log-probability of the sentence, c) log(P )−log(P OOV ) N W , that is, the average log-difference between the two above probabilities, d) NW − N bo , where N bo is the number of back-offs applied by the LM to the input sentence (this difference is related to the frequency of n-grams in the sentence that have also been observed in the training set), e) NOOV , the number of OOVs in the sentence. Note that if word counts NW or NOOV are equal to zero (i.e.…”