“…After encoding the tokens in a sentence, we enumerate through all the possible m spans J = {j1, • • • , ji, • • • , jm} upto a maximum specified length (in terms of number of tokens) for sentence s = {w1, • • • , wT } and then re-assign a label yi ∈ {I, O} for each span ji. For example, for the sentence "NLP is um important", all possible spans (or pairs of start and end indices) are {(1, 1), (2, 2), (3,3), (4,4), (1,2), (2,3), (2,4), (1,3), (1,4)}, and all these spans are labelled O except (3,3) which is labelled I. We denote bi and si as the start and end indices of span ji respectively.…”