“…Even here there are limitations, since our lexical items are not easily aligned with those found in other collections. For this reason, we can not leverage external corpus statistics from, for example, Google or Wikipedia (Bendersky et al, 2011;Bendersky et al, 2010;Bendersky and Croft, 2008;Lease, 2009), or phrases from search logs (Svore et al, 2010).…”