“…In our past research [11], we defined bof(g) as a weighted summation of surficial (TF-IDF) features, latent (LDA) features, and recursive (subgoal) features: (3) where g denotes a public goal, bof(g) denotes a bag-offeatures vector of g, and sub(g) denotes a set of subgoals of g. Here, w ∈ W denotes a term, z ∈ Z denotes a latent topic derived by a latent topic model [15], and tfidf(w, g) denotes the TF-IDF, i.e., the product of term frequency and inverse document frequency, of w in a title and a description of g. The p(z|g) denotes the probability of z given g, 0 ≤ α, β, γ ≤ 1, and α + β + γ = 1. The reason this definition incorporates a latent topic model is to enable short descriptions of goals to be dealt with because TF-IDF is insufficient for calculating similarities in short texts.…”