USAConversations about emotion between preschoolers and their mothers constitute an important form of shared meaning which, as groundwork for the child's developing theory of mind. could be related to aspects of social-emotional development. A sample of 46 preschoolers and their mothers discussed photographs of infants showing eight emotions; mothers also simulated sadness and anger. Transcribed conversations were coded for frequency and function of emotion language. During both tasks, mothers talked more than their children about emotions; however, the frequencies of emotion utterances which served as unelaborated comments, or to guide the other's behaviour, did not differ for mothers and children. Older children and mothers explained emotions more overall; Cyear-olds commented on the babies' emotions more than 3-year-olds. Mothers' and children's emotion language were related in interpretable ways. Aspects of emotion language emitted by both mothers and children were related to indices of positive social-emotional development, such as emotion knowledge and positivity of emotional displays observed in the preschool.
At the present time, sequential item recommendation models are compared by calculating metrics on a small item subset (target set) to speed up computation. The target set contains the relevant item and a set of negative items that are sampled from the full item set. Two well-known strategies to sample negative items are uniform random sampling and sampling by popularity to better approximate the item frequency distribution in the dataset. Most recently published papers on sequential item recommendation rely on sampling by popularity to compare the evaluated models. However, recent work has already shown that an evaluation with uniform random sampling may not be consistent with the full ranking, that is, the model ranking obtained by evaluating a metric using the full item set as target set, which raises the question whether the ranking obtained by sampling by popularity is equal to the full ranking. In this work, we re-evaluate current state-of-the-art sequential recommender models from the point of view, whether these sampling strategies have an impact on the final ranking of the models. We therefore train four recently proposed sequential recommendation models on five widely known datasets. For each dataset and model, we employ three evaluation strategies. First, we compute the full model ranking. Then we evaluate all models on a target set sampled by the two different sampling strategies, uniform random sampling and sampling by popularity with the commonly used target set size of 100, compute the model ranking for each strategy and compare them with each other. Additionally, we vary the size of the sampled target set. Overall, we find that both sampling strategies can produce inconsistent rankings compared with the full ranking of the models. Furthermore, both sampling by popularity and uniform random sampling do not consistently produce the same ranking when compared over different sample sizes. Our results suggest that like uniform random sampling, rankings obtained by sampling by popularity do not equal the full ranking of recommender models and therefore both should be avoided in favor of the full ranking when establishing state-of-the-art.
CCS CONCEPTS• Information systems → Recommender systems; Evaluation of retrieval results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.