Purpose
Open innovation communities are a growing trend across diverse industries because they provide opportunities of collaborating with customers and exploiting their knowledge effectively. Although open innovation communities can be strategic assets that can help firms innovate, firms nonetheless face the challenge of information overload incurred due to the characteristic of the community. The purpose of this paper is to mitigate the problem of information overload in an open innovation environment.
Design/methodology/approach
This study chose MyStarbucksIdea.com (MSI) as a target open innovation community in which customers share their ideas. The authors analyzed a large data set collected from MSI utilizing text mining techniques including TF-IDF and sentiment analysis, while considering both term and non-term features of the data set. Those features were used to develop classification models to calculate the adoption probability of each idea.
Findings
The results showed that term and non-term features play important roles in predicting the adoptability of ideas and the best classification accuracy was achieved by the hybrid classification models. In most cases, the precisions of classification models decreased as the number of recommendations increased, while the models’ recalls and F1s increased.
Originality/value
This research dealt with the problem of information overload in an open innovation context. A large amount of customer opinions from an innovation community were examined and a recommendation system to mitigate the problem was proposed. Using the proposed system, the firm can get recommendations for ideas that could be valuable for its business innovation in the idea generation phase, thereby resolving the information overload and enhancing the effectiveness of open innovation.
Clustering is a method for grouping objects with similar patterns and finding meaningful clusters in a data set. There exist a large number of clustering algorithms in the literature, and the results of clustering even in a particular algorithm vary according to its input parameters such as the number of clusters, field weights, similarity measures, the number of passes, etc. Thus, it is important to effectively evaluate the clustering results a priori, so that the generated clusters are more close to the real partition. In this paper, an improved clustering validity assessment index is proposed based on a new density function for intercluster similarity and a new scatter function for intra-cluster similarity. Experimental results show the effectiveness of the proposed index on the data sets under consideration regardless of the choice of a clustering algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.