Due to richness of information in forums, researchers are increasingly interested in mining knowledge from forums. From this observation, the forum posts and replies are clustered and analyzed in order to improve the user knowledge in the field. To harvest knowledge from the forum the contents must be downloaded. Forum board or thread is usually divided into multiple pages which are linked by page flipping links. The forum sites contain different pages like entry pages, thread pages and page flipping.The forum mining have three phases: preprocessing, mining the data by applying various data mining strategies such as clustering and post processing. In preprocessing raw data is transformed into a usable format, mainly by parsing and cleaning. While preprocessing, the pages are downloaded as the html file and the files are invoked into parsing and assign attributes like forum id, forum title, thread count, post count. The parsing process is accomplished; data cleaning process is applied to the downloaded post sets and automatically remove noise data and irrelevant data. Clustering algorithm is applied for the preprocessed data to groups the forums into various clusters. The clustering is accomplished by using all topics and sub topics of the forum. The four dimensions of clustering are number of posts/topics, average sentiment values/topics, positive percentage of posts/topics and negative percentage of posts/topics. The posts/topics dimension are determined by number of replies for a post, the sentiment values of this topics are identified from user replies, it describe the user opinion, the positive and negative dimensions are determined from user replies, describe the user perception in the posts. The positive and negative dimensions are also used to identifying the user attitude and pros and cons of the specific topics are discussed in the particular forum. In the post processing stage numbers of clusters are obtained. The obtained final clusters are grouped based on the topics with similar sentiment values and user opinions. Based on the sentiment values, the positive and negative posts are clustered for each thread. Information seekers, decision makers can benefit from this clustering. It simplifies the decision making process.
Convolutional neural networks are contemporary deep learning models that are employed for many various applications. In general, the filter size, number of filters, number of convolutional layers, number of fully connected layers, activation function and learning rate are some of the hyperparameters that significantly determine how well a CNN performs.. Generally, these hyperparameters are selected manually and varied for each CNN model depending on the application and dataset. During optimization, CNN could get stuck in local minima. To overcome this, metaheuristic algorithms are used for optimization. In this work, the CNN structure is first constructed with randomly chosen hyperparameters and these parameters are optimized using Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) algorithm. A CNN with optimized hyperparameters is used for face recognition. CNNs optimized with these algorithms use RMSprop optimizer instead of stochastic gradient descent. This RMSprop optimizer helps the CNN reach global minimum quickly. It has been observed that optimizing with GA and PSO improves the performance of CNNs. It also reduces the time it takes for the CNN to reach the global minimum.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.