ABSTRAKKumpulan data yang besar atau dikenal dengan istilah big data dapat dianalisis dengan berbagai macam teknik. Salah satu teknik untuk mengolah big data adalah Unsupervised Technique. Ada berbagai macam algoritma yang menerapkan teknik ini. Setiap algoritma memiliki cara dan karakteristik masing-masing. Penelitian ini berfokus pada pengembagan algoritma yang menerapkan unsupervised learning technique salah satunya algoritma K-Means dengan mengambil sample data pada masyarakat yang melakukan usaha kreatif dan mandiri. Masyarakat dalam yang memanfaatkan usaha online dan offline dalam pemasarannya. Peneliti melakukan uji eksperimen dan simulasi terhadap algoritma tersebut dengan menghasilkan output berupa aplikasi software serta tabel dan grafik yang mampu menggabungkan data yang didapat dari media social dan kuesioner secara ofline. Hasil analisa pengolahan Data tersebut dapat di gunakan sebagao DSS (Decicion Support System) oleh masyarakat dalam mengambil keputusan pengembangan pemasaran produksinya selanjutnya. ABSTRACTLarge data collection or known as big data can be analyzed with various techniques. One technique for processing big data is Unsupervised Technique. There are various kinds of algorithms that apply this technique. Each algorithm has its own ways and characteristics. This study focuses on developing an algorithm that implements an unsupervised learning technique, one of which is the K-Means algorithm by taking data samples to people who are doing creative and independent efforts. The Society utilized online and offline business in marketing. The researcher conducted an experimental test and simulation of the algorithm by producing output in the form of software applications as well as tables and graphs that were able to combine data obtained from social media and questionnaires fromline. The results of the analysis of data processing can be used as a DSS (Decion Support System) by the community in making their next production marketing development decisions.
There are a lot of Mustahiq data in LAZ (Lembaga Amil Zakat) which is spread in many locations today. Each LAZ has Mustahiq data that is different in type from other LAZ. There are differences in Mustahiq data types so that data that is so large cannot be used together even though the purpose of the data is the same to determine Mustahiq data. And to find out whether the Mustahiq data is still up to date (renewable), of course it will be very difficult due to the types of data types that are not uniform or different, long time span, and the large amount of data. To give zakat to Mustahiq certainly requires speed of information. So, in giving zakat to Mustahiq, LAZ will find it difficult to monitor the progress of the Mustahiq. It is possible that a Mustahiq will change his condition to become a Muzaki. This is the reason for the researcher to take this theme in order to help the existing LAZ to make it easier to cluster Mustahiq data. Furthermore, the data already in the cluster can be used by LAZ managers to develop the organization. This can also be a reference for determining the zakat recipient cluster to those who are entitled later. The research is "Modeling using K-Means Algorithm and Big Data analysis in determine Mustahiq data ". We got data Mustahiq with random sample from online and offline survey. Online data survey with Google form and Offline Data survey we got from BAZNAS (National Amil Zakat Agency) in Indonesia and another zakat agency (LAZ) in Jakarta. We conducted by combining data to analyzed using Big Data and K-Means Algorithm. K-Means algorithm is an algorithm for cluster n objects based on attributes into k partitions according to criteria that will be determined from large and diverse Mustahiq data. This research focuses on modeling that applies K-Means Algorithms and Big Data Analysis. The first we made tools for grouping simulation test data. We do several experimental and simulation scenarios to find a model in mapping Mustahiq data to developed best model for processing the data. The results of this study are displayed in tabular and graphical form, namely the proposed Mustahiq data processing model at Zakat Agency (LAZ). The simulation result from a total of 1109 correspondents, 300 correspondents are included in the Mustahiq cluster and 809 correspondents are included in the Non-Mustahiq cluster and have an accuracy rate of 83.40%. That means accuracy of the system modeling able to determine data Mustahiq. Result filtering based on Gender is “Male” accuracy 83.93%, based on Age is ”30-39” accuracy 71,03%, based on Job is “PNS” accuracy 83.39%, based on Education is “S1” accuracy 83.79%. The advantaged of research expected to be able to determine quickly whether the person meets the criteria as a mustahik or Muzaki for LAZ (Amil Zakat Agency). The result of modeling is K-Means clustering algorithm application program can be used if UIN Syarif Hidayatullah Jakarta want to develop LAZ (Amil Zakat Agency) too.
Traffic congestion big cities in Indonesia is unavoidable, especially in Jakarta. The increasing number of vehicle and the lack of public transportation is the main cause of traffic congestion in Jakarta. It disturb people activities. Government already did various efforts to resolve congestion problem, however it needs high installation, maintenance cost and need time to be implemented. Peoples often complained about traffic congestion in Jakarta by posting in Twitter which called tweets. Every tweets post are saved in API Twitter and used for sentiment analysis. It analyzed emotion of the user. Based on the problems, we do research how to detect traffic congestion in Jakarta. Therefore, we try to makes Congestion Detection App. We design the app using UML diagrams. Congestion Detection App is connected with Hadoop, Flume, Hive and Derby. The app stream twitters data to colected by connecting with API Twitter. This app is Java-based application which can makes and view data tables. It performance searching tweets data by ID and analyze traffic condition on a certain region in Jakarta. The perform sentiment analysis to a certain tweet and display the result based on the data table. The result of research is comparing Data from Congestion Detection App with data from Google Maps. We make three valus categories which consist of three colors: green for less traffic congestion have a value of 1. Orange for medium-scale traffic congestion has value of 2 and Red for heavily traffic congestion has a value of 3. Based on three categories and value we use 4 regions for sample and comparing the values with value from Google Maps Data to get the accuracy. We got 81% average accuracy from the four samples. The result of Data from tweet sample compared with Google Maps Data. It have big detected congestion with Congestion Detection App.
Text classification is a process of categorizing a text into the correct label. Text classification in natural language processing is a challenging task that requires accuracy to get the correct results, manual text classification tends to be inefficient because it requires a lot of time and also experts. The utilization of machine learning for automatic text classification can be a solution to this problem. KNN, Naive Bayes, and SVM are known as some of the most algorithms to solve classification problems, especially text classification. In this study, we are trying to compare the KNN, Naive Bayes, and SVM algorithms for text classification with the problem of classifying movie genres based on a synopsis using datasets obtained from Kaggle.com and IMDB Dataset. The results of this study indicate that of the 12 experiments, Support Vector Machine (SVM) is the bestperforming algorithm with an accuracy of 90%, 93%, 65%, and 63%. It is hoped that this research can help to determine the best algorithm in the text classification process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.