Compression purpose to reduce the redundancy data as small as possible and speed up the data transmission process. To solve the size problem in saving data and transmission process, we use Run Length Encoding and Fibonacci Code algorithm to do compression process. Run Length Encoding and Fibonacci Code algorithm is a type of lossless data compression used in this research, which performance will be measured by comparison parameters of the Compression Ratio (CR), Redundancy (RD), Space Saving (SS) and Compression Time. The compression process is only done on image files with Bitmap format (*.bmp) and encode using Run Length Encoding or Fibonacci Code, then perform the compression process. The final result of the compression is file with extension *.rle or *.fib which contains compressed information that can be decompressed back. The output of the decompression result is an original image file that is stored with *.bmp extension. Fibonacci algorithm will give a better compressed size on image color, while in a grayscale image Run Length Encoding will give a better compressed size. Based on the results of research at two different types of images, each algorithm has its own advantages. Fibonacci Code algorithm is better for color image compression while Run-Length algorithm Encoding is better for grayscale image compression.
Clustering is an unsupervised method to group multiple objects based on the similarity automatically. The quality of clustering accuracy is determined by the number of similar objects in a correct cluster group. The robust preprocessing process and the choice of cluster algorithm can increase the efficiency of clustering. The objective of this study is to observe the most suitable method to cluster document in Bahasa Indonesia. We performed tests on several cluster algorithms such as K-Means, K-Means++ and Agglomerative with various preprocessing stages and collected the accuracy of each algorithm. Clustering experiments were conducted on a corpus containing 100 documents in Bahasa Indonesia with a commonly used preprocessing scenario. Additionally, we also attach our preprocessing stages such as LSA function, TF-IDF function, and LSA / TF-IDF function. We tested various LSA dimension reductions values from 10% to 90%, and the result shows that the best percentage of reduction rates between 50%-80%. The result also indicates that K-Means++ algorithm produces better purity values than other algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.