Rapidly gaining information superiority is vital when fighting an enemy, but current computer forensics tools, which require file headers or a working file system to function, do not enable us to quickly map out the contents of corrupted hard disks or other fragmented storage media found at crime scenes. The lack of proper tools slows down the hunt for information, which would otherwise help in gaining the upper hand against IT based perpetrators. To address this problem, this paper presents an algorithm which allows categorization of data fragments based solely on their structure, without the need for any meta data. The algorithm is based on measuring the rate of change of the byte contents of digital media and extends the byte frequency distribution based Oscar method presented in an earlier paper. The evaluation of the new method shows a detection rate of 99.2 %, without generating any false positives, when used to scan for JPEG data. The slowest implementation of the algorithm scans a 72.2 MB file in approximately 2.5 seconds and scales linearly.
This paper proposes a method, called Oscar, for determining the probable file type of binary data fragments. The Oscar method is based on building models, called centroids, of the mean and standard deviation of the byte frequency distribution of different file types. A weighted quadratic distance metric is then used to measure the distance between the centroid and sample data fragments. If the distance falls below a threshold, the sample is categorized as probably belonging to the modelled file type. Oscar is tested using P E G pictures and is shown to give a high categorization accuracy, i.e. high detection rate and low false positives rate. By using a practical example we demonstrate how to use the Oscar method to prove the existence of known pictures based on fragments of them found in RAM and the swap partition of a computer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.