More than 300 individuals from 150 organizations across 26 countries and regions use the NIST released Media Forensic Challenge (MFC) datasets for their research. The MFC datasets were created for use in the DARPA MediFor (Media Forensics) program. Since their release, multiple questions have been fielded regarding the dataset properties, including contents, metadata definitions, usage, data repurposing, etc. For example: what do the datasets contain? What are the definitions of the different kinds of metadata? How does one label the data with the reference information to build the training data for machine learning algorithms? How would one modify/extract the data for their own research purposes? This document serves as a user guide for the MFC datasets, including those used in the Nimble Challenge (NC). This guide includes: 1) a description about MFC datasets including background, evolution history, and the dataset summary by the evaluation tasks; 2) user access and permissions of MFC datasets; 3) an introduction to the MFC data by providing a simple example of a manipulation journal graph and its detailed corresponding MFC dataset reference files; 4) an introduction to a flexible subset selection approach, "Selective Scoring," to sample the test probes from the entire test set for the particular task evaluation; 5) information to help users gain a deeper understanding of the metadata by presenting two commonly used approaches to illustrate the manipulation operation statistic histogram distributions, and 6) a general template of the NIST MFC evaluation dataset to facilitate the future dataset generation.
The interest in forensic techniques capable of detecting many different media manipulation types has been growing, and system development with machine learning technology has been evolving in recent years. There has been, however, a lack of diversity in the data collections and in the evaluation methodologies for advancing multimedia forensics technologies. For the forensics research community, a well-defined evaluation is necessary to rapidly measure the accuracy and robustness of systems over diverse datasets collected under various environments. In this paper, we propose an evaluation framework and associated performance metrics and apply them to the 2018 Multimedia Forensics Challenge (MFC18). This MFC18 evaluation consists of five tasks and two challenges. A large number of datasets were created to support each task and for conducting the experiments using a structured evaluation framework. A total of 25 teams participated in the MFC18 evaluation; we analyse their performance on the tasks and challenges, and provide performance rankings for each team’s best-performing system.
With the development of storage, transmission, editing, and sharing tools, digital forgery images are propagating rapidly. The need for image provenance analysis has never been more timely. Typical applications are content tracking, copyright enforcement, and forensics reasoning. However, large-scale image provenance datasets, which contain diverse manipulation history graphs with various manipulation operations and rich metadata, are still needed to facilitate the research. It is one of the major factors that hinders the development of techniques for image provenance analysis. To address this issue, we introduce large-scale benchmark datasets for provenance analysis, namely Media Forensics Challenge-Provenance (MFC-Prov) datasets. Two provenance tasks are designed along with evaluation metrics. Furthermore, extensive analysis is conducted for system performance in terms of accuracy on our datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.