Purpose
The purpose of this study is to demonstrate how existing volumes of big city crime data could be converted to significantly useful information by law enforcement agencies using readily available data warehouse and OLAP technologies. During the post-9/11 era, criminal data collection by law enforcement agencies received significant attention across the world. Rapid advancement of technology helped collection and storage of these data in large volumes, but often do not get analyzed due to improper data format, lack of technological knowledge and time. Data warehousing (DW) and On-line Analytical Processing (OLAP) tools can be used to organize and present these data in a form strategically meaningful to the general public. In this study, the authors took a seven-month sample crime data from the City of Houston Police Department’s website, cleaned and organized them into a data warehouse with the hope of answering common questions related to crime statistics in a big city in the USA.
Design/methodology/approach
The raw data for the seven-month period was collected from the website in Microsoft Excel spreadsheet format for each month. The data were then cleaned, described, renamed, formatted and then imported into a compiled Access database along with the definition of Facts and Dimensions using a STAR Schema. Data were then transferred to the Microsoft SQL Server data warehouse. SQL Server Analysis Services and Visual Studio Business Intelligent Tool are used to create a Data Cube for OLAP analysis of the summarized data.
Findings
To prove the usefulness of the DW and OLAP cube, the authors have shown few sample queries displaying the number and the types of crimes as a function of time of the day, location, premises, etc. For example, the authors found that 98 crimes occurred on a major street in the city during the early working hours (7 am and 12 pm) when nobody virtually was at home, and among those crimes, roughly two-thirds of them are thefts. This summarized information is significantly useful to the general public and the law enforcement agencies.
Research limitations/implications
The authors’ research is limited to one city’s crime data, whose data set might be different from other cities. In addition to the volume of data and lack of descriptions, the major limitations encountered were the lack of major neighborhood names and their relation to streets. There are other government agencies that provide data to this effect, and a standard set of data would facilitate the process. The authors also looked at data for a nine-month period only. Analyzing data over many years will provide time-trend of crime statistics for a longer period of time.
Practical implications
Many federal, state and local law enforcement agencies are rapidly embracing technology to publish crime data through their websites. However, more attention will need to be paid to the quality and utility of this information to the general public. At the time, there exists no compiled source of crime data or its trend as a function of time, crime type, location and premises. There needs to be a coherent system that allows for an average citizen to obtain this information in a more consumable package. DW and OLAP tools can provide this information package.
Social implications
Having the crime data of a big city in a consumable form is immensely useful for all segments of the constituency that the government agencies serve and will become a service that these offices will be expected to deliver on demand. This information could also be useful in many instances for the decision makers, ranging from those seeking to start a business, to those seeking a place to live who may not necessarily know which neighborhoods or parts of the city are more prone to criminal activity than others.
Originality/value
While there have been few reports of possible use of DW and OALP technologies to study criminal data, the authors found that not many authors used actual crime data, the data sets and formats used in each case are different, results are not presented in most cases and the actual vendor technologies implemented can be different as well. In this paper, the authors present how DW and OLAP tools readily available in most enterprises can be used to analyze publicly available criminal datasets and convert them into meaningful information, which can be valuable not only to the law enforcement agencies but to the public at large.