Map Sandbox Project

A sandbox for designing and implementing comparative thematic maps.

Resources

Over the past few months I’ve been collecting useful resources as I study crime mapping and learn how to build maps. This post contains an updated list of web links, books, and tools.

Crime mapping references

Crime mapping is an active area of research in the broader field of criminology. There are a couple of really nice review articles that have become standard references (as far as I can tell, as an outsider). Both of these were published by the National Institute of Justice (NIJ):

K.D. Harries, ‘Mapping crime: principles and practice’, U.S. Dept. of Justice, National Institute of Justice, Washington D.C. (1999). [pdf]

J.E. Eck, S. Chainey, J.G. Cameron, M. Leitner, R.E. Wilson, ‘Mapping crime: understanding hot spots’, U.S. Dept. of Justice, National Institute of Justice, Washington D.C. (2005). [pdf]

I also found the following article by J. Ratcliffe useful:

J. Ratcliffe, ‘Crime Mapping: Spatial and Temporal Challenges’, in Handbook of Quantitative Criminology (Eds. A.R. Piquero & D. Weisburd), Springer (2010). [pdf]

S. Chainey and J. Ratcliffe have co-authored one of the definitive books on crime mapping:

S. Chainey and J. Ratcliffe, ‘GIS and Crime Mapping’, Wiley (2007). [amazon]

The book I’ve found most useful, however, is an ArcGIS tutorial on crime mapping by W.L. Gorr and K.S. Kurland. One of the challenges with learning the core standard practices of this field as an outsider is finding reference data that one can use to reproduce the calculations being described. This tutorial is the only source I found that provides such data (unfortunately, the Esri license agreement prohibits me from redistributing the data).

W.L. Gorr and K.S. Kurland, ‘GIS Tutorial for Crime Analysis’, ESRI Press (2011). [amazon]

Finally, another useful reference is a book on infographics by I. Meirelles. It’s an excellent book, and has a great chapter on map visualizations:

I. Meirelles, ‘Design for Information’, Rockport Publishers (2013). [amazon]

Software / tools

One of the most widely used geographical information systems is ArcGIS. It has turnkey solutions for all of the common geographical mapping use-cases, but also has the toolset for tackling advanced problems. I’ve only just begun to use it by walking through the ‘GIS Tutorial for Crime Analysis’ mentioned above, which provides a six month license key. Esri offers a home use license for $100. They also have integrated Python scripting to help automate and scale up the overall GIS workflow (see their Python blog and GitHub repo for more info).

Although I intend to continue to dive deeper into learning how to put ArcGIS to good use, for me at the moment its main role is as a reference to compare my own calculations and visualizations to. Standard geographical data operations are available in ArcGIS, but these canned routines are essentially black boxes. I’m interested in understanding (at least partly) the underlying math and science the geoanalysis routines. As such, I will be ‘reinventing the wheel’ to some degree by implementing my own solutions to some of these standard problems (like binning and smoothing data points).

Starting out, I will be using the scientific computing platform Mathematica, which is a much more general-purpose tool than ArcGIS, and does have many built-in GIS features (such as importing shape files and transforming between geo projections). I was a developer at Wolfram for about six years, and so know my way around Mathematica pretty well. They also have a home use license, for \$295 (or for an annual subscription of \$149).

ArcGIS and Mathematica provide a spring board to get up and running doing calculations and making visualizations. However, I’m also interested in developing a suite of tools in Python, which is open source, if for no other reason than that I’ve been wanting to learn Python for a while, and so a project like this provides a context and motivation to do so. There are several open source GIS tools and libraries I can tap into. QGIS is an open source framework that seems to be trying to offer many of the same features as ArcGIS. Like ArcGIS, it has been integrated with a number of related Python libraries (see the PyQGIS page and the PyQGIS cookbook). Another useful Python library is GDAL, which seems mainly focussed on raster data manipulations (as opposed to vector/polygon operations).

For rendering maps, there are some awesome open source tools available. Leaflet is a JavaScript library for building interactive maps on the web. It has been integrated into Mapbox, which provides additional features such as building styled base maps that can be hosted on their servers. It integrates data from OpenStreetMap integrated into its services. I will be making heavy use of Mapbox in my subsequent posts. They even have a mobile app SDK for iOS and Android, which I may tap into. You can also combine the power of D3.js with Leaflet, which was done to great effect in a widget tool called Crosslet.

The domain of GIS and mapping has its own Stack Exchange site, which is a valuable resource for working with any of the above tools.

Crime data

Data sources

The best unified source of crime incident data that I know of is through the Socrata service. It is an open data platform that many city governments are using to host their data. It has a feature-rich web interface, where a user can query the data, generate and save visualizations (including mapping), and export data. Socrata also offers a developer API. In the table below, I provide a list of URL’s I curated several months ago (it may be slightly out of date).

City / County Socrata Portal
Austin http://data.austintexas.gov/
Baltimore http://data.baltimorecity.gov/
Belleville, IL https://data.illinois.gov/belleville
Boston https://data.cityofboston.gov/
Champaign, IL https://data.illinois.gov/champaign
Chicago http://data.cityofchicago.org/
Honolulu https://data.honolulu.gov/
Kansas City https://data.kcmo.org/
Madison https://data.cityofmadison.com/
New Orleans http://data.nola.gov/
New York City https://data.cityofnewyork.us
Oakland https://data.oaklandnet.com
Raleigh https://data.raleighnc.gov
Redmond, WA https://data.redmond.gov
Rockford, IL https://data.illinois.gov/rockford
Salt Lake City https://data.slcgov.com
San Francisco https://data.sfgov.org
Seattle http://data.seattle.gov/
Somerville, MA http://data.somervillema.gov
Wellington, FL https://data.wellingtonfl.gov/
Cook County, IL https://datacatalog.cookcountyil.gov


Having a Socrata data portal, however, doesn’t guarantee the city is making crime incident data available. For example, the city of Saint Louis does not (to my knowledge) have a Socrata portal, but the St. Louis Metropolitan Police Department does provide monthly crime reports containing tabulated incident data. In future posts, I’ll be focusing on crime data for Chicago, St. Louis, San Francisco, and Seattle:

City Crime data link
Chicago https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2
St. Louis http://www.slmpd.org/Crimereports.shtml
San Francisco https://data.sfgov.org/Public-Safety/SFPD-Reported-Incidents-2003-to-Present/dyj4-n68b
Seattle https://data.seattle.gov/Public-Safety/Seattle-Police-Department-Police-Report-Incident/7ais-f98f


Data access

For a broad overview of current trends in how city governments are making crime data available to the public, check out the following two-part article the Sunlight Foundation published earlier this fall:

A. Green, ‘The Landscape of Municipal Crime Data’, The Sunlight Foundation blog, 9/10/13. [link]; A. Green, ‘The Impact of Opening Up Crime Data’, The Sunlight Foundation blog, 9/12/13. [link]

In the first article, the author summarizes the broad range of levels of effort and quality of open data practices in U.S. cities, while the second highlights some of the notable success stories of citizens putting available data to good use. Even more valuable is the raw set of research notes the author has made available via Google docs, which includes additional links to data sources.

An important issue to be aware of when looking for crime data is the question of data ownership and the potential legal pitfalls stemming from this. A few years ago there was a federal court case that eventually reached a settlement, where the company Public Engines (who operate CrimeReports) sued ReportSee (who operate SpotCrime) for programmatically scraping their website. The details of the lawsuit can be found in this article:

M. Masnick, ‘Who Owns Public Crime Data?’, TechDirt blog, 6/14/10. [link]

and a summary of the settlement and its ramifications are given in these two articles:

J. Ellis, ‘How public is public data? With Public Engines v. ReportSee, new access standards could emerge’, Nieman Lab, 2/17/11. [link]

A. Hochberg, ‘Disputes over crime maps highlight challenge of outsourcing public data’, Poynter news, 5/11/13. [link]

To briefly summarize: Public Engines had entered into agreements with certain city police departments, acting as a third party taking in data from the agencies and providing a public interface to the data. ReportSee had scraped data from the CrimeReports web site operated by Public Engines.

In the settlement, ReportSee is barred from using data from CrimeReports and also can not ask for data from agencies that are contracted with Public Engines. That second restriction does not apply to everyone, however, since apparently the contracts do not forbid the participating police departments from making their data available to others. But in practice many departments are treating their contracts with third party data providers as exclusive, which likely violates open data laws that many states have legislated. At first glance, the settlement also seems to be in contradiction with the landmark Feist v. Rural ruling that established that raw facts can not be copyrighted. However, according to the Hochberg article, nowhere in their site licensing or in the lawsuit does Public Engines make any copyright claims on the data.

Administrative divisions

There are two main types of administrative division needed for this project: city neighborhoods and census blocks. In the table below I provide links to neighborhood boundaries for several cities:

City Neighborhood boundaries link
Chicago https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Neighborhoods/9wp7-iasj
St. Louis XXX
San Francisco https://data.sfgov.org/Service-Requests-311-/Neighborhoods/ejmn-jyk6
Seattle https://data.seattle.gov/dataset/Neighborhoods/2mbt-aqqx


The United States Census Bureau provides socioeconomic statistics for regions at varying levels of granularity: nation, state, county, etc. The smallest level of granularity is at the so-called census block level. These are of interest for this project because the Census Bureau produces demographic data assigned to these blocks, such as population, which will allow me to calculate a ‘per capita’ crime rate. Census block boundary files and demographic data can be downloaded directly from the Census Bureau site. The links are provided below. Note: obtaining demographic data such as population is a multistep process outlined in this pdf.

Resource link
TIGER/Line® shapefiles http://www.census.gov/cgi-bin/geo/shapefiles2013/main
Demographic data http://factfinder2.census.gov/