Visualization and Analysis of Criminality Data with QGIS

We will continue to explore tools that allow us to generate graphs that help us analyze our data. In this opportunity we will use a QGIS plugin, which takes advantage of generating graphics D3 style (Data-Driven Documents), we refer to the D3 Data Visualization – QGIS D3 Date and Time Heatmap.

What does D3 Data Visualization do?

The plugin, using date, time and custom categories in the data, creates a D3 circular heat histogram, outputting an interactive web page, which can optionally include a legend.

The plugin works by counting the number of events by date/time/category, using two axes and displays the results as a circular heat map. With the result obtained it allows us to analyze the temporal distribution of the data and how many events exist over time, based on two frequencies.

Data Source

First we are going to prepare our data to show how it works, for this we will be guided mainly by what is shown in its code repository, from where we can verify that, for its demonstration, it uses a database of 2006 crime events occurring in the city of Chicago, available from here.

In principle we will consider it as our main layer, but in addition, to make it more interesting we will use two additional layers, the first referred to the limits of neighborhoods available from the same source, the same is obtained doing the search as “Boundaries-Neighborhoods“.

Finally a layer on the so called police districts, available from here. It is important to mention that according to Figure 1, I also added to the QGIS a layer on Police Stations, in order to perhaps have one more criterion of analysis after obtaining my results.

layer panel — Figure 1: View of the field of study

Figure 2: View of 2006 crime data reported in Chicago

Now to have our final layer ready, we are going to have to perform the process of spatial union of the mentioned layers, the QGIS is achieved with Vector > Data Management Tools >Join Attributes by Location. In order to do this, we must bear in mind that our “target vector layer” is that of points that represent crime events.

Figure 3: Table of attributes integrating additional information

Knowing the Environment

After installing the plugin, when you activate it a window will appear with a group of tabs, the first of them “QGIS layer”, the same one serves to choose the vectorial layer that will be used to generate the graph of the heat map (Chicago-2006-crime).

In addition, you must select the date and time fields (Date). You can also specify the values for our radial axis and the concentric band, in both cases you must establish the type of information based on the range of time to be established. As we will see later the concentric band can be adjusted to include additional information.

The second tab “Titles”, allows us to customize our chart by placing titles to it and to the legend if we decide to include it. It also has the option of including the possibility of visualizing the values (the number of incidents) when passing the mouse.

Figure 5: Configuring the titles and labels of the graph

The third tab “Settings”, where some configurations are made such as specifying the “Inner radius”, which would be the radius in pixels where the first band of the heat map begins, then the height in pixels of each band (Band height).

In generating allows us to define the best dimensions of our graphic for a correct visualization; in the same way we can indicate you that the labels of both the radial lines and the concentric bands are shown. Finally we can also indicate the dimensions of our legend.

Figure 6: Adjustments of dimensions of our graph

The last tab corresponds to “Colors”, where the adjustments are made for a precise control on the color ramp of our heat map and we also define the color of “No Data”.

Plugin Application

Let’s make an example considering to have a graph that allows me to see the number of incidents distributed by months and time of day, to visualize in which schedules during all months of the year there are more incidents. For that we enter in the first tab the following information.

Figure 8: Definition of data to use in our first example

In the end we will have the result shown in Figure 9, it determines that in the month of January there are the least number of incidents, from the same graph we see that it is at midnight where the incidents increase. As we indicated previously, when passing with the mouse on the graph, in the lower part it will go away.

Figure 9: View of the result of our first graph

Keep in mind that the plugin asks you for a folder where the result will be stored, which as you will see consists of three files, from where you could also make some modifications to improve our output.

Additional Options

We will manage our data to create more custom graphics, especially to use the additional layers that were integrated into our original data, for this, we will first generate a filter to our layer of points.

With the filter made we will be able to visualize another type of information, having only one time variable, in this opportunity we are interested in knowing the number of incidents classified by their nature occurred during the months of the year, for it we adjust the options as shown below.

Figure 11: Adjusting to use additional data

Figure 12: Result of our example with filtered data

Our result shows that narcotics are the main incident recording values of up to 2,452 cases, the same is repeated with high values during all the months, in the same way, we see that it is followed in number of robberies, and finally we can see that no homicides are reported except for one occurred in the month of September.

Another example that we can show is considering the data of location of the neighborhood, for it we are going to follow the same procedure done recently, that is to say applying filters. In this opportunity, we will randomly select a group of neighborhoods, which we intend to analyze.

What we are going to do is first generate a graph that allows me to see if there are neighborhoods that in some months of the year have more incidences, then we will make another graph that determines me in which schedules the incidences are presented more frequently for each neighborhood selected.

Figure 13: Final result of examples incorporating neighborhood data

Finally we can appreciate according to the graph on the left, that the neighborhood “Harrinson”, presents the largest number of incidents during all months of the year.

On the other hand the neighborhood “Foster” is reported the lowest number of incidents, which could be inferred to be one of the safest. In relation to the graph on the right, we can see that from 7:00 p.m. until midnight the greatest number of incidents occur.

Well, as you can see, you can make many graphs, in this case was considered an example to analyze crime data, but it can serve for data such as number of accidents, emergencies or any column containing dates and times.

Although they are graphs, I consider that it is a good complement to the realization of other processes of data analysis, which would allow us for example to define perhaps personnel required for the attention of emergencies, or the possibility of implementing services or equipment. Anyway, I hope you can prove it.

Translated from: Carbajallosa

What does D3 Data Visualization do?

Data Source

Knowing the Environment

Plugin Application

Additional Options

Discover more from GIS Tuto