Point data – point pattern
In this seminar, we will analyse the spatial distribution of points in an area and will consider whether the point distribution tends to cluster or disperse or is randomly distributed on the level of statistical significance. This belongs crucial steps in analyses because it allows us to understand a spatial distribution of objects in the area. The first set of tools allow us to answer the question: What is the probability that a point distribution is randomly distributed in space? These methods do not create anything that can be visualised in a map, but numerical or graphical outputs are perfect to provide a more general description. During this seminar, we will try various methods using practice data and additionally you will use these methods by your own and interpret results.
We will work with data about IT companies in Prague.
We will start with examples of the density methods:
- quadrat counts method – it is based on dividing the area into cells of the same size and shape
- this method is not implemented in a software but it is easy to calculate the index of dispersions using the equation from the lecture
- create three various grids using a different cell size (tool Fishnet in ArcMap)
- calculate variance-mean ratios for all three grids and compare results
- Kernel density estimation –
- use the method of kernel density estimation (KDE) for a part of Prague
- estimate correct cell size, threshold and type of function
- based on the first results try to modify your settings to get better results
- display your results above an orthophoto map and try to interpret your results and consider an existence of hot spot areas
- try to use a dual kernel density estimation and use the address points as the second layer, compare results with results of single KDE
Video demonstrating KDE in ArcMap:
We select also some members of the distance methods:
-
- Average Nearest Neighbour –
- we will test the null hypotheses: average distance between observed and theoretical data are the same (randomly distributed)
- note the observed and theoretical mean distance, NNI and z-score
- this method is very sensitive to the size of an area that must be the same when comparing point distributions, try to calculate the same, but use a larger area
- NNI = ratio of observed and theoretical mean distances
- NNI = 1 … random distribution
- NNI = 0 … clustered distribution
- the maximal value of NNI is about 2.1 … regularly dispersed pattern
- Average Nearest Neighbour –
Video demonstrating NNA in ArcMap:
- Ripley’s k-function – the standardised average number of events in a distance h of an incident
- it works with all events and not only with the closest event
- it also creates a random distribution with the same number of events in the same area
- for both distributions, it calculates the k-functions and compares both distributions and the highest distance between both distributions (observed and theoretical) defines the distance with the biggest clustering
- firstly the whole calculation is realised for the smallest distance and the for particular distance steps
- using permutation you will calculate many random distributions and define lower and upper confidence interval (the highest and smallest values of K functions)
- the geographical shape of the area has significant influence because we are working with the whole set of events. The theoretical distributions are created in the same as the original data and thus, the shape of the is required as one of input parameters.
- you can use also weights
- null hypotheses: events are randomly distributed
- if you do not clip your point data falling within your area of interest but you only define this area by its boundaries, you do not have to specify or include a border effect correction.
- describe resulting L-function
- use also an appropriate number of simulations for confidence intervals
Video explaining K-function
Video how to calculate k-function
Individual task 3:
The third individual task will focus on point pattern analysis of crime incidents in Ostrava (sample of data). For analyses use these analytical tools – quadrat counts, kernel density estimation, nearest neighbor analysis and k-function. Interpret the results of these methods. Work only with one of the city districts with at least 50 incidents. All data you need is stored in shared MS Teams folder (data_task3.zip).
- settings, calculation and interpretation of the variance-mean ratio
- settings, calculation and interpretation of the NNI
- settings, calculation and interpretation of the KDE – provide at least three different settings and conclude to the best. Creata a map including a WMS with the orthophoto map to improve your interpretations.
- settings, calculation and interpretation of the k-function.
Deadline: 15. 12. 2022
Cvičení je vytvořeno v rámci projektu Inovace bakalářských a magisterských studijních oborů na Hornicko-geologické fakultě VŠB-TUO pod číslem CZ.1.07/2.2.00/28.0308. Tento projekt je realizován za spoluúčasti EU.