Skip to content

Spatial DNA: Measuring similarity of geolocation datasets with applications to forensics

Conference/Workshop:
American Statistical Association Joint Statistical Meetings
Published: 2019
Primary Author: Christopher Galbraith
Secondary Authors: Padhraic Smyth
Research Area: Digital

Datasets consisting of geolocated events provide rich spatial characterizations of human behavior. Individuals tend to be self-consistent over time while generating such events, visiting the same locations such as home, the office, or the gym. In this paper we develop an approach to quantify similarity between sets of spatial events, drawing inspiration from the forensic evaluation of DNA evidence. A randomization-based technique is applied in which locations are sampled from conditional distributions of spatial locations (constructed via mixtures of kernel density estimates with weights derived from discrete locations). Score functions based on the distance between groups of events are then computed and used to construct coincidental match probabilities. We illustrate the approach with a large geolocation data set collected from Twitter users. Results are compared to computing the log-likelihood of one set of spatial events under a mixture-KDE from another to assess similarity. Our experimental results indicate that the proposed method can accurately assess the similarity between sets of geolocations, with potential applications in forensic and cybersecurity settings.

Related Resources

Source Camera Identification with Multi-Camera Smartphones

Source Camera Identification with Multi-Camera Smartphones

An overview of source camera identification on multi-camera smartphones, and introduction to the new CSAFE multi-camera smartphone image database, and a summary of recent results on the iPhone 14 Pro’s.
An Anti-Fuzzing Approach for Android Apps

An Anti-Fuzzing Approach for Android Apps

One of significant mobile app forensic analysis problems is the app evidence extraction from the device. Given the fact that mobile apps could generate more than 19K files in a…
Forensic Analysis of Android Cryptocurrency Wallet Applications

Forensic Analysis of Android Cryptocurrency Wallet Applications

Crypto wallet apps that integrate with various block-chains allow the users to make digital currencies transaction with QR codes. According to reports from financesonline [3], there is over 68 million…
Variations and Extensions of Information Leakage Metrics with Applications to Privacy Problems with Imperfect Statistical Information

Variations and Extensions of Information Leakage Metrics with Applications to Privacy Problems with Imperfect Statistical Information

The conventional information leakage metrics assume that an adversary has complete knowledge of the distribution of the mechanism used to disclose information correlated with the sensitive attributes of a system.…