Skip to content

Spatial DNA: Measuring similarity of geolocation datasets with applications to forensics

Conference/Workshop:
American Statistical Association Joint Statistical Meetings
Published: 2019
Primary Author: Christopher Galbraith
Secondary Authors: Padhraic Smyth
Research Area: Digital

Datasets consisting of geolocated events provide rich spatial characterizations of human behavior. Individuals tend to be self-consistent over time while generating such events, visiting the same locations such as home, the office, or the gym. In this paper we develop an approach to quantify similarity between sets of spatial events, drawing inspiration from the forensic evaluation of DNA evidence. A randomization-based technique is applied in which locations are sampled from conditional distributions of spatial locations (constructed via mixtures of kernel density estimates with weights derived from discrete locations). Score functions based on the distance between groups of events are then computed and used to construct coincidental match probabilities. We illustrate the approach with a large geolocation data set collected from Twitter users. Results are compared to computing the log-likelihood of one set of spatial events under a mixture-KDE from another to assess similarity. Our experimental results indicate that the proposed method can accurately assess the similarity between sets of geolocations, with potential applications in forensic and cybersecurity settings.

Related Resources

A Response to the Threat of Stegware

A Response to the Threat of Stegware

Stegware refers to software, programs or apps that allow insertion of malware into a digital file, such as an image or video, using steganography techniques. Although it has been in…
A Forensic Analysis of Joker-Enabled Android Malware Apps

A Forensic Analysis of Joker-Enabled Android Malware Apps

This project aims at developing a set of automated Android Malware vetting tools to discover all the malicious behaviors of Android Malwares in the forms of files in the local…
LogExtractor: Extracting Digital Evidence from Android Log Messages via String and Taint Analysis

LogExtractor: Extracting Digital Evidence from Android Log Messages via String and Taint Analysis

Mobile devices are increasingly involved in crimes. Therefore, digital evidence on mobile devices plays a more and more important role in crime investigations. Existing studies have designed tools to identify and/or…
Forensic Analysis on Joker Family Android Malware

Forensic Analysis on Joker Family Android Malware

Android is the most popular operating system among mobile devices and the malware targeted explicitly for Android is rapidly growing and spreading across the mobile ecosystem. In this paper, we…