Skip to content

Statistical Methods for the Forensic Analysis of User-Event Data

Published: 2020
Primary Author: Chris Galbraith
Research Area: Digital

A common question in forensic analysis is whether two observed data sets originate from the same source or from different sources. Statistical approaches to addressing this question have been widely adopted within the forensics community, particularly for DNA evidence, providing forensic investigators with tools that allow them to make robust inferences from limited and noisy data. For other types of evidence, such as fingerprints, shoeprints, bullet casing impressions and glass fragments, the development of quantitative methodologies is more challenging. In particular, there are significant challenges in developing realistic statistical models, both for capturing the process by which the evidential data is produced and for modeling the inherent variability of such data from a relevant population.

In this context, the increased prevalence of digital evidence presents both opportunities and challenges from a statistical perspective. Digital evidence is typically defined as evidence obtained from a digital device, such as a mobile phone or computer. As the use of digital devices has increased, so too has the amount of user-generated event data collected by these devices. However, current research in digital forensics often focuses on addressing issues related to information extraction and reconstruction from devices and not on quantifying the strength of evidence as it relates to questions of source.

This dissertation begins with a survey of techniques for quantifying the strength of evidence (the likelihood ratio, score-based likelihood ratio and coincidental match probability) and evaluating their performance. The evidence evaluation techniques are then adapted to digital evidence. First, the application of statistical approaches to same-source forensic questions for spatial event data, such as determining the likelihood that two sets of observed GPS locations were generated by the same individual, is investigated. The methods are applied to two geolocated event data sets obtained from social networks. Next, techniques are developed for quantifying the degree of association between pairs of discrete event time series, including a novel resampling technique when population data is not available. The methods are applied to simulated data and two real-world data sets consisting of logs of computer activity and achieve accurate results across all data sets. The dissertation concludes with suggestions for future work.

Related Resources

Hunting wild stego images, a domain adaptation problem in digital image forensics

Hunting wild stego images, a domain adaptation problem in digital image forensics

Digital image forensics is a field encompassing camera identication, forgery detection and steganalysis. Statistical modeling and machine learning have been successfully applied in the academic community of this maturing field.…
Statistical Methods for the Forensic Analysis of Geolocated Event Data

Statistical Methods for the Forensic Analysis of Geolocated Event Data

A common question in forensic analysis is whether two observed data sets originated from the same source or from different sources. Statistical approaches to addressing this question have been widely…
CSAFE 2020 All Hands Meeting

CSAFE 2020 All Hands Meeting

The 2020 All Hands Meeting was held May 12 and 13, 2020 and served as the closing to the last 5 years of CSAFE research and focused on kicking off…
Android App Forensic Evidence Database (AndroidAED)

Android App Forensic Evidence Database (AndroidAED)

After attending this presentation, attendees will better understand how AndroidAED will be beneficial for academic researchers whose studies relate to mobile applications that grant them the ability to search through…
Do you have 44.03 seconds?

44.3 Seconds. That is the average amount of time it takes for a visitor to provide site feedback.
Test it yourself by taking the survey.


A scientist/researcherA member of the forensic science communityA journalist/publicationA studentOther. Please indicate.


Learn more about CSAFE overall.Discover research CSAFE is undertaking.Explore collaboration opportunities.Find tools and education opportunities.Other. Please indicate.


YesNo