Skip to content

Open Forensic Science in R

Published: 2019
Primary Author: Samantha Tyner
Secondary Authors: Heike Hofmann, Soyung Park, Eric Hare, Xiao Hui Tai, Karen Kafadar, Karen Pan, Amanda Luby
Research Area: Footwear

This book is for anyone looking to do forensic science analysis in a data-driven and open way. Whether you are a student, teacher, or scientist, this book is for you. We take the latest research, primarily from the Center for Statistics and Applications in Forensic Evidence (CSAFE) and the National Institute of Standards and Technology (NIST) and show you how to solve forensic science problems in R. The book makes some assumptions about you:

  1. You have some experience with R (R Core Team 2019). We don’t assume you are an expert by any means, but we do assume you are comfortable enough with R to install & library packages, load data, identify different data structures, and to follow along with the code we present in each chapter. If you need help getting started with R, there are lots of free resources online, and CSAFE has some resources available here. You can install R for Windows, Mac, and Linux here for free. We also recommend you install RStudio, the wonderful free IDE (Integrated Development Environment) for R. If you want a deeper dive into R, take a walk through R for Data Science. If you really want to explore the depths, Advanced R is an excellent resource.
  2. You are interested in forensic science. Hopefully that’s why you’re here! You may only be interested in DNA or firearms, so we’ve split the book up into chapters by forensic science subfield. You also don’t have to be an expert in the field. We will explain the basics of the field in the introduction of each chapter. You can also download this book by clicking here or by cloning it on GitHub and follow along, running the code on your own computer.
  3. You care about open source software. This doesn’t really affect your ability to read this book, but it’s a nice quality to have. The purpose of this book is to make forensic science more accessible. Right now, most databases, algorithms, and programs that get used every day in forensic science are proprietary, meaning that only the owners know how these systems work, how they were made, and what the source code looks like. This closed approach has lead to miscarriages of justice. With this free online book that relies solely on open-source software for analysis, we hope to demonstrate the impact open source software can have on forensic science, both in research and in practice. And in this spirit of openness, we ask that you contribute if you find an error or want to add a chapter on a topic we did not cover. You can open an issue here or fork the book’s Github repository and submit your changes via a pull request. If you’d like to contribute, we ask that you follow our contributor code of conduct and these recommended practices from Jenny Bryan and Jim Hester of RStudio.

Related Resources

Graph-Theoretic Techniques for Forensic Image Comparisons

Graph-Theoretic Techniques for Forensic Image Comparisons

This presentation is from the 76th Annual Conference of the American Academy of Forensic Sciences (AAFS), Denver, Colorado, February 19-24, 2024.
ShoeCase: A data set of mock crime scene footwear impressions

ShoeCase: A data set of mock crime scene footwear impressions

This project’s main objective is to create an open-source database containing a sizeable number of high-quality images of shoe impressions. The Center for Statistics and Applications in Forensic Evidence (CSAFE)…
A finely tuned deep transfer learning algorithm to compare outsole images

A finely tuned deep transfer learning algorithm to compare outsole images

In forensic practice, evaluating shoeprint evidence is challenging because the differences between images of two different outsoles can be subtle. In this paper, we propose a deep transfer learning-based matching…
An automated alignment algorithm for identification of the source of footwear impressions with common class characteristics

An automated alignment algorithm for identification of the source of footwear impressions with common class characteristics

We introduce an algorithmic approach designed to compare similar shoeprint images, with automated alignment. Our method employs the Iterative Closest Points (ICP) algorithm to attain optimal alignment, further enhancing precision…