Insights: Comparison of three similarity scores for bullet LEA matching

INSIGHTS

Comparison of three similarity scores for bullet LEA matching

OVERVIEW

As technology advances in the forensic sciences, it is important to evaluate the performance of recent innovations. Researchers funded by CSAFE judged the efficacy of different scoring methods for comparing land engraved areas (LEAs) found on bullets.

Lead Researchers

Susan Vanderplas
Melissa Nally
Tylor Klep
Christina Cadevall
Heike Hofmann

Journal

Forensic Science International

Publication Date

March 2020

THE GOALS

Evaluate the performance of scoring measures at a land-to-land level, using random forest scoring, cross correlation and consecutive matching striae (CMS).

Consider the efficacy of these scoring measures on a bullet-to-bullet level.

The Study

  • Data was taken from three separate studies, each using similar firearms from the same manufacturer, Ruger, to compare land engraved areas (LEAs), areas on a bullet marked by a gun barrel’s lands –– the sections in between the grooves on the barrel’s rifling.
  • Examiners processed the LEA data through a matching algorithm and scored it using these three methods:

1

Random Forest (RF):

A form of machine-learning that utilizes a series of decision trees to reach a single result.

2

Cross-Correlation (CC):

A measure of similarity between two series of data.

3

Consecutive Matching Striae (CMS):

Identifying the similarities between the peaks and valleys of LEAs.

Results

The Equal Error rate of each scoring method across multiple studies

  • On a bullet-to-bullet level, the Random Forest and Cross-Correlation scoring methods made no errors.
  • On a land-to-land level, the RF and CC methods outperformed the CMS method.
  • When comparing equal error rates, the CMS method had an error rate of over 20%, while both the RF and CC methods’ error rates were roughly 5%. The RF method performed slightly better.

FOCUS ON THE FUTURE

 

The random forest algorithm struggled to identify damage to bullets that obscured LEAs caused by deficiencies in the gun barrel such as pitting from gunpowder or “tank rash” from expended bullets.

  • In future studies, examiners could pair the RF algorithm with another algorithm to assess the quality of the data and determine which portions can be used for comparison.

All the studies used firearms from Ruger, a manufacturer picked because their firearms mark very well on bullets. Future studies can assess the performance of these scoring methods on firearms from different manufacturers with differing quality marks.

Insights: An Algorithm to Compare Two-Dimensional Footwear Outsole Images Using Maximum Cliques and Speeded Up Robust Features

INSIGHT

An Algorithm to Compare Two-Dimensional Footwear Outsole Images Using Maximum Cliques and Speeded Up Robust Features

OVERVIEW

Footwear impression researchers sought to increase the accuracy and reliability of impression image matching. They developed and tested a statistical algorithm to quantify and score the degree of similarity between a questioned outsole impression and a reference impression obtained from either a suspect or a known database. The resulting algorithm proved to work well, even with partial and partial-quality images.

Lead Researchers

Soyoung Park 
Alicia Carriquiry

Journal

Statistical Analysis and Data Mining

Publication Date

21 Feb. 2020

The Goals

1

Develop a semi-automated approach that:

  • Compares impression evidence imagery with putative suspect or database images.
  • Calculates a score to quantify the degree of similarity (or correspondence) between the images.
  • Lowers human error and bias in current practice.

2

Create a method to obtain a similarity score for a pair of impressions which can be used to assess the probative value of the evidence.

APPROACH AND METHODOLOGY

This algorithm focuses on the similarity between two outsole images and relies on the concept of maximum clique. Local maximum cliques can be used to find corresponding positions in the two images so that they can be aligned.

Rotation and translation don’t affect a maximum clique –– it depends on the pairwise distances between nodes on the graph.

So –– although outsole pattern images may be translated, rotated and subjected to noise and other loss of information –– the geometrical
relationships between the points that constitute the pattern will not change much.

In this study, researchers developed a publicly available and usable database of 2D outsole impressions. Then the researchers used data from a KNM™ (knowledge navigator model) database.

Key Definitions

Graph Theory

Study of graphs made up of vertices connected by edges

Clique

A subset of vertices with edges linking symmetrically, where every two disinct vertices are adjacent

Maximum Clique

Clique that includes the larges possible number of vertices

KEY TAKEAWAYS FOR PRACTITIONERS

1

With this new comparison learning algorithm, practitioners can align images using features chosen as areas of interest and calculate a similarity score more objectively.

2

The proposed pattern-matching algorithm can work with partial images or images of variable quality by partially aligning patterns to quantify degrees of similarity between two impressions.

3

While this study focuses on footwear evidence, this algorithm has potential
applications for other situations of pattern comparison, like:

  •  latent prints
  • surveillance photos
  • handwriting
  • tire treads and more.

4

The algorithm can distinguish impressions made by different shoes –– even when shoes share class characteristics including degree of wear.

SEE THE ALGORITHM IN ACTION

Researcher Dr. Soyoung Park demonstrates the team’s novel algorithm in a CSAFE webinar. The method is promising, because it appears to correctly determine, with high probability, whether two images have a common or a different source, at least for the shoes on which they have experimented.

ShoeprintR

Explore and try the algorithm by downloading it.

Insights: A Robust Approach to Automatically Locating Grooves in 3D Bullet Land Scans

INSIGHTS

A Robust Approach to Automatically Locating Grooves in 3D Bullet Land Scans

OVERVIEW

Land engraved areas (LEAs) can be important distinguishing factors when analyzing 3D scans from bullets. Creating a 3D image of an LEA requires examiners to also scan portions of the neighboring groove engraved areas (GEAs). Current modeling techniques often struggle to separate LEAs from GEAs. CSAFE researchers developed a new method to automatically remove GEA data and tested this method’s performance against previously proposed techniques.

Lead Researchers

Kiegan Rice
Ulrike Genschel
Heike Hofmann

Journal

Journal of Forensic Sciences

Publication Date

13 December 2019

GOAL

1

Present and discuss automated methods for identifying “shoulder locations” between LEAs and GEAs.

The Study

  • Rice et al. gathered 3D scans of 104 bullets from two available data sets (Hamby 44 and Houston), resulting in a total of 622 LEA scans.
  • They removed the curvature from these 3D scans to make 2D crosscuts of each LEA.
  • Using the 2D crosscuts, the team estimated the shoulder locations between LEAs and GEAs using three different models:

Rollapply:

A function (in this case, one available through the open-source “bulletxtrctr” package) which applies a rolling average to smooth out outliers in data.

Robust Linear Model:

A quadratic linear model that minimizes absolute deviations and is therefore less influenced by outliers.

Robust Locally Weighted Regression (LOESS):

A weighted average of many parametric models to fit subsets of data.

Results

Hamby set 44

Houston test set

areas of misidentification:

In this graphic, an Area of Misidentification less than 100 is considered a small deviation, between 100 and 1000 is medium, and greater than 1000 is a large deviation.

  • The Robust LOESS model significantly outperformed the Rollapply and Robust Linear models, resulting primarily in small deviations across all test sets.
  • Conversely, the Robust Linear model had the weakest performance of all three, with mostly large deviations across both Houston sets, and only outperforming the Rollapply model in the right shoulder section of the Hamby 44 set.
  • These results were expected, as the Robust LOESS model is intended to be flexible and handle areas that a quadratic linear model would fail to address.

FOCUS ON THE FUTURE

 

Both the Hamby 44 and Houston datasets used firearms from the same manufacturer. Future studies can expand on these findings by using a wider variety of barrel types, including different caliber sizes, manufacturers and nontraditional rifling techniques.

Insights: Quantifying the Association Between Discrete Event Time Series with Applications to Digital Forensics

INSIGHT

Quantifying the Association Between Discrete Event Time Series with Applications to Digital Forensics

Effects of Proficiency and Cross-examination

OVERVIEW

Digital devices provide a new opportunity to examiners because for every user event — like opening software, browsing online, or sending an email — an event time series is created, logging that data. Yet, using this type of user-generated event data can be difficult to correlate between
two devices for examiners. The research team set out to quantify the degree of association between two event time series both with and without population data.

Lead Researchers

Christopher Galbraith
Padhraic Smyth
Hal S. Stern

Journal

Journal of the Royal Statistical Society

Publication Date

January 2020

The Goals

1

Investigate suitable measures to quantify the association between two event series on digital devices.

2

Determine the likelihood that the series were generated by the same source or by different sources –– ultimately to assess the degree of association between the two event series.

APPROACH AND METHODOLOGY

Researchers explored a variety of measures for quantifying the association between two discrete event time series. They used multiple score functions to determine the similarity between the series. These score functions were discriminative for same- and different-source pairs of event series.

The following methods for assessing the strength of association for a given pair of event series proved most accurate:

1

Constructing score-based likelihood ratios (SLRs) that assess the relative likelihood of observing a given degree of association when the series came from the same or different sources. This uses a population-based approach.

2

Calculating coincidental match probabilities (CMPs) to simulate a different-source score distribution via what the research team refers to as sessionized resampling when working with a single pair of event series. When a sample from a relevant population is not available, this method still produces accurate results.

KEY TAKEAWAYS for Practitioners

1

The population-based approach of SLRs remains the preferred technique in terms of accuracy and interpretability.

2

The resampling technique using CMPs shows significant potential for quantifying the association between a pair of time event series, helping examiners determine the likelihood that two different time series were created by the same person, especially when no population sampling data is available.

3

With multiple-event series, combining these techniques could be valuable for pattern mining to determine which event series are associated with one another.

4

Developments in this area have the capacity to positively impact work in forensic and cybersecurity settings.

Next Steps

 

Both SLR and CMP techniques require more extensive study and testing before being used in practice by forensic examiners.

All techniques that are described are implemented in the open-source R package assocr.

Insights: What do Forensic Analysts Consider Relevant to their Decision Making?

INSIGHT

What do Forensic Analysts Consider Relevant to their Decision Making?

OVERVIEW

Forensic analysts make critical judgments that can play a crucial role in criminal investigations, so it is important that their decisions are as objective as possible. However, they often receive information that may not be relevant to their work and can subconsciously bias their analyses.
Researchers surveyed analysts from multiple forensic disciplines to see what information they consider relevant to their tasks.

Lead Researchers

Brett O. Gardner
Sharon Kelley
Daniel C. Murrie
Itiel E. Dror

Journal

Science and Justice

Publication Date

September 2019

Goals

1

Discover what information analysts consider relevant

2

Evaluate whether there is a general consensus across disciplines

3

Determine if these opinions match the National Commission of Forensic Science’s definition of task-relevance

The Study

The National Commission of Forensic Science (NCFS) defines task-relevant information as:

“Necessary for drawing conclusions: 1) about the propositions in question, 2) from the physical evidence that has been designated for examination, [and] 3) through the correct application of an accepted analytic method by a competent analyst.”

The team surveyed 183 forensic analysts among four primary forensic disciplines: Biology, Pattern Evidence, Chemistry, and Crime Scene Investigation. The survey contained 16 different types of information regarding either a case, suspect, or victim.

The analysts categorized the importance of each type of information to their specific tasks, labeling them as either:

Essential

Irrelevant

Would Review If Available

Results

1

Among four forensic science disciplines and 16 types of information (resulting in 64 total ratings for task-relevance), the analysts only reached 100% consensus three times. In fact, in 45 of 64 items, opinions between analysts directly contradicted each other.

2

However, in 36 ratings, the analysts reached a near-consensus where over 75% agreed. Pattern evidence analysts had the highest rate of consensus and crime scene investigators had the most disagreement.

3

Most analysts, apart from crime scene investigators, agreed that personal information regarding a suspect or victim was irrelevant to their tasks. This is consistent with the NCFS’s guidelines for task relevance.

4

The opinions of crime scene investigators were distinct from the other disciplines, as their task is to gather information rather than analyze it.

Focus on the future

 

While the survey contains which types of information the analysts consider relevant, it does not explain why they made these decisions.

It is important to remember that people do not always know the full reasoning behind their decision making.

Even within the same forensic disciplines, different laboratories may not have the same guidelines for what they consider relevant.

The forensic disciplines must reach a general consensus on what information is task-relevant.