WEBINAR Q&A: SHINING A LIGHT ON BLACK BOX STUDIES

Working on a Laptop

On April 22, the Center for Statistics and Applications in Forensic Evidence (CSAFE) hosted the webinar, Shining a Light on Black Box Studies. It was presented by Dr. Kori Khan, an assistant professor in the department of statistics at Iowa State University, and Dr. Alicia Carriquiry,

CSAFE director and Distinguished Professor and President’s Chair in Statistics at Iowa State.

In the webinar, Khan and Carriquiry used two case studies—the Ames II ballistics study and a palmar prints study by Heidi Eldridge, Marco De Donno, and Christophe Champod (referred to in the webinar as the EDC study)—to illustrate the common problems of examiner representation and high levels of non-response (also called missingness) in Black Box studies, as well as recommendations for addressing these issues in the future.

If you did not attend the webinar live, the recording is available at https://forensicstats.org/blog/portfolio/shining-a-light-on-black-box-studies/

What is Foundational Validity?

To start to understand Black Box studies, we must first establish foundational validity. The 2016 PCAST report brought Black Box studies into focus and defined them to be a thing of interest. The report detailed that in order for these feature comparison types of disciplines, we need to establish foundational validity, which means that empirical studies must show that, with known probability:

  • An examiner obtains correct results for true positives and true negatives.
  • An examiner obtains the same results when analyzing samples from the same types of sources.
  • Different examiners arrive at the same conclusions.

What is a Black Box Study?

The PCAST report proposed that the only way to establish foundational validity for feature comparison methods that rely on some amount of objective determination is through multiple, independent Black Box studies. In these studies, the examiner is supposed to be considered a “Black Box,” meaning there is some amount of subjective determination.

Method: Examiners are given test sets and samples and asked to render opinions about what their conclusion would have been if this was actual casework. Examiners are not asked about how they arrive at these conclusions. Data is collected and analyzed to establish accuracy. In a later phase, participants are given more data and their responses are again collected and then measured for repeatability and reproducibility.

Goal: The goal with Black Box studies is to analyze how well the examiners perform in providing accurate results. Therefore, in these studies, it is essential that ground truth be known with certainty.

What are the common types of measures in Black Box studies?

The four common types of measures are False Positive Error Rate (FPR), False Negative Error Rate (FNR), Sensitivity, and Specificity. Inconclusives are generally excluded from Black Box studies as neither an incorrect identification or incorrect exclusions, so inconclusive decisions are not treated as errors.

What are some common problems in some existing Black Box studies?

Representative Samples of Examiners

In order for results to reflect real-world scenarios, we need to ensure that the Black Box volunteer participants are representative of the population of interest. In an ideal scenario, volunteers are pulled from a list of persons within the population of interest, though this is not always possible.

All Black Box studies rely on volunteer participation, which can lead to self-selection bias, meaning those who volunteer are different from those who don’t. For example, perhaps those who volunteer are less busy than those who don’t volunteer. Therefore, it’s important that Black Box studies have inclusion criteria to help make the volunteer set as representative of the population of interest as possible.

In the Ames II case study, volunteers were solicited through the FBI and the Association of Firearm and Toolmarks (AFTE) contact list. Participants were limited by the following criteria:

Problems with this set:

  • Many examiners do not work for an accredited U.S. public crime laboratory.
  • Many examiners are not current members of AFTE.

Overall, this is strong evidence in this study that the volunteer set does not match or represent the population of interest, which can negatively influence the accuracy of Black Box study results.

Handling Missing Data

Statistical literature has many rules of thumb stating that it is okay to carry out statistical analyses on the observed data if the missing data accounts for between 5–20% and the missingness is “ignorable”. If missingness is non-ignorable, any amount of missingness can bias estimates. Across most Black Box studies, missing data is between 30–40%. We can adjust for some non-response, but first we must know whether it’s ignorable or non-ignorable.

  • Adjusting for missing data depends on the missingness mechanism (potentially at two levels: unit and item).
  • Ignorable:
    • Missing Completely at Random: the probability that any observation is missing doew not depend on any other variable in the dataset (observed or unobserved)
    • Missing at Random: the probability that any observation is missing only depends on other observed variables.
  • Non-ignorable
    • Not Missing at Random (NMAR): The probability that any observation is missing depends on unobserved values. Also know as non-ignorable.

To make this determination, the following data at a minimum must be known:

  • The participants who enrolled and did not participate
  • The participants who enrolled and did participate
  • Demographics for each of these groups of examiners
  • The total number of test sets and types assigned to each examiner
  • For each examiner, a list of the items he/she did or did not answer
  • For each examiner, a list of the items he/she did or did not correctly answer

Most Black Box studies do not release this information or the raw data. For example:

However, study made much of the necessary data known, allowing researchers to study missingness empirically. If there is a characteristic of examiners that is associated with higher error rates, and if that characteristic is also associated with higher levels of missingness, we have evidence that the missingness is non-ignorable and can come up with ways to address it.

In this example, of the 226 examiners who returned some test sets in the studies, 197 of those also had demographic information. Of those 197, 53 failed to render a decision for over half of the 75 tests presented to them. The EDC study noted that examiners who worked for non-U.S. entities committed 50% of the false positives made in the study, but only accounted for less than 20% of the examiners. Researchers wanted to discover whether examiners who worked for non-U.S. entities had higher rates of missingness. After analyzing the data, researchers found that instead of the 19% of respondents that worked for non-U.S. entities that were expected to have a missingness of over half, the observed amount was 28% of respondents.

Researchers then conducted a hypothesis test to see if there was an association between working for a non-U.S. entity and missingness by taking a random sample size, calculating the proportion of foreign workers in the sample, repeating many times, and comparing the observed value of 28% to the calculated ones.

  • H0: Working for a non-US entity is statistically independent of missingness
  • HA: Working for a non-US entity is associated with a higher missingness

Using this method, researchers found that the observed result (28%) would occur only 4% of the time, if there was no relationship between missingness and working for a non-U.S. entity, meaning that there is strong evidence that working for a non-U.S. entity is associated with higher missingness.

Researchers repeated the process to test whether missingness is higher among examiners who did not work for an accredited lab and had similar findings:

In this case, the hypothesis showed that his result (47% missingness) would only be expected about 0.29% of the time. Therefore, there is strong evidence that working for an unaccredited lab is associated with a higher missingness.

What are the next steps for gaining insights from Black Box studies?

The two issues discussed in this webinar—lack of a representative sample of participants and non-ignorable non-response—can be addressed in the short term with minor funding and cooperation among researchers.

Representation

  • Draw a random sample of courts (state, federal, nationwide, etc.)
  • Enumerate experts in each
  • Stratify and sample experts
  • Even if the person refuses to participate, at least we know in which ways (education, gender, age, etc.) the participants are or are not representative of the population of interest.

Missingness

  • This is producing the biggest biases in the studies that have been published.
  • Adjusting for non-response is necessary for the future of Black Box studies.
  • Results can be adjusted if those who conduct the studies release more data and increase transparency to aid collaboration.

Longer term solutions include:

  • Limiting who qualifies as an “expert” when testifying in court (existing parameters require minimal little to no certification, education, or testing)
  • Institutionalized, regular discipline-wide testing with expectations of participation.
  • Requirements to share data from Black Box studies in more granular form.

Insights: A Practical Tool for Information Management in Forensic Decisions

INSIGHTS

A Practical Tool for Information Management in Forensic Decisions:

Using Linear Sequential Unmasking-Expanded (LSU-E) in Casework

OVERVIEW

While forensic analysts strive to make their findings as accurate and objective as possible, they are often subject to external and internal factors that might bias their decision making. Researchers funded by CSAFE created a practical tool that laboratories can use to implement Linear Sequential Unmasking-Expanded (LSU-E; Dror & Kukucka, 2021)—an information management framework that analysts can use to guide their evaluation of the information available to them. LSU-E can improve decision quality and reduce bias but, until now, laboratories and analysts have
received little concrete guidance to aid implementation efforts.

Lead Researchers

Quigley-McBride, A.
Dror, I.E.
Roy, T.
Garrett, B.L.
Kukucka, J.

Journal

Forensic Science International: Synergy

Publication Date

17 January 2022

Goals

1

Identify factors that can bias decision-making.

2

Describe how LSU-E can improve forensic decision processes and conclusions.

3

Present a practical worksheet, as well as examples and training materials, to help laboratories incorporate LSU-E into their casework.

TYPES OF COGNITIVE BIAS

Cognitive biases can emerge from a variety of sources, including:

Figure 1. Eight sources of cognitive bias in forensic science (Dror, 2020)

COGNITIVE BIAS IN FORENSIC SCIENCE

As shown in Figure 1, there are many potential sources of information that can influence analysts’ decisions. Of particular concern is suggestive, task-irrelevant contextual information (such as a suspect’s race, sex, or prior criminal record) that can bias analyst’s decisions in inappropriate ways.

In one famous example, FBI latent print analysts concluded with “100 percent certainty” that a print linked to the 2003 Madrid train bombing belonged to a US lawyer, Brandon Mayfield. It transpired that these analysts were all wrong—that was not Mayfield’s print. Mayfield was Muslim, which might have biased the analysts given the strong, widespread attitudes towards Muslims post 9/11. Also, Mayfield was on the FBI’s “watch list” because he provided legal representation to someone accused of terrorist activities. Combined, these facts led to confirmation bias effects in the analysts’ evaluations and conclusions about Mayfield’s fingerprints.

LSU-E AND INFORMATION
MANAGEMENT

LSU-E is an approach information management which prioritizes case information based on three main criteria:

Biasing power:

How strongly the information might dispose an analyst to a particular conclusion.

Objectivity:

The extent to which the information might be interpreted to have different “meanings” from one analyst to another.

Relevance:

the degree to which the information is essential to the analytic task itself.

IMPLEMENTING LSU-E IN
FORENSICS

Quigley-McBride et al. have created a practical worksheet for laboratories to use when assessing new information.

1

First, the user specifies the information in question and its source

2

Second, they consider the three LSU-E criteria, and rate the information on a scale of 1-5 for each criterion

3

Finally, they describe strategies to minimize any adverse effects the information may have on the decision-making process

Focus on the future

 

Ideally, LSU-E procedures would be applied before the information reaches the analyst. That said, it is still effective when used at any point in the analyst’s workflow and can help analysts become aware of information that can inappropriately influence their work.

In addition to benefits for analysts, implementing LSU-E could help jurors evaluate the reliability of forensic expert testimony. This would not only encourage healthy skepticism among jurors, but could bolster an expert’s credibility by providing documentation of methods used to evaluate and mitigate potential biases in their decisions.

Insights: Handwriting Identification Using Random Forests and Score-Based Likelihood Ratios

INSIGHTs

Handwriting Identification Using Random Forests and Score-Based Likelihood Ratios

OVERVIEW

Handwriting analysis has long been a largely subjective field of study, relying on visual inspections from trained examiners to determine if questioned documents come from the same source. In recent years, however, efforts have been made to develop methods and software which quantify the similarity between writing samples more objectively. Researchers funded by CSAFE developed and tested a new statistical method for handwriting recognition, using a score-based likelihood
ratio (SLR) system to determine the evidential value.

Lead Researchers

Madeline Quinn Johnson
Danica M. Ommen

Journal

Statistical Analysis and Data Mining

Publication Date

03 December 2021

The Goals

1

Apply the SLR system to various handwritten documents.

2

Evaluate the system’s performance with various approaches to the data.

The Study

CSAFE collected handwriting samples from 90 participants, using prompts of various lengths to get samples of different sizes. These writing samples were broken down into graphs, or writing segments with nodes and connecting edges, then grouped into clusters for comparison.

When comparing the gathered samples, Johnson and Ommen considered two possible scenarios:

Common Source Scenario:

two questioned documents with unknown writers are compared to determine whether they come from the same source.

Specific Source Scenario:

a questioned document is compared to a prepared sample from a known writer.

They then used Score-based Likelihood Ratios (SLRs) to approximate the weight of the evidence in both types of scenarios.

The researchers used three different approaches when generating the known non-matching comparisons for the specific source SLRs:

Trace-Anchored Approach:

only uses comparisons between the questioned document (the trace) and a collection of writers different from the specific source (the background population).

Source-Anchored Approach:

only uses comparisons between writing from the specific source and the background population.

General-Match Approach:

only uses comparisons between samples from different writers in the background population.

Once the SLRs for each scenario were generated, they used random forest algorithms to determine comparison scores, including a pre-trained random forest using all of the gathered data, and one trained according to the relevant SLR.

Results

1

In common source scenarios, the trained random forest performed well with longer writing samples, but struggled with shorter ones.

2

The specific source SLRs performed better than the common source SLRs because they are tailored to the case at hand.

3

In all scenarios, it was more difficult for the SLR system to confirm samples with the same source than with different sources.

FOCUS ON THE FUTURE

 

The SLRs do not perform well with short documents, possibly due to a mismatch between the number of clusters used and the length of the document. Future work could determine the optimal number of clusters based on the document’s length.

Because the SLRs provide data on the strength of forensic handwriting evidence for an open-set of sources, this approach is an improvement on the previous clustering method developed by CSAFE, which used a closed set of known sources.

Insights: Using the Likelihood Ratio in Bloodstain Pattern Analysis

INSIGHTS

Using the Likelihood Ratio in Bloodstain Pattern Analysis

OVERVIEW

Using likelihood ratios (LRs) when reporting forensic evidence in court has significant advantages, as it allows forensic practitioners to consider their findings from the perspective of both the defense and the prosecution. However, despite many organizations adapting or recommending this practice, most experts in the field of bloodstain pattern analysis (BPA) still use a more traditional, subjective approach, indicating whether their findings are “consistent with” stated allegations. Researchers funded by CSAFE explored the challenges that come with using LRs when reporting BPA evidence, and proposed possible solutions to meet these challenges, concluding that the LR framework is applicable to BPA, but that it is a complex task.

Lead Researchers

Daniel Attinger
Kris De Brabanter
Christophe Champod

Journal

Journal of Forensic Sciences

Publication Date

29 October 2021

Goals

1

Determine why many BPA experts do not use LRs in their reporting

2

Present directions the community could take to facilitate the use of LRs

3

Provide an example of how LRs are applied in a relevant field

CHALLENGES
OF USING LIKELIHOOD RATIOS

Likelihood ratios (LRs) compare two competing hypotheses to see which better fits the evidence. While this practice has several advantages for use in court, as it provides a more objective and transparent view of an expert’s findings, there are challenges when it comes to applying LRs to bloodstain pattern analysis.

Graph displaying factors that can affect the complexity of BPA

Attinger et al. identified two key factors affecting a likelihood ratio’s complexity:

This is further complicated by the nature of bloodstain pattern analysis itself. BPA focuses on questions of activity (how far, how long ago, in what direction the blood traveled) or the type of activity (what caused the blood pattern), rather than questions of source as is normal for most forensic LR models. In addition, BPA as a field consists of a wide range of methods, and is a complex science that is still being built.

EXAMPLE OF LIKELIHOOD
RATIOS IN ACTION

A recent study demonstrated how LRs could be used in BPA by applying them to the related field of fluid dynamics. In their test, they compared the time between the drying of a blood pool in a laboratory setting and one observed in photographs.

Using this model, they were able to create a physical model factoring in time, the scale and shape of the blood pool, and the surface on which the pool formed. This model could then be applied into a likelihood ratio, comparing propositions from the prosecution and defense.

In this instance, the defense’s proposition would be 2330 times more likely than the prosecution’s.

Focus on the future

Attinger et al. propose three directions to facilitate the use of LRs in the field of BPA:

 

Promote education and research to better understand the physics of fluid dynamics and how they relate to BPA

Create public databases of BPA patterns, and promote a culture of data sharing and peer review

Develop BPA training material that discusses LRs and their foundations

Insights: Latent Print Quality in Blind Proficiency Testing

INSIGHT

Latent Print Quality in Blind Proficiency Testing:

Using Quality Metrics to Examine Laboratory Performance

OVERVIEW

Organizations and leaders continuously call for blind proficiency testing in modern forensic labs because it more accurately simulates routine examiner casework. In response, researchers funded by CSAFE worked with the Houston Forensic Science Center to assess the results of their blind quality control program and then applied quality metrics to the test materials to see if the quality of the prints impacted their conclusions.

Lead Researchers

Brett O. Gardner
Maddisen Neuman
Sharon Kelley

Journal

Forensic Science International

Publication Date

May 7, 2021

THE GOALS

1

Examine the results of blind proficiency testing within a fingerprint examination unit of a crime laboratory.

2

Use available quality metrics software to measure the quality of the submitted prints.

3

See if there is an association between fingerprint quality and examiners’ conclusions.

The Studies

The Quality Division at the Houston Forensic Science Center inserted 376 prints into 144 blind test cases over a two-year period. In these cases, examiners determined if the prints were of sufficient quality to search in their Automated Fingerprint Identification System (AFIS). After searching AFIS for corresponding prints, they concluded whether or not the test prints were similar enough to make a Primary AFIS Association (PAA). Then, the Blind Quality Control (BQC) team judged the examiners’ accuracy.

 

Meanwhile, Gardner et al. entered the same test prints into LQMetrics –– a commonly used software tool for fingerprint examiners that rates the quality of a print image on a scale of 0–100. The team scored print images with a quality score greater than 65 as “Good,” 45–65 as “Bad,” and lower than 45 as “Ugly.”

Results

Examiner Conclusions Among Good, Bad and Ugly Latent Prints

Print quality (as categorized by Good, Bad and Ugly) was significantly associated with examiner conclusions and ultimate accuracy. Note: N = 373 prints. There were 133 Good prints, 114 Bad prints and 126 Ugly prints

FOCUS ON THE FUTURE

 

The correct source for prints submitted to AFIS appeared in the top ten results only 41.7% of the time, lower than an estimated 53.4% of the time based on the quality of such prints. This highlights the potential for blind proficiency testing to gauge the accuracy of the entire system –– including AFIS.

This study only included prints that had the potential to be submitted to AFIS, dismissing images not labeled as latent prints. Future studies should include a full range of images to more closely reflect real-world casework.

Insights: Recognition of Overlapping Elliptical Objects in a Binary Image

INSIGHTS

Recognition of Overlapping Elliptical Objects in a Binary Image

OVERVIEW

A common objective in bloodstain pattern analysis is identifying the mechanism that produced the pattern, such as gunshots or blunt force impact. Existing image-based methods often ignore overlapping objects, which can limit the number of usable stains. Researchers funded by CSAFE established a novel technique for image analysis to provide more accurate data.

Lead Researchers

Tong Zou
Tianyu Pan
Michael Taylor
Hal Stern

Journal

Pattern Analysis and Applications

Publication Date

4 May 2021

Goals

1

Develop a method to classify shapes in complex images.

2

Apply this method to data of different types including bloodstain patterns.

3

Compare the new method’s accuracy to existing methods.

Approach and Methodology

When analyzing bloodstain patterns, the individual stains may appear as clumps comprised of overlapping objects (e.g., droplets). Zou et al. developed a new computational method that identifies the individual objects making up each clump. The method proceeds as follows:

1

Generate a large number of elliptical shapes that match the overall contours of the clump.

2

Use an empirical measure of fit to reduce the set of candidate ellipses.

3

Identify concave points in the clump’s contour and set up an optimization to determine the best fitting ellipses.

Image Processing

Examples of ellipse fitting results for synthetic data. (a) Original binary image; (b) Ground truth; (c) DEFA model; (d) BB model; (e) DTECMA. The number of true ellipses increases from 2 (leftmost column) to 9 (rightmost column). Rows (c) and (d) are results from existing methods; row (e) gives results for Zou et al.’s DTECMA algorithm.

The researchers tested the method on a set of over 1,600 test images with overlapping shapes, emulating bloodstains (row a).

Study Results

  • Across four different metrics, the new approach outperformed existing approaches.
  • The current methods struggled to correctly recognize shapes as the number of ellipses per picture grew. Only the new method was able to maintain consistent accuracy.

Examples of ellipse fitting results for synthetic data. (a) Original binary image; (b) Ground truth; (c) DEFA model; (d) BB model; (e) DTECMA. The number of true ellipses increases from 2 (leftmost column) to 9 (rightmost column). Rows (c) and (d) are results from existing methods; row (e) gives results for Zou et al.’s DTECMA algorithm.

Focus on the future

 

The new approach to identifying elliptical-shaped objects in complex images shows marked improvement over current methods. This is demonstrated using simulated data and biological data for which the underlying truth is known.

While these results are promising, there is currently no way to quantify the performance of these models for bloodstain pattern analysis. The paper shows that the new method seems to do well based on visual inspection.

The next stage of the research is to use the identified ellipses as summaries of the images that can be used to develop statistical methods for analyzing bloodstain patterns.

Insights: Mt. Everest— We Are Going to Lose Many

INSIGHTS

Mt. Everest—
We Are Going to Lose Many:

A Survey of Fingerprint Examiners’ Attitudes Towards Probabilistic Reporting

OVERVIEW

Traditionally, forensic examiners tend to use categorical language in their reports, presenting evidence in broad terms such as “identification” or “exclusion.” There have been efforts in recent years to promote the use of more probabilistic language, but many examiners have expressed concerns about the proposed change.

Researchers funded by CSAFE surveyed fingerprint examiners to better understand how examiners feel about probabilistic reporting and to identify obstacles impeding its adoption.

Lead Researchers

H. Swofford
S. Cole 
V. King

Journal

Law, Probability, and Risk

Publication Date

7 April 2021

Goals

1

Learn what kind of language forensic examiners currently use when reporting evidence.

2

Gauge attitudes toward probabilistic reporting and the reasoning behind those attitudes.

3

Explore examiners’ understanding of probabilistic reporting.

The Study

Results

Only 10% of participants reported using probabilistic language
0%
Only 2% actually used probabilistic language for the open-response question.
0%
58% felt that
probabilistic language was
not an appropriate direction
for the field.
0%
  • The most common concern was that “weaker,” more uncertain terms could be misunderstood by jurors or used by defense attorneys to “undersell” the strength of their findings.
  • Another concern was that a viable probabilistic model was not ready for use in a field as subjective as friction ridge analysis –– and may not even be possible.
  • While many felt that probabilistic language may be more accurate –– they preferred categorical terms as “stronger” –– and more in line with over a century of institutional norms.

Focus on the future

 

The views of the participants were not a handful of outdated “myths” that need to be debunked, but a wide and varied array of strongly held beliefs. Many practitioners are concerned about “consumption” issues –– how lawyers, judges, and juries will  understand the evidence –– that are arguably outside their role as forensic scientists.

While many participants expressed interest in probabilistic reporting, they also felt they were not properly trained to understand probabilities since it has never been a formal requirement. Additional education and resources could help examiners more confidently adopt the practice.

Insights: Judges and Forensic Science Education: A national survey

INSIGHTS

Judges & Forensic Science Education:

A national survey

OVERVIEW

Forensic evidence can play a crucial role in adjudicating a criminal trial. As scientific authorities scrutinize the reliability of many forensic methods, it is important for judges to be trained and educated to make more informed decisions. Since most judges lack a scientific background, additional training may play an important role. However, the effectiveness of additional training and how it affects judges’ perception of forensic evidence is unknown.

Lead Researchers

Brandon L. Garrett
Brett O. Gardner
Evan Murphy
Patrick Grimes

Journal

Forensic Science International

Publication Date

April 2021

Goals

In collaboration with the National Judicial College (NJC), researchers conducted a survey of 164 judges from 39 states who had participated in NJC programs in order to:

Learn judges’ backgrounds and training in forensic science.

Discover their views on the reliability of modern forensic disciplines.

Understand what additional materials and training judges need to better evaluate forensic science.

The Study

1

In part one, the judges described their past experience with forensic science and estimated a percentage of past cases that dealt with forensic evidence.

2

In part two, the judges reported the
amount of training they had involving forensic science, described the availability of training materials, and identified the resources they want in the future.

3

In part three, the judges described their familiarity with statistical methods and estimated the error rates in common forensic science disciplines.

Results

37.4% past cases involving forensic evidence
0%
14.7% past cases with hearings on admissibility of evidence
0%
13.5% past cases with forensic evidence ruled inadmissible
0%
  • An overwhelming majority received training on forensic evidence through further education as a judge but suggested more of this training should occur in law school.
  • They believed that DNA evidence was the most reliable form of forensic evidence –– and that bitemarks and shoeprints were the least reliable.
  • Judges who reported more extensive training were more likely to view themselves as gatekeepers of valid forensic science testimony and reported a higher percentage of evidence they ruled inadmissible.
  • On average, judges seem to underestimate the error rate of most forensic methods, though to a much lesser extent than lay people, lawyers, or even some forensic practitioners.
0%

of judges endorsed more than one week of training specific to forensic science evidence.

Focus on the future

 

The surveyed judges typically relied on journal articles, expert testimony, case law, and further education, but noted that these resources were not readily accessible. Additional education would help judges in their role as gatekeeper to prevent “junk science” being presented at trial.

Judges expressed a desire for additional training and online resources, especially in fields they rated as more reliable. Those include digital, DNA, and toxicology evidence –– these resources would allow judges to make more informed rulings on technical subjects.

Insights: Battling to a Draw

INSIGHTS

Battling to a Draw:

Defense Expert Rebuttal Can Neutralize Prosecution Fingerprint Evidence

OVERVIEW

While all forensic science disciplines pose some risk of error, the public typically believes that testimony from fingerprint experts is infallible. By employing rebuttal experts who can educate jurors about the risk of errors or provide opposing evidence, courts can counter this tendency. CSAFE funded researchers conducted a survey to study the effect of rebuttal experts on jurors’ perceptions.

Lead Researchers

Gregory Mitchell
Brandon L. Garrett

Journal

Applied Cognitive Psychology

Publication Date

4 April 2021

Goals

1

Determine if a rebuttal expert’s  testimony can affect jurors’ beliefs in the reliability of fingerprint evidence.

2

Examine the responses of jurors with different levels of concern about false acquittals versus false convictions.

The Study

1000

Participants completed a survey which included questions regarding their concerns about false convictions or false acquittals.

The participants were then assigned to random mock trial conditions:

Control condition with no fingerprint evidence

Fingerprint expert testimony with no rebuttal

A methodological rebuttal: the expert focuses on the subjective nature of fingerprint analysis as a whole

An “inconclusive” rebuttal: the expert opines their own comparison was inconclusive due to the poor quality of the evidence

An “exclusion” rebuttal: the expert states that their own comparison shows the defendant could not have been the source of the fingerprints

Results

Trial Condition

% Voting for Conviction

Trial Error Aversions

Mean Likelihood D Committed Robbery

Focus on the future

 

While exclusion and inconclusive rebuttals provided the best results for the defense, the methodological rebuttal still significantly impacted the jurors’ views on fingerprint evidence.

Traditional cross-examination seems to have mixed results with forensic experts. This implies that a rebuttal testimony can be more effective and reliable, while producing long-term changes in jurors’ attitudes.

While a rebuttal expert’s testimony can be powerful, much of that power depends on the individual jurors’ personal aversions to trial errors. This could be an important consideration for jury selection in the future.

Check out these resources for additional research on forensic evidence and juries:

Insights: Forensic Science in Legal Education

INSIGHTS

Forensic Science in Legal Education

OVERVIEW

In recent years, new expert admissibility standards in most states call for judges to assess the reliability of forensic expert evidence. However, little has been reported on the education and training law schools offer to law students regarding forensic evidence. Researchers funded by CSAFE conducted a survey to find out how many schools offer forensic science courses, and they also examine the state of forensics in legal education as a whole.

Lead Researchers

Brandon L. Garrett
Glinda S. Cooper
Quinn Beckham

Journal

Duke Law School Public Law & Legal Theory Series No. 2021-22

Publication Date

15 February 2021

Goals

1

Review the curricula of law schools across the United States.

2

Discover how many schools offer forensic science courses and what level of training they provide.

3

Discuss the survey results and their implications for the legal education system at large.

The Study

The 2009 National Academy of Sciences Report called for higher quality scientific education in law schools, citing the lack of scientific expertise among lawyers and judges as a longstanding gap. The American Bar Association then adopted a resolution calling for greater forensic sciences training among law students.

In late 2019 and Spring 2020, Garrett et al. searched online listings of courses for 192 law schools included on the 2019 News and World Report ranking list. They then sent questionnaires to faculties of these schools and requested syllabi to examine the coverage of forensic science courses the schools offered.

With the data in hand, Garrett et al. could examine the type of forensic science-related coverage at law schools in the United States.

Results

  • Only 42 different forensic science courses were identified by the survey, and several schools did not offer any of these courses at all.
  • Across the board, the courses offered were all for upper-level students, and many courses were not offered every year, further limiting students’ access to forensic science training.

Only two of the reported courses mentioned teaching statistics or quantitative methods; the vast majority only covered legal standards for admissibility of expert evidence.

  • Compounding this lack of access was a low degree of demand. None of the responding faculty reported having large lecture courses; in fact, many reported class sizes of fewer than twenty students.

Focus on the future

 

The results of this survey suggest that the 2009 NAS Report’s call for higher standards in forensic science education remain highly relevant and that continuing legal education will be particularly useful to addressing these needs.

In addition to specialty courses in forensics, more general courses in quantitative methods, during and after law school, could provide a better understanding of statistics for future and current lawyers and judges.

There is still much work to be done in order to ensure greater scientific literacy in the legal profession. To quote Jim Dwyer, Barry Scheck, and Peter Neufeld, “A fear of science won’t cut it in an age when many pleas of guilty are predicated on the reports of scientific experts. Every public defender’s office should have at least one lawyer who is not afraid of a test tube.”