Insights: Mock Jurors’ Evaluation of Firearm Examiner Testimony


Mock Jurors’ Evaluation of Firearm Examiner Testimony


Traditionally, firearm and toolmark experts have testified that a weapon leaves “unique” marks on bullets and casings permitting a “source identification” conclusion to be made. While scientific organizations have called this sort of categorical assertion into question, jurors still place a great deal of weight on a firearms expert’s testimony.

To examine the weight jurors place on these testimonies, researchers conducted two studies: the first evaluated if using more cautious language influenced jurors’ opinions on expert testimony, and the second measured if cross-examination altered these opinions.

Lead Researchers

Brandon L. Garrett
Nicholas Scurich
William E. Crozier


Law & Human Behavior

Publication Date



The team tested four hypotheses in these studies:


Jurors will accord significant weight to a testimony that declares a categorical “match” between two casings.


Jurors’ opinions will not be changed by more cautious language in a firearms expert testimony.


Guilty verdicts would only be lowered by using the most cautious language (i.e., “cannot exclude the gun”).


Cross-examination would lower guilty verdicts depending on the specific language used.

The Studies

Study 1:

1,420 participants read a synopsis of a criminal case which included the testimony of a firearms expert. The expert gave one of seven specifically worded conclusions, ranging from a “simple match,” to a more cautious “reasonable degree of ballistic certainty,” to “cannot be excluded.”

The participants then decided whether they would convict based on the testimony.

Study 2:

1,260 participants were given the same synopsis, with two important changes:

  • The expert’s testimony had three possible conclusions (inconclusive, a conclusive match, or a cautious
    “cannot be excluded”) rather than seven.
  • Some participants also heard cross-examination of the firearms expert.

The participants again decided whether they would convict the defendant and rated the testimony’s credibility.


Study 1:

Guilty Verdict

Figure 1. Proportion of guilty verdicts with 95% confidence intervals.

  • Compared to an inconclusive result, finding a “match” tripled the rate of guilty verdicts. Variations to how the “match” is described did not affect verdicts.
  • The sole exception is when the match was described as “…the defendant’s gun ‘cannot be
    excluded’ as the source.” Then the rate of guilty verdicts doubled –– instead of tripled –– compared to an inconclusive result.

Study 2:



Proportion of Guilty Verdicts

Figure 2. Proportion of guilty verdicts (with 95% confidence intervals) in each experimental condition.

  • Cross-examination did not help jurors to consistently discount firearms conclusions. This is consistent with prior work showing mixed effects of cross-examination on jury perceptions of strength of evidence.
  • ‘Cannot exclude’ and ‘identification’ conclusions lead to significantly more “guilty” convictions than the “inconclusive” condition.

Focus on the future


While it is unfortunate that using more cautious language does not affect jurors’ decisions, there is no downside to implementing it because it can prevent misleading or overstated conclusions.

Future studies should provide video testimony and discussion to better mimic a real-world trial.

The methods that firearms experts use have not been adequately tested, so jurors cannot accurately judge the strength of the evidence or the expert’s proficiency. This requires further research into the validity and reliability of firearms comparison methods.

Insights: Latent Print Comparison and Examiner Conclusions


Latent Print Comparison and Examiner Conclusions

A Field Analysis of Case Processing in One Crime Laboratory


While research on error rates and identifying areas of bias and influence in forensic examination exists, most of it occurs under controlled conditions. With this in mind, researchers set out to investigate real-world latent print comparison-based casework performed by the Houston Forensic Science Center (HFSC) and to assess the results of their latent print analyses for an entire year.

Lead Researchers

Brett O. Gardner
Sharon Kelley 
Maddisen Neuman


Forensic Science International

Publication Date

December 2, 2020



Analyze the HFSC latent print unit’s 2018 casework and describe examiner conclusions.


Explore what factors might have affected the examiners’ decisions.


Establish the extent of differences between individual examiner’s conclusions.

The Study

Researchers gathered data from JusticeTrax, HFSC’s laboratory information management system. With this, they looked at 20,494 latent print samples the HFSC team examined in 2018. In total, 17 different examiners submitted reports that year. All examiners were certified by the International Association for Identification and had anywhere from 5 to 36 years of experience in the field.

When provided a latent print for comparison, the examiners first checked if the print had enough usable data to enter into an Automated Fingerprint Identification System (AFIS). If so, the examiners then made one of three conclusions based on AFIS results:

No Association: The print is not a potential match with any known print in the AFIS database

Preliminary AFIS Association (PAA): The print is a potential match with a known print in the AFIS database

Reverse Hit: The print is not a potential match with any known print in the AFIS database, but later matches to newly added record prints



44.8% of the prints examined had enough usable data to enter into AFIS.


Out of the 11,812 prints entered into AFIS, only 20.7% (2,429 prints) resulted in a PAA



Examiners were slightly more likely to conclude a print was sufficient to enter into AFIS in cases involving a person offense (a crime committed against a person)


The types of AFIS software used produced vastly different results. The county-level AFIS (called MorphoTrak) and the federal-level AFIS (called Next Generation Identification, or NGI), were both nearly five times more likely to result in a PAA than the state-level AFIS (called NEC).


Individual examiners had drastically different standards to whether a print had enough usable data to enter into AFIS, and again regarding whether the AFIS results could be considered a PAA. This could differ by nearly twice as much, as one examiner concluded 13.3% of their AFIS results were PAAs, while another had 27.1% PAAs in their results.



The major differences between the county, state and federal-level AFIS software indicates that more research is needed on AFIS databases to increase their reliability across the board.

These results only reflect the work of one crime lab over the course of one year. Future research should be conducted with multiple labs in various locations.

HFSC made significant changes to its workflow in recent years, which may contribute to the disparity in examiner conclusions.