Skip to content

Confronting Challenges in Forensic Databases: CSAFE Team Addresses Limitations and Advocates for Change

Woman's hands using laptop at the office. Close-up image

Determining if a piece of evidence found at a crime scene matches a suspect poses a unique challenge for forensic investigators. Just how reliable are their conclusions? The current tools used often rely on expert experience, and aren’t as systematic as they could be.

This is where forensic databases can help. Historical records of pattern evidence such as fingerprints, bullet trajectories or toolmarks can provide forensic examiners with a more concrete answer to the matching debate. CSAFE researcher and University of Virginia Associate Professor of Statistics Dan Spitzner says that for example, a lot of work has been done using statistical methodology and databases to summarize what is seen in a fingerprint by providing a numerical score describing the likelihood that two fingerprints match.

“The question is, what does that number really mean? Can you report that with any kind of reliability and can you give it an interpretation that perhaps a jury or a judge can understand?” Spitzner said.

So how then, can databases be improved to offer valuable, accurate information to investigators? This is the question Spitzner and his CSAFE research team are trying to answer.

“When we look at a database, a question is whether it’s really an informative database,” Spitzner said.

In considering the effectiveness of a database, Spitzner explains that we need to ask, “Is it the right kind of database to use, are there biases in the database, does it represent the population we are thinking of?”

Spitzner is using concepts from probability and statistics to help the numbers reported from databases make sense. “If there’s a statistical concept that has any chance of being interpreted it’s  a probability and it is my job to make it easy to understand,” he said.

What does an ideal database look like? There are two main issues, according to Spitzner.

“One has to do with replication,” he said. “We wish more databases had multiple measurements of, for example, the same fingerprints, as a way to understand how fingerprints vary.”

“The other piece is we wish we could observe the same fingerprint being observed in multiple contexts such as different types of surfaces and by investigators with different levels of expertise,” Spitzner said.

Spitzner explains that this doesn’t necessarily exist in current databases, likely because investigators are approaching databases with a different objective than researchers.

“My impression is that people aren’t thinking about doing formal statistical inference when they are collecting fingerprints and compiling databases, they are thinking about collecting what information is at a scene,” he said.

Spitzner recognizes that researchers can dream about designing the ideal database, but the reality is that  the data is probably never going to be collected.

“Now that we have identified the type of information that we need from databases we need to take the databases that are already out there and understand the information they contain and think hard about how to get statistically meaningful information from them.”

As Spitzner continues to make progress on discovering ways to improve interpretation and performance of databases, the work will prove to be of special value to forensic scientists working with pattern evidence such as fingerprints in order to provide accurate, precise and interpretable tools that could be used in a courtroom.

Other members of the CSAFE team are applying these concepts and working to build new informative databases in areas such as digital forensics. Dr. Jennifer Newman at Iowa State University focuses on steganography, an area of forensic science that analyzes digital photos potentially containing hidden content related to criminal investigations. Current databases only include images from a still camera, but Newman’s database of over 50,000 images were captured using a variety of camera settings which can improve steganalysis detection tool reliability.

“We hope other researchers can look at the data we gather and develop a quantitative statistical analysis that is useful in a court of law,” Newman said.

CSAFE researchers are also leading educational workshops to further investigate the role of forensic databases and how to address key issues. Together, our team is working to improve the certainty forensic science practitioners have that a piece of sample evidence matches a database element, leading to increased accuracy in suspect identification.