NIST Finalizes Report of Digital Forensic Methods

The entrance sign at NIST's Gaithersburg campus. Credit: J. Stoughton/NIST

The National Institute of Standards and Technology (NIST) has finalized the report Digital Investigation Techniques: A NIST Scientific Foundation Review.

The report reviews the scientific foundations of forensic methods for analyzing computers, mobile phones and other electronic devices.

A draft report was first published in May and was open for public comments through July. The report was updated based on the comments received and to improve clarity, flow and accessibility.

In the report, the authors point out some limitations of digital investigations that practitioners should be aware of:

  • As with any crime scene, not all evidence may be discovered.
  • When recovering deleted files, the results may include extraneous material.
  • Examiners need to understand the meaning and significance of digital artifacts retrieved as they can change over different versions of operating systems or applications.

The report discusses many areas that need further research and improved processes, including better methods for sharing forensic knowledge among experts, more efficient and consistent approaches to testing forensic tools, and better sharing of forensic reference data.

More details about the report, including a link to download the final version, can be found at https://www.nist.gov/spo/forensic-science-program/digital-investigation-techniques-nist-scientific-foundation-review.

The Center for Statistics and Applications in Forensic Evidence (CSAFE), a NIST Center of Excellence, conducts research addressing the need for forensic tools and methods for digital evidence. Learn more about this research at forensicstats.org/digital-evidence.

NIST Updates Software Reference Library

Software files can be identified by a sort of electronic fingerprint called a hash. The NSRL dataset update makes it easy to separate hashes indicating run-of-the-mill files from those that might contain incriminating evidence, making investigative work easier. Credit: N. Hanacek/NIST

The National Institute of Standards and Technology (NIST) announced an update to the National Software Reference Library. The expanded, more searchable database will make it easier to sift through seized computers, phones and other electronic equipment.

The database plays a frequent role in criminal investigations involving electronic files, which can be evidence of wrongdoing. According to the NIST news release, “In the first major update to the NSRL in two decades, NIST has increased the number and type of records in the database to reflect the widening variety of software files that law enforcement might encounter on a device. The agency has also changed the format of the records to make the NSRL more searchable.”

NIST said that criminal and civil investigations frequently involve digital evidence in the form of software and files from seized computers and phones. Investigators need a way to filter out the large quantities of data irrelevant to the investigation so they can focus on finding relevant evidence.

The news release stated, “The update comes at a time when investigators must contend with a rapidly expanding universe of software, most of which produces numerous files that are stored in memory. Each of these files can be identified by a sort of electronic fingerprint called a hash, which is the key to the sifting process. The sophistication of the sifting process can vary depending on the type of investigation being performed.”

NIST reported that the NSRL’s reference dataset doubled from half a billion hash records in August 2019 to more than a billion in March 2022.

The new release notes why the dataset is important to digital forensic labs: “This growth makes the NSRL a vitally important tool for digital forensics labs, which specialize in this sort of file review. Such work has become a crucial part of investigations: There are about 11,000 digital forensics labs in the United States (compared with about 400 crime labs).”

The previous database version dates back 20 years, and while searching was possible, it was cumbersome. The new NSRL update will make it easier for users to create custom filters to sort through files and find what they need for a particular investigation.

The dataset and more information on the update are available at https://www.nist.gov/itl/ssd/software-quality-group/national-software-reference-library-nsrl.

The Center for Statistics and Applications in Forensic Evidence (CSAFE), a NIST Center of Excellence, conducts research addressing the need for forensic tools and methods for digital evidence. Learn more about this research at https://forensicstats.org/digital-evidence/.

NIST Seeks Public Comment on Draft Report of Digital Forensic Methods

Working on a Laptop

The National Institute of Standards and Technology (NIST) has published Digital Investigation Techniques: A NIST Scientific Foundation Review. The draft report will be open for public comments through July 11, 2022.

The report reviews the methods that digital forensic experts use to analyze evidence from computers, mobile phones and other electronic devices.

According to a news release from NIST, the authors of the report examined peer-reviewed literature, documentation from software developers, test results on forensic tools, standards and best practices documents and other sources of information.

The news release also stated that the report discusses several challenges that digital forensic experts face, including the rapid pace of technological change, and recommends better methods for information-sharing among experts and a more structured approach to testing forensic tools.

NIST will host a webinar to discuss the draft report and its findings on June 1 from 1–3 p.m. EDT. For more information about the webinar and to register, visit www.nist.gov/news-events/events/2022/06/webinar-digital-investigation-techniques-nist-scientific-foundation.

Read the full news release on the report at www.nist.gov/news-events/news/2022/05/nist-publishes-review-digital-forensic-methods.

The Center for Statistics and Applications in Forensic Evidence (CSAFE), a NIST Center of Excellence, conducts research addressing the need for forensic tools and methods for digital evidence. Learn more about this research at forensicstats.org/digital-evidence.

NIST Releases Results from a Black Box Study for Digital Forensic Examiners

NIST Black Box Study for Digital Forensic Examiners

The National Institute of Standards and Technology (NIST) has published the results from a black box study for digital forensic examiners. The study, released in February 2022, describes the methodology used in the study and summarizes the results.

The study was conducted online and open to anyone in the public or private sectors working in the digital forensics field. Survey participants examined and reported on the simulated digital evidence from casework-like scenarios. NIST said study’s goal was to assess the performance of the digital forensic community as a whole.

Results from a Black-Box Study for Digital Forensic Examiners (NISTIR 8412) can be viewed at https://nvlpubs.nist.gov/nistpubs/ir/2022/NIST.IR.8412.pdf.

From Results from a Black-Box Study for Digital Forensic Examiners, page 33:

Summary Key Takeaways

Despite the limitations of the study, two key takeaways about the state of the digital evidence discipline emerged:

  • Digital forensics examiners showed that they can answer difficult questions related to the analysis of mobile phones and personal computers. Questions ranged from basic, such as identifying who the user of the phone had contacted, to advanced questions that related to the use of the TOR browser.
  • The response to the study underscored the size, variety, and complexity of the field. The study received responses from examiners working in international, federal, state, local government, and private labs whose major work included law enforcement, defense, intelligence, and incident response/computer security. There were also responses from people outside of these areas.

 

Algorithmic Evidence in Criminal Trials

Computer software source code on screen

Guest Blog

Kori Khan
Assistant Professor
Department of Statistics, Iowa State University


 

We are currently in an era where machine learning and algorithms offer novel approaches to solving problems both new and old. Algorithmic approaches are swiftly being adopted for a range of issues: from making hiring decisions for private companies to sentencing criminal defendants. At the same time, researchers and legislators are struggling with how to evaluate and regulate such approaches.

The regulation of algorithmic output becomes simultaneously more complex and pressing in the context of the American criminal justice system. U.S. courts are regularly admitting evidence generated from algorithms in criminal cases. This is perhaps unsurprising given the permissive standards for admission of evidence in American criminal trials. Once admitted, however, the algorithms used to generate the evidence—which are often proprietary or designed for litigation—present a unique challenge. Attorneys and judges face questions about how to evaluate algorithmic output when a person’s liberty hangs in the balance. Devising answers to these questions inevitably involves delving into an increasingly contentious issue—access to the source code.

In criminal courts across the country, it appears most criminal defendants have been denied access to the source code of algorithms used to produce evidence against them. I write, “it appears,” because here, like in most areas of the law, empirical research into legal trends is limited to case studies or observations about cases that have drawn media attention. For these cases, the reasons for denying a criminal defendant access to the source code have not been consistent. Some decisions have pointed out that the prosecution does not own the source code, and therefore is not required to produce it. Others implicitly acknowledge that the prosecution could be required to produce the source code and instead find that the defendant has not shown a need for access to the source code. It is worth emphasizing that these decisions have not found that the defendant does not need access to source code; but rather, that the defendant has failed to sufficiently establish that need. The underlying message in many of these decisions, whether implicit or explicit, is that there will be cases, perhaps quite similar to the case being considered, where a defendant will require access to source code to mount an effective defense. The question of how to handle access to the code in such cases does not have a clear answer.

Legal scholars are scrambling to provide guidance. Loosely speaking, proposals can be categorized into two groups: those that rely on existing legal frameworks and those that suggest a new framework might be necessary. For the former category, the heart of the issue is the tension between the intellectual property rights of the algorithm’s producer and the defendant’s constitutional rights. On the one hand, the producers of algorithms often have a commercial interest in ensuring that competitors do not have access to the source code. On the other hand, criminal defendants have the right to question the weight of the evidence presented in court.

There is a range of opinions on how to balance these competing interests. These opinions run along a spectrum of always allowing defendants access to source code to rarely allowing defendants access to the code. However, most fall somewhere in the middle. Some have suggested “front-end” measures in which lawmakers establish protocols to ensure the accuracy of algorithmic output before their use in criminal courts. These measures might include an escrowing of the source code, similar to how some states have handled voting technology. Within the courtroom, suggestions for protecting the producers of code include utilizing traditional measures, such as the protective orders commonly used in trade secret suits. Other scholars have proposed a defendant might not always need access to source code. For example, some suggest that if the producer of the algorithm is willing to run tests constructed by the defense team, this may be sufficient in many cases. Most of these suggestions make two key assumptions: 1) either legislators or defense attorneys should be able to devise standards to identify the cases for which access to source code is necessary to evaluate an algorithm and 2) legislators or defense attorneys can devise these standards without access to the source code themselves.

These assumptions require legislators and defense attorneys to answer questions that the scientific community itself cannot answer. Outside of the legal setting, researchers are faced with a similar problem: how can we evaluate scientific findings that rely on computational research? For the scientific community, the answer for the moment is that we are not sure. There is evidence that the traditional methods of peer review are inadequate. In response, academic journals and institutes have begun to require that researchers share their source code and any relevant data. This is increasingly viewed as a minimal standard to begin to evaluate computational research, including algorithmic approaches. However, just as within the legal community, the scientific community has no clear answers for how to handle privacy or proprietary interests in the evaluation process.

In the past, forensic science methods used in criminal trials have largely been developed and evaluated outside the purview of the larger scientific community, often on a case-by-case basis. As both the legal and scientific communities face the challenge of regulating algorithms, there is an opportunity to expand existing interdisciplinary forums and create new ones.

Learn about source code in criminal trials by attending the Source Code on Trial Symposium on March 12 at 2:30 to 4 p.m. Register at https://forensicstats.org/source-code-on-trial-symposium/.

 


 

Publications and Websites Used in This Blog:

How AI Can Remove Bias From The Hiring Process And Promote Diversity And Inclusion

Equivant, Northpoint Suite Risk Need Assessments

The Case for Open Computer Programs

Using AI to Make Hiring Decisions? Prepare for EEOC Scrutiny

Source Code, Wikipedia

The People of the State of New York Against Donsha Carter, Defendant

Commonwealth of Pennsylvania Versus Jake Knight, Appellant

The New Forensics: Criminal Justice, False Certainty, and the Second Generation of Scientific Evidence

Convicted by Code

Machine Testimony

Elections Code, California Legislative Information

Trade Secret Policy, United States Patent and Trademark Office

Computer Source Code: A Source of the Growing Controversy Over the Reliability of Automated Forensic Techniques

Artificial Intelligence Faces Reproducibility Crisis

Author Guidelines, Journal of the American Statistical Association

Reproducible Research in Computational Science

Closed Source Forensic Software: Confronting the Evidence?

There is a persistent underlying flaw in the criminal justice system, stemming from unvalidated forensic science cloaked in intellectual property. Not only does this pose a risk when forensic evidence is a key factor in criminal convictions, but it also reveals how confidential forensic technology could violate defendants’ constitutional rights. 

Forensic analysis software, used to generate evidence in criminal trial proceedings, frequently contains closed source code. Such proprietary software prevents the scientific community, the public, juries, attorneys, and defendants from accessing the fundamental methods — or potential errors therein — that can ultimately influence verdicts. This creates a pathway for individuals to be wrongly convicted as a result of jurors being swayed by flawed evidence disguised as good science. 

An excellent example is the case of United States v. Ellis, in which DNA was the key evidence used against the defendant accused of illegal firearm possession. The police forensic lab found the DNA analysis inconclusive, prompting further analysis by third-party-owned software. With multiple hypotheses and test variations run on the sample, the prosecution relied on the result of one particular analysis based on the assumption that the defendant was one of four possible contributors to the DNA sample. 

When Mr. Ellis’ attorney requested access to the source code, “…the government refused to disclose it, arguing that the information is protected by trade secrets.” 

In response, the Electronic Frontier Foundation (EFF) and American Civil Liberties Union of Pennsylvania filed an amicus with the United States District Court of the Western District of Pennsylvania, outlining the inconsistency between closed source code, the defendants’ Sixth Amendment rights, and the right of the public to oversee the criminal trial. 

Source code, and other aspects of forensic software programs used in a criminal prosecution, must be disclosed in order to ensure that innocent people do not end up behind bars,” said the EFF. “Or worse — on death row.”

While it is understandable that developers of forensic software wish to protect their intellectual property, it raises a fundamental question: should IP be protected at the expense of civil rights? To protect the innocent, maintain public oversight, and ensure the advancement of forensic science practices, the curtain must be pulled back on protected methodologies. Arguably, the benefits of doing so would lead to fairer trials and greater trust in the scientific tools utilized within the criminal justice system.

Click here to learn more about CSAFE’s commitment to open source tools.

Insights: Statistical Methods for the Forensic Analysis of Geolocated Event Data

INSIGHT

Statistical Methods for the Forensic Analysis of Geolocated Event Data

OVERVIEW

Researchers investigated the application of statistical methods to forensic questions involving spatial event-based digital data. A motivating example involves assessing whether or not two sets of GPS locations corresponding to digital events were generated by the same source. The team established two approaches to quantify the strength of evidence concerning this question.

Lead Researchers

Christopher Galbraith
Padhraic Smyth
Hal S. Stern

Journal

Forensic Science International: Digital Investigation

Publication Date

July 2020

Publication Number

IN 108 DIG

The Goal

Develop quantitative techniques for the forensic analysis of geolocated event data.

APPROACH AND METHODOLOGY

Researchers collected geolocation data from Twitter messages over two spatial regions, Orange County, CA and the borough of Manhattan in New York City, from May 2015 to February 2016. Selecting only tweets from public accounts, they were able to gather GPS data regarding the frequency of geolocated events in each area.

Key Definitions

Likelihood Ratio (LR)

A comparison of the probability of observing a set of evidence measures under two different theories in order to assess relative support for the theories.

Score-Based Likelihood Ratio (SLR)

An approach that summarizes evidence measures by a score function before applying the likelihood ratio approach.

This study considered a scenario in which two sets of tweet locations are relevant to then determine the source of the tweets. The tweets could be from different devices or from the same device during two different time periods.

The team used kernel density estimation to establish a likelihood ratio approach for observing the tweets under two competing hypotheses: are the tweets from the same source or a different source?

Utilizing this second approach creates a score-based likelihood ratio that summarizes the similarity of the two sets of locations while assessing the strength of the evidence.

Decisions based on both LR and SLR approaches were compared to known ground truth to determine true and false-positive rates.

KEY TAKEAWAYS FOR PRACTITIONERS

1

Both methods show promise in being able to distinguish same-source pairs of spatial event data from different-source pairs.

2

The LR approach outperformed the SLR approach for all dataset sizes considered.

3

The behavior of both approaches can be impacted by the characteristics of the observed region and amount of evidential data available.

FOCUS ON THE FUTURE

 

In this study, time defined sets of locations gathered from Twitter. But, other methods for defining sets of locations, for example, including multiple devices over the same time period, could yield different results.

The amount of available data (the number of tweets) impacts the score-based approach.

NIST Seeks Digital Forensics Experts to Participate in Vital ‘Blackbox’ Study

Objectivity and accuracy are the pinnacle of forensic science. Yet everyone can agree: humans make errors — but to what degree when it comes to digital forensic evidence-gathering and analysis?

The National Institute of Standards and Technology (NIST) is launching the first “blackbox” research study to quantify the accuracy of computer and mobile phone forensics and answer this question.

Digital evidence provides an additional layer of potential human error, especially taking into consideration rapidly evolving technologies, and situations when key evidence must be identified and extracted from large volumes of digital data. It is for these reasons that CSAFE too has is working on a mobile app analysis tool EVIHUNTER.

On a broader scale, this NIST study acts as an answer to the 2009 National Academy of Sciences report: Strengthening Forensic Science in the United States: A Path Forward, which calls for blackbox studies to measure reliability of forensic methods that involve human judgement.

Digital evidence, though grounded in technology, certainly relies on the human element. By participating in the NIST study, digital forensic practitioners can help strengthen the future of forensic science by providing a foundation of quantitative probability that can be used by courts and jurors to weigh the validity of presented digital evidence and analysis — as well as inform future studies needed in this realm. Digital forensic experts can answer a question paramount to fulfilling their own goals and missions in their positions: Are our industry sector’s methods accurate and reliable?

The Study Details

Blackbox studies are unique in their anonymity. They assess the reliability and accuracy (right or wrong) of human judgement methods only, without concern for how experts reached their answer. Therefore, the study will not judge individuals and their performance but rather will be aimed to measure the performance of the digital forensics community as a whole.

The study will be conducted online — and enrollment is now open and the test is available for approximately three months.

Digital forensic experts who volunteer for the study will be provided a download of simulated evidence from the NIST website, in the form of one virtual mobile phone and one virtual computer. In roughly a two-hour time commitment, participants will be asked to examine simulated digital evidence and answer a series of questions similar to those that would be expected in a real criminal investigation. Participants will use forensic software tools of their choosing to analyze the forensic images.

Who Can Participate

All public and private sector digital examiners who conduct hard drive or mobile phone examinations as part of their official duties are encouraged to volunteer and participate in this study.

No one individual’s performance or laboratory will be calculated. Rather, NIST will publish anonymized and comprehensive results of the overall performance of the digital forensic expert community and different sectors within that community.

To learn more or to enroll in this vital study to advancing digital forensics forward, visit NIST Blackbox Study for Digital Examiners and follow the simple steps to get started.

 

[Enroll in NIST Blackbox Study]

Laboratories Learn About Accuracy of Forensic Software Tools through NIST Study

hand holding mobile phone showing multiple apps

In a new article, NIST researcher Jenise Reyes-Rodriquez shares an inside look at her work testing mobile forensic software tools. She and her team explore the validity of different methods for extracting data from mobile devices, even from damaged phones. Researchers subject a wide array of digital forensic tools to rigorous and systematic evaluation, determining how accurately it retrieves crucial information from the device.

She explains that unlike what you might see on television, forensic labs are often working with limited budgets and may not have access to multiple tools. They typically need to work with what they already have or can afford. Reyes-Rodriquez and her research team test these mobile tools on the most popular devices on the market and create reports for labs listing any anomalies such as incomplete text messages or contact names. These reports help labs know if the tool they have is appropriate to use in their case, and provides a guide on alternative options or an ideal tool to buy.

This research was funded by NIST and the Department of Homeland Security’s Cyber Forensics Project. Read the full story on the NIST website, and learn more about the research group’s work on the CFTT website. Here you can also access NIST’s testing methodology and forensic tool testing reports.

An Inside Look into the Role of a Digital Evidence Expert

Men looking at computers

In a new weekly series launched by TODAY, reporters investigate the future of work in today’s ever-changing society. Technological advancements are fueling the creation of new jobs that may not have even existed a few years ago, but are set to proliferate within the next decade.

A fourth installment in this series gives readers an inside look into the role of one type of digital forensic specialist. As digital crimes continue to increase from year-to-year, and the majority of information worldwide moves to a digital format, the article highlights an expected increased demand for this specialty.

Digital forensics experts contribute to criminal investigations in multiple ways. The TODAY article highlights one example, experts who are responsible for catching criminals behind cyber-attacks and security breaches. Authorities rely on these specialists to identify and preserve digital evidence, and piece together the events surrounding a crime.

In the article, experienced digital forensic experts highlight real cases, emphasizing that no two crimes are alike. The experts stress the importance of approaching each case with unique and creative problem solving.

“The role calls for very specific skills, including understanding an attacker’s perspective, deep technical skills especially in how systems work with one another, and sharp and analytical mind,” says one digital forensics specialist.

Dig deeper into the responsibilities of a digital forensic expert focused on cyber-security in the TODAY article, and learn how CSAFE researchers are investigating other applications of digital forensics such as steganography and user-generated event data on our website.