The plethora of mobile apps introduce critical challenges to digital forensics practitioners, due to the diversity and the large number (millions) of mobile apps available to download from Google play, Apple store, as well as hundreds of other online app stores. Law enforcement investigators often find themselves in a situation that on the seized mobile phone devices, there are many popular and less-popular apps with interface of different languages and functionalities. Investigators would not be able to have sufficient expert-knowledge about every single app, sometimes nor even a very basic understanding about what possible evidentiary data could be discoverable from these mobile devices being investigated. Existing literature in digital forensic field showed that most such investigations still rely on the investigator’s manual analysis using mobile forensic toolkits like Cellebrite and Encase. The problem with such manual approaches is that there is no guarantee on the completeness of such evidence discovery. Our goal is to develop an automated mobile app analysis tool to analyze an app and discover what types of and where forensic evidentiary data that app generate and store locally on the mobile device or remotely on external 3rd-party server(s). With the app analysis tool, we will build a database of mobile apps, and for each app, we will create a list of app-generated evidence in terms of data types, locations (and/or sequence of locations) and data format/syntax. The outcome from this research will help digital forensic practitioners to reduce the complexity of their case investigations and provide a better completeness guarantee of evidence discovery, thereby deliver timely and more complete investigative results, and eventually reduce backlogs at crime labs. In this paper, we will present the main technical approaches for us to implement a dynamic Taint analysis tool for Android apps forensics. With the tool, we have analyzed 2,100 real-world Android apps. For each app, our tool produces the list of evidentiary data (e.g., GPS locations, device ID, contacts, browsing history, and some user inputs) that the app could have collected and stored on the devices’ local storage in the forms of file or SQLite database. We have evaluated our tool using both benchmark apps and real-world apps. Our results demonstrated that the initial success of our tool in accurately discovering the evidentiary data.
Dynamic Taint Analysis Tool for Android App Forensics

Journal: 40th IEEE Symposium on Security and Privacy
Published: 2018
Primary Author: Zhen Xu
Secondary Authors: Chen Shi, Chris Cheng, Neil Zhengqiang Gong, Yong Guan
Type: Publication
Research Area: Digital
Related Resources
LibDroid: Summarizing information flow of Android Native Libraries via Static Analysis
With advancements in technology, people are taking advantage of mobile devices to access e-mails, search the web, and video chat. Therefore, extracting evidence from mobile phones is an important component…
Evaluating Reference Sets for Score-Based Likelihood Ratios for Camera Device Identification
An investigator wants to know if an illicit image captured by an unknown camera was taken by a person of interest’s (POI’s) phone. Score-based likelihood ratios (SLRs) have been used…
Likelihood Ratios for Categorical Evidence with Applications to Digital Forensics
In forensic investigations, the goal of evidence evaluation is often to address source-/identity-based questions in which the evidence consists of two sets of observations: one from an unknown source tied…
Automatic Detection of Android Steganography Apps via Symbolic Execution and Tree Matching
The recent focus of cyber security on automated detection of malware for Android apps has omitted the study of some apps used for “legitimate” purposes, such as steganography apps. Mobile…