StegoAppDB: A Forensics Image Database for Mobile Steganography
StegoAppDB, a steganography apps forensics image database, is a database of image data from mobile phones. It is the first database consisting of mobile phone photographs and stego images produced from mobile stego apps, including a rich set of side information. StegoAppDB contains over 810,000 innocent and stego images using 10 different phone models from 24 distinct devices, with detailed provenanced data such as a wide range of ISO and exposure settings, EXIF data, stego app APKs, message information, embedding rate, and other information.
A search on the database can be separated into two main categories, searching on stego and related images, and searching on original images. Stego images are created using Android and iOS mobile stego apps. We provide cover-stego image pairs for each stego image so that the data may be used for machine learning applications of steganalysis. Original images are acquired using our own Cameraw camera app, and saved in both DNG and high-quality JPEG formats. Cameraw is available on Github for both iOS and Androids. We retain the original devices and continue to add to the database. While designed for steganography, other digital image forensic areas may find this database to be suitable.
This database is publicly available and has no copyright or privacy issues associated with it.
For more information about the data in the database, please go to FAQ.
Q: How can I download portions of the data?
A: Go to the “Search” link, and select either “Originals” or “Stegos.” Make your selection form those options to search for images with your criteria selected for searching.
Q: What additional data is given with the images?
A: Your data file includes images, a text file that gives a record of your search parameters and a csv file that provides a list of all the attributes stored in the database for each individual image in the download file. In addition, a link to a pdf file titled “SAD Instructions and Information” is included with each download and is also available on this webpage. It gives more details about the downloaded data, including more details on the file folder structure, image types, and contents of the csv files.There are more parameters provided in the csv file than are available to query on. EXIF data, stego app used and hidden message are some of the attributes in the csv file.
Q: Which paper should I cite if I use StegoAppDB?
A: J. Newman, L. Lin, W. Chen, S. Reinders, Y. Wang, M. Wu, Y. Guan. “StegoAppDB: A steganography apps forensics image database,” IS&T Int’l. Symp. on Electronic Imaging, Media Watermarking, Security, and Forensics 2019, Burlingame, CA, 2019.
Q: What is an original image”?
A: We define an original image to be an image acquired by the mobile phone camera. An original image has the default pixel dimensions as dictated by the camera app. An original image can be used in many ways. A piece can be cropped out and this smaller-sized image can be used as input/cover image to a machine learning algorithm. Or, an original image can be selected by the user as input to a stego app on the phone.
Q: What is a “cover image”?
A: A cover image is an image that is used directly for embedding and from it, the corresponding stego image is created. Another term for cover image is a zero-rate embedded stego image, that is, it is a stego image that has not been embedded. A cover image and a stego image have the same pixel dimensions, and are practically visually identical, differing only at those locations where message bits reside. Features are then extracted from cover-stego image pairs and used in machine learning to train a classifier. In machine learning algorithms, a cover image is often a cropped (smaller) piece of an original image, as original-sized images can be too large for machine learning algorithms to compute with efficiently.
Q: What is an “input image”?
A: We define an input image as an image whose values and pixel dimensions are known, and which is then fed to an algorithm that will hide a message in it or hide a message in a re-sized version of it. The input image in many academic steganalysis algorithms is often the cover image itself, which can be a cropped sub-image from an original image. However, in mobile stego apps, the input image is typically an original image – one taken by the phone’s camera – selected as input to the GUI app. Once selected by the user, the input (original) image is passed to the stego app’s internal code and is often downsized. Any images created internally to the app cannot be accessed by the user, so once it is downsized, the downsized image is not available for machine learning purposes, as cover-stego pairs of images must be the same size.
Q: Where can I find the EXIF file for an image?
A: The EXIF file is included only with the “original image” files (see above for definition of “original image”.) An original image is acquired with a digital camera from a mobile phone. EXIF files are not included with input images, cover images, or stego images.
Q: My search is not bringing up any images. Why?
A: At least one box must be checked within the segmented sections for searching. If no boxes are checked in a section, the search will not find any images. Another reason that no images are found is that no image satisfies all the criteria.
Q: My search finds too many images and the download file size is too large. What do I do?
A: Select fewer criteria. You may select fewer phones, fewer apps (for stegos), or a narrower range of ISO and/or exposure time settings.
Q: How many images are in the database?
A: Over 810,000!
Q: How are the images interrelated, such as which cover image is cropped from which original image, what stego images are created, etc.?
A: The entity relationship (ER) diagram for the database describes how the data is related in the database. See here for the paper describing the database, and see here for ER diagram itself. The ER diagram gives the general overview of the relationship between images in the database, and also gives the attributes associated with each image. Given a specific stego image, the name for the cover image used to create that stego image is given in the csv file downloaded with the stego images. The document titled “PDF file of Download Information and Instructions” also has information that may be helpful to understand the relation between images in your download.
Download Information and Instructions
Information and Instructions are available here.
Please check back for updates.