Skip to content

Twin Convolutional Neural Networks to Classify Writers Using Handwriting Data

Conference/Workshop:
American Association of Forensic Sciences (AAFS)
Published: 2023
Primary Author: Pilhyun (Andrew) Lim
Secondary Authors: Danica Ommen
Type: Poster

Identifying the source of handwriting is an important application in the field of forensic science that addresses questioned document evidence found in criminal cases and civil litigation. It is difficult, given the idiosyncrasies of a person’s handwriting, to recognize the exact writer of a piece of handwriting based only on its physical properties. Even more so is trying to classify a writer without any prior database containing handwriting characteristics of such writer. Data sets containing handwriting samples from different sources are used to investigate how well a convolutional neural network can classify writers from unseen sources. Comparisons between scenarios modeled after real-world situations with varying degrees of complexity, which are adjusted by whether and from which source the samples from the suspects have been collected to train the model, are made to examine the extent to which twin convolutional neural networks can successfully classify similar and different writers. This presentation primarily aims to compare data processing and modeling methods to improve classification on whether two pieces of handwriting are from the same or different writers, in the context where every potential writer has never been seen before. The structure of a twin convolutional neural network allows such comparisons between two images by passing them through identical convolutional neural networks and defining a metric that merges their outputted feature vectors to obtain a similarity score. As model limitations in this presentation are driven by memory and available data, various pre-processing and sampling methods are compared to maximize classification accuracy. On this optimized data set, a custom model that is developed for this analysis is shown to outperform various top-performing architectures in image classification problems with a classification accuracy of 85.5 percent on a test set with similar structure to the training set and 82.8 percent on a data set collected from a different database. Results show that as long as a large-enough number of samples are available to train the model, comparisons between the writers of questioned documents can be classified with over 80 percent accuracy.

Related Resources

Score-based Likelihood Ratios Using Stylometric Text Embeddings

Score-based Likelihood Ratios Using Stylometric Text Embeddings

We consider the problem setting in which we have two sets of texts in digital form and would like to quantify our beliefs that the two sets of texts were…
Statistics and its Applications in Forensic Science and the Criminal Justice System

Statistics and its Applications in Forensic Science and the Criminal Justice System

This presentation is from the 2024 Joint Statistical Meetings (JSM), Portland, Oregon, August 3-8, 2024.
Algorithmic matching of striated tool marks

Algorithmic matching of striated tool marks

Automatic matching algorithms for assessing the similarity between striation marks have been investigated for bullet lands and some tool marks, such as screwdrivers. We are interested in the investigation of…
Silencing the Defense Expert

Silencing the Defense Expert

In the wake of the 2009 NRC and 2016 PCAST Reports, the Firearms and Toolmark (FATM) discipline has come under increasing scrutiny. Validation studies like AMES I, Keisler, AMES II,…