Skip to content

A clustering method for graphical handwriting components and statistical writership analysis

Journal: Statistical Analysis and Data Mining: The ASA Data Science Journal
Published: 2020
Primary Author: Amy M. Crawford
Secondary Authors: Nicholas S. Berry, Alicia L. Carriquiry
Research Area: Handwriting

Handwritten documents can be characterized by their content or by the shape of the written characters. We focus on the problem of comparing a person’s handwriting to a document of unknown provenance using the shape of the writing, as is done in forensic applications. To do so, we first propose a method for processing scanned handwritten documents to decompose the writing into small graphical structures, often corresponding to letters. We then introduce a measure of distance between two such structures that is inspired by the graph edit distance, and a measure of center for a collection of the graphs. These measurements are the basis for an outlier tolerant K‐means algorithm to cluster the graphs based on structural attributes, thus creating a template for sorting new documents. Finally, we present a Bayesian hierarchical model to capture the propensity of a writer for producing graphs that are assigned to certain clusters. We illustrate the methods using documents from the Computer Vision Lab dataset. We show results of the identification task under the cluster assignments and compare to the same modeling, but with a less flexible grouping method that is not tolerant of incidental strokes or outliers.

Related Resources

Comparing handwriter and FLASH ID®, Two Handwriting Analysis Programs

Comparing handwriter and FLASH ID®, Two Handwriting Analysis Programs

FLASH ID and handwriter are computer programs that compare questioned handwritten documents against handwritten samples from known writers. FLASH ID was developed by Sciometrics and is used by the FBI,…
Error Rate Methods for Forensic Handwriting Identification

Error Rate Methods for Forensic Handwriting Identification

Presentation is from the 106th International Association for Identification (IAI) Annual Educational Conference
Twin Convolutional Neural Networks to Classify Writers Using Handwriting Data

Twin Convolutional Neural Networks to Classify Writers Using Handwriting Data

Primary goals are to examine: 1. Write diversification versus representation. 2. Preservation of handwriting structure versus image density. 3. Input size versus training size. 4. Writer identification complexity assessment using…
Center for Statistics and Application in Forensic Evidence Update

Center for Statistics and Application in Forensic Evidence Update

The information below highlights a sample of current research initiatives led by the CSAFE team. Additional accomplishments in other forensic science disciplines will be discussed in subsequent issues of Forensic…