Identifying the source of handwriting is an important application in the field of forensic science that addresses questioned document evidence found in criminal cases and civil litigation. It is difficult, given the idiosyncrasies of a person’s handwriting, to recognize the exact writer of a piece of handwriting based only on its physical properties. Even more so is trying to classify a writer without any prior database containing handwriting characteristics of such writer. Data sets containing handwriting samples from different sources are used to investigate how well a convolutional neural network can classify writers from unseen sources. Comparisons between scenarios modeled after real-world situations with varying degrees of complexity, which are adjusted by whether and from which source the samples from the suspects have been collected to train the model, are made to examine the extent to which twin convolutional neural networks can successfully classify similar and different writers. This presentation primarily aims to compare data processing and modeling methods to improve classification on whether two pieces of handwriting are from the same or different writers, in the context where every potential writer has never been seen before. The structure of a twin convolutional neural network allows such comparisons between two images by passing them through identical convolutional neural networks and defining a metric that merges their outputted feature vectors to obtain a similarity score. As model limitations in this presentation are driven by memory and available data, various pre-processing and sampling methods are compared to maximize classification accuracy. On this optimized data set, a custom model that is developed for this analysis is shown to outperform various top-performing architectures in image classification problems with a classification accuracy of 85.5 percent on a test set with similar structure to the training set and 82.8 percent on a data set collected from a different database. Results show that as long as a large-enough number of samples are available to train the model, comparisons between the writers of questioned documents can be classified with over 80 percent accuracy.