Skip to content

Adversarial Matching of Dark Net Market Vendor Accounts

Conference/Workshop:
25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
Published: 2019
Primary Author: Xiao Hui Tai
Secondary Authors: Kyle Soska, N. Christian
Research Area: Digital

Many datasets feature seemingly disparate entries that actually refer to the same entity. Reconciling these entries, or “matching,” is challenging, especially in situations where there are errors in the data. In certain contexts, the situation is even more complicated: an active adversary may have a vested interest in having the matching process fail. By leveraging eight years of data, we investigate one such adversarial context: matching different online anonymous marketplace vendor handles to unique sellers. Using a combination of random forest classifiers and hierarchical clustering on a set of features that would be hard for an adversary to forge or mimic, we manage to obtain reasonable performance (over 75% precision and recall on labels generated using heuristics), despite generally lacking any ground truth for training. Our algorithm performs particularly well for the top 30% of accounts by sales volume, and hints that 22,163 accounts with at least one confirmed sale map to 15,652 distinct sellers—of which 12,155 operate only one account, and the remainder between 2 and 11 different accounts. Case study analysis further confirms that our algorithm manages to identify non-trivial matches, as well as impersonation attempts.

Related Resources

Statistical Methods for the Forensic Analysis of User-Event Data

Statistical Methods for the Forensic Analysis of User-Event Data

A common question in forensic analysis is whether two observed data sets originate from the same source or from different sources. Statistical approaches to addressing this question have been widely…
Hunting wild stego images, a domain adaptation problem in digital image forensics

Hunting wild stego images, a domain adaptation problem in digital image forensics

Digital image forensics is a field encompassing camera identication, forgery detection and steganalysis. Statistical modeling and machine learning have been successfully applied in the academic community of this maturing field.…
Statistical Methods for the Forensic Analysis of Geolocated Event Data

Statistical Methods for the Forensic Analysis of Geolocated Event Data

A common question in forensic analysis is whether two observed data sets originated from the same source or from different sources. Statistical approaches to addressing this question have been widely…
CSAFE 2020 All Hands Meeting

CSAFE 2020 All Hands Meeting

The 2020 All Hands Meeting was held May 12 and 13, 2020 and served as the closing to the last 5 years of CSAFE research and focused on kicking off…
Do you have 44.03 seconds?

44.3 Seconds. That is the average amount of time it takes for a visitor to provide site feedback.
Test it yourself by taking the survey.


A scientist/researcherA member of the forensic science communityA journalist/publicationA studentOther. Please indicate.


Learn more about CSAFE overall.Discover research CSAFE is undertaking.Explore collaboration opportunities.Find tools and education opportunities.Other. Please indicate.


YesNo