Skip to content

Adversarial Matching of Dark Net Market Vendor Accounts

Conference/Workshop:
25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
Published: 2019
Primary Author: Xiao Hui Tai
Secondary Authors: Kyle Soska, N. Christian
Research Area: Digital

Many datasets feature seemingly disparate entries that actually refer to the same entity. Reconciling these entries, or “matching,” is challenging, especially in situations where there are errors in the data. In certain contexts, the situation is even more complicated: an active adversary may have a vested interest in having the matching process fail. By leveraging eight years of data, we investigate one such adversarial context: matching different online anonymous marketplace vendor handles to unique sellers. Using a combination of random forest classifiers and hierarchical clustering on a set of features that would be hard for an adversary to forge or mimic, we manage to obtain reasonable performance (over 75% precision and recall on labels generated using heuristics), despite generally lacking any ground truth for training. Our algorithm performs particularly well for the top 30% of accounts by sales volume, and hints that 22,163 accounts with at least one confirmed sale map to 15,652 distinct sellers—of which 12,155 operate only one account, and the remainder between 2 and 11 different accounts. Case study analysis further confirms that our algorithm manages to identify non-trivial matches, as well as impersonation attempts.

Related Resources

LibDroid: Summarizing information flow of Android Native Libraries via Static Analysis

LibDroid: Summarizing information flow of Android Native Libraries via Static Analysis

With advancements in technology, people are taking advantage of mobile devices to access e-mails, search the web, and video chat. Therefore, extracting evidence from mobile phones is an important component…
Evaluating Reference Sets for Score-Based Likelihood Ratios for Camera Device Identification

Evaluating Reference Sets for Score-Based Likelihood Ratios for Camera Device Identification

An investigator wants to know if an illicit image captured by an unknown camera was taken by a person of interest’s (POI’s) phone. Score-based likelihood ratios (SLRs) have been used…
Likelihood Ratios for Categorical Evidence with Applications to Digital Forensics

Likelihood Ratios for Categorical Evidence with Applications to Digital Forensics

In forensic investigations, the goal of evidence evaluation is often to address source-/identity-based questions in which the evidence consists of two sets of observations: one from an unknown source tied…
Automatic Detection of Android Steganography Apps via Symbolic Execution and Tree Matching

Automatic Detection of Android Steganography Apps via Symbolic Execution and Tree Matching

The recent focus of cyber security on automated detection of malware for Android apps has omitted the study of some apps used for “legitimate” purposes, such as steganography apps. Mobile…