Machine learning-based Score Likelihood Ratios have been proposed as an alternative to traditional Likelihood Ratios and Bayes Factor to quantify the value of evidence when contrasting two opposing propositions. Under the common source problem, the opposing proposition relates to the inferential problem of assessing whether two items come from the same source. Machine learning techniques can be used to construct a (dis)similarity score for complex data when developing a traditional model is infeasible, and density estimation is used to estimate the likelihood of the scores under both propositions.
In practice, the metric and its distribution are developed using pairwise comparisons constructed from a sample of the background population. Generating these comparisons results in a complex dependence structure violating assumptions fundamental to most methods. To remedy this lack of independence, we introduce a sampling approach to construct training and estimation sets where assumptions are met. Using these newly created datasets, we construct multiple base SLR systems and aggregate their information into a final score to quantify the value of evidence. Our experimental results show that this ensembled SLR can outperform traditional SLR in terms of the rate of misleading evidence, discriminatory power and show they are more reliable.