TY - JOUR
T1 - Modeling verdict outcomes using social network measures
T2 - The watergate and caviar network cases
AU - Masías, Víctor Hugo
AU - Valle, Mauricio
AU - Morselli, Carlo
AU - Crespo, Fernando
AU - Vargas, Augusto
AU - Laengle, Sigifredo
N1 - Publisher Copyright:
© 2016 Masías et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2016/1/1
Y1 - 2016/1/1
N2 - Modelling criminal trial verdict outcomes using social network measures is an emerging research area in quantitative criminology. Few studies have yet analyzed which of these measures are the most important for verdict modelling or which data classification techniques perform best for this application. To compare the performance of different techniques in classifying members of a criminal network, this article applies three different machine learning classifiers-Logistic Regression, Naïve Bayes and Random Forest-with a range of social network measures and the necessary databases to model the verdicts in two real-world cases: the U.S. Watergate Conspiracy of the 1970's and the now-defunct Canada-based international drug trafficking ring known as the Caviar Network. In both cases it was found that the Random Forest classifier did better than either Logistic Regression or Naïve Bayes, and its superior performance was statistically significant. This being so, Random Forest was used not only for classification but also to assess the importance of the measures. For the Watergate case, the most important one proved to be betweenness centrality while for the Caviar Network, it was the effective size of the network. These results are significant because they show that an approach combining machine learning with social network analysis not only can generate accurate classification models but also helps quantify the importance social network variables in modelling verdict outcomes.We conclude our analysis with a discussion and some suggestions for future work in verdict modelling using social network measures.
AB - Modelling criminal trial verdict outcomes using social network measures is an emerging research area in quantitative criminology. Few studies have yet analyzed which of these measures are the most important for verdict modelling or which data classification techniques perform best for this application. To compare the performance of different techniques in classifying members of a criminal network, this article applies three different machine learning classifiers-Logistic Regression, Naïve Bayes and Random Forest-with a range of social network measures and the necessary databases to model the verdicts in two real-world cases: the U.S. Watergate Conspiracy of the 1970's and the now-defunct Canada-based international drug trafficking ring known as the Caviar Network. In both cases it was found that the Random Forest classifier did better than either Logistic Regression or Naïve Bayes, and its superior performance was statistically significant. This being so, Random Forest was used not only for classification but also to assess the importance of the measures. For the Watergate case, the most important one proved to be betweenness centrality while for the Caviar Network, it was the effective size of the network. These results are significant because they show that an approach combining machine learning with social network analysis not only can generate accurate classification models but also helps quantify the importance social network variables in modelling verdict outcomes.We conclude our analysis with a discussion and some suggestions for future work in verdict modelling using social network measures.
UR - http://www.scopus.com/inward/record.url?scp=84959019004&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0147248
DO - 10.1371/journal.pone.0147248
M3 - Artículo
C2 - 26824351
AN - SCOPUS:84959019004
SN - 1932-6203
VL - 11
JO - PLoS ONE
JF - PLoS ONE
IS - 1
M1 - 0147248
ER -