Identifying the Gender Gap in Financial White Collar Crime : An Empirical Study o n SEC Litigation Reports and the Impact on Current White Collar Crime Gender Theory
Abstract
Purpose: The purpose of this thesis is to explore whether there is a gender gap in white collar crime using a data-driven textual analysis of the Securities and Exchange Commission’s (SEC) litigation releases. This paper aims to gather and process high quality evidence from SEC litigation reports to offer better insights to the academic community regarding the gender gap in white-collar crime. We also aim to analyse trends on the gender breakdown of groups that commit white-collar crime and the types of crimes each gender group is likely to commit.
Methods: We built an R based crawler to gather all the textual data on over 10,000 litigation releases. Each litigation release was referenced to a name database that accounted for over 32,000 first names recorded based on social insurance applications dating back to 1879. The most popular first names were used with the associated gender (male or female) included. The litigation releases were also referenced to a list of the most common white-collar crime related terms.
Findings: There is a large discrepancy between the male names tallied in the litigation releases in comparison to the female names. The overall percentage of female names lies between approximately 12% and 20% over the time period of 1995 to 2020. All-male groups accounted for approximately 38% of the crimes recorded and individual males accounted for 32%. Mixed-gender groups made up the bulk of the remainder with 24.5% of crimes recorded, while female individuals and all-female groups accounted for a mere 5% and 0.8%, respectively. Crimes that involved terms like “fraud”, “trading”, “insider”, and “antifraud” were the most common. There was a peak period of litigation releases and therefore criminal activity between 2000 and 2005 with a consistent reduction in crime over time. On a per name basis, the percentage of male names that occur in litigation releases appear to slowly decrease from 1995 to 2020. Conversely, the percentage of female names appears to increase; if this trend were to be extrapolated into the future, the proportion of male and female names could potentially converge.
Research Limitations: The frequency of names mentioned in the litigation releases is reliant on how many times the defendant’s name was mentioned in that filing. Due to this, the data obtained varies depending on the writing style of each litigation release. The names recorded and tallied do not always refer to the defendant’s name and sometimes include the prosecutor or investigators name. This associates the prosecutors name as committing a crime, which is inaccurate. Many individuals have last names that are common first names (i.e., Scarlett), which leads to double counting the number of guilty parties. This double counting also occurs in the crime term frequency as it is subject to the writing style of the litigation release.