How The Marshall Project Analyzed Cuyahoga County Court Records through the Testify Project

by Jamie Crawford and Joshua Clark


Only 30% of Cuyahoga County residents are Black, yet three-quarters of the people in state prisons convicted in Cuyahoga County are Black. Nearly 70% of the county’s criminal court cases from 2016 to 2021 involved a defendant with at least one prior charge.  Most of these defendants come from the county’s poorest ZIP codes and are mostly Black males. To understand this discrepancy, The Marshall Project examined who is electing judges in Cuyahoga County and the track records of those judges. They also looked at who was facing charges in county courts and who was repeatedly cycling through the courts on new charges.  This was done by scraping data from justice system records and interviewing attorneys, academics, and people who have experienced the system firsthand.

Organization Background: The Marshall Project is a nonprofit, online journalism organization. They took on the effort of digging into the data because of their expertise and experience regarding criminal justice. Additionally, The Marshall Project collaborated with The Cleveland Documenters from Signal Cleveland, a nonprofit newsroom covering Greater Cleveland that fuses community building with local news reporting.


Project Goals: The project sought to uncover the inner workings of the historically opaque Cuyahoga County Justice system. Bringing transparency to how judges do their jobs is essential to understanding flaws in the larger system. 

Project Resources: The Marshall Project budgeted for the equivalent of at least one full-time staffer, and usually more, throughout the project to date.  The Marshall Project receives funding through grants and donations.

Tools & Technology

  • Custom Scraper developed by Freelance Data Visualization Engineer Aaron Williams
  • Airtable (cloud collaboration service)
  • Hasura/GraphQL (API development product)
  • R & Python (programming languages)
  • Observable (Online community of data practitioners)

Impact: The Marshall Project built a scraper to collect 70,000 court records and created the Testify FAQ tool where community members and residents could ask questions and engage with journalists. By collaborating with Cleveland Documenters from Signal Cleveland, the project built trust in the Cleveland and Ohio communities by allowing them to express concerns and ask questions about the court system and judges across various modern and traditional platforms. The Marshall Project also provided the community with a first-of-its-kind analysis of voting patterns that highlighted that mostly white, suburban voters were electing the judges presiding over the cases of mostly Black Cleveland residents. Additionally, researchers at Case Western University led by Dr. Ayesha Hardaway are using the data collected from The Marshall Project for a two-year research project investigating sentencing.

How it Happened

A lawyer called a Washington Post reporter and offered a data set of criminal case records from Cuyhoga County that the lawyer had scraped. The Marshall Project took on the effort of digging into the data and contracted with a former Washington Post reporter to do some more robust scraping.  Along with data retrieval and analysis came a series of interviews with attorneys, academics, and people who have experienced the system firsthand.

What Worked

1. Automating the data capture

Freelancer Aaron William was hired to write a scraper. Many newsrooms would balk at paying thousands of dollars for development when there’s already a robust in-house team. In the end, the cost was minuscule compared to the ongoing benefit of obtaining this data. Paying an experienced engineer enabled The Marshall Project to dedicate more time to developing solutions to the aggressive anti-scraping mechanisms the court uses.

2. Committing time to utilize community engagement strategies

The Marshall Project’s commitment to the long haul and engagement with the community proved to be very beneficial for the project. They could not just parachute in and expect the project to work. Many of the most powerful applications of the data The Marshall Project obtained emerged after nearly two years of sustained effort. For example, a reporter from The Marshall Project’s local expansion in Cleveland needed to see how often a specific lien was being issued in court, and they were able to find the answer just a few hours later on The Marshall Project’s database.

3. Utilizing a diverse spectrum of media

The Marshall Project formatted its findings to reach people across multiple mediums and platforms such as Instagram (reels), radio, print, and online podcasts. They were able to transform fundamental reporting into compelling products designed for specific audiences.

What Could Have Worked Better

1. Tempering expectations regarding time estimates

The time needed to do detailed data analysis of the court records was immense. When the Testify FAQ tool was released, the public posed many questions, some of which took months of work to answer clearly.

2. Dedicating more time and resources to tools for informing citizens

Some processes within the justice system are more complex than they need to be, and it would be worthwhile to put in the effort to create accessible foundational civic information and tools to help the public understand the inner workings of the justice system.

3. Monthly check-ins with the community

By facilitating monthly check-ins with the community, a space could’ve been provided to address questions and issues and to build trust to ensure that subjects would feel comfortable sharing their lived experiences outside of the interview process.

What Else You Should Know

The resources and engineering that went into the data extraction and analysis would be a limiting factor for other communities replicating this aspect of the project. In fact, courts are even less transparent in other communities, which would make this an even more extensive endeavor elsewhere. But on the positive side, a comprehensive voter guide to local judges with notes about their records and links to news articles could be produced in any community to inform voters to make informed decisions about the judges they elect in their local criminal justice systems.

Learn More

To learn more, reach out to David Eads by email or on Twitter. You can also check out the following links:

Tags: , ,