Skip to content
CASE STUDY

Canopy’s Auto Review Completes “Impossible” PHI Data Mining Project

The Challenge

A large hospital network experienced a protected health information (PHI) breach, with over 6,600 compromised PDFs — some containing up to 180,000 rows of information and over 150,000 individuals — and densely packed with PHI. Patients’ information was frequently duplicated with different PHI each time. The hospital network needed to work quickly in order to comply with the HIPAA Breach Notification Rule.

The Solution

Canopy’s advanced detection algorithms first identified each PII/PHI element. Auto Review linked patients’ names to their procedure dates, medical codes, insurance information, and other personal data. The only human effort required for review was for QA. Our machine learning models consolidated the entities into a list of unique patients, maintaining links to each person’s data elements.

"It was not humanly possible for our team to do this — it would have taken a couple hundred reviewers years to complete this project. We can’t even fathom the cost savings."

Project Lead

The Result

  • Automated review linking PII/PHI from tables in over 6,600 PDFs (some containing over 180,000 rows) to people
  • Deduplicated 4.28 billion entities to just 3 million unique patients, reducing entity list by 99%
  • Enabled hospital network to comply with HIPAA Breach Notification Rule
  • Saved team millions of man hours & completed "impossible" project in 15 days

By the Numbers

document icon

6,600

Crystal Reports (lengthy PDFs)

three people icon, two blue and one green

4.28 billion

entities, often frequently duplicated

numbers panel-calendar

15 days

for Canopy to complete the entire project

Get These Results for Your Data Breach Response

Request your personalized demo to see how Canopy's Auto Review uses AI to transform your workflow.