Canopy Blog

Incident Response Data Mining & the Snowflake Attack

Written by Canopy Team | June 13, 2024

A yet-undetermined number of companies have had their data stolen after Snowflake, a popular cloud storage platform, was infiltrated using compromised customer credentials. So far, several big-name companies have publicly confirmed data breaches as a result of this attack.

As more information becomes available about this attack, which has come to light almost exactly one year after the MOVEit hack, it’s shaping up to be one of the biggest data breaches ever. The criminal hacker group ShinyHunters has already posted large data sets for sale in online forums, including the personal data of 560 million customers from a single company, suggesting a massive scale of impact.

Databases Make Data Mining Tricky

A uniquely challenging aspect of this attack is that Snowflake hosts databases, which store and organize structured data. Databases are a notoriously difficult file type when it comes to incident response data mining due to how massive, tedious, and unpredictable they can be.

Data mining databases is often cumbersome because they:

  • Contain complex, vast quantities of PII, PHI, and personal information.
  • Contain normalized data within multiple tables as well as unnormalized values in fields.
  • Suffer from data quality issues like missing values, inconsistent formatting, and duplicative data.
  • Store uniquely formatted PII/PHI types in addition to standard PII.

When faced with databases, incident response providers will stumble (often unsuccessfully) through custom-coded solutions to try to get insight into the data before engaging a professional services provider for full review. These custom solutions are expensive, easily amounting to a large percentage of the total investigation cost.

But with the right technology partner, incident response data mining teams can confidently handle compromised databases and help their clients determine the impact much faster.

Overcome Structured Data Mining Challenges & Costs with Canopy

Canopy’s patented Data Breach Response software is the world’s leading data mining technology. It enables IR teams to locate PII/PHI, connect it to people, and generate a consolidated list of affected individuals — even when dealing with structured data.

Canopy is also the first and only data mining software to support structured data. With our Database Previews, IR teams and breach counsel can get an upfront look at compromised databases fast — typically on the same day they start uploading data. This gives providers the ability to assess risk upfront and focus their review strategy for peak efficiency and time savings. 

Because of their complexity, it’s necessary to engage professional services for databases that require a full review. In this instance, the upfront insight provided by Canopy’s Database Previews also allows IR teams to target those efforts, saving clients (and their cyber insurers) significant amounts of money on the overall project cost.

 

Were you or your client potentially affected by the Snowflake attack? Canopy’s software and database expertise can deliver immense time and cost savings for your data mining project. Contact one of our Certified Partners for data mining support, or request a demo to see the application in action.