I wrote software to detect redaction rectangles—it turns out these are relatively easy to recognize based on their color, shape, and the specific commands used to draw them. Out of 1.8 million PACER documents, there were approximately 2000 documents with redaction rectangles. (There were also about 3500 documents that were redacted by replacing text by strings of Xes.)
Next, my software checked to see if these redaction rectangles overlapped with text. My software identified a few hundred documents that appeared to have text under redaction rectangles, and examining them by hand revealed 194 documents with failed redactions. The majority of the documents (about 130) appear be from commercial litigation, in which parties have unsuccessfully attempted to redact trade secrets such as sales figures and confidential product information. Other improperly redacted documents contain sensitive medical information, addresses, and dates of birth. Still others contain the names of witnesses, jurors, plaintiffs, and one minor.
Timothy B. Lee: Studying the Frequency of Redaction Failures in PACER