I am by no means a digital forensics expert but thought I would share some basic info about redactions. I have used for example PDFs that have recently been posted from Epstein tranches. All relevant CWs... I made some effort to select examples that are less disturbing but the source material is what it is.
Examples to use:
DocA- All redactions are properly done (can't be reversed).DocB- Some redactions are reversible and others are not.DocC- redactions are improperly done.
Simple text selection
DocB is the most useful to illustrate because it has a combination. I guess they were made at different times but people with varying level of competencies/motivations. If you "Select All" the text you can see which reactions are done properly and not. Here is the bottom of pg 1:

- Orange rectangles = properly redacted. You can see whitespace around the blacked out because the text selector jumps over it. The right-hand one is more subtle. But if you use the cursor to drag the selection over the text you can see it jump.
- Magenta circled = improperly redacted. The selection continues over the boxes.
- Also quite subtle with the light blue selection color but the magenta circled boxes are slightly tinted blue. With a bolder selection color would be easier to see.
By using copy/paste of that section here is the result. For the purpose of demonstration:
- added
[--------]to show where the redactions still are because they actually just don't create any text. - Highlighted the newly-visible text (in this case "`III'"
On Christmas Eve of 1999, or the day prix:,
IIIwas over visiting hisson, who he had with
[--------]years prior.IIIbegan talking to[--------], asshe seemed to be acting strange and
IIIif he needed a ride home.UNCLASSIFIED
All these scans are pretty simple so this works for all 3 but sometimes if the document is more complicated (have columns or multiple text sizes) it might be harder to discern.
Opening file in LibreOffice
Another way is to open the PDF in LibreOffice Draw which is freely available and runs on windows, mac and linux. Note: I suggest experimenting on a smaller document, don't start with a dozens or hundreds of pages long doc as it is very hard work on your computer.
Here is pg 1 of DocA. I am using the "select" tool (regular pointer) to drag the top layer of the PDF off the actual text underneath:

I dragged the top layer to the right to reveal the text layer at the bottom:

The left shows how the OCR has interpreted things. You can see that there is not actually any texts below the blacked-out boxes (just blank space).
On the other hand, when you open DocC in LibreOffice, there is no separate image layer, because the PDF was generated directly from the digital source (like by Print to PDF).
When I use the "Select" tool I can drag the text from under the black boxes (page 20, #90):

This shows the inadequate method of redacting: a black box has been drawn on top of text, hiding it from the eye. If you ever have to obscure important information from a PDF, you must not do this!! The most straight-forward method of properly redacting would be to export the pages as jpg images then re-make the PDF, re-do the OCR. In a jpg you can't have hidden text.
It's simpler to use the selection method to just copy/paste the text from inside firefox (un-redacted text highlighted for demo):
JSC Interiors, LLCis a New York Limited Liability Company, the Articles of Organization of which were filed in November 2014. The Articles listJSC, who was forced and coerced to have sex with Epstein,as the company’s sole owner.JSCwas manipulated, exploited, and controlled by the Epstein Enterprise.
This is just some basic stuff, I'm sure there is a better tool to use than LibreOffice but everyone should already have that installed anyway. So it's good to know about for casual snooping around PDFs.
The above might not work for every kind of PDF, depending how it was generated. And there are other, more sophisticated tools for inspecting PDFs. Feel free to post any other tips, tools etc.
edited to add
other tools
- (suggested by @Enjoyer_of_Games@hexbear.net/@FumpyAer@hexbear.net)
- command line; requires Python/pip which can be generally be run from windows/mac/linux; is sometimes smooth but not always. (If you are interested in this kind of think you'll eventually need to get comfortable with python.)
- it can either display the in redacted text as white over black, or make a 2up with redacted on the left and unredacted on the right
- commandline; linux (in many distro package managers) most easily but also mac and windows by various means (python, brew, etc), see installation
- if the files you are working with don't contain text that you can select/copy (e.g. are just images) this is a good tool to work with.
- it can also extract text to produce what it calls a "sidecar" file which is a plain text file; like if you were to go "select all"/copy/paste into a new file. but it can do this on multiple files at once, very fast. so it makes scanning through files a lot faster (depending on quality of original)
- command line; linux (in many distro package managers) and mac (requires
brew). - search for text in PDF files. Use after you have gotten them as fixed as you can with all the other tools.
- can you complex search queries, like searching for documents that contain a certain keyword nearby, or after, another keyword. Regex compatible, so you could do things like consider "don" "trump" and "donald" to be synonymous, or extract all email addresses or phone numbers.
Holy shit have they really fucked up some of the redaction this badly?
@grok redact this PDF for me
Probably laid off the supervisor who knew how to use adobe properly. Someone else asked chat gpt and it told them this. Certainly some of the people who were tasked with implementing it would have known better but they either kept their mouths shut and followed orders, or weren't listened to.
OR/and it is all an intentional psy op......
It'll take analysis of the data set as a whole to make a good guess.
I'm sure someone will be on it in a week or two.