US DOJ just removed a photo of Trump from the Epstein file dump website

dead [he/him]@hexbear.net · edit-2 6 months ago

US DOJ just removed a photo of Trump from the Epstein file dump website

AernaLingus [any]@hexbear.net · 6 months ago

When I extracted it it was only 492 kB, so I’m not sure why this image ended up being 854 kB. They’re the same image, ultimately: when I used ImageMagick’s compare tool it showed them as being pixel-identical, and they optimized down to bit-identical files after running them through Efficient Compression Tool with the flags -9 -strip .

What method did you use, out of curiosity? I used mutool extract on the PDF, which spit out the single PNG.

dead [he/him]@hexbear.net · 6 months ago

I used the pdfimages command in poppler-utils.

https://packages.debian.org/trixie/poppler-utils

AernaLingus [any]@hexbear.net · 6 months ago

I did some digging and it looks like lossless images aren’t really stored as PNGs, per se:

https://en.wikipedia.org/wiki/PDF#Raster_images

FlateDecode, a commonly used filter based on the deflate algorithm defined in RFC 1951 (deflate is also used in the gzip, PNG, and zip file formats among others); introduced in PDF 1.2; it can use one of two groups of predictor functions for more compact zlib/deflate compression: Predictor 2 from the TIFF 6.0 specification and predictors (filters) from the PNG specification (RFC 2083),

pdfimages hints at this when all the other images output options say things like “write JPEG images as JPEG files” but then the PNG output option says “change the default output format to PNG” (if you don’t supply any arguments it spits out raw PPM files).

In fact, if you look at the size of the original PDF, it’s 385 kB—more in line with the optimized filesize I ended up with. My guess is that mutool extract simply makes a bit more of an effort to recompress the image than pdf2images, but in both cases they’re falling short of the original compression (at least for this PDF).

(completely unrelated, but I found it funny that the PDF uses the woke sans-serif font Helvetica)

US DOJ just removed a photo of Trump from the Epstein file dump website

US DOJ just removed a photo of Trump from the Epstein file dump website

‘In-your-face cover up’: Trump photo discovered in Epstein files pulled from DOJ website