I had to work with many scanned PDF documents that were saved in a very un-optimal way, each one having up to tens of MBs. This was not suitable for sending via email and also unnecessary. I saw an option on my girlfriend's Mac to optimize PDF. The result was that her document dropped in size from 3MB to 57kB without any visible drop in quality.
To find out how to do that using a command line on Linux was easy. The first StackOverflow result from search had shown the following use of GhostScript:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook \
-dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
It worked flawlessly. I needed to run this on many files in the folder, or
in other words, to run in batch. I resorted on using xargs
.
Note: Using
xargs -I
like explained below can be potentially dangerous. Read links in one of my posts to learn more.
The gs
command adjusted and piped into xargs
looks like this:
find --depth=0 -name *.pdf | xargs -I % gs --ARGUMENTS %
Or it could utilize fd
utility with a null character instead of a
newline, via the -0
or it's long form, the --print0
attribute. This is
the way it was historically combined with xargs
, also noted in the
fd docs:
-0, --print0
Separate search results by the null character (instead of newlines). Useful for piping results to xargs.
The command then looks like this:
fd -0 -d1 "\.pdf" | xargs -0 -I % gs --ARGUMENTS %
On many environments the null character path might not even be necessary,
but is it might be good to know about the connection. The -0
on both
sides of the pipe could thus be dropped:
fd -d1 "\.pdf" | xargs -I % gs --ARGUMENTS %
Again, whenever using xargs -I
, make do a dry runs first (just the
whatever find command you use without piping anything) to be on the safer
side where something nasty does not surprise you as a minimal safety
precaution. And possibly do your own research.
The full command I ended up obtaining batch processed, size optimized PDFs was this one:
fd -d1 "\.pdf" | xargs -I % \
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook \
-dNOPAUSE -dQUIET -dBATCH -sOutputFile="/path/to/output/dir/%" %
Might come handy. Enjoy!