Like it!

Join us on Facebook!

Like it!

Compress a PDF file with Ghostscript on Linux

How to reduce the size of a PDF that originated from a scanned document.

I have just scanned a bunch of physical pages into a PDF and the result is a pretty big file. Without any advanced OCR processing, the scanned pages are stored as plain images rather than text, which increase the overall size of the output.

Browsing the web I've come up with the following Ghostscript command that compresses and optimizes the original file into a gray-scaled version of it. The result is a printer-friendly PDF file, i.e. the resolution is set to 300 dpi, but you can change it along the way.

gs \
-sDEVICE=pdfwrite \
-dCompatibilityLevel=1.4 \
-dPDFSETTINGS=/printer \
-dBATCH \
-dNOPAUSE \
-dQUIET \
-sProcessColorModel=DeviceGray \
-sColorConversionStrategy=Gray \
-dOverrideICC \
-sOutputFile=output.pdf \
input.pdf

The parameters in detail

-sDEVICE=pdfwrite selects which output device Ghostscript should use. I want to print to a PDF file, so I'm using pdfwrite.

-dCompatibilityLevel=1.4 generates a PDF version 1.4. You may want to change this according to your needs. Here's a list of all PDF versions.

-dPDFSETTINGS=/printer sets the image quality for printers (i.e. 300 dpi). Choose /screen if you want to scale it down to 72 dpi: you will obtain additional compression (but the file will look ugly if printed on paper).

-dBATCH -dNOPAUSE: Ghostscript will process the input file(s) without interaction. It will quit on completion.

-dQUIET mutes routine information comments on standard output.

-sProcessColorModel=DeviceGray is the color model to use during conversion.

-sColorConversionStrategy=Gray instructs Ghostscript to produce a grayscaled output.

-dOverrideICC: since the color has changed, -dOverrideICC updates the color profile accordingly.

-sOutputFile=output.pdf: where to save the output file.

input.pdf: the original file to process.

Additional notes

The above command should work on Windows and OS X as well, as long as Ghostscript is installed.

PDF version 1.5 seems to feature a better image compression. I should look into that more closely.

Sources

Ghostscript - How to use Ghostscript (link)
GitHub gist - Compress PDF files with ghostscript (link)
Stackoverflow - How to convert a PDF to grayscale from command line avoiding to be rasterized? (link)

comments
Yago on March 30, 2019 at 01:47
Awesome
abdan on June 19, 2019 at 12:34
How do we do the opposite, ie. how to uncompress PDF file by using Ghostscript ?
Antonio Nicasio on March 12, 2020 at 17:08
quick question is possible to increase the images inside in PDF to 300dpi with -dPDFSETTINGS=/printer or need todo something else?
Triangles on March 21, 2020 at 10:37
@Antonio Nicasio I think the -dPDFSETTINGS=/printer flag should be OK for your needs. I've never tried upscaling/upsampling, though...