I decided to finally learn Qt and started to read “C++ GUI Programming with Qt 4” (first edition), which is available online. The book comes in a zip file that unzips to a huge, 51MB, PDF file. Even considering that the book is quite long (556 pages), the file size is very large compared to what one would usually expect. The huge file size made reading the PDF less convenient, as one notices a considerable delay when opening it (especially if the PDF resides on some portable storage), so I decided to play around a little and see what I could do about it.
I decided to try re-distilling the book’s PDF by printing it through the default PDF printer that comes with KDE, and voila, the newly printed PDF weighed only 5.5MB, almost 10% of the original file size. I’m not sure what bloat is getting removed, as I don’t see any loss of quality except that the bookmarks and links inside the PDF were lost. I’ve also successfully tested the technique with 30MB and 26.2MB PDFs of other books I have, and it resulted in 6.7MB and 6.8MB re-distilled PDFs, respectively.
So it seems one can greatly reduce large PDFs by re-distilling them using a simple PDF printer, at the cost of losing the bookmarks and links in the PDF, which I believe may be worth it depending on the original file size. It would be interesting to figure out a way to apply this technique while still maintaining the bookmarks and links, if it’s possible at all.
Did you ever find out how to do this? as in distill but keep the links?
I did further testing after writing the post. If I remember correctly, the thing was that the original pdf wasn’t compressed. When I ran it through the distiller, it created a new, compressed pdf.
Compressing the pdf could also be done using pdftk, this should be a superior way as you will keep the links and bookmarks.