PDF editing

qpdf

Extract pages:

qpdf [INPUT] --pages . [start]-[end] -- [OUTPUT]

run with the option --replace-input and without [OUTPUT] to substitute the [INPUT] file.

Rotate pages:

qpdf [INPUT] [OUTPUT] --rotate=[+|-][ANGLE]:[PAGE_RANGE]

Compression + Monochrome

  1. Split pages with qpdf --split-pages in.pdf out.pdf
  2. Compress each page with convert -monochrome -compress lzw -density 300 [INPUT] [OUTPUT] (use parallel, with seq -w [1 801])
  3. Merge outpus with qpdf --empty --pages *.pdf -- out.pdf

alternative to 2. convert -density 300 out-017.pdf -threshold 90% -type bilevel -compress fax try.pdf

ocr

ocrmypdf -l [LANG] --optimize [1,2,3] [INPUT] [OUTPUT]

Run tesseract --list-langs to see all available languages