Files
pdf-slim/README.md

52 lines
1.8 KiB
Markdown

# pdf-slim
Losslessly shrink a PDF. The output renders pixel-for-pixel identically to the
input; any candidate that doesn't is discarded.
## How it works
`pdf-slim` builds several smaller candidates and keeps the smallest one that
passes verification. The input file is never modified.
1. **Structural**`qpdf` object-stream generation and maximum-level flate
recompression. Always render-safe; a modest win.
2. **mutool-subset**`mutool clean` garbage-collects, deduplicates, deflates,
and natively subsets fonts.
3. **font-resubset** — for any TrueType `FontFile2` stream still over a size
threshold (the "Word embedded the whole font" case), the font is re-subset
with `pyftsubset` against the document's actual character set. Glyph IDs are
retained (`--retain-gids`) so CID / Identity-H instances sharing the file
stay valid. The stream is spliced back, lengths are fixed with `fix-qdf`, and
the result is recompressed.
Every candidate is verified by rendering each page of both the original and the
candidate at the same DPI and requiring byte-identical pixels. The smallest
verified candidate wins. If nothing wins by more than 1%, the input is copied
through unchanged, so repeated runs are stable.
## Usage
```
pdf-slim input.pdf [output.pdf] [--dpi 200] [--min-font-bytes 150000]
```
- `input.pdf` — the PDF to shrink.
- `output.pdf` — optional; defaults to `<input>-slim.pdf`. Refuses to overwrite
the input.
- `--dpi` — verification render resolution (default `200`).
- `--min-font-bytes` — re-subset embedded TrueType fonts larger than this
(default `150000`).
## Requirements
The following tools must be on `PATH`:
- `mutool` (MuPDF)
- `qpdf` (provides `qpdf` and `fix-qdf`)
- `pyftsubset` (fontTools)
- `python3`
## Ownership
Owned by SILO GROUP — [www.silogroup.org](https://www.silogroup.org).