PRD: wkhtmltopdf - Chrome Headless Wrapper #2
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
PRD: wkhtmltopdf - Chrome Headless Wrapper
Problem Statement
Odoo and other systems depend on
wkhtmltopdffor HTML-to-PDF conversion. The original wkhtmltopdf is based on a deprecated Qt WebKit engine, producing inferior PDFs and receiving no active development. Users want a drop-in replacement binary calledwkhtmltopdfthat accepts the same CLI arguments but delegates to headless Google Chrome for superior rendering, all built with Python 3 stdlib only.Solution
Build
wkhtmltopdfas a Python 3 CLI binary (stdlib only, zero external dependencies) that:User Stories
--page-size A4to produce an A4 PDF so that my reports match expected dimensions.--page-size Letterto produce a Letter-sized PDF so that US reports work correctly.--orientation landscapeto rotate the page so that wide tables fit.--margin-top 20to set a 20mm top margin so that my content doesn't touch the edge.--margin-bottom,--margin-left,--margin-rightto work independently so I can control all margins.--page-widthand--page-heightto set custom page dimensions in mm so I can produce non-standard sizes.--dpi 300to produce a high-resolution PDF so that images look sharp.--disable-local-file-accessto block local file references so that untrusted HTML can't read my filesystem.--enable-local-file-accessto allow local file references so that embedded images work.--header-html file.htmlto inject a header on every page so that I get consistent document headers.--footer-html file.htmlto inject a footer on every page so that I get page numbers and footers.--header-spacingto control gap between header and content so layout looks correct.--header-lineto draw a line under the header so it visually separates from content.--cookie-jar cookies.txtto pass cookies to Chrome so that authenticated pages render correctly.--javascript-delay 5000to wait 5 seconds before printing so that JS-heavy pages finish rendering.--quietto suppress all non-error output so that logs stay clean.--zoom 1.5to scale the page content so that small text becomes readable.--disable-smart-shrinkingto prevent Chrome from auto-scaling so my CSS controls the layout exactly.WKHTMLTOPDF_CHROME_PATHenv var to override auto-detection so I can pin a specific Chrome binary.WKHTMLTOPDF_TIMEOUTenv var to change the default timeout so long-running renders don't get killed.~/.wkhtmltopdf.jsonfor persistent settings so I don't repeat env vars.--chrome-pathCLI flag to specify Chrome location so I can override per-invocation.--timeoutCLI flag to set max wait time so I can control rendering timeout per call.--print-media-typeto use print CSS media queries so that print-specific styles apply.--viewport-size WIDTHxHEIGHTto set the browser viewport so that responsive layouts render at the right width.--extra-chromium-argsto pass additional Chrome flags so I can access experimental features.--no-sandboxso that containerized deployments work.--versionto print the wrapper version so I can verify the installed version.--helpto print usage information so I can discover available options.Implementation Decisions
wkhtmltopdf(shebang#!/usr/bin/env python3).google-chrome,google-chrome-stable,chromium-browser,chromium.--headless=new --disable-gpu --no-sandbox --disable-dev-shm-usage.--print-to-pdfuses inches. Convert mm → inches (mm / 25.4).--print-to-pdfpaper size.--cookies-file, clean up temp file after.--header-templateand--footer-template(or--print-to-pdf-header-template/--print-to-pdf-footer-template) with HTML content.--virtual-time-budget(milliseconds) or--timeoutfor wait.transform: scale()wrapper or--device-scale-factor.--window-size=WIDTH,HEIGHT.--print-to-pdfimplies print media; use--run-all-compositor-stages-before-drawfor full rendering.argparse(stdlib) for CLI parsingsubprocess(stdlib) for Chrome invocationos,sys,tempfile,json,pathlibfor utilitiesTesting Decisions
unittest(stdlib only).Out of Scope
--web-spacing(CSS zoom via wkhtmltopdf-specific spacing).--toc).--cover).--grayscale).--outline,--outline-depth).Further Notes
wkhtmltopdfshould work without modification.--headless=newmode (Chrome 112+) provides the best rendering fidelity.--print-to-pdfflag in Chrome is the core mechanism; all wkhtmltopdf flags map to variations of this.