Inspiration
Home/property inspectors spend hours massaging findings into the Texas Real Estate Commission (TREC) report format, copy/pasting text, wrangling photos, linking videos, and triple-checking checkboxes and margins. We wanted a tool that trusts the data, not manual formatting, and generates a polished, compliant PDF directly from the inspection JSON.
What we built InspectFlow consumes a structured JSON (sections -> line items -> comments/media) and outputs a TREC-ready PDF that: Mirrors the structure exactly: I. Section -> A. Line Item -> Comments Renders I / NI / NP / D as checkboxes (auto-crossed from inspectionStatus; cross all if a line item has no comments) Embeds images (scaled to fit) and shows clickable video badges that open URLs Applies consistent typography, spacing, and 1″ margins; headers show the property address, footers include the official TREC line Prepends the official first two TREC pages, fills Page 1 (client, date, address, inspector), and fills Page 2 (Additional Info + “Page 2 of __” total) using the actual field rectangles for pixel-perfect placement Adds page numbers (“Page X of Y”) centered above the footer, starting on page 3
How we built it ReportLab (Platypus) drives layout: Sections, line items, comments, and media are flowables inside a width-clamped frame to guarantee margins. Custom flowables draw the 4 status boxes and a video play badge with a hyperlink. Header/Footer are drawn via an onPage callback; page numbers use a custom Canvas that post-processes pages to avoid double emission. TREC P1/P2 are prepended with pypdf. Page 1 is filled via overlayed text; Page 2 uses the widgets’ /Rect to place text precisely inside the real form fields, no guesswork. Atomic writes ensure the final PDF overwrites cleanly (no accidental append).
Challenges we faced (and how we solved them) PDF field appearances: Some viewers ignore filled values unless /NeedAppearances or appearances are regenerated. We sidestepped brittle DA fiddling by drawing text overlays inside the field rectangles for Page 2 totals. Duplicate pages after numbering: Our first NumberedCanvas called showPage() twice (during pagination and during save), doubling body pages (48 → 96). Fix: save state in showPage() and emit pages only once in save(). Margins drifting with media: Images/videos could spill past the frame at deep indents. We wrap media in a 2-column table (indent_cell) that hard-caps content width at the frame. Status fidelity: inspectionStatus varies (strings, dicts). We wrote a normalizer and added a cross-all fallback when comments are missing to make gaps explicit. Date formats: JSON dates arrived as epoch ms or strings. We built a tolerant parser to render consistent MM/DD/YYYY. “Appending to itself” final PDF: Guarded the merge step with distinct input/output paths and atomic os.replace to guarantee a fresh final on every run.
What we learned The PDF stack (ReportLab + pypdf) is powerful but opinionated, appearance streams, AcroForm nuances, and canvas lifecycles matter. Robust document generation is mostly about defensive layout: clamp widths, normalize inputs, and keep a single source of truth for spacing. For product polish, a few tiny touches, legends, fallback copy, and centering page numbers above the footer, dramatically improve readability.
Log in or sign up for Devpost to join the conversation.