Installation

  • Follow structions for installing PDFPlumber here
  • Install other requirements with pip3 install requirements.txt

How is work behind the hood

1. Detect all texts in the document

2. Merge texts into lines

3. Detect Key-Value pairs based on texts detection result and document layout analysis

4. It can detect even more complicated structures like multiple-choice check boxes and tabular data

Share this project:
×

Updates