2. Rule-based Data Extraction
Rule-based Data Extraction approach requires one document template to be created for each PDF format.
Jiffy follows a two-phase approach for rule-based data extraction.
- Document Template creation : Refer - Repository -> Document Templates for detailed steps on how to prepare and upload a document template.
- Task Design : Create a task under Jiffy Core (Task Design – Task) and drag and drop a PDF node. For the configurations to be set, the below steps need to be followed under Properties tab
- Select a PDF configuration from the drop down. Refer - Task Design -> Configuration for setting a Document configuration
- Template Type need to be changed to “Fixed Template”
- Select a PDF Template tag (Refer - Repository -> Document Templates on detailed steps on how to add a tag to Document template)
- For multi-page PDFs, if the PDF need to be split, then turn on the “Split for template” option. The PDF split logic is as follows:
- If page number is not available, it would split each page as a single PDF
- If page number is available as “Page M of N”, the PDF would be split based on the page numbers
- For all other scenarios, the split will not be effective
- Select the PDF location from the mapping section
- Click on trial run. Click on the output section, to view the PDF output
Below video demonstrates the Intelligent Document processing PDF extraction process