Your agent pipeline handles JSON beautifully. It routes events, processes queues, sends notifications. Then someone uploads a PDF invoice and the entire thing stops.
Real-world business processes do not start with clean API calls. They start with documents: signed contracts, insurance claims, tax forms, purchase orders, compliance questionnaires. The gap between "we can process structured data" and "we can process what customers actually send us" is where most agent pipelines stall.
The Document Gap
Most agent pipelines assume structured inputs. A webhook fires with JSON. An event arrives on a topic with a typed schema. A queue message contains the exact fields the next skill expects. This works when machines talk to machines.
But the moment a human enters the loop — uploading a form, attaching a document, forwarding an email with a PDF — the pipeline needs to convert unstructured content into structured data before anything else can happen. Today, most teams handle this with custom scripts, one-off regex parsers, or manual data entry. None of these scale.
Three Document Processing Primitives
The document processing layer for agent pipelines breaks down into three concerns:
Form field extraction. A scanned PDF or image contains fields — names, dates, amounts, checkboxes, addresses — but they are pixels, not data. Extraction means identifying field boundaries, reading their values with OCR, and mapping them to a typed schema. The output is structured JSON that downstream skills can consume directly. Multi-page support, table extraction, and handwriting recognition cover the range from clean digital forms to messy scanned originals.
Document classification. Before you extract fields, you need to know what you are looking at. Is this an invoice, a contract, a tax form, or a receipt? Classification routes documents to the correct extraction template. Without it, you either build a single monolithic extractor that handles every document type (brittle) or require users to manually label what they upload (slow). A classifier examines layout, keywords, and structural patterns to assign a document type with a confidence score.
Signature verification. Documents with signatures carry legal weight, but verifying those signatures programmatically is hard. Is the signature present? Does it match a known reference? Is the document altered after signing? Signature verification compares ink signatures against reference samples, detects digital signature validity, and flags documents where the signature region shows evidence of tampering.
Cost Analysis
| Scenario | In-House | FormCraft.io Skills |
|---|---|---|
| 500 forms/day extraction | 3-4 weeks eng time + OCR API costs | $1.50/day ($0.003/form) |
| Document classification (500 docs) | Custom ML model + training data | $0.50/day ($0.001/doc) |
| Signature verification (200 docs) | Weeks of CV engineering | $0.80/day ($0.004/doc) |
| Full pipeline (500 docs/day) | Months of engineering | $2.80/day |
At $2.80/day for a complete document intake pipeline, skills cost less than a single hour of engineering time per month.
Why Skills Beat Document Processing Libraries
Document processing libraries exist. Tesseract handles OCR. OpenCV detects signatures. spaCy extracts entities from text. But stitching these together into a production pipeline means:
- Dependency management. Tesseract alone requires system-level binary installation, language packs, and version pinning across environments.
- Model hosting. Classification models need GPU inference in production. That is a whole infrastructure layer.
- Version coupling. When your OCR library updates its output format, every downstream parser breaks simultaneously.
- Cost isolation. You cannot attribute document processing costs to specific pipelines when everything shares a monolithic processing service.
Skills isolate each concern behind a pay-per-call API. No dependencies to install. No models to host. No versions to pin. Each skill scales independently, costs are attributable per-call, and upgrades happen on the publisher side without touching your pipeline code.
Composability
Form processing skills compose naturally with the existing BluePages ecosystem:
- Form extraction → PII anonymization (DataLens.ai): Extract form fields, then automatically redact personal data before downstream processing.
- Document classification → policy gate (ComplianceKit): Route classified documents through compliance checks before human review.
- Signature verification → audit trail (ComplianceKit): Log verified signatures as immutable audit entries for SOC 2 evidence.
- Form extraction → notification (NotifyHub.dev): Extract a claim amount from an insurance form and notify the adjuster via Slack with the structured data.
Each composition is a single pipeline definition in the BluePages compose API. No glue code, no orchestration logic, no deployment.
Introducing FormCraft.io
FormCraft.io launches today on BluePages with three skills:
- Form Field Extractor ($0.003/call) — Multi-format form extraction with OCR, table detection, checkbox recognition, and handwriting support. Outputs typed JSON matching your target schema.
- Document Classifier ($0.001/call) — Layout-aware document classification across 30+ standard document types with custom type training support. Returns type, confidence, and suggested extraction template.
- Signature Verifier ($0.004/call) — Ink and digital signature detection, reference comparison with similarity scoring, tamper detection, and certificate chain validation for digital signatures.
All three skills are live now. Try them in the sandbox or compose them into a document intake pipeline.