Back to Resources
PRODUCT UPDATE

Introducing Parse Jobs API: Process 1000+ Page Documents

By Product TeamNovember 5, 20244 min read

Today we're excited to announce the Parse Jobs API — our solution for processing massive documents asynchronously. Upload a 2,000-page mortgage packet or medical record bundle, and we'll process it in the background and notify you when complete.

🚀 What's New

  • • Process documents up to 10,000 pages
  • • Asynchronous processing with webhooks
  • • Batch upload multiple files in one job
  • • Cost-optimized for large documents (30% cheaper than sync API)

Why We Built This

Our original Parse API works great for individual documents (invoices, forms, reports), but customers kept asking: "What about our 500-page mortgage packets?" or "Can we process an entire box of medical records at once?"

The synchronous API isn't ideal for these use cases:

  • Long-running requests risk timeouts
  • Client has to keep connection open for minutes/hours
  • Retrying a failed 1000-page job means reprocessing everything

How It Works

Step 1: Create a Job

POST /v1/parse/jobs
Content-Type: application/json

{
  "files": [
    "https://your-bucket.s3.amazonaws.com/mortgage-packet.pdf"
  ],
  "mode": "parse",
  "webhook_url": "https://your-app.com/webhooks/retriv",
  "metadata": {
    "loan_id": "L-2024-00123",
    "customer": "John Doe"
  }
}

Returns a job_id immediately. Processing starts in the background.

Step 2: We Process in the Background

Your document is broken into chunks and processed in parallel across our GPU cluster. Depending on size and complexity, this can take anywhere from 30 seconds to 2 hours.

Step 3: Get Notified via Webhook

POST https://your-app.com/webhooks/retriv
Content-Type: application/json

{
  "event": "job.completed",
  "job_id": "job_abc123",
  "status": "completed",
  "result_url": "https://api.retriv.ai/v1/parse/jobs/job_abc123/result",
  "metadata": {
    "loan_id": "L-2024-00123",
    "customer": "John Doe"
  },
  "stats": {
    "total_pages": 847,
    "chunks_created": 423,
    "processing_time_seconds": 142
  }
}

Step 4: Download Results

GET /v1/parse/jobs/job_abc123/result

Returns the same structure as the sync Parse API:
{
  "chunks": [...],
  "metadata": {...},
  "pages": [...]
}

Use Cases

🏦 Mortgage Processing

Process entire loan packages: W-2s, pay stubs, bank statements, tax returns, appraisals, title reports — all in one job.

🏥 Medical Record Extraction

Parse years of patient records: lab results, clinical notes, imaging reports, medication lists, discharge summaries.

⚖️ Legal Discovery

Process thousands of pages of discovery documents, contracts, depositions, and correspondence for e-discovery.

📊 Financial Due Diligence

Analyze entire data rooms during M&A: financial statements, contracts, HR records, intellectual property filings.

Pricing

Parse Jobs are 30% cheaper than the synchronous Parse API:

Sync Parse API
10 credits / page
Async Parse Jobs API
7 credits / page

A 1,000-page document costs 7,000 credits (vs 10,000 with sync API).

What's Next

This is just v1 of Parse Jobs. On our roadmap:

  • Priority queue: Pay more to jump the line for time-sensitive documents
  • Progress callbacks: Get updates every N pages processed
  • Scheduled jobs: Process documents at off-peak hours for lower cost
  • Batch extraction: Apply the same schema to multiple documents in one job

Try Parse Jobs Today

Available now for all Pro and Enterprise customers. Free tier gets 5 jobs/month.

RT
Retriv.ai Product Team
Building the future of document AI