StyleOCR
What's New

What's New in StyleOCR 0.1.0

Preview build focused on on-device OCR, VLM layout understanding, and local automation hooks.

Highlights

StyleOCR 0.1.0 is an early desktop preview of the OCR toolkit: batch images and PDFs into Word, Markdown, plain text, or Excel, with optional vision-language models for tables, mixed layouts, and richer extraction.

Recognition stack

  • Fast / Pro / VLM modes — trade speed for accuracy or layout-aware understanding.
  • Dual model tracks — classic OCR weights for plain text plus VLM bundles for structure-heavy pages (capabilities such as tables, formulas, seals, or handwriting depend on the installed pack).
  • Style extraction, image retention, and PDF page-break options — tune exports per task.

Structured extraction & Agent

  • Information extraction presets (for example invoices, passports, ID cards) and custom modes with user-defined fields.
  • Agent flows for recognize, translate, extract, and summarize with streaming output for review before export.

Automation & integrations

  • MCP server — opt-in localhost endpoint for compatible AI assistants.

Productivity

  • Task history with search, background runs, and stop/resume patterns for long batches (exact behavior per platform build).
  • Notifications for task completion, updates, and MCP client connect events.

Coming next

Continued polish on layout fidelity, model packaging, and cross-platform parity. Watch this page and in-app release notes for the next drop.


StyleOCR keeps recognition on your device; remote calls are for models, updates, and storefront flows you explicitly trigger—not for uploading every page to a hosted OCR API.