What's New in StyleOCR 0.1.0

Preview build focused on on-device OCR, VLM layout understanding, and local automation hooks.

Highlights

StyleOCR 0.1.0 is an early desktop preview of the OCR toolkit: batch images and PDFs into Word, Markdown, plain text, or Excel, with optional vision-language models for tables, mixed layouts, and richer extraction.

Recognition stack

Fast / Pro / VLM modes — trade speed for accuracy or layout-aware understanding.
Dual model tracks — classic OCR weights for plain text plus VLM bundles for structure-heavy pages (capabilities such as tables, formulas, seals, or handwriting depend on the installed pack).
Style extraction, image retention, and PDF page-break options — tune exports per task.

Structured extraction & Agent

Information extraction presets (for example invoices, passports, ID cards) and custom modes with user-defined fields.
Agent flows for recognize, translate, extract, and summarize with streaming output for review before export.

Automation & integrations

MCP server — opt-in localhost endpoint for compatible AI assistants.

Productivity

Task history with search, background runs, and stop/resume patterns for long batches (exact behavior per platform build).
Notifications for task completion, updates, and MCP client connect events.

Coming next

Continued polish on layout fidelity, model packaging, and cross-platform parity. Watch this page and in-app release notes for the next drop.

StyleOCR keeps recognition on your device; remote calls are for models, updates, and storefront flows you explicitly trigger—not for uploading every page to a hosted OCR API.