What's New
What's New in StyleOCR 0.1.0
Preview build focused on on-device OCR, VLM layout understanding, and local automation hooks.
Highlights
StyleOCR 0.1.0 is an early desktop preview of the OCR toolkit: batch images and PDFs into Word, Markdown, plain text, or Excel, with optional vision-language models for tables, mixed layouts, and richer extraction.
Recognition stack
- Fast / Pro / VLM modes — trade speed for accuracy or layout-aware understanding.
- Dual model tracks — classic OCR weights for plain text plus VLM bundles for structure-heavy pages (capabilities such as tables, formulas, seals, or handwriting depend on the installed pack).
- Style extraction, image retention, and PDF page-break options — tune exports per task.
Structured extraction & Agent
- Information extraction presets (for example invoices, passports, ID cards) and custom modes with user-defined fields.
- Agent flows for recognize, translate, extract, and summarize with streaming output for review before export.
Automation & integrations
- MCP server — opt-in localhost endpoint for compatible AI assistants.
Productivity
- Task history with search, background runs, and stop/resume patterns for long batches (exact behavior per platform build).
- Notifications for task completion, updates, and MCP client connect events.
Coming next
Continued polish on layout fidelity, model packaging, and cross-platform parity. Watch this page and in-app release notes for the next drop.
StyleOCR keeps recognition on your device; remote calls are for models, updates, and storefront flows you explicitly trigger—not for uploading every page to a hosted OCR API.