Commit Graph

5 Commits

Author SHA1 Message Date
Kai Ton 3f8765a581 Migrate to Laravel app + unified services; add email conversion
- Move docling/markitdown services under services/ alongside new
  unlimited-ocr and vision services
- Add Laravel app for email-to-markdown conversion and OCR frontend
- Add email export tooling and example emails/output
- Update docker-compose, Caddyfile, and frontend assets

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-01 07:22:08 +00:00
Kai Ton 6ba704865f update youtube link 2026-06-26 07:57:43 +00:00
Kai Ton 22cc0d0857 Fix PP-OCRv6 error + MarkItDown LLM fallback
Docling: pass PdfPipelineOptions (TesseractCLI) to ImageFormatOption
to prevent RapidOCR/PP-OCRv6 being loaded for image files

MarkItDown: auto-fallback to plain conversion when Ollama returns 500
(OOM/crash) instead of propagating the error to the user

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-25 07:53:22 +00:00
Kai Ton 94cbabe6d7 Fix Docling PP-OCRv6 error + add DB init scripts
- Force TesseractCliOcrOptions for image formats (JPG/PNG/TIFF/BMP)
  to prevent RapidOCR/PP-OCRv6 fallback on docling 2.107
- Add db/init.sql and db/init_docling.sql for database initialization

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-25 07:48:04 +00:00
Kai Ton 11de2d2175 Initial commit — MarkItDown vs Docling demo
- FastAPI microservices: MarkItDown + Docling với async SQLAlchemy
- Caddy reverse proxy same-origin (no CORS)
- Bootstrap 5 frontend với marked.js rendering
- LLM settings card: Ollama URL, model select từ API, cleanup model
- POST /cleanup endpoint với AI làm đẹp Markdown
- GET /models fetch danh sách model từ Ollama
- Runtime LLM re-init không cần restart container
- PYTHONDONTWRITEBYTECODE + .dockerignore

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-25 06:47:35 +00:00