- Force TesseractCliOcrOptions for image formats (JPG/PNG/TIFF/BMP)
to prevent RapidOCR/PP-OCRv6 fallback on docling 2.107
- Add db/init.sql and db/init_docling.sql for database initialization
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- FastAPI microservices: MarkItDown + Docling với async SQLAlchemy
- Caddy reverse proxy same-origin (no CORS)
- Bootstrap 5 frontend với marked.js rendering
- LLM settings card: Ollama URL, model select từ API, cleanup model
- POST /cleanup endpoint với AI làm đẹp Markdown
- GET /models fetch danh sách model từ Ollama
- Runtime LLM re-init không cần restart container
- PYTHONDONTWRITEBYTECODE + .dockerignore
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>