← All Projects

Document Pipeline

ongoing Document Processing

Local OCR pipeline for digitizing physical documents. Scans paper records through a local LLM (OLMo 2) to produce structured spreadsheet data, with human review for accuracy.

  • Local LLM inference (no cloud PII exposure)
  • TIFF/PDF to structured data conversion
  • Human-in-the-loop validation workflow
PythonOLMo 2Tesseract

Activity Timeline

No activity recorded yet.