European Patent Office
Document transcription
Complex math and chemical structures bottlenecked transcriptions. A fine-tuned OCR model now structures this data into ST36 XML.
- Lead time cut from up to 5 days to a few minutes
- 400,000 daily patent pages transcribed via AI
- Under 1% character recognition error rate