ScribeOCR – Web interface for recognizing text, OCR, & creating digitized docs

https://news.ycombinator.com/rss Hits: 11
Summary

Scribe OCR Scribe OCR is a free (libre) web application for recognizing text from images, proofreading OCR data, and creating fully-digitized documents. Live site at scribeocr.com. There are 3 primary uses cases for Scribe OCR. Adding an accurate searchable text layer to a PDF document. Scribe OCR can be used as an alternative to applications like Adobe Acrobat for recognizing text and creating searchable PDFs. Unlike other tools, Scribe OCR makes it easy to correct errors in the recognized text. Proofreading existing OCR data. Scribe OCR can be used to edit and correct existing OCR data created with other applications, including Tesseract HOCR files. By accurately positioning text over the input image, OCR data can be proofread significantly faster than with other methods. Creating fully digital versions of documents and books. Other OCR programs do not truly digitize documents, but rather add roughly-positioned invisible text over the original image. Scribe OCR can be used to produce text native, ebook-style PDFs that accurately replicate the original document. Note: This repo only contains code for the user interface. Recognition is run using the Scribe.js library, which is in the Scribe.js repo. Discussion regarding recognition--from questions about quality to instructions on how to implement OCR within your own project--should happen in that repo. Running ScribeOCR can be run by using the public site at scribeocr.com. The entire program runs in your browser--no data is sent to a remote server. There is currently no standalone desktop application, so running locally requires serving the files over a local HTTP server. To run a local copy, run the following commands (requires npm): git clone --recursive https://github.com/scribeocr/scribeocr.git cd scribeocr npm i npx http-server The npx http-server command will print the address on your local network that ScribeOCR is running on. You can use the site by visiting that address. Please "thumbs up" this Git Issue if...

First seen: 2025-10-10 03:22

Last seen: 2025-10-10 13:31