Web-based Clipboard OCR

Basic Interface
Sample Converted Text

Inspiration

I dislike images on text and find that there are often snippets from pictures that I want to quickly screenshot and copy paste. I thought : "There must be a better way than manually transcribing text from images/screenshots"

What it does

On the webpage, paste in an image from your clipboard and the image will be quickly processed and turned into plain text, which can be copied/pasted or saved. By pasting in an image from your clipboard, this eliminates the steps of saving the image as a file and uploading the file afterwards.

Used in conjunction with clipboard managers and screenshot utilities, this is a small but powerful tool that can quickly do its job.

How I built it

I built for myself to quickly take advantage of the javascript tessaract OCR implementation. This relies on the Tessaract.js project

Challenges I ran into

Accessing the clipboard and passing the clipboard image into the OCR required the conversion of the array of clipboard items to be imported and converted into an image. Clipboard access ended up being the most complex step, while call Tessaract was easier than anticipated.

What I learned

OCR is fantastically useful when using screensharing or VMs that do not share clipboards. Javascript clipboard access and permissions don't work in the way I expected and require a "paste" action.

What's next for Web-based Clipboard OCR

At the moment, not much more is planned. The tool is already lightweight and easy to use. It can quickly fit itself into a workflow.

Possible next features:

Interface modifications: Tune vertical/horizontal layout.
OCR tuning: modify parameters of the OCR. User-tunable and cached settings. Add timeout and retries.
Interface modifications: Buttons for "Upload image", "Save as text"
Interface Improvements to Copy output to clipboard. (maybe use automatically copy on highlight text)
CSS
More languages

A stretch goals:

Overlay the text over the image with the coordinates returned by tessaract OCR. -> Export as PDF.
Rewrite entire project as a chrome extension.
Queuing multiple images.

Built With

Updates

Joshua Loke started this project — Apr 29, 2020 03:18 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.