What is an OCR system?

Definition: An OCR (Optical Character Recognition) system is a computerized scanning system enabling you to scan text documents into an electronic computer file which you can then edit using a word processor on your computer.

Optical Character Recognition is the machine recognition of printed text characters.

A OCR system is able to recognize numerous kinds of printed characters and text fonts from both computers and typewriters. Advanced OCR systems can even identify handwriting.

An OCR scanner turns a text document into a bitmap. This is subsequently turned into an ASCII text, which is placed on your hard drive and which you can edit in a word processor

How does it work?

When you scan a text document e.g. a page in a book or an invoice it is turned into a bitmap, which is a picture of the text.

An OCR system compares the dark and light aspects of this bitmap in order to determine each alphanumeric character. When the OCR recognizes the characters, it turns them into ASCII text, which is raw text that you can edit in a word processor.

The result is that you can e.g. search, edit or copy the text as easily and quickly as you would a document in e.g Notepad.

