Which type of software can translate scanned text into text that you can edit?

You can scan a document and convert the text into a format data that you can edit with a word processing program. This process is called OCR [Optical Character Recognition]. To scan and use OCR, you need to use an OCR program, such as the ABBYY FineReader program that came with your scanner.

OCR software cannot recognize or has difficulties recognizing the following types of documents or text.

Items that have been copied from other copies

Text with tightly spaced characters or line pitch

Text that is in tables or underlined

Cursive or italic fonts, and font sizes less than 8 points

See one of these sections to scan and convert text using ABBYY FineReader.

Converting into Editable Text in Office Mode

Do one of the following to start ABBYY FineReader.

Windows: Select the start button icon or Start > Programs or All Programs > ABBYY FineReader 6.0 Sprint > ABBYY FineReader 6.0 Sprint.

Mac OS X: Select Applications > ABBYY FineReader 5 Sprint Plus, and double-click the Launch FineReader 5 Sprint icon.

Click the Scan&Read icon at the top of the window. Epson Scan starts in the last mode you used.

Note for Mac OS X users:

If you do not see a Scan&Read icon, choose Select Scanner from the Scan&Read menu, select EPSON GT-20000, and click OK. Then select Scan&Read from the Scan&Read menu to start Epson Scan.

Select Color or Black&White as the Image Type setting.

Note:

If you select Black&White, you can also select an Image Option setting, as described below.

No Image Option setting is applied.

Drops out red from the scan.

Drops out green from the scan.

Drops out blue from the scan.

Enhances red in the scan.

Enhances green in the scan.

Enhances blue in the scan.

Text Enhancement Technology

Improves accuracy during OCR [Optical Character Recognition] scanning by eliminating the document’s background from the scan. This setting is available only when the Image Type is set to Black&White.

Makes grayscale images clearer and text recognition more accurate by separating the text from the graphics. This setting is available only when the Image Type is set to Black&White.

Select Document Table as the Document Source setting.

Select the size of your original document as the Size setting.

Select 300 as the Resolution setting.

Click Scan. Your document is scanned, processed into editable text, and opened in the ABBYY FineReader window.

Start converting your scanned Word documents, TXT files, images, and more into fully editable PDFs with our free online converter tool. Keep your formatting, export your document, and more — right now, right from your browser. 

OCR [optical character recognition]

  • Share this item with your network:

By
  • TechTarget Contributor

What is OCR [optical character recognition]?

OCR [optical character recognition] is the use of technology to distinguish printed or handwritten text characters inside digital images of physical documents, such as a scanned paper document. The basic process of OCR involves examining the text of a document and translating the characters into code that can be used for data processing. OCR is sometimes also referred to as text recognition.

OCR systems are made up of a combination of hardware and software that is used to convert physical documents into machine-readable text. Hardware, such as an optical scanner or specialized circuit board, is used to copy or read text while software typically handles the advanced processing. Software can also take advantage of artificial intelligence [AI] to implement more advanced methods of intelligent character recognition [ICR], like identifying languages or styles of handwriting.

The process of OCR is most commonly used to turn hard copy legal or historic documents into PDFs. Once placed in this soft copy, users can edit, format and search the document as if it was created with a word processor.

How optical character recognition works

The first step of OCR is using a scanner to process the physical form of a document. Once all pages are copied, OCR software converts the document into a two-color, or black and white, version. The scanned-in image or bitmap is analyzed for light and dark areas, where the dark areas are identified as characters that need to be recognized and light areas are identified as background.

The dark areas are then processed further to find alphabetic letters or numeric digits. OCR programs can vary in their techniques, but typically involve targeting one character, word or block of text at a time. Characters are then identified using one of two algorithms:

  1. Pattern recognition. OCR programs are fed examples of text in various fonts and formats which are then used to compare, and recognize, characters in the scanned document.
  2. Feature detection. OCR programs apply rules regarding the features of a specific letter or number to recognize characters in the scanned document. Features could include the number of angled lines, crossed lines or curves in a character for comparison. For example, the capital letter "A" may be stored as two diagonal lines that meet with a horizontal line across the middle.

When a character is identified, it is converted into an ASCII code that can be used by computer systems to handle further manipulations. Users should correct basic errors, proofread and make sure complex layouts were handled properly before saving the document for future use.

Optical character recognition use cases

OCR can be used for a variety of applications, including the following:

  • Scanning printed documents into versions that can be edited with word processors, like Microsoft Word or Google Docs.
  • Indexing print material for search engines.
  • Automating data entry, extraction and processing.
  • Deciphering documents into text that can be read aloud to visually-impaired or blind users.
  • Archiving historic information, such as newspapers, magazines or phonebooks, into searchable formats.
  • Electronically depositing checks without the need for a bank teller.
  • Placing important, signed legal documents into an electronic database.
  • Recognizing text, such as license plates, with a camera or software.
  • Sorting letters for mail delivery.
  • Translating words within an image into a specified language.

Benefits of optical character recognition

The main advantages of OCR technology are the following:

  • saves time;
  • decreases errors;
  • minimizes effort; and
  • enables actions that are not possible with physical copies, such as compressing into ZIP files, highlighting keywords, incorporating into a website and attaching to an email.

While taking images of documents enables them to be digitally archived, OCR provides the added functionality of being able to edit and search those documents.

How does OCR software can translate scanned text into text that you can edit?

How Does OCR Work? OCR software processes a digital image by locating and recognizing characters, such as letters, numbers, and symbols. Some OCR software will simply export the text, while other programs can convert the characters to editable text directly in the image.

What type of software is required to convert scanned text into text that can be modified by a word processor?

Optical character recognition use cases OCR can be used for a variety of applications, including the following: Scanning printed documents into versions that can be edited with word processors, like Microsoft Word or Google Docs.

How can I edit text in a scan copy?

Edit text in a scanned document.
Open the scanned PDF file in Acrobat..
Choose Tools > Edit PDF. ... .
Click the text element you want to edit and start typing. ... .
Choose File > Save As and type a new name for your editable document..

What software converts a scanner image into a text file that can be edited with a word processing application?

Optical Character Recognition [OCR] software can convert scanned documents in an image format into editable documents. You can use this software to edit scanned documents using a PDF or word processing application.

Chủ Đề