Edit Text After Scanning
Optical character recognition (OCR) is a term for software that can recognize text characters in an image, and OCR software typically allows you to extract text from an image, which is the first step to editing it. Every scanner comes with its own OCR software, typically, but using each is a different process. By contrast, Microsoft OneNote is available now on both Mac and Windows, has OCR and text extraction functionality, and is freely available on modern PCs, tablets, and smartphones, making the process of extracting text from images much simpler and predictable. All desktop and mobile versions of OneNote include text extraction capabilities – even free versions – but you can only extract text from an image using the desktop version of OneNote.
Contents
Steps
Extracting Your Scanned Text
- Download OneNote to your desktop PC. On a Mac or Windows PC the process will vary slightly depending on operating system versions and preferences; you can download it from Office.com. Overall, OneNote for Mac is very similar to OneNote for Windows; the OCR functionality works basically the same in both.
- Click the Pictures icon. You can find the icon on OneNote's Insert tab (the icon says "Picture" on Mac). The OneNote interface has a large ribbon across the top by default and the "Pictures" (or "Picture" on Mac) icon is on the Insert tab towards the left. On Mac, you can also choose "Picture" from the "Insert" menu at the top of the screen. When you click the icon, the Insert Picture window appears (or the "Choose a picture" window on Mac).
- If you don't see the tabs or icons click the Ribbon Display Options button to the immediate left of the Minimize button in the top right of the application window and choose "Show Tabs and Commands." On Mac you can just use the menus at the top of the screen, so the tabs aren't necessary.
- Hover your mouse cursor over the buttons to see what they're called.
- Navigate to and select the image you want. After you do, click Open ("Insert" on Mac). The image file appears in OneNote at the place where the cursor is.
- You can also choose File Printout rather than Image to extract text from the printout of a document.
- As an alternative, press the ⎙ PrtScr button on your keyboard to capture an image of your current screen, then paste it into your document using Ctrl+V (or ⌘ Cmd+V on a Mac).
- Text in the image that you are extracting from needs to be typeset for good OCR recognition.
- Right-click the image and choose “Copy Text From Image.” The text in the image will be copied to your PC’s clipboard.
- On Windows, if instead of an image you choose File Printout in Step 2, right-clicking on one page of the printout will result in two alternate choices here: “Copy Text From This Page of the Printout” or “Copy Text From All Pages of the Printout” – select the one you want.
- Paste the text back into OneNote using Ctrl+V (or ⌘ Cmd+V on a Mac) and edit it in the app if you like. You can also choose to paste the image into another program.
- You can select the text using your mouse cursor and then pressing Ctrl+C (or ⌘ Cmd+C on a Mac). Alternatively, you can right-click (or Ctrl+click on a Mac) the text and choose "Copy."
- If you have saved the extracted text and are accessing it from a non-desktop version of OneNote, Instructions will vary significantly for copying and pasting. On Android, for example, you need to press and hold on part of the text you're wanting, use the resulting "handles" on either side to select all the text, and press the Copy or Cut button (the icons are of two pages on top of one another and of a pair of scissors, respectively).
- Paste the copied text into another application. Microsoft Word or Google Docs are popular applications; simply open a new or existing document in that app and press Ctrl+V (or ⌘ Cmd+V on a Mac). The text will likely look pretty ugly when you paste it in.
- You may want to save the document immediately before editing so you can go back to the original, unedited text.
- Edit and format the text as normal. You are limited in terms of formatting and such only by the app you choose to paste into – the latest version of Microsoft Word, for example, always has far more options and gives you much more control than Microsoft Notepad or even Google Docs, for example.
Using Other Extractors
- Open whatever extractor you are using. Whatever extractor you choose, the process involves opening the image in the extractor, extracting the text from it, and then copying and pasting the text into a document for editing. Different types of applications or services abound:
- Scanner-included software: If you have a scanner and still have the software that came with it, it probably includes OCR text extraction capabilities. Instructions should have come with the scanner or you should be able to look them up online for a relatively modern scanner.
- Free Websites: These ad-driven but functional Websites typically take TIF, GIF, PDF, JPG, BMP, PNG or some combination. They often have limits (such as 5MB) to the size of files you can upload. Some sites will email you a Word document or other file containing the text of your image for free, others will simply provide the text for you to copy. A few include:
- Free-ocr.com
- Onlineocr.net
- High-cost OCR software: Some OCR software costs up to $500 to purchase; consider these only if you need extremely highly accurate OCR results. Some of the more popular can be found at TopTenReviews.com or similar sites; several of the top currently include:
- OmniPage Standard
- Adobe Acrobat
- ABBYY FineReader
- Free software; it is likely that these solutions will not work with larger images and many don’t work beyond more than the first page of a PDF:
- FreeOCR
- SimpleOCR
- Free OCR To Word
- Use your tool for extracting text. You can usually save your text as either plain text, Word .doc format, or in Rich Text Format (RTF). RTF format was a precursor to .doc and (like .doc) allows for saving text formatting, margins, images, and so forth into a single, portable and shareable file. RTF files are much larger than .doc files, and since .doc is viewable by just about anyone (MS Word has a free viewer available), .doc is probably your best bet.
- Copy and paste the resulting text into your chosen editing tool. It will likely look a formatting mess when you paste it in, so you will have to remove a lot of spaces or break up words that have been crammed together. The level of formatting messiness depends largely on how clean the image that you extracted the text from was.
- Edit and format the text as normal. You are limited in terms of formatting and such only by the app you choose to paste into – the latest version of Microsoft Word, for example, always has far more options and gives you much more control than Microsoft Notepad or even Google Docs, for example.
Related Articles
Sources and Citations
- http://venturebeat.com/2015/02/19/microsoft-updates-onenote-with-ocr-support-across-all-platforms-ipad-app-gets-handwriting-support/
- http://www.macworld.com/article/2952510/software-productivity/onenote-2016-for-mac-review-intuitive-and-versatile-but-still-not-up-to-par-with-windows-version.html
- https://support.office.com/en-us/article/Insert-and-format-pictures-in-OneNote-2016-for-Mac-0e730b31-0d42-4888-9ce4-f07d49607b4c#BKMK_toc01
- https://support.office.com/en-GB/article/Copy-text-from-inserted-pictures-in-OneNote-2016-for-Mac-b840c9a0-6f25-423c-bbb5-f240cc07d4db
- https://support.office.com/en-us/article/Extract-text-from-pictures-and-file-printouts-in-OneNote-2016-for-Windows-e633bd8f-0b97-4fb4-96e1-0f081541e52e
- http://www.themillergroup.com/latest-news/how-to-extract-text-from-images-using-onenote/
- http://www.makeuseof.com/tag/essential-guide-onenote-mac/
- http://www.free-ocr.com/
- http://www.onlineocr.net/
- http://ocr-software-review.toptenreviews.com/
- http://www.paperfile.net/
- http://www.simpleocr.com/
- http://www.ocrtoword.com/
- http://accessproject.colostate.edu/udl/modules/word/tut_rtf.php