Creating Searchable PDFs

Scanning to PDF

Scanned PDFs can be extremely problematic if not done correctly, since pages are often scanned as images rather than text. This makes it impossible for any type of technology to interact with the document.

Fortunately it is not a difficult problem to fix if you have the right software - software with Optical Character Recognition (OCR) capability.

Downloadable Guides

If you prefer to use a downloadable, printable version, guides for both Acrobat X and Acrobat XI are available in Word and PDF format below. The online instructions are for Acrobat X only.

  • Creating Searchable PDFs with Acrobat X Download Word Version of Searchable PDFs with Acrobat X Guide Download PDF Version of Searchable PDFs with Acrobat X Guide
  • Creating Searchable PDFs with Acrobat XI Download Word Version of Searchable PDFs with Acrobat XI Guide Download PDF Version of Searchable PDFs with Acrobat XI Guide

Optical Character Recognition (OCR)

OCR is software that recognizes and interprets text in an image and converts it to text that a computer can read.

A PDF that has been OCR'd is sometimes referred to as a Searchable PDF. Either way, the resulting document has text that you can interact with.

Benefits of an OCR'd document include the ability to:

  • Search text
  • Select text
  • Highlight text
  • Create a Table of Contents
  • Listen to text

OCR at the Scanner or Copier

Some scanners and photocopiers have the ability to OCR documents as they scan so that you don’t have to fix it afterwards. Many Xerox photocopiers can scan to PDF via email or network, and you can select the option to make that PDF searchable.

Look for keywords OCR or Searchable PDF in the save options while scanning.

OCR with Adobe Acrobat Professional

Software with OCR capability comes with various levels of sophistication and cost. The cheapest and most common software is Adobe Acrobat Professional (about $99 per copy). This is different than the free Adobe Reader program, which can only read PDFs, not change them.

Some departments at CSU have site licenses for Acrobat Pro, but if yours does not, Morgan Library also has it on all computers.

Use the Recognize Text Tool

  1. Open the PDF in Acrobat Pro
  2. Click on Tools at the top right of the document.
  3. When the toolbar opens on the right side of the screen, click on Recognize Text to expand the menu.
  4. Select In this File to bring up the OCR menu box.
    View, Tools, Recognize Text, In This File
  5. Choose a radio button to do either All Pages or just the Current Page, then Click OK. (All pages could take a while on longer documents.)
    Recognize Text Box
  6. Once this finishes running, you should be able to highlight text in the document.
    Sample PDF with text highlighted

Add Tags

Tags make a document more easily readable with a screen reader by specifying the text reading order and creating a Table of Contents.

To add tags to a document, use the Accessibility Tool.

  1. Open the Accessibility Tool from the View menu. Click on View, Tools, then Accessibility.
    Acrobat Accessibility Tool
  2. In the Toolbar, click on Add Tags to Document.
    Add Tags Menu

Morgan Library Course Reserves

The Course Reserves system at Morgan Library is a great resource for faculty at CSU. If you commonly use scanned articles or chapters for classes, you can request these titles to be posted online in PDF format. They will be available to any student enrolled in your course.

The library staff ensures that these PDFs are OCR'd, and the turn-around time is typically within 24 hours.

Sign on with your eid on the Library home page Reserves tab to make your requests.

Further Resources

In-depth tutorials on creating accessible PDFs can be found here:

Adobe PDF, Universally Designed by CSU's ACCESS Project

PDF Tutorials by Adobe

PDF Tutorials by WebAIM

Video Tutorials by Atomic Learning