Scan (OCR) into MS Access (form)?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • hayadooen
    New Member
    • Jul 2008
    • 2

    Scan (OCR) into MS Access (form)?

    Hi all,

    Very new to this forum. Searched for relative postings but found none.

    Question:

    Does anyone know of any application that could convert/import scanned docs (PDFs) into access database (specified fields)?

    I constantly deal with paper works from clients like purchase orders, etc. and data entry into access is very time-consuming and costly.

    What I'm looking for is a program that would take my scanned PDFs and OCR them into an Access Form that would import the data into specific fields created, thus creating a temp table that I could then manipulate the data.

    Not sure if I worded it correctly, but i'm hoping there is such a software.

    Please post.

    Thx.

    H
  • Stewart Ross
    Recognized Expert Moderator Specialist
    • Feb 2008
    • 2545

    #2
    Although there are commercial systems which can scan unstructured documents for index and retrieval purposes I am not aware of any that are aimed at scanning structured documents to database tables.

    Commercial OCR systems used for high-volume scanning tend to be bought for particular purposes (e.g. scanning opinion survey results, attendance sheets, voting papers etc.) and rely on bespoke forms that the scanner can read easily (hence the typical light red print on many such forms, which fades out when scanned leaving the responses clear).

    A quick Google search does not show anything like what you need available at present; sorry. Other respondents might know differently, but I think you will not find a solution at present.

    -Stewart

    ps doing another search I found a commercial PDF to excel table converter. I haven't tried it myself, so I cannot vouch for it nor give any recommendation about its strengths or weaknesses.

    Following it up might help in your quest, although I remain doubtful. Here is the web link to it: http://www.snapfiles.com/get/pdf2xl.html.
    Last edited by Stewart Ross; Jul 3 '08, 12:27 PM. Reason: added ps

    Comment

    • hayadooen
      New Member
      • Jul 2008
      • 2

      #3
      Hi Stewart,

      Thanks for your reply.

      I've been googling as well hoping to find such a software, but been unsuccessful as well.

      What about a simplified process?

      IE. Having all documents correctly scanned into word doc, is there a way, in access, where I could create a template, indicating the location of fields, on a form, and then a process of converting the word document into an access template?

      Peers mentioned the use of OLE embedding in access?

      Any ideas? My knowledge with access is quite limited, so any input is well appreciated.

      regards,

      H

      Comment

      • Stewart Ross
        Recognized Expert Moderator Specialist
        • Feb 2008
        • 2545

        #4
        Hi hayadooen. Can't think of any way to tackle even the simpler suggestion you've made.

        Problem is that you need to define the structure of the scanned document. To use an analagous situation, it is like importing a text file containing fixed-length strings of characters where each line of text is one complete record. If you don't have a field list to know where one field ends and the next begins it is virtually impossible to import such text automatically.

        The Excel table converter I mentioned was the closest I could find to what you needed. Importing to Word using character recognition software would not help, as even if the OCR software was very consistent you would need to define bookmarks in Word to identify the field structure before you could get the Word document into Access.

        For OLE in Access all you would be doing is calling the scanner to embed an image of the document in a table as a binary file of some kind (a BLOB or Binary Large Object file). You would still need to OCR this into a regular Access table, and this in turn means being able to identify and impose a field structure on an unstructured scan - which by the lack of solutions available you can see is not a simple task.

        Sorry!

        -Stewart

        Comment

        • youmike
          New Member
          • Mar 2008
          • 69

          #5
          I've done a bit of experimenting in this area and I think Stewart hit the nail on the head when he used "unstructur ed". The fact is that the volume of coding that would be needed to deal with all the possible alternatives simply is not worth it. The applications that I've developed all recognised the problem and they create documents which can later be processed using a Bar Code Id which retrieves most of the processing data from appropriate tables and prompts the user to add those only elements not so available, but even so there is a significant capturing overhead.

          When it comes to third party documents, this approach becomes unworkable. The only other thing that might work is a series of prompts to read parts of a document using a hand held scanner, but I'd say that the labour would be no less than more conventional means of capture.

          Sorry to be so negative.

          Comment

          • Jhonny Ortiz

            #6
            You can use Regular Expression to look for certain patterns in your document and return the information.

            Might be a little difficult to do it without field names, but definitely not impossible.

            Comment

            • Jason Byrd
              New Member
              • Mar 2011
              • 1

              #7
              I had the same problem. You can try to use a PDF converter that converts the PDF into XRBL (excel) format first, then importing it into access directly into the data sheet, bypassing the form. This works well as long as the PDF's dont have any hand written characters (the converter has a hard time recognizing them).

              Comment

              Working...