May 3, 2010 by Christoff Truter C#
The past few weeks we've been looking for a suitable OCR solution to
integrate into our document management system.
One option we came across involves MODI (Microsoft Office Document Imaging) -
a tool available within Microsoft Office 2003 - 2007 (not available in Microsoft Office 2010).
Simply include the MODI Type library (COM Interop) and convert image(s) to text like this:
using MODI; using System; class Program { static void Main(string[] args) { DocumentClass doc = new DocumentClass(); doc.Create(@"some.tiff"); doc.OCR(MiLANGUAGES.miLANG_ENGLISH, true, true); foreach (Image image in doc.Images) { Console.WriteLine(image.Layout.Text); } } }
Captiva September 30, 2017 by Josh
Is it possible to convert scanned images using Captiva into PDF files to pull data? Otherwise, it takes time to hand enter based on the data shown, whereas, a PDF version that it text selectable, can be copied into Excel, sorted, and using Text to Columns, push certain data into another column to group the data that's needed.