C#.net - Read Text from Image in C#.net

In this article i will show you how to read text from image by using OCR Components in C#.net.


What is OCR?
OCR (Optical Character Recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.


OCR translates images of text, such as scanned documents, into actual text characters. Also known as text recognition, OCR makes it possible to edit and reuse the text that is normally locked inside scanned images. OCR works using a form of artificial intelligence known as pattern recognition, to identify individual text characters on a page, including punctuation marks, spaces, and ends of lines.


First off, you need to have MS Office 2007 installed or later version. This is obviously a dependency if you develop an application to use the OCR capabilites in the field – it won’t work without Office installed. Furthermore, the OCR capability doesn’t install by default when you install Office, you need to add a component called ‘Microsoft Office Document Imaging’ (MODI).


Instructions on how to add the required MODI component.


Step 1
Click Start, click Run, type appwiz.cpl in the Open box, and then click OK.


Step 2
Click to select the Office 2007 version that you have installed.


Step 3
Click Change.


Step 4
Click Add or Remove Features, and then click Continue.


Step 5
Expand Office Tools.




Click on Image for better View.


Step 6
Click Microsoft Office Document Imaging, and then click Run all from My Computer.




Click on Image for better View. 


Step 7
Click Continue.


Now MODI Components installed on your Machine.lets create OCR Application in Visual Stdio.


Step 8
Create a Console Application and give the solution name as SolReadTextFromImage.


Step 9
Copy a Sample image file in Application BaseDirectory.(./bin/debug/SampleImage.JPG)






Click on Image for better View. 


Step 10
Add a MODI Reference in our application.so we can use in our application for reading text from image.Right Click on project in Solution Explorer.right click onReferences,select the COM tab,then select Microsoft Office Document Imaging 12.0 Type Library.  




Click on Image for better View. 


Step 11
The Code below will read text from image and store in text file,it is look like this
  1. #region Methods  
  2.   
  3.         /// <summary>  
  4.         ///  Read Text from Image and display in console App  
  5.         /// </summary>  
  6.         /// <param name="ImagePath">specify the Image Path</param>  
  7.         private static void ReadTextFromImage(String ImagePath)  
  8.         {  
  9.             try  
  10.             {  
  11.                 // Grab Text From Image  
  12.                 MODI.Document ModiObj = new MODI.Document();  
  13.                 ModiObj.Create(ImagePath);  
  14.                 ModiObj.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, truetrue);  
  15.   
  16.                 //Retrieve the text gathered from the image  
  17.                 MODI.Image ModiImageObj = (MODI.Image)ModiObj.Images[0];  
  18.                   
  19.   
  20.                 System.Console.WriteLine(ModiImageObj.Layout.Text);  
  21.   
  22.                 ModiObj.Close();  
  23.             }  
  24.             catch (Exception ex)  
  25.             {  
  26.                 throw new Exception(ex.Message);  
  27.             }  
  28.         }  
  29.   
  30.         /// <summary>  
  31.         ///  Read Text from Image and Store in Text File  
  32.         /// </summary>  
  33.         /// <param name="ImagePath">specify the Image Path</param>  
  34.         /// <param name="StoreTextFilePath">Specify the Store Text File</param>  
  35.         private static void ReadTextFromImage(String ImagePath, String StoreTextFilePath)  
  36.         {  
  37.             try  
  38.             {  
  39.                 // Grab Text From Image  
  40.                 MODI.Document ModiObj = new MODI.Document();  
  41.                 ModiObj.Create(ImagePath);  
  42.                 ModiObj.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, truetrue);  
  43.   
  44.                 //Retrieve the text gathered from the image  
  45.                 MODI.Image ModiImageObj = (MODI.Image)ModiObj.Images[0];  
  46.                  
  47.   
  48.                 // Store Image Content in Text File  
  49.                 FileStream CreateFileObj = new FileStream(StoreTextFilePath, FileMode.Create);  
  50.                 //save the image text in the text file   
  51.                 StreamWriter WriteFileObj = new StreamWriter(CreateFileObj);  
  52.                 WriteFileObj.Write(ModiImageObj.Layout.Text);  
  53.                 WriteFileObj.Close();  
  54.   
  55.                 ModiObj.Close();  
  56.             }  
  57.             catch (Exception ex)  
  58.             {  
  59.                 throw new Exception(ex.Message);   
  60.             }  
  61.         }  
  62.  
  63.         #endregion  
  64.   
  65.           

Step 12
Call both methods in main function,it is look like this
  1. static void Main(string[] args)  
  2.         {  
  3.             // Set Sample Image Path  
  4.             String ImagePath = AppDomain.CurrentDomain.BaseDirectory + "SampleImage.jpg";  
  5.   
  6.             ReadTextFromImage(ImagePath);  
  7.   
  8.             // Set Store Image Content text file Path  
  9.             String StoreTextFilePath = AppDomain.CurrentDomain.BaseDirectory + "SampleText.txt";  
  10.   
  11.             ReadTextFromImage(ImagePath, StoreTextFilePath);  
  12.         }  


Full Code
  1. using System;  
  2. using System.Collections.Generic;  
  3. using System.Linq;  
  4. using System.Text;  
  5. using System.IO;  
  6.   
  7.   
  8. namespace SolReadTextFromImage  
  9. {  
  10.     class Program  
  11.     {  
  12.         static void Main(string[] args)  
  13.         {  
  14.             // Set Sample Image Path  
  15.             String ImagePath = AppDomain.CurrentDomain.BaseDirectory + "SampleImage.jpg";  
  16.   
  17.             ReadTextFromImage(ImagePath);  
  18.   
  19.             // Set Store Image Content text file Path  
  20.             String StoreTextFilePath = AppDomain.CurrentDomain.BaseDirectory + "SampleText.txt";  
  21.   
  22.             ReadTextFromImage(ImagePath, StoreTextFilePath);  
  23.         }  
  24.  
  25.         #region Methods  
  26.   
  27.         /// <summary>  
  28.         ///  Read Text from Image and display in console App  
  29.         /// </summary>  
  30.         /// <param name="ImagePath">specify the Image Path</param>  
  31.         private static void ReadTextFromImage(String ImagePath)  
  32.         {  
  33.             try  
  34.             {  
  35.                 // Grab Text From Image  
  36.                 MODI.Document ModiObj = new MODI.Document();  
  37.                 ModiObj.Create(ImagePath);  
  38.                 ModiObj.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, truetrue);  
  39.   
  40.                 //Retrieve the text gathered from the image  
  41.                 MODI.Image ModiImageObj = (MODI.Image)ModiObj.Images[0];  
  42.                  
  43.   
  44.                 System.Console.WriteLine(ModiImageObj.Layout.Text);  
  45.   
  46.                 ModiObj.Close();  
  47.             }  
  48.             catch (Exception ex)  
  49.             {  
  50.                 throw new Exception(ex.Message);  
  51.             }  
  52.         }  
  53.   
  54.         /// <summary>  
  55.         ///  Read Text from Image and Store in Text File  
  56.         /// </summary>  
  57.         /// <param name="ImagePath">specify the Image Path</param>  
  58.         /// <param name="StoreTextFilePath">Specify the Store Text File</param>  
  59.         private static void ReadTextFromImage(String ImagePath, String StoreTextFilePath)  
  60.         {  
  61.             try  
  62.             {  
  63.                 // Grab Text From Image  
  64.                 MODI.Document ModiObj = new MODI.Document();  
  65.                 ModiObj.Create(ImagePath);  
  66.                 ModiObj.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, truetrue);  
  67.   
  68.                 //Retrieve the text gathered from the image  
  69.                 MODI.Image ModiImageObj = (MODI.Image)ModiObj.Images[0];  
  70.                  
  71.   
  72.                 // Store Image Content in Text File  
  73.                 FileStream CreateFileObj = new FileStream(StoreTextFilePath, FileMode.Create);  
  74.                 //save the image text in the text file   
  75.                 StreamWriter WriteFileObj = new StreamWriter(CreateFileObj);  
  76.                 WriteFileObj.Write(ModiImageObj.Layout.Text);  
  77.                 WriteFileObj.Close();  
  78.   
  79.                 ModiObj.Close();  
  80.             }  
  81.             catch (Exception ex)  
  82.             {  
  83.                 throw new Exception(ex.Message);   
  84.             }  
  85.         }  
  86.  
  87.         #endregion  
  88.     }  
  89. }  


Output

Click on Image for better View. 

Download
Download source Code

Refer by : http://kishor-naik-dotnet.blogspot.in/2012/05/cnet-read-text-from-image-in-cnet.html

Comments