OCR for MultiPage TIFF Files

The OcrInput and automatically work with input TIFF files that conventional Tesseract cannot read.

Every frame of your TIFFs will be imported, creating a multipage IronOcr.OcrResult document.

Here's a sample C# code to OCR a TIFF file using Iron Tesseract:

using IronOcr;

class Program
{
    static void Main()
    {
        // Create an instance of IronTesseract. 
        // This object facilitates the OCR process.
        var Ocr = new IronTesseract();

        // Create an OcrInput object. This object is used to manage the documents or images to be processed.
        var inputs = new OcrInput();

        // Add a TIFF file to the OcrInput. The AddMultiFrameTiff method allows reading multi-page TIFF files. 
        inputs.AddMultiFrameTiff("example.tiff");

        // Use the Read method of IronTesseract to perform OCR on the input images.
        // This method returns an OcrResult object, which contains the recognized text.
        OcrResult result = Ocr.Read(inputs);

        // Output the result to the console. OcrResult.Text contains the recognized text.
        System.Console.WriteLine(result.Text);
    }
}
using IronOcr;

class Program
{
    static void Main()
    {
        // Create an instance of IronTesseract. 
        // This object facilitates the OCR process.
        var Ocr = new IronTesseract();

        // Create an OcrInput object. This object is used to manage the documents or images to be processed.
        var inputs = new OcrInput();

        // Add a TIFF file to the OcrInput. The AddMultiFrameTiff method allows reading multi-page TIFF files. 
        inputs.AddMultiFrameTiff("example.tiff");

        // Use the Read method of IronTesseract to perform OCR on the input images.
        // This method returns an OcrResult object, which contains the recognized text.
        OcrResult result = Ocr.Read(inputs);

        // Output the result to the console. OcrResult.Text contains the recognized text.
        System.Console.WriteLine(result.Text);
    }
}
Imports IronOcr

Friend Class Program
	Shared Sub Main()
		' Create an instance of IronTesseract. 
		' This object facilitates the OCR process.
		Dim Ocr = New IronTesseract()

		' Create an OcrInput object. This object is used to manage the documents or images to be processed.
		Dim inputs = New OcrInput()

		' Add a TIFF file to the OcrInput. The AddMultiFrameTiff method allows reading multi-page TIFF files. 
		inputs.AddMultiFrameTiff("example.tiff")

		' Use the Read method of IronTesseract to perform OCR on the input images.
		' This method returns an OcrResult object, which contains the recognized text.
		Dim result As OcrResult = Ocr.Read(inputs)

		' Output the result to the console. OcrResult.Text contains the recognized text.
		System.Console.WriteLine(result.Text)
	End Sub
End Class
$vbLabelText   $csharpLabel

Explanation of the Code:

  1. IronTesseract Instance: An instance of the IronTesseract class is created to facilitate OCR operations. This object handles the recognition process.

  2. OcrInput Object: This object is used to store the images or documents you want to perform OCR on. It can handle multiple formats and pages.

  3. AddMultiFrameTiff Method: This method is used to add a TIFF file to the OcrInput. It specifically supports multi-page TIFFs, allowing you to process each page as part of a single operation.

  4. OCR Operation: The Read method of the IronTesseract object is called with the OcrInput. This performs the OCR and stores the results in an OcrResult object.

  5. Display Results: The recognized text from the TIFF file is then written to the console using System.Console.WriteLine. The OcrResult.Text property contains the textual content extracted from the image.

By following these steps, you can efficiently perform OCR on TIFF files, especially those with multiple pages, using the Iron OCR library.