OCR for PDF Stream
IronOCR also supports Stream.
In this example, IronPDF is used to create a PDF Stream that can later be used for text recognition by IronOCR.
Please note that IronOCR can only read Stream as an input but does not support exporting Stream as an output file.
// Import the necessary namespaces
using IronPdf;
using IronOcr;
using System.IO;
class Program
{
static void Main()
{
// Create a PDF document using IronPDF
var pdfDocument = new HtmlToPdf().RenderHtmlAsPdf("<h1>Hello World</h1><p>This is a simple example of PDF generation.</p>");
// Save the PDF as a stream
using (MemoryStream pdfStream = new MemoryStream()) // Initialize a new memory stream
{
pdfDocument.Stream.CopyTo(pdfStream); // Copy the pdfDocument's stream to the memory stream
pdfStream.Position = 0; // Reset the position of the memory stream to the beginning
// Initialize IronOCR engine
var Ocr = new IronTesseract();
// Perform OCR on the PDF Stream
using (var input = new OcrInput(pdfStream)) // Pass the PDF Stream to the OCR Input
{
var result = Ocr.Read(input); // Perform OCR and get the result
Console.WriteLine(result.Text); // Output the text from the scanned document to the console
}
}
}
}
// Import the necessary namespaces
using IronPdf;
using IronOcr;
using System.IO;
class Program
{
static void Main()
{
// Create a PDF document using IronPDF
var pdfDocument = new HtmlToPdf().RenderHtmlAsPdf("<h1>Hello World</h1><p>This is a simple example of PDF generation.</p>");
// Save the PDF as a stream
using (MemoryStream pdfStream = new MemoryStream()) // Initialize a new memory stream
{
pdfDocument.Stream.CopyTo(pdfStream); // Copy the pdfDocument's stream to the memory stream
pdfStream.Position = 0; // Reset the position of the memory stream to the beginning
// Initialize IronOCR engine
var Ocr = new IronTesseract();
// Perform OCR on the PDF Stream
using (var input = new OcrInput(pdfStream)) // Pass the PDF Stream to the OCR Input
{
var result = Ocr.Read(input); // Perform OCR and get the result
Console.WriteLine(result.Text); // Output the text from the scanned document to the console
}
}
}
}
' Import the necessary namespaces
Imports IronPdf
Imports IronOcr
Imports System.IO
Friend Class Program
Shared Sub Main()
' Create a PDF document using IronPDF
Dim pdfDocument = (New HtmlToPdf()).RenderHtmlAsPdf("<h1>Hello World</h1><p>This is a simple example of PDF generation.</p>")
' Save the PDF as a stream
Using pdfStream As New MemoryStream() ' Initialize a new memory stream
pdfDocument.Stream.CopyTo(pdfStream) ' Copy the pdfDocument's stream to the memory stream
pdfStream.Position = 0 ' Reset the position of the memory stream to the beginning
' Initialize IronOCR engine
Dim Ocr = New IronTesseract()
' Perform OCR on the PDF Stream
Using input = New OcrInput(pdfStream) ' Pass the PDF Stream to the OCR Input
Dim result = Ocr.Read(input) ' Perform OCR and get the result
Console.WriteLine(result.Text) ' Output the text from the scanned document to the console
End Using
End Using
End Sub
End Class
Explanation:
- We use the
IronPdf
namespace to generate a PDF document from a simple HTML string. - The PDF document is converted to a
Stream
by copying it into aMemoryStream
. IronOcr
'sIronTesseract
engine is used to read and perform OCR on the stream.- All memory streams are properly disposed of using
using
blocks to manage resources efficiently. - The OCR output is printed to the console, showing the recognized text from the PDF.
This example effectively demonstrates how to use streams with IronOCR and IronPDF for text recognition tasks within a .NET application.