Optical Character Recognition

Zia Optical Character Recognition electronically detects textual characters in images or digital documents, and converts them into machine-encoded text. Zia OCR can recognize text in 9 international languages and 10 Indian languages. You can check the list of languages and language codes from the API documentation

Note: Catalyst does not store any of the files you upload in its systems. The files you upload are used for one-time processing only. They are not used for ML model training purposes either. Catalyst components are fully compliant with all applicable data protection and privacy laws.

You must specify the path to the image or document file that needs to be processed for OCR, as shown in the code below. You can also format the response you receive as shown in the sample code. The response will also include a confidence score, which defines the accuracy of the processing, in addition to the recognized text.

Allowed file formats: .jpg, .jpeg, .png, .tiff, .bmp, .pdf

File size limit: 20 MB

You must specify the model type as OCR in setModelType(), and the language codes using setLanguageCode. These values are optional for the OCR model type. By default, it is processed as the OCR model type, and the languages are automatically detected if they are not specified.

Ensure the following packages are imported:

    
        copy
        
            
        
    
    
         import com.zc.component.ml.ZCContent; 
import com.zc.component.ml.ZCLine;
import com.zc.component.ml.ZCML; 
import com.zc.component.ml.ZCOCRModelType; 
import com.zc.component.ml.ZCOCROptions;
import com.zc.component.ml.ZCParagraph;
import java.io.File;

    
        copy
        
            
        
    
    
         File file = new File("/Users/amelia-421/Desktop/MyImage.webp"); 
//Specify the file path 
ZCOCROptions options = ZCOCROptions.getInstance().setModelType(ZCOCRModelType.OCR).setLanguageCode("eng,tam");
//Set the model type and languages 
ZCContent ocrContent = ZCML.getInstance().getContent(file, options); 
//Call getContent() with the file object to get the detected text in ZCContent object 
//To get individual paragraphs List
paragraphs = ocrContent.getParagraphs(); 
for(ZCParagraph paragraph : paragraphs)
{ 
//To get individual lines in the paragraph 
List paraLines = paragraph.lines; 
for(ZCLine line : paraLines)
{ 
//To get individual words in the line
String words = line.words; 
String text = line.text; 
//Raw line text 
} 
String text = paragraph.text; 
//Returns the raw paragraph text 
} 
String text = ocrContent.text; 
//Returns the raw image text

Last Updated 2025-02-19 15:51:40 +0530 +0530

Yes

Thank you for your feedback!

Send your feedback to us

Skip

Submit

OCR - API

Java SDK

SDK Operations

Optical Character Recognition

Ensure the following packages are imported: