# Optical Character Recognition -------------------------------------------------------------------------------- title: "Introduction" description: "Zia OCR is a ready-to-implement, AI-powered microservice that detects and recognizes textual content from images and documents." last_updated: "2026-03-18T07:41:08.702Z" source: "https://docs.catalyst.zoho.com/en/zia-services/help/optical-character-recognition/introduction/" service: "Zia Services" -------------------------------------------------------------------------------- # Optical Character Recognition ## Introduction Optical Character Recognition is a Catalyst Zia Services component that performs the electronic detection of handwritten or printed textual characters in images or digital documents, and converts the detected characters to machine-encoded text. Zia detects text in photos and scanned documents, then breaks the text down into individual characters, and identifies the language it is in. The recognized text is then presented as a JSON response. The recognized text is presented as a JSON response, along with a confidence score that informs you of its accuracy. You can code the Catalyst application to store the recognized data or process it further in any way you require. Zia OCR can automatically detect and recognize texts in {{%link href="/en/zia-services/help/optical-character-recognition/key-concepts/#supported-languages" %}}10 major languages{{%/link%}}. OCR is widely used in web and mobile applications that are created to read content from scanned or photographed documents, flyers, menus, posters, signs, and other files containing text. The identified text can be stored digitally or used for further data processing. Catalyst provides Zia OCR in the {{%bold%}}Java{{%/bold%}}, {{%bold%}}Node.js{{%/bold%}} and {{%bold%}}Python{{%/bold%}} SDK packages, and you can integrate it in your Catalyst web or Android application. The Catalyst console provides easy access to code templates for these environments that you can implement in your application's code. You can also test Zia OCR by uploading sample images or documents that contain text in the console and obtain the recognized text, to get a better idea of Zia's accuracy and the OCR response format. You can refer to the {{%link href="/en/sdk/java/v1/zia-services/ocr" %}}Java SDK documentation{{%/link%}}, {{%link href="/en/sdk/nodejs/v2/zia-services/ocr/" %}}Node.js SDK documentation{{%/link%}} and {{%link href="/en/sdk/python/v1/zia-services/ocr" %}}Python SDK documentation{{%/link%}} for code samples of Zia OCR. Refer to the {{%link href="/en/api/code-reference/zia-services/ocr/#OCR" %}}API documentation{{%/link%}} to learn about the API available for OCR. You can learn more about the other components of Catalyst Zia Services from {{%link href="/en/zia-services" %}}this page{{%/link%}}. -------------------------------------------------------------------------------- title: "Key Concepts" description: "Zia OCR is a ready-to-implement, AI-powered microservice that detects and recognizes textual content from images and documents." last_updated: "2026-03-18T07:41:08.702Z" source: "https://docs.catalyst.zoho.com/en/zia-services/help/optical-character-recognition/key-concepts/" service: "Zia Services" -------------------------------------------------------------------------------- # Key Concepts Before you learn about the use cases and implementation of Optical Character Recognition, it's important to understand its fundamental concepts in detail. ### Text Recognition Process OCR systems in general follow a top-down approach to the text detection and identification process. When an image or a digital document is submitted to Zia OCR, the text detection and recognition process proceeds as follows: 1. Zia analyzes the structure of the image and divides it into blocks of contiguous sets of textual lines, like paragraphs. <br /> {{%note%}}{{%bold%}}Note:{{%/bold%}} A block could also contain pictorial content. However, any content that is not text, such as diagrams, symbols, or images will not be identified by Zia OCR. {{%/note%}} 2. Zia then breaks the blocks down further and identifies individual lines of text. 3. The lines of text are then divided into words and each word is broken down into individual characters. 4. Zia compares the characters it has detected with its dataset and runs advanced algorithms and analysis to identify the characters and recognize words based on the of character groupings. 5. Zia also identifies the language the content is in by processing it through volumes of probabilities and hypotheses using Intelligent Character Recognition (ICR) technology. 6. The processed and recognized text is finally returned to the user as either a JSON or a document response. ### Model Types A model type is a key attribute that describes the type of OCR feature supported by Catalyst. All general image and document files that you process for the common optical character recognition feature will fall under the OCR Model Type. You will need to specify this as the model type, whenever you process an image or a document through the Catalyst OCR API or SDK. Catalyst also enables you to process ID proofs and official documents, and perform secure identity checks through an independent feature called {{%link href="/en/zia-services/help/identity-scanner/introduction" %}}Identity Scanner{{%/link%}}. These will fall under their respective model types of AADHAAR, PAN, CHEQUE and PASSBOOK. ### Supported Languages The {{%badge%}}OCR{{%/badge%}} models can detect and recognize textual content in 9 international languages and 10 Indian languages. #### Indian Languages 1. English 2. Hindi 3. Bengali 4. Marathi 5. Telugu 6. Tamil 7. Gujarati 8. Urdu 9. Kannada 10. Malayalam 11. Sanskrit #### Additional International Languages 1. Arabic 2. Chinese 3. French 4. Italian 5. Japanese 6. Portuguese 7. Romanian 8. Spanish If the user doesn't specify the language, Zia can detect the language automatically. Zia can recognize handwritten content as long as the text is legible, clear, and uses a standard font structure. However, it cannot recognize any non-textual content such as images or diagrams. ### Input Format Zia OCR supports input files in the following formats for processing: 1. _.webp/.jpeg_ 2. _.png_ 3. _.tiff_ 4. _.bmp_ 5. _.pdf_ You could provide a space for the user to upload the image or document file from the device's memory to the Catalyst application. You can also code the Catalyst application to use the end user device's camera to capture a photo with textual content, and process the image as the input file. The input provided using the API contains the source file, the {{%link href="#supported-languages" %}}language{{%/link%}} of the text to be recognized (optional), and the {{%link href="#model-types" %}}model type{{%/link%}} (optional). You can check the request format from the {{%link href="/en/api/code-reference/zia-services/ocr/#OCR" %}}API documentation{{%/link%}}. The user must follow these guidelines while providing the input, for better results: * Avoid providing blurred or unrecognizable text in images. * Ensure that the text in an image file is clear, visible, and legible. * If handwritten text is present in an image file, ensure that it uses a standard font. * The image size must not be too small. ### Response Format Zia returns the response of OCR processing in the following ways: * {{%bold%}}In the Console{{%/bold%}}<br /> When you upload a sample image or a document file to be processed in the console, it will return the response in two formats:<br /> * {{%bold%}}Document response:{{%/bold%}} This returns a {{%link href="/en/zia-services/help/optical-character-recognition/implementation/#test-optical-character-recognition-in-the-catalyst-console" %}}formatted readable text{{%/link%}} that is visually segregated into lines and paragraphs based on the original content, along with a confidence score for the {{%badge%}}OCR{{%/badge%}} model type in a percentage value. * {{%bold%}}JSON response:{{%/bold%}} This returns the recognized text in JSON format along with the confidence score for the {{%badge%}}OCR{{%/badge%}} model type. * {{%bold%}}Using the SDKs{{%/bold%}}<br /> When you send an image or document file using an API request, you will receive a JSON response containing the recognized text in the same format mentioned above. You can customize the formatting of the JSON response in your code using SDKs. For example, you can return separate paragraphs or individual words from a line as the response. For more information, refer to the {{%link href="/en/sdk/java/v1/zia-services/ocr" %}}Java{{%/link%}}, {{%link href="/en/sdk/nodejs/v2/zia-services/ocr" %}}Node.js{{%/link%}} and {{%link href="/en/sdk/python/v1/zia-services/ocr" %}}Python{{%/link%}} SDK documentation. -------------------------------------------------------------------------------- title: "Benefits" description: "Zia OCR is a ready-to-implement, AI-powered microservice that detects and recognizes textual content from images and documents." last_updated: "2026-03-18T07:41:08.702Z" source: "https://docs.catalyst.zoho.com/en/zia-services/help/optical-character-recognition/benefits/" service: "Zia Services" -------------------------------------------------------------------------------- # Benefits 1. {{%bold%}}Automatic Language Detection{{%/bold%}}<br /> Zia can operate in an automatic mode and run its algorithms to detect and identify the language of the text in an input file. However, to speed up the text recognition process, the user can specify the language of the text, if they know it, while submitting it for OCR processing. This enables Zia to restrict its processing and analysis to a limited dataset, leading to quicker and more accurate results. 2. {{%bold%}}Rapid Performance{{%/bold%}}<br /> Zia OCR processes files and generates the results in a fast and effective manner. Catalyst ensures a high throughput of data transmission and a minimal latency in serving requests. The quick response time enhances your application's performance, and provides a satisfying experience for the end user. 3. {{%bold%}}Highly Accurate and Reliable Results {{%/bold%}}<br /> Zia is an AI-driven assistant that undergoes repeated systematic training to generate results with higher accuracy and a lower error margin. The AI is trained using various machine learning techniques to perform complex computations and analysis. The training model is highly vigorous, which means it studies and analyzes large volumes of data, and this ensures that the results generated are precise, accurate, and reliable. 4. {{%bold%}}Seamless Integration{{%/bold%}}<br /> You can easily implement OCR in your application without having to worry about the underlying logic or the backend set-up. You can implement the ready-made code templates provided for the Java and Node.js platforms in any of your Catalyst applications that requires the use of OCR. 5. {{%bold%}}Testing in the Console{{%/bold%}}<br /> The testing feature in the console enables you to verify the efficiency of Zia OCR. You can upload sample images and documents with text, and view the results. This allows you to get an idea about the format and accuracy of the response that will be generated when you implement it in your application. -------------------------------------------------------------------------------- title: "Use Cases" description: "Zia OCR is a ready-to-implement, AI-powered microservice that detects and recognizes textual content from images and documents." last_updated: "2026-03-18T07:41:08.702Z" source: "https://docs.catalyst.zoho.com/en/zia-services/help/optical-character-recognition/use-cases/" service: "Zia Services" -------------------------------------------------------------------------------- # Use Cases Text detection and recognition technologies are implemented in a wide range of applications and scenarios. The following are some use cases for Zia OCR: * A text conversion Android application implements Zia OCR to convert hand-written and hard copy text content into digital documents. The application scans a picture of the source text and it is processed for OCR where the content is broken down and individual characters are recognized and grouped. Catalyst then produces the final intelligible result within seconds which is displayed to the end user in the application. The user will now be able to edit, format, or search in the text file like any other digital document. * An application linked to traffic cameras that scans and reads license plate registration numbers of traffic rules offenders implements Zia OCR to read text from captured images of license plates. The images are uploaded automatically, and Zia performs OCR processing on them to decipher the registration numbers. The recognized registration numbers are processed further to obtain the identity of the vehicle owner. The application also stores the data in {{%link href="/en/cloud-scale/help/data-store/introduction" %}}Catalyst Data Store{{%/link%}} tables. Some other examples where Zia OCR can be implemented include: * An application that recognizes text in image files and converts them to PDFs with the text, or vice-versa. * An application that enables you to quickly digitize your Aadhaar cards and bank passbooks and store them as PDF files. * A crime investigation application that scans crime scene photos for textual content such as from street signs, billboards, or graffiti. * Applications that digitize magazines, journals, contracts, pamphlets or posters by scanning hard copies of them. <br /> -------------------------------------------------------------------------------- title: "Implementation" description: "Zia OCR is a ready-to-implement, AI-powered microservice that detects and recognizes textual content from images and documents." last_updated: "2026-03-18T07:41:08.702Z" source: "https://docs.catalyst.zoho.com/en/zia-services/help/optical-character-recognition/implementation/" service: "Zia Services" -------------------------------------------------------------------------------- # Implementation This section only covers working with OCR in the Catalyst console. Refer to the {{%link href="/en/sdk/java/v1/zia-services/ocr/" %}}SDK{{%/link%}} and {{%link href="/en/api/code-reference/zia-services/ocr" %}}API{{%/link%}} documentation sections for implementing Zia OCR in your application's code. As mentioned earlier, you can access the code templates that will enable you to integrate OCR in your Catalyst application from the console, and also test the feature by uploading sample images and documents and obtaining the recognized text. ### Access Optical Character Recognition To access Optical Character Recognition in your Catalyst console: 1. Navigate to {{%bold%}}Zia services{{%/bold%}} in the left pane of the Catalyst console and click {{%bold%}}OCR{{%/bold%}} to access the feature. <br /> <br /> 2. Click {{%bold%}}Try a Demo{{%/bold%}} in the Optical Character Recognition feature page.<br /> ### Test Optical Character Recognition in the Catalyst Console You can test OCR by either selecting a sample image or PDF file from Catalyst or by uploading your own file. To process a sample file and obtain the results: 1. Click {{%bold%}}Select a Sample Image{{%/bold%}} in the box.<br /> <br /> 2. Select an image or a PDF file from the samples provided.<br /> <br /> OCR will process the file, detect and identify the textual content in it. Since it is a sample file, the language the text is in and the model type is provided by Catalyst automatically.<br /><br /> The recognized text is displayed in the console under the _Result_ section.<br /> <br /> You can view the {{%link href="/en/zia-services/help/optical-character-recognition/key-concepts/#response-format" %}}JSON response{{%/link%}} by clicking {{%bold%}}View Response{{%/bold%}}.<br /> <br /> To upload your own image or PDF file with text: 1. Click {{%bold%}}Upload{{%/bold%}} under the _Result_ section.<br /> <br /> If you're opening Optical Character Recognition after you have closed it, click {{%bold%}}Browse Files{{%/bold%}} in this box.<br /> <br /> 2. Upload a file from your local system.<br /> {{%note%}}{{%bold%}}Note:{{%/bold%}} The file must be in ._jpg_/._jpeg_, ._png_, ._bmp_, ._tiff_, or ._pdf_ format. The file size must not exceed 20 MB.{{%/note%}} 3. Select the {{%link href="/en/zia-services/help/optical-character-recognition/key-concepts/#model-types" %}}model type{{%/link%}} and the {{%link href="/en/zia-services/help/optical-character-recognition/key-concepts/#supported-languages" %}}languages{{%/link%}} in the file's text, if you are aware of them. You can select {{%bold%}}General{{%/bold%}} for the {{%badge%}}OCR{{%/badge%}} model. You can select multiple languages, if the file contains text in multiple languages. 4. Click {{%bold%}}Proceed{{%/bold%}}. The console will process the file and display the recognized textual content, and the confidence score if it is of the {{%badge%}}OCR{{%/badge%}} model type. You can copy the recognized text using the copy icon. You can check the JSON response in a similar way. ### Access Code Templates for Optical Character Recognition You can implement Optical Character Recognition in your Catalyst application using the code templates provided by Catalyst for {{%link href="/en/sdk/java/v1/zia-services/ocr/" %}}Java{{%/link%}}, {{%link href="/en/sdk/nodejs/v2/zia-services/ocr/" %}}Node.js{{%/link%}} and {{%link href="/en/sdk/python/v1/zia-services/ocr/" %}}Python{{%/link%}} platforms. You can access them from the section below the test window. Click either the {{%bold%}}Java SDK{{%/bold%}}, {{%bold%}}NodeJS SDK{{%/bold%}} or {{%bold%}}Python SDK{{%/bold%}} tab, and copy the code using the copy icon. You can paste this code in your web or Android application's code wherever you require. <br /> In Java, you can process the input file as a new {{%badge%}}File{{%/badge%}}, specify the model type using {{%badge%}}ZCOCRModelType{{%/badge%}} and the languages using {{%badge%}}setLanguageCode{{%/badge%}}. Refer to the {{%link href="/en/api/code-reference/zia-services/ocr/#OCR" %}}API documentation{{%/link%}} for the keys of the supported languages and model types. As mentioned earlier, you can format the JSON response that you receive. The Java code enables you to obtain specific paragraphs, individual lines in a paragraph, or individual words in a line. The Node.js code processes the input file as the object {{%badge%}}ocrPromise{{%/badge%}}. You can provide the input file name, set the model type using {{%badge%}}modelType{{%/badge%}}, and the languages using {{%badge%}}language{{%/badge%}}. For Python, you must pass the file path, model type, and languages as arguments to the {{%badge%}}extract_optical_characters(){{%/badge%}} method. However, the model type and language values are optional. By default, it is passed as the {{%badge%}}OCR{{%/badge%}} model type, and the languages are automatically detected if they are not specified.