Optical Character Recognition
Zia Optical Character Recognition electronically detects textual characters in images or digital documents, and converts them into machine-encoded text. Zia OCR can recognize text in 9 international languages and 10 Indian languages. You can check the list of languages and language codes from the API documentation
You must specify the path to the image or document file that needs to be processed for OCR. The response will also include a confidence score, which defines the accuracy of the processing, in addition to the recognized text.
Allowed file formats: .jpg, .jpeg, .png, .tiff, .bmp, .pdf
File size limit: 20 MB
You must pass the file path, model type, and languages as arguments to the extractOpticalCharacters() method. However, the model type and language values are optional. By default, it is passed as the OCR model type, and the languages are automatically detected if they are not specified.
The zia reference used below is defined in the component instance page. The promise returned here is resolved to a JSON object.
copylet fs = require('fs'); //Define the file stream for file attachments let result = await zia.extractOpticalCharacters( fs.createReadStream('/Users/amelia-421/Desktop/MyDoc.jpg'), { language:'eng', modelType: 'OCR' }) ; console.log(result);
A sample response that you will receive is shown below. The response is the same for both versions of Node.js.
Node js
copy{ "confidence":95, "text":"This is a lot of 12 point text to test the\nocr code and see if it works on all types\nof file format\n\nThe quick brown dog jumped over the\nlazy fox. The quick brown dog jumped\nover the lazy fox. The quick brown dog\njumped over the lazy fox. The quick\nbrown dog jumped over the lazy fox" }
Last Updated 2023-09-03 01:06:41 +0530 +0530
Yes
No
Send your feedback to us