Key Concepts

Before you learn about the use cases and implementation of Object Recognition, it’s important to understand its fundamental concepts in detail.

Object Recognition Process

Zia’s object recognition process can be broken down into various stages:

  1. Object detection: Object detection involves detecting instances of semantic objects in images. Along with object localization, object detection can not only detect the most obvious object in the image, but can detect and localize multiple objects in the same image.
  2. Object localization: Object localization draws bounding boxes around the object after it has been detected. A bounding box is the coordinates of a rectangular border that encloses the detected object. This helps Zia map the location of the object in the image.
  3. Image classification:
    Image classification is part of deep learning, a subset of machine learning, which classifies detected objects in an image into categories. Zia can identify object classes like animals, everyday items, food, sports equipment, vehicles in an image. When an object is assigned to a class based on its visual content, Zia runs further algorithms to accurately recognize the object and deliver the results.
  4. Object recognition:
    Zia breaks down the object class further to perform semantic recognition of specific entities. Combining object detection, image localization, and image classification, the specific object is recognized and produced in the output. Zia can recognize 80 kinds of specific object types, some of which include: person, car, dog, chair, traffic light, knife, umbrella, cellphone, book, cake, baseball bat, laptop, aeroplane, stop_sign, parking meter.

This entire process involves dividing and conquering, comparing the detected objects with the representations stored in the AI’s memory, and repeatedly hypothesizing and testing. Since an object can appear in different poses, contexts, arbitrary positions, and orientations, Zia undergoes constant training to generate accurate results.

Input Format

Zia Object Recognition performs facial detection and attributes recognition by analyzing image files. Object Recognition supports the following input file formats:

  1. .jpg/.jpeg
  2. .png

You could provide a space for the user to upload the image file from the device’s memory to the Catalyst application. You can also code the Catalyst application to use the end user device’s camera to capture a photo and process the image as the input file.

You can check the request format of the API from the API documentation.

The user must follow these guidelines while providing the input, for better results:

  • Avoid providing blurred or corrupted images.
  • Object Recognition can recognize the objects in an image better if they are clear, visible, and distinct.
  • The file size must not exceed 10 MB.

Response Format

Zia Object Recognition returns the response in the following ways:

  • In the Console
    When you upload a sample image in the console, it will return the results in two response formats:
    • Document response: The textual response shows the object types of all the recognized objects in the image, with the confidence level of each recognition as percentage values.
    • JSON response: The JSON response contains the object type, the coordinates of the object in the image, and the confidence score in a value between 0 to 1 of each object that was recognized in the image. The confidence score of 0 to 1 can be equated to percentage values as follows:
Confidence Level in percentage Confidence Score of values between 0 and 1
0-9 0.0
3-9 0.0
10-19 0.0
20-29 0.0
30-39 0.0
40-49 0.01
50-59 0.12
60-69 0.23
>70 0.63
  • Using the SDKs
    When you send an image file using an API request, you will receive a JSON response containing the results in the format specified above.

You can check the JSON response format from the API documentation.


Last Updated 2023-05-09 17:03:08 +0530 +0530