External data Connectors

  1. Amazon AWS S3

    Amazon Simple Storage Service (Amazon S3) is a scalable, high-speed, web-based cloud storage service designed for online backup and archiving of data and applications on Amazon Web Services.

    The AWS S3 data connector can be used to access objects stored in Amazon S3 buckets. The objects in S3 can be public as well as private. QuickML does not need any authentication keys for fetching public objects. If the bucket has restricted access, then QuickML require accessKey and secretKey.

    Details required to import data:

    • Bucket name : The bucket in which data is stored.
    • Object name : Name of the file which needs to be imported.
    • Region : Data center region in which your bucket is stored.
    • Access Key : The Access key of AWS IAM user.
    • Secret Key : The secret key of AWS IAM user.
    • File Type : The format of the file to be imported.

    To import data from S3:

    You need to create an AWS IAM user as per this link. While creating a new IAM user, please make sure you provide the necessary permissions to access the S3 service. At the end of the IAM user creation process, you will get an “Access Key” and a “Secret Key”. You have to store the “Secret Key” in some secured place as it cannot be reproduced again.

    If you have an IAM user with permission to access S3 objects, you can get the access and secret keys for that user from the AWS console.

  2. Microsoft Azure Object Storage

    Microsoft Azure contains multiple storage solutions like SQL databases, NoSQL databases and file storage solutions. File storage solutions are mainly used to store and retrieve the unstructured data (text, csv, etc.) using streaming. This includes Blob Storage, Queue Storage, and Disk storage.

    Blob Storage:

    Blob storage is an object storage solution for the cloud that deals with data in blobs (tiny units) in three different ways, as per the requirement: Block blobs, Page blobs and Append blobs. Amongst them, Block blobs are the most-commonly used method to store objects.

    Details required to import data:

    • Blob Name : File name in azure storage account container that needs to be imported.
    • Container Name : Container name in which file is stored.
    • Connection String : Authentication string to be passed to validate the account and user details with Microsoft Azure.
    • File Type : Type of the file that is being imported.

    To get Connection String from Azure Portal:

    1. Login into Azure Portal.
    2. Search for and select the storage account from which the data needs to be imported.
    3. Search the access keys for a value of the connection strings provided (key 1 or key 2).
  3. Google Cloud Storage

    Google Cloud Storage is a secure, high-performance object storage solution for archiving and storing files online. Using the connector provided, data stored in the any formats can be imported into QuickML.

    Details required to import data:

    • Blob Name : File name in Google storage that needs to be imported
    • Bucket Name : The bucket in which data is stored.
    • File Type : Type of the file that is being imported.
    • Authentication json file : Json file container authentication details to access the object

    Google Service account details will be required in order to request the data from the storage server. Steps to get the authentication details have been provided below.

    Steps to get the Authentication Service json file:

    1. In the IAM & Admin section in Google cloud portal, go to Service Accounts.
    2. Create a service account if needed and provide access to the required projects.
    3. In Service Account navigate to Actions tab and select Create Key.
    4. Save the key as a json file and It can be used as value for file argument of the below-mentioned API parameters.
  4. OneDrive

    Microsoft’s OneDrive object storage solution is also available as data connector in QuickML. It allows user to store and manage files of various formats securely. This integration empowers user to effortlessly incorporate OneDrive data into machine learning workflows, enhancing the efficiency and versatility of analysis. Using the connector provided, data stored in the any formats can be imported into QuickML.

    Details required to import data:

    • Client Id: This is the unique identifier assigned to your application by Microsoft when registering it in Azure Active Directory.
    • Tenant Id: This is the unique identifier for your organization’s Azure Active Directory tenant.
    • Client Secret: A secret key associated with your registered application to ensure secure communication between the application and the service.
    • User Email: The email associated with your OneDrive account for authentication and access permissions.
    • File Name: The name of the file you wish to import from OneDrive.
    • Source File Type: The format of the file you’re importing.

Last Updated 2023-10-09 18:18:15 +0530 +0530