Bulk Operations

Export Local

Exports data based on a given search operation to a local file in JSON or CSV format.

  • operation (required) - must always be export_local

  • format (required) - the format you wish to export the data, options are json & csv

  • path (required) - path local to the server to export the data

  • search_operation (required) - search_operation of search_by_hash, search_by_value, search_by_conditions or sql

  • filename (optional) - the name of the file where your export will be written to (do not include extension in filename). If one is not provided it will be autogenerated based on the epoch.

Body

{
  "operation": "export_local",
  "format": "json",
  "path": "/data/",
  "search_operation": {
      "operation": "sql",
      "sql": "SELECT * FROM dev.breed"
  }
}

Response: 200

{
  "message": "Starting job with id 6fc18eaa-3504-4374-815c-44840a12e7e5"
}

CSV Data Load

Ingests CSV data, provided directly in the operation as an insert, update or upsert into the specified database table.

  • operation (required) - must always be csv_data_load

  • action (optional) - type of action you want to perform - insert, update or upsert. The default is insert

  • database (optional) - name of the database where you are loading your data. The default is data

  • table (required) - name of the table where you are loading your data

  • data (required) - csv data to import into Harper

Body

{
  "operation": "csv_data_load",
  "database": "dev",
  "action": "insert",
  "table": "breed",
  "data": "id,name,section,country,image\n1,ENGLISH POINTER,British and Irish Pointers and Setters,GREAT BRITAIN,http://www.fci.be/Nomenclature/Illustrations/001g07.jpg\n2,ENGLISH SETTER,British and Irish Pointers and Setters,GREAT BRITAIN,http://www.fci.be/Nomenclature/Illustrations/002g07.jpg\n3,KERRY BLUE TERRIER,Large and medium sized Terriers,IRELAND,\n"
}

Response: 200

  {
      "message": "Starting job with id 2fe25039-566e-4670-8bb3-2db3d4e07e69",
      "job_id": "2fe25039-566e-4670-8bb3-2db3d4e07e69"
  }

CSV File Load

Ingests CSV data, provided via a path on the local filesystem, as an insert, update or upsert into the specified database table.

Note: The CSV file must reside on the same machine on which Harper is running. For example, the path to a CSV on your computer will produce an error if your Harper instance is a cloud instance.

  • operation (required) - must always be csv_file_load

  • action (optional) - type of action you want to perform - insert, update or upsert. The default is insert

  • database (optional) - name of the database where you are loading your data. The default is data

  • table (required) - name of the table where you are loading your data

  • file_path (required) - path to the csv file on the host running Harper

Body

{
  "operation": "csv_file_load",
  "action": "insert",
  "database": "dev",
  "table": "breed",
  "file_path": "/home/user/imports/breeds.csv"
}

Response: 200

{
  "message": "Starting job with id 3994d8e2-ec6a-43c4-8563-11c1df81870e",
  "job_id": "3994d8e2-ec6a-43c4-8563-11c1df81870e"
}

CSV URL Load

Ingests CSV data, provided via URL, as an insert, update or upsert into the specified database table.

  • operation (required) - must always be csv_url_load

  • action (optional) - type of action you want to perform - insert, update or upsert. The default is insert

  • database (optional) - name of the database where you are loading your data. The default is data

  • table (required) - name of the table where you are loading your data

  • csv_url (required) - URL to the csv

Body

{
  "operation": "csv_url_load",
  "action": "insert",
  "database": "dev",
  "table": "breed",
  "csv_url": "https://s3.amazonaws.com/complimentarydata/breeds.csv"
}

Response: 200

{
  "message": "Starting job with id 332aa0a2-6833-46cd-88a6-ae375920436a",
  "job_id": "332aa0a2-6833-46cd-88a6-ae375920436a"
}

Export To S3

Exports data based on a given search operation from table to AWS S3 in JSON or CSV format.

  • operation (required) - must always be export_to_s3

  • format (required) - the format you wish to export the data, options are json & csv

  • s3 (required) - details your access keys, bucket, bucket region and key for saving the data to S3

  • search_operation (required) - search_operation of search_by_hash, search_by_value, search_by_conditions or sql

Body

{
    "operation": "export_to_s3",
    "format": "json",
    "s3": {
        "aws_access_key_id": "YOUR_KEY",
        "aws_secret_access_key": "YOUR_SECRET_KEY",
        "bucket": "BUCKET_NAME",
        "key": "OBJECT_NAME",
        "region": "BUCKET_REGION"
    },
    "search_operation": {
        "operation": "sql",
        "sql": "SELECT * FROM dev.dog"
    }
}

Response: 200

{
  "message": "Starting job with id 9fa85968-4cb1-4008-976e-506c4b13fc4a",
  "job_id": "9fa85968-4cb1-4008-976e-506c4b13fc4a"
}

Import from S3

This operation allows users to import CSV or JSON files from an AWS S3 bucket as an insert, update or upsert.

  • operation (required) - must always be import_from_s3

  • action (optional) - type of action you want to perform - insert, update or upsert. The default is insert

  • database (optional) - name of the database where you are loading your data. The default is data

  • table (required) - name of the table where you are loading your data

  • s3 (required) - object containing required AWS S3 bucket info for operation:

    • aws_access_key_id - AWS access key for authenticating into your S3 bucket

    • aws_secret_access_key - AWS secret for authenticating into your S3 bucket

    • bucket - AWS S3 bucket to import from

    • key - the name of the file to import - the file must include a valid file extension ('.csv' or '.json')

    • region - the region of the bucket

Body

{
  "operation": "import_from_s3",
  "action": "insert",
  "database": "dev",
  "table": "dog",
  "s3": {
    "aws_access_key_id": "YOUR_KEY",
    "aws_secret_access_key": "YOUR_SECRET_KEY",
    "bucket": "BUCKET_NAME",
    "key": "OBJECT_NAME",
    "region": "BUCKET_REGION"
  }
}

Response: 200

{
  "message": "Starting job with id 062a1892-6a0a-4282-9791-0f4c93b12e16",
  "job_id": "062a1892-6a0a-4282-9791-0f4c93b12e16"
}

Delete Records Before

Delete data before the specified timestamp on the specified database table exclusively on the node where it is executed. Any clustered nodes with replicated data will retain that data.

Operation is restricted to super_user roles only

  • operation (required) - must always be delete_records_before

  • date (required) - records older than this date will be deleted. Supported format looks like: YYYY-MM-DDThh:mm:ss.sZ

  • schema (required) - name of the schema where you are deleting your data

  • table (required) - name of the table where you are deleting your data

Body

{
  "operation": "delete_records_before",
  "date": "2021-01-25T23:05:27.464",
  "schema": "dev",
  "table": "breed"
}

Response: 200

{
  "message": "Starting job with id d3aed926-e9fe-4ec1-aea7-0fb4451bd373",
  "job_id": "d3aed926-e9fe-4ec1-aea7-0fb4451bd373"
}

Last updated