Data8 Administration & Batch Data Cleansing API

Manage your Data8 account, submit data to batch data cleansing jobs and retrieve the results

Jobs are submitted to workflows that are built for you by the Data8 Production Team to your specifications, and the details of the data to be provided to each workflow and generated by it will be documented by them.

If you do not already have a workflow available to submit jobs to, please get in touch with your account manager to discuss your requirements.

All requests must be authenticated using an Authorization: Bearer header, with the bearer token being obtained from the Data8 OAuth token server at https://auth.data-8.co.uk/connect/token.

/Dataset

Creates a new dataset

HTTP Method: POST

Operation Id: Dataset_Create

Each dataset must have a unique name. The name must start with a character a-z and can only include the characters a-z, 0-9 or _.

Datasets are either input or output datasets. Only input datasets can be created using this endpoint. Output datasets are created automatically as required when jobs are started.

Each input dataset must have its columns specified when it is created. Each column has a name, following the same naming requirements as above for the dataset, and a type.

When the dataset is created it is marked as incomplete. Add data to the dataset using the PATCH /Dataset/{name}/data endpoint. Once all data has been added, use the PUT /Dataset/{name} endpoint to mark the dataset as complete before using it as input to a job.

The full details of the dataset can be retrieved at any time using the GET endpoint.

Parameters

No parameters.

Request Body

The details of the dataset to create

Example Value

Schema

{
	"name": "sample_dataset_1",
	"columns": {
	  "firstname": 0,
	  "lastname": 0,
	  "priority": 1
	}
}

Responses

The dataset has been created

Example Value

Schema

{
	"name": "sample_dataset_1",
	"columns": {
	  "firstname": 0,
	  "lastname": 0,
	  "priority": 1
	},
	"input": true,
	"recordCount": 0,
	"completed": true
}

Some validation error occurred in the dataset details

Example Value

Schema

{
	"errors": {},
	"type": "string",
	"title": "string",
	"status": 0,
	"detail": "string",
	"instance": "string",
	"errors": {}
}

Another dataset with the same name already exists

Example Value

Schema

{
	"type": "string",
	"title": "string",
	"status": 0,
	"detail": "string",
	"instance": "string"
}