Data8 Logo

DescriptionRequired DataUpload OptionsDownload OptionsOutput Columns
Deduplicate
Identify duplicate records within the data.
Description

This service checks the data provided for duplicates based on a matching name and address. When processing residential data, the level at which the name must match can be set.

This service is costed on a per record basis.

Required Data

This service requires the following data to be available before it can be used:

  • Address

This service can also make use of the following optional data to improve its effectiveness:

  • Company Name
  • Person Name
Upload Options

The following options can be configured for this service at the upload stage.

Name Description
Match Level
Match Level

Set this to indicate how closely names need to match for a record to be considered a match by this service.

  • Forename requires a fuzzy match on both the surname and forename, and single initials are not allowed to match.
  • Initial requires a fuzzy match on both the surname and forename, allowing single initials to match. For example, "R Smith" would be allowed to match "Robert Smith", but "Richard Smith" and "Robert Smith" would not be considered a match.
  • Surname allows two completely different forenames to match so long as they have a fuzzy match on the surname.
  • Personal intelligently picks the highest match level available based on your data. If a record does not contain any forename information then a Surname match will be allowed, but if forename data is present then at least an Initial level match will be required.
  • Premise allows all records at the same address to match. If a company name is present then it is treated as part of the address and must match as well.
Output Type
Output Type

Set this to configure what should happen when a match is made by this service:

  • Flag adds an extra column to your output file which is populated when a match is made and blank otherwise.
  • Suppress removes any matching records from your output file.
Download Options

There are no options that can be configured for this service at the download stage.

Output Columns

This service adds the following columns to your output data.

Name Description
Unique ID
Unique ID

The unique ID of this record assigned by Data8 for use in deduplication.

Duplicate Flag
Duplicate Flag

Identifies any duplicates found at the match level selected with "Dup" and blank if the record is unique.

Duplicate IDs
Duplicate IDs

A list of Unique IDs that this record matches with at the match level selected. The list is semi-colon delimited.

Keep Flag
Keep Flag

A binary flag to indicate whether this record should be kept in the file or removed as a superfluous duplicate at the match level selected. "1" indicates a record that should be kept, "0" indicates a record to discard.

An error has occurred. This application may no longer respond until reloaded. Reload 🗙