Interpret, correct, and recover rejected data during daily imports

During daily processes and manual data imports, processed files are renamed by adding a .ok extension followed by the date and time the file was handled by your Probance project. Each of these files also generates several rejection files containing rows that could not be imported into the database.

If a file is not renamed on the SFTP after the daily processes, it indicates that it was not processed.

Event flows generate 5 rejection files, while profile and catalog flows generate 4 rejection files. An additional file is generated for flows sent via the real-time API.

You can find these rejection files on the SFTP of your main account in the upload/rejects directory. They are retained there for 7 days.

This directory also includes rejection files for flows from your partners, even if they have individual deposit directories for confidentiality purposes. However, your partners do not have access to the rejection files for the flows they generate.

Reviewing these files will allow you to:

  • Understand the reason for the rejection.
  • Plan data recovery after corrections using a Probance naming convention.

1. Types of Rejections

loadingreject

Returns all rows that do not comply with the file format defined in the flow specifications (e.g., delimiter, number of columns, end-of-line character, encapsulation).

checkschemareject

Returns all rows where a required value is missing or a string exceeds the maximum length defined in the flow specifications.

convertreject

Rejects all rows with data types that do not match those defined in the flow specifications (e.g., integer, date pattern, decimal numbers).

filterrowreject

If you have defined filters on your fields in Cockpit, this rejection type includes rows where a column does not comply with its filter.

lookupreject

This type of rejection applies to event flows. It includes rows corresponding to events for which the customer (client identifier) is unknown in the customer table.

Lookup rejections may have two causes:

  1. The id_user line was never sent in the profile flow.
  2. The id_user line was sent in the profile flow but was rejected during the import.

Jsonreject

This rejection type indicates the nature of import errors encountered when processing lines received via JSON calls.

2. Recovering Rejected or Missing Data

PHM natively supports concatenating your CSV files by adding an extension to your file names.

For example, for the file clients_prospects_DDMMYYYY.csv, you can upload the following files:

  • clients_prospects_DDMMYYYY.csv
  • clients_prospects_DDMMYYYY-00.csv
  • clients_prospects_DDMMYYYY-01.csv
  • clients_prospects_DDMMYYYY-02.csv
  • …up to clients_prospects_DDMMYYYY-99.csv

These files will be processed in order, from -00 to -99, ending with the daily file without an extension.

If a row appears multiple times in these recovery files (for profile data and not events), the value in the row of the last file will be retained during data import.

You can view information about the expected file naming for each flow in your Probance One interface under Administration / Technical.