Parameters for importing data

The import mechanism options allow you to set rules for handling updates, duplicates and concerns.

Fields marked with an asterisk are mandatory fields. Key elements for transferring data between environments to maintain data consistency will be indicated.

System name - is used to identify the type of data for which settings are used. When importing to maintain data consistency between two environments, this name is irrelevant. When using more advanced downloading, it is useful to separate the import options into the types of data being downloaded and name them accordingly

What to do with update records - available options are OVERWRITE, SKIP, ASK_USER.

Default way to update fields - available options are: OVERWRITE, SKIP, FILL_EMPTY, ADD_NEW_VALUES, OVERWRITE_BY_NOT_EMPTY.

What to do with duplicate records - available options are OVERWRITE, SKIP, ASK_USER.

Default way to update duplicate fields - available options are: OVERWRITE, SKIP, FILL_EMPTY, ADD_NEW_VALUES, FILL_EMPTY.

What to do with questionable records - available options are ASK_USER, SKIP, SAVE_BEST_MATCH.

Option

Meaning or available options

What to do with update records

This option applies to import records having system IDs already existing in the database (duplicates by ID). Available options:

override / skip / query the user

What to do with duplicate records

The option applies to import records considered duplicates, without system ID.

override / skip / query the user

What to do with records with doubts

This option applies to import records with more than one match from the database. Available options:

Ask the user / skip / save without resolving doubts (best fit)

Set aside questions for the user at the end of the import

When to ask the user about concerns

Note user answers

Whether to remember the user's previous answers

Default way to update fields

Specifies how the fields of the overwritten record are to be updated by default. The selected action will be applied to all fields for which no custom update rules are specified. 

Options available:

overwrite / omit / fill in the blank / overwrite with a non-empty value

Rules for updating fields

Custom rules for updating selected fields. You can define any number of rules.

For each rule, specify:

  • Entity types - the types of entities to be affected by a rule

  • Fields 

  • Action - how to update selected fields

Maximum number of authors

The maximum number of authors to be uploaded to the author field (for publications). If there are more authors than the specified number in the imported publication, they are aggregated to the 'collective author' field

Update external identifiers

Whether to update external identifiers in uniquely matched duplicates in the database during import, even if the duplicate is skipped.

Update indicators

Whether to update the publication rates of uniquely matched duplicates in the database during import, even if the duplicate is skipped.

Matching by external ID - blacklisting

The parameter allows you to indicate which external identifiers are to be ignored when verifying whether an imported record is a duplicate. This eliminates groups of local identifiers that may be duplicate between repositories (such as personnel identifier, directory identifier). Specify the system names of the identifiers from the local systems.

Matching by external ID - white list

By default, during import, the system first attempts to match records based on external IDs. The ID whitelist parameter allows you to narrow the list of verified IDs to selected ones (e.g. ORCID, POL-on). Provide system names of IDs from local systems.

After setting the parameter, the system will verify only the indicated ID during import - the others will be ignored. If the whitelist is used, the blacklist is unnecessary.

Push record status

Do not change / Complete / Incomplete

Detect nested dependencies

A parameter important especially when importing hierarchical data. It guarantees that the master record will be imported before the records in which it is nested (as long as it is in the list of records to be imported).

Save without matching

The option allows you to disable deduplication mechanisms, which speeds up the import process.

When setting import options for the operation of copying data between environments, the following settings are recommended:

update records: OVERWRITE

update fields: OVERWRITE

duplicate records: SKIP

duplicate fields: SKIP

 

 

Duplicate record min confidence - minimum percentage of matching records that results in considering them as duplicates - the higher, the more items must match

Ambiguous record min confidence - minimum percentage of matching records that results in considering them as ambiguous records and redirects to the editor's decision

Defer questions to the user at the end of the import - checking this option causes the import not to finish by itself, only the doubtful records will wait for the editor's decision

 

When setting up the import for the action of copying data between environments, it is recommended to check the flag "Save without matching" . → then the system does not verify whether it is downloading a duplicate, but only overwrites what it has already found in the system, based on the record ID.

 

In the import options, there is an option to check the flag Save log of the import process - if unchecked, no additional logs will be saved after each import.