Data management

Data storage

The system is very well equipped with the tools for data management. The main feature is that the system records the history of changes, so that it is possible to go back to one of the states preceeding a current state. With this feature one can find out who is responsible for a change, and when the change has been made. The system stores the preceeding versions of each record so that it is also possible to see the difference between the consecuti record versions, and optionally to go back to a chosen version. In the system soft and strong DELETE operations are applicable, so even the Delete operattion can be UNDO.

A very important feature of storing the information is “embedding” parts of one record within another record. For example adding the author to the publication record means that a selected substantial part of the author record is copied to the publicaiton. This allows the system to store in bibliographic record the “historical” data concerning the author, valid for the moment the bibliographic record is created. So, if the affiliation of the author in the time of building a publication record was faculty1, this value is stored in embedded part of the author description, and the change of the affiliation by the author tha was made later, will not affect the historical information about the faculty. It also makes possible that the local value for the author name, as embedded in the publication, differs from the value in the main authority record. Yet another advantage is that the bibliographic record is automatically assigned to a given affiliation, research discipline, etc., so there is no need to define “collection” of the publication record.

Every record type, defined by means of XSD, is used to generate all necessary data structures, such as data entry format, indexes, some validation rules. With this approach one can define very complex structures.

Data management functions

In OMEGA-PSIR the data management facilities are very powerful. The system provides means for

  1. importing data from various sources, both bibliographic and auxiliary ones (which makes the data entry process much cheaper). For example data can be imported from global scientific database, like e.g. SCOPUS, WoS, PubMed), from patents database (EPO), CrossRef, DataCite, ORCID, also from other OMEGA-PSIR based knowledge bases (which makes possible building the federated portals, see Polish Medical Platform);

  2. enriching existing metadata with data from external resources (e.g. enriching bibliographic metadata with bibliometric data from Scopus, WoS, or integrating journals with Sherpa-Romeo to get information about openess strategy of the journals),

  3. acquiring multimedia data from Youtube

  4. importing XML files from web or from a desktop - special tools are available to translate XML to the requested formats.

  5. Manual data entry by librarians and trained users

  6. self depositing data by researchers

  7. One can define workflow for data entry processes

  8. Strong duplicate discovery procedures are available

For manual data entry staff strong validation control tools are provided. The validation procedures aredefinable by the system administration.