Deposited Item Lifecycle



Submitted Item

After you deposit a submission it will be inserted into a pool of submitted items. The item is not publicly available and waits for an editor to approve (or reject) it.

Validated Item

  • Automatic validation of file formats and integrity
  • Manual validation of non-standard formats

Once a data producer submits an item to the repository it is first being automatically validated. Specifically, when the dataset is created, the data files are stored on our cluster. Their MD5 checksums are calculated and corrupted files are immediately repaired using redundant data

Integrity and authenticity of the archived data

We verify the integrity and authenticity of the archived data at all stages of the deposited item lifecycle. This is done both automatically and manually. The depositor’s identity is verified either by a local account or a shibboleth account (providing their name and email address and attributes). The item is then linked to the Data Producer’s account, and only the Data Producer may edit, manage and publish changes to their item. Yet it is only possible at the 'before the deposition' stage. After the review and acceptance by the data curator, the item is published, its version is fixed and the content of the item is immutable. No data versioning is supported. Once a Data Producer wants to introduce a new version of their item, they have to create a new deposit.

The repository maintains provenance data and related audit trails. Logged-in users have access to information about changes made to the submitted item, described in the metadata field dc.description.provenance. This information is available in full metadata description mode, e.g. see here

Curated Item

Next, the item is curated by a data steward. The task of the data curator is to verify whether the submission meets our requirements with respect to metadata quality and completeness, bitstream consistency and IPR. The curator can return the submission to the data depositor describing the needed changes. This step is repeated until the curator approves the item. The approved item becomes a published item.

Edited Item

The task of the editor is to verify whether the submission meets our requirements in respect to metadata quality and completeness, bitstream consistency and IPR. The editor can return the submission to the data depositor describing the needed changes. This step is repeated until the editor approves the item. The approved item becomes a published item.

Published Item

A published item obtains a PID (persistent identifier) which should be used for referencing and citing e.g., http://hdl.handle.net/11858/00-097C-0000-0022-F59C-8. The CLARIN-PL repository will ensure that the PID (more precisely, we use http handle proxy of the PID) will be resolved into a working web page (even if the current server infrastructure changes or is moved) describing your resource.

Published items are available through our search interface, browsing mode. Metadata of all items are submitted to search engines and are available through OAI-PMH protocol (several institutes harvest our repository for item's metadata e.g., http://catalog.clarin.eu/vlo/). Bitstreams of public submissions (see Restricted Submissions) are also available through OAI-ORE protocol.

Deleting and Modifying of Published Item

Anybody can request deletion of published items; however, these will be evaluated on case-by-case basis. Furthermore, we reserve the right to keep the metadata of published submissions available in case there is no specific reason why to delete the metadata. The reason is that it is against the concept of PIDs (persistent identifiers). All PIDs are available through OAI-PMH interface even if only to inform that the item has been deleted.

We allow for minor edits of the submission (e.g., typos) through our Help Desk. These are also evaluated on case-by-case basis. For major changes, the user is requested to submit a new version of that item and we will indicate in the metadata that it is replaced by a newer version.

Restricted Submissions

First of all, all metadata are always publicly available. We support open access submissions; however, we also support restrictive licences for bitstreams which require e-signing before downloading the bitstreams. We keep track of these e-signatures in case there are IPR infringements.

See currently available licenses or ask us to add a specific one.

We also support putting embargo on bitstreams which means that the bitstreams become publicly available after specific dates.