Deposited Item Lifecycle (ver. 1.01)

References documentation: Editor Manual



Submitted Item

After you deposit a submission it will be inserted into a pool of submitted items. The item is not publicly available and waits for an editor to approve (or reject) it.

Validated Item

  • Automatic validation of file formats and integrity
  • Manual validation of non-standard formats

Once a data producer submits an item to the repository it is first being automatically validated. Specifically, when the dataset is created, the data files are stored on our cluster. Their MD5 checksums are calculated and corrupted files are immediately repaired using redundant data

Curated Item

Next, the item is curated by a data steward. The task of the data curator is to verify whether the submission meets our requirements with respect to metadata quality and completeness, bitstream consistency and IPR. The curator can return the submission to the data depositor describing the needed changes. This step is repeated until the curator approves the item. The approved item becomes a published item.

Edited Item

The task of the editor is to verify whether the submission meets our requirements in respect to metadata quality and completeness, bitstream consistency and IPR. The editor can return the submission to the data depositor describing the needed changes. This step is repeated until the editor approves the item. The approved item becomes a published item.

Published Item

A published item obtains a PID (persistent identifier) which should be used for referencing and citing e.g., http://hdl.handle.net/11858/00-097C-0000-0022-F59C-8. The CLARIN-PL repository will ensure that the PID (more precisely, we use http handle proxy of the PID) will be resolved into a working web page (even if the current server infrastructure changes or is moved) describing your resource.

Published items are available through our search interface, browsing mode. Metadata of all items are submitted to search engines and are available through OAI-PMH protocol (several institutes harvest our repository for item's metadata e.g., http://catalog.clarin.eu/vlo/). Bitstreams of public submissions (see Restricted Submissions) are also available through OAI-ORE protocol.

Deleting Published Item

Anybody can request deletion of published data (bitstreams) through our Help Desk. However, these will be evaluated on case-by-case basis. Furthermore, we reserve the right to keep the metadata of published submissions available in case there is no specific reason why to delete the metadata. The reason is that it is against the concept of PIDs (persistent identifiers). All PIDs are available through OAI-PMH interface even if only to inform that the item has been deleted.

Modifying a Published Item

We allow for minor edits of the submission (e.g., typos) through our Help Desk. We also allow certain additions to the metadata or data, e.g. adding keywords, adding information on a new publication about the submission or adding new bitstreams which are derived from existing ones, i.e. making the same data available in additional formats. Suggestions for such edits are also evaluated on case-by-case basis. In cases of more substantial changes, the user must make a new version of the entry, and the metadata will show that the entry has been superseeded by a newer version. Note also that the repository editors might make small changes to the metadata in well defined circumstances (i.e. correcting typos, removing URLs that no longer work, or unifying keywords) even after the item has been published.

Submitting a new version of an Item

If a new version of a resource was created, you need to create a new submission. However, you do not (in fact, should not) create a new submission from scratch, but follow the following procedure:

  1. Log in and go to the list of your existing Submissions. You can only create a new version from your previous submission, not from somebody else's – to do it yourself, you have to ask them to perform steps 1 – 3 in this list and Save and Share the entry with you, and then you will be able to finish it.
  2. Scroll down to Archived Submissions, tick the select box of the record you want to create a new version of, and select "Add new version".
  3. This creates a cloned record for you, just as if you created a new submission. This new submission has all the metadata from the previous one already copied. Click "Resume". Now you are in the submission workflow.
  4. Modify the metadata you want changed. You should at least modify the title by deleting the automatically generated date, giving it a new version number, and, at the end of the description, briefly describing the differences to the previous version. If the size of the resource has changed, you should also modify the size information.
  5. Upload the data of the new version and finish the submission as usual.

Once the submission is finished, the old and new versions are automatically connected, so that the users can see that newer / older version(s) of the resource exists.

A more detailed description, together with screenshots is availble from the GitHub CLARIN-DSPACE site.

Integrity and authenticity of the archived data

We verify the integrity and authenticity of the archived data at all stages of the item’s lifecycle, both automatically and manually. The depositor’s identity is verified either through a local account or via a Shibboleth account, which provides verified personal attributes such as name and email address. Each deposited item is linked to the Data Producer’s account, and only the Data Producer is authorized to edit, manage, and submit changes to their item prior to deposition.

After the review and acceptance by the data curator, the item is published, and its version becomes fixed — the content of the published bitstreams is immutable. Minor post-publication modifications (e.g., correction of typos, addition of keywords, or inclusion of derived data formats) may be made upon justified request via the Help Desk and are evaluated on a case-by-case basis. In cases of more substantial revisions, a new version of the item must be created, and the metadata will indicate that the earlier version has been superseded.

The repository also maintains provenance data and audit trails to ensure traceability and transparency. Logged-in users have access to detailed information about all changes related to a submitted item, available in the metadata field dc.description.provenance. This information can be viewed in the full metadata description mode (e.g., see here).

Restricted Submissions

First of all, all metadata are always publicly available. We support open access submissions; however, we also support restrictive licences for bitstreams which require e-signing before downloading the bitstreams. We keep track of these e-signatures in case there are IPR infringements.

See currently available licenses or ask us to add a specific one.

We also support putting embargo on bitstreams which means that the bitstreams become publicly available after specific dates.