dc.contributor.author |
Walczak, Jerzy Piotr |
dc.contributor.author |
Sobótka, Paweł |
dc.contributor.author |
Marasek, Krzysztof |
dc.date.accessioned |
2016-06-06T13:43:41Z |
dc.date.available |
2016-06-06T13:43:41Z |
dc.date.issued |
2016-05-12 |
dc.identifier.uri |
http://hdl.handle.net/11321/297 |
dc.description |
This submission contains the operating system of the long-term archive, built in the Polish-Japanese Academy of Information Technology for the Clarin-PL project. Basic elements of the archive are data nodes, equipped with mass memories. The nodes are controlled by embedded low-power computers which are independently powered up only when their storage is about to be accessed. This allows not only for limiting the overall energy consumption but also lowers environmental demands (no air-condition needed). The nodes are grouped in trays. Basic and recommended configuration allows for 30 nodes in trays, but it is possible to extend this limit up to 253. Each tray contains several networks designed for data transport, devices’ state control and power supply. Communication with clients is conducted through buffers that are the only parts visible from externally connected networks. Therefore, stored files are completely isolated and cannot be directly accessed. Multiple trays located at single physical site create a complete archive. It is possible to split storage space into virtual archives that are separated on logical level. The operating system of the data network allows to store from 3 to 7 copies of single digital file in different nodes. Moreover, additional copies of the resource may be stored automatically in remotely located archives. The trays are treated as local parts of wider dispersed data network structure. Software of the archive enables not only secure read and write operations data but it also automatically takes care of the stored data. It periodically regenerates physical state of saved files. In case of device failure clients are transparently redirected to local or remote redundant copies. The mechanism of "software bots" was implemented. Archive can be supplied with external programs for processing files stored inside the data network. This allows for data analyzes, indexation, post-data creation, statistical computations or finding associations in unstructured data sets of Big Data type. Only the output of software bot can be externally accessed what makes such operations very secure. Client programs communicate with the archive using set of simple protocols based on key-value pair strings, making it convenient to build web interfaces for archive access and administration. By automating the supervision of the resources, reduction of requirements for storage, precise energy consumption control and proposed solution significantly lowers the cost of long-term data storage. |
dc.language.iso |
N/A |
dc.publisher |
Polish-Japanese Academy of Information Technology |
dc.rights |
BSD 2 Clause |
dc.rights.uri |
https://opensource.org/licenses/BSD-2-Clause |
dc.rights.label |
PUB |
dc.source.uri |
http://www.clarin-pl.eu |
dc.subject |
long-term archive |
dc.title |
Long term archive operating system source code |
dc.type |
toolService |
metashare.ResourceInfo#ContentInfo.detailedType |
service |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent |
false |
hasMetadata |
true |
has.files |
yes |
branding |
CLARIN-PL |
demo.uri |
http://www.clarin-pl.eu |
contact.person |
Krzysztof Marasek kmarasek@pja.edu.pl Polish-Japanese Academy of Information Technology |
sponsor |
Ministry of Science and Higher Education (Poland) x Clarin-PL nationalFunds |
files.size |
3710304 |
files.count |
1 |