Skip to content
gilbertchen edited this page Sep 27, 2020 · 1 revision

This feature is available since CLI version 2.7.0.

Usage

To initialize a storage with erasure coding enabled, run this command (assuming 5 data shards and 2 parity shards):

duplicacy init -erasure-coding 5:2 repository_id storage_url

Then you can run backup, check, prune, etc as usual.

When a bad chunk is detected, you'll see log messages like this:

  Restoring /private/tmp/duplicacy_test/repository to revision 1
  Recovering a 1824550 byte chunk from 364910 byte shards: ***--**
  Downloaded chunk 1 size 1817347, 1.73MB/s 00:00:11 9.0%
  Recovering a 6617382 byte chunk from 1323477 byte shards: **--***
  Downloaded chunk 2 size 6591322, 8.02MB/s 00:00:02 42.0%
  Recovering a 5136934 byte chunk from 1027387 byte shards: --*****
  Downloaded chunk 3 size 5116593, 12.90MB/s 00:00:01 67.6%
  Recovering a 2515494 byte chunk from 503099 byte shards: -*****-
  Downloaded chunk 4 size 2505558, 15.29MB/s 00:00:01 80.1%
  Recovering a 3984934 byte chunk from 796987 byte shards: --*****
  Downloaded chunk 5 size 3969180, 19.07MB/s 00:00:01 100.0%
  Downloaded file1 (20000000)

To check if a storage is configured with erasure coding, run duplicacy -d list and it should report the numbers of data and parity shards:

Data shards: 5, parity shards: 2

Encoding Format

The encoded chunk file starts with a 10 byte unique banner, then a 14 byte header containing the chunk size and parity parameters, followed by hashes of each shard, then the contents of shards, and finally the 14 byte header again for redundancy:

----------------------------
| duplicacy\0003 (10 bytes) |
-------------------------------------------------------------------------------------------------
| chunk size (8 bytes) | #data shards (2 bytes) | #parity shards (2 bytes) | checksum (2 bytes) |
-------------------------------------------------------------------------------------------------
| hash of data shard #1 (32 bytes) |
------------------------------------
...
| hash of parity shard #1 (32 bytes) |
------------------------------------
...
| data shard #1 |
-----------------
...
| parity shard #1 |
-----------------
...
-------------------------------------------------------------------------------------------------
| chunk size (8 bytes) | #data shards (2 bytes) | #parity shards (2 bytes) | checksum (2 bytes) |