You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a curator
I want the dataset files I'm publishing to Wasabi to be also backed up to S3 Glacier
So that I have piece of mind that any dataset files we publish has a backup available
Acceptance criteria
Given there are new GigaDB datasets that are not backed up and not published
When I push the dataset files to Wasabi on the bastion server
Then the datasets files are saved to Wasabi bucket
And the dataset files are saved to AWS S3 Glacier class of storage
rija
changed the title
Access Tencent COS service to backup GigaDB datasets
Enable mechanism to upload new dataset from EFS to Wasabi and S3 Glacier at the same time
Jul 3, 2024
Given there are new GigaDB datasets that are not backed up and not published
When I push the dataset files to Wasabi on the bastion server
Then the datasets files are saved to Wasabi bucket
And the dataset files are saved to AWS S3 Glacier class of storage
Hi @kencho51, @pli888, @only1chunts mentioned this morning that that acceptance criteria that was our basis of thinking during sprint planning is not correct.
Because until a dataset is set to published, the curator may amend the files on Wasabi multiple times, so it's not worth backing up non-definitive files, multiple times, to a cold storage.
So I suggested Ken to update the transfer wrapper script to take flags --wasabi, --backup which will transfer the files to Wasabi or to S3 glacier depending on which flag is passed.
My only concern is that the backup is no longer automated, which is a big problem as humans have to remember to run transfer to backup.
So, I'll be creating a new backlog ticket for automatically asynchronously backup dataset files when the upload status is set to "Published". (Update: created #1992 )
…Wasabi and backup to S3 by curators (Merge pull request #1977)
- Allow user to upload dataset files to wasabi bucket and also s3 glacier bucket for backup
- Automatically mount EFS access point to bastion and webapp servers
- Remove user suffix from wasabi profile and improve curators docs
- Fix acceptance tests failure in Gitlab pipeline
Refs: #1771, #1861, #1903, #2064
User story
Acceptance criteria
Additional Info
1) Create a S3 bucket for the backup of dataset files
gigadb-datasetfiles-backup
and use same subdirectory structure as on Wasabi main storage of dataset files #19642) Create a
FilesBackupTool
user in IAM and the create (and document) the appropriate IAM policy namedAllowReadWriteBucketGigadbDatasetsFilesBackup
for accessing thegigadb-datasetfiles-backup
bucket #19653) Ensure rclone config for curators on the bastion server has configuration for both destination #1966
4) Create and deploy (using
bastion_playbook.yml
) a wrapper script that handles the transfer (rclone copy
) and logging of selected dataset files to both destinations #19675) Add curators manual for operating the tool #1968
Note: the actual transfer already published dataset files is not part of this ticket, but should be dealt with in ticket #1963
Product Backlog Item Ready Checklist
Product Backlog Item Done Checklist
The text was updated successfully, but these errors were encountered: