You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After speaking with @mkoskinen, he made a valid point in that there may be a usecase where some kind of object storage (or more specifically an implementation of BackupClientInterface) is so basic/simple that it doesn't support functionality such as resume. A way around this would be to use a BackupClientInterface that supports resuming (even a flat file storage) and then after each substream for that object/key/file is complete you can them upload it to S3/GCS/whatever as a whole.
This can also solve other problems, for example currently we don't compress the .json file (using something like gz) while streaming because we can't find a resume point from a compressed object/key/file. This approach would allow you to backup the stream to an initial storage and then after its finished compress it and send it to S3/GCS/whatever.
On first impressions the implementation can be adding a single method to the BackupClientInterface that returns an Option[Sink] where if its defined, this sink gets executed after a backup is complete. One considering is whether the sink can be run asynchronously or synchronously (ideally as a parameter in the method itself), i.e.
defafterBackupSink(async: Boolean):Option[Sink]
The text was updated successfully, but these errors were encountered:
After speaking with @mkoskinen, he made a valid point in that there may be a usecase where some kind of object storage (or more specifically an implementation of
BackupClientInterface
) is so basic/simple that it doesn't support functionality such as resume. A way around this would be to use aBackupClientInterface
that supports resuming (even a flat file storage) and then after each substream for that object/key/file is complete you can them upload it to S3/GCS/whatever as a whole.This can also solve other problems, for example currently we don't compress the
.json
file (using something likegz
) while streaming because we can't find a resume point from a compressed object/key/file. This approach would allow you to backup the stream to an initial storage and then after its finished compress it and send it to S3/GCS/whatever.On first impressions the implementation can be adding a single method to the
BackupClientInterface
that returns anOption[Sink]
where if its defined, this sink gets executed after a backup is complete. One considering is whether the sink can be run asynchronously or synchronously (ideally as a parameter in the method itself), i.e.The text was updated successfully, but these errors were encountered: