Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Support for S3 with AccessKey and SecretKey #123

Open
jhnnsrs opened this issue Jul 14, 2022 · 4 comments
Open

[Feature]: Support for S3 with AccessKey and SecretKey #123

jhnnsrs opened this issue Jul 14, 2022 · 4 comments

Comments

@jhnnsrs
Copy link

jhnnsrs commented Jul 14, 2022

Hi!

First thanks for the great work, especially for the effort in types!
As a lot of the community uses S3 for storage, I was wondering if you would be
interested in supporting a dedicated S3 Store (building on the HTTP Store but with
support for accessKeys and secretKeys? I could spin up a PR, but here is already an
implementation using "aws4fetch".

import {
  HTTPStore,
  openGroup,
  openArray,
  ZarrArray,
  Group as ZarrGroup,
} from "zarr";
import { AwsClient } from "aws4fetch";

enum HTTPMethod {
  Get = "GET",
  Head = "HEAD",
  Put = "PUT",
}


export class S3HttpError extends Error {
  __zarr__: string;
  constructor(code: any) {
    super(code);
    this.__zarr__ = "HTTPError";
    Object.setPrototypeOf(this, HTTPError.prototype);
  }
}

export class S3KeyError extends Error {
  __zarr__: string;

  constructor(key: any) {
    super(`key ${key} not present`);
    this.__zarr__ = "KeyError";
    Object.setPrototypeOf(this, KeyError.prototype);
  }
}

export function joinUrlParts(...args: string[]) {
  return args
    .map((part, i) => {
      if (i === 0) {
        return part.trim().replace(/[\/]*$/g, "");
      } else {
        return part.trim().replace(/(^[\/]*|[\/]*$)/g, "");
      }
    })
    .filter((x) => x.length)
    .join("/");
}

class S3Store extends HTTPStore {
  aws: AwsClient;

  constructor(url: string, aws: AwsClient, options: any = {}) {
    super(url, options);
    this.aws = aws;
  }

  async getItem(item: any, opts: any) {
    const url = joinUrlParts(this.url, item);
    let value: any;
    try {
      value = await this.aws.fetch(url, { ...this.fetchOptions, ...opts });
    } catch (e) {
      throw new S3HTTPError("present");
    }
    if (value.status === 404) {
      // Item is not found
      throw new S3KeyError(item);
    } else if (value.status !== 200) {
      throw new S3HttpError(String(value.status));
    }

    return value.arrayBuffer(); // Browser
    // only decode if 200
  }
  async setItem(item: any, value: any) {
    const url = joinUrlParts(this.url, item);
    if (typeof value === "string") {
      value = new TextEncoder().encode(value).buffer;
    }
    const set = await this.aws.fetch(url, {
      ...this.fetchOptions,
      method: HTTPMethod.Put,
      body: value,
    });
    return set.status.toString()[0] === "2";
  }

  async containsItem(item: any) {
    const url = joinUrlParts(this.url, item);
    try {
      const value = await this.aws.fetch(url, {
        ...this.fetchOptions,
      });

      return value.status === 200;
    } catch (e) {
      return false;
    }
  }
}

Hope that might help some!

@gzuidhof
Copy link
Owner

Thank you for sharing :)

I would be hesitant to put it into the codebase as it would introduce a dependency on aws4fetch even for users that don't interact with s3(-compatible) stores.

But I would love to have this as an example in the docs! If you would be open to making a PR with that I'd happily merge that. I think it could deserve its own markdown file here, and then you would link in it in the sidebar here.

One improvement to your code: you could use a HEAD request for containsItem (docs), it saves downloading the entire chunk just to see if it exists :)

@jhnnsrs
Copy link
Author

jhnnsrs commented Jul 15, 2022

Agreed! Better no dependency hell.

Will write a few lines about it, and make a PR.
Originally I had the HEAD request on containsItem, but HEAD is not always an allowed method on S3 endpoints (some minio deployments).

@manzt
Copy link
Collaborator

manzt commented Jul 15, 2022

One thing we had also discussed previously would be to have the HTTPStore a more general fetch-based store where you can provide your own fetch implementation.

E.g.,

let store = new FetchStore(url, { fetch: globalThis.fetch }); // default, essentially current `HTTPStore`

let store = new FetchStore(url, {
  fetch: new AwsClient({ /* ... */ }).fetch,
  supportedMethods: ["GET"], // default ["GET", "PUT", "OPTIONS"],
});

Regardless, I think this example would be really great to have in our documentation!

@jhnnsrs
Copy link
Author

jhnnsrs commented Jul 16, 2022

Sounds great to me. Maybe I would still stick with the HTTPStore naming though. But I love the more compositional approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants