Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement "accounting" service that tracks user egress and updates Stripe #137

Open
4 of 8 tasks
Tracked by #135
travis opened this issue Sep 20, 2024 · 3 comments
Open
4 of 8 tasks
Tracked by #135
Assignees

Comments

@travis
Copy link
Member

travis commented Sep 20, 2024

When requests come into the gateway, the last thing they do will be to send a UCAN invocation (? should it be a UCAN invocation?) to a new "egress accounting service" that will record important metadata about the request and record the usage in our Stripe account.

While this seems like it might make sense as an AWS-based lambda service, I think it's also worth seriously considering writing this as a Cloudflare worker, since the rest of the read pipeline is in Cloudflare. If we go this route, we should consider alternatives to the AWS UCAN stream and will need to figure out how to connect to Stripe from Cloudflare apps (this should not be hard).

The accounting service may also be expected to update caches or datastores based on the invocations it receives.

This service will record usage in Stripe, but should also keep enough information about accounting to support our billing and product development processes.

Tasks for Egress Event Tracking and Billing Integration

@travis travis mentioned this issue Sep 20, 2024
3 tasks
@fforbeck fforbeck self-assigned this Sep 24, 2024
@fforbeck fforbeck assigned fforbeck and unassigned fforbeck Oct 3, 2024
@fforbeck
Copy link
Member

fforbeck commented Oct 4, 2024

Hey @travis and @alanshaw,
Check out the plan for the egress traffic system. Let me know if you're good with it, and I'll start the implementation.

In order to implement the Accounting Service that tracks egress traffic, we will establish a system with the following components:

  1. DynamoDB Table for Egress Events:

    • Purpose: Store egress events, where each event represents a request served to a customer.
    • Schema: Include attributes such as customerId, resourceId, and timestamp to uniquely identify and track each request.
  2. SQS Queue for Egress Events:

    • Purpose: Serve as the entry point for egress events pushed by the Freeway Cloudflare worker.
    • Functionality: Acts as a buffer to handle incoming events, ensuring they are processed in a reliable and scalable manner.
  3. AWS Lambda Handler for Event Processing:

    • Purpose: Process events from the SQS queue and store them in the DynamoDB table.
    • Processing Strategy: Utilize immediate processing to handle events as they arrive, ensuring real-time tracking of egress requests.
    • Implementation: Save the event in DynamoDB for each request, maintaining an accurate tally of requests per customer.
  4. Stripe Integration for Billing:

    • Purpose: Stripe provides a way to track usage-based billing through its Usage Records API. This allows us to bill customers based on the number of units they consume, such as API requests. We need to integrate Stripe to track egress traffic for billing purposes.
    • Method: For each processed event, increment the usage quantity by 1 in Stripe for the corresponding customer, ensuring accurate billing based on the number of requests served.
  5. Testing and Validation:

    • Unit Testing: Implement unit tests for the Lambda function to ensure correct processing logic and error handling.
    • Integration Testing: Deploy to a test environment and simulate egress events to verify end-to-end functionality, including DynamoDB updates and Stripe integration.
    • Mocking External Services: Use mocks for AWS services and Stripe API during testing to simulate interactions and validate behavior without incurring costs.
    • Monitoring and Logging: Set up logging and monitoring to track the processing of events and identify any issues during testing and production.
  6. Architecture Diagram:

    • Visual Representation: The following Mermaid diagram illustrates the architecture of the egress traffic accounting system:
graph TD;
    A[Freeway Cloudflare Worker] -->|Push Egress Events| B[SQS Queue];
    B -->|Trigger| C[AWS Lambda Function];
    C -->|Store Event| D[DynamoDB Table];
    C -->|Record Usage| E[Stripe API];
    subgraph AWS
        B
        C
        D
    end
    subgraph External
        A
        E
    end
    style A fill:#f9f,stroke:#333,stroke-width:2px;
    style B fill:#bbf,stroke:#333,stroke-width:2px;
    style C fill:#bbf,stroke:#333,stroke-width:2px;
    style D fill:#bbf,stroke:#333,stroke-width:2px;
    style E fill:#f9f,stroke:#333,stroke-width:2px;
Loading

@travis
Copy link
Member Author

travis commented Oct 7, 2024

This looks great to me! The one thing it's missing is a discussion of operational monitoring - I think we need, at minimum, monitoring and alerting on the SQS queue to make sure we know when the reader (a lambda I assume?) is falling behind. We definitely also want alerting on any errors coming out of this system - this is probably as simple as making sure Sentry is configured for each of the components, but worth noting.

Before you start implementation could you create issues (probably sub-issues of this one - should be as simple as creating a "todo" list with new issues - see https://dev.to/keracudmore/create-sub-issues-in-github-issues-409m for more detail) for each of these tasks? We'll make sure to get them into the sprint starting this week.

@fforbeck
Copy link
Member

fforbeck commented Oct 7, 2024

This looks great to me! The one thing it's missing is a discussion of operational monitoring - I think we need, at minimum, monitoring and alerting on the SQS queue to make sure we know when the reader (a lambda I assume?) is falling behind. We definitely also want alerting on any errors coming out of this system - this is probably as simple as making sure Sentry is configured for each of the components, but worth noting.

Before you start implementation could you create issues (probably sub-issues of this one - should be as simple as creating a "todo" list with new issues - see https://dev.to/keracudmore/create-sub-issues-in-github-issues-409m for more detail) for each of these tasks? We'll make sure to get them into the sprint starting this week.

Sounds great! Thanks for reviewing it. I just updated the ticket description with the tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants