Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Airnode feed calls to Signed API by small buffer #212

Open
Siegrift opened this issue Feb 1, 2024 · 11 comments
Open

Batch Airnode feed calls to Signed API by small buffer #212

Siegrift opened this issue Feb 1, 2024 · 11 comments
Labels
on hold We do not plan to address this at the moment

Comments

@Siegrift
Copy link
Collaborator

Siegrift commented Feb 1, 2024

Currently, Nodary Airnode feed makes ~1800 per second Signed API calls (counting only those that reach our server). These numbers are similar for many other Airnode feeds. This is because the non-batched feeds make single API call per data feed and after the data is fetched it is pushed to Signed API. Put shortly, we have as many Signed API POST requests as we have data provider requests.

There are these issues:

  1. These requests are randomly distributed, but there can still be too many of them at once, possibly causing Fix load balancer returning 502 Bad Gateway error #211
  2. Causes too many logs in Signed APIs, relating to Investigate possible log reduction in Signed API and Airnode feed #208
  3. Causing unnecessary HTTP request overhead for both Airnode feed and Signed API.

What I propose is to batch the Airnode feed calls by a short period (e.g. 50ms or 100ms). We want to push the data as soon as possible, but such a short delay seems acceptable to me. This sets the upper bound on the number of Signed API calls requests made per second.

If we do this, we might not want to remove the "Request received" message in this issue, but instead log it with how many "signed datas" it received in the batch.

@bbenligiray let me know wdyt.

@bbenligiray
Copy link
Member

bbenligiray commented Feb 1, 2024

What are the expected batch sizes with a 50-100 ms period?

Currently, Nodary Airnode feed makes ~1800 per second Signed API calls

This surprises me, considering that Nodary Airnode feed batches all of its calls. Maybe you included the Nodary signed API logs?

@Siegrift
Copy link
Collaborator Author

Siegrift commented Feb 1, 2024

Nodary Airnode feed makes ~1800 per second Signed API calls

Sorry. I mistyped. It's 1800 per MINUTE not second. This means it makes 30 calls every second. But when you check the link, you can see that the distribution is not really uniform.

This surprises me, considering that Nodary Airnode feed batches all of its calls.

I think you are looking at the wrong file. I believe the actual deployment is this file. I spoke to @metobom about this and he mentioned he wants to switch it, but there is some stuff he needs to think about first. I also see 5s fetchInterval, not sure if that shouldn't be 1.

Maybe you included the Nodary signed API logs?

Yeah, I was looking at Signed API logs on Grafana, but I mistyped "minute" instead of "seconds". The idea still stands though.

What are the expected batch sizes with a 50-100 ms period?

Say we have 300 feeds, fetchInterval 1 second, none of the calls is batched (300 api calls) and our staggering distributes them equally. Normally we would make a call with 1 signed data request every ~3.3ms (call this frequency). With the change the batch size would be delay_period_ms / frequency_ms and we would send a request at most every 1000 / delay_period_ms. So for 50ms period, the batch size is ~15 and we would make at most 20 calls.

@Siegrift
Copy link
Collaborator Author

Siegrift commented Feb 1, 2024

This would also help with reducing the Signed API costs reported here.

@bbenligiray
Copy link
Member

I think 300 feeds with 1 second fetchInterval is a bit extreme (in that most API keys won't support that). If we think about 150 feeds with 5 second fetchInterval (which is much more realistic), that's a request every 33ms. I think this scheme starts becoming worth it starting at 200ms batching periods.

I assume this is a future thing to implement, right?

@Siegrift
Copy link
Collaborator Author

Siegrift commented Feb 2, 2024

I think this scheme starts becoming worth it starting at 200ms batching periods.

I am fine with 200ms as well (or any period we agree on). I quite like the idea of having the upper bound on the Signed API calls.

I assume this is a future thing to implement, right?

Yeah, but as I mention in

If we do this, we might not want to remove the "Request received" message in #208, but instead log it with how many "signed datas" it received in the batch.

Because both Airnode feed INFO logs for outgoing Signed API requests and Signed API logs for incoming requests would be decreased.

@bbenligiray
Copy link
Member

Logging every 200ms is still a lot though. Decreasing the logs is not meaningful if it doesn't decrease it to an acceptable level.

@Siegrift
Copy link
Collaborator Author

Siegrift commented Feb 2, 2024

Logging every 200ms is still a lot though.

Is it though? The message {"level":"info","message":"Received request \"POST /:airnodeAddress\".","ms":"+7ms","timestamp":"2024-02-02T20:49:28.481Z"} has 123 bytes. Doing 123 (bytes) * 10 (assuming 10 request per second = 100ms period) * 60 * 60 * 24 * 30 = 3188160000B = 3.18816 GB. If we have 10 APIs, thats ~31.9 GB monthly which is 15 $.

@bbenligiray
Copy link
Member

I'll actually propose logging more on the next call :^)

@Siegrift Siegrift self-assigned this Mar 11, 2024
@Siegrift
Copy link
Collaborator Author

Siegrift commented Mar 11, 2024

I'd suggest the following implementation:

  1. There is going to be a separate loop in Airnode feed for pushing the data, that runs every 100ms.
  2. When scheduling new data push here, we would instead save what to update in signedApiUpdates array in state.
  3. The loop gets all elements from signedApiUpdates and clears the array. It then posts the signed data to the Signed API.

Let's sync on this on the call.

@bbenligiray
Copy link
Member

bbenligiray commented Mar 13, 2024

I would keep this in the backlog even if there is no intention to implement it

@Siegrift Siegrift removed their assignment Mar 14, 2024
@Siegrift
Copy link
Collaborator Author

As part of investigating Signed API errors from Airnode feeds,
I looked at the Grafana logs for. For context, at the time of writing, in the last 12 hours, the following number of errors were produced:

This is not that high considering the number of running instances and the time range, but this feature could help reducing this number.

@Siegrift Siegrift added the on hold We do not plan to address this at the moment label Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
on hold We do not plan to address this at the moment
Projects
None yet
Development

No branches or pull requests

2 participants