Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluesky: look into optimizing API calls #1597

Closed
snarfed opened this issue Nov 2, 2023 · 8 comments · Fixed by snarfed/granary#627
Closed

Bluesky: look into optimizing API calls #1597

snarfed opened this issue Nov 2, 2023 · 8 comments · Fixed by snarfed/granary#627

Comments

@snarfed
Copy link
Owner

snarfed commented Nov 2, 2023

Wow! Bluesky only has 26 users, ~1% of all active users that we poll, but it's already the solid majority of all outbound HTTP requests! We should eventually look into optimizing those. First step would be to get the like/repost cache working, based on counts, and only fetch those when the count increases...but I'm not sure the Bluesky API gives us those counts? cc @JoelOtter

image
@JoelOtter
Copy link
Contributor

No it does! I just was unaware of how the caching works to begin with. Happy to have a look :)

@JoelOtter
Copy link
Contributor

Actually it looks like Bluesky is already using the cache for replies, reposts and likes?

@snarfed
Copy link
Owner Author

snarfed commented Nov 4, 2023

You're right, granary reads and writes it, and it gets passed through from the Bluesky.last_activities_cache_json datastore property here:

bridgy/tasks.py

Lines 132 to 144 in 8c809d2

cache = util.CacheDict()
if source.last_activities_cache_json:
cache.update(json_loads(source.last_activities_cache_json))
# search for links first so that the user's activities and responses
# override them if they overlap
links = source.search_for_links()
# this user's own activities (and user mentions)
resp = source.get_activities_response(
fetch_replies=True, fetch_likes=True, fetch_shares=True,
fetch_mentions=True, count=30, etag=source.last_activities_etag,
min_id=source.last_activity_id, cache=cache)

...and yet it's not working. Hmm. 😐 Maybe first check if any of the Bluesky entities in the datastore actually have anything stored in that property? https://console.cloud.google.com/datastore/entities/query?project=brid-gy

@JoelOtter
Copy link
Contributor

OH

@JoelOtter
Copy link
Contributor

This might actually be (part of?) the cause of #1592

@snarfed
Copy link
Owner Author

snarfed commented Nov 5, 2023

Huh, you're probably right! Great catch.

@snarfed
Copy link
Owner Author

snarfed commented Nov 5, 2023

Looks like it worked! When there's nothing new, a Bluesky poll now only does a single API call, down from as many as 91 before. 😆

image

@snarfed
Copy link
Owner Author

snarfed commented Nov 5, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants