Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Infinite Scroll for /summary page #72

Open
KartikSoneji opened this issue Aug 8, 2022 · 5 comments
Open

feat: Infinite Scroll for /summary page #72

KartikSoneji opened this issue Aug 8, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@KartikSoneji
Copy link
Member

Make the /summary page initially load only the last 3 summaries, and dynamically fetch more as the user scrolls.

@KartikSoneji KartikSoneji added the enhancement New feature or request label Aug 18, 2022
@sreekaransrinath
Copy link
Contributor

Will make it a pain in the ass to scrape ;-;

@KartikSoneji
Copy link
Member Author

a. Who is scraping the catchup page?
b. The idea is to expose an api, which people can use instead of scraping.

@HarshKapadia2
Copy link
Member

a. Who is scraping the catchup page?

We've already had one project that does it (https://github.com/mihikagaonkar/OTC-Dashboard), so let us not make any assumptions and keep things open for the future.

b. The idea is to expose an api, which people can use instead of scraping.

That is an alternative, but it requires additional effort.
What is your plan for this API? How much detail will it include? Will it send over the entire file or will it provide options to get dates, durations and other specific parts of the content? (This API will also act as a blocker if we have to change any file formatting in the future, as we will have to handle different scenarios of file formattings to be parsed and returned.)
Also more importantly, how would we let someone who wanted to scrape our pages know that we have such a feature available?

@KartikSoneji
Copy link
Member Author

a. Who is scraping the catchup page?

We've already had one project that does it (https://github.com/mihikagaonkar/OTC-Dashboard), so let us not make any assumptions and keep things open for the future.

There was no need to scrape the website, all the data was availabe in the repo.

b. The idea is to expose an api, which people can use instead of scraping.

That is an alternative, but it requires additional effort. What is your plan for this API?

No, that is a side effect of implementing infinite scroll.
The endpoint that will be called to get the next set of summaries will be the same one that someone might use to scrape them.

How much detail will it include? Will it send over the entire file or will it provide options to get dates, durations and other specific parts of the content?

Just the <section> tags that currently contain each summary in the combined summary page.

-e, --embedded
Output an embeddable document, which excludes the header, the footer, and everything outside the body of the document. This option is useful for producing documents that can be inserted into an external template.
We shouldn't need a new parser, just the -e flag.

Also more importantly, how would we let someone who wanted to scrape our pages know that we have such a feature available?

Hmm maybe add a page, but most likely someone who wants to scrape the page will analyze network requests.
Or ask us about it.

But in general, there are very few reasons to scrape the summaries from the website.
If someone wants to run static analysis, individual files in the repo are better for that.
The only other reason might be to integrate with another website, but in that case an api would be easier.

@HarshKapadia2
Copy link
Member

Makes sense. Thank you.

We should add a note somewhere for scrapers though, just to inform them about the API. (Maybe in the API response?)
We will also have to document the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants