Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created CLI and Implemented Runtime profiling #5

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

rly0nheart
Copy link

@rly0nheart rly0nheart commented Nov 28, 2023

Hi @trislee, I created the CLI and implemented run-time profiling (which is optionally enabled)

Here are some key changes I made:

  • Implemented CLI
  • Implemented Profiling (optionally enabled with --runtime-prof)
  • Refactored the make_request function, as per the TODO note left in code (I tried to make minimal changes as possible)
  • Added Dockerfile
  • Updated README
  • Added LICENSE (I noticed Bellingcat uses the MIT License mostly, so that's what I added)
  • Added a pyproject.toml file

Screenshots

Screenshot from 2023-11-28 19-50-19
Screenshot from 2023-11-28 19-50-56
Screenshot from 2023-11-28 19-51-12
Screenshot from 2023-11-28 19-51-31

CLI Examples

Channel Operations

Get channel profile

polyphemus channel profile --name channel_name

Get channel videos

polyphemus channel videos --id channel_id

Get channel subscribers

polyphemus channel subscribers --id channel_id

Video Operations

Get video views

polyphemus video views --id claim_id

Get video comments

polyphemus video comments --id claim_id

Get video reactions

polyphemus video reactions --id claim_id

Get video streaming url

polyphemus video streaming_url --canonical-url canonical_url

Get recommended videos

polyphemus video recommended_videos --title video_title --id claim_id

Miscellaneous Operations

Append comments reactions

polyphemus misc append_comments_reactions --comments {list},{of},{comma},{separated},{comment},{objects}

Convert normalized names to video info

polyphemus misc normalized_name2video_info --normalized-names list-of-dash-separated-nomalized-names```

@trislee
Copy link
Contributor

trislee commented Nov 30, 2023

The CLI looks good. A few things though, mostly related to testing (which you can run from the package root directory using the pytest command:

  • in the tests/api.py module, removing the from polyphemus import api line makes all the tests in that module fail
  • in the tests/base.py module, the test_process_raw_video_info test seems to never terminate. Not sure exactly what the problem is, but seems to be related to async.
  • regarding the run-time profiling (as mentioned in the TODO section of the README), I meant that I wanted to see if there were ways of improving the performance of some of the more time-consuming functions in the package, and see if making async requests would improve that performance. I didn't mean to add run-time profiling to the CLI, which I think would be confusing for the end-users of the program. Sorry for the confusion.

Before I accept this PR, I'd like to see

  • all tests pass
  • comparison of run-time performance between sync and async versions of, for example, the RecommendationEngine.generate method, to see if the performance gains are worth the additional complexity from async

@rly0nheart
Copy link
Author

Got it! Sending async requests definitely improves it, and I'm starting to think I overdid it with the profiling thing😅

@rly0nheart
Copy link
Author

The CLI looks good. A few things though, mostly related to testing (which you can run from the package root directory using the pytest command:

  • in the tests/api.py module, removing the from polyphemus import api line makes all the tests in that module fail
  • in the tests/base.py module, the test_process_raw_video_info test seems to never terminate. Not sure exactly what the problem is, but seems to be related to async.
  • regarding the run-time profiling (as mentioned in the TODO section of the README), I meant that I wanted to see if there were ways of improving the performance of some of the more time-consuming functions in the package, and see if making async requests would improve that performance. I didn't mean to add run-time profiling to the CLI, which I think would be confusing for the end-users of the program. Sorry for the confusion.

Before I accept this PR, I'd like to see

  • all tests pass
  • comparison of run-time performance between sync and async versions of, for example, the RecommendationEngine.generate method, to see if the performance gains are worth the additional complexity from async

So, in order for the tests to run successfully, I will have to make major changes to the test files, like having a separate test function for each api function that is to be tested. What do you think about this?

@trislee
Copy link
Contributor

trislee commented Dec 8, 2023

I'm open to making major changes to the test files as long as the functionality of the tests doesn't change (i.e. all functions/methods are sufficiently tested). I chose that specific implementation to reduce redundancy.

Why would you need to separate the api test functions? Would an approach like the one mentioned in this answer work here?

@rly0nheart
Copy link
Author

rly0nheart commented Dec 8, 2023

I'm open to making major changes to the test files as long as the functionality of the tests doesn't change (i.e. all functions/methods are sufficiently tested). I chose that specific implementation to reduce redundancy.

Why would you need to separate the api test functions? Would an approach like the one mentioned in this answer work here?

The need to create separate test functions comes from the fact that all api functions now need a/an aiohttp session. This is because I was trying to avoid creating a new session for each api function call. So, in the tests, a similar approach (like the one on the stackoverflow answer) would be more resource friendly. Having a fixture function to create a session and then use it for all tests. If any issues arise from that, I might go ahead and use a new session for each test (for the sake of testing). I've encountered an issue in tests, where a fixture function for creating a session wasn't working because it yelded an AsyncGenerator and not a ClientSession. So, I had to instead create a new session for each test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants