Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement equivalent numpy median and quantile / percentile #53

Open
ricardoV94 opened this issue Nov 29, 2022 · 7 comments · May be fixed by #942
Open

Implement equivalent numpy median and quantile / percentile #53

ricardoV94 opened this issue Nov 29, 2022 · 7 comments · May be fixed by #942

Comments

@ricardoV94
Copy link
Member

Please describe the purpose of filing this issue

Equivalent symbolic methods to those are missing.

The Numpy quantile and percentile methods have too many options for the interpolation argument, and these are planned to be deprecated for a while now (see numpy/numpy#10736). It should suffice to implement the default "linear" interpolation.

I am confident that these should not require any extra Ops.

@Dhruvanshu-Joshi
Copy link
Member

Here do we want to use the np.median itself or do we need to right the logic for median in pytensor?

@ricardoV94
Copy link
Member Author

Here do we want to use the np.median itself or do we need to right the logic for median in pytensor?

Good question, depends on how complicated the median code would look like if written in pure PyTensor. If it's messy we can just use np.median under the hood

@Dhruvanshu-Joshi
Copy link
Member

I think using numpy under the hood is better. They even handle nans and have also allow inplace median by overwriting the input whenever required.
So a new op makes sense for this or just a function which calls np.median under the hood will be enough?

@ricardoV94
Copy link
Member Author

For median we can simply (but a bit naively) sort out and pick the middle index (or the average of the two middle ones if it's even length). There are better approaches, but this may be a good enough solution for now: https://rcoh.me/posts/linear-time-median-finding/

@ricardoV94
Copy link
Member Author

or just a function which calls np.median under the hood will be enough?

If using numpy code, it must always be inside an Op. Normal numpy code won't work with PyTensor symbolic variables (which are just placeholders and don't contain values yet)

@Dhruvanshu-Joshi
Copy link
Member

Dhruvanshu-Joshi commented Jun 7, 2024

or just a function which calls np.median under the hood will be enough?

If using numpy code, it must always be inside an Op. Normal numpy code won't work with PyTensor symbolic variables (which are just placeholders and don't contain values yet)

Ohk. So I'll try to write the median in pytensor itself since we want to avoid creating new ops everytime right?

@ricardoV94
Copy link
Member Author

Ohk. So I'll try to write the median in pytensor itself since we want to avoid creating new ops everytime right?

Not anytime, but as much as possible!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants