Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally support LZO #51

Closed
wants to merge 5 commits into from
Closed

Optionally support LZO #51

wants to merge 5 commits into from

Conversation

milesgranger
Copy link
Owner

@milesgranger milesgranger commented Mar 31, 2021

I've recently got a decent pure rust implementation of minilzo done here: https://github.com/milesgranger/minilzo3

Toying with the idea of closing #41 while also keeping the MIT license by making it an optional feature.

From the rust build this is pretty straight forward, can build using the feature lzo and poof, you're in GPL land, but for PyPI wheels, I think we'd need to have a cramjam-lzo, cramjam-features-which-use-licenses-that-suck or something to that effect so a user can explicitly opt into a cramjam distribution which includes LZO.

Not sure how to accomplish this. Maybe add a CI matrix for features like this to build and upload into a different package namespace on PyPI on a per optional feature basis?

@martindurant
Copy link

I suspect that the users of fastparquet - aside from individual hobbyists - anyone would be happier to not even have to consider the inclusion or not of GPL.

You can, of course, make multiple package names and wheels for them all, and you can have conda-forge make multiple outputs from one feedstock. I don't see how you can depend on an OR of packages, with conda or pip, so that fastparquet can't say "I need cramjam, but cramjam-lzo will do". That is, unless cramjam-lzo depends on cramjam, which would mean duplicated binary code.

@milesgranger
Copy link
Owner Author

This was a concern and something I view as a nonstarter, a package per and duplicate code. I am not sure how to proceed really. Happy that LZO is just as fast as python-lzo, but unhappy dealing with restrictive licenses.

One alternative is having two versions of cramjam, one as it is now and another "full", so developers can do as they potentially are now, try and import the lzo submodule and deal the same way as trying to import python-lzo for that feature. Then dependent on the end user if they have the full cramjam installed (and thus, they deal with implications of GPL/etc on their own).

Or maybe it's just not worth it. 👀

@martindurant
Copy link

You can't do that with static libraries, though, right? Either it's there or not, so two different versions of the package or two packages. And then it's not obvious how to offer the options to the user.

@milesgranger
Copy link
Owner Author

We can control what ends up in the wheel using feature flags, #[cfg(feature = "lzo")] as it's in the PR now, but this could rather be #[cfg(feature = "full")] to conditionally include lzo or other libs which have none MIT or otherwise restrictive licenses and distribute cramjam and cramjam-full wheels. For lib devs like yourself, I suppose raising an informative error could direct a user to install the latter if they need LZO when attempting from cramjam import lzo fails?

My thinking with the initial post was installing a subpackage directly so no duplicate code would exist such as cramjam.lzo delta code, but I really don't think that's possible. 🤔

@martindurant
Copy link

For lib devs like yourself, I suppose raising an informative error could direct a user to install the latter if they need LZO when attempting from cramjam import lzo fails?

but fastparquet would need to depend on the minimal version, so you potentially end up with both :|. Having fastparquet depend on neither is not a good solution, the basic install needs to be functional.

@milesgranger
Copy link
Owner Author

Good point.. so the only viable option is if it can be distributed as a submodule only pip install cramjam.lzo and only include the delta code. Is that a fair assessment?

@martindurant
Copy link

It would be a fine thing to do, but this should be carefully documented, that whoever wants to depend on cramjam needs to make that decision when defining their requirements. In practice, I imagine that means the non-lzo will get used, exclusively.

@milesgranger
Copy link
Owner Author

In practice, I imagine that means the non-lzo will get used, exclusively.

Well that's an unfortunate idea.. you don't think some people would want to avoid using python-lzo, because it too depends on system packages when they could just use cramjam.lzo for the same effect? I hope so, as that's sorta the whole point of this project. 😅

@martindurant
Copy link

Only if they know that they specifically need LZO!

I have a feeling it may be possible to build as a single package but with different version-build strings, like cramjam-v1.0.0 Vs cramjam-v1.0.0-lzo, since you can have OR versions (not package names), at least in conda.

@milesgranger
Copy link
Owner Author

Closing in favor of #52

@milesgranger milesgranger deleted the maybe-add-lzo branch April 2, 2021 16:47
@milesgranger milesgranger mentioned this pull request Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants