Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use models.multibind instead of models.selexDataset in mubind-pipeline and make tests in mubind #95

Open
2 tasks
ilibarra opened this issue Dec 13, 2022 · 6 comments
Assignees

Comments

@ilibarra
Copy link
Member

  • Define tests for Multibind (do not remove previous tests, but rather make new ones with the same style (also can group into a single script.
  • update fit_model.py to use multibind for SELEX/PBM instead of If else. Probably, a parameter of Multibind self.datatype is enough for this change.
@ilibarra ilibarra changed the title Incorporate Multibind as a class in mubind pipeline instead of SelexDataset Use models.multibind instead of models.selexDataset in mubind-pipeline and make tests in mubind Dec 13, 2022
@ege-erdogan ege-erdogan self-assigned this Dec 14, 2022
@ege-erdogan
Copy link
Collaborator

ege-erdogan commented Dec 14, 2022

Regarding the second point, currently:

  • train_iterative (in mubind prediction.py) called from fit_model initiates Multibind based on the passed dataset's class, also setting the datatype parameter.
  • Multibind then initiates the correct module etc. based on the datatype param.

@ilibarra
Copy link
Member Author

Solution can be

  1. renaming SelexDataset to PBMDataset in this line of fit_model.py. This assumes that PBMDataset is working properly and it s the default solution.
    https://github.com/theislab/mubind-pipeline/blob/main/scripts/fit_model.py#L88

@johschnee may you please yes/no confirm that PBMDataset is the one to be used internally in prediction.py? Please write and/or DM us if more considerations have to be taken. Thank yo,
https://github.com/theislab/mubind/blob/main/mubind/tl/prediction.py#L349

@ilibarra
Copy link
Member Author

  • Test for PBMDataset in pytests is required.

@johschnee
Copy link
Collaborator

I unfortunately don't have access to the mubind-pipeline repository. But going by your description alone, I'd be careful about renaming. SelexDataset and PBMDataset have different attributes and these attributes are also used by the multibind model.

If you think the code in mubind-pipeline is relevant for a more datailed answer, it would be great if you could give me access.

@ilibarra
Copy link
Member Author

@johschnee access to mubind-pipeline granted. My bad!

@johschnee
Copy link
Collaborator

Thank you, this indeed changes my answer. Yes, for PBM-Data the class PBMDataset is the one to be used internally. As Ege said, the datatype parameter is initialized based on the dataset's class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants