-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX correct node splitting order & remove class weight #56
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error still persists at tree level. I'll test more about it. Not ready to merge yet.
Edit: tested in previous build (right after merging tree level paritial_fit
) and the problem already existed back then.
@adam2392 When you have time can you try running the code several times? The error is randomly appearing on my machine, and I couldn't find any clue why yet. Thanks. from sklearn.ensemble import RandomForestClassifier
from sklearn import datasets
from numpy.random import permutation
iris = datasets.load_iris()
iris_X = iris.data
iris_y = iris.target
p = permutation(iris_X.shape[0])
iris_X = iris_X[p]
iris_y = iris_y[p]
clf_1 = RandomForestClassifier()
clf_1.partial_fit(iris_X, iris_y, classes=[0,1,2])
clf_1.partial_fit(iris_X, iris_y) |
Didn't find anything suspicious about |
Hmmm... I'm not sure, but this error now looks like some of the errors I was getting very randomly in scikit-tree, which I just skipped this specific test. It started coming up in scikit-tree after the Some thoughts:
I can't reproduce the error on my machine tho forsure. However, it seems to coming up in scikit-learn fork only after this PR introduced certain changes. So are these changes somehow forcing the tree into some edge case? If you're able to reproduce the error, perhaps you can try using a C++ debugger, or valgrind? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm dam that would be a finicky bug. Is there any unit test we can add that checks this at the tree level that fails on main but succeeds in this PR branch? Worried we'll miss this in the future and a regression occurs again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be other CI errors that I didn't resolve. Working on them now.
For unit tests, the problem is no test can handle segmentation fault? Unless there is one I don't know.
As in the unit-test results in segmentation fault reliably in main, but does not in this branch. Is it true that the root cause is that the capacity changes upon |
Oh so this branch is fine now? That's good to hear after struggling since yesterday. I fear that the capacity change itself ( |
Nvm I was on my phone. I see the same CI errors still. Are these the ones you were referring to that was fixed by the resizing of the tree? |
I'll keeping looking into the errors then. The resizing fix definitely resolved my local errors (code above). |
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
I cannot replicate this error locally, but I'm inclined to merge this for now. If you're able to investigate on your end, then that would be great. I believe the error points to some type of fault in the partial_fit implementation since the traceback consistently shows it errors out in the |
<!-- Thanks for contributing a pull request! Please ensure you have taken a look at the contribution guidelines: https://github.com/scikit-learn/scikit-learn/blob/main/CONTRIBUTING.md --> #### Reference Issues/PRs <!-- Example: Fixes scikit-learn#1234. See also scikit-learn#3456. Please use keywords (e.g., Fixes) to create link to the issues or pull requests you resolved, so that they will automatically be closed when your pull request is merged. See https://github.com/blog/1506-closing-issues-via-pull-requests --> neurodata/treeple#107 #### What does this implement/fix? Explain your changes. #### Any other comments? <!-- Please be aware that we are a loose team of volunteers so patience is necessary; assistance handling other issues is very welcome. We value all user contributions, no matter how minor they are. If we are slow to review, either the pull request needs some benchmarking, tinkering, convincing, etc. or more likely the reviewers are simply busy. In either case, we ask for your understanding during the review process. For more information, see our FAQ on this topic: http://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention. Thanks for contributing! --> --------- Signed-off-by: Adam Li <[email protected]> Co-authored-by: Adam Li <[email protected]>
Reference Issues/PRs
neurodata/treeple#107
What does this implement/fix? Explain your changes.
Any other comments?