Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C assertions causing core dump without any useful message #96

Open
macks22 opened this issue May 1, 2017 · 1 comment
Open

C assertions causing core dump without any useful message #96

macks22 opened this issue May 1, 2017 · 1 comment

Comments

@macks22
Copy link
Contributor

macks22 commented May 1, 2017

I'm getting the following with 0.2.9 (fresh pip install today):

python: ffm_als_mcmc.c:172: sparse_fit: Assertion `(sizeof (*w_0) == sizeof (float) ? __finitef (*w_0) : sizeof (*w_0) == sizeof (double) ? __finite (*w_0) : __finitel (*w_0)) && "w_0 not finite"' failed.
[1]    30860 abort (core dumped)  python classify.py --nsplits 1 -c fm --basic-descriptive --date  --subways ...

I'm using the als.FMClassifier via a wrapper class for one-vs-rest classification:

        from fastFM import als
        class FMClassifier(als.FMClassification):
            def fit(self, X, y, *args):
                y = y.copy()
                y[y == 0] = -1
                return super(FMClassifier, self).fit(X, y, *args)

            def predict_proba(self, X):
                probs = super(FMClassifier, self).predict_proba(X)
                return np.tile(probs, 2).reshape(2, probs.shape[0]).T

        from sklearn.multiclass import OneVsRestClassifier
        return OneVsRestClassifier(
            FMClassifier(n_iter=500, random_state=42))

The data I was using to fit the model is attached. I tried to get a smaller dataset, but I was having trouble producing it when I cut things out.

X_train_sparse_fit_issue.npy.zip
y_train_sparse_fit_issue.npy.zip

@ibayer
Copy link
Owner

ibayer commented May 2, 2017

This look very much like an issue with your data, I would especially check for nan and columns that are always zero and the target variable. Changing the C code to avoiding a core dump in such situations is on my list. Getting a small dataset to see if we can automatically detect what causes is the issue is necessary (especially since you change the targets with one-vs-rest).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants