Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time and event column types #65

Closed
adam-haber opened this issue Oct 31, 2017 · 3 comments
Closed

Time and event column types #65

adam-haber opened this issue Oct 31, 2017 · 3 comments
Assignees
Labels

Comments

@adam-haber
Copy link

Following the PEM example with my own data, I got unreasonable results using survivalstan.utils.plot_observed_survival.

After casting my time column to float (was int before) and the event column to boolean (was 0/1 before), everything worked.

Is this intentional?

@jburos
Copy link
Member

jburos commented Oct 31, 2017

This is not intentional - but is not a scenario I am explicitly testing for. I will add this. Thanks for the head's up.

One question - did you transform your data using prep_data_long_surv or are your data already in start-stop / long / denormalized format? This will help me narrow down the possible locations of the problem.

@jburos jburos added the bug label Oct 31, 2017
@jburos jburos self-assigned this Oct 31, 2017
@adam-haber
Copy link
Author

The data wasn't in a long format; I just had an "event" column (some of it censored) and a "time" column.

@jburos
Copy link
Member

jburos commented Oct 31, 2017

Great, so that helps a lot. This is still an issue I'll want to catch & fix, but to start with you should first transform your data to "long" format in order to fit the PEM model.

This would be a two-step process, like so:

dlong = survivalstan.prep_data_long_surv(df=d, event_col='event', time_col='t')
fit = survivalstan.fit_stan_survival_model(
    model_code = survivalstan.models.pem_survival_model,
    df = dlong,
    sample_col = 'index',
    timepoint_end_col = 'end_time',
    event_col = 'end_failure',
    formula = '~ age_centered + sex'
)

You may very well still run into the int/float and boolean/int problems you noted above, but noting this here since it came up.

Linking to related issue / recommendation #64 since that would make the need for a two-step process somewhat obsolete

@jburos jburos added this to the v0.1.3 release milestone Oct 31, 2017
@jburos jburos closed this as completed Sep 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants