Not able to replicate the study #1

AnoopRKulkarni · 2022-03-19T09:32:48Z

Hello,

I have installed the MIMIC-IV 1.0 dataset on my local machine and have followed the instructions in mimic-code repository to create all the tables and load data.

After that I completed the creation of "aline" schema using the scripts mentioned in this repository. For most part, the analysis follows closely along the notebook in the repository but towards the end, I get a p-value of 0.019 ! Clearly am NOT able to replicate the study anywhere close.

Is anyone aware of what the issue can be? and what I need to fix this? Am I missing something?

Thanks and regards
~anoop

More details

Cohort size:
23390 - exclusion_readmission
17564 - exclusion_shortstay
24127 - exclusion_vasopressors
26733 - exclusion_septic
13137 - exclusion_aline_before_admission
56645 - exclusion_not_ventilated_first24hr
20545 - exclusion_service_surgical
Will remove 74367 of 76540 patients.

Replicating the flow of the flowchart from Chest paper.
76540 - removing 35770 (46.73%) patients - short stay // readmission.
40770 - removing 27589 (67.67%) patients - not ventilated in first 24 hours.
13181
- removing 5140 (39.00%) patients - additional 5140 39.00% - exclusion_septic
- removing 7986 (60.59%) patients - additional 4727 35.86% - exclusion_vasopressors
- removing 4973 (37.73%) patients - additional 863 6.55% - exclusion_aline_before_admission
- removing 5421 (41.13%) patients - additional 278 2.11% - exclusion_service_surgical
2173 - final cohort.

Accuracy of 66.91 from PyMatch with unbalanced matching.

and then this!

Result of propensity score followed by matching:
p = 0.019.
Odds ratio: 0.72 [0.54 - 0.94].

AnoopRKulkarni · 2022-06-17T17:35:29Z

A few excursions into the mimic-code directory and I decided to use aline_propensity_score.Rmd (from mimic-iii directory) for computing the p-value using the original R Matching package.

What I did is the following:

Created the aline schema and its tables with MIMIC-IV dataset and used SQL scripts in this repo
Saved the dataframe as aline_data.csv
Used the glm and R Matching package to compute the p-value using the McNemar's Chi-squared test with continuity correction and it came out to be 0.501 !!
BUT, if I use the R "vcd" package to compute p-value using their mantelhaen.test() function (the CMH test really) on the R matched data, then I still get p-value as 0.019.

So seems like there is an issue in the aline database and any suggestions from those who created and/or know the dataset sufficiently well, for any clues on where the issue could be.

thanks in advance

~anoop

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to replicate the study #1

Not able to replicate the study #1

AnoopRKulkarni commented Mar 19, 2022 •

edited

Loading

AnoopRKulkarni commented Jun 17, 2022 •

edited

Loading

Not able to replicate the study #1

Not able to replicate the study #1

Comments

AnoopRKulkarni commented Mar 19, 2022 • edited Loading

More details

AnoopRKulkarni commented Jun 17, 2022 • edited Loading

AnoopRKulkarni commented Mar 19, 2022 •

edited

Loading

AnoopRKulkarni commented Jun 17, 2022 •

edited

Loading