Skip to content

Commit

Permalink
Update content/91.supp-info.md
Browse files Browse the repository at this point in the history
Co-authored-by: Casey Greene <[email protected]>
  • Loading branch information
jjc2718 and cgreene authored Sep 9, 2024
1 parent 068de88 commit f2ee03b
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions content/91.supp-info.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ For both the "best" and "smallest good" model selection approaches, this effect

Based on these results, given the observation that the mean difference in model performance is fairly small in both "frequent CNV" and "rare CNV" cases, and for both model selection approaches, we conclude that combining point mutation and CNV data and including the target gene in the feature set are reasonable general rules for our pan-cancer and pan-gene study.
In general, our focus is less on individual prediction performance and more on model complexity, which is another degree removed from the individual prediction performance.
In addition, including the target gene would seem most likely to increase the benefit of smaller models, as the single-gene could be considered particularly information rich.
However, the exceptions that we pointed out above emphasize the importance of considering the biological context in applications to specific driver genes or prediction problems.

![Bar plot showing difference in performance (AUPR) between models including and excluding the target gene, for genes where CNV changes are (top) and are not (bottom) frequently included in the label set, colored by model selection approach. Positive values represent better performance for the “control” model, and negative values better performance for the “drop target” model.](images/supp_figure_1.png){#fig:supp_note tag="S1" width="100%"}
Expand Down

0 comments on commit f2ee03b

Please sign in to comment.