fix max feature count

greenelab · Nov 8, 2023 · e6c6500 · e6c6500
1 parent b4cb1c7
commit e6c6500
Show file tree

Hide file tree

Showing 22 changed files with 6,624 additions and 3,249 deletions.
diff --git a/content/02.main-text.md b/content/02.main-text.md
@@ -84,7 +84,7 @@ The `optimal` learning rate schedule is used by default.
 
 When we compared these four approaches, we used a constant learning rate of 0.0005, and an initial learning rate of 0.1 for the `adaptive` and `invscaling` schedules.
 We also tested a fifth approach that we called "`constant_search`", in which we tested a range of constant learning rates in a grid search on a validation dataset, then evaluated the model on the test data using the best-performing constant learning rate by validation AUPR.
-For the grid search, we used the following range of constant learning rates: {0.000005, 0.00001, 0.00005, 0.0001, 0.0005, 0.001, 0.01}.
+For the grid search, we used the following range of constant learning rates: {0.00001, 0.0001, 0.001, 0.01}.
 Unless otherwise specified, results for SGD in the main paper figures used the `constant_search` approach, which performed the best in our comparison between schedulers.
 
 ### DepMap gene essentiality prediction
@@ -116,11 +116,11 @@ Previous work has shown that pan-cancer classifiers of Ras mutation status are a
 We first evaluated models for KRAS mutation prediction.
 As model complexity increases (more nonzero coefficients) for the `liblinear` optimizer, we observed that performance increases then decreases, corresponding to overfitting for high model complexities/numbers of nonzero coefficients (Figure {@fig:optimizer_compare_mutations}A).
 On the other hand, for the SGD optimizer, we observed consistent performance as model complexity increases, with models having no nonzero coefficients performing comparably to the best (Figure {@fig:optimizer_compare_mutations}B).
-In this case, top performance for SGD (a regularization parameter of 10^-1^) is slightly better than top performance for `liblinear` (a regularization parameter of 1 / 3.16 x 10^2^): we observed a mean test AUPR of 0.722 for SGD vs. mean AUPR of 0.692 for `liblinear`.
+In this case, top performance for SGD (a regularization parameter of 3.16 x 10^-3^) is slightly better than top performance for `liblinear` (a regularization parameter of 1 / 3.16 x 10^2^): we observed a mean test AUPR of 0.725 for SGD vs. mean AUPR of 0.685 for `liblinear`.
 
 To determine how relative performance trends with `liblinear` tend to compare across the genes in the Vogelstein dataset at large, we looked at the difference in performance between optimizers for the best-performing models for each gene (Figure {@fig:optimizer_compare_mutations}C).
 The distribution is centered around 0 and more or less symmetrical, suggesting that across the gene set, `liblinear` and SGD tend to perform comparably to one another.
-We saw that for 52/84 genes, performance for the best-performing model was better using SGD than `liblinear`, and for the other 32 genes performance was better using `liblinear`.
+We saw that for 58/84 genes, performance for the best-performing model was better using SGD than `liblinear`, and for the other 25 genes performance was better using `liblinear`.
 In order to quantify whether the overfitting tendencies (or lack thereof) also hold across the gene set, we plotted the difference in performance between the best-performing model and the largest (least regularized) model; classifiers with a large difference in performance exhibit strong overfitting, and classifiers with a small difference in performance do not overfit (Figure {@fig:optimizer_compare_mutations}D).
 For SGD, the least regularized models tend to perform comparably to the best-performing models, whereas for `liblinear` the distribution is wider suggesting that overfitting is more common.
 

diff --git a/content/91.supp-info.md b/content/91.supp-info.md
@@ -1,6 +1,6 @@
 ## Supplementary Material {.page_break_before}
 
-![Number of nonzero coefficients (model sparsity) across varying regularization parameter settings for KRAS mutation prediction using SGD and `liblinear` optimizers.](images/supp_figure_1.png){#fig:compare_sparsity tag="S1" width="100%"}
+![Number of nonzero coefficients (model sparsity) across varying regularization parameter settings for KRAS mutation prediction using SGD and `liblinear` optimizers, and averaged across all genes for both optimizers. In the "all genes" plot, the black dotted line shows the median parameter selected for `liblinear`, and the grey dotted line shows the median parameter selected for SGD.](images/supp_figure_1.png){#fig:compare_sparsity tag="S1" width="100%"}
 
 ![Distribution of performance difference between best-performing model for `liblinear` and SGD optimizers, across all 84 genes in Vogelstein driver gene set, for varying SGD learning rate schedulers. Positive numbers on the x-axis indicate better performance using `liblinear`, and negative numbers indicate better performance using SGD.](images/supp_figure_2.png){#fig:compare_all_lr tag="S2" width="100%" .page_break_before}
 

diff --git a/content/images/figure_1.png b/content/images/figure_1.png