Review: Ch 7 (classification_continued) #107

leem44 · 2021-03-28T18:00:51Z

Reviewer E:

I love the cross validation diagram -- extremely helpful!
- ML: no changes needed here
~~Consider defining “accuracy” more precisely and either defining “Kap” or removing it from tidymodels output. It’s a bit distracting to have it reported but unexplained~~
~~Consider spending slightly more time explaining confusion matrix and what each cell means~~
~~Chapter 7 is extremely dense and hits on so many foundational modeling concepts. I think some of this could be helpful to pull up before Chapter 6 and describe a holistic modeling workflow~~

leem44 · 2021-03-28T18:31:03Z

Reviewer B:

Wow, cross-validation in an intro course! The future has arrived! Cross-validation based error-estimates are much easier to understand than summaries like R-squared or AIC.
- ML: no changes needed here

trevorcampbell · 2021-03-28T20:34:03Z

Reviewer D

~~it is unclear to me what the overall workflow advocated by authors should be~~
Should the readers/users split the data into training set, validation set and test set (such that the training set and the validation set combine to yield the overall training set)?
Should they then tune the classifier by building it on the training set and evaluating it on the validation set?
Once the classifier is tuned, should they assess its accuracy on the test set?
Once accuracy of tuned classifier is established, should the classification model/technique be applied to the ENTIRE data (i.e., training set + validation set + test set) to perform classifications for new observations?
- ML: overview at the end does this
If the 4 items described above capture the workflow intended by the authors, I find it confusing that this workflow is presented backwards in the manuscript – for example, the accuracy of the classifier is assessed BEFORE the classifier is actually tuned.
- ML: I think this is difficult to explain tuning before explaining the accuracy
Can the authors clarify their intended workflow and make sure the chapters in which they present the elements of this workflow follow a logical sequence?

trevorcampbell · 2021-03-28T21:20:33Z

Reviewer A

"precision and recall, not discussed": Why not?
- ML: I don't know if it makes sense to explain it here if we aren't going to do anything with it later. I will add an issue in case we want to add it in a later iteration precision/recall #230
p163: the figures are hard to follow (the flow of them) because they get jumbled in the PDf version. in the HTML version it's fine. actual comment: I expected to see the scatterplot as the first graphic. If you want the reader to see these in print, be sure to point to them specifically in the text.
- ML: added to Review: Global/big picture formatting revisions #96
~~p165 first code block: I think it's important to point out that you are not simply sampling the rows of the cancer data set, but are performing a stratified sample.~~
-~~p166 You could emphasize the stratification by summarizing the split between M and B in each data set.~~
~~p169 Is that good? I feel like some wrap up of the performance would be good for the novice reader here.~~
~~170 Does that mean you pool all the data and train your final classifier? Be very clear for the reader since this is an intro text.~~
~~p177 in the underfitting paragraph: So you want to balance these two issues: be clear about that.~~

trevorcampbell · 2021-07-17T02:09:55Z

From #146 comment by @ttimbers : need more informative axis labels in figures. The variables we have are the mean values across cells in a tissue sample.

However, I worry a bit that changing the axis labels will make the examples more confusing (because the new axis labels should be something like Mean Concavity (for example)).

I will make this same comment in the chapter-specific edits thread for classification 1.

leem44 · 2021-08-12T15:28:16Z

From #146 comment by @ttimbers : need more informative axis labels in figures. The variables we have are the mean values across cells in a tissue sample.

However, I worry a bit that changing the axis labels will make the examples more confusing (because the new axis labels should be something like Mean Concavity (for example)).

I will make this same comment in the chapter-specific edits thread for classification 1.

I decided not add "Mean" in front of the labels since I think it might make it more confusing, but I did specify when the values were standardized e.g. Perimeter (standardized)

leem44 mentioned this issue Aug 12, 2021

precision/recall #230

Closed

leem44 mentioned this issue Aug 16, 2021

Review classification chapters #234

Merged

leem44 linked a pull request Sep 21, 2021 that will close this issue

Review classification chapters #234

Merged

leem44 closed this as completed Sep 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review: Ch 7 (classification_continued) #107

Review: Ch 7 (classification_continued) #107

leem44 commented Mar 28, 2021 •

edited

Loading

leem44 commented Mar 28, 2021 •

edited

Loading

trevorcampbell commented Mar 28, 2021 •

edited by leem44

Loading

trevorcampbell commented Mar 28, 2021 •

edited by leem44

Loading

trevorcampbell commented Jul 17, 2021

leem44 commented Aug 12, 2021

Review: Ch 7 (classification_continued) #107

Review: Ch 7 (classification_continued) #107

Comments

leem44 commented Mar 28, 2021 • edited Loading

Reviewer E:

leem44 commented Mar 28, 2021 • edited Loading

Reviewer B:

trevorcampbell commented Mar 28, 2021 • edited by leem44 Loading

Reviewer D

trevorcampbell commented Mar 28, 2021 • edited by leem44 Loading

Reviewer A

trevorcampbell commented Jul 17, 2021

leem44 commented Aug 12, 2021

leem44 commented Mar 28, 2021 •

edited

Loading

leem44 commented Mar 28, 2021 •

edited

Loading

trevorcampbell commented Mar 28, 2021 •

edited by leem44

Loading

trevorcampbell commented Mar 28, 2021 •

edited by leem44

Loading