Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrangling chapter review #235

Merged
merged 72 commits into from
Sep 27, 2021
Merged

Wrangling chapter review #235

merged 72 commits into from
Sep 27, 2021

Conversation

leem44
Copy link
Contributor

@leem44 leem44 commented Aug 17, 2021

@trevorcampbell @ttimbers Please have a look at this PR and let me know if I should make any changes!
This PR addresses: #103, #160

  • Note: I have not yet removed map from this chapter -- I figure its easier to get any feedback on this, make changes and then create a branch off of this to remove map

leem44 and others added 30 commits June 27, 2021 13:34
…s, piping in data frame as per reviewers comments
…s, piping in data frame as per reviewers comments
@leem44
Copy link
Contributor Author

leem44 commented Sep 20, 2021

@ttimbers Changes look good! Only thing is the one below "Montreal" is spelled "Montreak"

Also changed the vector figure to be of type character, as previously it was called an integer, but then we actually made a double vector in code... I tried to swap it to double, but then it was too confusing for a first example. Character vector is a simpler and thus better first example:
Screen Shot 2021-09-19 at 12 23 34 PM

@ttimbers
Copy link
Contributor

Good catch!

…ta from the canlang package as I don't think we want to add that complexity...
@ttimbers
Copy link
Contributor

I also removed loading data from the canlang package and instead get them to read the data from files. I don't think we really explain data packages and they are kind of weird, so best to leave out I think?

@leem44
Copy link
Contributor Author

leem44 commented Sep 21, 2021

Yes, I agree with that! Makes sense!

@leem44 leem44 linked an issue Sep 21, 2021 that may be closed by this pull request
@ttimbers
Copy link
Contributor

ttimbers commented Sep 22, 2021

In Fig 3.10 we illlustrate going from longer to wider, but each table is the same number of columns. Yes, this is technically something that happens, but for our first teaching case we really should have a wider dataframe that we make longer:
Screen Shot 2021-09-21 at 10 26 13 PM

Also, where did the values for commuters come from? I imagine population came from canlang, but I don't think that data set has commuters? Other columns we could add from canlang could be area dwellings or households (we need something that is a count to stick with the narrative)?

@ttimbers
Copy link
Contributor

ttimbers commented Sep 22, 2021

I think we should remove the & operator from here as we never use it elsewhere in the book or course. So it's just redundant and extra info that they do not "need" to know:

Screen Shot 2021-09-22 at 9 48 06 AM

@ttimbers
Copy link
Contributor

@leem44 - I am also removing the demonstration of filter not working on numeric data that is in the mutate section, as it is rendundant to what we wrote above in the section about convert and separate.

@ttimbers
Copy link
Contributor

ttimbers commented Sep 22, 2021

@leem44 - in the wrangling chapter, are you OK to move the pipe to before mutate and after select and filter? It uses select and filter so is timely after those two. Also, then I can use the |> in the setup for the as.numeric transformation for mutate so the creation of the data frame with the character columns that should be numbers is a little less magical?

If I do that, then I will use arrange in the pipe preamble since that was intro'ed in chapter 1, but mutate won't have been demo'd yet. Also, I think that works nicely since we use arrange in the code example we show that follows.

@ttimbers
Copy link
Contributor

ttimbers commented Sep 22, 2021

Maybe not actually - I am now fence sitting on this change, as it causes cascading changes...

Decided not to do this.

@ttimbers
Copy link
Contributor

ttimbers commented Sep 22, 2021

Simplified the example to just plot the proportion of people speaking English as the primary language at home.
Then we can skip case_when and vector recycling and keep the narrative pretty similar.

@ttimbers
Copy link
Contributor

Done all sections except for summarize + across and purrr::map* (I did do rowwise which comes after these).

@ttimbers
Copy link
Contributor

OK, I think we are ready to merge. I added a bunch of images into the wrangling chapter, and re-organized the section where we do aggregation. I am pretty excited about it. @leem44 - I made quite a few changes to the wrangling chapter to (hopefully) simplify it. I hope that is OK. If there is anything I removed that you really want to keep in, please let me know and we can talk and find a way to keep it.

@ttimbers
Copy link
Contributor

@leem44 gave me the go-ahead to merge, so merging!

@ttimbers ttimbers merged commit bc73e02 into dev Sep 27, 2021
@ttimbers ttimbers deleted the review_wrangling branch October 19, 2021 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Review: Ch 3 (wrangling)
2 participants