Skip to content

Commit

Permalink
Improve looping over files lecture material
Browse files Browse the repository at this point in the history
* Condense presentation
* Show shortcuts for empty vectors
* Store filename then read it to simplify setup for storing in data frames
* Add a realistic calculation within the file itself
  • Loading branch information
ethanwhite committed Nov 4, 2022
1 parent b0ca28b commit 6009bbd
Showing 1 changed file with 10 additions and 16 deletions.
26 changes: 10 additions & 16 deletions materials/for-loops-R.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,14 +207,15 @@ data_files = list.files(pattern = "locations-")
* First create an empty vector to store those counts

```r
results <- vector(mode = "integer", length = length(data_files))
results <- integer(length(data_files))
```

* Then write our loop

```r
for (i in 1:length(data_files){
data <- read.csv(data_files[i])
filename <- data_files[i]
data <- read.csv(filename)
count <- nrow(data)
results[i] <- count
}
Expand All @@ -228,38 +229,31 @@ for (i in 1:length(data_files){
* We often want to calculate multiple pieces of information in a loop making it useful to store results in things other than vectors
* We can store them in a data frame instead by creating an empty data frame and storing the results in the `i`th row of the appropriate column
* Associate the file name with the count
* Also store the minimum latitude
* Start by creating an empty data frame
* Use the `data.frame` function
* Provide one argument for each column
* "Column Name" = "an empty vector of the correct type"

```r
results <- data.frame(file_name = vector(mode = "character", length = length(data_files)))
count = vector(mode = "integer", length = length(data_files)))
results <- data.frame(file_name = character(length(data_files)),
count = integer(length(data_files)),
min_lat = numeric(length(data_files)))
```

* Now let's modify our loop from last time
* Instead of storing `count` in `results[i]` we need to first specify the `count` column using the `$`: `results$count[i]`
* We also want to store the filename, which is `data_files[i]`

```r
for (i in 1:length(data_files){
data <- read.csv(data_files[i])
count <- nrow(data)
results$file_name[i] <- data_files[i]
results$count[i] <- count
}
```

* We could also rewrite this a little to make it easier to understand by getting the file name at the begging

```r
for (i in 1:length(data_files){
for (i in 1:length(data_files)){
filename <- data_files[i]
data <- read.csv(filename)
count <- nrow(data)
min_lat = min(data$lat)
results$file_name[i] <- filename
results$count[i] <- count
results$min_lat[i] <- min_lat
}
```

Expand Down

0 comments on commit 6009bbd

Please sign in to comment.