diff --git a/episodes/01-introduction.md b/episodes/01-introduction.md index ed81ffbf..71b82e90 100644 --- a/episodes/01-introduction.md +++ b/episodes/01-introduction.md @@ -21,13 +21,13 @@ exercises: 0 ## Introducing the Python programming language -Python is a general purpose programming language. It is an interpreted language, +Python is a general-purpose programming language. It is an interpreted language, which makes it suitable for rapid development and prototyping of programming segments or complete small programs. Python's main advantages: -- Open source software, supported by [Python Software +- Open-source software, supported by [Python Software Foundation](https://www.python.org/psf/) - Available on all major platforms (Windows, macOS, Linux) - It is a good language for new programmers to learn due to its straightforward, @@ -43,11 +43,10 @@ before running it. It is the machine code which is executed and produces results. In a language like C++ your code is translated into machine code and stored in a separate file, in a process referred to as **compiling** the code. You then execute the machine code from the file as a separate step. This is -efficient if you intend to run the same machine code many times as you only have -to compile it once and it is very fast to run the compiled machine code. +efficient if you intend to run the same machine code many times as you only compile it once and it is very fast to run the compiled machine code. On the other hand, if you are experimenting, then your -code will change often and you would have to compile it again every time before +code will change often, and you would have to compile it again every time before the machine can execute it. This is where **interpreted** languages have the advantage. You don't need a complete compiled program to "run" what has been written so far and see the results. This rapid prototyping is helped further by @@ -86,11 +85,11 @@ development more interactive. Since its inception, the scope of the project has expanded to include **Ju**lia, **Pyt**hon, and **R**, so the name was changed to "Jupyter" as a reference to these core languages. Today, Jupyter supports even more languages, but we will be using it only for Python code. Specifically, we will -be using **Jupyter notebooks**, which allows us to easily take notes about +be using **Jupyter notebooks**, which allow us to easily take notes about our analysis and view plots within the same document where we code. This facilitates sharing and reproducibility of analyses, and the notebook interface is easily accessible through any web browser. Jupyter notebooks are started -from the terminal using +from the terminal using the following command: ```bash $ jupyter notebook @@ -124,8 +123,8 @@ cell (`In [ ]:`) is created for you automatically. ![](fig/Python_jupyter_8.png){alt='Jupyter\_notebook\_cell'} When a cell is run, it is given a number along with the corresponding output -cell. If you have a notebook with many cells in it you can run the cells in any -order and also run the same cell many times. The number on the left hand side of +cell. If you have a notebook with many cells in it, you can run the cells in any +order and also run the same cell many times. The number on the left-hand side of the input cells increments, so you can always tell the order in which they were run. For example, a cell marked `In [5]:` was the fifth cell run in the sequence. diff --git a/episodes/02-basics.md b/episodes/02-basics.md index dc4e0bf0..815cbd2e 100644 --- a/episodes/02-basics.md +++ b/episodes/02-basics.md @@ -1,7 +1,7 @@ --- title: Python basics -teaching: 25 -exercises: 30 +teaching: 90 +exercises: 60 --- ::::::::::::::::::::::::::::::::::::::: objectives @@ -31,11 +31,11 @@ exercises: 30 ### New cells -From the insert menu item you can insert a new cell anywhere in the notebook either above or below the current cell. You can also use the `+` button on the toolbar to insert a new cell below. +From the insert menu item, you can insert a new cell anywhere in the notebook either above or below the current cell. You can also use the `+` button on the toolbar to insert a new cell below. ### Change cell type -By default new cells are created as code cells. From the cell menu item you can change the type of a cell from code to Markdown. Markdown is a markup language for formatting text, it has much of the power of HTML, but is specifically designed to be human-readable as well. You can use Markdown cells to insert formatted textual explanation and analysis into your notebook. For more information about Markdown, check out these resources: +By default, new cells are created as code cells. From the cell menu item, you can change the type of a cell from code to Markdown. Markdown is a markup language for formatting text, it has much of the power of HTML, but is specifically designed to be human-readable as well. You can use Markdown cells to insert formatted textual explanation and analysis into your notebook. For more information about Markdown, check out these resources: - [Jupyter Notebook Markdown Docs](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html) - [Markdown - a Visual Guide](https://beegit.com/markdown-cheat-sheet) @@ -44,9 +44,9 @@ By default new cells are created as code cells. From the cell menu item you can ### Hiding output -When you run cells of code the output is displayed immediately below the cell. In general this is convenient. The output is associated with the cell that produced it and remains a part of the notebook. So if you copy or move the notebook the output stays with the code. +When you run cells of code the output is displayed immediately below the cell. In general, this is convenient as then the output is associated with the cell that produced it and remains a part of the notebook. If you copy or move the notebook the output will stay with the code. -However lots of output can make the notebook look cluttered and more difficult to move around. So there is an option available from the `cell` menu item to 'toggle' or 'clear' the output associated either with an individual cell or all cells in the notebook. +However, lots of output can make the notebook look cluttered and more difficult to move around. There is an option available from the `cell` menu item to 'toggle' or 'clear' the output associated either with an individual cell or all cells in the notebook. ## Creating variables and assigning values @@ -77,18 +77,19 @@ print(type(s)) ``` There are many more data types available, a full list is available in the [Python documentation](https://docs.python.org/3/library/datatypes.html). -We will be looking a few of them later on. +We will be looking at a few of them later. + +We need to use Python’s built-in `print()` function, which displays formatted text, because by default only the last output from a cell is displayed. ## Arithmetic operations -For now we will stick with the numeric types and do some arithmetic. +For now, we will stick with the numeric types and do some arithmetic. -All of the usual arithmetic operators are available. +All the usual arithmetic operators are available. -In the examples below we also introduce the Python comment symbol `#`. +In the examples below we introduce the Python comment symbol `#`. Anything to the right of the `#` symbol is treated as a comment. To a large extent using Markdown cells in a notebook reduces the need for comments in the code in a notebook, but occasionally they can be useful. -We also make use of the built-in `print()` function, which displays formatted text. ```python print("a =", a, "and b =" , b) @@ -112,15 +113,11 @@ a = 2 and b = 3.142 0.8580000000000001 ``` -We need to use the `print()` function because by default only the last output from a cell is displayed in the output cell. - -In our example above, we pass four different parameters to the first call of `print()`, each separated by a comma. A string `"a = "`, followed by the variable `a`, followed by the string `"b = "` and then the variable `b`. - -The output is what you would probably have guessed at. +In the first example above, we pass four different parameters to the first call of `print()`, each separated by a comma. A string `"a = "`, followed by the variable `a`, followed by the string `"b = "` and then the variable `b`. The output is what you would have probably guessed. -All of the other calls to `print()` are only passed a single parameter. Although it may look like 2 or 3, the expressions are evaluated first and it is only the single result which is seen as the parameter value and printed. +All of the other calls to `print()` are only passed a single parameter. Although it may look like two or three, the expressions are evaluated first, and it is only the single result which is seen as the parameter value and printed. -In the last expression `a` is multiplied by 2 and then the modulus of the result is taken. Had we wanted to calculate a % b and then multiply the result by two we could have done so by using brackets to make the order of calculation clear. +In the last expression, `a` is multiplied by two and then the modulus of the result is taken. Had we wanted to calculate a % b and then multiply the result by two we could have done so by using parentheses to make the order of calculation clear. When we have more complex arithmetic expressions, we can use parentheses to be explicit about the order of evaluation: @@ -169,17 +166,17 @@ A complete set of Python operators can be found in the [official documentation]( Python has a reasonable number of built-in functions. You can find a complete list in the [official documentation](https://docs.python.org/3/library/functions.html). -Additional functions are provided by 3rd party packages which we will look at later on. +Additional functions are provided by third party packages which we will look at later in the lesson. For any function, a common question to ask is: What parameters does this function take? -In order to answer this from Jupyter, you can type the function name and then type `shift`\+`tab` and a pop-up window will provide you with various details about the function including the parameters. +To answer this from Jupyter, you can type the function name and then type `shift`+`tab` and a pop-up window will provide you with various details about the function, including the parameters. ::::::::::::::::::::::::::::::::::::::: challenge ## Exercise -For the `print()` function find out what parameters can be provided +Find out what parameters can be provided for the `print()` function. ::::::::::::::: solution @@ -195,7 +192,7 @@ Type 'print' into a code cell and then type `shift`\+`tab`. The following pop-up ## Getting Help for Python -You can get help on any Python function by using the help function. It takes a single parameter of the function name for which you want the help. +You can get help for any Python function by using the help function. It takes a single parameter, which is the name of the function you want the help for. ```python help(print) @@ -215,11 +212,11 @@ print(...) flush: whether to forcibly flush the stream. ``` -There is a great deal of Python help and information as well as code examples available from the internet. One popular site is [stackoverflow](https://stackoverflow.com/tags) which specialises in providing programming help. They have dedicated forums not only for Python but also for many of the popular 3rd party Python packages. They also always provide code examples to illustrate answers to questions. +There is a great deal of Python help and information as well as code examples available from the internet. One popular site is [stackoverflow](https://stackoverflow.com/tags) which specialises in providing programming help. They have dedicated forums not only for Python but also for many of the popular third party Python packages. They also provide code examples to illustrate answers to questions. -You can also get answers to your queries by simply inputting your question (or selected keywords) into any search engine. +You can also get answers to your queries by entering your question (or selected keywords) into any search engine. -A couple of things you may need to be wary of: There are currently 2 versions of Python in use, in most cases code examples will run in either but there are some exceptions. Secondly, some replies may assume a knowledge of Python beyond your own, making the answers difficult to follow. But for any given question there will be a whole range of suggested solutions so you can always move on to the next. +A couple of things you may need to be wary of: There are currently 2 versions of Python in use, in most cases code examples will run in either, but there are some exceptions. Secondly, some replies may assume a deeper knowledge of Python than you currently have, making the answers difficult to follow. But for any given question there will be a whole range of suggested solutions so you can always move on to the next. ## Data types and how Python uses them @@ -227,7 +224,7 @@ A couple of things you may need to be wary of: There are currently 2 versions of The data type of a variable is assigned when you give a variable a value as we did above. If you re-assign the value of a variable, you can change the data type. -You can also explicitly change the type of a variable by `casting` it using an appropriate Python builtin function. In this example we have changed a `string` to a `float`. +You can also explicitly change the type of a variable by `casting` it using an appropriate Python built-in function. In this example we have changed a `string` to a `float`. ```python a = "3.142" @@ -241,7 +238,7 @@ print(type(a)) ``` -Although you can always change an `integer` to a `float`, if you change a `float` to an `integer` then you can lose part of the value of the variable and you won't get an error message. +Although you can always change an `integer` to a `float`, if you change a `float` to an `integer` then you can lose part of the value of the variable, and you won't get an error message. ```python a = 3.142 @@ -267,7 +264,7 @@ print(a) 3 ``` -In some circumstances explicitly converting a data type makes no sense; you cannot change a string with alphabetic characters into a number. +In some circumstances explicitly converting a data type makes no sense; for example, you cannot change a string with alphabetic characters into a number. ```python b = "Hello World" @@ -296,25 +293,25 @@ A string is a simple data type which holds a sequence of characters. Strings are placed in quotes when they are being assigned, but the quotes don't count as part of the string value. -If you need to use quotes as part of your string you can arbitrarily use either single or double quotes to indicate the start and end of the string. +If you need to use quotes as part of your string you can choose to use either single or double quotes to indicate the start and end of the string. ```python mystring = "Hello World" print(mystring) name = "Peter" -mystring = 'Hello ' + name + ' How are you?' +mystring = 'Hello, ' + name + ', how are you?' print(mystring) name = "Peter" -mystring = 'Hello this is ' + name + "'s code" +mystring = 'Hello, this is ' + name + "'s code" print(mystring) ``` ```output Hello World -Hello Peter How are you? -Hello this is Peter's code +Hello, Peter, how are you? +Hello, this is Peter's code ``` ## String functions @@ -334,7 +331,7 @@ print(len(mystring)) 11 ``` -[The official documentation](https://docs.python.org/3/tutorial/classes.html) says, 'A method is a function that "belongs to" an object. In Python, the term method is not unique to class instances: other object types can have methods as well. For example, list objects have methods called append, insert, remove, sort, and so on.'. +[The official documentation](https://docs.python.org/3/tutorial/classes.html) says: _A method is a function that “belongs to” an object. (In Python, the term method is not unique to class instances: other object types can have methods as well. For example, list objects have methods called append, insert, remove, sort, and so on. [...]_ ' If you want to see a list of all of the available methods for a string (or any other object) you can use the `dir()` function. @@ -346,7 +343,7 @@ print(dir(mystring)) ['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill'] ``` -The methods starting with `__` are special or magic methods which are not normally used. +Methods starting with `__` are special or magic methods which are not normally used. Some examples of methods are given below. We will use others once we start reading files. Some examples of the methods are given below. We will use others when we start reading files. @@ -366,7 +363,7 @@ THE QUICK BROWN FOX The quick brown fox ``` -The methods starting with 'is...' return a boolean value of either True or False +Methods starting with _is_ return a Boolean value (True or False). ```python print(myString.isalpha()) @@ -376,10 +373,10 @@ print(myString.isalpha()) False ``` -the example above returns False because the space character is not considered to be an Alphanumeric value. +The example above returns False, because the space character is not considered to be an Alphanumeric value. -In the example below, we can use the `replace()` method to remove the spaces and then check to see if the result `isalpha` -chaining method in this way is quite common. The actions take place in a left to right manner. You can always avoid using chaining by using intermediary variables. +In the example below, we can use the `replace` method to remove spaces and then check to see if the result is alphanumeric by using the method `isalpha`. +Methods can be "chained", which means performing one method after another, from left to right, without the need to save all intermediate variables. If you would prefer not to chain methods, you would need to save each intermediate value to a new variable. ```python print(myString.replace(" ","").isalpha()) @@ -389,7 +386,7 @@ print(myString.replace(" ","").isalpha()) True ``` -For example, the following is equivalent to the above +For example, the following is equivalent to the above, but requires an extra line of code and an extra variable: ```python mystring_clean = myString.replace(" ","") @@ -401,9 +398,11 @@ True ``` If you need to refer to a specific element (character) in a string, -you can do so by specifying the index of the character in `[]` -you can also use indexing to select a substring of the string. In Python, -indexes begin with `0` (for a visual, please see [Strings and Character Data in Python: String Indexing](https://realpython.com/python-strings/#string-indexing) or [9\.4. Index Operator: Working with the Characters of a String](https://runestone.academy/runestone/books/published/thinkcspy/Strings/IndexOperatorWorkingwiththeCharactersofaString.html)). +you can do so by specifying the `index` of the character in between square brackets `[]` +In Python, indexes begin with `0` (for a visual, please see +[Strings and Character Data in Python: String Indexing](https://realpython.com/python-strings/#string-indexing) +or [9.4. Index Operator: Working with the Characters of a String](https://runestone.academy/runestone/books/published/thinkcspy/Strings/IndexOperatorWorkingwiththeCharactersofaString.html)). +You can also use indexing to select a substring of the whole string. ```python myString = "The quick brown fox" @@ -430,46 +429,30 @@ quick ## Basic Python data types -So far we have seen three basic Python data types; Integer, Float and String. There is another basic data type; Boolean. Boolean variables can only have the values of either `True` or `False`. (Remember, Python is case-sensitive, so be careful of your spelling.) -We can define variables to be of type boolean by setting their value accordingly. Boolean variables are a good way of coding anything that has a binary range (eg: yes/no), because it's a type that computers know how to work with as we will see soon. +So far, we have seen three basic Python data types: Integer, Float, and String. There is another basic data type: Boolean. Boolean variables take the values `True` or `False` (remember that Python is case-sensitive, so be careful with spelling!). Boolean variables are a good way of coding anything that has a binary range (e.g., yes/no), because it's a type that computers know how to work with, as we will see soon. ```python -print(True) print(False) bool_val_t = True print(type(bool_val_t)) print(bool_val_t) -bool_val_f = False -print(type(bool_val_f)) -print(bool_val_f) +print(true) ``` ```output -True False True - -False -``` - -Following two lines of code will generate error because Python is case-sensitive. We need to use 'True' instead of 'true' and 'False' instead of 'false'. - -```python -print(true) -print(false) -``` - -```output NameError Traceback (most recent call last) in ----> 1 print(true) - 2 print(false) NameError: name 'true' is not defined ``` -We can also get values of Boolean type using comparison operators, basic ones in Python are `==` for "equal to", `!=` for "not equal to", and `>`, `<`, or `>=`, `<=`. +The last line of code generates an error because Python is case-sensitive. We need to use 'True' instead of 'true'. + +We can also get Boolean values by using comparison operators, the basic ones in Python are `==` for "equal to", `!=` for "not equal to", and `>`, `<`, or `>=`, `<=`. ```python print('hello' == 'HELLO') @@ -491,36 +474,27 @@ False ## Exercise -Imagine you are considering different ways of representing a boolean value in your data set and you need to see how python will behave based on the different choices. Fill in the blanks using the built in functions we've seen so far in following code excerpt to test how Python interprets text. Write some notes for your research team on how to code `True` and `False` as they record the variable. +Imagine you are considering different ways of representing a boolean value in your data set and you need to see how python will behave based on the different choices. Fill in the blanks using the built-in functions we've seen so far to test how Python interprets the different values. ```python bool_val1 = 'TRUE' print('read as type ',___(bool_val1)) print('value when cast to bool',___(bool_val1)) -bool_val2 = 'FALSE' +bool_val2 = 1 print('read as type ',___(bool_val2)) print('value when cast to bool',___(bool_val2)) -bool_val3 = 1 +bool_val3 = 0 print('read as type ',___(bool_val3)) print('value when cast to bool',___(bool_val3)) - -bool_val4 = 0 -print('read as type ',___(bool_val4)) -print('value when cast to bool',___(bool_val4)) - -bool_val5 = -1 -print('read as type ',___(bool_val5)) -print('value when cast to bool',___(bool_val5)) -print(bool(bool_val5)) ``` ::::::::::::::: solution ## Solution -0 is represented as False and everything else, whether a number or string is counted as True +The number 0 is interpreted as `False`, and non-empty strings or non-zero numbers are interpreted as `True`. ::::::::::::::::::::::::: @@ -528,40 +502,36 @@ print(bool(bool_val5)) ## Structured data types -A structured data type is a data type which is made up of some combination of the base data types in a well defined but potentially arbitrarily complex way. +A _structured_ data type is made up of a combination of the base data types (e.g., Integer, String, Boolean) in a well defined but potentially (arbitrarily) complex way. ### The list -A list is a set of values, of any type separated by commas and delimited by '[' and ']' +A list is a set of values of any type, separated by commas and delimited by square brackets ('[' and ']'). ```python -list1 = [6, 54, 89 ] +list1 = [6, 54, 89.23] print(list1) print(type(list1)) -list2 = [3.142, 2.71828, 9.8 ] +myname = "Peter" +list2 = ["Hello", myname, 'how are you today?'] +print(list2) +myname = "Fred" print(list2) print(type(list2)) -myname = "Peter" -list3 = ["Hello", 'to', myname ] -print(list3) -myname = "Fred" +list3 = [6, 5.4, "numbers", True] print(list3) print(type(list3)) - -list4 = [6, 5.4, "numbers", True ] -print(list4) -print(type(list4)) ``` ```output -[6, 54, 89] +[6, 54, 89.23] [3.142, 2.71828, 9.8] -['Hello', 'to', 'Peter'] -['Hello', 'to', 'Peter'] +['Hello', 'Peter', 'how are you today?'] +['Hello', 'Peter', 'how are you today?'] [6, 5.4, 'numbers', True] @@ -571,15 +541,15 @@ print(type(list4)) ## Exercise -We can index lists the same way we indexed strings before. Complete the code below and display the value of `last_num_in_list` which is 11 and values of `odd_from_list` which are 5 and 11 to check your work. +Use an _index_ to print the last number in the list below. Then print the odd numbers that are in the list. ```python -num_list = [4,5,6,11] +num_list = [4, 5, 6, 11] last_num_in_list = num_list[____] print(last_num_in_list) -odd_from_list = [num_list[_____], ______] +odd_from_list = print(odd_from_list) ``` @@ -588,31 +558,27 @@ print(odd_from_list) ## Solution ```python -# Solution 1: Basic ways of solving this exercise using the core Python language -num_list = [4,5,6,11] - -last_num_in_list = num_list[-1] +# One solution: Basic way of solving this exercise using core Python language +last_num_in_list = num_list[-1] # Why is this a "better" solution than num_list[3]? print(last_num_in_list) odd_from_list = [num_list[1], num_list[3]] print(odd_from_list) - -# Solutions 2 and 3: Usually there are multiple ways of doing the same work. Once we learn about more advanced Python, we would be able to write more varieties codes like the followings to print the odd numbers: +# Other solutions: There is usually more than one way of doing the same task. Some might be more _generalizable_ or robust than others. Once we learn more advanced Python, we will be able to write more advanced and error-proof code to print odd numbers, such as the following: +# Use the third party package called `numpy` import numpy as np -num_list = [4,5,6,11] +num_list = [4, 5, 6, 11] -# Converting `num_list` list to an advanced data structure: `numpy array` +# Convert `num_list` list to an advanced data structure: `numpy array` num_list_np_array = np.array(num_list) -# Filtering the elements which produces a remainder of `1`, after dividing by `2` -odd_from_list = num_list_np_array[num_list_np_array%2 == 1] -print(odd_from_list) +# Use the "modulo" operator we learned about above to find the indexes of the numbers in the list that have remainder = 1 after dividing by 2 +odd_ixs = num_list_np_array % 2 == 1 +print(odd_ixs) -# or, Using a concept called `masking` -# Create a boolean list `is_odd` of the same length of `num_list` with `True` at the position of the odd values. -is_odd = [False, True, False, True] # Mask array -odd_from_list = num_list_np_array[is_odd] # only the values at the position of `True` remain +# Now, use a concept called `masking` to index into the numerical array +odd_from_list = num_list_np_array[num_list_np_array % 2 == 1] print(odd_from_list) ``` @@ -622,7 +588,7 @@ print(odd_from_list) ### The range function -In addition to explicitly creating lists as we have above it is very common to create and populate them automatically using the `range()` function in combination with the `list()` function +In addition to explicitly creating lists as we have above it is very common to create and populate them automatically using the `range` function in combination with the `list` function ```python list5 = list(range(5)) @@ -633,17 +599,15 @@ print(list5) [0, 1, 2, 3, 4] ``` -Unless told not to `range()` returns a sequence which starts at 0, counts up by 1 and ends 1 before the value of the provided parameter. +Unless explicitly specified, `range` returns a sequence which starts at 0, counts up by one, and ends one number _before_ the value of the provided parameter. This can be a cause of confusion. `range(5)` above does indeed have 5 values, but rather than being `1,2,3,4,5` which you might have expected, they are `0,1,2,3,4`. -This can be a cause of confusion. `range(5)` above does indeed have 5 values, but rather than being 1,2,3,4,5 which you might naturally think, they are in fact 0,1,2,3,4. The range starts at 0 and stops one before the value of the single parameter we specified. - -If you want different sequences, then you can modify the behavior of the `range()` function by using additional parameters. +If you want different sequences, you can modify the behavior of the `range()` function by using additional parameters. ```python -list6 = list(range(1, 9)) +list5 = list(range(1, 9)) +print(list5) +list6 = list(range(2, 12, 2)) print(list6) -list7 = list(range(2, 11, 2)) -print(list7) ``` ```output @@ -651,16 +615,14 @@ print(list7) [2, 4, 6, 8, 10] ``` -When you specify 3 parameters as we have for list(7); the first is start value, the second is one past the last value and the 3rd parameter is a step value by which to count. The step value can be negative - -`list7` produces the even numbers from 1 to 10. +When you specify three parameters, as we have for `list6`, the first will be the first value in the list, the second will be last value of the list plus one, and the third parameter is the step or interval by which to count. So, for example, `list6` returns all even numbers between 2 and 12 (including the first, excluding the latter). ::::::::::::::::::::::::::::::::::::::: challenge ## Exercise -1. What is produced if you change the step value in `list7` to -2 ? Is this what you expected? -2. Create a list using the `range()` function which contains the even number between 1 and 10 in reverse order ([10,8,6,4,2]) +1. What happens if you change the step in `list6` to -2 ? +2. Create a list using the `range()` function which contains all even numbers between 1 and 10 (including 10) in reverse order (i.e., [10, 8, 6, 4, 2]) ::::::::::::::: solution @@ -674,15 +636,13 @@ list8 = list(range(10, 1, -2)) print(list8) ``` -list7 will print nothing because starting at 2 and incrementing by -2 is the wrong direction to 11. - - +`list7` will be empty because it isn't possible to start at 2 and get to 11 by incrementing by -2 (which is the same as decreasing by 2). ::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::: -The other main structured data type is the Dictionary. We will introduce this in a later episode when we look at JSON. +Another commonly used structured data type is a Dictionary. We will introduce this in a later lesson, when we look at JSON. :::::::::::::::::::::::::::::::::::::::: keypoints diff --git a/episodes/04-reusable.md b/episodes/04-reusable.md index 9b0ca0a5..5a23c0fe 100644 --- a/episodes/04-reusable.md +++ b/episodes/04-reusable.md @@ -21,20 +21,15 @@ exercises: 15 ## Defining a function -We have already made use of several Python builtin functions like `print`, `list` and `range`. +We have already made use of several Python built-in functions like `print`, `list`, and `range`. But in addition to the functions provided by Python, you can write your own as well. Functions are used when a section of code needs to be repeated several times in a program, it saves you rewriting it. In reality, you rarely need to repeat the _exact same_ code. Usually there will be some variation, for example in the variables the code needs to be run on. Because of this, when you create a function you are allowed to specify a set of `parameters` or arguments to the function. -In addition to the functions provided by Python, you can write your own functions. - -Functions are used when a section of code needs to be repeated at various different points in a program. It saves you re-writing it all. In reality you rarely need to repeat the exact same code. Usually there will be some variation in variable values needed. Because of this, when you create a function you are allowed to specify a set of `parameters` which represent variables in the function. - -In our use of the `print` function, we have provided whatever we want to `print`, as a `parameter`. Typically whenever we use the `print` function, we pass a different `parameter` value. - -The ability to specify parameters make functions very flexible. +When we used the `print` function we provided the text we wanted to `print` as a `parameter`. Typically whenever we use the `print` function, we pass a different `parameter` value. The ability to specify parameters make functions very flexible. ```python def get_item_count(items_str,sep): ''' - This function takes a string with a list of items and the character that they're separated by and returns the number of items + This function takes a string with a list of items and the character that separates the items, and returns the number of items in the list + ''' items_list = items_str.split(sep) num_items = len(items_list) @@ -52,22 +47,21 @@ print(get_item_count(items_owned,';')) Points to note: -1. The definition of a function (or procedure) starts with the def keyword and is followed by the name of the function with any parameters used by the function in parentheses. -2. The definition clause is terminated with a `:` which causes indentation on the next and subsequent lines. All of these lines form the statements which make up the function. The function ends after the indentation is removed. -3. Within the function, the parameters behave as variables whose initial values will be those that they were given when the function was called. -4. functions have a return statement which specifies the value to be returned. This is the value assigned to the variable on the left-hand side of the call to the function. (power in the example above) -5. You call (run the code) of a function simply by providing its name and values for its parameters the same way you would for any builtin function. +1. The definition of a function (or procedure) starts with the keyword _def_ and is followed by the name you wish to give to the function, with any parameters used by the function in between parentheses. +2. The definition clause ends in`:` which causes indentation on the next and subsequent lines. All of these lines are the statements which make up the function. The function ends where the indentation ends. +3. Within the function, the parameters behave as variables whose initial values will be those that were given when the function was called. +4. Functions usually "return" something, which is the result of the procedure applied to the parameters, and is the value assigned to the variable on the left-hand side of the call to the function. This is specified using the `return` keyword. +5. You call (run the code) of a function by providing its name and values for its parameters, the same way you would for any built-in function. 6. Once the definition of the function has been executed, it becomes part of Python for the current session and can be used anywhere. -7. Like any other builtin function you can use `shift` + `tab` in Jupyter to see the parameters. -8. At the beginning of the function code we have a multiline `comment` denoted by the `'''` at the beginning and end. This kind of comment is known as a `docstring` and can be used anywhere in Python code as a documentation aid. It is particularly common, and indeed best practice, to use them to give a brief description of the function at the beginning of a function definition in this way. This is because this description will be displayed along with the parameters when you use the help() function or `shift` + `tab` in Jupyter. -9. The variable `x` defined within the function only exists within the function, it cannot be used outside in the main program. +7. At the beginning of the function code we have a multiline `comment` denoted by the `'''` at the beginning and end. This kind of comment is known as a `docstring` and can be used anywhere in Python code as a documentation aid. It is particularly common, and indeed best practice, to use them to give a brief description of the function. This is because this description will be displayed along with the parameters when you use the help() function or `shift` + `tab` in Jupyter. +8. Variables that are defined within a function only exist within the function itself, they cannot be used outside in the main program. -In our `get_item_count` function we have two parameters which must be provided every time the function is used. You need to provide the parameters in the right order or to explicitly name the parameter you are referring to and use the `=` sign to give it a value. +Our `get_item_count` function has two parameters which must be provided every time the function is called. You need to provide the parameters in the right order or to explicitly name the parameter you are referring to and use the `=` sign to give it a value. -In many cases of functions we want to provide default values for parameters so the user doesn't have to. We can do this in the following way +In many cases, there is a value for a certain parameter that is more likely than others. In that case the value can be set as "default", and that is the value that will be taken if the user does not specify a value. ```python -def get_item_count(items_str,sep=';'): +def get_item_count(items_str, sep=';'): ''' This function takes a string with a list of items and the character that they're separated by and returns the number of items ''' @@ -76,14 +70,14 @@ def get_item_count(items_str,sep=';'): return num_items -print(get_item_count(items_owned)) +print(get_item_count(items_owned)) # Note that the separator is not specified ``` ```output 4 ``` -The only change we have made is to provide a default value for the `sep` parameter. Now if the user does not provide a value, then the value of 2 will be used. Because `items_str` is the first parameter we can specify its value by position. We could however have explicitly named the parameters we were referring to. +The only change we made is to provide a default value for the `sep` parameter. Now if the user does not provide a value, then the value _;_ will be used. Because `items_str` is the first parameter, we can specify its value by position, without having to explicitly name it, but it could be clearer to explicitly name all parameters. ```python print(get_item_count(items_owned, sep = ',')) @@ -97,91 +91,92 @@ print(get_item_count(items_str = items_owned, sep=';')) ::::::::::::::::::::::::::::::::::::::: challenge -## Volume of a cube +## Exercise -1. Write a function definition to calculate the volume of a cuboid. The function will use three parameters `h`, `w` - and `l` and return the volume. +1. Write a function definition to create an identifier for each survey participant. +The function requires three parameters: `first_name`, `surname`, and their six-digit staff ID number (`id`); and returns an identifier formed by the last letter of the first name, the two middle numbers of the staff ID, and the last letter of the surname. -2. Supposing that in addition to the volume I also wanted to calculate the surface area and the sum of all of the edges. Would I (or should I) have three separate functions or could I write a single function to provide all three values together? +2. Suppose that in addition to the identifier you also wanted to generate a username that each participant could use to log into a platform where you will display their results, formed by their first name initial plus the whole surname, all in lowercase; and also return their full name as one string. +Would you (or should you) have three separate functions or could you write a single function to provide all three values together? ::::::::::::::: solution ## Solution -- A function to calculate the volume of a cuboid could be: +1. A function to calculate the unique identifier as described in the exercise could be: ```python -def calculate_vol_cuboid(h, w, len): +def generate_identifier(first_name, surname, id): """ - Calculates the volume of a cuboid. - Takes in h, w, len, that represent height, width, and length of the cube. - Returns the volume. + Generates an identifier formed by: the last letter of the first name, the two middle numbers of the staff ID, and the length of their surname. + Takes in first_name, surname, and id; and returns the identifier. """ - volume = h * w * len - return volume + identifier = first_name[-1] + id[2:4] + str(len(surname)) + return identifier ``` -- It depends. As a rule-of-thumb, we want our function to **do one thing and one thing only, and to do it well.** - If we always have to calculate these three pieces of information, the 'one thing' could be - 'calculate the volume, surface area, and sum of all edges of a cube'. Our function would look like this: +2. It depends. As a rule-of-thumb, functions should __do one thing and one thing only, and do it well.__ +If you always need these three pieces of information together, the 'one thing' could be +'for each participant, generate an identifier, username, and full name'. In that case, your function could look like this: ```python # Method 1 - single function -def calculate_cuboid(h, w, len): +def generate_user_attributes(first_name, surname, id): """ - Calculates information about a cuboid defined by the dimensions h(eight), w(idth), and len(gth). - - Returns the volume, surface area, and sum of edges of the cuboid. + Generates attributes needed for survey participant. + Takes in first_name, surname, and id; and returns an identifier, username, and full name. """ - volume = h * w * len - surface_area = 2 * (h * w + h * len + len * w) - edges = 4 * (h + w + len) - return volume, surface_area, edges + identifier = first_name[-1] + id[2:4] + str(len(surname)) + username = (first_name[0] + surname).lower() + full_name = first_name + " " + surname + return identifier, username, full_name ``` -It may be better, however, to break down our function into separate ones - one for each piece of information we are -calculating. Our functions would look like this: +It may be better, however, to break your function down: one for each piece of information you are +generating. Your functions could look like this: ```python # Method 2 - separate functions -def calc_volume_of_cuboid(h, w, len): +def gen_identifier(first_name, surname, id): """ - Calculates the volume of a cuboid defined by the dimensions h(eight), w(idth), and len(gth). + Generates an identifier formed by: the last letter of the first name, the two middle numbers of the staff ID, and the length of their surname. + Takes in first_name, surname, and id; and returns the identifier. """ - volume = h * w * len - return volume + identifier = first_name[-1] + id[2:4] + str(len(surname)) + return identifier -def calc_surface_area_of_cuboid(h, w, len): +def gen_username(first_name, surname): """ - Calculates the surface area of a cuboid defined by the dimensions h(eight), w(idth), and len(gth). + Generates an username formed by: their first name initial plus the whole surname, all in lowercase. + Takes in first_name and surname, and returns the username. """ - surface_area = 2 * (h * w + h * len + len * w) - return surface_area + username = (first_name[0] + surname).lower() + return username -def calc_sum_of_edges_of_cuboid(h, w, len): +def display_full_name(first_name, surname): """ - Calculates the sum of edges of a cuboid defined by the dimensions h(eight), w(idth), and len(gth). + Displays the participant's full name. + Takes in first_name and surname, and returns the full name. """ - sum_of_edges = 4 * (h + w + len) - return sum_of_edges + full_name = first_name + " " + surname + return full_name ``` -We could then rewrite our first solution: +We could then rewrite our first function that returns all attributes needed: ```python -def calculate_cuboid(h, w, len): +def gen_attributes(first_name, surname, id): """ - Calculates information about a cuboid defined by the dimensions h(eight), w(idth), and len(gth). - - Returns the volume, surface area, and sum of edges of the cuboid. + Generates attributes needed for survey participant. + Takes in first_name, surname, and id; and returns an identifier, username, and full name. """ - volume = calc_volume_of_cuboid(h, w, len) - surface_area = calc_surface_area_of_cuboid(h, w, len) - edges = calc_sum_of_edges_of_cuboid(h, w, len) + identifier = gen_identifier(first_name, surname, id) + username = gen_username(first_name, surname) + full_name = display_full_name(first_name, surname) - return volume, surface_area, edges + return identifier, username, full_name ``` ::::::::::::::::::::::::: @@ -190,31 +185,27 @@ def calculate_cuboid(h, w, len): ## Using libraries -The functions we have created above only exist for the duration of the session in which they have been defined. If you start a new Jupyter notebook you will have to run the code to define them again. - -If all of your code is in a single file or notebook this isn't really a problem. - -There are however many (thousands) of useful functions which other people have written and have made available to all Python users by creating libraries (also referred to as packages or modules) of functions. - -You can find out what all of these libraries are and their contents by visiting the main (python.org) site. +The functions we have created above only exist within the Jupyter notebook in which they have been defined, and only for the duration of the session. If you start a new Jupyter notebook you will have to copy and paste the functions in to define them again. If all of your code is in a single file or notebook this isn't really a problem. But if your project gets larger, it can be hard to keep track of where each function is saved. -We need to go through a 2-step process before we can use them in our own programs. +There are many (thousands) of useful functions which other people have written and have made available to all Python users by creating libraries (also referred to as packages or modules) of functions. You can find out more about existing Python packages by visiting [pypi.org/](https://pypi.org/). -Step 1. use the `pip` command from the commandline. `pip` is installed as part of the Python install and is used to fetch the package from the Internet and install it in your Python configuration. +There are several ways to install third party packages to be able to use them in your own code. If you have Python 3.4 or later, it includes by default a package installer called [pip](https://pypi.org/project/pip/), which can be used to install packages. From a Jupyter notebook, you would use the syntax: -```bash -$ pip install +```python +!pip install ``` -pip stands for Python install package and is a commandline function. Because we are using the Anaconda distribution of Python, all of the packages that we will be using in this lesson are already installed for us, so we can move straight on to step 2. - -Step 2. In your Python code include an `import package-name` statement. Once this is done, you can use all of the functions contained within the package. - -As all of these packages are produced by 3rd parties independently of each other, there is the strong possibility that there may be clashes in function names. To allow for this, when you are calling a function from a package that you have imported, you do so by prefixing the function name with the package name. This can make for long-winded function names so the `import` statement allows you to specify an `alias` for the package name which you must then use instead of the package name. +After installing the package, you still need to "import" the package into your notebook to be able to use the functions contained within the package. This is done by running: +```python +import +``` -In future episodes, we will be importing the `csv`, `json`, `pandas`, `numpy` and `matplotlib` modules. We will describe their use as we use them. +As all of these packages are produced by third parties independently of each other, there is the strong possibility that there may be clashes in function names, this is there are functions in two different packages that have the exact same name. Therefore, when you are calling a function from a package that you have imported, you can prefix the function name with the package name, which makes it clear which function you are expecting to run. This can make for long-winded function names, though! The `import` statement allows you to also specify an "alias" for the package, which you must then use instead of the full package name. For example: +```python +import numpy as np +``` -The code that we will use is shown below +Many aliases (specified after the `as` keyword) are nearly universally adopted conventions used for very popular libraries, and you will almost certainly come across them when searching for example code. In future lessons, we will be importing the `csv`, `json`, `pandas`, `numpy`, and `matplotlib` modules, which we will describe as we use them. The code that we will use to import these packages is: ```python import csv @@ -224,9 +215,7 @@ import numpy as np import matplotlib.pyplot as plt ``` -The first two we don't alias as they have short names. The last three we do. Matplotlib is a very large library broken up into what can be thought of as sub-libraries. As we will only be using the functions contained in the `pyplot` sub-library we can specify that explicitly when we import. This saves time and space. It does not effect how we call the functions in our code. - -The `alias` we use (specified after the `as` keyword) is entirely up to us. However those shown here for `pandas`, `numpy` and `matplotlib` are nearly universally adopted conventions used for these popular libraries. If you are searching for code examples for these libraries on the Internet, using these aliases will appear most of the time. +Matplotlib is a very large library broken up into what can be thought of as sub-libraries. As we will only be using the functions contained in the `pyplot` sub-library we can specify that explicitly when we import. This saves time and space, and does not affect how we call the functions in our code. :::::::::::::::::::::::::::::::::::::::: keypoints