Skip to content

Commit

Permalink
updates - 3.3, 4.4, 9, 15
Browse files Browse the repository at this point in the history
  • Loading branch information
campbelle1 committed Aug 22, 2023
1 parent e93f293 commit 4e0dc89
Show file tree
Hide file tree
Showing 25 changed files with 4,161 additions and 2,138 deletions.
175 changes: 147 additions & 28 deletions textbook/03/3/DataTypes.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"# Data Types\n",
"*Evelyn Campbell, Ph.D.*\n",
"\n",
"Python offers a number of different data types that can be manipulated and used by various functions. Some important built-in Python data types include booleans, strings, integers, and floats. These data types can be used to build various data structures, such as lists, dictionaries, arrays, and dataframes, which will be covered in Chapters 4 and 6. Here we will explore each data type and corresponding functions that are useful when working with these data types."
"Python offers a number of different data types that can be manipulated and used by various functions. Some important built-in Python data types include **booleans**, **strings**, **integers**, and **floats**. These data types can be used to build various data structures, such as lists, dictionaries, arrays, and dataframes, which will be covered in Chapters [4](../4/DataStructures.ipynb) and [6](../6/DataFrames.ipynb). Here we will explore each data type and corresponding functions that are useful when working with these data types."
]
},
{
Expand All @@ -18,12 +18,12 @@
"source": [
"## Booleans\n",
"\n",
"Booleans are a data type that consists of two possible outcomes: `True` or `False`. Under the hood, these values take on a binary value, where `True` is equal to 1 and `False` is equal to 0. Booleans are very commonly used with comparison operators, and because they also can have a numeric meaning, they can be used in calculations as well. Let's start with a simple example of a Boolean."
"Booleans are a data type that consist of two possible outcomes: `True` or `False`. Under the hood, these values take on a binary value, where `True` is equal to 1 and `False` is equal to 0. Booleans are very commonly used with comparison operators ([discussed more in section 3.4](../3/4/Comparisons.ipynb)), and because they also can have a numeric meaning, they can be used in calculations as well. Let's start with a simple example of a Boolean."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 5,
"id": "0856f3c4",
"metadata": {},
"outputs": [
Expand All @@ -33,19 +33,29 @@
"False"
]
},
"execution_count": 1,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"boolval = 2 + 5 < 3 + 1\n",
"boolval = 5 < 3\n",
"boolval"
]
},
{
"cell_type": "markdown",
"id": "061f80a6",
"metadata": {},
"source": [
"Above, the variable `boolval` is equated to the expression `5 < 3`, which reads \"5 is less than 3.\" Because 5 is not in fact less than 3, the entire statement is `False`, and this Boolean value is assigned to `boolval`.\n",
"\n",
"Below, we add 5 to the value of `boolval`. Recall that `False` has a numerical value of 0, so essentially, `boolval + 5` is the same as 0 + 5:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 6,
"id": "9474bdc6",
"metadata": {},
"outputs": [
Expand All @@ -55,7 +65,7 @@
"5"
]
},
"execution_count": 2,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -65,6 +75,14 @@
"boolval"
]
},
{
"cell_type": "markdown",
"id": "071ba5da",
"metadata": {},
"source": [
"Using the variable directly in a comparison expression, we can see that the value of `boolval` is less than 10, and thus returns another Boolean value of `True`:"
]
},
{
"cell_type": "code",
"execution_count": 3,
Expand All @@ -91,7 +109,9 @@
"id": "37df7040",
"metadata": {},
"source": [
"The `bool()` function converts an input (i.e. a numeric value, string, or even data structures) to a boolean value."
"Python has built-in **functions** that use values and variables as input to perform a task and produce an output. We have already used some basic functions, such as the `print()` function, and we will learn about a few more that are associated with datatypes. Built-in functions will be further discussed in section [section 3.4](../3/5/IntroFunctions.ipynb). \n",
"\n",
"For now, we will use a few basic functions associated with data types. The `bool()` function converts an input (i.e. a numeric value, string, or even data structures) to a boolean value."
]
},
{
Expand Down Expand Up @@ -127,7 +147,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 2,
"id": "96368899",
"metadata": {},
"outputs": [
Expand All @@ -142,7 +162,7 @@
],
"source": [
"something = 6542\n",
"nothing = 0 # an empty list\n",
"nothing = 0\n",
"print(bool(something))\n",
"print(bool(nothing))"
]
Expand All @@ -154,12 +174,14 @@
"source": [
"## Strings\n",
"\n",
"Strings are a data type that can consist of concatenated alphanumeric and punctuation characters. Strings are recognized by Python through the use of single (' ') or double (\" \") quotation marks. "
"A **string** a data type that can consist of **concatenated** alphanumeric and punctuation characters. According to the Merriam-Webster dictionary, to concatenate means *to link together in a series or chain*.\n",
"\n",
"Strings are recognized by Python through the use of single (' '), double (\" \"), or triple (''' ''') quotation marks. "
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 7,
"id": "9868b82e",
"metadata": {},
"outputs": [
Expand All @@ -186,7 +208,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"id": "4e147421",
"metadata": {},
"outputs": [
Expand All @@ -204,7 +226,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 9,
"id": "5b4ec625",
"metadata": {
"tags": [
Expand All @@ -214,28 +236,86 @@
"outputs": [
{
"ename": "SyntaxError",
"evalue": "unterminated string literal (detected at line 1) (3546504085.py, line 1)",
"evalue": "invalid syntax (3546504085.py, line 1)",
"output_type": "error",
"traceback": [
"\u001b[0;36m Cell \u001b[0;32mIn [8], line 1\u001b[0;36m\u001b[0m\n\u001b[0;31m print('This isn't easy.')\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m unterminated string literal (detected at line 1)\n"
"\u001b[0;36m Input \u001b[0;32mIn [9]\u001b[0;36m\u001b[0m\n\u001b[0;31m print('This isn't easy.')\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"
]
}
],
"source": [
"print('This isn't easy.')"
]
},
{
"cell_type": "markdown",
"id": "101ed527",
"metadata": {},
"source": [
"The above error can be fixed by an **escape sequence**. Escape sequences are string modifiers that allow for the use of certain characters that would otherwise be misinterpreted by Python. Because strings are created by the use of quotes, the escape sequences `\\'` and `\\\"` allow for the use of quotes as part of a string:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "18da1844",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This isn't easy.\n"
]
}
],
"source": [
"print('This isn\\'t easy.')"
]
},
{
"cell_type": "markdown",
"id": "2dae3eff",
"metadata": {},
"source": [
"Other useful escape sequences include `\\n` and `\\t`. These allow for a new line and tab spacing to be added to a string, respectively."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "241848e8",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This is the first sentence \n",
"This is the second sentence! \tThis is the third sentence?\n"
]
}
],
"source": [
"sentences = '''This is the first sentence \\nThis is the second sentence! \\tThis is the third sentence?'''\n",
"print(sentences)"
]
},
{
"cell_type": "markdown",
"id": "a46d5107",
"metadata": {},
"source": [
"Strings can be used in simple additive mathematical operations, like addition and multiplication."
"Strings can be used in simple additive mathematical operations, like addition and multiplication, resulting in concatenation of the strings:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 10,
"id": "3b40b622",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -265,7 +345,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 11,
"id": "c57fbd1b",
"metadata": {},
"outputs": [
Expand All @@ -283,12 +363,45 @@
"print(words, words)"
]
},
{
"cell_type": "markdown",
"id": "ca3e1bc1",
"metadata": {},
"source": [
"Escape sequences also can be used in the `print()` function as an argument or through concatenation:"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "74b515d1",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This is a sentence. \t This isn't easy.\n",
"\n",
"\n",
"This isn't easy.\tThis is a sentence.\n"
]
}
],
"source": [
"print(words, '\\t', 'This isn\\'t easy.') # Escape sequence used as an argument in the print function\n",
"print('\\n') # Escape sequence used to print a blank line\n",
"print('This isn\\'t easy.' + '\\t' + words) # Escape sequence concatenated to strings in the print function"
]
},
{
"cell_type": "markdown",
"id": "396964f1",
"metadata": {},
"source": [
"When manipulating string variables, data scientist will often used what are called *methods*. A method is piece of code that is associated with a defined variable, as opposed to a *function* which uses defined variables as input parameters. Functions will be further discussed in the upcoming section.\n",
"When manipulating string variables, data scientists will often use what are called **methods**. A method is piece of code that is associated with a defined variable, as opposed to a **function** which uses defined variables as input arguments for parameters. Functions will be further discussed in the upcoming section.\n",
"\n",
"\n",
"Some methods can be used on strings to quickly and efficiently alter them. A few include the `.upper()`, `.lower()`, `.capitalize()`, `.title()`, and `.swapcase()` methods. There are many others, but these few are great to start exploring the different ways string variables can be manipulated:"
Expand Down Expand Up @@ -382,7 +495,7 @@
"id": "01d68864",
"metadata": {},
"source": [
"We can confirm that these are indeed strings by calling these variables into the `type()` function, which can be used on any variable to check its data type."
"We can confirm that these are indeed strings by calling the `type()` function on these variables, which can be used on any variable to check its data type."
]
},
{
Expand Down Expand Up @@ -412,7 +525,7 @@
"id": "c8160dd5",
"metadata": {},
"source": [
"Keep in mind that when a numerical value is converted to a string, it can no longer be used to perform advanced mathematical calculations, such as division, subtraction, or exponentiation."
"Keep in mind that when a numerical value is converted to a string, it can no longer be used to perform certain mathematical calculations, such as division, subtraction, or exponentiation."
]
},
{
Expand Down Expand Up @@ -485,7 +598,7 @@
"source": [
"## Integers & Floats\n",
"\n",
"Integers and floats are numerical data types that are often used to perform mathematical operations. Integers consists of whole numbers, while floats consists of whole numbers with floating decimal places. Floats can hold up to 15 significant figures following the decimal point and can be used to obtain more accurate calculations. However, it is easier and faster for a computer to do calculations using integers. Thus, one must weigh the pros and cons of using these data types when doing calculations and writing functions to obtain outcomes that are most aligned with their end goals. Let's take a look at these data types in use.\n"
"Integers and floats are numerical data types that are often used to perform mathematical operations. Integers consist of whole numbers, while floats consist of whole numbers with floating decimal places. Floats can hold up to 15 significant figures following the decimal point and can be used to obtain more accurate calculations. However, it is easier and faster for a computer to do calculations using integers. Thus, one must weigh the pros and cons of using these data types when doing calculations and writing functions to obtain outcomes that are most aligned with their end goals. Let's take a look at these data types in use.\n"
]
},
{
Expand Down Expand Up @@ -596,7 +709,7 @@
"id": "0b7c63c9",
"metadata": {},
"source": [
"We can see that the conversion of an integer to a float simply adds one significant figure after the decimal place. Moreover, converting a float to an integer rounds the number <u>down</u> to the nearest whole number. We can also convert numerical values in strings and boolean data types to integers and floats"
"We can see that the conversion of an integer to a float simply adds one significant figure after the decimal place. Moreover, converting a float to an integer rounds the number *down* to the nearest whole number. We can also convert numerical values in strings and boolean data types to integers and floats"
]
},
{
Expand Down Expand Up @@ -662,10 +775,16 @@
"id": "e37e3887",
"metadata": {},
"source": [
"## Conclusions\n",
"\n",
"In this section, we learned about various different data types. These include the `boolean`, `string`, `int`, and `float` data types. As you become more acquainted with Python, you will see the ubiquity of these data types in many data structures, which we will discuss in upcoming chapters. For now, explore these data types and relevant functions to learn how and when these data types can be used. Happy coding!\n"
"By understanding data types, we can begin to use them in other analyses and functionalities in Python. Next, we will learn how to use data types in comparisons, which can help further down the line in functions ([Chapter 3.5](../3/5/IntroFunctions.ipynb)), for loops ([Chapter 5.3](../../05/3/Control_Statements_Iteration.ipynb)), and subsetting data from DataFrames ([Chapter 5.3](../../06/6/Select_Condition.ipynb))."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23d5baaa",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -684,7 +803,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.9.12"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 4e0dc89

Please sign in to comment.