{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "TyD2gu6xTg3B"
},
"source": [
"# 1. Introduction to Python 🐍\n",
"\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zbJ2fw29OHAX"
},
"source": [
"This Notebook was originally prepared by [Mathieu Blondel](https://mblondel.org/) and few modifications have been made by us."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JbVT11Y8CbAu"
},
"source": [
"## Goals of this exercise 🌟"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kw1B2R_WCU7X"
},
"source": [
"\n",
"\n",
"* We will learn about the programming language Python as well as [NumPy](https://numpy.org/) and [Matplotlib](https://matplotlib.org/), two fundamental tools for data science and machine learning in Python.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YV4sYCn8EPwb"
},
"source": [
"## What are Jupyter Notebooks and Google Colab? 🤔"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2HuuwOcQEYQI"
},
"source": [
"Notebooks are a great way to mix executable code with rich contents (HTML, images, equations written in LaTeX). Google Colab allows to run notebooks on the cloud for free without any prior installation, while leveraging the power of [GPUs](https://en.wikipedia.org/wiki/Graphics_processing_unit)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "t0XmMkPuGGOs"
},
"source": [
"This document that you are reading is not a static web page, but an interactive environment called a notebook, that lets you write and execute code. Notebooks consist of so-called code cells, blocks of one or more Python instructions. For example, here is a code cell that stores the result of a computation (the number of seconds in a day) in a variable and prints its value:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "qeGAvLq1ALJ4",
"outputId": "98125183-7dbb-434d-d37f-636f211e9908"
},
"outputs": [
{
"data": {
"text/plain": [
"86400"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"seconds_in_a_day = 24 * 60 * 60\n",
"seconds_in_a_day"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "njoPs3q1G75-"
},
"source": [
"Click on the \"play\" button to execute the cell. You should be able to see the result. Alternatively, you can also execute the cell by pressing `Ctrl + Enter` if you are on Windows/Linux or `Command + Enter` if you are on a Mac."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NvdYaBgHHdbw"
},
"source": [
"Variables that you defined in one cell can later be used in other cells:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "s1kp5Zv0JBSx",
"outputId": "bb15ad08-0397-4662-f9b8-b0985b898976"
},
"outputs": [
{
"data": {
"text/plain": [
"604800"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"seconds_in_a_week = 7 * seconds_in_a_day\n",
"seconds_in_a_week"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uALfY4q1JFQ0"
},
"source": [
"Note that the order of execution is important. For instance, if we do not run the cell storing `seconds_in_a_day` beforehand, the above cell will raise an error, as it depends on this variable. To make sure that you run all the cells in the correct order, you can also click on `Runtime` in the top-level menu, then `Run all`."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HvXs_KaoRWQ9"
},
"source": [
"### Exercise - code cells ❗❗\n",
"\n",
"Add a code cell below this cell: \n",
"\n",
"1. Click on this cell\n",
"2. Click on `+ Code`\n",
"3. In the new cell, compute the number of seconds in a year by reusing the variable `seconds_in_a_day`. Run the new cell.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qyLSwlxnJqXX"
},
"source": [
"## Python 🐍"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3ltwm91eJyQM"
},
"source": [
"Python is one of the most popular programming languages for machine learning, both in academia and in industry. As such, it is essential to learn this language for anyone interested in machine learning. In this section, we will review Python basics."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3EKvP6jiMZ9H"
},
"source": [
"### Arithmetic operations"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DDjs0-7YQ80h"
},
"source": [
"Python supports the usual arithmetic operators: + (addition), * (multiplication), / (division), ** (power), // (integer division)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UhcbBQUiStHG"
},
"source": [
"### Lists"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RkPn1IjNTCxA"
},
"source": [
"Lists are a container type for ordered sequences of elements. Lists can be initialized empty"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "OrnV1ySAPtHp"
},
"outputs": [],
"source": [
"my_list = []"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OwRqyYI9XnPK"
},
"source": [
"or with some initial elements"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "Uq5YTJ1JXpOX"
},
"outputs": [],
"source": [
"my_list = [1, 2, 3]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Hk2WmojJXyyz"
},
"source": [
"Lists have a dynamic size and elements can be added (appended) to them"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "QFTNqiYiXxAh",
"outputId": "aeef5f18-37ba-43af-d79c-7aa769323268"
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 4]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list.append(4)\n",
"my_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IUnJuqQ2Yhzw"
},
"source": [
"We can access individual elements of a list (indexing starts from 0)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "pyFxyZPVYpG_",
"outputId": "e09e13f7-a3a3-4097-d89b-89801b15953b"
},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list[2]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hPMrIDYsdgMP"
},
"source": [
"We can access \"slices\" of a list using `my_list[i:j]` where `i` is the start of the slice (again, indexing starts from 0) and `j` the end of the slice. For instance:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Ichf9p0gd7tJ",
"outputId": "68bec7a4-0bd6-455e-abb6-e281053468de"
},
"outputs": [
{
"data": {
"text/plain": [
"[2, 3]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list[1:3]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KMbzH4tzQ9rI"
},
"source": [
"Omitting the second index means that the slice shoud run until the end of the list"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "O7wCthKnREKV",
"outputId": "29eacc7f-de0c-4bc5-cda6-ac675b7147a6"
},
"outputs": [
{
"data": {
"text/plain": [
"[2, 3, 4]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list[1:]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "C5Aeu7PUebrK"
},
"source": [
"We can check if an element is in the list using `in`"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "a_P5NCi-efvb",
"outputId": "4198bcfe-839b-4a73-a755-1f15208aaddc"
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"5 in my_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LypIsP5gkl10"
},
"source": [
"The length of a list can be obtained using the `len` function"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ac0FMsaKkrWc",
"outputId": "2bdbfe60-79c9-47f6-8678-f19c9100db91"
},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(my_list)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1c3RLStf7G2I"
},
"source": [
"### Strings"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Jm6hZhgz7KhI"
},
"source": [
"Strings are used to store text. They can be defined using either single quotes or double quotes"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"id": "cCma6Oj_7T8n"
},
"outputs": [],
"source": [
"string1 = \"some text\"\n",
"string2 = 'some other text'"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Irr4xuWu7Znu"
},
"source": [
"Strings behave similarly to lists. As such we can access individual elements in exactly the same way"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "26_POhLO7iM3",
"outputId": "564919bc-3368-4853-ad19-e0e435383e21"
},
"outputs": [
{
"data": {
"text/plain": [
"'e'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"string1[3]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oA_UD0JV7oPw"
},
"source": [
"and similarly for slices"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "dcZFcLqQ7qCe",
"outputId": "c5ea492e-ed98-4b55-f0fe-e7970bfa36b0"
},
"outputs": [
{
"data": {
"text/plain": [
"'text'"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"string1[5:]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hOQ_CIiu76YG"
},
"source": [
"String concatenation is performed using the `+` operator"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "mxqNMKCY79_W",
"outputId": "5d66117b-1bab-4f90-a4c7-a895f50f3aaa"
},
"outputs": [
{
"data": {
"text/plain": [
"'some text some other text'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"string1 + \" \" + string2"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7Lox2GZCMdIB"
},
"source": [
"### Conditionals"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-gXEAWFZfDTT"
},
"source": [
"As their name indicates, conditionals are a way to execute code depending on whether a condition is True or False. As in other languages, Python supports `if` and `else` but `else if` is contracted into `elif`, as the example below demonstrates. "
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xC_DMZjofoYZ",
"outputId": "0144d338-2824-41ea-a377-672a7fccc163"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"positive\n"
]
}
],
"source": [
"my_variable = 5\n",
"if my_variable < 0:\n",
" print(\"negative\")\n",
"elif my_variable == 0:\n",
" print(\"null\")\n",
"else: # my_variable > 0\n",
" print(\"positive\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ag0SUokSf9jl"
},
"source": [
"Here `<` and `>` are the strict `less` and `greater than` operators, while `==` is the equality operator (not to be confused with `=`, the variable assignment operator). The operators `<=` and `>=` can be used for less (resp. greater) than or equal comparisons."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qTkQ2F_jy8wz"
},
"source": [
"Contrary to other languages, blocks of code are delimited using indentation. Here, we use 2-space indentation but many programmers also use 4-space indentation. Any one is fine as long as you are consistent throughout your code."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "clWaFCzBMfkv"
},
"source": [
"### Loops"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_A5doqhTivWe"
},
"source": [
"Loops are a way to execute a block of code multiple times. There are two main types of loops: **while** loops and **for** loops."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YN8lwTxQkGEa"
},
"source": [
"While loop"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "7-QXGqgOjsr_",
"outputId": "df68f7da-f690-4237-f79a-40e55e413719"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"2\n",
"3\n",
"4\n"
]
}
],
"source": [
"i = 0\n",
"while i < len(my_list):\n",
" print(my_list[i])\n",
" i += 1 # equivalent to i = i + 1"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8mEI_ocfkSvZ"
},
"source": [
"For loop"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "2QObx5mckMcI",
"outputId": "8e898b21-c4cd-41d5-a5e7-a2b3b1c64bf5"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"2\n",
"3\n",
"4\n"
]
}
],
"source": [
"for i in range(len(my_list)):\n",
" print(my_list[i])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XO6qqppikZvm"
},
"source": [
"If the goal is simply to iterate over a list, we can do so directly as follows"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "PjFKzN6zkeJ7",
"outputId": "4215706a-61a9-4d12-f4ff-1fa6724c077c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n",
"2\n",
"3\n",
"4\n"
]
}
],
"source": [
"for element in my_list:\n",
" print(element)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cck4zwYrex02"
},
"source": [
"### Functions"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "n1PbIf_ohxFO"
},
"source": [
"To improve code readability, it is common to separate the code into different blocks, responsible for performing precise actions: functions! 😃 A function takes some inputs and process them to return some outputs."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "cImA09gOhRmx",
"outputId": "71b08a6b-6a00-4078-b4dd-f0b095297a30"
},
"outputs": [
{
"data": {
"text/plain": [
"36"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def square(x):\n",
" return x ** 2\n",
"\n",
"def multiply(a, b):\n",
" return a * b\n",
"\n",
"# Functions can be composed.\n",
"square(multiply(3, 2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "75-5SOk9iYSt"
},
"source": [
"To improve code readability, it is sometimes useful to explicitly name the arguments"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "wkIUuZHhidI0",
"outputId": "48c2d1c2-bd45-4a86-d6aa-0b630561b3ba"
},
"outputs": [
{
"data": {
"text/plain": [
"36"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"square(multiply(a=3, b=2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LkpwbQEVMys2"
},
"source": [
"### Exercise - conditionals, loops and function ❗❗"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ASpVhol9ZXI0"
},
"source": [
"**Exercise 1.** Using a conditional, write the [relu](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) function defined as follows\n",
"\n",
"$\\text{relu}(x) = \\left\\{\n",
" \\begin{array}{rl}\n",
" x, & \\text{if } x \\ge 0 \\\\\n",
" 0, & \\text{otherwise }.\n",
" \\end{array}\\right.$"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"id": "jlgyu65SaUvr"
},
"outputs": [],
"source": [
"def relu(x):\n",
" # Write your function here\n",
" return\n",
"\n",
"relu(-3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Y3so0ceoakIw"
},
"source": [
"**Exercise 2.** Using a foor loop, write a function that computes the [Euclidean norm](https://en.wikipedia.org/wiki/Norm_(mathematics)#Euclidean_norm) of a vector, represented as a list."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"id": "-IH-BD41bI1u"
},
"outputs": [],
"source": [
"def euclidean_norm(vector):\n",
" # Write your function here\n",
" return\n",
"\n",
"my_vector = [0.5, -1.2, 3.3, 4.5]\n",
"# The result should be roughly 5.729746940310715\n",
"euclidean_norm(my_vector)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SEXIh_e9cW3S"
},
"source": [
"**Exercise 3.** Using a for loop and a conditional, write a function that returns the maximum value in a vector."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"id": "zd9ntMq0cb2e"
},
"outputs": [],
"source": [
"def vector_maximum(vector):\n",
" # Write your function here\n",
" return"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qPAZA4OMc6sT"
},
"source": [
"**Bonus exercise.** if time permits, write a function that sorts a list in ascending order (from smaller to bigger) using the [bubble sort](https://en.wikipedia.org/wiki/Bubble_sort) algorithm."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"id": "sBokdJO4dGyf"
},
"outputs": [],
"source": [
"def bubble_sort(my_list):\n",
" # Write your function here\n",
" return\n",
"\n",
"my_list = [1, -3, 3, 2]\n",
"# Should return [-3, 1, 2, 3]\n",
"bubble_sort(my_list)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jDxjvtEEM1vg"
},
"source": [
"### Going further 💯"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wRkmvzf-PdEp"
},
"source": [
"Clearly, it is impossible to cover all the language features in this short introduction. To go further, we recommend the following resources:"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "n8nbPoWclRlH"
},
"source": [
"\n",
"\n",
"\n",
"* Python for for computations in science and engineering [booklets](https://pointbreezepubs.gumroad.com/).\n",
"* List of Python [tutorials](https://wiki.python.org/moin/BeginnersGuide/Programmers)\n",
"* Four-hour [course](https://www.youtube.com/watch?v=rfscVS0vtbw) on Youtube\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6X4WJo3iM6m9"
},
"source": [
"## NumPy 💻"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_H3bNbLloXCY"
},
"source": [
"[NumPy](https://numpy.org/) is a popular library for storing arrays of numbers and performing computations on them. Not only this enables to write often more succint code, this also makes the code faster, since most NumPy routines are implemented in C for speed."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "M7tI3XLhqwSX"
},
"source": [
"To use NumPy in your program, you need to import it as follows"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"id": "phSPPyfyq2gX"
},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9secCfFLNHEE"
},
"source": [
"### Array creation\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LSS2wEnkq97n"
},
"source": [
"NumPy arrays can be created from Python lists"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Hfeg286yrLvJ",
"outputId": "c42f5ee7-a0f8-4dfb-b534-a62708bbf156"
},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 2, 3])"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_array = np.array([1, 2, 3])\n",
"my_array"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Sy2EvrxFriAG"
},
"source": [
"NumPy supports array of arbitrary dimension. For example, we can create two-dimensional arrays (e.g. to store a matrix) as follows"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "wM-GYVMsrzNs",
"outputId": "128f54ec-65df-45fb-ee5e-a825c25c6634"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 2, 3],\n",
" [4, 5, 6]])"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_2d_array = np.array([[1, 2, 3], [4, 5, 6]])\n",
"my_2d_array"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-kZMzYsAsVAc"
},
"source": [
"We can access individual elements of a 2d-array using two indices"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "4q8X86BbscPd",
"outputId": "73a0b46d-3d90-4cfd-dbfe-307dd23b50f4"
},
"outputs": [
{
"data": {
"text/plain": [
"6"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_2d_array[1, 2]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OfVIKyxkTh0p"
},
"source": [
"We can also access rows"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "CrKnDAtyTlYe",
"outputId": "6b364949-e1ad-4285-8821-d7990c0e8865"
},
"outputs": [
{
"data": {
"text/plain": [
"array([4, 5, 6])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_2d_array[1]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hskLBCp9ToCG"
},
"source": [
"and columns"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "MOOFsLHhTozX",
"outputId": "23d6d993-5b91-4450-e492-6aa7de1986ed"
},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 6])"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_2d_array[:, 2]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "keWK_5PHr9Q2"
},
"source": [
"Arrays have a `shape` attribute"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "5QIo7l1Yr8m7",
"outputId": "6613e4e3-351c-4e00-c22c-753c48ba842a"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(3,)\n",
"(2, 3)\n"
]
}
],
"source": [
"print(my_array.shape)\n",
"print(my_2d_array.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LmX0EDWVsoDY"
},
"source": [
"Contrary to Python lists, NumPy arrays must have a type and all elements of the array must have the same type."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "FZjOowkls57o",
"outputId": "60d9d76d-c601-4ad3-963f-663df00ce536"
},
"outputs": [
{
"data": {
"text/plain": [
"dtype('int32')"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_array.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "i5AvLdf7tGnZ"
},
"source": [
"The main types are `int32` (32-bit integers), `int64` (64-bit integers), `float32` (32-bit real values) and `float64` (64-bit real values)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "w8ym2qZCt9Nm"
},
"source": [
"The `dtype` can be specified when creating the array"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "gXpM_FqruCVv",
"outputId": "88d2cc90-e73f-4554-b09b-fd19dd9b3f35"
},
"outputs": [
{
"data": {
"text/plain": [
"dtype('float64')"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_array = np.array([1, 2, 3], dtype=np.float64)\n",
"my_array.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WueaRIONuTdS"
},
"source": [
"We can create arrays of all zeros using"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "jbD8N1UauK8r",
"outputId": "775fe73a-316f-4605-d747-2c1431f1b013"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0., 0., 0.],\n",
" [0., 0., 0.]])"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"zero_array = np.zeros((2, 3))\n",
"zero_array"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vn5go6qoudo4"
},
"source": [
"and similarly for all ones using `ones` instead of `zeros`."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1kCRlhLJuvZ6"
},
"source": [
"We can create a range of values using"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "EcQXDeEmuxpO",
"outputId": "9142690f-f66d-425a-9792-435f458d3742"
},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4])"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.arange(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZvJECk6Iu3uF"
},
"source": [
"or specifying the starting point"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Pk3UzL3du_f8",
"outputId": "2a9a07f4-907b-4553-d339-a4834d543544"
},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 4])"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.arange(3, 5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "f1JtqFSivJKG"
},
"source": [
"Another useful routine is `linspace` for creating linearly spaced values in an interval. For instance, to create 10 values in `[0, 1]`, we can use"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "udHHjGAHvOQM",
"outputId": "4be5895d-975a-40e8-a54f-b269e454e94b"
},
"outputs": [
{
"data": {
"text/plain": [
"array([0. , 0.11111111, 0.22222222, 0.33333333, 0.44444444,\n",
" 0.55555556, 0.66666667, 0.77777778, 0.88888889, 1. ])"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.linspace(0, 1, 10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WbcxAKobvgUT"
},
"source": [
"Another important operation is `reshape`, for changing the shape of an array"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "4FPzTuDlvlLO",
"outputId": "9697290e-0b71-4891-f45a-27da056f05f2"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 2],\n",
" [3, 4],\n",
" [5, 6]])"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_array = np.array([1, 2, 3, 4, 5, 6])\n",
"my_array.reshape(3, 2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "G-QR80_g3N9Y"
},
"source": [
"Play with these operations and make sure you understand them well."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "f9B0iCBlmfeY"
},
"source": [
"### Basic operations"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "elQGgkqDxKLV"
},
"source": [
"In NumPy, we express computations directly over arrays. This makes the code much more succint."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hkCU1T8ixghX"
},
"source": [
"Arithmetic operations can be performed directly over arrays. For instance, assuming two arrays have a compatible shape, we can add them as follows"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "4AoiRq42x5mI",
"outputId": "bc259f8a-b32b-47d2-81a7-bbacce4c12ae"
},
"outputs": [
{
"data": {
"text/plain": [
"array([5, 7, 9])"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"array_a = np.array([1, 2, 3])\n",
"array_b = np.array([4, 5, 6])\n",
"array_a + array_b"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SyPqME2EyD4x"
},
"source": [
"Compare this with the equivalent computation using a for loop"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "HxRFA_U2yfI-",
"outputId": "050bc7fd-2872-41aa-f141-5cb098534d11"
},
"outputs": [
{
"data": {
"text/plain": [
"array([5, 7, 9])"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"array_out = np.zeros_like(array_a)\n",
"for i in range(len(array_a)):\n",
" array_out[i] = array_a[i] + array_b[i]\n",
"array_out"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "i2a-apX-zlPN"
},
"source": [
"Not only this code is more verbose, it will also run much more slowly."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Qdn8MwpR0wX_"
},
"source": [
"In NumPy, functions that operates on arrays in an element-wise fashion are called [universal functions](https://numpy.org/doc/stable/reference/ufuncs.html). For instance, this is the case of `np.sin`"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "JoanjiMu1BH5",
"outputId": "9ffbb222-89fc-438d-9e2c-46794d3689b8"
},
"outputs": [
{
"data": {
"text/plain": [
"array([0.84147098, 0.90929743, 0.14112001])"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.sin(array_a)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jHljrPXg5h8W"
},
"source": [
"Vector inner product can be performed using `np.dot`"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "TphR8oIx5ob9",
"outputId": "ed943559-ad23-4d3f-8b8c-9188763fb07f"
},
"outputs": [
{
"data": {
"text/plain": [
"32"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.dot(array_a, array_b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lHInOiSW50OR"
},
"source": [
"When the two arguments to `np.dot` are both 2d arrays, `np.dot` becomes matrix multiplication"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "QRbpbhPP6Up0",
"outputId": "08fa7a5b-8e69-41ae-8e2c-30d309ea8213"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.78372786, -0.849849 , -0.80062414, -0.17832529],\n",
" [-0.95976823, -1.81073509, -0.58237139, -1.02972195],\n",
" [-1.00018451, -1.01810406, -1.04100908, -0.2526472 ],\n",
" [-0.73005882, -1.58872211, -0.40764567, -0.54033096],\n",
" [-0.46429466, -1.14327928, -0.12796835, -0.87775792]])"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"array_A = np.random.rand(5, 3)\n",
"array_B = np.random.randn(3, 4)\n",
"np.dot(array_A, array_B)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "odVawD9m6gwv"
},
"source": [
"Matrix transpose can be done using `.transpose()` or `.T` for short"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "gvPe_JAO6mvF",
"outputId": "87775836-ce5e-40f5-80a1-33c9613fe5c5"
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.53487215, 0.9167603 , 0.61587894, 0.98524764, 0.48891708],\n",
" [0.55500459, 0.73120862, 0.72928845, 0.44575498, 0.39782748],\n",
" [0.05074806, 0.68669673, 0.14052154, 0.05889541, 0.68256372]])"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"array_A.T"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JlWt3oFnE_E-"
},
"source": [
"### Slicing and masking"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "e4aKKe7bFA65"
},
"source": [
"Like Python lists, NumPy arrays support slicing"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "0kPhv2xcF1TP",
"outputId": "fc04e16e-62b1-4237-aa99-249525314029"
},
"outputs": [
{
"data": {
"text/plain": [
"array([5, 6, 7, 8, 9])"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.arange(10)[5:]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ITu2Wy4-GB2G"
},
"source": [
"We can also select only certain elements from the array"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "8tlZzTB6GEyw",
"outputId": "eee6c6cb-e7fb-479b-9ad4-26c45f1253e7"
},
"outputs": [
{
"data": {
"text/plain": [
"array([5, 6, 7, 8, 9])"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.arange(10)\n",
"mask = x >= 5\n",
"x[mask]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NlGForCimjBL"
},
"source": [
"### Exercise - arrays and linear algebra ❗❗"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ur1UlSFPTu6O"
},
"source": [
"**Exercise 1.** Create a 3d array of shape (2, 2, 2), containing 8 values. Access individual elements and slices."
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"id": "v1ed4-vLUWXQ"
},
"outputs": [],
"source": [
"# Your code here"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "s_ksfCDJzyxI"
},
"source": [
"**Exercise 2.** Rewrite the relu function (see Python section) using [np.maximum](https://numpy.org/doc/stable/reference/generated/numpy.maximum.html). Check that it works on both a single value and on an array of values."
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"id": "QtSTxH5Dz6f8"
},
"outputs": [],
"source": [
"def relu_numpy(x):\n",
" return\n",
"\n",
"relu_numpy(np.array([1, -3, 2.5]))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wggUjpyRz7fb"
},
"source": [
"**Exercise 3.** Rewrite the Euclidean norm of a vector (1d array) using NumPy (without for loop)"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"id": "p5BLcHOD0Bhy"
},
"outputs": [],
"source": [
"def euclidean_norm_numpy(x):\n",
" return\n",
"\n",
"my_vector = np.array([0.5, -1.2, 3.3, 4.5])\n",
"euclidean_norm_numpy(my_vector)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "01IteVJ60Il2"
},
"source": [
"**Exercise 4.** Write a function that computes the Euclidean norms of a matrix (2d array) in a row-wise fashion. Hint: use the `axis` argument of [np.sum](https://numpy.org/doc/stable/reference/generated/numpy.sum.html)."
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"id": "at5lWRNM0SVG"
},
"outputs": [],
"source": [
"def euclidean_norm_2d(X):\n",
" return\n",
"\n",
"my_matrix = np.array([[0.5, -1.2, 4.5],\n",
" [-3.2, 1.9, 2.7]])\n",
"# Should return an array of size 2.\n",
"euclidean_norm_2d(my_matrix)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yd1ZoByo436x"
},
"source": [
"**Exercise 5.** Compute the mean value of the features in the [iris dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html). Hint: use the `axis` argument on [np.mean](https://numpy.org/doc/stable/reference/generated/numpy.mean.html)."
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"id": "fYFVobkP5JK6"
},
"outputs": [],
"source": [
"from sklearn.datasets import load_iris\n",
"X, y = load_iris(return_X_y=True)\n",
"\n",
"# Result should be an array of size 4."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1FDs9zX6mpoX"
},
"source": [
"### Going further 💯"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hFP61Iztmr9Q"
},
"source": [
"* NumPy [reference](https://numpy.org/doc/stable/reference/)\n",
"* SciPy [lectures](https://scipy-lectures.org/)\n",
"* One-hour [tutorial](https://www.youtube.com/watch?v=QUT1VHiLmmI) on Youtube \n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7Jt6T3kJ8I2T"
},
"source": [
"## Matplotlib 📈"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kQX8TiEOALkQ"
},
"source": [
"### Basic plots"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "REYwc9Va8UTg"
},
"source": [
"Matplotlib is a plotting library for Python."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Eom7t-m6-Uzb"
},
"source": [
"We start with a rudimentary plotting example."
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 295
},
"id": "g21e5Ncm927z",
"outputId": "2e95dff3-0bb0-4759-f675-d32c05eaaade"
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from matplotlib import pyplot as plt\n",
"\n",
"x_values = np.linspace(-3, 3, 100)\n",
"\n",
"plt.figure()\n",
"plt.plot(x_values, np.sin(x_values), label=\"Sinusoid\")\n",
"plt.xlabel(\"x\")\n",
"plt.ylabel(\"sin(x)\")\n",
"plt.title(\"Matplotlib example\")\n",
"plt.legend(loc=\"upper left\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ltvlLwXF-eAH"
},
"source": [
"We continue with a rudimentary scatter plot example. This example displays samples from the [iris dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html) using the first two features. Colors indicate class membership (there are 3 classes)."
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 268
},
"id": "sEzcJAmy-hbK",
"outputId": "e71652fa-add5-40e9-ee69-46eaf36b7d5e"
},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from sklearn.datasets import load_iris\n",
"X, y = load_iris(return_X_y=True)\n",
"\n",
"X_class0 = X[y == 0]\n",
"X_class1 = X[y == 1]\n",
"X_class2 = X[y == 2]\n",
"\n",
"plt.figure()\n",
"plt.scatter(X_class0[:, 0], X_class0[:, 1], label=\"Class 0\", color=\"C0\")\n",
"plt.scatter(X_class1[:, 0], X_class1[:, 1], label=\"Class 1\", color=\"C1\")\n",
"plt.scatter(X_class2[:, 0], X_class2[:, 1], label=\"Class 2\", color=\"C2\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5vjln9qwAc3M"
},
"source": [
"We see that samples belonging to class 0 can be linearly separated from the rest using only the first two features."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uVWuIUs2AQ5a"
},
"source": [
"### Exercise - plots ❗❗\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1X6-g6zgCwJd"
},
"source": [
"**Exercise 1.** Plot the relu and the [softplus](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#Softplus) functions on the same graph."
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"id": "Ob6HZUX0DJ8y"
},
"outputs": [],
"source": [
"# Your code here"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vpRGfz0aDW3l"
},
"source": [
"What is the main difference between the two functions?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JjDeIufRAYVL"
},
"source": [
"**Exercise 2.** Repeat the same scatter plot but using the [digits dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) instead."
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"id": "-JU3TXCBBB0c"
},
"outputs": [],
"source": [
"from sklearn.datasets import load_digits\n",
"X, y = load_digits(return_X_y=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "w7wPWdmXBQA2"
},
"source": [
"Are pixel values good features for classifying samples?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dYM-oV1jD3RV"
},
"source": [
"### Going further 💯\n",
"\n",
"* Official [tutorial](https://matplotlib.org/tutorials/introductory/pyplot.html)\n",
"* [Tutorial](https://www.youtube.com/watch?v=qErBw-R2Ybk) on Youtube"
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.15"
}
},
"nbformat": 4,
"nbformat_minor": 1
}