Skip to content
Snippets Groups Projects
Commit 2686ac85 authored by Škoviera, Radoslav, Mgr., Ph.D.'s avatar Škoviera, Radoslav, Mgr., Ph.D.
Browse files

Added lesson 2

parent 393da30d
No related branches found
No related tags found
No related merge requests found
# 2D arrays (matrices), array processing (search, cumulative sum)
---
title: "Lecture 2"
format:
html:
code-fold: false
jupyter: python3
---
## Matrices
Matrices are 2D arrays. They have two dimensions: rows and columns. The number of rows is often called the height and the number of columns is often called the width.
This is similar to image, although, images have typically flipped dimensions: x-axis is width (columns) and y-axis is height (rows).
```{python}
m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print(m)
also_m = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
print(also_m)
m_as_well = [[j + i * 3 for j in range(1, 4)] for i in range(3)]
print(m_as_well)
```
Creating a matrix in a loop (the following two functions are equivalent):
```{python}
def create_counting_matrix(height, width):
return [[j + i * width for j in range(1, width + 1)] for i in range(height)]
def create_counting_matrix_unpacked(height, width):
mat = []
for i in range(height):
row = []
for j in range(1, width + 1):
row.append(j + i * width)
mat.append(row)
return mat
print(create_counting_matrix(3, 4))
```
Matrix pre-allocation:
```{python}
M, N = 3, 4
mat = [[0] * N for _ in range(M)]
print(mat)
```
### Filling matrices with values
### Matrix operations
#### Addition
#### Multiplication
#### Transposition
### Miscellaneous operations on matrices
#### Comparison
#### All different
### "Graphical" matrices (images)
Raw images are basically matrices, where each element represents an intensity value of the corresponding pixel.
Color images are 3D matrices, where each element is a 3-tuple of intensity values of the color channels but we will not go there.
However, we can have a little fun with gray-scale images.
```{python}
img = [
[100, 100, 100, 100, 100, 100, 100, 100, 100, 100],
[100, 0 , 0 , 0, 0 , 0 , 0, 0, 0, 100],
[ 0, 0, 128, 128, 0, 0, 128, 128, 0, 0],
[ 0, 0, 255, 255, 0, 0, 255, 255, 0, 0],
[ 0, 0, 255, 255, 0, 0, 255, 255, 0, 0],
[ 0, 0, 128, 128, 0, 0, 128, 128, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 200, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 200, 0, 0, 0, 0],
[ 0, 0, 0, 0, 200, 200, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 255, 0, 0, 0, 0, 0, 0, 255, 0],
[ 0, 255, 255, 255, 255, 255, 255, 255, 255, 0],
[ 0, 0, 255, 128, 128, 128, 128, 255, 0, 0],
[ 0, 0, 0, 255, 255, 255, 255, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 100, 100, 0, 0, 0, 0],
]
print(img)
from matplotlib import pyplot as plt
plt.imshow(img, cmap='gray')
plt.axis('off')
plt.show()
```
## Array & matrix processing
### Visualizing (printing) arrays
For debugging purposes, it is useful to see what is in an array. Smaller arrays are easy to print "whole":
```{python}
import numpy as np
a = np.random.permutation(10)
print(a)
```
You can use the `join` method of strings to make the output nicer. More on that later, when we get to strings.
```{python}
def pretty_print_array(a):
print('[' + ', '.join([str(x) for x in a]) + ']')
pretty_print_array(a)
```
For larger arrays, you need to be creative - the solution will depend on what you need. For example, it might be enough to print the first or last few items.
Print the array as rows (Python kinda does that but "uncontrollably"):
```{python}
def print_as_rows(a, row_length=10):
for i in range(0, len(a), row_length):
print(a[i:i+row_length])
a = np.random.permutation(100)
print_as_rows(a)
```
You can even improve it to have nicer output. Again, more on that next time, when we get to strings.
```{python}
def pretty_print_as_rows(a, row_length=10):
print('[')
for i in range(0, len(a), row_length):
print(', '.join([str(x) for x in a[i:i+row_length]]), end=',\n')
print(']')
a = np.random.permutation(100).tolist()
pretty_print_as_rows(a)
```
#### Printing matrices:
(remember to define/run cell with the `create_counting_matrix` and `pretty_print` functions above)
```{python}
def print_matrix(m):
print('[', end='')
n_rows = len(m)
for ri, row in enumerate(m):
print(row, end=',\n' if ri < n_rows - 1 else '')
print(']')
def formatted_print_matrix(m):
print('[', end='')
n_rows = len(m)
max_digits = int(np.ceil(np.log10(max([max(row) for row in m]))))
for ri, row in enumerate(m):
row_prefix = ('' if ri == 0 else ' ') + '['
row_numbers = ', '.join([f"{x:{max_digits}d}" for x in row])
row_end=',\n' if ri < n_rows - 1 else ']'
print(row_prefix + row_numbers, end=row_end)
print(']')
mat = create_counting_matrix(11, 17)
print("Row-by-row print:")
print_matrix(mat)
print()
print("Formatted print:")
formatted_print_matrix(mat)
```
### Search
#### Linear search
Requires iterating over the array, asymptotic runtime is $O(n)$.
```{python}
import numpy as np
N = 100
a = np.random.permutation(N)
searched_item = 50
def linear_search(a, x):
for i in range(len(a)):
if a[i] == x:
return i
return None
print("Linear search run time:")
%timeit linear_search(a, searched_item)
```
##### Computational complexity side-quest
Even though the 'asymptotic' time of linear search is $O(n)$,
the 'actual' time depends on the 'properties' of the array - where the item is located in the array.
```{python}
N = 100
a = np.random.permutation(N)
searched_item = a[N // 2]
print("Searching for an item in the middle of the array:")
%timeit linear_search(a, searched_item)
searched_item = a[0]
print("Searching for an item at the beginning of the array:")
%timeit linear_search(a, searched_item)
searched_item = a[-1]
print("Searching for an item at the end of the array:")
%timeit linear_search(a, searched_item)
```
#### Binary search
Requires pre-sorted array but asymptotic runtime is $O(\log n)$ - we only look which "half" of the array might contain the item.
```{python}
def binary_search_loop(a, x):
low = 0 # lower-bound index for search
high = len(a) - 1 # upper-bound index for search
loop_count = 1
while low <= high: # while there are indices to search
mid = (low + high) // 2 # compute the middle index
if a[mid] < x: # if the middle point is less than the item
low = mid + 1 # "discard" the left half (search from the middle + 1)
elif a[mid] > x: # if the middle point is greater than the item
high = mid - 1 # "discard" the right half (search from the middle - 1)
else: # otherwise, the middle point is the item
return mid, loop_count
loop_count += 1
return None, loop_count # we did not find the item
def binary_search_recursion(a, x):
if len(a) == 0: # we got an empty array
return None # the item is for sure not in it
mid_idx = len(a) // 2 # compute the middle point index
if a[mid_idx] == x: # if the middle point is the item
return mid_idx # we found it, job done
elif a[mid_idx] < x: # if the middle point is less than the item
# we look at the right half of the array
return binary_search_recursion(a[mid_idx + 1:], x)
else:
# otherwise, we look at the left half of the array
return binary_search_recursion(a[:mid_idx], x)
# recursion is slower in this case, as we are copying the whole array, instead of just using the index.
sa = np.sort(a)
print("Binary search run time:")
%timeit binary_search_loop(sa, 5)
%timeit binary_search_recursion(sa, 5)
print("Buuuuuut...")
print("'Fair' binary search run time (includes sorting):")
%timeit binary_search_loop(np.sort(a), 5)
%timeit binary_search_recursion(np.sort(a), 5)
```
The advantage of binary search is relatively constant search time. As you can see from the following,
the search time is shorter than "average" search time for linear search: $O(\log n) < O(n)$.
This includes even sorting the array.
```{python}
print("Searching for an item in the middle of the array, using binary search:")
searched_item = sa[N // 2]
%timeit binary_search_loop(np.sort(a), searched_item)
print("Searching for an item at the beginning of the array, using binary search:")
searched_item = sa[0]
%timeit binary_search_loop(np.sort(a), searched_item)
print("Searching for an item at the end of the array, using binary search:")
searched_item = sa[-1]
%timeit binary_search_loop(np.sort(a), searched_item)
```
#### Interpolation search
Even more restrictive requirements than binary search - the array needs to be 'uniformly' sorted.
```{python}
def interpolation_search(a, x):
low = 0 # lower-bound index for search
high = len(a) - 1 # upper-bound index for search
loop_count = 1
while low <= high and a[low] <= x <= a[high]: # while there are indices to search
mid = low + int(((x - a[low]) * (high - low)) / (a[high] - a[low]))
if a[mid] < x: # if the middle point is less than the item
low = mid + 1 # "discard" the left half (search from the middle + 1)
elif a[mid] > x: # if the middle point is greater than the item
high = mid - 1 # "discard" the right half (search from the middle - 1)
else: # otherwise, the middle point is the item
return mid, loop_count
loop_count += 1
return None, loop_count # we did not find the item
print("Searching for an item in the middle of the array, using interpolation search:")
searched_item = sa[N // 2]
%timeit interpolation_search(np.sort(a), searched_item)
print("Searching for an item at the beginning of the array, using interpolation search:")
searched_item = sa[0]
%timeit interpolation_search(np.sort(a), searched_item)
print("Searching for an item at the end of the array, using interpolation search:")
searched_item = sa[-1]
%timeit interpolation_search(np.sort(a), searched_item)
```
The awesome speed of interpolation search:
```{python}
a = np.random.permutation(N)
sa = np.sort(a)
searched_item = sa[N // 2 + 1]
print("Interpolation search run time:")
%timeit interpolation_search(sa, searched_item)
v, i_loop_count = interpolation_search(sa, searched_item)
print(f"Interpolation search 'step' count: {i_loop_count}")
print("Binary search run time:")
%timeit binary_search_loop(sa, searched_item)
b, b_loop_count = binary_search_loop(sa, searched_item)
print(f"Binary search 'step' count: {b_loop_count}")
```
Unfortunately, the "single step" search only works if the array is completely uniform.
```{python}
a = np.random.choice(N**3, size=N, replace=False) / 100 # possibly non-uniform array
sa = np.sort(a)
searched_item = sa[N // 2 + 1]
print("Interpolation search run time:")
%timeit interpolation_search(sa, searched_item)
v, i_loop_count = interpolation_search(sa, searched_item)
print(f"Interpolation search 'step' count: {i_loop_count}")
print("Binary search run time:")
%timeit binary_search_loop(sa, searched_item)
v, b_loop_count = binary_search_loop(sa, searched_item)
print(f"Binary search 'step' count: {b_loop_count}")
```
#### Find all
The search finds the first occurrence of the searched item. Which one will depend on the search method. Sometimes, it is required to find all occurrences of the searched item in the array.
```{python}
import numpy as np
def find_all(a, x):
indices = []
for i in range(len(a)):
if a[i] == x:
indices.append(i)
return indices
N = 100
a = np.random.choice(N**3, size=N, replace=True)
searched_item = a[N // 2]
occurrences = find_all(a, searched_item)
print("The value", searched_item, "occurs", len(occurrences), "times in the array at indices", occurrences)
```
### 'Statistical' computations on arrays
#### Simple operations
##### Minimum, maximum
Python has built-in `min()` and `max()` functions. Here, we wil see simple implementation of these functions,
simply to practice working with arrays. In practice, you can use the built-in functions.
```{python}
import numpy as np
my_list = np.random.permutation(10).tolist()
print("Input array:", my_list)
def minimum(a):
min_value = a[0]
for i in range(1, len(a)):
if a[i] < min_value:
min_value = a[i]
# Alternatively:
# min_value = min(min_value, a[i])
return min_value
def maximum(a):
max_value = a[0]
for i in range(1, len(a)):
if a[i] > max_value:
max_value = a[i]
# Alternatively:
# max_value = max(max_value, a[i])
return max_value
print("Minimum: ", minimum(my_list))
print("Maximum: ", maximum(my_list))
```
##### Mean
First, we need to compute the sum:
```{python}
import numpy as np
my_list = np.random.permutation(10).tolist()
print("Input array:", my_list)
def sum(a):
s = 0
for i in range(len(a)):
s += a[i]
return s
print("Sum: ", sum(my_list))
```
Mean is then simply the sum divided by the number of items:
```{python}
def mean(a):
return sum(a) / len(a)
print("Mean: ", mean(my_list))
```
#### Cumulative sum (prefix sum)
Summing items between two indices is useful for many tasks.
```{python}
import numpy as np
temp_measurements = (np.random.rand(200) * 10).tolist()
print("Input array: " + ', '.join([f"{x:.2f}" for x in temp_measurements[:10]]) + ", ...")
def range_sum(a, start, end):
s = 0
for i in range(start, end):
s += a[i]
return s
print("Sum between indices 5 and 15:", range_sum(temp_measurements, 5, 15))
```
This is fine, if we need to to this once, but what if we need to do it many times?
```{python}
print("Summing once run time:")
%timeit range_sum(temp_measurements, 5, 15)
print("Summing 100 times run time:")
%timeit [range_sum(temp_measurements, 5, 15) for _ in range(100)]
```
Is there a better way? Why yes! We will use **cumulative sum** (also known as prefix sum).
It is done bone adding all the previous values to the current value in the array.
```{python}
def cumulative_sum(a):
for i in range(1, len(a)):
a[i] += a[i - 1]
return a
my_list = list(range(10))
print("Cumulative sum of ordered numbers from 0 to 9: ")
print("Input array: [" + ', '.join([f"{x:2d}" for x in my_list]) + "]")
print("Cumulative sum: " + str(cumulative_sum(my_list)))
print()
my_list = np.random.permutation(10).tolist()
print("Cumulative sum of randomly shuffled numbers: ")
print("Input array: [" + ', '.join([f"{x:2d}" for x in my_list]) + "]")
print("Cumulative sum: " + str(cumulative_sum(my_list)))
```
### Integral image (summed-area table, cumulative sum in 2D)
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment