Matrices are 2D arrays. They have two dimensions: rows and columns. The number of rows is often called the height and the number of columns is often called the width.
This is similar to image, although, images have typically flipped dimensions: x-axis is width (columns) and y-axis is height (rows).
```{python}
m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print(m)
also_m = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
print(also_m)
m_as_well = [[j + i * 3 for j in range(1, 4)] for i in range(3)]
print(m_as_well)
```
Creating a matrix in a loop (the following two functions are equivalent):
```{python}
def create_counting_matrix(height, width):
return [[j + i * width for j in range(1, width + 1)] for i in range(height)]
For debugging purposes, it is useful to see what is in an array. Smaller arrays are easy to print "whole":
```{python}
import numpy as np
a = np.random.permutation(10)
print(a)
```
You can use the `join` method of strings to make the output nicer. More on that later, when we get to strings.
```{python}
def pretty_print_array(a):
print('[' + ', '.join([str(x) for x in a]) + ']')
pretty_print_array(a)
```
For larger arrays, you need to be creative - the solution will depend on what you need. For example, it might be enough to print the first or last few items.
Print the array as rows (Python kinda does that but "uncontrollably"):
```{python}
def print_as_rows(a, row_length=10):
for i in range(0, len(a), row_length):
print(a[i:i+row_length])
a = np.random.permutation(100)
print_as_rows(a)
```
You can even improve it to have nicer output. Again, more on that next time, when we get to strings.
```{python}
def pretty_print_as_rows(a, row_length=10):
print('[')
for i in range(0, len(a), row_length):
print(', '.join([str(x) for x in a[i:i+row_length]]), end=',\n')
print(']')
a = np.random.permutation(100).tolist()
pretty_print_as_rows(a)
```
#### Printing matrices:
(remember to define/run cell with the `create_counting_matrix` and `pretty_print` functions above)
```{python}
def print_matrix(m):
print('[', end='')
n_rows = len(m)
for ri, row in enumerate(m):
print(row, end=',\n' if ri < n_rows - 1 else '')
print(']')
def formatted_print_matrix(m):
print('[', end='')
n_rows = len(m)
max_digits = int(np.ceil(np.log10(max([max(row) for row in m]))))
for ri, row in enumerate(m):
row_prefix = ('' if ri == 0 else ' ') + '['
row_numbers = ', '.join([f"{x:{max_digits}d}" for x in row])
row_end=',\n' if ri < n_rows - 1 else ']'
print(row_prefix + row_numbers, end=row_end)
print(']')
mat = create_counting_matrix(11, 17)
print("Row-by-row print:")
print_matrix(mat)
print()
print("Formatted print:")
formatted_print_matrix(mat)
```
### Search
#### Linear search
Requires iterating over the array, asymptotic runtime is $O(n)$.
```{python}
import numpy as np
N = 100
a = np.random.permutation(N)
searched_item = 50
def linear_search(a, x):
for i in range(len(a)):
if a[i] == x:
return i
return None
print("Linear search run time:")
%timeit linear_search(a, searched_item)
```
##### Computational complexity side-quest
Even though the 'asymptotic' time of linear search is $O(n)$,
the 'actual' time depends on the 'properties' of the array - where the item is located in the array.
```{python}
N = 100
a = np.random.permutation(N)
searched_item = a[N // 2]
print("Searching for an item in the middle of the array:")
%timeit linear_search(a, searched_item)
searched_item = a[0]
print("Searching for an item at the beginning of the array:")
%timeit linear_search(a, searched_item)
searched_item = a[-1]
print("Searching for an item at the end of the array:")
%timeit linear_search(a, searched_item)
```
#### Binary search
Requires pre-sorted array but asymptotic runtime is $O(\log n)$ - we only look which "half" of the array might contain the item.
```{python}
def binary_search_loop(a, x):
low = 0 # lower-bound index for search
high = len(a) - 1 # upper-bound index for search
loop_count = 1
while low <= high: # while there are indices to search
mid = (low + high) // 2 # compute the middle index
if a[mid] < x: # if the middle point is less than the item
low = mid + 1 # "discard" the left half (search from the middle + 1)
elif a[mid] > x: # if the middle point is greater than the item
high = mid - 1 # "discard" the right half (search from the middle - 1)
else: # otherwise, the middle point is the item
return mid, loop_count
loop_count += 1
return None, loop_count # we did not find the item
def binary_search_recursion(a, x):
if len(a) == 0: # we got an empty array
return None # the item is for sure not in it
mid_idx = len(a) // 2 # compute the middle point index
if a[mid_idx] == x: # if the middle point is the item
return mid_idx # we found it, job done
elif a[mid_idx] < x: # if the middle point is less than the item
The search finds the first occurrence of the searched item. Which one will depend on the search method. Sometimes, it is required to find all occurrences of the searched item in the array.
```{python}
import numpy as np
def find_all(a, x):
indices = []
for i in range(len(a)):
if a[i] == x:
indices.append(i)
return indices
N = 100
a = np.random.choice(N**3, size=N, replace=True)
searched_item = a[N // 2]
occurrences = find_all(a, searched_item)
print("The value", searched_item, "occurs", len(occurrences), "times in the array at indices", occurrences)
```
### 'Statistical' computations on arrays
#### Simple operations
##### Minimum, maximum
Python has built-in `min()` and `max()` functions. Here, we wil see simple implementation of these functions,
simply to practice working with arrays. In practice, you can use the built-in functions.
```{python}
import numpy as np
my_list = np.random.permutation(10).tolist()
print("Input array:", my_list)
def minimum(a):
min_value = a[0]
for i in range(1, len(a)):
if a[i] < min_value:
min_value = a[i]
# Alternatively:
# min_value = min(min_value, a[i])
return min_value
def maximum(a):
max_value = a[0]
for i in range(1, len(a)):
if a[i] > max_value:
max_value = a[i]
# Alternatively:
# max_value = max(max_value, a[i])
return max_value
print("Minimum: ", minimum(my_list))
print("Maximum: ", maximum(my_list))
```
##### Mean
First, we need to compute the sum:
```{python}
import numpy as np
my_list = np.random.permutation(10).tolist()
print("Input array:", my_list)
def sum(a):
s = 0
for i in range(len(a)):
s += a[i]
return s
print("Sum: ", sum(my_list))
```
Mean is then simply the sum divided by the number of items:
```{python}
def mean(a):
return sum(a) / len(a)
print("Mean: ", mean(my_list))
```
#### Cumulative sum (prefix sum)
Summing items between two indices is useful for many tasks.