Short Exercises #5

Due: Wednesday, November 9th at 4:30pm CT

The following short exercises are intended to help you practice some of the programming concepts introduced in Module #5. These exercises should not take more than 1-2 hours in total to complete. The goal of these exercises is to help you develop skills with the NumPy library and array concepts. To that end, you might find the NumPy documentation helpful.

NumPy is a rich library and we do not have time in class to cover all of its functionality in recorded lectures, so we recommend you check out the documentation (particularly the Numpy Reference) for the following functions and array methods which may be of help when completing these short exercises:

Functions:

  1. np.arange

  2. np.argmin, np.argmax

  3. np.sum

Methods (assuming an array x):

  1. x.copy()

  2. x.mean()

  3. x.min(), x.max()

  4. x.reshape()

Fetching the instructor files

To get the files for this set of short exercises, first set the GITHUB_USERNAME environment variable by running the following command at the Linux command line (replacing replace_me with your GitHub username):

GITHUB_USERNAME=replace_me

(remember you can double-check whether the variable is properly set by running echo $GITHUB_USERNAME)

Then navigate to your Short Exercises repository and pull the new material:

cd ~/capp30121
cd short-exercises-$GITHUB_USERNAME
git pull upstream main

You will find the files you need in the se5 directory.

IMPORTANT: If you are unable to obtain the instructor files by running the commands above do not try to add the files in some other way. Doing so will likely prevent you from submitting your code. Instead, please seek assistance on Ed Discussion or at office hours.

Testing

As usual, you will want to test your solution manually before you try the automated tests. Remember to set up autoreload before you start testing.

$ ipython3

In [1]: %load_ext autoreload

In [2]: %autoreload 2

In [3]: import se5

Exercises

IMPORTANT: While many of the exercises below can be completed using for and while loops, the purpose of these exercises is to learn how to use the more efficient NumPy functions. Therefore do not use for or while loops when completing these exercises. You must also not use list or dictionary comprehensions. For reference, we have also included an estimate of the number of lines of code needed to implement each solution.

Arrays and scalars

  1. (5 lines) Complete the function reshape_array(x, new_dims) which takes an 2-dimensional array x and a 2-tuple of integers representing the dimensions of a 2-dimensional array. This function returns a new array with the values of x and dimensions new_dims, if possible. Otherwise, it returns a copy of x.

    For example, if x is np.array([[1, 2, 3, 4, 5, 6]]) and new_dims is (2, 3), reshape_array(x, new_dims) should return np.array([[1, 2, 3], [4, 5, 6]])

    On the other hand, if x is np.array([[1, 2, 3, 4, 5]]) and new_dims is (2, 3), reshape_array(x, new_dims) should return the new array np.array([[1, 2, 3, 4, 5]]). x cannot be reshaped into a 2 by 3 array since it have 5 values.

  2. (1 line) Complete the function harmonic_sequence(N) which returns the sum of the first N values in the harmonic sequence. That is, it returns the sum: \(1 + \frac{1}{2} + \frac{1}{3} + \dots + \frac{1}{N}\).

Masking array values

  1. (2 lines) Complete the function clip_in_range(x, lb, ub), which takes an n-dimensional array x, a lower bound lb, and an upper bound ub, and modifies x so that its values are between lb and ub, inclusive.

    For example, let x be np.array([3, -2, 8, 4, 5, 9]). After the call clip_in_range(x, 0, 5), x should be np.array([3, 0, 5, 4, 5, 5]).

    This is the NumPy version of clip_in_range from SE #2.

  2. (1 line) Complete the function fill_missing_data which takes an n-dimensional array that contains “missing” data represented with the value -1. This function should modify the array so that the missing data is replaced with the mean of the values in the array, excluding the missing values.

    For example, let x be np.array([[-1, 2, 8], [6, 4, -1]]). There are two missing values in x (at locations (0, 0) and (1, 2)). After the call fill_missing_data(x), x should be np.array([[5, 2, 8], [6, 4, 5]]). The missing values are replaced with the average of the rest of the data (i.e., the average of the values 2, 8, 6, 4).

    You can assume that there is at least one non-missing value in x.

Indexing

  1. (4 lines) Complete the function smallest_span which returns the index of the row with the smallest “span” of values. The span of a row is its largest value minus its smallest value.

    For example, smallest_span(x) should return 2 if x is the array:

np.array([[7, 9, 1, 2],   # span: 8
          [-1, 3, 8, 0],  # span: 9
          [6, 7, 2, 1]])  # span: 6
  1. (8 lines) Complete the function select_row_col(x, row_idx, col_idx) that takes in a 2-dimensional array x and returns a subset of rows or columns or sub-array specified by row_idx and col_idx. If you specify row_idx as a list and col_idx as None, you will return a subset of rows. Similarly, if you specify row_idx as None and col_idx as a list, you will return a subset of columns. If you specify row_idx as a list and col_idx as a list, you will return a sub-array specified by the given rows and columns. If you specify both row_idx and col_idx as None, you will return the array itself. For example,

In [1]: x = np.array([[0, 1, 2],
                      [3, 4, 5],
                      [6, 7, 8]])

In [2]: se5.select_row_col(x, [1, 2], None)
Out[2]:
array([[3, 4, 5],
       [6, 7, 8]])

In [3]: se5.select_row_col(x, None, [1, 2])
Out[3]:
array([[1, 2],
       [4, 5],
       [7, 8]])

In [4]: se5.select_row_col(x, [1, 2], [0, 2])
Out[4]:
array([[3, 5],
       [6, 8]])

Submitting your work

Once you’ve completed the exercises, you must submit your work through Gradescope (linked from our Canvas site). Gradescope will fetch your files directly from your GitHub repository, so it is important that you remember to commit and push your work!

To submit your work, go to the “Gradescope” section on our Canvas site. Then, click on “Short Exercises #5”. Then, under “Repository”, make sure to select your uchicago-CAPP30121-aut-2022/short-exercises-$GITHUB_USERNAME.git repository. Under “Branch”, just select “main”.

Finally, click on “Upload”. An autograder will run, and will report back a score. Please note that this autograder runs the exact same tests (and the exact same grading script) described in Testing Your Code. If there is a discrepancy between the tests when you run them on your computer, and when you submit your code to Gradescope, please let us know.

Your ESNU score on this set of exercises will be determined solely on the basis of these automated tests:

Grade

Percent tests passed

Exemplary

at least 95%

Satisfactory

at least 75%

Needs Improvement

at least 50%

Ungradable

less than 50%

If there is a discrepancy between the tests when you run them on your computer, and when you submit your code to Gradescope, please let us know. Please remember that you can submit as many times as you want before the deadline. We will only look at your last submission, and the number of submissions you make has no bearing on your score.