Project Part 2: Electronic Exchanges

Part 2 Due Date Friday, May 27th at 9pm. (Note the non-standard due date)

In this part of the project, you are going to replace the concrete data structure used to implement books, expand the functionality of exchange to handle cancel orders, and pull it all together in a program that reads orders from a file, and writes action reports to a file.

Getting Started

You will be using the project repository you created for Part 1 for this part as well. To pick up the files you need for this part, navigate to your cmsc15200/project-GITHUB_USERID directory (where GITHUB_USERID) is replaced with your GitHub account name and then run:

git pull upstream main

If this command succeeded, you will have a subdirectory named part2. You will do all of your work for this part in this directory. See the file README.md for a description of the files in the distribution.

We have given you clean copies of the files from Part 1, but you may want to start by copying book.c, book.h, exchange.c, and if you made any changes, order.c and order.h from your part1 subdirectory to your part2 subdirectory.

Deliverables

The deliverables for this project are broken into three sections. We strongly encourage you to work on them one at a time.

Deliverable #1

Your first task is to change the concrete implementation of your book data structure from a linked list to a heap. Please note that while we discussed the recursive version of the heap helper functions sift_up and sift_down in class, your implementations for these functions and the rest of the code you need to manage a heap must be iterative. To be clear: You may not use recursion in your heap implementation.

You will need to rewrite your struct book to hold the information needed to for the heap (the array, the number of heap slots, and the current number of the heap slots that are occupied) and you’ll need to rewrite your functions to update the heap as needed.

For the heap array, start out with a dynamically allocated array of some initial size (say, 5) and grow it as needed. You will need to use the C library function realloc to grow the array. You can learn about this function using man realloc. You can also look at the implementation of the action record data structure, which uses realloc to grow the array that holds the action records as needed. Unlike the action record array, which holds structures, you will likely want an array of pointers to structures for your heap.

We have provided a version of realloc named ck_realloc in util.c that checks the return value for you.

If the interface for your book data structure is well thought out, you should be able to replace the underlying implementation without needing to change your implementations of process_order, mk_exchange, free_exchange, and print_exchange.

Deliverable #2

Your second task is to update process_order to handle cancel orders.

A cancel limit order is a request to reduce the number of shares associated with a previously received order, which is identified by its order reference number. To process a cancel order, determine the appropriate book (buy or sell), find the relevant order, reduce the number of shares by the number specified in the cancel order, and add an action to the action report, if appropriate.

Two seemingly anomalous situations can arise with cancel orders: (1) a cancel order arrives with a request to cancel more shares than are available in the corresponding booked order and (2) a cancel arrives with an order reference number that does not appear in the relevant book. Both situations arise from the same cause: one or more trades execute between the arrival of the add limit order and the arrival of the cancel limit order. The corresponding order will appear in the book with a diminished number of shares, if the number of shares traded between the add and the cancel is smaller than the number of shares added originally. In the extreme case, when all the shares get traded between the add and the cancel, the order does not appear in the book at all.

We have updated the set of actions handled by the action report data structure to include CANCEL_BUY and CANCEL_SELL. The oref and price needed for the action can be taken from the cancel order or from the corresponding booked order, as they will be the same. The number of shares cancelled depends on the state of the original add limit order when the cancel order is received.

To make this more concrete, here are some sample orders:

I,UOCCS,A,S,100,550000,1000 @ time 10
I,UOCCS,A,B,70,558000,1070 @ time 20
I,UOCCS,A,S,20,540000,2000 @ time 30
I,UOCCS,A,S,10,550000,1010 @ time 40
I,UOCCS,C,S,20,540000,2000 @ time 50
I,UOCCS,C,S,100,550000,1000 @ time 60
I,UOCCS,A,B,10,550000,1012 @ time 70
I,UOCCS,C,B,10,550000,1010 @ time 80

And here is the state of the books after processing Order 1010 at time 40:

Ticker: UOCCS
Buy Book:
  Book is empty

Sell Book:
  I,UOCCS,A,S,20,540000,2000 @ 30
  I,UOCCS,A,S,30,550000,1000 @ 10
  I,UOCCS,A,S,10,550000,1010 @ 40

The action report for the order at time 50 is:

20 shares CANCELED (SELL) at price 540000 (2000)

and here is the state of the books after processing the cancel order:

Ticker: UOCCS
Buy Book:
  Book is empty

Sell Book:
  I,UOCCS,A,S,30,550000,1000 @ 10
  I,UOCCS,A,S,10,550000,1010 @ 40

The action report for the order at time 60 is:

30 shares CANCELED (SELL) at price 550000 (1000)

Notice that even though the order asked to cancel 100 shares from order 1000, only 30 were cancelled because some were traded before the cancel arrived. The state of the books after this order is:

Ticker: UOCCS
Buy Book:
  Book is empty

Sell Book:
  I,UOCCS,A,S,10,550000,1010 @ 40

The next order (at time 70) yields a trade:

10 shares TRADED at price 550000 (1010)

after which both books are empty.

Finally, the last order (at time 80) wants to cancel shares from order 1010, but it is already gone from the book. There is no action to record when the number of shares is zero. Your function should merely return an empty action report in this case.

Heaps are not designed for searching for a specific value. To find a specific order in the heap, you can treat the heap array as a regular array and do a linear search. Of course, you’ll need to think about how to re-heapify your heap when the number of shares in an order in the middle of the heap gets reduced to zero.

The amount of code needed to do this task is small. It is mostly an exercise in thinking carefully about how to manage the heap and whether you want to change or simply extend the book interface.

Deliverable #3

In this task, you will pull everything together into a program that reads orders from a file and writes action reports to a file.

The work for this task will be done in the file: simulate.c. The program must take two command-line parameters: the ticker symbol and a test number.

Your program should generate the error: usage: simulate <ticker symbol> <test number> and then exit, if provided with the wrong number of arguments.

You will construct the necessary file names from the test number. The order filenames will have the format: tests/test%d_orders.csv, where the %d will be replaced by the test number. So for test 0, the file name is tests/test0_orders.csv. The output will be written to a file named tests/test%d_actions.csv, where again, the %d will be replaced by the test number. For test 0, the output filename will be tests/test0_actions.csv.

You should read and process the orders one by one. You do not need to construct a data structure to hold all of the order strings. You may assume all of the orders are for the same ticker symbol and that the lines in the orders file are well-formatted. You are responsible for computing the time for each order. Start the clock at 0 and move forward by 1 time unit for each order. (Your books should still be able to handle delayed orders, even though all the order from simulate will have increasing times.)

We have provided a new function in the action report data structure that takes an action report, a file pointer, and an integer index and writes the actions in the report out to the file, one per line. The format is:

index,action,oref,price,number of shares

For this task, number the orders starting at zero, and then pass the order number as the index. For an example, see tests/test6_action_expected.csv.

You will find the following C library functions useful for this task: fgetc, fopen, fclose. The first function gets a single character from a file stream. The other two are used for opening and closing files. See the respective man pages for these functions for details. Please note that if a file does not exist, your code should print a suitable error message to stderr and exit.

(Do not use getline, which looks helpful in this situation. Unfortunately, it is not an official part of the standard and its type signature is subtlely different on our Linux servers and on gradescope.)

We have included a target for simuate in the Makefile. Here’s a sample run of this program:

./simulate UOCCS 6

Test 6 (file: tests/test6_orders.csv) contains seven orders:

I,UOCCS,A,S,100,550000,1000
I,UOCCS,A,B,70,558000,1070
I,UOCCS,A,S,20,540000,2000
I,UOCCS,A,S,10,550000,1010
I,UOCCS,A,B,20,555000,1020
I,UOCCS,A,S,100,570000,2010
I,UOCCS,A,B,100,560000,1040

Here’s the expected result (from tests/test6_actions_expected.csv):

0,BOOKED_SELL,1000,550000,100
1,EXECUTE,1000,550000,70
2,BOOKED_SELL,2000,540000,20
3,BOOKED_SELL,1010,550000,10
4,EXECUTE,2000,540000,20
5,BOOKED_SELL,2010,570000,100
6,EXECUTE,1000,550000,30
6,EXECUTE,1010,550000,10
6,BOOKED_BUY,1040,560000,60

Space management

As in Part 1, your grade will, in part, depend on whether you have successfully deallocated all the memory allocated by your code.

Testing

The testing set up for deliverables one and two is similar to Part 1. We have added a sample cancel test to student_test_exchange.c. You are welcome to add tests to this file.

We have also provided files that contain the tests that we ran on your code for Part 1. Each test has a file with a list of orders, a file with the times used, and a file with the expected action reports. See part2/tests/README.md for more information.

These files will be useful for the third deliverable.

The diff utility will be helpful in determining whether your output is correct. For example, you can run:

./simulate UOCCS 6
diff tests/test6_actions_expected.csv tests/test6_actions.csv

diff runs silently if the files are the same. You are welcome to add more test files to this directory. Make sure to end the file with a new line and do not include any blank lines.

When you upload your code to gradescope, the initial autograder will run three tests. The first two are available in the distribution in test_exchange.c. This arrangement will allow you to make sure that your code compiles and runs on at least two tests in Gradescope. The third runs the simulate example from the writeup and compares your results with the expected results.

If you see Exchange (1.0/1.0), Cancel (1.0/1.0), and Simulate (1.0/1.0)` under ``Passed Tests, then your code passed the three tests provided. Ignore the 3/70 autograder score. This score will change when we run the final autograder.

We will replace this autograder with actual tests that will be used to verify your code once the deadline passes. We do not promise that the tests provided in student_test_exchange.c and in test_exchange.c cover all the cases that will be covered in the automated tests.

Utility Functions

As in Part #1, we have provided some utility functions in util.h and util.c. You are welcome to use any of these functions.

Grading

For this project, the weights will be:

  • Completeness: 70%

  • Code Quality: 30%

The completeness score will include your performance on automated tests and on whether you successfully avoid memory leaks. Just a reminder, the automated tests will not be available to you as you work on your project.

The code quality score will depend on how well you decompose the problem into pieces, follow the course style guide, and on whether you successfully implement the required deliverables.

Cleaning up

Before your submit your work, we strongly encourage you to get your directory into a clean state. Run make clean to remove any executables that are laying around and then run git status . to make sure you did not forget to add/commit any of the required files.

Also, remove directives, such as:

// YOUR CODE HERE

// Replace 0.0 with an appropriate return value

from your code. Make sure to recompile and rerun the tests after you clean up. It is easy to introduce a syntax error at this stage.

Submission

Before you upload your submission, make sure you have added the required information to the header comment in simulate.c:

  • your name,

  • the names of anyone beyond the course staff you discussed the assignment with, and

  • links to any resources that you used other than course materials and the course textbook.

Note: write None in the relevant field, to indicate that you did not use any sources and/or consult anyone beyond the course staff.

To submit your work, you will need to add, commit, and push simulate.c, exchange.c, book.c, book.h, order.c, and order.h to GitHub (git -u adds known files that been modified) and then upload your repository to Gradescope under the Project Part #2 assignment.

You are welcome to upload your code to Gradescope multiple times before the deadline, but it is wildly inefficient to use Gradescope as a compute server. You should test your code on the CS Linux machines and only upload it when you think you are finished or when you are getting close to the deadline and want to do a “safety” submission.

You are responsible for making sure that the code you upload compiles and runs. You will get zero credit for code that does not compile.

Finally, a reminder that we will not accept late work.