Goals for this Part 1

  • Practice binary search tree operations
  • Learn how memory allocation works

For this lab, you are welcome to get technical help from another student on how to use or install any of the tools involved. You may also get syntax help on C. You may not, however, get help on the algorithmic portion of the exercise outside of office hours or piazza questions to TAs or instructors.

Set up

Remember to create a new directory in your repository for this lab.
$ cd CNET-cs152-win-19
$ mkdir hw8
$ svn add hw8
$ cd hw8

Problems:

In Part 1, we will build the appropriate functions that we will be using to create and operate on our data structures that together implement our memory allocation technique. The files we will be working with are:

Provided here are the header files. You will be re-using bst.c and memory.c from last week's work, adding extra functions as specified in the .h file. I have provided a working version of llist.c and llist.h for linked list functions necessary for the iterator. Feel free to add helper functions into appropriate files, as long as the function fits in the purpose of the file.

Memory Allocation

The job of the memory allocator is to provide chunks of memory when requested. This requires keeping track of all available memory. Memory begins as a single, large allocation from the operating system.

In our system, we are going to keep track of these items in a binary search tree, ordered by the size of the chunk of memory. Every time the program asks for memory, we want to give it a chunk that is close to its size. Therefore, it will search for the item that is closest (but larger than) its size. If what it finds is more than double its size, then it will split off that part from the rest.

To do this, we need some data structures. We are going to have avail_mem, which points to a bst. This is a global variable because we want it to be able to be accessed by both my_malloc and my_free. In C, the only way to make a variable accessible to two different functions (without be an input parameter) is to make it a global variable. You will notice the following declaration at the top of my_alloc.c:

	bst *avail_mem = NULL;
	
This means that our memory begins looking like this:

We need to create an empty bst. This is the purpose of the init_allocator function. After this function, the allocator memory should look like this:

Note that this is a bst with root node NULL. This means we don't actually have any memory to hand out. Once my_malloc is called, when the code discovers that there is no memory in the bst (or, any time when there isn't enough contiguous memory to satisfy the request), we need to get more memory.

To get more memory, you call malloc and ask for a full page - which we will consider 4096 bytes for this assignment.

	void *memory = malloc(4096);
	
You now need to place this into your tree of available memory. If you were to place it in there (without splitting off some to give it away), this is what it would look like:

Now the bst points to a node. This node's left and right children are NULL pointers. The data, however, points to a memory struct. The memory struct contains the address of the beginning of that 4096 bytes of memory, and it records the available size as 4088.

Why only 4088? Why not 4096? There is a difference between how much memory is there and how much is usable to the program. This is for a very important reason. When a chunk of memory is provided to the program, the memory allocator removes it from its data structure - thereby losing the size information. The allocator could have a second bst, sorted by memory address, that stores all allocated memory with the size of that memory. However, it is more efficient to store the size with the memory itself. Therefore, whenever the allocator hands out memory to a program, it places the size in the beginning of the memory and hands out a pointer to a spot after the size. This way, when the pointer is provided in a free call, the size is obtained in O(1) time rather than O(logn) time in a search structure.

To illustrate this, imagine this initial call to my_malloc was for 128 bytes. We would give the user the last 128 bytes of this memory allocation. The pointer ret_val points to the beginning of this 128 bytes of memory. In order to keep track of the size, we would place the number 128 into the integer slot prior to the allocation. In typical systems today, sizeof(int) is 8 for 8 bytes (64 bits). This leaves 3960B left. We split that into the 8 bytes for the size (should this be given out in a single allocation request) and 3952B for the program to use.

Problem 1:

You need to make two major changes to the memory functions. First, you need to modify the type of the addr to be void* rather than unsigned int. Second, you will implement the helper functions that operate on memory structs.

The new struct definition:
typedef struct {
  void *addr;
  unsigned int size;
} memory;
This also changes the following functions:
memory* memory_new(void *addr, unsigned int size);
int memory_addr_cmp(const void* x, const void* y);
int memory_size_cmp(const void* x, const void* y);
void memory_print(void* data);
Now you are ready to implement your allocation helper functions.
memory *allocate_memory_page();

This function allocates a single page (4096 bytes) and creates the memory structure that points to it. The following data structures will be created:

The pointer ret_val will be returned.

void *split_memory(memory* data, unsigned int size_desired);

This function takes a memory* that has more than size_desired memory in it and splits it off into the chunk that can be passed to the program and the portion that is still able to be passed out later. I have shown the before and after picture for the following call: split_memory(data, 128). It returns a pointer to the last 128 bytes of the original memory.

Be careful in adjusting the memory address. You can't perform arithmetic on a void* because the computer doesn't know how big the type is. For example, if you do the following:

char *pc = (char *)malloc(sizeof(char));
int *pi = (int *)malloc(sizeof(int));
pc = pc+4;
pi = pi+4;

pc advances by sizeof(char)*4 and pi advances by sizeof(int)*4. Luckily, sizeof(char) is 1, therefore if you want to perform any arithmetic on pointers in such a way that it is tied to the actual bytes, then use a char *. If you have a void pointer and you want to make a pointer that points 100 bytes later than it, this is what you would do:
void* advance_by_100(void *v)
{
	char *pc = (char*)v;
	pc += 100;
	return (void*)pc;
}

Problem 2: Iterate

Once this is completed, we will use these functions to implement our iterator defined in bst.h:

  • void* bst_iterate(bst* tree);
  • This function is an iterator that accesses items in the binary search tree in an in-order traversal. Recall that an inorder traversal requires that all left-subtrees are visited, followed by the root, followed by right subtrees, in that sequence. This is the location where we will leverage our choice of linked-list access mechanism.

    An example of an in-order traversal is shown below:

    Inorder Tree Traversal

    (source: https://en.wikipedia.org/wiki/Tree_traversal#In-order)

    The expected use of an iterator is in a for loop, like such:

    void *item;
    for(item = iterate(avail_mem); item != NULL; item = iterate(NULL))
    	do_something_with_item;
    

    You could print out the items, place them into a bst using a different comparison function, or whatever you want. Each iteration through the loop, the iterator will return a different item from the bst. Once it has completed the bst, it will return NULL, which will exit the loop.

    To implement this, you will need a linked list. I have provided the linked list implementation and an iterator for the linked list (in order to show you how iterators are implemented). This is the same code we went over in class. The key is the static variable that holds state between function calls.

    Inspect the code carefully and refer to your notes. A tree seems much more challenging than a list. The key is to traverse the bst during the initialization step of the iterator to create a linked list out of the tree. Then you use the linked list iterator for the rest.

Problem 3: Allocator

Now you are ready to create your basic allocator! Begin with the init_alloc function. It must be called at the beginning of the function to initialize the allocator.

void init_alloc();

At the beginning of the program, avail_mem is null. After this call, your structure should look like this:

The next step is the crux of this week's assignment - creating the basic working allocator.

void *my_malloc(int num_bytes);

The first step is to adjust the number of bytes requested, if necessary. The reason is that on many computers, all memory accesses must be aligned. That means that if someone asks for 3 bytes, you can't actually allocate only 3 bytes. Many types are more than 3 bytes - int, float, double. All variables' addresses must be a multiple of their size. That is, doubles' addresses must be a multiple of 8, and int and float's addresses must be a multiple of 4. Because the largest primitive type (built-in type) is double (8 bytes), all allocation must occur on sizes that are multiples of 8. Therefore, if the size requested is not a multiple of 8, the size needs to be rounded up to be one. Think of how to do this using the modulus (%) operator. Do not use a math library function.

Your next job is to keep track of available memory. You will use a bst. When a request comes in, you will search for an item that is equal to or larger than the requested size of memory - whatever items is closest to the size.

void* node_item_or_successor(node *n, void *item,
        int (*cmp)(const void* x, const void* y));
void* bst_item_or_successor(bst *b, void *item);

If the chunk found is at least twice the requested size, then you split the memory and give out the end part of the memory. You implemented the split function above in the memory problem. If the found chunk size is between the size requested and double the size requested, then you give out the entire chunk. If, at any time, there is no chunk larger than or equal to the requested size, then you obtain more memory, and take a chunk out of that, inserting into your tree any extra memory from that chunk. You implemented allocate_memory_page above.

Finally, you need to implement the free function that places a chunk back into the tree of available memory.

void my_free(void *address);

Testing

You are expected to have tests both for the helper functions and for the full malloc and free functions. It is very difficult to know whether malloc and free are working unless you know what is going to happen. Therefore, you need to work up a set of test cases with fairly known outcomes. While you don't know what the beginning pointer is that the system will give to the first alloc call of 4096 bytes, you should know what all the pointers will be relative to that first value.

Submit

At this point, you should have done the following:
  • Checked out your repository
  • Created a folder named hw8 in your repository and run svn add hw8
    $ svn add hw8
  • Created , added, and filled in four files (with their corresponding header files): bst.c, bst.h, llist.c, llist.h, memory.c, memory.h, my_alloc.c, my_alloc.h. and Makefile inside your hw8 directory.
  • Copied over your code from bst.c and memory.c from last week.
  • $ svn add *.h *.c Makefile
  • Compiled your executable manually and with the Makefile
  • Executed your code to make sure it runs properly and inspected the results.
  • $ svn commit -m "hw8 warmup complete"
Now you're ready to move on to Part 2!!