Homework #6

Due: Saturday, November 18th at 11:59pm (Extended Deadline)

In this assignment, we will continue implementing our shell program. This assignment focuses on spawning child processes (i.e., the command lines) and waiting for them to complete.

CS Linux Machine

You will need access to an Linux based machine when working on your homework assignments. You should not test your programs on macOS or Windows Linux because these operating systems do not provide all utility commands necessary for completing this and possibly future assignments. Additionally, if they do provide a command then it may not contain all options that a Unix-like system provides. We will use and grade all assignments on the CS Linux machines and all programming assignments must work correctly on these machines. However, you can work locally on a Unix or Unix-like machine but ensure that you test your final solutions on a CS Linux machine.

Please follow the instructions provided here

Creating Your Private Repository

There is no new repository to create for this assignment. You will continue to use the repository you created in homework #5.

Short Answer Questions

Each assignment may include a section dedicated to answering a few short answer questions. Please turn in a readable text file named saqs (e.g., saqs.pdf). Store this file inside the saqs/saqs.pdf directory of your msh repository. You will place your answers to the following questions in this document. The file is not required to be a PDF but any text document that is easy to open is fine.

Please do not ask questions on Ed or during office hours explaining your answer to SAQs to determine if it is correct or almost correct. Those types of questions will not be answered. This is what we are grading you on. However, you can ask questions related to the concepts you want to use to answer your question or to clarify the understanding of the question.

SAQ 1

#include <stdio.h>
#include <unistd.h>

int main()
{
  pid_t id;
  int i;
  int j = 2;
  while (j > 0) {
  id = fork();
  if (id > 0) {
      printf("<1>\n");
  }
  else if (id == 0) {
      printf("<2>\n");
  }
  for (i = 1; i <= 2; i++)
      printf("%d ", i);

    j--;
  }
  return 0;
}

Task: Submit a separate image file of the process graph for this code.

SAQ 2

Consider four processes with the following starting and ending times:

Process

Start Time

End Time

A

5

7

B

2

4

C

3

6

D

1

8

Question: Consider of all the possible ways to pair only two processes together (e.g., “A && B”, “A && C”, “A && D” etc.) from the chart above, which of the pairs are concurrent?

SAQ 3

Can the init process always terminate any process in the system? Explain.

Programming Assignment

For the programming portion of this assignment, you will continue building your msh shell. Specifically, you will focus on spawning new child processes representing jobs. The assignment will have the following tasks to complete:

  1. Create a job.c and job.h module - handles working with jobs.

  2. Update the shell typedef and its various functions to handle job creation, spawning and termination.

Task 1: Job Module (job.h and job.c)

The job module will contain information about the current jobs (i.e., foreground and/or background jobs) in existence. Unlike in the prior shell module, we want you to decide what should go inside this module and how to incorporate it into the shell module. The implementation must be defined in a file src/job.c and its header in include/job.h. The header must contain the following definitions where you can add new fields but cannot delete or rename these types:

#ifndef _JOB_H_
#define _JOB_H_

#include <sys/types.h>

typedef enum job_state{FOREGROUND, BACKGROUND, SUSPENDED, UNDEFINED} job_state_t;

// Represents a job in a shell.
typedef struct job {
    char *cmd_line;     // The command line for this specific job.
    job_state_t state;  // The current state for this job
    pid_t pid;          // The process id for this job
    int jid;            // The job number for this job
}job_t;

#endif

The enum job_state is an enumeration type that allows you to assign the job_state_t field to one of those defined constants. It provides better code readability. For example,

job_t job;
job_state_t state = FOREGROUND;
job.state = state;

To help you get started thinking about how to design and incorporate this module into the shell module, we recommend the following:

  1. Inside the msh_t typedef in the shell.h include a new field job_t *jobs to represent an array jobs that are running in the shell.

  2. When allocating the job_t *jobs, we recommend you allocate it to the size of max_jobs (i.e., the parameter from alloc_shell). You may not use every element in the array; however, it will make it easier to add,delete, and modify jobs in the array.

  3. Define a bool add_job(job_t *jobs, int max_jobs, pid_t pid, job_state_t state, const char *cmd_line) function that adds a new job to the array. Reminder if there are no more jobs left to allocate (i.e., max_jobs has been reached) then this function should return a false; otherwise, true to indicate the job was added.

  4. Define a bool delete_job(job_t *jobs, pid_t pid) function to remove a job from the array of jobs based on the pid_t provided. You should think about the return value for this function.

  5. Define a void free_jobs(job_t *jobs, int max_jobs) function to deallocate the jobs array.

The above suggestions are not required and you are free to design the module as you wish as long as it has at least the above skeleton code. You can make modifications to the headers defined above, add new functions, etc. However, for each function prototype in the job.h file, you must provide documentation for that function, for example here’s the documentation for alloc_shell from the shell.c file:

/*
* alloc_shell: allocates and initializes the state of the shell
*
* max_jobs: The maximum number of jobs that can be in existence at any point in time.
*
* max_line: The maximum number of characters that can be entered for any specific command line.
*
* max_history: The maximum number of saved history commands for the shell.
*
* Returns: a msh_t pointer that is allocated and initialized
*/
msh_t *alloc_shell(int max_jobs, int max_line, int max_history);

Task 2: Update the evaluate function and shell module

The current state of the evaluate function is to print the argv and argc from the command line. For example,

msh> /usr/bin/ls msh-lamonts
argv[0]=/usr/bin/ls
argv[1]=msh-lamonts
argc=2

Now, the evaluate function must be updated as follows:

  1. Create a new child process using fork() that will handle the execution of the current job.

  2. Have the parent add the newly created job to the jobs array.

  3. The child process will then call execve(...) to execute the job.

  4. The parent process will blocking using waitpid(...) or wait(...) (you choose) until the child process completes.

  5. Once the child process terminates, the parent process must delete the job from the jobs array.

Assuming the msh-lamonts directory is a copy of the msh repository, then running the /usr/bin/ls msh-lamonts now evaluates to the following

msh> /usr/bin/ls msh-lamonts
bin  data  include  README.md  saqs  scripts  src  tests

Notice how we had to provide the full path to the ls executable (i.e., /usr/bin/ls and not just the ls). This is required since our shell currently is not looking at the PATH environment variable to find executables. Thus, when you are testing your msh shell currently you must provide the full executable.

Multiple Jobs

What happens if you have multiple jobs on the same command line? For example,

msh> /usr/bin/ls msh-lamonts; /usr/bin/printf "%02d/%02d/%02d\n" 2 2 2023

then this must cause the evaluate function to run the jobs sequentially after each other. The shell goes through the five steps described above for the first job (i.e., msh> /usr/bin/ls msh-lamonts), and then does the steps again for the subsequent jobs that are running in the foreground (i.e., /usr/bin/printf "%02d/%02d/%02d\n" 2 2 2023). Thus, the following above command produces

msh> /usr/bin/ls msh-lamonts; /usr/bin/printf "%02d/%02d/%02d\n" 2 2 2023
bin  data  include  README.md  saqs  scripts  src  tests
02/02/2023

Background Jobs

Background jobs are asynchronous in nature and should be launched and executed in a child process and not waited on by the parent process. This means that the parent process should not wait on background job to terminate but rather continue on executing the next job. However, before the evaluate function returns it must check to see if any background jobs have completed and delete them from the jobs array. You can do this by calling waitpd in the following way

pid_t term_pid = waitpid (child_pid, &child_status, WNOHANG);

where child_pid is the pid of the child process assigned to a background job. The WNOHANG allows the parent to not block if the background child process has not completed yet. As a reminder, waitpid will return the pid of the terminated child process. Please note: this is a naive implementation and we will update this mechanism in the next assignment using signals.

Testing

You should do you own local testing of the msh program yourself.

Professor Samuels will provide additional test cases on Tuesday November 14th; however, we want you to first do your own testing to learn how to become better testers before providing you with more test cases.

Grading

Programming assignments will be graded according to a general rubric. Specifically, we will assign points for completeness, correctness, design, and style. (For more details on the categories, see our Assignment Rubric page.)

The exact weights for each category will vary from one assignment to another. For this assignment, the weights will be:

  • SAQs: 15%

  • Completeness: 55%

  • Correctness: 20%

  • Design/Style: 10%

Submission

Before submitting, make sure you’ve added, committed, and pushed all your code to GitHub. You must submit your final work through Gradescope (linked from our Canvas site) in the “Homework #6” assignment page via two ways,

  1. Uploading from Github directly (recommended way): You can link your Github account to your Gradescope account and upload the correct repository based on the homework assignment. When you submit your homework, a pop window will appear. Click on “Github” and then “Connect to Github” to connect your Github account to Gradescope. Once you connect (you will only need to do this once), then you can select the repository you wish to upload and the branch (which should always be “main” or “master”) for this course.

  2. Uploading via a Zip file: You can also upload a zip file of the homework directory. Please make sure you upload the entire directory and keep the initial structure the same as the starter code; otherwise, you run the risk of not passing the automated tests.

Note

For either option, you must upload the entire directory structure; otherwise, your automated test grade will not run correctly and you will be penalized if we have to manually run the tests. Going with the first option will do this automatically for you. You can always add additional directories and files (and even files/directories inside the stater directories) but the default directory/file structure must not change.

Depending on the assignment, once you submit your work, an “autograder” will run. This autograder should produce the same test results as when you run the code yourself; if it doesn’t, please let us know so we can look into it. A few other notes:

  • You are allowed to make as many submissions as you want before the deadline.

  • Please make sure you have read and understood our Late Submission Policy.

  • Your completeness score is determined solely based on the automated tests, but we may adjust your score if you attempt to pass tests by rote (e.g., by writing code that hard-codes the expected output for each possible test input).

  • Gradescope will report the test score it obtains when running your code. If there is a discrepancy between the score you get when running our grader script, and the score reported by Gradescope, please let us know so we can take a look at it.