Computer Science with Applications 1 & 2

More Simulation: Structural balance

Due: Thursday, Oct 25, 2012 at 6pm

The goal of this assignment is to give you more practice with loops, arrays, and functions and to reinforce the simulation concepts learned in the first programming assignment.

Introduction

Networks (aka graphs) can be used to model relationships among people, companies, countries, etc and are of great interest to people in many fields. A network consists of a set of nodes that are used to model entities (people, countries, etc) and edges that describe relationships among the entities. Edges may be directed or undirected, weighted, or of different types. In this assignment, we will look at a model of the evolution of friendships in a neighborhood. The nodes of our network will represent people who are either friends (connected by friend edges) or enemies (connected by enemy edges). To simplify the discussion, we are going to assume that everyone in the neighborhood knows everyone else and that any two people are either friends or they are enemies, that is, no one is ambivalent about anyone else.

We are interested in looking at the dynamics of triads (aka, trios) in the network and how changes in their relationships affect the structure of the network as a whole. Social network theory suggests that some combinations of friend/enemy relationships among three people are stable (or balanced) and others are not. There are four possible configurations or types of triads (note: this figure is in color in the on-line version of the assignment):

We identify the type of a triad based on the number of enemy edges. A Type 0 triad represents three mutual friends, which is a stable configuration. Type 2 triads are also stable under the theory that the enemy of my friend is my enemy. In contrast, Type 1 triads are not stable. If person A is friends with persons B and C, it is likely that persons B and C will eventually become friends or at least acquaintances, which yields Type 0 triad. (A property known as triadic closure.) Alternatively, it might be the case that persons B and C cannot stand each other, in which case, it is likely that person A will eventually flip his opinion of one of person B and person C, which yields a Type 2 triad. Finally, the theory that the enemy of my enemy is my friend suggests that Type 3 triads are also not stable. Eventually, two of the three persons in the triad will gang up on the third.

The field of social network theory uses the term balanced to describe a triad that is in a stable configuration. A network is said to be in structural balance if every possible triad is balanced. The question of interest to sociologists is: given a neighborhood where the relationships among the people in the neighborhood are chosen at random and a set of rules for how to update relationships to resolve instabilities, how likely is it that we will eventually reach a state of structural balance?

We will answer this question by simulating the evolution of the relationships in the network. To guarantee that our simulations will terminate, we are going to limit the updates that can be done during a single trial simulation. Instead of asking how likely is it that a neighborhood with certain characteristics will reach structural balance, we ask how likely is it that such neighborhoods will reach update-limited structural-balance, i.e., how likely is it that such neighborhoods will reach structural balance within a specified number of updates. Your task is to write a program to answer this question.

Evolution rules

Here are the rules for evolving a single triad:

Type 0 triad: Do nothing, the triad is balanced.

Type 1 triad: There are two possible ways to update a triad to make it balanced:
1. convert the enemy edge to a friend edge (i.e, convert it to a Type 0 triad), or
2. convert one of the friend edges, chosen at random, to an enemy edge (i.e., convert it to a Type 2 triad). You can use the function LocalRandom.randBoolean() to choose which friend edge to convert.
You will choose between these two options based on q, the evolution parameter. This parameter is the probability (between 0 and 1) that a Type 1 triad will evolve into a Type 0 triad (if it does not, it evolves into a Type 2 triad).

Type 2 triad: Do nothing, the triad is balanced.

Type 3 triad: To update the triad, convert one of the three enemy edges, chosen at random, to a friend edge. You can use the function LocalRandom.randDouble(), which will yield a random value in the range [0,1), to choose which enemy edge to convert.

Data representation

We will represent neighborhoods with a two dimensional boolean array, neighborhood, where neighborhood[i][j] is true if person i and person j are friends and false if they are enemies. Note that these matrices are symmetric, that is neighborhood[i][j] equals neighborhood[j][i]. We will assume for completeness that the people in our neighborhood are not in need of years of therapy, that is, neighborhood[i][i] is true.

We have provided a function genNeighborhood(int N, double p) that takes the number of people in the neighborhood, N, and the friendship probability, p, and returns a randomly generated neighborhood (where each pair of neighbors are friends with probability p).

For testing purposes, it can be useful to start from a known neighborhood. The distribution contains some simple neighborhoods for this purpose:

tests/sample-neighborhood-0.txt: a neighborhood that has type 0, type 2, and type 3 triads.
tests/sample-neighborhood-1.txt: a neighborhood with only enemy edges,
tests/sample-neighborhood-2.txt: a neighborhood with only friend edges
tests/sample-neighborhood-3.txt: a neighborhood that has type 0 and type 1 triads.

We have also provided functions that read the neighborhood files (boolean[][] TableUtil.loadBooleanTable(String filename)) and print the current state of a neighborhood (TableUtil.printBooleanTable(boolean[][] table)).

Simulation Parameters

There are five parameters for these simulations:

N: the number of people in the neighborhood,
p: the probability that two people will be friends,
q: the evolution parameter,
maxUpdates: the maximum number of updates allowed in a single trial, and
numTrials: the number of trials to run.

Your task

Your task is to write a program that estimates the likelihood that a neighborhood with certain characteristics (number of people, friend probability, and evolution parameter) will reach update-limited structural-balance. Specifically, you need to implement the function estimateBalanceLikelihood in Balance.java. To accomplish this task, you will need to do the following subtasks:

Simulate a single trial evolution: Given a neighborhood, an evolution parameter (q), and the maximum number of updates (maxUpdates) simulate a single trial evolution of the neighborhood. In this simulation, you will continue to update the edges until the neighborhood reaches structural balance or until the maximum number of updates to the edges have been performed. The result of this task should be a boolean that indicates whether the specified neighborhood reached update-limited structural-balance during the trial.
Run multiple trials: Given the full set of simulation parameters execute the specified number of trials. Note: each individual trial should generate a new neighborhood by calling genNeighborhood with N and p. The result of this subtask is a double that corresponds to an estimate of the likelihood that neighborhoods with those parameters will reach update-limited structural-balance. Your implementation of this function should call the function that is the result of the first subtask, rather the repeat the code for a single trial.

We have provided the code for your program to accept all the necessary parameters as command-line arguments. Our code validates the parameters and uses them to invoke estimateBalanceLikelihood.

You should not implement your entire solution in that one function! This exercise requires decomposing the problem into multiple functions; instead of telling you what that decomposition is, it is up to you to decide what functions you will need to solve the problem cleanly. You should think carefully about this decomposition and how you will test your functions before you start writing code.

Getting started

We have seeded your PhoenixForge repositories with a directory named pa3, which contains the following:

Balance.java: you will modify this file.
README.txt: a description of the contents of the directory
TestestimateBalanceLikelihood.java: sample unit test
LocalRandom.java: our local random number generation library (API).
TableUtil.java: a library for reading/writing/printing tables (API)
tests: a directory with some sample neighborhoods and our test code

Balance.java includes code to parse the command-line arguments, call the function estimateBalanceLikelihood, and output the result.

Writing unit tests

Unlike the previous two assignments, we are providing you with only a few tests. Furthermore, these tests only test your complete solution in a few scenarios. So, just because your code passes the tests we have provided is no guarantee that your solution is correct.

In this assignment, it is up to you to write tests that will convince you (and us) that your code is correct. These tests are more properly called unit tests because they focus on a single function (a single "unit" of your program) to determine whether it will function properly when used in a larger context. A good set of unit tests covers all the different types of input the function might receive. For example, if you have a function for determining whether a network is in structural balance, you would want to test it on at least one balanced network and one unbalanced network.

In the previous assignment, we encouraged you to lump all such tests into a TestAuxiliary.java file. Java programmers traditionally write a separate test class, named TestX.java, for each function X. In this assignment, you must follow this convention and provide test code for all your major functions. To get you started, we have provided sample test code for the function estimateBalanceLikelihood (see TestestimateBalanceLikelihood.java) to give you an idea of how to structure your testing code.

When writing your own test classes, we suggest you do the following:

For each function, come up with at least 3-4 test cases. A test case is simply a specific input to the function, and for which you know the expected output. That way, you can run your function with that input and see if it returns that output. Take into account that you will have to work out these test cases by hand. If you look at the TestestimateBalanceLikelihood class, you will see that we have three test cases.
Make sure you choose test cases that cover different types of input. For example, imagine you were writing test cases for a function that determines whether an integer is prime or not. Let's say you choose some prime and composite numbers arbitrarily: 101, 402, 242, and 773. These are good test cases (you're testing two representative types of input: primes and composites), but they could be even better. This function has a very notable edge case: what about the number 1? The function implementor could have (incorrectly) assumed that the definition of prime numbers includes 1 because, under a certain light, 1 is divisible by 1 and itself (1). A test case that checks whether input 1 returns false would catch this bug. In general, you want to identify such edge cases in your functions and make sure you write test cases for them. A good rule of thumb is that, given a range of possible inputs, you should test the extremes of that range and a few test cases "in between".
For each test case, print a short description of the test, set up the arguments as needed, call the function, and then print the actual result of the call and the expected result. If you look at the TestestimateBalanceLikelihood class, you will see that we use a simple test function to eliminate repeated set-up and output code.
Professional programmers write tests that generate output only when an error is detected. This approach is very useful, because it allows you to focus on the errors, but it takes more effort than is necessary for this assignment. If you look at the TestestimateBalanceLikelihood class, you will see a sample test function, testAlt, that generates output only when it detects an error.
If you're still unclear on what exactly should be contained in a unit test, we encourage you to take a closer look at the TestX classes we provided in the previous assignments.

Your grade will depend, in part, on the quality of your test code, that is, whether your tests are comprehensive.

This part of the assignment may sound tedious, but testing each function as you write it can often significantly reduce the amount of time you spend debugging. One of my test functions turned up a bug---I divided by 30 in a place where I had intended to divide by 3.0---that I had overlooked and might never have found in the absence of careful unit testing.

Unit tests with random numbers

Because testing code that uses random numbers can be tricky, our LocalRandom class allows you dictate the sequence of numbers to be generated by calling the function LocalRandom.initRandomFixed() with an array of doubles. Subsequent calls to the random number generator will hand out the values you specified in the order they occur in the array. Note: LocalRandom.randBoolean() generates a random value (using the specified values, if provided) and returns true if the specified value is strictly less than 0.5 and false, otherwise.

For example, if we wish to the test first update rule for type 1 triads, we can set q to be .6 and use the (1,2,3) triad the sample-neighborhood-3.txt as a test case. Here's the setup code for the test:

    boolean[][] network = TableUtil.loadBooleanTable("tests/sample-neighborhood-3.txt");
    double[] vals = {.4};   // yield a value less than q.
    LocalRandom.initRandomFixed(vals);
    // do test on network w/ trial (1,2,3)

To test the second update rule for type 1 triads, you could use the following set-up calls:

    boolean[][] network1 = TableUtil.loadBooleanTable("tests/sample-neighborhood-3.txt");
    double[] vals1 = {.7, .3};    // yield a value greater q, and then yield T.
    LocalRandom.initRandomFixed(vals1);
    // do test on network1 w/ trial (1,2,3)

    boolean[][] network2 = TableUtil.loadBooleanTable("tests/sample-neighborhood-3.txt");
    double[] vals2 = {.7, .6};   // yield a value greater q, and then yield F.
    LocalRandom.initRandomFixed(vals2);
    // do test on network2 w/ trial (1,2,3)

depending on which of the two friend edges you wish to flip.

Submission

You will need to add your test code to your PhoenixForge repository using the command svn add <filename> from within your pa3 directory, where <filename> is replaced by the name of the file you wish to add. For example, svn add TestestimateBalanceLikelihood.java will add the file TestestimateBalanceLikelihood.java to the repository.

To submit your assignment, use the command svn status to make sure you have added all your files to your repository and then check your code into PhoenixForge using the command:

 svn ci -m"final version ready for submission"

from within your pa3 directory. We will grade the last version you check in. We recommend checking in your code early and often.

Acknowledgements

This assignment was inspired by the discussion of structural balance in the book Networks, Crowds, and Markets by David Easley and Jon Kleinberg. The evolution rules are taken from the paper "Social balance on networks: The dynamics of friendship and enmity", T. Antal, P.L. Krapivsky, and S. Redner in Physica D (2006) pg. 130-136.