MPCS_51050-Lab

Lab 6 Details for MPCS 51050

Each lab will consist of a small problem and details of how to proceed. You need to submit labs to the TAs for grading--see submission instructions below. Generally, unless otherwise specified, you will have one week to complete each assigned lab.

See the syllabus for information on grading. Turning in lab assignments is required. Submit your assignments to the subversion repository according to the directions on the syllabus page.

You must write these solutions in Java leveraging both Camel and ActiveMQ.

Lab 6 Due: 5:00 pm, Friday, May 22, 2015

Problem (Point-To-Point Exercise: Reading data from files, consuming those files, putting them on a queue, and then consuming them from the queue):

In this lab, you will use Camel's DSL to create a Producer program that consumes data from files in an input directory and for each file writes that data onto a Point-To-Point Message Queue. You will also write a Consumer Program that reads those messages from that Point-To-Point Queue and writes those messages out to an output directory.

What you need to implement:

The data files are CSV files that each contain trade data that contains a Ticker, a Buy Price, a Quantity, and a Sell Price and Sell Quantity. Each line in the CSV files looks something like this:

MSFT,22.81,118,22.82,68

So in this example, the ticker would be "MSFT", the buy price would be 22.81, the Buy Quantity would be 118 (shares), the Sell Price would be 22.82, and the Sell Quantity would be 68 (shares).

You can create 100 CSV files of this type of data by running this script and executing on this data input file. Something like this (you can put first.100.csv wherever you want, just remember where you put it):

broadcaster.sh /home/mark/mpcs.51050/first.100.csv

Our suggestion would be to put the broadcaster.sh shell script in a directory that is part of your $PATH, so you can execute it from any directory. One suggestion would be your local "bin" directory: ~/bin.

We are assuming that you have thoroughly read chapters 1-3 of Ibsen & Anstey, Camel in Action, prior to working on this lab. You will also find Appendix A on the Simple expression language helpful for this lab as well.

Copy your previous lab 5 project in eclipse to a new project named "MPCS-Lab6-Consumer". You can do this by selecting your lab 5 project "chapter1-file-copy", pressing "Ctrl-C" (or right-clicking and choosing "Copy"), and then pasting a copy of that project by right clicking in the Package Explorer and choosing "Paste", and in the Copy Project dialog renaming the copy to "MPCS-Lab6-Consumer" as you paste, like this:

Now do the same thing again, this time copying your MPCS-Lab6-Consumer project you just pasted and creating a new copy of that project and calling that "MPCS-Lab6-Producer". Now you have three projects in the Package Explorer: chapter1-file-copy, your new MPCS-Lab6-Consumer project, and your MPCS-Lab6-Producer project. (Of course they are all three identical copies for the moment). The advantage of copying chapter1-file-copy is that you bring with you all the maven jarballs that you will need, so you don't have to manually add them through the project properties/Java Build Path.

Now, in a terminal, navigate to your your MPCS-Lab6-Producer directory under your eclipse workspace, and change to the data/inbox subdirectory. Mine looks like this:

~/workspace/MPCS-Lab6-Producer/data/inbox

Remove the message1.xml file that's there, so the directory is empty. Now (we are assuming that broadcaster.sh is in a directory in your $PATH), execute the following line:

broadcaster.sh /home/[youruserid]/mpcs.51050/first.100.csv

(Note we are assuming you've put the first.100.csv file under the mpcs.51050 subdirectory under your home directory. You can put it anywhere you want, just make sure you remember where you put it and refer to it in the line above).

Once that command has executed, do an "ls" in your ~/workspace/MPCS-Lab6-Producer/data/inbox directory. You should see 100 files that look something like this:

$ ls
08-05-14_09-55-33.csv 08-05-14_09-56-02.csv 08-05-14_09-56-31.csv 08-05-14_09-57-01.csv 08-05-14_10-47-56.csv 08-05-14_10-54-02.csv
08-05-14_09-55-34.csv 08-05-14_09-56-03.csv 08-05-14_09-56-32.csv 08-05-14_09-57-02.csv 08-05-14_10-47-57.csv 08-05-14_10-54-03.csv
08-05-14_09-55-35.csv 08-05-14_09-56-04.csv 08-05-14_09-56-33.csv 08-05-14_09-57-03.csv 08-05-14_10-47-58.csv 08-05-14_10-54-04.csv
etc.

Your exact filenames will be slightly different as the date-time will have changed. Now, cat out one of the files, and you will see something like this:

MSFT,22.81,118,22.82,68

Each of the 100 files will contain a single line of data similar to the above. These 100 data files will constitute your Producer's "input". Your Producer will roll through these 100 files, and load the content of each file onto an ActiveMQ message queue.

Now you have some reading to do. Make sure you understand what Camel is doing and basically how it works by reading the first three chapters of Ibsen & Anstey Camel in Action along with Appendix A. As we have moved past using maven in favor of eclipse, you can ignore the maven-related instructions (but of course you're free to execute the examples all you want and that would prove instructive). You can also ignore the java spring information. We will not be using java spring in the course, but plain old java POJOs (Plain Old Java Objects).

Your your MPCS-Lab6-Producer will do the following:

1. Read

You are to read through this file using Camel's DSL, establishing a Camel Route that [References to sections of Camel in Action in brackets]:

1. Consume the input directory "data/input" [CIA 2.3 & 7.1-7.2]
2. Logs the string "RETRIEVED: ${file:name}" [CIA Appendix A]
3. Unmarshals the data read [CIA 3.4]
4. Runs the CSV translator on the data [CIA 3.4.2]
5. Splits the body (so that the individual lines go on the queue as individual messages) [CIA 3.4.2]
6. Sends the output to jms:queue:MPCS_51050_LAB6 [CIA 3.4.2]

Read especially the File Component sections on configuration options in CIA 7.2.1, especially understanding the "noop" option and the "fileName" option. You may find that setting "noop=true" will make your testing a bit easier as Camel will not remove the input files as it processes them (which relieves you from having to recreate them with each test using broadcaster.sh).

Once you have developed your code, test it out (this will require several iterations to figure it all out) and you should see that you are publishing 100 messages to your MPCS_5050_LAB6 queue. Once you see that you are doing that, congratulate yourself, and take a break.

Next, you will begin work on your Consumer, in the project you created called MPCS-Lab6-Consumer. Your Consumer will consume the messages off your MPCS_5050_LAB6 queue. As it takes off each message, the message will be removed from the queue. Your route will be fairly simple. In your Consumer route you will need to do the following [References to sections of Camel in Action in brackets]:

1. Consume messages from the MPCS_51050_LAB6 queue [CIA 7.3]
2. Log the string "RECEIVED: jms queue: ${body} from file: ${header.CamelFileNameOnly}" [CIA Appendix A]
3. Convert the body of the message taken from the queue into a string using a String.class conversion [CIA 3.6.2]
4. Write the resulting set of messages out to "file:data/outbox" [CIA 7.2.1] appending ".out" to the the current thread name and the original Camel input file name in the the output file name [CIA Table A.1], with the result that you have 100 files in the data/outbox directory that look something like this:

$ ls
Thread-Camel (camel-1) thread #0 - JmsConsumer[MPCS_51050_LAB6]-08-05-14_09-55-33.csv.out
Thread-Camel (camel-1) thread #0 - JmsConsumer[MPCS_51050_LAB6]-08-05-14_09-55-34.csv.out
Thread-Camel (camel-1) thread #0 - JmsConsumer[MPCS_51050_LAB6]-08-05-14_09-55-35.csv.out
Thread-Camel (camel-1) thread #0 - JmsConsumer[MPCS_51050_LAB6]-08-05-14_09-55-36.csv.out

You can use the fileName= option in the output line such as:

.to("file:data/outbox?noop=true&fileName=Thread-${threadName}-${header.CamelFileNameOnly}.out");

Notes and Hints:

In the referenced Camel in Action sections, you will find code examples. Use these as a guide as to the DSL syntax for accomplishing the requirements of this lab. You will likely find this lab impossible to accomplish without reading the assigned sections.

Submitting:
Submit your assignments to the subversion repository in the pre-existing folder named "labN" (where N is the homework number). Please include a README text file that contains any instructions for the TAs to assist with grading, and design notes are often the most useful thing you can provide. We do not usually need any info on how to compile your code unless your code layout is arcane.