Lab 7 Details for MPCS 51050

Each lab will consist of a small problem and details of  how to proceed. You need to submit labs to the TAs for grading--see submission instructions below.  Generally, unless otherwise specified, you will have one week to complete each assigned lab.

See the syllabus for information on grading.  Turning in lab assignments is required.  Submit your assignments as a tarball to the subversion repository according to the directions on the syllabus page.

You must write these solutions in Java leveraging both Camel and ActiveMQ.

Lab 7   Due: 5:00 pm, Monday, May 25, 2020

Problem (Producing Messages Derived from RSS Feeds to a Queue leveraging xPath and Regular Expression Parsing):

In this lab, you will use Camel's DSL to create a Producer program that contains a Content Based Router [cf. .choice()] and consumes data from an RSS feed and extracts certain information from each feed and posts summary messages to a Queue.

What you need to implement:

You are to read Appendix A (in Camel in Action) on the Simple expression language ([CIA pp. 461ff.]) and the online description of the RSS component (http://camel.apache.org/rss.html) and Camel XPath (http://camel.apache.org/xpath.html) as well as Camel regular expression parsing and choice ([CIA 2.5, "Routing and EIPs" in particular] to create the following Producer with a Content Based Router.  To take a working project you already have in Eclipse and transform it for use with RSS, simply add the following dependency directly to your project's pom.xml (which is still used by your Eclipse project):
<dependency>
	<groupId>org.apache.camel</groupId>
	<artifactId>camel-rss</artifactId>
	<version>${camel-version}</version>
</dependency>
The producer will access and receive feeds from Google News (URL:  String googleNewsURL = "https://news.google.com/?output=rss";).  You will use the RSS Component to obtain this feed, for example: 
from("rss:" + googleNewsURL)....  Once you are receiving the feed, you should set the body of the message using xpath() to the actual text message embedded inside the full xml text.  For example, if you receive the following XML feed:

<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>Top Stories - Google News</title>
    <link>http://news.google.com/news?pz=1&amp;ned=us&amp;hl=en</link>
    <description>Google News</description>
    <language>en-US</language>
    <copyright>&amp;copy;2015 Google</copyright>
    <pubDate>Fri, 22 May 2015 21:35:56 GMT</pubDate>
    <dc:date>2015-05-22T21:35:56Z</dc:date>
    <dc:language>en-US</dc:language>
    <dc:rights>&amp;copy;2015 Google</dc:rights>
    <image>
      <title>Top Stories - Google News</title>
      <url>https://ssl.gstatic.com/news-static/img/logo/en_us/news.gif</url>
      <link>http://news.google.com/news?pz=1&amp;ned=us&amp;hl=en</link>
    </image>
    <item>
      <title>Irish voters decide on whether to allow gay marriage - Los Angeles Times</title>
      <link>http://news.google.com/news/url?sa=t&amp;fd=R&amp;ct2=us&amp;usg=AFQjCNGJFgQ9UcL2Wi4OHAjM1SdEnuhXgw&amp;clid=etc....
...

You will focus on extracting the rss-channel-item-title-text which in the above case would be the string:  "
Irish voters decide on whether to allow gay marriage - Los Angeles Times".  You are then to send that in a new message to a new QUEUE called "jms:queue:RSS_GOOGLE_NEWS_UPDATES" as two sub-messages, one part being the title of the feed:  ""Irish voters decide on whether to allow gay marriage" and the second part being the source:  "Los Angeles Times".

You are also to log each message using .log() as well as routing output to a file (as you did in the first Camel lab). The file output is to be sent into your local project's data directory. (For more on the log() instruction, see http://camel.apache.org/log.html.) Discuss in your README what you saw during your collection of data and ensure that the data directory you specified with the file destination is included in your submission and contains the data in your discussion.

One final requirement:  you are to use Camel regular expressions to "filter" out incoming feeds to only those topics that you are interested in [CIA 2.5]. (You may decide for yourself which "topics" you are interested in, such as "USA" or "gay marriage" or "Obama", etc.) 

You may decide how to format your final message on the queue.  The only requirement is that both the title of the message (from the <title> in the feed) and the source of the news ("Los Angeles Times" in the above example) both be present in the out message.  It could be something as simple as:  <title="Irish voters decide on whether to allow gay marriage
" source="Los Angeles Times"/>.  But this output format is entirely up to you.

Submitting:
See above for specific items that need to be included in your submission, including specific items in the README and the project directory you submit. Submit your assignments to the subversion repository in the pre-existing folder named "labN" (where N is the homework number).