Lab 7
Details for MPCS 51050
Each
lab will consist of a small problem and details of
how to proceed. You need to submit labs to the TAs for
grading--see submission instructions below.
Generally, unless otherwise specified, you will have one week to complete each assigned lab.
See
the syllabus for information on grading. Turning in lab
assignments is required. Submit your
assignments as a tarball to the subversion repository according to the
directions on the syllabus page.
You
must write these solutions in
Java leveraging both Camel and ActiveMQ.
Lab 7
Due: 5:00 pm, Friday, May 29, 2015
Problem
(Producing Messages Derived from RSS Feeds to a Queue leveraging xPath and Regular Expression Parsing):
In
this lab, you will use Camel's DSL to create a Producer program that
contains a Content Based Router [cf. .choice()] and consumes data from an RSS feed and extracts certain information from each feed and posts summary messages to a Queue.
What
you need to implement:
You are to read Appendix A (in Camel in Action) on
the Simple expression language ([CIA pp. 461ff.]) and the online
description of the RSS component (http://camel.apache.org/rss.html) and
Camel XPath (http://camel.apache.org/xpath.html) as well as Camel
regular expression parsing and choice ([CIA 2.5,
"Routing and EIPs" in particular] to create the following Producer with
a Content Based Router.
To take a working project you already have
in Eclipse and transform it for use with RSS, simply add the following dependency directly to your project's pom.xml (which is still used by your Eclipse project):
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-rss</artifactId>
<version>${camel-version}</version>
</dependency>
The producer will access and receive feeds from Google News (URL: String googleNewsURL = "https://news.google.com/?output=rss";). You will use the RSS Component to obtain this feed, for example: from("rss:" + googleNewsURL)....
Once you are receiving the feed, you should set the body of the message
using xpath() to the actual text message embedded inside the full xml
text. For example, if you receive the following XML feed:
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>Top Stories - Google News</title>
<link>http://news.google.com/news?pz=1&ned=us&hl=en</link>
<description>Google News</description>
<language>en-US</language>
<copyright>&copy;2015 Google</copyright>
<pubDate>Fri, 22 May 2015 21:35:56 GMT</pubDate>
<dc:date>2015-05-22T21:35:56Z</dc:date>
<dc:language>en-US</dc:language>
<dc:rights>&copy;2015 Google</dc:rights>
<image>
<title>Top Stories - Google News</title>
<url>https://ssl.gstatic.com/news-static/img/logo/en_us/news.gif</url>
<link>http://news.google.com/news?pz=1&ned=us&hl=en</link>
</image>
<item>
<title>Irish voters decide on whether to allow gay marriage - Los Angeles Times</title>
<link>http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=AFQjCNGJFgQ9UcL2Wi4OHAjM1SdEnuhXgw&clid=etc....
...
You will focus on extracting the rss-channel-item-title-text which in the above case would be the string: "Irish
voters decide on whether to allow gay marriage - Los Angeles
Times". You are then to send that in a new message to a new QUEUE
called "jms:queue:RSS_GOOGLE_NEWS_UPDATES" as two sub-messages, one
part being the title of the feed: ""Irish voters decide on whether to allow gay marriage" and the second part being the source: "Los Angeles Times".
You are also to log each message using .log() as well as routing output to a file (as you
did in the first Camel lab).
The file output is to be sent into your local project's data directory.
(For more on the log() instruction, see http://camel.apache.org/log.html.)
Discuss in your README what you saw during your collection of
data and ensure that the data directory you specified with the file destination
is included in your submission and contains the data in your discussion.
One final requirement: you are to use Camel regular expressions
to "filter" out incoming feeds to only those topics that you are
interested in
[CIA 2.5].
(You may decide for yourself which "topics" you are
interested in, such as "USA" or "gay marriage" or "Obama", etc.)
You may decide how to format your final message on the queue. The
only requirement is that both the title of the message (from the
<title> in the feed) and the source of the news ("Los Angeles
Times" in the above example) both be present in the out message.
It could be something as simple as: <title="Irish voters decide on whether to allow gay marriage" source="Los Angeles Times"/>. But this output format is entirely up to you.
Submitting:
See above for specific items that need to be included in your submission, including specific items in the README and the project directory you submit. Submit your assignments to the subversion repository in the pre-existing folder named "labN"
(where N is the homework number).