Default Project Description

DEFAULT PROJECT CONCEPT

Note: Within this description, patterns that are relevant to the description will be in [brackets] and CAPITALIZED, e.g.: [MESSAGE CHANNEL]. When multiple possible pattern suggestions exist, you may feel free to choose the best one for your design or substitute another you feel may fit better.

If you're having trouble coming up with your own project idea, you may, if you so choose, use and implement the following project idea as your own. Note that this is a proposal you can choose to use. It's not a "specification" for a lab, but a "suggestion" on how you might go about designing a solution for the project. You may (and certainly will) change this as you go through the coding of your project, substituting other patterns for the ones suggested here. You may find that Hohpe & Woolf, Chapter 13, "Integration Patterns in Practice" will be helpful in this context and will offer further examples of pattern application.

GENERAL PROBLEM:

Imagine you are writing a statistics generator for a set of trading engine, each of which may be running on computers in New York, London, and Tokyo. The trading engine, which will be a farily simple consumer [SELECTIVE CONSUMER, POLLING CONSUMER, etc.], will periodically look at different message channels [POINT-TO-POINT CHANNEL or PUBLISH-SUBSCRIBE CHANNEL] for the latest statistics on stocks in a portfolio [COMPOSITE] (you can keep it basic with stocks or you could choose to incorporate options, futures, etc. if you want). Your goal in this default course project is to load the channel (queues or topics...you're decision...see implications below) with various statistics for bids and asks (min, max, average, standard deviation, variance, regression, moving average, etc.). Your program would read in tic data, run the statistics [STRATEGY, TEMPLATE METHOD], load the stat channels, and your trading engines which manage a portfolio of stocks would update the stocks in their portfolios with the appropriate stats. You should have a couple of different trading engines each of which is interested in different stocks and different statistics from the channels. Each Trading Engine should have a single instance of a "Reporting Engine" [SINGLETON] that will periodically printout stat data for its portfolio [COMPOSITE, ITERATOR].

DESIGN IMPLICATIONS:

You can design this so that you have one channel per stock, with multiple statistics inside each stock's channel. Or, you could choose to have multiple statistics channels, one for min, one for max, one for average, etc., and put the various stock values on the channel, so the "STD_DEV" channel would hold messages [MESSAGE] that contain stdev values for MSFT, IBM, etc. You might also decide to design this where each traded stock has its own dedicated set of statistics (DATATYPE CHANNEL).

You would then have a few very simple"trading engines" that would subscribe to the various channels and consume the statistics off the channels (one trading engine might only be interested in MSFT and ORCL data, another trading engine might be interested in only average and standard deviation for IBM, ORCL, and MSFT, etc.). (Certainly, you can challenge yourself further and make this more interesting if you include concepts for depth of book and "Reversion to Mean" strategies, etc., if these are meaningful and you are so inclined...such enhancements would be totally up to you).

You will need a "data feed" that delivers tic information periodically. You can use this file that includes 333 discrete tic information files for MSFT, IBM, and ORCL (or you can create your own data feed if you're interested in expanding to options and futures). You could define a Camel Endpoint [ENDPOINT] that consumes those data files and puts the tic information on a channel. You may need to use a translation [MESSAGE TRANSLATOR] to get the information into a format that you prefer.

You might decide to make your trading engines a little more "realistic" by having them actually "trade" (buy, sell, or hold) given the incoming statistical data (leveraging a moving average or something even more simple). If you were to choose to implement this, your engine will definitely have to "remember" what it's position is in a give stock, whether it's long or short a given stock, etc. [STATE].

You should be concerned about "invalid" input data from the data feed. You might want to incorporate an error channel [INVALID MESSAGE CHANNEL] should invalid data input be discovered on publication.

SUMMARY OF POSSIBLE DELIVERABLES:

You could arrange this in a variety of ways. One way would be to have a single "Data Producer" that reads in tic data and publishes it to a data channel. Then, you might have a single or dedicated statistics engine that reads the tic data and publishes statistics out to different channels (again organized either by stocks, statistics, or some combination of the two). Then, you might have multiple trading engines that are dedicated to a particular type of pricing strategy, and consume the statistics they are interested in from the various channels, and print out their stats or if you choose to enhance the trading engines a little the positions of the trading engine.

Questions to ask yourself as you choose among the pattern options:

1. Queues or Topics? I.e., Point-to-Point or Publish-Subscribe? Key question here is: will you have multiple pricing engines needing to access the same statistics?
2. How do you want to design the channels? Do you want to have stock channels that hold multiple statistics for their stocks (only)? Do you want to have different statistics channels ("AVG", "STDEV", etc.) that hold specific statistics information for multiple stocks? Do you want to have specific channels for a given stock and statistic combination ("MSFT_STDEV")? One implication to your decisioning is to decide which design would be most flexible when another trading engine and set of stocks is added.
3. As for running the statistics, do any of them share parts of a common algorithm? For example, if you run certain statistics such as variance, standard deviation, regression, covariance, etc., might those strategies leverage some of the same steps? And if so, is there an opportunity for a TEMPLATE METHOD?
4. Do you need to enhance [CONTENT ENRICHER/Camel .enrich()] the tic data as it goes onto the data feed channel with additional information such as a timestamp, etc.?

Overall Suggestions:

Remember that the following few very simple lines of Camel Code actually imbed several different enterprise integration patterns:

                ConnectionFactory connectionFactory = new ActiveMQConnectionFactory("tcp://localhost:61616");
                context.addComponent("jms", JmsComponent.jmsComponentAutoAcknowledge(connectionFactory));
                . . .
                from("ftp://localhost/orders?username=secret&password=secret)
                .process(new Processor() {
                    public void process(Exchange exchange) throws Exception {
                        System.out.println("We just downloaded: " + exchange.getIn().getHeader("CamelFileName"));
                    }
                })
                .to("jms:incomingOrders");

already leverages 5 Enterprise Integration Patterns, including: Message Broker (ActiveMQ), Endpoint (from/to), Message Translator (Processor), Message Channel (jms:incomingOrders), and Message (which encapsulates the data being passed between endpoints from ftp file to JMS queue).

Project Expansions and Extrapolations:

You could modify this idea to challenge yourself further by extending this idea to generate Monte Carlo Simulations, etc.

You could modify this idea to challenge yourself further by enhancing the core concept of stocks with options derivatives, in which your stats leverage Black-Scholes and become the Greeks: beta, delta,gamma, theta, vega, kappa, rho, etc.

You might decide to transform [MESSAGE TRANSLATOR] your messages into a common model on the way to a channel so that the impact of different desired formats among different trading engines is minimized [CANONICAL DATA MODEL].

You might decide to enhance your statistics with qualitative information (in addition to quantitative data for statistics) and keep track of public "mentions" of a particular stock in the news. Certainly, natural language processing is beyond our simple example. But you could think of a stub [PROXY] that runs several RSS financial reads, and counts the times "MSFT", "ORCL", etc. appears in the news, and publish those counts to a channel (or transformed as categorical codes). Your trading engines could then keep that information in mind as well as they operate. Would you want a proxy for each stock you're trading, or you might trade? You would have to be careful here, because you can't keep track of all stocks, as big data tends to be BIG. The idea of course is that at some point "in the future" you would replace the stubs with actual strategies that do qualitative statistics on big data inputs.