Home > .net, CEP > Thoughts on StreamInsight

Thoughts on StreamInsight

October 14, 2009 Leave a comment Go to comments

So I recently had a chance to play with StreamInsight CTP 2. Overall, it’s a good offering from MSFT but they have a ways to go before they can catch up with the competitors.  Here’re some of my initial thoughts.

  1. CTIs are powerful but may prove to be a handful for developers to implement correctly.
    • For the financial sector, one area they may affect is if one was to move end-of-day reconciliation or pricing processing to StreamInsight.  Most of these CSVs from dealers will contain out of order marks/prices/trades.  But with full control over the input adapter which will be used to turn static data into streaming data, we could easily issue a CTI after the CSV has finished uploading.  However, moving this processing to StreamInsight is not something I’d recommend.  Firstly, it’s due to the static->realtime data. Secondly, and more importantly, given how easily and frequently the EOD processing can break due to bad formats coming in from the dealers, it makes sense to leave this processing to SSIS packages. You wouldn’t want a developer to have to crack open an IDE each time this processing breaks.
    • Since all of market data is chronological, it is necessary for the adapter developer to issue a CTI pulse after each tick is received so that it will appear in the input stream.  It seems to me that this use case can be made easier if MSFT was to create a setting where these CTIs are automatically handled.
    • CTIs make unordered time based edge events pretty hard to implement. Normally, one would set up time based patterns within a window to watch for either the edge start or end condition to occur.  With StreamInsight, one would either have to move this logic to the adapter or issue the CTI immediately and then work with the event in the engine.
    • It would be useful if StreamInsight allowed for a way to either handle out-of-order events chronologically (i.e. manage a different timestamp) or simply drop the event altogether (based on some setting).
  2. At present, it seems that there is a 1:1 mapping between an adapter and a stream. Hooking up multiple adapters on both input and output sides of a stream is a must-have.
    • On the input side, multiple adapters can be used to normalize data into a common schema. As my colleague Kishor pointed out, there’s a way to fold multiple streams into one with LINQ. But this would result in adapter code ending up side by side with engine code.
    • On the output side, for example, we’d want to hook up adapters for a log writer, db writer, along with one for a messaging bus before we feed the data into the next stream or query.
  3. Unit testing: This is often a sore point with CEP applications. CTP2 is completely lacking in this area at the moment. In my opinion, providing support for unit tests would be a big win for MSFT.
  4. Managing the adapter machine state is messy. Copy/pasting boilerplate-like code will create maintenance issues. I suppose we could abstract this in a base class but given that each adapter is likely to have its own custom cleanup code, I’m not sure I’ll gain much once I’ve done adding the equivalent events/delegates to pass control to the sub classes.

More on this later.

Categories: .net, CEP Tags:
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: