comparison of message passing frameworks

Jun 1, 2008 at 3:32 PM
Hi, to the author of MPAPI, I think it's great that you developed and shared your message passing framework for dotnet.

I am a little surprised how little web traffic your project has attracted thus far.  High performance computing is on the rise, and distributed or clustering application frameworks are a key enabling technology.  Look at Google's MapReduce architecture and all the attention it has generated lately as an example.  Look at Hadoop which is the open source version of MapReduce.   Look at Digipede.   Grids, fabrics, message passing frameworks, these are all steps forward, compared to roll-your-own, one-off, nonreusable techniques which a person can build from the ground up using .net Remoting and so on to meet the needs of a specific application.

I might want to try MPAPI.   I have looked at some other message passing implementations recently, with the aim of building my current C# application in a high performance distributed computing design. Maybe you can help me predict some differences between MPAPI and these other message passing frameworks.

Here are the message passing packages I've been looking at.  Of course they are not identical.  To be clear, I am not suggesting they are identical or even close to identical.  What I am saying is that every one of these packages or subsystems could potentially be used as the basis for a MP type of high performance distributed application in C#.

As the author of MPAPI maybe you can help me and other potential users see more accurately where MPAPI fits in and compares versus some of these other systems, when C# high performance distributed applications are required.  The following systems all could provide some degree of messaging infrastructure for high performance applications which require distributed messaging services. 

As I said, there are other ways to achieve high peformance distributed computing, such as Hadoop.  However let's narrow our discussion to the message passing technique if that's OK, since MPAPI is a good candidate for this.

1 - Open MPI.   Comments:  Does NOT work on Windows at this time but support is being "discussed."  It seems to be an open specification for message passing protocol or behavior, but not message queueing per se.

2 - AMQP.  Comments:  This is a specification for an open message queue protocol.  It is not exactly a "message passing" spec per se.  The AMQP spec is being promoted and developed mostly by a consortium of non-software companies, including JP Morgan (a financial), Cisco (a network device company), and some others.

3 - RabbitMQ.  Comments:  This is an open source package which is an implementation of AMQP.  It works on Windows and other OSes like Linux.  In my application, RabbitMQ was having timeout problems if the app waits too long between insertions of messages to the queue, which is a very undesirable behavior.  Erlang is the engine for RabbitMQ.   This is an interesting point, given that MPAPI was developed with Erlang's benefits as an inspiration.  Notice that RabbitMQ actually uses Erlang at runtime while MPAPI is merely using Erlang to some degree as an inspiration.  To be clear I am not suggesting either way which is "better."

4 - Apache ActiveMQ (and repackaged versions by Iona and others).   Comments:  This is an open source implementation of a message queue system.  It runs on Linux and Windows.  C# applications can use ActiveMQ by calling the REST interface or by calling the NMS (similar to JMS, or Java Messaging Specification) libraries which Apache also provides as open source.  I personally found Apache and Iona's packages too buggy to use.  There were shutdown errors which causes transactions to be undone, and it proved to be a mortal problem which I could not solve.  I had to abandon ActiveMQ for my application.  It is open source so I could fix it myself in theory but I have only enough time and skill to work on my own applications, not write code to fix this framework.

5 - Microsoft Message Queue Server.  Comments:  This is "the standard" for message queueing on Windows.  It does not work on Linux.  It is not cheap.   Its protocol is proprietary.  The .NET Framework of C# has a lot of built-in support for MSMQ.  MSMQ depends on COM+ infrastructure, which is impossible on my current workstation because the registry is screwed up currently and I dont feel like reinstalling the entire OS and software packages at this time.  Registries get screwed up on Windows from time to time which is a well known issue with Windows.  I would prefer to use something which does not depend on COM+ infrastructure.  I believe a backbone like MSMQ which is built on COM+ is a fundamentally unreliable backbone, and is therefore a pretty poor choice for a mission critical application infrastructure in my opinion.

6 - IBM MQSeries and Tibco.  Comments:  These are "the standard" for message queuing on non-Windows OSes.  I understand these are expensive and proprietary and pretty solid and fast.  I have no experience with them and frankly I don't care to use them personally.  Such "elite" queueing systems are not helpful for advancing the knowledge and adoption rate of high performance clustered applications on low cost computers.


Questions:

Is MPAPI capable of storing messages if they are not consumed immediately?  Is it using a persisent storage model?  How many messages can MPAPI store?  Can I configure which disk device to store the messages on?  Can I administratively choose to use a database rather than a filesystem for the message store?

Why did you choose to write a custom message queue or store for MPAPI's internal implementation, rather than using some message storage system already written like ActiveMQ or RabbitMQ or MSMQ?

Do you think MPAPI provides more or better application services than a package like RabbitMQ, such that the complexity of my distributed application code would be reduced if I use MPAPI compared to a general messaging package like RabbitMQ?  Of course I already understand that a "message queue" is very general purpose and is not specifically designed only for "high performance computing" or "distributed parallel applications".  

Can you perceive any benefits to MPAPI, if MPAPI were reengineered to use Erlang, or AMQP, or RabbitMQ, or MSMQ, versus the current implementation of MPAPI?

I am excited to try MPAPI and just thought I'd ask these questions while they were on my mind.  I think lots of choices exist today for high performance computing.   IT's good to know in advance what the tradeoffs will be before making a big investment of work to build any distributed application on any of these message passing infrastructures.


Thanks for reading.

Geoffrey
Coordinator
Jun 5, 2008 at 10:20 AM
Edited Jun 5, 2008 at 10:28 AM
Hi Geoffrey

Thank you for you post. I think it is a great idea to highlight where MPAPI differs from the other established message passing frameworks. So I will try to answer your questions here to the best of my knowledge - I am not very familiar with the other frameworks, and have not tried them out.

But the reason I wrote MPAPI was because I wanted a light framework that enabled me to write distributed applications - as well as non-distributed - using message passing concurrency. I started looking into the MPI standard and found that it was way too complex for what I needed (and I guess what many people need). Besides, the leading MPI-implementation is only intended for C/C++ and FORTRAN programmers, and I had no intention to write a bridge (God knows there are enough bad code out there). Furthermore my prefered language at the moment is C# (and F#). I also wanted it to be able to run on Mono.NET so I had to develop a custom remoting framework for that.
When I investigated a bit further I found out that there where some people who could use such a light framework, so I decided to make it puplic. You are right - the interest hasn't been overwhelming although distributed computing is fast on the rise, but maybe that can be attributed to the fact that people usually tend to use "old" and proven technology with a lot of advertising and exposure on the net even though it doesn't necessarily meet their requirements 100%.

Now to your questions. I cannot say an awful lot about the other frameworks since I haven't got any experience in using them - I only read some specs. But I will be as specific and thorough as I can.

1)
No, MPAPI does not persist messages on non-volatile memory. But that was actually a great idea. I haven't thought about that since I work on the assumption that a sender usually receives a response from a request. If something went wrong it could be delegated to another node/worker. Besides, the Monitor-primitive is there to help you find out when something is wrong.

2)
I wrote MPAPI because I wanted something that could be used in .NET without having to use a lot of third party tools, or having to write language bridges (that's just bad style). For example, RabbitMQ needs Erlang, and Erlang is really (!) slow compared to .NET. Besides, the existing frameworks are just too damn complex for my needs.
Erlang has an advantage over .NET languages though: in Erlang you can easily spawn thousands of threads since each thread (they call it a process) only takes up 300 bytes plus code and state. This makes thread context switching fast on Erlang systems. In .NET you use OS-threads, and context switching is way more expensive since a .NET thread takes up about 1MB of memory plus code and state.

3)
Well, if you compare the codebase of MPAPI agains any of the other frameworks (including their 3rd party references) you will see a big difference - MPAPI is smaller, and the distribution of an application written using MPAPI is less complex. Furthermore the number of primitives available to programmers, and the setup needed to make a distributed cluster running, is way smaller and thus less complicated in MPAPI. Maybe it is TOO simple, but I am willing to look into that whenever someone raises an issue.

4)
As far as I can tell RabbitMQ and MSMQ are alike. And MSMQ is only intended for sending and receiving messages. This means that they can only solve one of the things that MPAPI solves, namely sending messages back and forth between nodes/workers/entities. MPAPI also handles threading, and the synchronization between these threads. This is called Message Passing Concurrency, and is a simplified programming model that enables programmers to write multithreaded (and distributed) applications without having to worry about the synchronization between threads - a huge problem that will only grow larger as we move to multicore processors. This is actually an important observation! MSMQ and MPAPI are different. MPI and MPAPI are alike in the issues they address.

MPAPI cannot be engineered to Erlang, since I took some of the great ideas from Erlang and implemented them i MPAPI. Erlang is great for distributed applications, but there is a huge performance penalty there. Granted, 99,99% of applications are idle some (or most) of the time so performance is not a great issue, but some applications need performance at all cost. You can go to The Language Benchmark Game to see a comparison of languages performance on a small subset of computeintensive tasks.

If you need any help with implementation of a test scenario with MPAPI I might be able to help you out. I am quite interested in what you intend to use is for. Just keep in mind that the framework is not finished; as I, and other users, gain more experience with it I will alter it to reflect those needs. Furthermore, it is completely open source so you can alter the source code to your liking.

All I can say before you make any commitment to any framework is that you must investigate and test thoroughly each one. And weigh benefits and drawback against eachother. 3rd party references, or bindings to a complex (and some would say obsolete) language like C/C++ can be expensive in the long run, since it complicates both development, deployment and maintenance.

Please feel free to contact me at mpapi@sector0.dk if you have anything else you want me to clarify, or you want me to elaborate on some of this.

Best regards
Frank