I came across MessagePack from multiple sources in recent weeks. MessagePack claims to be a "binary-based efficient object serialization library". The bar chart on MessagePack’s site claims that MessagePack is four timers faster than JSON serialization and deserialization.
Really? If true, it is able to optimize for both speed and space at the same time. I needed to find out. I did a quick test using @pgriess's node-msgpack implementation for Node (if you’re using recent versions of Node, use the fork https://github.com/jmars/node-msgpack in stead). Here are the numbers for 18k of JSON for a total of 10000 test runs on Node v0.4.8 on Ubuntu.
msgpack pack: 12705 ms msgpack unpack: 1668 ms json pack: 1575 ms json unpack: 1271 ms
For this size, MessagePack’s serialization is roughly 8 times slower than v8′s JSON serialization. Here are the results of a very small piece of data (48 bytes – same as the one used by Peter’s bench.js) for 10000 runs.
msgpack pack: 108 ms msgpack unpack: 69 ms json pack: 31 ms json unpack: 19 ms
In other words, I’m unable to reproduce Peter’s numbers. The differences are likely due to changes in Node and v8 since the time Peter did his tests. You can try it out yourself by doing the following:
git clone https://github.com/jmars/node-msgpack make # before the next step, edit bench.js to require lib/msgpack node bench.js
This is no good. There are other features that concern me as well. It is not clear why the developers of MessagePack chose to bundle together RPC (an application issue), pipelining (a protocol level issue), connection pooling (again an application issue) etc with a format design.
{ 21 comments… read them below or add one }
MessagePack 9 times slower than v8′s JSON serialization? http://www.subbu.org/blog/2011/06/messagepack-anyone
MessagePack Anyone?: subbu.orgI came across MessagePack from multiple sources in recent weeks. MessagePack claim… http://bit.ly/mDVBOx
I think performance may not even be the biggest challenge — although Java version appears to be about same speed as latest Thrift or fastest JSON codecs (and slightly below protobuf), its lack of documentation is nasty.
Trying to actually use data binding resulted in lots of cursing, because if you don’t know to add all those @Optionals for data that may be null; or register bean types (or annotate them), you will get seemingly random exceptions.
And I completely agree with concern over bundling unrelated concerns, starting with bundling of serialization/databinding with other aspects.
I like bundling as long as it’s modular. Give me an end-to-end system that works well for general purpose and don’t make me think too much so I can focus on the problem that I’m actually trying to solve, not which pipelining technique I’m going to use. If it matters, I’ll switch out that module later.
No, not those kinds of features. RPC pushes you to the dark ages of distributed systems. It’s role is very limited. Moreover, pipelining belongs to HTTP and that gets you interoperability. Connection pooling is again related to persistent connections in HTTP. There is no need to reinvent those features.
It’s great that node json serialization is fast (it should be) but that doesn’t make it fast for any other systems that it has to interoperate with.
As far as the comparison of a binary protocol+connection pooling+event handling to http+pipelining that’s silly. Your own writing points out the problems with pipelining.
MsgPack is packaged separately from MsgPackRPC. The fact RPC adds value to the format is reason enough to bundle them in the same project, I really don’t see that as a legitimate complaint.
> It’s great that node json serialization is fast (it should be) but that doesn’t make it fast for any other systems that it has to interoperate with.
Agreed. However, the claim about speed is misleading. That claim is what caught many people’s attention.
> As far as the comparison of a binary protocol+connection pooling+event handling to http+pipelining that’s silly.
You’re taking that out of context. Reinventing the wheel is not necessary.
But JSON is rather efficient on many many platforms; not just on node.js.
On Java, for example, it is competitive with msgpack, as well as all other binary protocols, even regarding performance.
In fact, all measurements I have seen that are done by people other than msgpack authors cast doubt on msgpack’s performance claims — where are independent verifications of alleged performance superiority? Or even details on exactly how authors have done their tests? On what platform, comparing against which other libraries?
First – @Cowtowncoder, can you give a specific example of speed claims from the author that aren’t substantiated? Here Subbu’s questions a test that isn’t even from the MsgPack author, rather from the author of node-msgpack. In fact this post tries to call into question the Message Pack author’s claim of 4x over Protocol Buffers and 12x over JSON by running a totally different benchmark on a totally different platform?
The primary claim from the MsgPack author is right on the MsgPack.org homepage:
“In this test (source code), it measured the elapsed time of serializing and deserializing 200,000 target objects. The target object consists of the three integers and 512 bytes string.”
The source code links here (not the node-msgpack benchmark):
http://msgpack.org/releases/benchmark/msgpack-protobuf-json-speed-test.tar.gz
Second – The comment on the RPC bundle is silly. Serialization formats are very commonly accompanied by RPC solutions. Avro, BERT, Thrift, Protocol Buffers all either explicitly provide RPC interfaces or suggest the use of protocol definitions as RPC interfaces. The author is promoting REST as an alternative but it’s been my experience that message serialization and RPC is more performant but less flexible.
“The author is promoting REST as an alternative but it’s been my experience that message serialization and RPC is more performant but less flexible.”
Do you have numbers to back this up? I’m curious because I’m aware of systems with efficient connection management in place but still using HTTP as a transfer protocol.
I can’t point to many HTTP implementations that are slower and do not implement efficient connection management. I’m not saying HTTP by nature is slow but I am saying binary serialization+RPC protocols such as Message Pack are generally more efficient than HTTP and therefor more performant. I will say though HTTP is far more open, enjoys ubiquitous adoption and certain implementations are very performant.
That should be “I can point to many HTTP implementations that are slower and do not implement efficient connection management.” sorry too early.
“The primary claim from the MsgPack author is right on the MsgPack.org homepage:”
Understood, but claims should not be made based on such a limited test.
Neither should claims to the contrary. There are actually a number of other tests besides the one quoted on the home page, which by the way your results do not invalidate.
The point is you can’t call the whole serialization protocol into question because you can’t verify a single third party implementation. You also note this is likely due to some improvement in node.js since the original test. To follow that acknowledgement up with “This is no good.” implying the whole project is bad doesn’t make sense.
The only thing you can say with your test is, someone implemented a node.js Message Pack binding some time ago and it was faster than JSON. According to my test that may no longer true.
What’s your take on Apache Avro then?
Avro is meant for large datasets, but I would recommend running https://github.com/eishay/jvm-serializers/wiki/ which now includes MessagePack to see how it fares against Avro and others on the JVM.
messagepackは遅い、という記事だけど僕には評価できない / MessagePack Anyone? http://htn.to/4D59gZ
Subbu Allamaraju refutes MessagePack’s performance claim over plain old JSON http://j.mp/khuOkH
http://bit.ly/nxISc8 (MessagePack Anyone?)
Has anyone here run the msgpack-protobuf-json-speed-test benchmark on the MessagePack site (src link under chart)? The version of yajl I had to install was 1.0.12 which seems a bit outdated. Maybe better results for JSON with 2.0.1 parser?
MessagePack Anyone? http://t.co/XoDYXSR