Testing, probing and profiling the code

This is part 3 of a series on improving performance. Click here for part 1, and here for part 2.

Now that the basics are done, it is time to start thinking more about testing. Some may say “but what happened to test first?” Well that is a long discussion, but suffice it to say that there must be a balance between testing and productivity.

Those that subscribe to assuring 100% test coverage I applaud your efforts. But unfortunately, your code will still contain bugs. Testing is VERY important. But apply it in a broader context and you will be more productive. I built tests that assure this very naive system works.

So what do we need to test here? Firstly, we must exercise as much of the code as possible. Then we can profile to see where the bottlenecks are. After all, this project is about profiling.

The rules of the game are “throw as many transactions at this thing as rapidly as possible.” But we also must be concerned about the quality of the data. In a real-world matching engine, the majority of transactions happen real close to the best bid and offer. But not everything happens there. So we need to generate a large amount of test data that will be processed in order. The end result should be a somewhat repeatable time. That data should include:

  • Most (but not all) transactions close to the bid and ask
  • A varying amount of quantity and price
  • Price should not fall below zero (an upward bias perhaps?)
  • Orders that partially fill
  • Orders that take liquidity from several existing orders

There is functionality that does not exist. For example, this book does not provide for cancelling orders. It does not handle rounding (yet). So the test data will not include those things.

A very large CSV file was created with randomized data. Different tests will be created that limit the number of records processed. From that test, we can begin profiling to see which areas of code would benefit the most from further examination.


I will be using valgrind’s callgrind output to measure performance. kcachegrind will be used to help read the callgrind output.

Initial Results

Each time I profile a piece of code, I am somewhat surprised at the results. “No premature optimization” is alive and well. I had several assumptions before profiling. Some were valid, others do not seem to be so.

Firstly, the method “placeOnBook” took a good chunk of time. After a small amount of evaluation, it simply places an entry in the map. This area should be examined. I am thinking either there are copies or constructors that could be optimized or avoided altogether.

Secondly, the CSV Reader was high on the list. This is an example of how your test framework can show up in performance results. It doesn’t matter too much, as such results can be somewhat ignored. Optimization there is a waste of effort. But it does serve as an indication that the matching engine code is fairly robust.

Conclusion of Round 1

The std::map used is where effort should be put in. Keying differently, or replacing std::map with something faster will provide the best improvement. Here is a screenshot of the output, formatted by kcachegrind.

Order matching and container types

This is part two of my order book exercise. To start at the beginning, click here.

The initial commit of my order book uses a std::map as a collection of bids and asks. That works, but it has a problem. The key is the price. More than 1 order with the same price and oops! Bad things happen. In this case, the previous order disappears.

Well, is std::map the best choice? We could immediately say no, but that wouldn’t be any fun, now would it?

Our goal here is profiling, but I want a complete, working example. So something must be down to allow for two orders to be on the book at the same price. In addition, the oldest order should be used first (FIFO). How is that to be accomplished?

One way is create an object to be used as a key. The plan is to create an AssetOrderKey object that works with comparison operators. Let’s see that in action.

If we add an AssetKey object, and then expand it into two objects, we can have a different comparison operator for the two collections. Therefore, a begin() call on either collection will give us the best bid or ask.

To see these changes, take a look at this GitHub commit.

Another way to do that would be to pass a comparison operator to the std::map. I did not do that here.

Next step: testing and profiling . Click here for part 3.

Matching Engine Requirements

As an academic exercise, I wanted to take on building a matching engine in C++. The purpose here is to iterate through the process of measuring and improving performance.

I imagine the initial requirements as naive, with later iterations including removal of floating point calculations, variable precision, 128-bit integers, “dust” handling (may be more of an implementation question than a performance one).

This engine will be strictly single-threaded and purposeful. It is hoped that the input will be clean and optimized (which could be offloaded) to improve throughput.

Tooling will be sparse, on purpose. The idea here is not a discussion of the intricacies of the tools, but how tweaks to code affects speed.

The Idea

The engine will receive limit orders that specify the asset held, the asset to be bought, and the desired price. It will include a sequential id, externally guaranteed to be unique (such a key could be analyzed later… we’ll see…).

Once received, the order is processed and if not immediately filled, what is left over is placed on the order book.

Simple, right? There are many details yet to be sorted out. So we will get started! I will edit this post with links to my future posts on the subject. Stay tuned!

For the first cut of the order book, see this GitHub commit.

To see what I tackled next, see part two.