Branching Strategy for Komodo Core

A bit of background…

The current branching strategy used in Komodo Core is somewhat informal. The idea is to formalize ideas so that everyone agrees on the procedures to follow, and those coming after us know what should be followed.

This is a living document. If the procedures below do not serve their intended purpose, they (or portions of them) should be replaced.

Why this and not that?

There are a number of branching strategies available. This Komodo Core branching strategy is heavily based on Git Flow. This was chosen because the existing strategies match the style of the code base (long lived branches, multiple versions deployed and supported, etc.).

How it works

There are a number of long-lived branches within the Komodo Core repository. The primary ones are:

  • master – The main production branch. This should remain stable and is where code for official releases are built.
  • dev – The development branch. This is where work is done between branches. Non-hotfix bug fixes and new features should branch off this branch.
  • test – The branch for the testnet. As the codebase stabilizes before a release, the code is merged from dev to test.
  • hotfix – After a release, a hotfix branch may need to be created to fix a critical bug. Once merged into test and master, it should also be merged into dev. These branches should not be long-lived.

Tags

Each release for testnet and production should be tagged. The versioning strategy is not currently part of this document.

Reviewing / Merging

Before a feature or fix is merged into a parent branch, reviews must be approved by repository maintainers. Anyone can review, but approvals from 2 repository maintainers must be completed.

Once approved, it is best (if possible) that the author of the feature or fix merges their branch into the parent branch. Once the merge is completed locally and all unit and CI tests pass, the code is pushed to the parent branch.

Note: It is best to not use the merge feature of the GitHub web interface. Perform the merge locally and push to the server.

Crypto Currency Exchange Connector v2

I began a personal project a long time ago to provide connectivity to various cryptocurrency exchanges. I never got around to finishing it. It did however stay in my mind. And while working on other projects, I gathered knowledge of better ways to implement different pieces.

Recently I began anew. The market has changed a bit, but the basics still hold true. There is a need for fast connectivity to multiple exchanges. Open-source “one size fits all” solutions that I researched did not work for me. They did not focus on speed and efficiency.

The Problems

Centralized exchanges are still the norm, although distributed exchanges (i.e. Uniswap) are making headway. While I believe distributed exchanges will be tough to beat in the long term, currently their centralized cousins provide the “on ramps” for the majority of the casual users. They also suffer from a dependence on their underlying host (i.e. Ethereum in the case of Uniswap). These are not impossible hurdles to overcome, but for now centralized exchanges play a large role in the industry.

Regardless of the centralized/decentralized issue, there are multiple exchanges. Those with connectivity to many of them can seek out the best deals, as well as take advantage of arbitrage opportunities. But this is a crowded field. Getting your order in before someone else becomes more important as bigger players squeeze the once juicy profits of the early adopters.

The Solution

A library that knows how to keep a connection to a variety of exchanges and provides a platform to develop latency-sensitive trading algorithms is the problem that the Crypto Currency Exchange Connector is attempting to solve.

The library is being written in C++. It will be modular to support features such as persistence, risk management systems, and the FIX protocols. But the bare-bones system will get the information from the exchange and present it to the strategy as quickly as possible. It will also take orders generated by the strategy and pass it to the exchange as quickly as possible. A good amount of effort has been put forth to make those two things performant.

Giving strategies the information needed to make quick decisions was the focus. Keeping strategies from the need of “boiler plate” code was a secondary goal that I feel I accomplished.

Compiling in only the pieces necessary was another focus. I debated for a good bit about making each exchange a network service you can subscribe to. That makes sense for a number of reasons, but it is an extra latency that I did not want to pay for. And having everything compiled into one binary does not mean it is impossible to break it out later. I wrote it with the idea that off-loading exchanges to other processes or servers would be possible, but having it all-in-one is the default.

Persistence is another issue. There is a ton of data pouring out of the exchanges. Persisting it is possible. But having it in a format for research often requires dumps, massages, reformatting, and sucking it into your backtesting tool. To alleviate some of those steps, I chose TimescaleDB. It is based on Postgres, and seems to work very well. Dividing tick data into chunks of time can be done at export time. I have yet to connect it to R or anything like that, but I am excited to try. That will come later.

Project Status

I truly do not want this project to be set aside yet again. I have a strong desire to get this across the finish line. And it is very close to functional. There are only a few pieces left to write before my first strategy can go live.

I will not be making this project open source. It is too valuable to me, and gives me an edge over my competitors. At best I will take the knowledge I’ve gained and apply it to other systems that I am asked to build.

Arduino is great, but…

Arduino and the Arduino IDE are great products. They serve as an on-ramp (and a very good one) for many makers. You can do quite a bit of tinkering, and even build commercial products without leaving the Arduino ecosystem. I give the many developers, volunteers, makers, etc. big kudos for their work.

It is just not for me. I’ve never been one that likes having development steps done for me. I need a lower level. I need to see the gears (rusty or greasy as they may be) turn. So an IDE that does everything for you is great. But at times it gets in the way. And that is when I start to look under the hood.

Let me be clear. I will probably never design my own chip. I am also not one to dig too far in the minutiae of manufacturer’s data sheets. I like a certain level of abstraction. But when something is not working as I expect, my instinct is to check my setup. And when the setup is all done for me, I quickly start groping at straws, and often waste time.

A recent experience solidified my desire to leave the Arduino IDE behind and go it alone. And by doing so, while it was investment of time, added significantly to my repertoire of skills. And it is something I believe is a worth-while investment for many developers.

In my case, I was not working on an Arduino. I was working with a chip and associated hardware that had been twisted to work with the Arduino IDE. That was a great way to get started. Again, a big thank you to all that worked on making that work. But when things get “twisted” to make it work, often things get lost, dropped, or forgotten.

Fortunately in my case, the manufacturer of the micro controller has a well-documented SDK and gcc toolchain. And while I have yet to remove all Arduino pieces of my code, (and may never do so) I can say I am now much more confident in what will happen when I tweak my linker script. I know what goes where and when it will be called. When all of the sudden a reboot is triggered, I am usually certain as to what area of the code is in need of attention.

If you know how to recover from accidentally “bricking” your device, I believe you will love stepping outside of the Arduino IDE and into the world of tool chains, linker scripts, master boot records, and boot loaders. Yes, it is frustrating at first, but the feeling of success at the end often makes it worth it.

That is my $0.02. As always, feel free to chime in with your opinion.

IoT and Smart Watches

There are plenty of avenues to enjoy some good old-fashioned coding. Lately, I’ve been doing some embedded work, and found a reason to explore smart watches. It seems quite a community is building around watches and fitness bands.

I purchased an inexpensive watch model that seems to have some grass-roots support around it. The P8. It was so inexpensive, I took a chance on its successor, the P9. They arrived today. The brain-dump and camera-dump will be built up here.

IoT – The marriage of software and hardware

I have always enjoyed electronics as a spectator. It continues to fascinate me. But the maturing trend of IoT captures my imagination because it blends electronics and software.

My career has mainly revolved around the major platforms. Windows, MacOS, Linux. I wrote large applications that ran on large servers or powerful desktops. Performance was a concern, but the cost of development often demanded to worry about hardware constraints only when something didn’t work.

Smartphones and other mobile devices drove developers to once again consider the hardware. Sure, there are development tools that hide some of the complexities of particular hardware. But when you get down to a board that has few microprocessors, limited RAM, little storage, and must run on a watch battery, the idea of adding a tool to hide complexities often means the tool is too big to fit on the device, let alone your code.

Robots have been used as a STEM tool in schools for a while now. There are people in the workplace now that have a much more solid base in electronics than I do. Imagine kids, my school had a computer in the library that had to be shared among the entire student body. Of course, I was one of the few that knew how to turn it on. But I digress..

The idea that the latest generation (and even a few before this one) are being exposed to IoT makes me happy. Learning such things while your mind is open to explore (and you have the time to do it) means more capable people with imaginations beyond the basics.

But that doesn’t mean that people from my generation can’t contribute. Just as the generation before us, we can certainly understand that we are where we are because of the shoulders we stood on. The question I ask myself is “How can I help?”

I’m no writer, and I’m no teacher. But I do have a good amount of experience, much of which only exists as anecdotes in my head. So it is time I do more thinking, learning, and writing.

My hope is that by writing more, I will explore more. I am looking toward IoT to expand my electronics knowledge, increase my coding knowledge, and learn how to best explain concepts.

These are lofty goals, but I’m hoping for the best.

Side project: Marine Weather Radar

I am researching open-source projects for marine weather radar.

The weather radar hardware manufacturers are proprietary. I understand why, and I’m not against it. But I would like to have my data in one place if I can. To do that, I need integration. That will affect my choice in purchasing hardware.

To research:

Hardware manufacturers of marine radar for pleasure craft. Do they provide an API? What equipment is necessary?

Open-source projects used for integration of systems in the marine environment.

What I’ve found:

OpenCPN seems to be the open-source software of choice for chart plotter navigation. It seems they have AIS (traffic) integrated among others. I have also seen integration with older radomes as part of the radarpi project.

The industry standard for networking devices together is NMEA2000, which is a protocol far too slow for radar data, but seems to work well for things like wind speed, gps position, depth, etc.

Signal K is an open source project for integration of NMEA2000 stuff into the PC. From what I’ve seen so far, they’ve put a lot of work into it.

The Ultimate:

A PC connecting to the boat’s WiFi network that connects to a server (perhaps running on a Raspberry Pi) that talks to boat systems.

Related links:

GnuRadar: Need to explore

Cruiser’s Forum: Post about someone wanting to do the same, but dated.

SI-TEX MDS 8R was (is?) a Radome that plugged into a PC’s ethernet port. Interesting, not sure if it continues to be in production.

After more research, it seems radar is going the ethernet route. More of them are going headless, and I am hopeful that standardization is coming soon.

Update 16-September-2020

radar_pi has done a lot of work on this already. It seems the tight grip that manufacturers have on their hardware limits (but does not eliminate) the usefulness of writing such code.

Manufacturers do sell the antenna separately from their head units. Whether a setup replacing their head unit with a generic one would cause things like voided warranties have yet to be seen.

To truly reverse engineer, build, and test such software would require the actual hardware. This post talks about someone working on such a thing with Furuno hardware.

An Algorithmic Trading Framework

I’m dreaming here. But if I were to build the ultimate framework for algorithmic trading, what would it look like?

Purpose

One purpose would be to concentrate more on the algorithm, and less on the mechanics that all strategies need. But this is tricky. All strategies need something different, with different parameters.

Another purpose would be to have a consistent way to write complete algorithms. Once the framework is learned, adding new and updating old ones become easier.

Hurdles

There are a myriad of options for implementing trading algorithms. A simple idea can quickly turn into a long list of questions.

Risk measures must be looked at. How is the optimum trade size calculated? Are other instruments involved in this calculation (i.e. Would this trade create a risk in a particular index or industry that is above an allowed threshold?).

We must consider order entry. Will it be a market order? Do we attempt to make the spread? Which ECN will be used? What do we do if the order is not immediately filled? What if the order is partially filled?

Then there is order management. At what point is the trade closed at a loss? Is there a strategy for adjusting risk if the trade moves for/against the entry price? How are profits taken?

There is also brokerage and data feed questions. Are these decisions already made? Is there the possibility they will be changed in the future?

With such questions answered (or at least partially answered), we begin to look at frameworks that can help build the infrastructure.

If we’re sticking to a particular broker or software package, the framework decision becomes easy. If we’re talking FOREX, and the broker mainly works with Metatrader, there would need to be a strong reason to choose another platform.

A trading business must also look at in-house experience that is available. A hedge fund that has a staff of Python developers may not want to work with a C++ framework.

Conclusion

A “one size fits all” trading platform will never be created. A platform that works well for a particular situation is often available. I would like to build a platform that is somewhere in the middle of those two situations. I would like to hide the complexities of portfolio management, broker connectivity and data feed connectivity by providing a (somewhat) generic interface to these items.

Strategies that connect through an API to these resources would be somewhat more portable between brokerages, data providers, and changes to portfolio management rules. This would also hopefully allow for backtesting without rewriting.

A while back, I started down the road of building such a framework. A rough implementation is at GitHub. And by rough I mean pre-pre-alpha. There are plenty of areas that need work, or even redone. But it is a start.

Data Storage Formatting

I have been battling an internal war. The question: In which format should data be stored in? Twenty years ago, the question was fairly simple. Much was moving from proprietary storage systems to relational database management systems. That was a great move, or so I thought at the time. Everything was sorted. Simple queries were simple. Hard queries were hard. Queries that needed to be performant could be tuned.

But data came in different formats, and had to be parsed. What could easily be parsed was placed in columns. The big stuff was placed in blobs. Specialized parsers could then be built to look at the blobs if they wanted to.

There were plenty of issues to resolve. How do you keep multiple databases in sync? How do you handle changes in the incoming metadata? What do you do with old, rarely used, but perhaps important data? The questions kept coming.

Those questions and many more have answers that are not easy. And like most things in life, the answer is often “it depends.”

The current situation is the need to pour over mountains of different data, looking for specific things that happened over a specific time.

Time-Series Market Data

Sample frequency is an issue here. The typical OHLC data that comes from the financial markets is in a recognizable format, easily parsed and placed in rows and columns, indexed, and with management systems like kdb+ you can capture data at a fast sample rate, or even down to the tick, but summaries are quickly calculated.

Report Filings

Is a typical RDBMS a good fit for financial reports, such as what comes from EDGAR? Not so much. A “big data” system may be a better fit, or perhaps an API that knows how to parse directly from EDGAR itself.

Market News

Sentiment is very difficult to quantify. The “mood” of the market may be extrapolated after the fact by certain indicators, but how is the oil sector affected when good Tesla numbers come out the same day as a bad US Manufacturing report? How do you build that query?

Summary

There are many more categories and sub-categories. But the basic result is: there are many factors to consider when looking at data. Attempt to pick the correct way, and be prepared to change it if it doesn’t work. Don’t be afraid to use more than one solution at the same time. Avoid complexity, but remember that not all systems are small. Consider micro services. Keep an open mind, but avoid “analysis paralysis”.

Once again, “it depends.”

C++ vs Python in Algorithmic Trading

I am very much a C++ person. I warmed up to the language as it was warring with C for programmers and popularity. Although I used neither for my “day job” ex/cept on rare occasions, I jumped at opportunities to play with them.

After C and C++, I learned many other languages. I took deep dives into Java, C#, JavaScript, etc. But there were plenty that were (some still are) popular that I never had the opportunity to use much. Perl, awk/sed, regular expressions are some that come to mind.

Eventually I switched to using mainly C++, and then nearly 100% for a good while. After escaping the confines of my hard-core enterprise software experiences, I found the language world had twisted again. But this time was a bit different.

In areas of finance, vendors often had the upper hand. When some big name used some tool within their development area, the smaller ones followed suit, if they could afford it. If you were the lucky vendor, you sat back and waited for the orders to roll in.

But as the trading world evolved, the smaller, nimble organizations had unprecedented access to new tools at little to no cost. Metrics like performance and accuracy were comparable. And the young graduates were leaving their academic life with these skills already under their belts.

Language wars are far from over. But the voracity is dying down. Most developers understand that you should pick the right tool for the job. Consider what you are building up front, and make sure to leave room for flexibility.

C++ is still my language of choice. I use it daily. But it has flaws. And sometimes it is simply not the best language for the job. At the moment I am prototyping, and building some algorithms that crunch data from multiple sources in multiple ways. The sources, elements, and methods that I write tomorrow will probably not be used in the final product. But these cycles are necessary to figure out what the final product should do. Should I write that in C++? Not in this case.

Much of what I’m looking at has been looked at by others. Many used Python to grind through it. There are scripts, designs, and visualizations written in Python that help me understand the data that is running through the system. Do I want to rewrite that in C++? No thank you. The final product may be written in C++. But for now, I’m happily tweaking other people’s code for my own purpose.

I must give props to the mighty data scientists and financial wizards that wrote some of these Python libraries. Of course, the language itself permits C libraries to be used within the language. So some of them are simply using Python to access libraries written in other languages. But still, Python has carved out a good market share in the big-data areas of trading systems.

Will Python take over everything in the trading world? The odds are very slim. Entire trading systems have been written in it. Many of them in fact. But there are still good reasons to choose other tools that do a better job at some aspect of the cycle.

I have added Python to my language stack. I will probably never dive deep into the language. But after only a short while using it, I feel comfortable with it.

Should your system be written in Python? Perhaps. I wouldn’t hesitate if it fits. But I also wouldn’t recommend forcing it to fit when it doesn’t belong. Standardize across the board on it? Nope. Pick and choose. Choose wisely. Hybrid models are common, and often best.

That is my $0.02. YMMV

Algorithmic detection of chart patterns

I have been building out some algorithms that rely on detecting “consolidation”. Chartists know what that looks like, but the term by itself is extremely vague to a quant. What to do?

I have often thought about the method used in creating “The encyclopedia of chart patterns” and wondered if such could be used as part of an automated strategy. I read somewhere that the source code for the application used to build the data in that book was written in VB, and was for his personal use. Sad, but I would have probably done the same thing.

Today I began reading “Foundations of Technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation” (2000) by Lo, Mamaysky, and Wang. What a great paper. It details some of the technical challenges of building such algorithms.

What I like of the paper is that it picked one smoothing algorithm, and basically said “we picked it and went with it, right or wrong”. My interpretation of that is “you can take this further, but we wanted to give you enough information to get you started”. Kudos to the authors.

Is it worth the effort to implement? Time will tell. But it certainly helped me get my head around some of the intricacies of smoothing estimators and finding optimal values for them.