This page is documents some of the really cool things that I wish I had the time to pursue during my day-to-day, but instead did during my spare time. Some of them went on to be implemented, others I still have hope for, and then there’s the ideas that I’ve since come to consider sub-optimal. The overarching theme is a desire to replace code with data and algorithms on that data.
I’d read the introduction first and then feel free to jump around, most of the sections should stand alone. Personally I recommend market DSL, my first and most academic idea.
If I was to describe the purpose of the algorithms we develop in a single word it would be “derivatives” (with “related contingencies” in a distant second). Okay, let’s back up a bit and talk about sports betting in general, here’s a crudely drawn representation of a very small offering for a soccer match:
Selections are an outcome that can be bet on, for example “the home team will win”. They have a probability of occurring which is overrounded to produce a price which represents how much the bet could win. Markets are groups of related selections, if exactly one selection can win and the probability sums to one I call it a proper market. As developers we’re interested in probabilities and proper markets because prices and markets are just transformations with fewer invariants.
Another way of grouping markets is driving and derivative, where driving markets are the markets that contain enough information to calculate the derivations. The set of driving markets varies per sport and model but it’s typically a supremacy market such as “who will score the most points?” and an expectancy market such as “how many points will be scored in total?”. Algorithms take the burden of ensuring that all prices are consistent off the traders — the domain experts who set the prices for an event — by reducing their workload to just pricing the driving markets, allowing them to trade more events. Even this work can be outsourced by receiving the driving markets from an odds feed.
We call the process of fitting the model to the prices of the driving markets parameter derivation:
Once we have parameters we can use the to build the model. The model provides data structures from which we can extract probabilities of outcomes, a process we call calculation:
It’s the relationship between structures and extractors that give rise to proper markets, structures contain all the possibilities at a moment in time, so an extractor that assigns each probability within it to a single selection is guaranteed to calculate a proper market.
Some say that Python is a terrible choice of language for high-performance numerical computation. You’ll get no argument from me (or anyone on my team), but we’re stuck with it for now, so I try to push the calculations into NumPy. Sean Parent’s advice of “no raw loops” is something to live by, and this time it manifests as a rare example of optimization:
“Who will win?" is a proper market, that means we can calculate it assigning each cell a bin and summing the cells in each bin, aka numpy.bincount.
Well this is a little faster, but my experiments in C led me to expect a 100× speed-up. Let’s try with a more realistic 21×21 grid:
For a few minutes I was worried that the for-based one wasn’t going to terminate, but we got there in the end! Of course if we had been able to write in C (or C++, Fortran, etc) we could get even more speed:
Calculating a million markets in a second is overkill as even our biggest events only have two hundred or so. I wonder what we could put all that extra CPU to use for, maybe something involving an arbitrary number of markets and selections…
This was my first major breakthrough and forged my belief that (at least) my whole team could stop writing code if we could just find the right abstractions. Now I am more experienced this idea is no longer what I would propose, but it’s close. The original prototype is on GitHub.
Resulting is the process that decides which selections have won, which have lost, and which are not yet decided. You could use the algorithms to do this, taking zero as loss, one as win, and everything else as undecided. Unfortunately there are markets that we offer that the algorithms don’t model, and so instead we have two implementations that both understand what each market means. This is obviously wasteful, but worse it also leads to subtle errors when the definitions are slightly different. This led to me design a language that you could express the meaning of a market in:
In this language resulting is eval and “which selections are in this market?” is members·typeof:
Another particularly interesting walker estimates the relative magnitudes of the probabilities of selections in a given state, and that can be used to estimate the direction each probability will move given a state change:
We can use this as an unsophisticated way to verify that our algorithms behave sanely for all states. This function is also particularly applicable for solving a problem we face with our odds feeds. We receive odds from several sources and aggregate them before fitting the algorithm to them, the problem is that not all feeds update at the same time so we need to discard prices that predate the latest change in state but many of the feeds don’t provide the state the prices apply to. We can compare the directions of the deltas of the prices to the predicted directions as a quick and dirty way to detect this situation. It is also somewhat possible to infer a change of state from deltas too, but because many factors can influence prices we can only suspend betting until we receive the accompanying state change; this is still useful to protect our clients from bad bets.
It is even possible to use this model to extract markets from the algorithms, but the difficulty I had in making this efficient convinced me that this couldn’t be the unifying theory I hoped it would be, it’s just too powerful. However tt did lead me most of the way to the current design, the market definitions fall into one of a small number of groups, so I have taken those as basis functions.
This section is based on a draft email that walks the reader through my thought process in developing a replacement of our event state management system. Open questions include how to model time; what to do about the explosion of states if we model several scores, for example goals, corners and cards in soccer; and what to do about cyclic machines as in tennis.
We have thousands of lines of code that answer the questions “what can happen?” and “what has happened?” throughout the system, obviously algorithms focuses on the first, but we have to interact with plenty of code that answers the second. Of course the problem with code is that it’s easy to let it get out of sync. Nowhere is this more clear than tennis where the different formats are supported to varying degrees, for the longest time the serving rules in tiebreaks were subtly wrong but only if a trader was manually controlling the state.
We model the state of an event as an ordered list of incidents such as “the home team scored at minute X” or “the serving player scored”. Given a history the two fundamental operations are:
Incidents remind me of transitions in a finite-state machine, consider a three-set tennis match, we can model it as an FSM as follows:
There are many ways to encode this into a program, I have chosen to write a function because we’d need an infinitely large table to represent advantage sets once we start modeling games:
We can trivially build get-state and set-state from this function:
The weakness of using functions to define transitions is apparent, we should be able to prune the search tree, there’s no need to search past (1, 0) if the target is (0, 2) but that knowledge isn’t reified anywhere.
Even worse, set-state has a problem, when there are multiple paths to a state it picks an arbitrary one, this is actually how the existing system behaves, but if one day the algorithms start caring about the order of events we’d need to solve it properly. One approach is to record an ordered list of states, and to store the transitions between each state only if known.
Another approach — proposed in my email — is to stored all possible transitions, but this is complicated by cyclic graphs such as tennis games with deuces, and the benefits of this are smaller; knowing that there multiple possible transitions affects correctness whereas knowing which transitions they are only affects completeness. The upshot is that resulting does a slightly better job because it can eliminate some cases that didn’t happen.
The migration to microservices required a rewrite of the part of our code responsible for interfacing with the ORM. This code was tied up with the calculation process as a whole, and so I decided to take the chance to improve our error reporting. The previous system was implemented as for loops with trys to isolate failures and logging and continues to skip the markets that didn’t need calculating. This meant that to answer a simple question such as “why isn’t this market priced?” you’d have to consult the logs on the production boxes, even for user errors such as “the algorithm can’t price that market” or “the algorithm parameters are not sufficient”.
We could have addressed this by introducing a list of reasons why markets were skipped and making sure to append to it whenever we make a decision, but that could only be enforced with code reviews and I much prefer having code police itself. When you think about it, the bodies of our loops are essentially functions on a single market that returns either a processed market or a reason why that market could not be processed, and we want to apply those functions in series to all the non-skipped markets, a.k.a. the either and list monads.
The Left instances contain human-readable strings that are displayed to the traders as tooltips on the buttons that control whether a market is offered or not. Of course in Python we’re using exceptions to early-exit with reasons and implementing all the control flow ourselves, but apart from being wordy there’s no difference.
The algorithms codebase is particularly uniform, there are a handful of functions to test and a large dataset to test them on. Inspired by Jay Fields’ Expectations we have a family of test-suite generating functions that provide the “when” when given data structures representing “given” and “then”:
The idea is to remove all the boilerplate so that there’s no excuse for not writing tests when a new market is added. The terseness also reduces the noise that a code reviewer could be distracted by, which in conjunction with well-named “given”s make it easier to detect erroneous tests.
The format of the expected values is designed for convenience, for example the function extracts tests returns probabilities not indices, so the test runner generates a handful of grids with random probabilities and checks that the extracted probabilities match the sum of the cells.