Interactive Brokers Algo Optimizer App (using Thompson Sampling)

Why Thompson Sampling over Traditional A/B Testing, a t-test or ANOVA?

If you want to compare two or more execution algorithms, the standard institutional approach is traditional fixed-horizon A/B testing evaluated via a t-test. Or actually an ANOVA test for more than 2 algos. Have fun with ANOVA if you decide to go that route!!!:scream:

Making it even worse, classic A/B testing has a catastrophic flaw for live trading: it forces you to lose money. To reach statistical significance, a t-test requires you to route an equal number of trades to a clearly inferior strategy for weeks or months just to satisfy a static sample-size calculation. In data science, this is known as the Exploration vs. Exploitation dilemma, and traditional testing solves it by over-exploring at your expense.

This app solves that problem by replacing hypothesis testing with Thompson Sampling (a core Multi-Armed Bandit algorithm). And it walks you through it–leaving the math to the algorithm.

The Math: Asymptotically Optimal for Least Regret

Thompson Sampling has been mathematically proven to be Bayesian optimal for minimizing cumulative regret (the opportunity cost of utilizing inferior execution paths). Instead of dividing traffic evenly, the engine updates a Bayesian conjugate prior (a Beta distribution) for each algorithm in real-time.

For the next trade, the app runs a randomized trial: it samples from each algorithm's current posterior distribution and routes the trade to whichever path returns the highest simulated success rate.

  • The Result: The engine dynamically shifts traffic toward the winning execution paths in minimum time, drastically reducing exposure to inferior strategies while still allocating just enough exploratory traffic to unproven algorithms to ensure the system doesn't get trapped in a local minimum.

Why it is conceptually simpler than a t-test:

Ironically, the math behind Thompson Sampling makes it far simpler to operate than a standard A/B framework:

  • No Arbitrary Sample Sizes: You don't need to calculate statistical power, effect sizes, or target sample sizes before you start.

  • No "P-Hacking" or Early Peeking: In a t-test, checking the data early and stopping the test ruins the math (multiplicity inflation). With Thompson Sampling, you can view the data, stop, or add new algorithms at any time. The system is inherently self-correcting.

  • Dead Simple Updates: The Beta-Binomial update requires no complex matrix math or lookup tables. When a trade finishes, the background loop literally just executes a basic arithmetic update:

    • If Success: \alpha = \alpha + 1

    • If Failure: \beta = \beta + 1

    • You can keep track of your trades and the result in the app

    • You define whether a trade is a success or failure by comparing you fill price to a benchmark algorithm such as VWAP or opening price

The engine does the heavy lifting via local Monte Carlo simulations to map out the 95% Credible Intervals and the absolute "Probability of Being Best" for each execution arm.

2 Likes
  1. IB Algo Bandit Optimizer Live Application — Open this to launch the tool instantly in your browser, load the demo data, and see the Bayesian engine dynamically converge on the true probabilities. You can then erase the demo and use it for your own trading if you like.

  2. GitHub Source Code Repository — Go here to view the raw front-end code, read the "as-is" public domain license, or fork the project to build your own custom modifications.

I did improve the demo portion of the app. It now embeds the true underlying probabilities of success for each algorithm (which are randomized each time you reload). So now you can see if the engine has successfully found the best algorithm after just 60 trials.

Once it identifies the top performer, it will begin to exploit it more heavily and explore the inferior algorithms less. It still has to spend some initial energy exploring to identify the baseline truth—there is no way around that—but Thompson Sampling is mathematically proven to be optimal for this. The true probabilities are now represented by the orange vertical hash marks on the bars.

I think the app is fully usable for trading now. Future community improvements to the demo mode might include adding manual adjustments for the true success rates, changing the total number of simulated trials, or dynamically adding/removing custom algorithms to see how the convergence speed scales.