High-Performance Backtest Engines
How to build a backtest engine that runs hundreds of times faster without changing a single PnL number — data layout, caching, adaptive resolution, and architecture, from first speedups to production internals.
- 01
Jul 1, 2026 #algotradingThe Backtest Speed Ladder: 298x on a Laptop CPU, Identical PnL to the Last Trade
Five implementations of the same 80-combo parameter sweep, all verified to produce identical PnL: pandas rolling.apply takes 69.9 seconds, numpy 3.1, numba 2.0, parallel numba 0.23 — a measured 298x speedup on an Apple M2 Max with zero hardware changes, and still ~13x over a competent vectorized baseline. What each rung buys, why a GPU is not the missing piece, and where the real bottleneck in mass parameter search lives.
- 02
Mar 16, 2026 #algotradingAggregated Parquet Cache: How to Speed Up Multi-Timeframe Backtests by Hundreds of Times
How to precompute timeframes and indicators from minute candles, save them to parquet, and use them for mass strategy testing without redundant recalculations.
- 03
Mar 17, 2026 #algotradingAdaptive Drill-Down: Backtest with Variable Granularity from Minutes to Raw Trades
How adaptive data granularity speeds up backtests and saves storage: drill-down from 1m to 1s, 100ms, and raw trades only where price moved significantly or volume spiked, not across the entire historical series.
- 04
Jul 2, 2026 #algotradingThe IPC Tax: Put the Backtest Engine Behind a Socket and Lose 13% — Almost None of It to the Socket
We ported a numba backtest kernel line-for-line to Rust and called it across a process boundary four ways, with an equivalence gate confirming identical PnL to the last trade. Shipping the entire 1.2 MB price series through a Unix socket costs ~2 ms — about 0.1% of the job. JSON-encoding the same payload costs 1348x more than raw bytes, chatty per-combo calls re-ship the data 80 times, and a per-bar call pattern would pay 2.1 s of pure IPC on a 2.0 s job. The boundary is cheap; the tax is in how you cross it.