Disclaimer: The information provided in this article is for educational and informational purposes only and does not constitute financial, investment, or trading advice. Trading cryptocurrencies involves significant risk of loss.
Welcome to Part 2 of our QuestDB series. In Part 1, we covered the three-tier storage architecture and schema design principles. Now we're getting into the truly differentiating feature set — the SQL extensions that make QuestDB feel like it was designed by traders, for traders.
Standard SQL was created in the 1970s for relational data. It knows nothing about time as a first-class concept. Every time-series operation in PostgreSQL or MySQL requires verbose workarounds — window functions, lateral joins, CTEs stacked three deep. QuestDB's extensions collapse these multi-paragraph queries into single, expressive statements.
Let's walk through each one with real trading examples.
SAMPLE BY: Native Time-Bucketed Aggregation
If there's one query that every trading system runs more than any other, it's OHLCV aggregation — turning raw trades into candlestick data. In standard SQL, you'd write something like:
-- Standard SQL: verbose and slowSELECT
date_trunc('minute', timestamp) AS bucket,
symbol,
(array_agg(price ORDERBYtimestamp))[1] ASopen,
max(price) AS high,
min(price) AS low,
(array_agg(price ORDERBYtimestampDESC))[1] ASclose,
sum(quantity) AS volume
FROM trades
WHEREtimestamp>= now() -interval'1 hour'GROUPBY bucket, symbol
ORDERBY bucket;
In QuestDB, this becomes:
SELECTtimestamp, symbol,
first(price) ASopen,
max(price) AS high,
min(price) AS low,
last(price) ASclose,
sum(quantity) AS volume
FROM trades
WHEREtimestampIN today()
SAMPLE BY1m;
That's it. SAMPLE BY 1m tells QuestDB to bucket data into 1-minute intervals, and first() / last() are native aggregation functions that respect the time ordering within each bucket. No date_trunc, no array_agg gymnastics, no explicit GROUP BY.
The available intervals are flexible: 1s, 5s, 15m, 1h, 1d, 7d — any combination of time units. For crypto markets that never sleep, you can align to calendar boundaries with timezone awareness:
SAMPLE BY1d ALIGN TO CALENDAR TIME ZONE 'UTC';
FILL: Handling Gaps in Data
Real markets have gaps — low-liquidity pairs might not trade for minutes or hours. SAMPLE BY supports several FILL strategies:
-- Fill with previous value (forward fill — common in finance)
SAMPLE BY15m FILL(PREV);
-- Fill with linear interpolation
SAMPLE BY15m FILL(LINEAR);
-- Fill with a constant
SAMPLE BY15m FILL(0);
-- No fill — return NULL (default)
SAMPLE BY15m FILL(NONE);
FILL(PREV) is the bread-and-butter for trading dashboards — if no trades occurred in a 15-minute bucket, carry forward the last known price. FILL(LINEAR) is more appropriate for continuous signals like funding rates or interest rates.
ASOF JOIN: The "Do What I Mean" of Market Data Alignment
This is QuestDB's crown jewel, and if you've worked with market data, you'll immediately understand why.
The fundamental problem: you have trades in one table and quotes (bid/ask) in another. You want to know what the prevailing quote was at the exact moment each trade was executed. In a regular database, timestamps almost never align perfectly — a trade at 12:00:00.123 needs to match a quote at 12:00:00.098, not 12:00:00.201.
ASOF JOIN solves this in one line:
SELECT trades.*, quotes.bid, quotes.ask
FROM trades
ASOF JOIN quotes ON (symbol);
For each row in trades, QuestDB finds the row in quotes with the latest timestamp that is less than or equal to the trade's timestamp, matching on the symbol column. No correlated subqueries. No window functions. No application-level logic.
This is the foundation of Transaction Cost Analysis (TCA) — comparing your execution price to the prevailing market at the time of execution. In PostgreSQL, the equivalent requires a LATERAL JOIN with ORDER BY and LIMIT 1 for each row, which is orders of magnitude slower on large datasets.
TOLERANCE: Preventing Stale Data Joins
Here's a subtlety that separates toy implementations from production systems. What if a quote for a thinly-traded asset is 5 minutes old at the time of a trade? In a volatile market, a 5-minute-old quote is essentially garbage. The default ASOF JOIN would still use it — it finds the latest match, no matter how stale.
QuestDB's TOLERANCE clause fixes this:
SELECT trades.*, quotes.bid, quotes.ask
FROM trades
ASOF JOIN quotes ON (symbol)
TOLERANCE 1s;
Now, if no matching quote exists within 1 second of the trade, the join returns NULL instead of stale data. This is critical for accurate TCA and for any analytics where data freshness matters.
As a bonus, TOLERANCE can significantly improve query performance. Without it, the engine might scan far back into the quotes table looking for a match. With TOLERANCE, it can terminate the backward scan early once records are too old to qualify.
LT JOIN and SPLICE JOIN
Two variations worth knowing: LT JOIN is like ASOF JOIN but matches strictly before the timestamp (never equal). This is useful when you need to avoid look-ahead bias in backtesting — you want the quote that existed before your trade, not one that arrived at the same microsecond.
SPLICE JOIN is a full ASOF in both directions: for each record in the left table it finds the prevailing right-table record, and for each record in the right table it finds the prevailing left-table record. The result is a merged, interleaved timeline of both data sources. This is particularly useful for creating unified event timelines from multiple data streams.
HORIZON JOIN: Post-Trade Analysis in a Single Query
Introduced in QuestDB 9.3.3, HORIZON JOIN is purpose-built for markout analysis — the cornerstone of execution quality assessment and market microstructure research.
The question it answers: "After a trade was executed, how did the price evolve over the next N seconds?" Traditionally, this requires self-joins, UNION ALL across multiple ASOF queries, or pushing the logic to application code. HORIZON JOIN collapses it all:
SELECT h.offset /1000000AS offset_sec,
avg(mid.price - fill.price) AS avg_markout
FROM fills
HORIZON JOIN mid_prices ON (symbol)
RANGEBETWEEN0AND60s STEP 1s;
This computes the average price movement at 1-second intervals up to 60 seconds after each fill. The engine handles the time offset computation, the ASOF matching at each horizon point, and the aggregation — all in a single pass.
For non-uniform horizons — or to look before the event — use the LIST syntax:
HORIZON JOIN mid_prices ON (symbol)
LIST (-5s, -1s, 0, 1s, 5s, 30s, 60s);
This gives you both pre- and post-trade markout curves. Combined with filtering by venue, strategy, or order size, you can build a complete execution quality framework inside the database.
QuestDB's cookbook includes five post-trade analysis patterns built on HORIZON JOIN: slippage analysis (execution vs. mid price), markout curves, implementation shortfall (the Perold decomposition), venue scoring for smart order routing, and flow toxicity detection (VPIN). Each one runs on their live demo with real data.
WINDOW JOIN: Correlating Events with Surrounding Data
WINDOW JOIN was introduced in QuestDB 9.3. It allows each row from a primary table to be joined with a time window of rows from another table, with aggregations computed over the matching rows.
Consider an FX trading scenario where you want to correlate each trade with the average bid and ask prices in the 10 seconds following the trade:
SELECT
trades.timestamp,
trades.symbol,
trades.price,
avg(quotes.bid) AS avg_bid,
avg(quotes.ask) AS avg_ask
FROM trades
WINDOWJOIN quotes ON (symbol)
RANGEBETWEEN0AND10s FOLLOWING
INCLUDE PREVAILING;
The INCLUDE PREVAILING clause ensures you get the most recent price even if there's no exact match at the window boundary. This eliminates the pages of subqueries that short-horizon analytics typically require.
Use cases in trading: computing average market conditions around each execution, detecting unusual price behavior in a time window before large orders, correlating IoT/infrastructure events (network latency spikes) with execution quality.
LATEST ON: Instant Current State
A deceptively simple but incredibly useful extension. LATEST ON returns the last row for each value of a partitioning column:
SELECT*FROM trades
LATEST ONtimestampPARTITIONBY symbol, side
ORDERBYtimestampDESC;
This gives you the last trade for each (symbol, side) pair — essentially a real-time "current state" snapshot. In a traditional database, this would require a correlated subquery or a window function with ROW_NUMBER().
For trading dashboards showing the latest price across hundreds of symbols, LATEST ON is instantaneous. Combined with materialized views (which we'll cover in Part 3), it becomes the foundation for sub-millisecond portfolio snapshots.
TWAP: Native Time-Weighted Average
The twap(price, timestamp) aggregate was added in QuestDB 9.3.3. Unlike VWAP which weights by volume, TWAP weights by duration — each price holds until the next observation, and the result is the area under the step function divided by total time.
SELECT symbol,
twap(price, timestamp) AS twap_value,
vwap(price, quantity) AS vwap_value
FROM trades
WHEREtimestampIN today()
SAMPLE BY1h;
TWAP is the standard execution benchmark for algorithmic orders. Having it as a native aggregate that supports parallel GROUP BY and SAMPLE BY with all FILL modes means you don't need any client-side integration — the computation runs entirely inside the query engine.
Window Functions: The Foundation of Technical Analysis
QuestDB supports standard SQL window functions, which form the backbone of technical indicator computation:
-- Bollinger Bands using materialized OHLC viewWITH stats AS (
SELECTtimestamp, close,
AVG(close) OVER (
ORDERBYtimestampROWSBETWEEN19 PRECEDING ANDCURRENTROW
) AS sma20,
AVG(close*close) OVER (
ORDERBYtimestampROWSBETWEEN19 PRECEDING ANDCURRENTROW
) AS avg_close_sq
FROM trades_OHLC_15m
WHEREtimestampBETWEEN dateadd('h', -24, now()) AND now()
AND symbol ='BTC-USDT'
)
SELECTtimestamp,
sma20,
sma20 +2*sqrt(avg_close_sq - sma20 * sma20) AS upper_band,
sma20 -2*sqrt(avg_close_sq - sma20 * sma20) AS lower_band
FROM stats;
QuestDB 9.3.3 also introduced the SQL-standard WINDOW clause — define a window specification once, reference it by name across multiple functions. No more repeating the same PARTITION BY and ORDER BY in every expression:
SELECTtimestamp, symbol,
avg(price) OVER w AS avg_price,
stddev(price) OVER w AS std_price,
min(price) OVER w AS min_price,
max(price) OVER w AS max_price
FROM trades
WINDOW w AS (PARTITIONBY symbol ORDERBYtimestampROWSBETWEEN99 PRECEDING ANDCURRENTROW);
Cleaner queries, less copy-paste, same execution performance.
Real-World Query Patterns
Let me show you some patterns that come up constantly in production trading systems:
Cross-Asset Correlation
-- Rolling hourly and daily correlation between ETH and another assetWITH data AS (
SELECT ETHUSD.timestamp,
corr(ETHUSD.price, asset.price) AS corr
FROM ETHUSD
ASOF JOIN asset
SAMPLE BY1m
)
SELECTtimestamp,
avg(corr) OVER (ORDERBYtimestampRANGEBETWEEN1HOUR PRECEDING ANDCURRENTROW) AS hourly_corr,
avg(corr) OVER (ORDERBYtimestampRANGEBETWEEN24HOUR PRECEDING ANDCURRENTROW) AS daily_corr
FROM data;
RSI Across All Symbols
-- 14-day RSI for all USDT pairsWITH gains_losses AS (
SELECTtimestamp, symbol,
CASEWHENclose>lag(close) OVER (PARTITIONBY symbol ORDERBYtimestamp)
THENclose-lag(close) OVER (PARTITIONBY symbol ORDERBYtimestamp)
ELSE0ENDAS gain,
CASEWHENclose<lag(close) OVER (PARTITIONBY symbol ORDERBYtimestamp)
THENlag(close) OVER (PARTITIONBY symbol ORDERBYtimestamp) -closeELSE0ENDAS loss
FROM trades_latest_1d
WHERE symbol LIKE'%-USDT'
)
SELECTtimestamp, symbol,
100-100/ (1+ avg_gain /NULLIF(avg_loss, 0)) AS rsi
FROM (
SELECTtimestamp, symbol,
AVG(gain) OVER (PARTITIONBY symbol ORDERBYtimestampROWSBETWEEN13 PRECEDING ANDCURRENTROW) AS avg_gain,
AVG(loss) OVER (PARTITIONBY symbol ORDERBYtimestampROWSBETWEEN13 PRECEDING ANDCURRENTROW) AS avg_loss
FROM gains_losses
);
ATR (Average True Range)
-- 14-period ATR from 15-minute OHLC barsWITH tr AS (
SELECTtimestamp, symbol,
GREATEST(
high - low,
ABS(high -lag(close) OVER (PARTITIONBY symbol ORDERBYtimestamp)),
ABS(low -lag(close) OVER (PARTITIONBY symbol ORDERBYtimestamp))
) AS true_range
FROM trades_OHLC_15m
)
SELECTtimestamp, symbol,
AVG(true_range) OVER (PARTITIONBY symbol ORDERBYtimestampROWSBETWEEN13 PRECEDING ANDCURRENTROW) AS atr_14
FROM tr;
The SQL Advantage Over Custom Code
You might wonder: why compute these indicators inside the database when you could pull raw data into Python and use ta-lib or pandas?
Three reasons. First, data locality — moving terabytes of tick data across the network to compute a rolling average is wasteful. The computation should live where the data lives. Second, parallelism — QuestDB's vectorized engine with JIT compilation can process these queries across multiple cores with SIMD instructions, often faster than single-threaded Python code. Third, consistency — when multiple dashboards, strategies, and monitoring systems all need the same indicators, having a single source of truth in the database eliminates synchronization bugs.
That said, QuestDB isn't trying to replace your entire analytics stack. The Parquet interop means you can still pull historical data directly into your ML pipeline without going through the database for batch workloads.
Coming Up in Part 3
In the final part, we'll cover materialized views for real-time OHLC (including cascaded views at multiple timeframes), 2D arrays for native order book analytics, and a reference architecture for a complete QuestDB-powered algorithmic trading platform.
Citation
@software{soloviov2025questdb_algotrading_p2,
author = {Soloviov, Eugen},
title = {QuestDB for Algorithmic Trading: SQL Extensions That Change the Game},
year = {2025},
url = {https://marketmaker.cc/en/blog/post/questdb-algotrading-sql},
version = {0.1.0},
description = {Deep dive into QuestDB's time-series SQL extensions: SAMPLE BY, ASOF JOIN, HORIZON JOIN, WINDOW JOIN, LATEST ON, and real-world trading query patterns.}
}