PnL by Active Time: The Metric That Changes Strategy Rankings

MarketMaker.cc Team
क्वांटिटेटिव रिसर्च और स्ट्रैटेजी

MarketMaker.cc Team
क्वांटिटेटिव रिसर्च और स्ट्रैटेजी
You have two strategies. The first: PnL +300%, 418 trades, position open 45% of the time. The second: PnL +27%, 38 trades, position open 5% of the time. Which one is better?
If you chose the first one — you answered incorrectly. Here is why.
Raw PnL — the total return over the entire backtest period — does not account for what fraction of time the strategy was in a position. A strategy with +300% and 45% trading time uses your capital less than half the time. The remaining 55% of the time, capital sits idle.
A strategy with +27% and 5% trading time uses capital only 5% of the time — but the remaining 95% is available for other strategies.
If you run a portfolio of strategies through an orchestrator, one strategy's idle time is filled by others. The key metric then becomes not how much a strategy earned over a year, but how much it earns per unit of active time.

where:
def pnl_per_active_time(
total_pnl: float, # total PnL, %
test_period_days: int, # backtest length, days
trading_time_pct: float, # fraction of active time, 0..1
fill_efficiency: float = 0.80, # slot fill efficiency
) -> dict:
"""
Calculate effective return per active time.
"""
active_days = test_period_days * trading_time_pct
pnl_per_day = total_pnl / active_days
annualized_raw = pnl_per_day * 365
annualized_effective = annualized_raw * fill_efficiency
return {
"active_days": active_days,
"pnl_per_day": pnl_per_day,
"annualized_raw": annualized_raw,
"annualized_effective": annualized_effective,
}
Period: 750 days (25 months), fill_efficiency = 0.80:
| Strategy | PnL | Trading time | Active days | PnL/day | Annualized (x0.8) |
|---|---|---|---|---|---|
| Strategy C | +300% | 45% | 337.5 | 0.89%/d | 259% |
| Strategy B | +27% | 5% | 37.5 | 0.72%/d | 210% |
| Strategy A | +58% | 15% | 112.5 | 0.51%/d | 150% |
By raw PnL: Strategy C (300%) >> Strategy A (58%) >> Strategy B (27%). By effective return: Strategy C (259%) > Strategy B (210%) > Strategy A (150%).
Strategy B with 27% PnL turns out to be comparable to Strategy C with 300% PnL — because it earns the same money in 9 times less active time. The remaining 95% of the time can be filled with other strategies.
The formula above is linear. It is simpler and more conservative. The compound variant accounts for profit reinvestment:
import numpy as np
def compound_annualized(total_pnl_pct, active_days, fill_efficiency=0.80):
"""Compound extrapolation."""
daily_return = (1 + total_pnl_pct / 100) ** (1 / active_days) - 1
annualized = (1 + daily_return) ** (365 * fill_efficiency) - 1
return annualized * 100
b_compound = compound_annualized(27, 37.5)
c_compound = compound_annualized(300, 337.5)
With compound extrapolation, Strategy B overtakes Strategy C: 540% vs 231%. The ranking is inverted.
Recommendation: use linear extrapolation for ranking. It is more conservative and less prone to rewarding overfitting on a small number of trades.
Strategy B with 38 trades and PnL/day = 0.72% looks attractive. But 38 trades is a statistically weak sample. A high PnL/day could be the result of a lucky coincidence.
We use the t-distribution to penalize small samples:
where is the mean return per trade, is the standard deviation, is the number of trades, is the t-distribution quantile.
import scipy.stats as st
import numpy as np
def confidence_adjusted_score(
trade_returns: list,
test_period_days: int,
fill_efficiency: float = 0.80,
min_trades: int = 30,
confidence: float = 0.95,
) -> dict:
"""
Strategy ranking with sample size adjustment.
"""
n = len(trade_returns)
if n < min_trades:
return {"score": 0, "reason": f"Too few trades ({n} < {min_trades})"}
returns = np.array(trade_returns)
mean_ret = np.mean(returns)
se = np.std(returns, ddof=1) / np.sqrt(n)
alpha = 1 - confidence
t_crit = st.t.ppf(1 - alpha / 2, df=n - 1)
ci_lower = mean_ret - t_crit * se
if mean_ret <= 0:
confidence_factor = 0
else:
confidence_factor = max(0, ci_lower / mean_ret)
total_pnl = np.sum(returns)
hold_times = [...] # holding hours for each trade
active_days = sum(hold_times) / 24
pnl_per_day = total_pnl / active_days if active_days > 0 else 0
annualized = pnl_per_day * 365 * fill_efficiency
score = annualized * max_leverage * confidence_factor
return {
"score": score,
"annualized": annualized,
"confidence_factor": confidence_factor,
"ci_lower": ci_lower,
"n_trades": n,
}
| Strategy | Trades | Mean ret | SE | CI lower | Conf. factor | Adjusted score |
|---|---|---|---|---|---|---|
| Strategy B | 38 | 0.71% | 0.28% | 0.14% | 0.20 | 210% x 0.20 = 42% |
| Strategy C | 418 | 0.72% | 0.05% | 0.62% | 0.86 | 259% x 0.86 = 223% |
| Strategy A | 491 | 0.12% | 0.02% | 0.08% | 0.67 | 150% x 0.67 = 100% |
After confidence adjustment, Strategy C confidently leads: 418 trades give a narrow CI and high confidence factor. Strategy B with 38 trades is penalized — its "brilliant" performance may be the result of variance.

The fill_efficiency parameter answers the question: "What fraction of time can the orchestrator keep capital working?"
The simplest approach: fill_efficiency = 0.80 for all strategies. Assumes the orchestrator utilizes 80% of idle time with other strategies/pairs.
Pro: identical for all, easy to compare. Con: does not account for correlation between strategies.
If you have pairs, each active of the time, the probability that at least one is active:
But cryptocurrencies are highly correlated — BTC pulls ETH, SOL, and the rest along with it. The effective number of independent pairs:
def estimate_fill_efficiency(
trading_time_pct: float,
n_pairs: int,
correlation_factor: float = 3.0, # crypto — high correlation
max_slots: int = 10,
) -> float:
"""
Analytical estimate of fill_efficiency.
Args:
trading_time_pct: fraction of active time for one strategy
n_pairs: number of trading pairs
correlation_factor: correlation coefficient (1=independent, 5=strong)
max_slots: maximum number of simultaneous positions
"""
effective_n = n_pairs / correlation_factor
p_at_least_one = 1 - (1 - trading_time_pct) ** effective_n
expected_active = effective_n * trading_time_pct
utilization = min(expected_active, max_slots) / max_slots
return min(p_at_least_one, utilization)
eff_b = estimate_fill_efficiency(0.05, 10, 3.0)
eff_c = estimate_fill_efficiency(0.45, 10, 3.0)
For Strategy B with 5% activity and 10 correlated pairs, fill_efficiency is only ~16%. This dramatically reduces effective return.
The most accurate approach is to run all strategies on all pairs and calculate real slot utilization:
def simulate_fill_efficiency(
all_signals: dict, # {(strategy, pair): [(entry_time, exit_time), ...]}
max_slots: int = 10,
test_period_minutes: int = 750 * 24 * 60,
) -> float:
"""
Simulate real orchestrator slot utilization.
"""
timeline = np.zeros(test_period_minutes)
for signals in all_signals.values():
for entry_min, exit_min in signals:
timeline[entry_min:exit_min] += 1
capped = np.minimum(timeline, max_slots)
fill_efficiency = np.mean(capped) / max_slots
return fill_efficiency
Combining all components:
def strategy_score(
trades: list,
test_period_days: int,
fill_efficiency: float = 0.80,
min_trades: int = 30,
funding_rate: float = 0.0001,
) -> float:
"""
Final score for strategy ranking.
Accounts for:
- PnL per active day (capital usage efficiency)
- MaxLev (risk-adjusted scaling)
- Confidence adjustment (penalty for small sample)
- Funding costs (realistic costs at leverage)
"""
n = len(trades)
if n < min_trades:
return 0
returns = np.array([t.pnl_pct for t in trades])
hold_hours = np.array([t.hold_hours for t in trades])
total_pnl = np.sum(returns)
active_days = np.sum(hold_hours) / 24
pnl_per_day = total_pnl / active_days
equity = np.cumprod(1 + returns / 100)
peak = np.maximum.accumulate(equity)
max_dd = ((equity - peak) / peak).min()
max_lev = max(1, int(50 / abs(max_dd * 100)))
funding_daily = funding_rate * 3 * max_lev * 100 # in %
net_pnl_per_day = pnl_per_day - funding_daily
annualized = net_pnl_per_day * 365 * fill_efficiency
se = np.std(returns, ddof=1) / np.sqrt(n)
mean_ret = np.mean(returns)
if mean_ret <= 0:
return 0
t_crit = st.t.ppf(0.975, df=n - 1)
ci_lower = mean_ret - t_crit * se
conf_factor = max(0, ci_lower / mean_ret)
score = annualized * max_lev * conf_factor
return score
This metric does not replace but complements the tools from previous articles:
Loss-Profit Asymmetry: max drawdown determines MaxLev, which feeds into the score formula. The deeper the drawdown, the lower the score — nonlinearly, due to recovery asymmetry.
Monte Carlo bootstrap: confidence intervals from bootstrap provide a more accurate estimate of the confidence factor than the t-distribution. You can replace the CI from the t-distribution with the 5th percentile from bootstrap.
Funding rates: funding costs are subtracted from PnL per active day. With high leverage and low PnL/day, funding can make the net score negative — the strategy is unprofitable in reality despite a positive raw PnL.
PnL per active time is the primary metric for ranking strategies in an orchestrator. When multiple strategies compete for the same slot, the one with the highest score (accounting for confidence adjustment) wins.
In practice, this leads to surprising decisions: strategies with "modest" raw PnL but short time in position often get priority over "flashy" strategies with high PnL but long positions. The former use capital more efficiently in a portfolio of dozens of strategies.
The key insight: the only metric that scales is PnL per active day. Raw PnL does not scale: you cannot run the same strategy twice. But you can fill idle time with other strategies — and PnL per active day accurately predicts how much you will earn in a portfolio.
Raw annual PnL is a convenient but deceptive metric. It does not account for the trader's most important resource — the time during which capital is working.
Three takeaways:
Calculate PnL per active day. A strategy with +27% over 38 days in position = +0.72%/day. A strategy with +300% over 338 days = +0.89%/day. The difference is not 11x, but 1.2x.
Account for fill_efficiency. In a portfolio of correlated crypto pairs, fill_efficiency is lower than it seems. 10 pairs does not equal 10x diversification. With correlation_factor = 3, the effective number of pairs is only ~3.
Penalize small samples. 38 trades with a mean of +0.71% gives a CI from +0.14% to +1.28%. 418 trades with +0.72% gives a CI from +0.62% to +0.82%. The second strategy is more reliable, even though the means are nearly identical.
The PnL per active time metric does not replace PnL@MaxLev — it complements it by adding the dimension of capital usage efficiency. For a single strategy, PnL@ML is sufficient. For a portfolio of strategies, PnL per active time is essential.
@article{soloviov2026pnlactivetime, author = {Soloviov, Eugen}, title = {PnL by Active Time: The Metric That Changes Strategy Rankings}, year = {2026}, url = {https://marketmaker.cc/ru/blog/post/pnl-active-time-metric}, version = {0.1.0}, description = {Why raw annual PnL is a poor metric for comparing strategies with different trading time. How to calculate effective return, why you need fill\_efficiency, and why a strategy with 27\% PnL can outperform one with 300\%.} }