Kistack Blog KR Start backtest

When absolute return lies — the baseline-pick trap

The same +40% feels like a win or a regret depending on the line next to it. Here's how the reference choice shapes the conclusion before the math starts.


Reference line shapes the result — matte 3D glass panel with one main mass at fixed height and two smaller reference lines at different heights, floating chart and UI cards on a black canvas in the Innovation Forest tone

You run a backtest and the result reads +40%. How do you feel about that? Honestly, it depends entirely on what's sitting next to it. The same +40% can read as a win, or read as a regret. The difference isn't the number itself — it's the reference line you put next to it.


Why the same +40% reads as a win one day and a regret the next

Picture a situation. You ran a 5-year backtest and the result came out +40%. The asset grew 40% over 5 years. Looking at +40% alone doesn't honestly tell you whether it's strong or weak.

Add one line next to it and the color shifts immediately.

  • Same period, S&P 500 returned +20% → you outpaced the market by 2x. A win.
  • Same period, S&P 500 returned +60% → you trailed the market significantly. A regret.
  • Same period, QQQ returned +90% → you trailed by a lot more. A bigger regret.

Same +40%, three different meanings. The number doesn't carry meaning on its own — the reference line next to it shapes the interpretation. What you pick as the comparison essentially pre-decides the conclusion.


Why looking at absolute return alone shakes your decisions

Looking at absolute return (+40%) alone makes decisions keep shifting. The number doesn't carry an answer inside it.

When making an investment decision, the mental flow usually runs: "this asset's result is good → keep going" or "the result isn't good → look for something else." The "good versus not good" benchmark gets built automatically in your head — often without your noticing.

  • Some hold "savings account 3%" as their mental baseline → +40% looks great no matter what
  • Some hold "the crypto +200% a friend mentioned" as their baseline → +40% looks underwhelming
  • Some hold "last year's market crash" as their baseline → +40% feels like relief

Same +40%, different conclusions depending on the mental baseline. And that baseline shifts with whatever you heard that day. No wonder decisions keep wavering.

That's why displaying a market reference line alongside the result matters. It prevents the mental baseline from shifting moment to moment. Whether the market was +20% or +60% doesn't change.


What was the S&P 500 doing — the weight of one reference line

The most common reference line is the S&P 500 index. An index of 500 large U.S. companies, representing the broader U.S. market in one line. With it sitting next to a backtest result, the weight of decision-making shifts.

A quick example.

My backtest result
+40% (5 years)
Same period S&P 500
+62% (5 years)

With both lines visible, the mental framing shifts. "I thought +40% was good, but I trailed the market average by 22%." The follow-up questions sharpen. "Why was I slower than the market average?" "Which composition held me back?"

Without the reference line, those questions don't surface. +40% feels like enough and you move on. One reference line creates the next question.


SPY vs QQQ — what you pick as the baseline shapes the conclusion

This is where the trap actually lives. What you pick as the comparison essentially predetermines the conclusion.

  • SPY (S&P 500) → 500 large U.S. companies. A steady market-average baseline.
  • QQQ (Nasdaq 100) → 100 tech-leaning Nasdaq names. Higher volatility and higher expected return.
  • None (no comparison) → absolute return only.

Place the same +40% next to SPY versus QQQ and the comparison reads differently.

  • Same +40% next to SPY +62% → "trailed the market average"
  • Same +40% next to QQQ +95% → "trailed the tech-heavy market by a lot"
  • Same +40% with no baseline → "+40% seems fine, I guess"

Three different conclusions from the same single result. Which baseline you place next to it shapes that day's conclusion. To put it differently, the comparison isn't just evaluating the result — the comparison is generating the conclusion. Picking the baseline deserves one conscious question: "which one is my asset actually closer to?"

  • U.S. ETF-heavy portfolio → SPY is the honest comparison
  • Tech and growth-stock-heavy portfolio → QQQ is the honest comparison
  • A mix of both → display both lines and read both sides

Picking a baseline that doesn't match your asset character isn't a comparison — it's self-justification. Comparing a tech portfolio to SPY only makes the result look better. Same period, QQQ ran faster.

SPY baseline and QQQ baseline make the same result read differently — matte 3D glass panel with one central main mass and two smaller masses split left and right, floating chart and UI cards on a black canvas in the Innovation Forest tone


The cost of "no baseline" as a choice

The comparison option includes "none" — absolute return only. Some people pick this often. The result screen looks cleaner.

This choice carries a hidden cost. A result without a reference line reads as good or bad depending on your mood that day. After yesterday's market-crash news, +40% feels like comfort. After hearing about a friend's +100%, the same +40% feels underwhelming.

A baseline cuts that wavering. The market average stays fixed at +62%. That one line existing or not is what creates consistency in decision-making.

If you have a clear absolute-return target, "none" can be the right choice. "Reach +50% in 5 years" — that's an absolute goal. For that type of investor, what the market did doesn't matter. Your +50% target is your baseline.

The point isn't that having no baseline is the problem.

The riskiest situation is the baseline you don't consciously notice but that shifts on its own. Even with "none" selected, holding one explicit baseline in mind is worth it.


One question for picking the comparison

When picking the comparison option, ask yourself one thing: "Is my asset closer to SPY, QQQ, or both?"

Clarifying that question fixes the baseline automatically.

  • U.S. large-cap ETFs (VOO, VTI, etc.) → SPY is the most honest baseline
  • Tech and growth stocks (TSLA, NVDA, QQQ, etc.) → QQQ is the most honest baseline
  • Similar weight in both → display both baselines and compare both sides
  • Korean stocks, bonds, etc. → U.S. indices may not be honest baselines (an absolute-target line fits better)

Asking the question backward creates self-justification. "Which baseline makes my result look good?" The same +40% can look underwhelming next to SPY and great next to KOSPI. A comparison picked that way doesn't help decisions. It's just a comfortable picture.

Honest comparison places the index closest to your asset character next to it. That sharpens the next question.


Looking at it directly

Running the same backtest with SPY, QQQ, and none as three separate runs is the fastest way. Watch how the same +40% organizes differently in your head depending on which baseline sits next to it. Feel that difference once, and from then on the baseline goes up automatically with every result.

Set the baseline first. Read the result second. The order changes the conclusion.


  • This information is not investment advice.
  • Past performance does not guarantee future results.
  • Backtest results are simulations and may differ from actual trading outcomes.

Kistack is an information service designed to help users review market data independently and form their own judgments. These backtests are historical simulations based on public market data and do not guarantee future investment returns. Past performance is not indicative of future results. Trading costs such as fees, taxes, and slippage are not reflected in simulations. Data is provided by Kistack; decisions are made by users.

This information is provided for educational and informational purposes only and does not constitute investment advice within the meaning of the Investment Advisers Act of 1940 (IAA) §206. Kistack is not a registered investment adviser and does not provide individualized buy or sell recommendations.

All performance figures shown are historical simulations. Disclosures regarding past performance and risk are presented in a manner intended to be fair, balanced, and not misleading, consistent with FINRA Communications Rule 2210. No statement on this site is intended to omit material facts or to mislead readers under SEC Rule 10b-5 of the Securities Exchange Act of 1934.