This page explains every statistical concept used in the study. Each section starts with a real-world analogy, then connects it to what we actually did. You don't need to read them in order — jump to whatever term confused you on another page.

1. What is a "regression" and why do we use it?

The problem we're solving

We want to know: do stock prices behave differently around paydays? But lots of things affect stock prices — the day of the week, the time of year, what happened yesterday. If we just look at "payday days vs. other days," we might accidentally pick up one of those other effects and think it's a payday effect.

Real-world analogy

Ice cream and drowning. Ice cream sales and drowning deaths both go up in summer. If you just compared the two, you'd conclude ice cream causes drowning. But both are caused by a third thing: hot weather. A regression is the tool that says "hold weather constant, and now check if ice cream still predicts drowning." (It doesn't.)

In our study: "Hold the day-of-week, month, and recent market moves constant, and now check if payday timing still predicts stock returns."

How it works, step by step

A regression takes each trading day and builds an equation:

How much did the market move today?
today's return = baseline (the average day)
  + ?? × is it a payday window? ← THIS is what we want to measure
  + ?? × is it Monday? (Mondays tend to be worse)
  + ?? × is it Friday? (Fridays tend to be better)
  + ?? × is it January? (January has its own pattern)
  + ?? × is it the turn of the month? (last day + first 3 days)
  + ?? × what did the market do yesterday? (momentum)
  + random noise (stuff we can't predict)

The regression fills in every ?? with the number that best fits 16,680 trading days of data. The number in front of "is it a payday window?" is the beta coefficient (β) — the headline number in our study.

Bottom line: β = +0.10% means "on days near a payday, the market goes up an extra 0.10% per day, even after accounting for everything else we know moves markets." If β = 0, there's no payday effect. If β is negative, the market actually goes down near paydays.

2. What is a "p-value"?

The question it answers

Suppose you found β = +0.10%. Cool — but is that real, or did it just happen by chance? Maybe you got unlucky with which days fell in the "payday window" and which didn't. The p-value answers: "If there were truly no payday effect, how often would random chance produce a number this big or bigger?"

Coin flip analogy

You suspect a coin is rigged. You flip it 100 times.

  • 52 heads: Meh. That's close enough to 50 that you wouldn't even blink. (p ≈ 0.69)
  • 57 heads: Hmm, a little suspicious, but you've seen that happen with a fair coin. (p ≈ 0.09)
  • 60 heads: Now you're paying attention. This is uncommon for a fair coin. (p ≈ 0.02)
  • 65 heads: Okay, this coin is almost certainly rigged. (p ≈ 0.002)
  • 75 heads: There is essentially zero chance this coin is fair. (p < 0.000001)

The p-value is the probability of seeing a result this extreme (or more) if the coin were perfectly fair. The smaller the p-value, the harder it is to explain away as luck.

The significance threshold

Scientists have agreed on a convention: if p < 0.05 (less than 5% chance of being luck), we call the result "statistically significant." This isn't a magic number — it's just a widely accepted standard. Here's what the different levels mean:

p > 0.10
Not significant
Probably noise
p = 0.05–0.10
Suggestive
Interesting, not conclusive
p = 0.01–0.05
Significant *
Likely real
p = 0.001–0.01
Highly significant **
Almost certainly real
p < 0.001
Extremely significant ***
Beyond reasonable doubt

Throughout this site, we mark significance with stars: * = p < 0.05, ** = p < 0.01, *** = p < 0.001, and . = p < 0.10 (suggestive).

3. What are "HAC standard errors" and why should I care?

The problem with basic statistics on stock data

Most statistics assume that each data point is independent — like each flip of a coin has nothing to do with the last one. But stock prices don't work like coin flips:

What basic stats assume
  • Each day is independent of the last
  • Market volatility is the same every day
  • A calm day and a crisis day have equal weight
How stocks actually work
  • Bad days tend to follow bad days (momentum)
  • Some weeks are wild, some are calm (volatility clustering)
  • A 2008 crisis day is very different from a normal Tuesday

If you use basic statistics on stock data, your p-values will be too optimistic — results will look more significant than they really are. It's like wearing rose-colored glasses that make everything seem more important.

HAC standard errors (named after the authors, Newey and West) are the correction. They adjust the math to account for the fact that stock returns are messy, correlated, and unevenly volatile. Think of it as swapping the rose-colored glasses for prescription lenses. The results might look less dramatic, but they're honest.

4. What is "FDR correction" and the "multiple comparisons problem"?

The birthday party analogy

You're at a party with 100 people. You test whether each person's birthday predicts their salary. At the p < 0.05 level, you'd expect about 5 people to show a "significant" correlation just by pure luck — even though birthdays obviously don't affect salaries.

That's the multiple comparisons problem: test enough things, and some will look significant by accident.

In our study, we tested:

That's hundreds of tests. Some will look significant by chance alone.

How FDR correction fixes this

FDR stands for "False Discovery Rate." It's a method (invented by Benjamini and Hochberg in 1995) that adjusts all the p-values to account for the number of tests. It produces a q-value for each result:

What q-values mean

q = 0.05 means: "If you take all findings with q ≤ 0.05, at most 5% of them are expected to be false alarms."

A finding with p = 0.01 might have q = 0.15 after FDR correction — meaning that once you account for all the tests you ran, this result is no longer trustworthy on its own.

Bottom line: Raw p-values tell you how surprising ONE result is. FDR-adjusted q-values tell you how surprising it is given that you tested many things. Throughout this site, we report both — and our headline conclusions are based on q-values, not raw p-values.

5. What is the "clearing lag"?

This is the single most important concept for understanding our findings. When your paycheck is deposited, the 401(k) contribution doesn't instantly buy stock. It goes through a pipeline:

The key insight: There's a 7-8 trading day gap between "your paycheck arrives" and "your money buys stocks." Our research shows that stock prices systematically rise during this exact gap. By the time your money buys in, prices are already elevated.

This is why the initial analysis (which looked at the paycheck date) found nothing — the action happens a week later. Once we shifted the analysis to account for the clearing lag, the pattern appeared.

6. What is a "Monte Carlo simulation"?

The skeptic's challenge

Imagine showing your results to a skeptic. They say: "Sure, you found a pattern around payday dates. But I bet you'd find a pattern around ANY set of dates if you looked hard enough."

The Monte Carlo test directly answers this challenge.

How it works

  1. Take the real payday dates (about 480 semi-monthly settlement dates over 20 years).
  2. Pick 480 random dates from the same calendar — completely ignoring paydays.
  3. Run the exact same analysis on these random dates. Compute β (the effect size).
  4. Write down the result and repeat steps 2-3 a total of 500 times.
  5. Compare: is the REAL payday β bigger than 95% of the random ones?
If real paydays are NOT special

The real β would be somewhere in the middle of the random β's. It wouldn't stand out. Verdict: the pattern is noise.

If real paydays ARE special

The real β would be larger than 95%+ of the random β's. Random dates almost never produce a pattern this strong. Verdict: the pattern is real.

Bottom line: Monte Carlo is the "prove it's not a coincidence" test. We shuffle the calendar 500 times and check if paydays really are special compared to random dates. They are.

7. What is "GARCH"?

GARCH is a type of statistical model built specifically for financial data. Remember how we said stock volatility clusters — calm days follow calm days, and stormy days follow stormy days?

Weather analogy

If you're predicting tomorrow's temperature, knowing today's temperature helps a lot (if it's 90° today, it's probably not 30° tomorrow). Similarly, if the stock market was wild today, it's more likely to be wild tomorrow.

A regular regression ignores this. GARCH builds it into the model: it predicts not just the direction of the market, but also how volatile it will be, and uses that to make more precise estimates.

We use GARCH as a cross-check. If our payday effect shows up in both the regular regression AND the GARCH model, we can be more confident it's real — because GARCH handles volatility clustering better than HAC standard errors alone.

Bottom line: GARCH is a fancier statistical model that's purpose-built for stock market data. We ran our test through it as a second opinion. It confirmed the findings.

8. What is "block bootstrap"?

Bootstrap is a technique where you create "fake" versions of your dataset by resampling it, then check if your results hold up across all the fake versions.

The deck of cards analogy

Imagine your data is a deck of 5,000 cards (one per trading day). A regular bootstrap would shuffle the cards randomly and draw 5,000 with replacement. But this breaks the order — you might put a Monday after a Wednesday, which doesn't make sense for time-series data.

A block bootstrap instead picks up chunks of consecutive cards (say, 20 days at a time) and rearranges the chunks. Each chunk keeps its internal order intact, so the patterns within each stretch of days are preserved. Only the arrangement of chunks changes.

We do this 1,000 times and compute β each time. If 95% of the bootstrapped β's are positive (or all negative), the result is robust. The range of values gives us a confidence interval — a range where the true effect most likely lives.

Bottom line: Block bootstrap is a way to test "if we had slightly different data, would we still get the same answer?" It's like asking 1,000 slightly different versions of reality and checking if they all agree. When they do, we can trust the finding.

9. How to read the charts on this site

Most interactive charts on this site follow the same format. Here's a complete guide:

The x-axis (horizontal)

Usually shows the clearing lag — how many trading days after the paycheck date. Lag 0 = the paycheck day itself. Lag +8 = eight trading days later (approximately when the 401(k) money actually buys stocks).

The y-axis (vertical)

Usually shows the β coefficient (the effect size) in percentage points per day. A bar reaching up to +0.10% means "the market returns an extra 0.10% per day during this window." A bar going down to -0.10% means the market performs 0.10% worse per day.

The zero line

The horizontal dashed line at y = 0 represents "no effect." Bars above this line mean positive excess returns; bars below mean negative. If a bar's error range crosses this line, the result is not statistically significant.

Error bars (thin lines above and below each bar)

These show the 95% confidence interval. Think of it as: "we're 95% sure the true value is somewhere within this range."

  • Short error bars = we're quite sure about this number
  • Long error bars = there's a lot of uncertainty
  • Error bars that cross zero = we can't rule out that the effect is actually zero (not significant)

Colors and legend

Different colors represent different time periods or categories. Click a legend entry to show/hide that series. This lets you compare, for example, "full 1960-2026 sample" vs. "just the 2000-2019 era."

Significance markers

Stars next to values indicate statistical significance:

MarkerMeaningHow confident?
(no marker)p ≥ 0.10Not significant — could easily be chance
.p < 0.10Suggestive — worth noting but not conclusive
*p < 0.05Significant — less than 5% chance of being luck
**p < 0.01Highly significant — less than 1% chance
***p < 0.001Extremely significant — less than 0.1% chance

Green/red shaded bands

Green bands highlight the "settlement window" (the days when 401(k) money is most likely buying). Red bands or markers highlight where investors are buying at elevated prices.

Interactive features

10. Putting it all together

Here's how all these methods connect in our study:

  1. We start with a question: do stock prices behave differently around paydays?
  2. We use regression to isolate the payday effect from other known market patterns (day-of-week, month, etc.).
  3. We use HAC standard errors to make sure our p-values are honest (not inflated by the messiness of stock data).
  4. We test multiple lags (0 to 20 days after paycheck) to find when the effect actually happens.
  5. We apply FDR correction because testing 20 different lags means some will look significant by chance.
  6. We run Monte Carlo to prove that real payday dates produce a stronger signal than random dates.
  7. We run GARCH as a second opinion from a model designed specifically for stock data.
  8. We run block bootstrap to check that results hold up when we reshuffle the data.
  9. Only findings that survive ALL of these filters make it into our conclusions.
Think of it like a courtroom: the regression is the evidence, HAC is the lie detector, FDR is the cross-examination, Monte Carlo is the independent witness, GARCH is the expert testimony, and bootstrap is the jury deliberation. A finding that survives all of them is a strong verdict.