NFL 2025 Season · Quantitative Research

Sportsbook Pricing
vs. Prediction Markets

Six notebooks. 285 games. One question: who actually priced the 2025 NFL season correctly — ESPN, the sportsbooks, or a $1.14B prediction market? This report works through each model, tests them against outcomes, and follows the sharp money.

285
games analysed
7
sportsbooks tracked
$1.14B
Polymarket volume
4
probability models
32
teams profiled
Weeks 1–18
+ all playoffs
01
ESPN Win Probability vs. Sportsbooks
Expected Value Analysis · 285 games · 7 books

Before each NFL game, ESPN publishes a win probability for both teams. Sportsbooks publish moneyline odds (the prices you bet at). By converting the sportsbook odds into implied probabilities and comparing them to ESPN's model, we can calculate the "Expected Value" (EV) of each bet. A positive EV means ESPN thinks the team is more likely to win than the sportsbook price implies. The question: does betting on high ESPN-suggested EV actually make money in the long run?

ESPN vs Sportsbook Correlation
−0.358
A negative number means high suggested value bets lost more often. Statistically significant (p < 0.0001) — this is not random noise.
Win Rate: High EV Bets (EV > +25)
22.2%
When ESPN's model suggested the biggest value, those teams won less than 1-in-4 times.
Win Rate: Negative EV Bets (EV < −10)
69.6%
The "bad value" bets (backing strong favourites) won 7-in-10 times, as expected.
Total Bets Analysed
3,990
Every game × every sportsbook × both sides, across the full season.
ESPN Systematically Overrates Underdogs
The core finding of Notebook 1 is striking: ESPN's Expected Value signal is completely inverted. The higher the EV ESPN suggests for a bet, the more likely that team is to lose. This happens because ESPN's model assigns overly generous win probabilities to underdogs — teams the sportsbooks have correctly priced as heavy longshots. When a team is priced at +250 (28.6% implied) by the books but ESPN says they have a 40% chance, the "EV" looks excellent. In reality, the sportsbook was right. This means treating ESPN's win probability as a betting signal is actively counterproductive.
Win Rate by Expected Value Bucket
Each bar shows how often the team with that ESPN-suggested EV actually won. A well-calibrated model would show higher win rates for higher EV buckets. The opposite is true here.
What Each EV Bucket Means in Plain English
02
Model Calibration — How Accurate Are the Predictions?
Brier Score ranking · 17 models compared

The Brier Score measures how accurate a probability model is (lower is better). A score of 0.25 is what you'd get from always guessing 50/50 (completely useless). The Brier Skill Score (BSS) shows improvement over that 50/50 baseline (higher is better). We scored 17 different models: ESPN, 7 sportsbooks using two different mathematical methods each, Polymarket, and two consensus averages.

Most Accurate Model
0.2101
7-book consensus (normalisation method). 15.7% more skilled than guessing 50/50.
ESPN Brier Score
0.2216
Ranks 16th out of 17 models. Only marginally better than a coin flip.
Polymarket Brier Score
0.2224
$1.14 billion in trading volume, yet performs nearly identically to ESPN.
Useless Baseline (50/50)
0.2500
If you predicted every game as exactly 50/50 all season, this is what you'd score.
The Wisdom of Crowds Wins — But the Crowd Is Bookmakers, Not Twitter
The most accurate pre-game probability model isn't a sophisticated algorithm — it's simply the average of what seven sportsbooks agree on. Professional bookmakers, competing against each other for market share, are collectively the best forecasters in the dataset. The sportsbook consensus is 1.57× more skill-efficient than ESPN. Polymarket — despite attracting over $1 billion in trading volume from thousands of participants — performs no better than ESPN's internal model. The implication: market depth alone doesn't guarantee accuracy. Bookmakers have decades of incentive to price correctly; Polymarket participants may be less informed on average.
The Three Key Models Side by Side
Lower Brier Score = more accurate. The gap between the sportsbook consensus and ESPN/Polymarket is small in absolute terms but meaningful when compounded over 285 games.
All 17 Models Ranked by Skill Score
The Brier Skill Score shows improvement over always guessing 50/50. Models ending in † used the Shin method (a more conservative probability adjustment). Models without † used simple normalisation.
03
Which Sportsbook Offers the Best Odds?
Best line frequency · Book deviation · Season shopping value

Across 285 games, sportsbooks often offer slightly different odds on the same outcome. This section identifies which book most frequently offered the best available price, how far each book's odds deviated from the average market consensus, and how much money a bettor who always shopped for the best line could have captured over the full season.

Total Season Shopping Value
$32,785
The total extra profit from always choosing the best available odds vs. the worst, on every game, for $100 per bet.
Average Gap Per Game (Away Side)
$69.53
Average difference per $100 bet between the best and worst book on the away side. Median is $27.04.
Bet365 Best-Line Rate
35.6%
Bet365 offered the best combined odds most often — but also led the worst-line count with 110 games.
Most Reliably Above Market
DraftKings
+0.0018 percentage points above the consensus on average — small but the most consistent edge for bettors.
The Bet365 Paradox: Most Volatile, Not Most Generous
At first glance, Bet365 looks like the obvious choice — it offered the best available line 203 times out of 570 opportunities (35.6%). But it also offered the worst line 110 times on the away side alone. Bet365's pricing swings dramatically in both directions, with a standard deviation 2.4× greater than the next most volatile book. This isn't generosity — it's aggression. Bet365 takes strong positions that diverge from the market consensus, sometimes in the bettor's favour, sometimes not. DraftKings, by contrast, sits quietly above the consensus average in 153 of 285 games without the wild swings — making it the more reliable choice for a disciplined line-shopper.
$32,785 in Value Left on the Table All Season
If a bettor placed $100 on the away side of every game using the worst available book each time, and another bettor used the best available book, the second bettor would end the season $32,785 ahead — purely from line shopping. The largest single-game gap was $1,090 (Arizona Cardinals @ Seattle Seahawks, Week 10), driven entirely by Bet365's extreme pricing on a heavy underdog. The median gap of $27 per game is more realistic for most situations, but over a full season the cumulative value is substantial.
How Often Each Book Offered the Best (Blue) or Worst (Red) Odds
The blue bars show how many times each book posted the highest moneyline for bettors across the season. The red bars show worst-line appearances. Notice Bet365 leads both categories.
Each Book's Average Deviation from the Market Consensus
Positive (blue) means this book's odds implied a lower probability than average (better for bettors). Negative (red) means the book priced the team as more likely to win than peers did, giving bettors worse value.
04
Polymarket: Can a Prediction Market Beat the Bookmakers?
$1.14 billion in trading volume · 285 games · 4-model comparison

Polymarket is a decentralised prediction market where anyone can bet real money on sports outcomes — with no bookmaker taking a cut. In theory, the collective wisdom of thousands of informed bettors should produce more accurate probabilities than any single model. We tracked $1.14 billion in Polymarket trading across all 285 NFL 2025 games and compared its accuracy to ESPN and the sportsbook consensus.

Total Season Trading Volume
$1.14B
74.9% on moneylines, 18.2% on spreads, 5.1% on totals. The Super Bowl alone attracted $55.3M.
Super Bowl Volume
$55.3M
20 times the season median. The market was open for 13 days and the probability barely moved — opening at 67.5% for Seattle and closing at 67.5%.
Average Divergence vs. ESPN
10.2pp
Polymarket and ESPN disagreed by more than 5 percentage points in over half (53%) of all games.
Best-Calibrated Volume Tier
Low Volume
Games with the least trading (bottom 25%, median $1.2M) were the most accurately priced — counterintuitively, more money flowing in didn't improve accuracy.
$1.1 Billion in Bets, Yet No Better Than ESPN
Despite attracting over a billion dollars in trading volume, Polymarket's accuracy (Brier Score 0.2224) is virtually identical to ESPN's (0.2216) and both trail the sportsbook consensus (0.2103) by a meaningful margin. This is one of the most surprising findings in the entire project. The "wisdom of crowds" failed to outperform professional bookmakers even with massive liquidity. The likely explanation: Polymarket's participants, though numerous, may be less systematically informed than bookmakers who have every financial incentive to price games correctly. Additionally, low-volume games — which tend to involve heavy favourites with predictable outcomes — are actually the most accurately priced on Polymarket, suggesting that crowd input is most useful when the outcome is already obvious.
Does More Trading Volume Mean More Accurate Prices?
Each bar shows Polymarket's prediction accuracy (Brier Score, lower = better) for games in each volume tier. Surprisingly, the lowest-volume games (likely heavy-favourite matchups where the outcome was obvious) were the most accurately priced.
Top 10 Most-Traded Games on Polymarket
05
Which Teams Beat the Spread — and Which Are Traps?
All 32 teams · Cover rates · Home vs. Away splits · Over/Under records

The point spread is sportsbooks' way of levelling the playing field — they set a margin the favourite must win by. Covering the spread means performing better than that expectation. In a perfectly efficient market, every team would cover the spread exactly 50% of the time. When teams deviate significantly from that, it reveals which teams the market consistently over or under-estimated all season.

Seattle Seahawks Cover Rate ★
80.0%
16 covers in 20 games. The only statistically significant result in the entire dataset (p = 0.012). The market consistently underpriced Seattle all season.
Green Bay Packers Cover Rate
27.8%
Only 5 covers in 18 games, despite winning exactly half their games. The public overestimated them all year.
Buffalo Bills (Classic Trap Team)
42.1%
Won 68.4% of games outright but covered only 42.1% of the time. Public money inflated their spread beyond what even a good team could beat.
New England Patriots Away Cover
88.9%
8 of 9 road games covered. The market kept underestimating them as road underdogs throughout the season — right up to and including the Super Bowl.
The Trap Team Phenomenon: Winning Doesn't Mean Covering
The most practically useful insight for anyone betting on football: winning a game and covering the spread are completely different things. The Buffalo Bills won 68.4% of their games — one of the best records in the league — yet covered only 42.1% of the time. This happens because their strong brand and recognisable roster attracted so much public money that sportsbooks consistently set their spread too high. Betting on Buffalo to win outright was reasonable; betting them against the spread was a losing proposition all season. The Green Bay Packers are the opposite trap — the market priced them as roughly a .500 team (which they were), but the spread was still consistently too generous, resulting in the worst cover rate in the dataset at 27.8%.
Playoffs: Home Teams Dominated the Spread
In the regular season, away teams and home teams covered the spread at nearly identical rates (48.8% vs. 51.2%). But in the playoffs, the pattern collapsed entirely: away teams covered 0 of 4 Divisional games, and 0 of 2 Conference Championship games. Six consecutive playoff games where the spread underestimated the home team's advantage. Home field appears to be systematically underpriced in high-stakes playoff matchups — teams that earned home field advantage through winning records may receive less spread credit than they deserve.
SORT BY:
Cover Rate for All 32 Teams — Full Season
Blue = teams covering above 50% (market underrated them). Red = teams covering below 50% (market overrated them). The dashed line at 50% is the theoretical break-even.
Win Rate vs. Cover Rate — Spotting Traps and Value
Each dot is a team. Red dots (bottom-right) are trap teams: they win often but their spread is too high. Blue dots (top-left) cover well despite losing records (value plays). Orange is Seattle, the standout in both dimensions.
Away Team Cover Rate by Week — Did the Market Get Better Over the Season?
Blue weeks: away teams beat the spread more than 50% of the time. Red weeks: home teams dominated. Divisions (DIV) and Championship (CC) weeks show away teams going 0% — a historic sweep by home teams in the playoffs.
Over/Under Rate by Team — Who Plays in High-Scoring Games?
Red bars: the total consistently went over in these teams' games (market underestimated scoring). Blue bars: games consistently went under. The Kansas City Chiefs had the lowest over rate in the league at 23.5% — their games were consistently priced too high on expected scoring.
06
Sharp Money: Following the Professionals
Line movement signals · Reverse line movement · Dual-market analysis

"Sharp" bettors are professionals who bet large amounts based on sophisticated analysis. When they place significant wagers, sportsbooks adjust their spread to limit exposure. By tracking how much the spread moved from the opening line to the closing line, we can identify which games attracted sharp attention — and test whether the sharp side actually covered the spread more often.

Line Movement Predicts Outcomes — Especially Strong Movements
When the spread moved more than 1 point in a direction, the team that money moved toward covered the spread at above-50% rates. The signal gets stronger as the movement increases: spreads moving 1–3 points toward a team produced 65% cover rates (away) and 59.2% (home). Movements greater than 3 points toward the home team produced a 66.7% cover rate in the 21 games where it occurred. This confirms that professional bettors moving lines were generally right — but extreme away movements (>3pts) appear driven by injury news rather than pure analytical edge, and those produced only a 50% cover rate.
Houston @ Baltimore, Week 5 — The Sharpest Game of the Season
The most extreme sharp money event in 285 games: Houston opened as 9.5-point underdogs at Baltimore. Over the following days, massive sharp action moved the spread 12.7 points — the largest movement in the entire dataset — with Houston closing as 3.2-point favourites. Simultaneously, Polymarket moved Houston's win probability from 20.5% to 53.5% — a 33-point shift. Two completely independent markets, moving simultaneously in the same direction, signalling the same thing. Houston won the game outright by double digits. Both markets were right.
When Sportsbooks and Polymarket Disagreed — The Most Extreme Finding
In 8 games, the sportsbook spread and Polymarket moved in opposite directions — one market said sharp money was on the away team while the other suggested sharp money was on the home team. In 7 of those 8 games (87.5%), the away team covered the spread. While 8 games is a small sample, this 87.5% rate is the most extreme result in the entire dataset. When two independent sharp markets contradict each other, it may signal that sportsbooks were reacting to public money while Polymarket reflected more informed positioning — and the away team benefited from that confusion.
Cover Rate for the Sharp Side — By How Much the Line Moved
Each bar shows the cover rate for the team that money moved toward, grouped by movement size. Bigger movements = stronger signal, except for extreme away movements which appear driven by news rather than analysis.
Dual Signal: When Sportsbooks and Polymarket Agree (or Disagree)
The orange bar shows what happened when both markets moved in opposite directions — a rare 87.5% cover rate for the away team in 8 games. The base rate (grey) is 48.8% for the whole season.
07
Game Explorer
All 285 games · Click any game to expand full analysis

Browse every game of the 2025 NFL season. Each entry shows the final score, how the pre-game probabilities compared across ESPN, the sportsbook consensus, and Polymarket, how much the spread moved before kickoff, and whether sharp money or unusual market signals were present. Click any game to expand the full detail view.

08
Final Thoughts & Future Work
Key takeaways · What we'd do next

What We Found: Five Key Takeaways

1. Sportsbooks are the most accurate probability model available. The 7-book consensus outperformed ESPN, Polymarket, and every individual book. Professional bookmakers (incentivised to price correctly or lose money) collectively produced the best pre-game forecasts in the dataset.

2. ESPN's win probabilities should not be used as a betting signal. The model systematically overstates the chances of heavy underdogs. Using high ESPN Expected Value as a guide to bet actually leads to worse-than-random results. This is likely because ESPN optimises for broadcast context (making games sound competitive) rather than betting accuracy.

3. Line shopping matters — a lot. The difference between the best and worst available odds was worth $32,785 over the full season across all games on a $100-per-bet model. In practice, a serious bettor placing larger wagers across multiple games every week would see this gap multiply substantially. Always checking multiple books before placing a bet is the single highest-return discipline available.

4. Sharp money is a genuine signal. Spread movements of 1–3 points correctly predicted cover direction roughly 60–65% of the time. Reverse line movement (where public money floods one side but the line moves the other way) is the strongest signal: underdogs showing RLM covered at 63.9%, versus a 47.3% base rate. The dual-signal approach — combining sportsbook line movement with Polymarket probability movement — adds further confirmation when both markets agree.

5. Seattle's 80% cover rate was the real anomaly. One team, across 20 games, covering at a statistically significant rate (p=0.012) is exceptional in a dataset this size. Whether that reflects genuine market mispricing or a one-season anomaly that reverts to normal in 2026 is the most interesting open question.

Limitations of This Analysis

All findings are based on a single NFL season (285 games). Statistical significance requires large samples: a 60% cover rate over 40 games could be noise. The sharp money signals identified here should be validated against multiple seasons before being treated as reliable edges. Polymarket's poor relative performance may partly reflect that its market opened weeks before game day, while sportsbook closing lines are set moments before kickoff — a comparison advantage for the sportsbooks. Future work should compare Polymarket's same-day closing prices against closing spreads.

Possible Future Work

  • Multi-season validation. Run the same analysis across 2022–2024 seasons to test whether Seattle's cover rate, the sharp money signals, and the trap team patterns are consistent or one-year anomalies.
  • Player-level injury impact on Polymarket. Build a dataset linking injury reports to Polymarket price movements. The ATL @ NO reversal pattern (probability crashing on a quarterback injury then recovering) was one of the most interesting signals — a systematic study could identify how much each starting quarterback is worth in market probability terms.
  • Intra-game live betting. This project analysed only pre-game probabilities. Applying the same calibration framework to live (in-game) betting markets — where information updates in real time — could reveal whether the accuracy gap between ESPN and bookmakers widens or closes during a game.
  • Weather and travel data. Adding structured weather data (temperature, wind, precipitation) and travel distance as features to a predictive model could improve pre-game accuracy beyond what any single probability source currently provides.
  • Machine learning on the combined dataset. Combining ESPN probability, sportsbook consensus, Polymarket, spread movement, and team ATS history into a single regression or gradient boosting model could produce a meta-predictor that outperforms every individual source.
  • Other sports. The Polymarket API infrastructure built for this project can be directly applied to NBA, Premier League, or other sports with active Polymarket markets — enabling direct cross-sport comparisons of prediction market accuracy.