Sportsbook Pricing
vs. Prediction Markets
Six notebooks. 285 games. One question: who actually priced the 2025 NFL season correctly — ESPN, the sportsbooks, or a $1.14B prediction market? This report works through each model, tests them against outcomes, and follows the sharp money.
Before each NFL game, ESPN publishes a win probability for both teams. Sportsbooks publish moneyline odds (the prices you bet at). By converting the sportsbook odds into implied probabilities and comparing them to ESPN's model, we can calculate the "Expected Value" (EV) of each bet. A positive EV means ESPN thinks the team is more likely to win than the sportsbook price implies. The question: does betting on high ESPN-suggested EV actually make money in the long run?
The Brier Score measures how accurate a probability model is (lower is better). A score of 0.25 is what you'd get from always guessing 50/50 (completely useless). The Brier Skill Score (BSS) shows improvement over that 50/50 baseline (higher is better). We scored 17 different models: ESPN, 7 sportsbooks using two different mathematical methods each, Polymarket, and two consensus averages.
Across 285 games, sportsbooks often offer slightly different odds on the same outcome. This section identifies which book most frequently offered the best available price, how far each book's odds deviated from the average market consensus, and how much money a bettor who always shopped for the best line could have captured over the full season.
Polymarket is a decentralised prediction market where anyone can bet real money on sports outcomes — with no bookmaker taking a cut. In theory, the collective wisdom of thousands of informed bettors should produce more accurate probabilities than any single model. We tracked $1.14 billion in Polymarket trading across all 285 NFL 2025 games and compared its accuracy to ESPN and the sportsbook consensus.
The point spread is sportsbooks' way of levelling the playing field — they set a margin the favourite must win by. Covering the spread means performing better than that expectation. In a perfectly efficient market, every team would cover the spread exactly 50% of the time. When teams deviate significantly from that, it reveals which teams the market consistently over or under-estimated all season.
"Sharp" bettors are professionals who bet large amounts based on sophisticated analysis. When they place significant wagers, sportsbooks adjust their spread to limit exposure. By tracking how much the spread moved from the opening line to the closing line, we can identify which games attracted sharp attention — and test whether the sharp side actually covered the spread more often.
Browse every game of the 2025 NFL season. Each entry shows the final score, how the pre-game probabilities compared across ESPN, the sportsbook consensus, and Polymarket, how much the spread moved before kickoff, and whether sharp money or unusual market signals were present. Click any game to expand the full detail view.
What We Found: Five Key Takeaways
1. Sportsbooks are the most accurate probability model available. The 7-book consensus outperformed ESPN, Polymarket, and every individual book. Professional bookmakers (incentivised to price correctly or lose money) collectively produced the best pre-game forecasts in the dataset.
2. ESPN's win probabilities should not be used as a betting signal. The model systematically overstates the chances of heavy underdogs. Using high ESPN Expected Value as a guide to bet actually leads to worse-than-random results. This is likely because ESPN optimises for broadcast context (making games sound competitive) rather than betting accuracy.
3. Line shopping matters — a lot. The difference between the best and worst available odds was worth $32,785 over the full season across all games on a $100-per-bet model. In practice, a serious bettor placing larger wagers across multiple games every week would see this gap multiply substantially. Always checking multiple books before placing a bet is the single highest-return discipline available.
4. Sharp money is a genuine signal. Spread movements of 1–3 points correctly predicted cover direction roughly 60–65% of the time. Reverse line movement (where public money floods one side but the line moves the other way) is the strongest signal: underdogs showing RLM covered at 63.9%, versus a 47.3% base rate. The dual-signal approach — combining sportsbook line movement with Polymarket probability movement — adds further confirmation when both markets agree.
5. Seattle's 80% cover rate was the real anomaly. One team, across 20 games, covering at a statistically significant rate (p=0.012) is exceptional in a dataset this size. Whether that reflects genuine market mispricing or a one-season anomaly that reverts to normal in 2026 is the most interesting open question.
Limitations of This Analysis
All findings are based on a single NFL season (285 games). Statistical significance requires large samples: a 60% cover rate over 40 games could be noise. The sharp money signals identified here should be validated against multiple seasons before being treated as reliable edges. Polymarket's poor relative performance may partly reflect that its market opened weeks before game day, while sportsbook closing lines are set moments before kickoff — a comparison advantage for the sportsbooks. Future work should compare Polymarket's same-day closing prices against closing spreads.
Possible Future Work
- Multi-season validation. Run the same analysis across 2022–2024 seasons to test whether Seattle's cover rate, the sharp money signals, and the trap team patterns are consistent or one-year anomalies.
- Player-level injury impact on Polymarket. Build a dataset linking injury reports to Polymarket price movements. The ATL @ NO reversal pattern (probability crashing on a quarterback injury then recovering) was one of the most interesting signals — a systematic study could identify how much each starting quarterback is worth in market probability terms.
- Intra-game live betting. This project analysed only pre-game probabilities. Applying the same calibration framework to live (in-game) betting markets — where information updates in real time — could reveal whether the accuracy gap between ESPN and bookmakers widens or closes during a game.
- Weather and travel data. Adding structured weather data (temperature, wind, precipitation) and travel distance as features to a predictive model could improve pre-game accuracy beyond what any single probability source currently provides.
- Machine learning on the combined dataset. Combining ESPN probability, sportsbook consensus, Polymarket, spread movement, and team ATS history into a single regression or gradient boosting model could produce a meta-predictor that outperforms every individual source.
- Other sports. The Polymarket API infrastructure built for this project can be directly applied to NBA, Premier League, or other sports with active Polymarket markets — enabling direct cross-sport comparisons of prediction market accuracy.