Spend Your Time on Your Dataset, not Your Model

I’m writing this because the ratio of discussions about ML in sports betting are too often around models. It seems most people spend their time obsessing over model architecture. They think fine-tune neural networks (don’t use NN’s for sports betting—it’s [mostly] a waste of time), optimize XGBoost parameters, or experiment endlessly with ensembles is the key to success. But here’s the truth, its not: that’s the easy part.

The hard work is engineering data:

  • Collecting it.
  • Building databases that work at scale.
  • Creating features that teach your model what really matters.
  • Making sure inference is running correctly.
  • This 2-pager touches on the third component, creating features. Your model doesn’t magically know what’s important. It only learns what you teach it through your data and features. And creating metrics that tell your model, “This matters,” is where the edge lies.

    Why Features Matter More Than Models

    In sports betting, where markets are efficient, most basic model architectures will perform similarly when given the same inputs. The key differentiator isn't whether you're using LightGBM, XGBoost or some ridic ensemble you over engineered - it's how you've engineered your features.

    Think about it:

  • Basic stats are available to everyone
  • Standard ML implementations are well documented
  • The math behind common models is widely understood
  • But feature engineering? Its still complicated, with the help of GPT and Claude, it still took me months to crack and code our ELO (strength) metric.
  • The Power of Complex Metrics

    If you’re serious about beating the market, you can’t stop at basic stats. You need to go deeper, building complex metrics that reflect the nuances of the game.

    Here are some examples (but the list can go on and on):

    Lineup Strength

    Understanding the impact of different player combinations is critical. Some lineups amplify team strengths, while others highlight weaknesses. Capturing this dynamically—based on performance, roles, and synergy—is far more powerful than relying on individual player metrics alone.

    Player Strength

    Metrics like DARKO or Bball Index’s LEBRON quantify individual player impact, but they require months of research, coding, and validation. At Sharps Research, creating our Player Strength Metric involved digging into every aspect of performance:

  • Efficiency metrics like points per possession.
  • Contextual adjustments for pace and opponent quality.
  • Tracking trends over time to account for streaks and slumps.
  • Depth Strength

    Bench contributions are often overlooked, but they’re critical for sustained success. Depth metrics evaluate how much value a team’s bench adds or subtracts, adjusting for matchups and roles.

    Game Context

    Beyond individual or team strength, understanding what really matters in a game—and creating metrics to reflect that—is key. Examples include:

  • Adjusting shooting metrics for defensive pressure.
  • Quantifying momentum using possession-by-possession volatility.
  • Evaluating clutch performance in high-pressure moments.
  • Etc.

    The list of features that can be added to a model in sports is large. the list above doesnt even scratch the surface of what can be done looking at temporal features, matchup tension etc.

    Bias Hunting: Reducing Noise and Improving Fairness

    Every dataset has biases lurking beneath the surface. If you don’t address them, your model will make predictions based on those biases, not the underlying truth.

    Here’s a common example: last 10 games stats.

  • Team A played 10 very strong opponents.
  • Team B played 10 weak opponents.
  • Both teams show the same effective field goal percentage (eFG%).
  • If your feature only tracks eFG% over the last 10 games, your model will treat both teams as equally strong shooters. But that’s wrong—the stat doesn’t reflect the context of the competition.

    Fixing Bias: Add Contextual Features

    To reduce bias, you could create a last_10_opponent_strength feature. This metric captures the quality of competition each team faced and adjusts other stats accordingly. For example:

  • Team A’s adjusted eFG% would be weighted against the strong defenses they faced.
  • Team B’s adjusted eFG% would reflect the weaker defenses.
  • By incorporating opponent strength, you eliminate the bias and allow your model to make fairer comparisons.

    Optimizing Features: Overly Simplistic Example: Team Shooting

    Let's walk through optimizing a team's shooting efficiency (eFG%). Most people just calculate a 10-game average and call it a day. Then they'll blame their model when they're not beating the market. Let's do better.

    Notion Image

    Step 1: Testing Different Calculations

  • Basic average (what everyone does)
  • Median (better with outliers)
  • Exponential decay (recent games matter more)
  • Some other method
  • Step 2: Building Test Models

  • Create models with each calculation method
  • Compare prediction accuracy
  • Check feature importance
  • Verify stability across seasons
  • Step 3: Window Size Optimization

  • Test 9-game windows
  • Test 11-game windows
  • Test 12-game windows
  • Let the data tell us what works
  • Step 4: Further Refinements

  • Adjust for opponent strength
  • Split home/away performance
  • Consider rest days
  • Add season phase context
  • Iteration

    Every feature can be improved through iteration. Take something basic like points scored. You could evolve it into points per possession. Then adjust for pace. Then opponent strength. Then recency weight. Each step making it slightly more valuable.

    Conclusion

    While everyone else is arguing about model architecture or trying to tune the perfect neural network, focus on your features. That's where the money is, everyone and their grandma can use a LLM to code a model today.

    The market doesn't care how complex your model is. It only cares if you can predict better than everyone else. And that starts with better features.