Transparency - Sharps Research

We recently encountered a challenge that exposed a blind spot in how we approach transparency with our users. One of our models was live with a subtle error that wasn’t immediately apparent. It wasn’t until a user dug deep into its predictions that we realized something was wrong. Debugging the issue has been a complex process, but it for anyone providing predictions it underscored a crucial point: transparency isn’t optional it’s essential.

The Cost of Limited Transparency

The issue with the model wasn’t just the error itself; it was the time and opportunity lost in identifying and fixing it. As models grow more complex, with layers of features, intricate preprocessing pipelines, and advanced inference logic, detecting and resolving issues consumes significant resources. A subtle shift in feature distribution or a hidden preprocessing error can easily spiral into an exhaustive and frustrating search for the root cause.

But the cost of limited transparency doesn’t stop with us. It’s not just a barrier for model builders; it fundamentally robs users of insights. When users don’t have access to the data driving predictions, the metrics used in training, or the assumptions baked into the model, they’re left in the dark. Transparency doesn’t just benefit model builders in debugging, it’s a tool for users to understand, evaluate, and trust the models they rely on.

And here’s the uncomfortable truth: it’s far too easy to fake prediction models in this space. A lack of transparency allows anyone to claim high accuracy without ever showing their work. It’s a smokescreen—an easy way to build hype without accountability. Users are left with no way to verify if a model is robust or just a cleverly curated set of outputs.

This principle extends beyond sports betting into all areas of AI and Machine Learning. Whether it’s predicting NBA moneylines, medical diagnoses, automated agriculture, or financial outcomes, a lack of transparency creates an environment ripe for deception. Without visibility, users are forced to blindly trust outcomes, unable to evaluate whether predictions are reliable, fair, or even grounded in reality.

We’ve reflected on this and recognize that extreme transparency is a core a responsibility. If we had shared more real-time information about the data, metrics, and assumptions behind our model, the error could have been caught sooner. More importantly, users would have been empowered to explore and understand the model’s logic for themselves, fostering trust and collaboration.

Transparency isn’t just about fixing errors, it’s about ensuring no one is left guessing. Whether in sports betting or AI at large, models should never be black boxes. In a world where it’s easy to fake results, transparency is the only safeguard that separates credible models from hollow claims. Users deserve the tools to see, question, and truly understand the systems that impact their decisions. Especially if you’re relying on them to help you with a financial decision.

What Transparency Looks Like For Us Going Forward

Transparency isn’t just about sharing numbers; it’s about providing meaningful context. Going forward, every prediction on Sharps Research will come with:

Exact Data Inputs: Users will see the exact data that was fed into the model for any given prediction. No black boxes.

High-Level Metrics with Context: While we’ll still provide precision, recall, and ROC AUC for those who understand them, we’ll supplement this with plain-language explanations about what the metrics mean in practice.

Training Data Disclosure: We’ll explain what data was used to train the model, its scope, and its limitations. For example, if the model is trained on NBA data from 2009 to 2023 but struggles with edge cases like unusual lineups, users will know upfront.

Inference Details: How the model processes data during inference will be detailed, giving users confidence that predictions are based on consistent, high-quality inputs.

Honestly, this should be a minimum for anyone in the space, anyone providing a prediction on a sporting event should openly provide this information. Over the next couple months we will have all of this in place. But we are coding it from scratch, so it will take a minute.

Balancing Transparency and Protecting Our Edge

People often ask us, "Aren't you giving away your secret sauce by being so open about your models?" It's a fair question. Here's the reality:

Let's take a simple hypothetical example. For tonight's Celtics vs Lakers game prediction, we'll show you exactly what data we're using:

Celtics recent defensive rating: 108.4

Celtics recent offensive rating: 121.2

Celtics back-to-back: No

Celtics SharpsELO: 92.1

Lakers recent defensive rating: 112.8

Lakers recent offensive rating: 117.3

Lakers back-to-back: Yes

Lakers SharpsELO: 85.4

We'll show you all these exact numbers and how they factor into our prediction. You can see that all the data loaded properly and that the numbers make sense. There isn't a ‘NaN’, Celtics defensive rating isn't 0 or 1500. You see that SharpsELO rating? It synthesizes years of data and complex patterns into a single number. You can see it, but good luck recreating how we calculate it. That's the point, we can be completely open about what goes into our models while our real edge stays protected.

The Real Risks and How We Handle Them

Yes, there are risks. People might try to game our models. Competitors might try to copy us. And sometimes, too much information confuses users rather than helps them.

We handle this through monitoring for suspicious patterns, sharing information in layers so users can dig as deep as they need without getting overwhelmed, and maintaining strong protection of our core technical advantages. The reality is that most attempts at gaming fail due to implementation complexity, not lack of understanding.

Why (We Think) Being Open Works Better

Most betting models are black boxes that just say "trust us." That's not good enough. When users understand our models, they help us make them better - spotting problems faster, giving better feedback, and understanding both the strengths and limitations of our predictions.

We're convinced that being open creates better outcomes. The risks of being secretive – losing trust, improving too slowly, and market confusion, are worse than the manageable risks of being open about what we do.

In an industry where anyone can claim sky-high accuracy without showing their work, we choose to be different. Not because it's noble, but because it works better.

—

. Finally. Thanks to Felix for bringing the error to our attention.