Skip to content
Data Science

Build at scale or experiment first?

Introduction

 

A proof of concept (POC) is a miniature project aimed at demonstrating the value of a concept before fully committing to it. It is common practice to use this type of experiment to explore possible ways forward, discover pitfalls, and convince your organisation that you are on the right track. Unsurprisingly, in software development and data science there are many similar and related concepts such as “AI experiments” and “minimum viable products”, but these terms will not be explored in-depth in this blog post. Rather, we’ll focus on when to commit to a POC and when it would be better to just build at scale from the start.

 

When POCs are worth your while and when they are a waste of time

 

As with most things, POC projects are common for a reason. They are great tools to test out novel ideas and minimise the risk of spending lots of effort and money on something that turns out to be completely useless. If you come up with an idea that you are sure that none of your competitors or even companies in other sectors has thought about, it is reasonable to test it out before committing to it.

 

A good POC or discovery project begins with an idea that addresses some real aspect of the business, with a technical solution that is sound and could be transformed into something permanent. In the right company, a failed POC can be a source of insight and saved time, and a successful one can be like turning on the light in a dark room. 

 

If you are developing something that has never been done before, it might not be a good idea to commit to a proper setup for something that might not even work. But for an average company struggling to become data-driven, chances are that the arising issues aren’t exactly cutting-edge problems. Someone has probably solved almost the exact same problem before and uploaded a conference talk about it on Youtube several years ago. In this case, there is no need to reinvent the wheel, and the best way forward would be to simply decide which type of wheel you would like and in what size.

 

…and when it’s somewhere in between

 

Looking back a few years to the time before machine learning hit the mainstream through giants such as Google, Spotify, Amazon, and Facebook, a lot of time and effort was spent trying to judge whether all these new machine learning tools were really worth looking into, and further what they could be used for.

 

Nowadays, the situation is completely different. The value of advanced analytics is well established, and in many industries such as retail or financial services, it is a race between competitors to pick the low-hanging fruits of data science. In these industries, conducting a POC just to see if you agree with the consensus is not exactly optimal.

 

If you, as many today, find yourself in the situation where you want to build something that a tech giant would consider standard, but which has not yet been done in your local market or industry, then it can be hard to decide how to proceed. Whilst the technical solution might seem straight-forward, there are always situation-specific details that need to be ironed out. For instance, maybe the model you are trying to use works really well for selling TVs in an online store, but it might need some tweaks in the modelling approach if you want to use it for products with substantially different purchase patterns such as groceries. These types of problems, along with setting up the flow of data, are the most common roadblocks in this kind of project. 

 

The best thing to do in such a situation is often a combination of building for scale, with small experiments on the side to help guide the system design. Knowing how to design your system and what to investigate can be tricky unless you have the right competence and experience on your team. You will be in a better position to weigh the risk of wasting time on smaller experiments versus the risk of investing heavily into something that ultimately turns out to be a bad idea.

 

To summarise the discussion of conducting POCs or not, it is always a good idea to conduct experiments, but not necessarily in a way that slows you down. There’s almost always something to figure out, and if it isn’t a technical problem it is something concerning the business and the organisation around the modelling initiative than about the modelling itself.

 

“The value of advanced analytics is well established, and in many industries such as retail or financial services, it is a race between competitors to pick the low-hanging fruits of data science. In these industries, conducting a POC just to see if you agree with the consensus is not exactly optimal.”

 

Bridges between teams

 

If you are conducting a POC in your company, and the different parts of the company have different expectations and different ideas of what the purpose of the experiment is, it’s very difficult and confusing for these different teams to feel comfortable and to cooperate. For instance, if you have an offer recommendation model optimising for redemption rate, but you measure it on spend, you might have good results initially but eventually the consequences of measuring something else than what the model is trained for, will limit progress and add confusion.

 

In such a situation, workshopping together and discussing under informal conditions can be a good start. Reaching an agreement on what to measure should be of highest priority, and in my experience, regular workshops to discuss goals and KPIs can be a great way to build understanding to make work more enjoyable and productive throughout the whole process, even after you have left the POC stage.

 

As a business person who doesn’t necessarily understand the full technical details of a model, it can feel reassuring to know that you understand what a particular metric measures, and that you can see the progress being made by the data science team doing their thing. On the other hand, the data scientists will not need to explain the minutiae of their work to a stakeholder to get the approval to go ahead, because their solutions can prove their worth in a practical test against agreed-upon metrics.

 

Final thoughts

 

Let us conclude by summarising when to conduct a POC or not:

 

  • Avoid POC: If you are trying to explore the way forward for your company’s technical capabilities, avoid unnecessary POCs and build a minimal version in the right system with the right data from the beginning, and plan continued development depending on the outcomes and learnings.
  • Embrace POC: If you are doing something novel that might not work, do it as an ad-hoc experiment and worry about productionising it later.

 

A POC can be an eye-opener if it incorporates the whole business case, and can be very valuable in showing the business value of committing to a certain project. If the setup is sound, the KPIs are well defined and the path forward is clear, all is good. On the other hand, if the purpose of the experiment is unclear, don’t go ahead with it. Instead, focus on something where the purpose is clear or spend more time concretising the experiment. 

 

AUTHOR

Axel Sarlin

Data Scientist
axel.sarlin@avaus.com

Latest posts