At White Star Capital we are openly interested in companies that see data as a competitive advantage. We want to back companies that help collect, collate and bring data to life through algorithms. We’ve spoken about this before in posts such as “Take X and add AI” and “Into the Age of Context”. We also co-founded London.AI, the leading meetup for London’s machine learning community. So hopefully, our keen interest in this space has come across.
And so, we’ve enjoyed reading two great posts this week from fellow VCs outlining similar areas of interest. Matt Turck of FirstMark penned “The Power of Network Data Effects” in which he talked about the intrinsic and defensible value of a data moat at the core of a machine learning startup’s strategy:
“Data network effects occur when your product, generally powered by machine learning, becomes smarter as it gets more data from your users.”
Matt’s post was then echoed by Boris Wertz of VersionOne in his post “Data, not algorithms, is key to machine learning success”. He wrote:
“Algorithms have largely been commoditized … if startups want to succeed in machine learning, their top priority should be building proprietary data sets”
We agree with all of this and have been focusing on how and when to best back team building interesting datasets to go mine. Matt Wichrowski, our Summer Associate, did some great analysis on the investment landscape in the AI space, and one of his conclusions was that VC/Angel investments into research projects (“AI-for-AI’s sake”) were unlikely to yield outsized returns and more likely led to acqui-hires of smart and talented teams of algorithm builders.
Our thesis is that startups can use machine learning to fundamentally disrupt industries. This happens when algorithms either create new product possibilities or drive efficiency. In order to create this value, the algorithms need data (ideally, proprietary data, ideally, lots of data).
When evaluating machine learning startups we focus on a number of questions:
- Is the overall market without algorithms big enough (or could new functionality enable a big new market)?
- Is there excessive, system-wide waste and/or growing pressure to reduce cost?
- Is there an abundance of digitized, unstructured data? Can it be easily gathered or scraped at an appropriate “cost of acquisition”?
- Are there routine processes? Are they burdened by human error?
- Is capacity constrained created by human limitations and/or qualified talent?
This thesis has led us to look at a number of verticals and companies trying to create business value through the combination of data and algorithms. We are interested in companies seeking to answer the questions above when applied to a variety of industries from healthcare to manufacturing, to consumer applications and e-commerce enablement.
A number of our portfolio companies are reflective of this focus:
KeyMe users scan their keys via a mobile app. The company then uses advanced neural networks to recognize the unique properties of the keys to store them digitally and allow their duplication via KeyMe’s robotic kiosks. Eventually, having ingested hundreds of thousands of keys, the computer vision algorithms will evolve to the point where it knows what your key should look like, rather than the worn down version you have scanned, and therefore you receive a better key than the one you inserted. Locksmiths and lockouts is a very large maket in the US and KeyMe is using NASA-level brainpower to build a deep database of digitalized keys.
The IoT wave will give rise to unprecedented data creation. By 2020, billions of connected sensors will collect data every second. Our portfolio company, mnubo, provides real time, predictive analytics to IoT companies from iControl, the largest home automation company in the world, to VanHawks the makers of very cool smart bikes, to sensors in agricultural fields helping to increase production yield. mnubo partners with its clients to capture the data and then intelligently analyse and drive actions and visualization. It’s effectively algorithm-as-a-service for IoT.
A number of other portfolio companies are focused on building defensible data/algorithm moats, from Gymtrack, bringing sensors and analytics to your old-school barbells at the gym, to Hole19 who is building one of the deepest and richest databases of golfers and golf courses, to Busbud building an amazing database of bus routes around the world, to DICE enabling fans, acts and managers never-before-seen insights into concerts, venues, ticket buyers and analytics.
We continue to look for more companies that see data as a competitive advatage and are developing capture methods and algorithms to acquire, analyse and derive insight from it. If you’re building something like that drop one of our team members a line!
This article originally appeared on Medium. Be sure to follow us there!