It used to happen all the time and it drove us nuts.
One of our data scientists would formulate a hypothesis on what data they needed to improve their predictive models. The problem was that we rarely had what they were looking for in-house.
Hypothesis dead. Models starved.
As I said, it drove us nuts.
So now let me tell you my story, of how all of this led me to create a new Narrative for the industry.
I was SVP of Product / Strategy at Tapad, a sophisticated marketing technology company that was practically drowning in data. The whole company was built on the premise of painting as complete a picture as possible of consumers using multiple digital devices.
We had literally geniuses on staff. PHDs. Not to mention the petabytes of monthly information at our fingertips.
That sounds incredible, and it was. Yet we never had enough and always needed more. I would constantly hear from experts inside data-rich companies that for all the information they had about customers or products, there were always missing pieces. The same held true for us.
Theoretically, it should have been easy to fill in those blanks. Keep in mind, this wasn’t 1998 - it was 2016, and we were living in the big data era.
Except it wasn’t easy at all.
You might ask, ‘well, why didn’t you hook up your fancy software to some data exchanges and get more?’ Or ‘why didn’t you just cut deals with companies that had what you were looking for and buy it?’
We tried. It sounds easy. It wasn’t.
Here’s how it played out. Our business development team would spend some time trying to figure out which companies might have the signal that would help support our hypothesis or yield new insights that could improve our products.
Once we found the right companies, we’d have to help them lay the groundwork for extracting and selling it. You have to realize, we weren’t talking about marketing clouds here. These were companies who had never sold data before or had only considered it as an exhaust opportunity.
Even if we had convinced them to sell to us (not always easy) we’d have to help them price it out (even harder).
Then we’d have to hold their hands while they steered these sales plans throughout the proper legal and compliance channels. As you might imagine, this was not a smooth or quick process for companies not used to trading in data.
This was all before we’d even get our hands on the data to test it, let alone make sure it could be accessed via our systems and technology. That usually meant building out a custom integration, testing it, tweaking it, supporting it, and hoping nothing got hosed along the way.
We’d work through these tedious steps still not sure whether any of it was going to pay off. These were hypotheses, remember? What if we got what we wanted, and it didn’t pay off in the way we’d expected?
As with all hypothesis, sometimes they didn’t pan out. Many times, the data we’d get out of this process didn’t pan out. Sometimes it was because it was duplicative with our existing datasets. Sometimes it didn’t provide any signal to our models. And sometimes our hypothesis was just bad.
We understood the value of data when we could get our hands on it, but the cost of trying to get it often made the whole exercise untenable.
Clearly, data acquisition was not exactly a couple of clicks and a simple software download. And we’d have to do this again and again for every potential partner.
It was hand to hand combat. There were reverse network effects. It would never scale.
To be sure, we understood what our partners were up against. Tapad also licensed data in the form of the output of our models. We experienced all of the same headaches. We knew we had something that could create massive value, but figuring out how to operationalize the initiative proved difficult.
My initial reaction to this problem is that there must be existing solutions that can solve my problems. With all of the companies claiming to have “data clouds” and “data stores”, clearly my pain would go away with a well-placed phone call.
The market I found when I began to do the research was littered with companies that promised to solve all of my data problems. But upon closer inspection, I realized they did nothing of the sort.
Solutions inevitably consisted of someone making data “simple” for me, but in doing so they would simplify the data to the point that it was no longer useful. They would create a layer of opacity that obfuscated where the data came from and how it was collected. This would eliminate our ability to create a custom strategy as to what we wanted to buy or sell.
This was a model I had seen before albeit unrelated to data. It was the ad network model – no visibility, no control, and uncertain quality.
Fundamentally I don’t have a problem with a brokerage / middle-man model where appropriate. That being said I think as any market matures, middle-men start to make less sense. People need to start building sophisticated strategies which require direct ownership of those strategies.
At Tapad, our data science and data engineering teams wanted raw data that hadn’t gone through anyone else’s black box, hadn’t been aggregated and had a rich set of attributes. They didn’t want someone else’s idea of what might be valuable.
Then it hit me. If we were experiencing this wildly effective set of market circumstances, this had to be happening all over our industry. And other industries. On every functional team. Everywhere.
If somebody could get this right - by building data acquisition software (and the accompanying data monetization software) that makes things easily discoverable, transparent and selective - we could unlock the full potential of the data-driven economy. If we could make acquisition and monetization operate with the exact precision and monster scale we’d seen in programmatic ad buying, our product could be revolutionary for buyers and sellers. We’d transform how organizations thought about and use data.
So we got to work on a new Narrative.