“Google, Tear Down This Wall”

A quick overview of where we are. Next, where we are going.

Nov 18, 2024

Intro

Twenty-five years after the birth of the modern internet, we stand at another inflection point. The rise of general-purpose AI models has fundamentally changed what's possible in personalization. These models don't just predict what word comes next - they understand human context, preferences, and behaviors in ways previously monopolized by Big Tech. This understanding, combined with new technical capabilities, creates an opportunity to tear down the walls that have divided our digital identities for decades.

History

The Internet started out static. Simple HTML web pages that you could read. Later, companies like Ebay made it possible to interact, search, and take action within the site. During this period, cookies were invented to allow a site to “remember” a user, so it wasn’t Groundhog Day every time you visited.

Companies like Epsilon, Acxiom, and LiveRamp rose to prominence, creating identity resolution and Clean Room services, where you could understand a user’s activities across sites and see their full picture.

This created “personalization.” By knowing that a user bought Kleenex, Amazon may be able to tell that they are sick and give them recommendations for Theraflu. This is how Facebook could afford to make their site free. The users were actually the product. We were what was being bought and sold.

One-Sided

This is a raw deal. You were sold “personalization,” but do better ads really give you the same feeling as having a wallpaper of your family?

“Personalization” has become synonymous with predatory behavior. Now, this personalization on platforms like Instagram are increasingly shaping your mind. If it understands your shopping preferences and only shows you the jeans you’ve bought in the past, the platforms own you. You have decreasing say in what you are interested in and no longer have free will.

The Age of Predictive ML

These companies have dominated because of their scale. Network effects enabled them to build highly defensible businesses that understood their users more and more over time.

The more you interact with Facebook, the better they understand you within their walls. The techniques they do this with, however, are on their way out. Recommending you “similar products” to what you’ve bought in the past is a very narrow understanding of who you are.

This is now changing.

The Great Unlock

Every once in a while, there is a massive shift in how the world works, and you are lucky if you are one of the people who can see it coming. I am in this position now to spot 3 major changes.

The floor of “understanding” and intelligence has been massively raised overnight
Software personalization is no longer one-sided, the users also benefit now
Data interoperability has been solved by vector embeddings

1. Floor of Understanding

Overnight, the understanding that Big Tech used to monopolize and extract value from their users has been democratized. We’ve seen similar patterns of democratization in the past with cloud computing democratizing infra and with present day LLMs and document search/RAG democratizing intelligence.

Already, you have access to pre-trained models like GPT, Claude, etc. for running customer queries or document search on. You could spend $10 million+ and still not have a better internal model.

This will also happen for user understanding. Companies are already using LLM-based architecture for recommendation systems and solving the cold start problem [1]. When you run your data through a model, it doesn’t just understand what narrow product you might like. It understands the world generally, so it can actually deduce reasonable assumptions on your fitness goals based on recent books you’ve read.

Historical: Big data, narrow models, proprietary systems.

Modern: Small data, deep understanding, open access.

With more information on you, it triangulates assumptions about you really well. Proof up next.

2. My Software

One of the first projects I built when I started engineering AI was in self-understanding. If you are someone who writes, as I do, you have a large corpus of textual data that represents your thoughts.

I plugged it into my engine Delta ∆. I was taken back by how well this model understood me through my writing. It occurred to me at that moment that writing is a projection of the mind. After later experimentation, I found that this model could understand the shape of my beliefs, characteristics, and personality based on what I wrote on completely different topics. For instance, it was able to accurately predict what dog types I like most based on my personality, which it predicted from my writings.

More importantly, this was a personalization that I actually wanted. Not some one-sided raw deal. Much of software will become completely personalized like this, down to the colors on the page. One also imagines where products aren’t matched but actually created just for you.

3. Interoperability

The last important point is that we no longer need a defined schema to translate information from one source to another. Turn this document into a .json or .csv and run it through a model and it will still register it the same as if I pasted the text into it.

Further, as I’ve explained before this ultimately compiles into the penultimate layer of a LLM into a vector embedding. So that understanding can be logged as just an embedding that is of the same shape and can be read the same by the embedding models.

Breaking Down Silos Finally

This offers a unique moment in time where we can break down silos and empower the little guys to take on big tech. A coordinated coalition can actualize a world where we own our data, rather than a perverse world where our data owns us.

The Last Frontier of Personalization

The change happening now is more profound than simply better recommendations. Where traditional ML needed massive amounts of data to make simple predictions ("people who bought X also bought Y"), modern AI understands the human context behind the behavior. Think of old ML as a calculator mapping patterns, and new AI as an interpreter of meaning. When you first sign up for a service, it doesn't have to collect months of your behavior to understand you - it can grasp who you are from the context you choose to share. If you read productivity books, run in the evenings, and heavily use your calendar, old ML would just try to sell you a planner because others like you bought one. Modern AI understands you value time optimization and health, allowing it to make novel connections you might appreciate across any domain. This isn't just better pattern matching - it's a fundamental shift from predicting what you'll do to understanding who you are.

First: Solve the Cold Start Problem

The key to breaking big tech's monopoly on personalization is solving the cold start problem. Today, companies spend millions trying to understand their users through lengthy onboarding flows, complex tracking systems, and expensive data purchases. Users are forced to rebuild their identity with every new service they try.

In the next essay, we will discuss the cold start problem and we are working to solve it with context and embeddings.

If you are facing similar problems, reach out to jonathan@irreverent-capital.com

Here's Google's Secret to Hiring the Best People | WIRED

Engineering the Future

Discussion about this post