Project Delta ∆

Powering Self-Understanding and The Marketplace of Ideas

Aug 24, 2024

I just moved to SF / Berkeley. I’d love to connect with anyone interested in similar ideas. Feel free to reach out if you find this interesting. X or email me politzki18 [at] gmail dot com.

Also to the new subscribers: I feel like I have one of the smartest subscriber bases and it's awesome seeing your backgrounds.

“To know thyself is the beginning of wisdom.” - Socrates

Yeah, yeah, this dead horse has been beaten by every philosopher. I’ve still been thinking about technology and how we can use it to solve this problem at last.

Pre-History and Inspiration

In December 2023, I decided to take all of the essays that I had written to date (since 2021) and distill them into an essay called “△ - Everything I Know Now.” It was a lot, so I spent a few days manually reading and annotating the essays, essentially uploading them into my memory so that I could understand and map the trajectory of my life. The project was incredibly rewarding, as I was able to see how various people, events, ideas, and experiences have noticeably shaped the course of my life.

Wandering through the dilutive atmosphere that is daily life, you can lose a sense of individual identity. It is really difficult to know exactly who you are. But it felt like the compiled essay was truly a reflection of me, which is an important point when we visit the topic of embeddings.

Re-reading the essays, I realized I have always been interested in symbolism, abstraction, machine learning, individualism, and culture. Through this experience, I connected the dots looking backward into my overarching life philosophy. Now, I combine those dots into a new engine, which I am calling Delta.

Intro

I want to float three quotes that I think capture why this problem is very important to the human race.

“When I have one week to solve a seemingly impossible problem, I spend six days defining the problem. Then, the solution becomes obvious.”
(Albert Einstein)

“What is not defined cannot be measured. What is not measured, cannot be improved. What is not improved, is always degraded.”
(William Thomson Kelvin)

“Many social and legal conflicts hinge on semantic disagreements”
(Marti, Wu, Piantadosi, & Kidd, 2023).

One way to think about progress is simply as solving a string of problems. You can gather from these quotes that in order to solve a problem, you must first define it. In fact, according to Einstein, defining the problem is the most critical step.

Even after the problem has been defined, many people perceive it differently. So aligning on a common understanding is vital to be able to progress forward as a society.

Defining The Problem

This is both important at the singular, individual level and at the scale of groups, cultures, and generations.

Singular

At the micro-level you probably haven’t defined yourself. It is oddly difficult to understand yourself and so we pay shrinks, ascribe to astrology, and at best use personality tests to categorize ourselves within limited psychological dimensions. But your personality shouldn’t be bucketed in such ways; the human brain is a tangled, complex web that we will never be able to untangle and understand without the help of artificial intelligence.

Universal

On the macro-level, these problems arise in the way we choose to define what we can agree on and solve together. This is how culture and legal frameworks are born. This is also how many conflicts arise. In order for us to have real discussion on progress, we at the minimum need to define where we agree/disagree.

Have you ever had a conversation with someone, which turned into an argument, and it was clear you weren’t even speaking the same language? Every person’s conception of a word, such as “strawberry”, is different from others. This is problematic, since humans view the world in concepts. Marti et al. also studies how different people view the same politicians’ names (i.e. Warren, Trump, Biden), and it is clear that these concepts have an even higher delta in how people perceive the word. These differences are cited as a likely byproduct of a person’s experiences.

These experience form understanding of concepts, which live in relative terms to each other in concept space.

“If conceptual variability is commonplace, that suggests the variability is a fundamental feature of our conceptual systems, perhaps an inevitable byproduct of the substantial experiential differences people accumulate throughout their lives”
(Marti, Wu, Piantadosi, & Kidd, 2023).

Concept Space

Imagine concept space as a gigantic room that has organized in it every concept that has existed and ever could exist spaced and organized relative to each other so that it could be easily retrieved.

The goal of this project is not to build in Latent Space. The foundation models (Llama, OpenAI, etc.) are the shoulders we stand upon. Every iteration, their modeling of the universe improves.

In my last essay, Deep Learning’s Future, I discussed the manifold and echoed some of Stephen Wolfram’s commentary on interconcept space. During the present explosion of LLMs, this space feels like it is not discussed enough. You can imagine that all possible concepts exist in high-dimensional space.

The islands normally seem to be roughly “spherical”, in the sense that they extend about the same nominal distance in every direction. But relative to the whole space, each island is absolutely tiny—something like perhaps a fraction 2–2000 ≈ 10–600 of the volume of the whole space. And between these islands there lie huge expanses of what we might call “interconcept space”. [LINK]

Why Now?

Humans are remarkably skilled at generalizing and navigating concept space in our brains, because we are neural nets ourselves. However, we need to leverage the resources at our disposal to do better.

We have never before in history had the technological means to accomplish what is today possible. Which technology could have been the arbiter of semantic matters? Just a second ago, the Grammarlys of the world enabled computers to understand syntax. Now, LLMs are exceptional at going one step further and understanding relative concepts, through embeddings in latent space.

Writing is a Projection of the Mind

Remember when I said it felt like my writing was a reflection of myself in the intro and that that was an important point? Well, that’s because it is. Writing is an intimate projection of the mind. I can log my own beliefs in this space and then use machinery to tell me exactly how they differ from others or even how they have changed over time, providing me with the fingerprint of my personality. This can be represented by the delta between a chosen baseline, for instance the “average” or maybe in comparison to another writer. The performance of such techniques will drastically outperform personality tests and especially the astrologists.

How Machines Can Help

In my first writeup on Project Delta ∆, I showed this image, an overly simplistic but helpful way for a reader to understand how the mechanics of these differences can be quantified and extracted.

The computer understands these concepts in a set of features and weights, which ultimately break down into 0s and 1s. It can understand the differences between Hitler, Italy, Germany, and Mussolini. And the Delta between these many features in latent space is the equivalent of us being able to intuitively understand the difference between an apple and an orange. Humans often reduce these into very simple dimensions so that we can understand the relative concepts easily (color, taste, substance, etc.).

Example image from this overview from 3Blue1Brown (the best I’ve seen yet).

Machines “Intuit” Too

It turns out that computers can do that too. Through the use of dimensionality reduction techniques such as T-SNE and PCA analysis, we are able to compress the differences into a more digestible format.

Take for instance these cat pictures, made up of many pixels. There are many hidden differences between these pictures, but the model compresses into two major axes of oil vs pencil drawing and white vs black.

Specific Current Wedge → Future Vision

Consumer Tool

We need to start somewhere. I think the easiest and most personal way to work on this is to conduct the experiment at the singular, individual level. Starting with my own essays. In creating a tool that can extract the personality of the user relative to baseline, we are laying the foundation for a tool that can unravel the differences between any set of concepts.

Legal Tool

After wedging in as a useful consumer product, Delta will refine the product and expand to legal teams. The legal use case is the most clear as law is simply the practice of arguing over semantic debates. The lawyers state the facts, and the judge interprets the current legal scripture in light of those facts. The legal scripture, like religious scripture, is not black and white. A tool is needed to define the differences.

For example, here is a standard LSAT question. The reader must disambiguate the viewpoints.

Broader Societal Problems

After this has been refined, Delta will be used at the philosophical, political, and cultural level to properly define differences in beliefs, values, preferences, and laws of different nations and continents.

Generational Problems

Then, we are able to chart the zeitgeist of history, the Google Trends of Concepts, but on a much larger and more important scale. Humans do not have the capacity for analyzing *everything*. Delta will find differences in the past, present, and potential future and support the decision making process of our greatest leaders in dealing with conflict and bringing important change.

My ultimate vision for Delta is to be the engine for the marketplace of ideas and cultural change.

Whatever future we are entering, we are still in the driver's seat for the future of mankind and it will take careful thought and consideration to make decisions at each careful step. Each one may make or break our destiny.

The Current Iteration

The initial write-up of Delta is logged in the Appendix.

At the beginning of August, I took off into a programming storm and built the first beta for Delta, found at Delta-Analysis.com.

The frontend is hosted on Vercel and the backend architecture runs on Heroku. The flow goes something like this:

Input: Users provide a Medium or Substack URL containing the author's writings.
Web Scraping: Our backend scrapes the content from the provided URL.
Text Processing: The scraped text is preprocessed to remove noise and prepare it for analysis.
Embedding Generation: We use OpenAI's API to generate high-dimensional vector representations (embeddings) of the processed text.
Analysis: Our custom analysis service processes these embeddings along with the original text to extract key themes, writing style characteristics, and overall insights.
LLM-Powered Insights: We utilize OpenAI's language models to generate human-readable insights about the author's writing.
Results Visualization: The frontend presents the analysis results in an intuitive, visually appealing format.

Ultimately, Delta ∆ aims to detect and unpack an author’s personality. Since writing is a projection of the mind, and the beliefs, values, interests, and experiences of the author, we can expect that there is a lot to learn from a writer’s material.

Rather than break down the syntax and grammar of a writer, Delta is tuned to only focus on substance and ideas. Right now, I am making final edits on the actual output to make it more digestible.

Conclusion

Understanding ourselves and our society is a problem that humans have faced with no solace for all of our history, until this exact moment. Unpacking concepts is valuable for both individuals to understand themselves and entire societies to define themselves and their future. As we stand in the present and look upon the past, we can only do our best to ensure that our future holds what we ultimately decide. Each decision will take both individual agency and societal collaboration, both of which hinge upon true understanding.

Most of what we remember from societies before us is the ideas that they introduced and maintained. Culture ultimately determines fate. Delta aims to provide the tooling to wrestle with ideas in a much more digestible way, so we can shape our future.

Right now, Delta aims to be a tool to help you understand yourself. With a long road ahead, I am extremely receptive to feedback to how to make this a more useful tool for you. Until next time.

Appendix:

The First Write-Up of Project Delta ∆

4 months ago, I wrote about Project Delta ∆ and this sorry attempt at logging the experimentation. The general gist of the essay was the following:

Yes, LLMs are important, but innovation is more than just a chat bot. We had next word prediction before transformer-based architecture and Google’s introduction of Attention (2017).
Rather than building a gazillion text-generation apps, we should focus on the more important paradigm. AI can enable us to understand concepts.
The implications are significant, in order to solve any problem, you must first define it and understand it. Often, we don’t even understand ourselves.
Writing data is probably the best angle for building this tool. Image data is also interesting.
1. Writing builds upon the use of individual words (symbols), which in combination create abstract concepts.
2. There is a massive corpus of written data available in the world. This is partly why we can train such amazing LLMs in the first place.
3. Writing is a projection of the mind/concepts, and long-form writing dives into deep concepts.
There are different potential wedges for tooling like this into the market I can think of.
1. The google trends of concepts, beliefs, values, ideas
2. Personality tests based on your writing.
3. A tool that can help you understand the trajectory of your life through your writing (as I did for mine)
4. Political and Legal use cases of finding similar and different clauses.

Some Literature

The literature on this dates back to 1957, with the publication of An Analysis of Concept-Clusters in Semantic Inter-Concept Space. In 2023, Latent Diversity in Human Concepts was published analyzing semantic differences in individual words, which called out differences in cognition and perception. This comes at the problem from a psychological angle, but Delta aims to build the final solution with the simple intent of enabling others to categorize, visualize, and differentiate concepts through our product. The applications are clear from academic, political, legal, editorial, psychological, consumer, and likely other uses.

Rate of Expansion

Below cites the progression of LLM context windows over time [LINK]. Delta will be bottlenecked by foundation models' ability to intake large quantities of text in the far future. At the present, this doesn’t seem to be an issue.

GPT-2 context window size: 1024 tokens

GPT-3 context window size: 2048 tokens

GPT-4 context window size: 8,192 tokens

GPT-4o context window size: 128,000 tokens

Claude Sonnet context window size: 200,000 tokens

Estimated GPT-5 context window size: 1,000,000 tokens

Comparable Products

There are no products that focus entirely on the definition of ideas and concepts in writing.

Amazon internal writing tool
https://www.analyzemywriting.com/
Grammarly

Other Data Inputs

It is easy to imagine being able to take data from many sources to feed into a personality model. Text, Social Media, Twitter (X), etc all contain valuable information on your personality.

West Coast Milblogger

Aug 27, 2024

Very interesting work. In my mind it correlates to my search for proof of Size Theory.

Size Theory as far as I know is something I've been pursuing that Delta could help refine. The main ideas come from Stephen King's Dark Tower series, Leonard Susskind's work in theoretical physics, and now Claudia de Rham's work to unify Quantum Mechanics with General Relativity.

I wonder if Delta could help in the search of the common denominator? What we see in a telescope we see in a microscope. The later must also have time dilation. And while the structure of the Universe resembles that of the human brain they are not directly related as far as we know. Yet what connects a solar system and an atom with electrons? Or is it the Universe itself acting more like a molecule and we are particles?

It feels as if Delta could help define these connections and how they work together from Quantum Mechanics to General Relativity.

Keep up the great work.

Expand full comment

Engineering the Future

Discussion about this post