your system language is:English

Software 3.0: Andrej Karpathy on AI and Programming in Engli

Cover

📺 Today’s recommended deep-dive video: https://www.youtube.com/watch?v=LCEmiRjPEtQ


Software 3.0: Programming the New Digital Operating System

The landscape of software development is undergoing its most radical transformation in seventy years as English replaces traditional code. We are moving away from manual logic and toward “people spirits”—stochastic simulators that require a new kind of stewardship.

Core Question: How does the emergence of LLMs as a new operating system redefine the way we build, use, and maintain software?

Highlights

  • The transition from Software 1.0 (manual code) and 2.0 (neural weights) to 3.0 (English prompts).
  • Understanding LLMs as “people spirits” with encyclopedic knowledge but distinct cognitive deficits.
  • The “Autonomy Slider” as the essential design pattern for modern, partially autonomous applications.
  • The necessity of rebuilding digital infrastructure, such as documentation and APIs, specifically for AI agents.

⏱️ Reading time: approx. 8 minutes · Saves you about 31 minutes vs. watching.

Want to take notes while watching? Click the image below and let AI Notebook capture the key points for you 👇

AI Notebook


The Three Eras of Software

From Manual Code to English Prompts

Software is changing fundamentally for the first time in seven decades. For most of computing history, Software 1.0 consisted of explicit instructions written by humans for computers to execute. Then came Software 2.0, where we stopped writing code and started curating data to train neural network weights. Now, we have entered the era of Software 3.0, where the computer itself has become programmable through natural language.

In this new paradigm, your English prompts are the programs. This shift democratizes development, effectively turning every person who can speak a natural language into a programmer.

While Software 1.0 ate the world, Software 2.0 began eating the Software 1.0 stacks, specifically in complex domains like Tesla’s Autopilot. Now, Software 3.0 is beginning to eat through everything else. We are seeing a future where developers must be fluidly bilingual, moving between C++, neural network optimization, and English-language orchestration to build the most efficient systems possible.

A process map showing the evolution of software development: a timeline from 1950 to present. 1.0 shows a person writing lines of code; 2.0 shows a person feeding data into a neural network box labeled 'Optimizer'; 3.0 shows a person speaking or typing English into a large 'LLM' box that generates actions.

💡 Digging Deeper

Q: Is Software 1.0 going away?
A: No, but its footprint is shrinking as neural networks and LLMs handle tasks that were previously too complex to hard-code.

Q: Why call LLMs “Software 3.0”?
A: Because unlike 2.0, which produced fixed-function classifiers, 3.0 creates a general-purpose computer that can be re-programmed on the fly via prompting.

Q: What is the primary advantage of the 3.0 stack?
A: The speed of iteration; you can describe a complex logic change in a sentence rather than writing hundreds of lines of brittle code.


The Psychology of the “People Spirit”

Simulating Human Intelligence

Large Language Models are essentially stochastic simulations of “people spirits.” Because they are trained on nearly every word ever written by humanity, they have developed an emergent psychology that mimics our own. This results in a machine that possesses an encyclopedic memory—a “Rainman” of the digital world—capable of recalling obscure hashes or phone numbers with a precision no human could match.

However, this intelligence is remarkably jagged.

While an LLM can solve complex graduate-level physics problems, it might simultaneously insist that “9.11 is greater than 9.9.” They suffer from what Karpathy calls “entrograde amnesia,” where their weights remain fixed and they only “learn” within the narrow confines of a temporary context window. This makes them powerful but profoundly fallible collaborators.

A comparison table comparing a Traditional Computer (CPU/RAM) to an LLM. Row 1: Memory (RAM vs Context Window). Row 2: Logic (Boolean vs Stochastic). Row 3: Learning (Hard drive storage vs Fixed weights/In-context learning).

💡 Digging Deeper

Q: Why do LLMs hallucinate?
A: They are auto-regressive simulators designed to predict the next token, not to verify truth; they lack a robust internal model of self-knowledge.

Q: Can LLMs learn over time like a human coworker?
A: Not natively. Their “long-term memory” is currently limited to what you feed into the context window, which is why “context management” is a primary task for AI apps.

Q: Are LLMs more like utilities or operating systems?
A: While they are distributed like a utility (electricity), they behave like operating systems (Windows/Linux) by orchestrating compute and memory to solve problems.


Designing for Partial Autonomy

The Autonomy Slider and the GUI

The most successful AI applications today, such as Cursor for coding or Perplexity for search, share a common design philosophy: partial autonomy. They do not attempt to replace the human entirely; instead, they focus on making the “generation-verification” loop as fast as possible. Humans are excellent at verification but slow at generation, whereas AI is the opposite.

A critical component of these apps is the “Autonomy Slider.”

This allows a user to choose the level of AI involvement, ranging from simple autocomplete to full-repo agentic changes. By using a GUI instead of a raw text terminal, these apps leverage human visual processing—our “internal GPU”—to audit the AI’s work much faster than reading text would allow.

A Gantt chart/process flow illustrating the 'Generation-Verification Loop'. It shows a short 'Human Prompt' phase, a long 'AI Generation' bar, and a quick 'Human Audit' phase using a GUI diff tool (red/green highlights), repeating in cycles.

💡 Digging Deeper

Q: Why is a GUI better than a chat interface?
A: GUIs allow for high-bandwidth auditing; seeing a “red/green” code diff is much faster for a human brain than reading a paragraph of text explaining a change.

Q: What is an “Iron Man Suit” product?
A: It is a product that serves as an augmentation of the human (a suit) but can also act as a semi-autonomous agent when directed.

Q: How do we keep AI “on a leash”?
A: By breaking large tasks into small, concrete chunks and requiring human approval at each step to prevent the agent from “getting lost in the woods.”


Vibe Coding and Agent-First Infrastructure

Meeting the Agents Halfway

We are entering the era of “Vibe Coding,” a meme that captures the reality of building functional software through high-level intent rather than syntax mastery. This is a “gateway drug” to software development, allowing individuals to build iOS apps or web tools in a single day without prior knowledge of specific languages like Swift. However, as the number of AI agents grows, we must rethink our digital infrastructure to support them.

Current software is designed for humans to click and read.

To empower agents, we need to provide “LLM-friendly” versions of everything. This includes documentation in Markdown, llm.txt files for domain descriptions, and APIs that favor curl commands over instructions to “click a button.” If we meet these “people spirits” halfway by making our data legible to them, the speed of digital progress will accelerate exponentially.

A concept map showing 'Agent-First Infrastructure'. Central node: 'Digital Service'. Branches to: 'Human GUI' (Visuals, Buttons), 'Developer API' (JSON, REST), and 'Agent Interface' (Markdown Docs, llm.txt, Model Context Protocol).


Key Takeaways

The shift to Software 3.0 represents a move from time-sharing computers in the cloud to a potential revolution in personal AI computing. While we are currently in the “1960s era” of AI operating systems—relying on centralized, expensive providers—the democratization of this technology is unprecedented. Unlike previous technological shifts that started with the military, AI has been “beamed down” to billions of consumers simultaneously.

Building for this era requires a mindset shift from total automation to human-centric augmentation. We must build “Iron Man suits” rather than standalone robots. By focusing on the “Autonomy Slider” and creating agent-legible documentation, we can navigate the transition from manual coding to a future where our primary job is to steward and verify the output of these powerful digital spirits.

The next decade will be defined by how far we can safely slide that autonomy slider. As software eats the world, AI agents will eat the software. Those who learn to vibe code, manage context, and build infrastructure for these new digital consumers will be the architects of the next technological age.


Q&A

Q1: What is “Vibe Coding”?
A1: It is the practice of building software by describing requirements in natural language and using AI to generate the implementation, focusing on the “vibe” or intent rather than the underlying syntax.

Q2: How does Karpathy describe the transition of the Tesla Autopilot stack?
A2: He observed that as the system improved, the amount of Software 1.0 (C++ code) decreased while the Software 2.0 (neural network weights) expanded, effectively “eating” the manual code.

Q3: What are the two main ways to speed up the AI-human collaboration loop?
A3: First, utilize GUIs to speed up human verification; second, keep the AI on a leash by using small, incremental prompts rather than asking for massive, un-auditable changes.

Q4: What is a llm.txt file?
A4: It is a proposed standard for a simple Markdown file on a domain that tells an LLM exactly what the website is about in a format that is easy for the model to parse and understand.

Q5: Why is Karpathy skeptical of “2025 as the year of agents”?
A5: Drawing from his experience in self-driving, he notes that achieving 100% autonomy is extremely difficult and takes much longer than initial demos suggest; he views this as the “decade of agents.”

Q6: What is “entrograde amnesia” in the context of LLMs?
A6: It refers to the fact that LLMs have fixed weights and do not “learn” or remember new information from their interactions once the session ends, unless that information is explicitly managed in the context window.

Q7: How should documentation change for the AI era?
A7: Documentation should be provided in Markdown rather than just HTML, and instructions like “click here” should be replaced with direct commands (like curl) that an agent can execute.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts