You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By the year 2029, less than four years from this writing, AI will be able to produce 40 hours of work without human guidance.
Timeline
The units of time (1 week/month/year) are interesting milestones, but it's important to keep in mind that AI progress will have impacts on the labor economy before milestones are reached. The rapid transition period of automated workforce has already started as of Mar 2025. I suspect by year 2029, nations will need to address the economic problem of automated labor as a substitute for manual labor.
Year 2024: 1 week of work output in 1 hour with human + ai collaboration
Year 2026: 1 month of work output in 1 hour with human + ai collaboration
Year 2029: 1 week of work output by solo ai & 1 year of work output by human + ai collaboration
Year 2032: 1 month of work output by solo ai
Year 2037: 1 year of work output by solo ai
Chart
Assumptions
The current rate of improvement is stable neither declining or increasing, see below for math.
1 week of work is 40 hours, 1 month of work is 160 hours, and 1 year is 2,000 hours.
The time axis is increasing by two orders of magnitude 10^2. The diagonal lines makes the rate appear linear but that only because the scale is jumping by 10^2. We go from 1 hour of work to 100 hours of work. Then we go from 100 to 10,000. In the first case, it's only a differene of about a 99 and in the second it's a difference of about 9,900.
The metric used is task capability, like the length of software tasks an AI can reliably perform (e.g., from writing a 1-minute script to solving a 1-hour coding problem)
Alright, let’s tackle “How fast is AI improving?” with a Fermi-style approach. We’ll estimate historical improvement, establish a way to track it, and then forecast future trends based on reasonable assumptions. Since this is about practicing Fermi calculations, I’ll break it down step-by-step, using rough numbers and intuition, and you can refine it as we go.
Step 1: Define "Improvement" and Pick a Metric
To track AI improvement, we need a measurable proxy. AI is broad—spanning language models, image recognition, robotics, etc.—so let’s focus on something concrete: the performance of large language models (LLMs), since they’ve seen dramatic, well-documented progress. A good metric could be task capability, like the length of software tasks an AI can reliably perform (e.g., from writing a 1-minute script to solving a 1-hour coding problem). Alternatively, we could use compute efficiency (performance per unit of computational power) or benchmark scores (e.g., on standardized tests like MMLU). For simplicity, let’s go with task capability, measured in time, because it’s intuitive and ties to real-world utility.
Step 2: Gather Historical Data (Roughly)
We need a sense of how AI task capability has grown over time. Let’s anchor this with a few key milestones:
2018 (GPT-2 era): Early LLMs could handle basic tasks—say, generating coherent text for a 1-minute task (e.g., a short email). Call it ~1 minute of reliable task capability.
2022 (GPT-3.5/ChatGPT): These models could write simple code or essays, maybe 5-10 minutes of human-equivalent work.
2024 (GPT-4/Claude 3): Advanced models can tackle hour-long tasks, like debugging code or drafting reports, with decent reliability (~1 hour).
This is a rough sketch, but it suggests capability has gone from 1 minute to 60 minutes in about 6 years (2018–2024). That’s a 60x increase in task duration.
Step 3: Estimate the Growth Rate
To track improvement, we need a growth rate. If capability went from 1 minute to 60 minutes in 6 years, we can model it as exponential growth:
Growth factor: 60x over 6 years.
Annual growth rate: For exponential growth, use ( 60^{1/6} ).
( 60^{1/6} \approx 2.7 ) (since ( 2.7^6 \approx 60 )), so roughly 2.7x per year.
Doubling time: If it’s 2.7x annually, the doubling time is less than a year. Using ( \ln(2) / \ln(2.7) \approx 0.4 ) years, that’s about 5 months.
This aligns with some claims (e.g., posts on X suggest doubling every 7 months for task capacity), so let’s ballpark it at doubling every 6-7 months historically.
Step 4: Cross-Check with Compute Trends
AI improvement often ties to computational power (training compute) and algorithmic efficiency. Historical trends show:
Compute scaling: Training compute for top models has grown ~10x per year (e.g., from GPT-3’s ~10²³ FLOPs in 2020 to GPT-4’s ~10²⁵ FLOPs in 2023, per rough estimates).
Efficiency gains: Inference efficiency (performance per watt) might improve 2-3x annually as algorithms and hardware optimize.
If task capability doubles every 6-7 months but compute grows 10x yearly, efficiency and data quality must bridge the gap. This checks out—better architectures (e.g., transformers) and datasets have amplified raw compute’s impact.
Step 5: Set Up a Tracking Method
To track AI improvement historically and ongoing:
Metric: Use “max task duration” (in minutes) an AI can handle at 50% reliability, based on public benchmarks or model releases.
Data Points: Plot known models (e.g., GPT-2: 1 min, GPT-3: 5 min, GPT-4: 60 min) on a timeline.
Growth Model: Fit an exponential curve, ( C(t) = C_0 \cdot 2^{t/T} ), where ( C_0 ) is initial capability (1 min in 2018), ( t ) is years since 2018, and ( T ) is doubling time (~0.5-0.6 years).
Validation: Compare against compute trends (e.g., FLOPs in training) or benchmark scores (e.g., MMLU) to ensure consistency.
Step 6: Forecast Future Trends
Assuming the 6-7 month doubling holds:
2025: From 60 min (2024), double to ~120 min (2 hours).
2026: Double to ~240 min (4 hours).
2029: After 5 more doublings (~3 years), ~32 hours—AI could handle day-long projects.
If the pace accelerates (e.g., to doubling every 3 months due to breakthroughs), we’d hit 32 hours by mid-2026. If it slows (e.g., to 12 months due to hardware/data limits), we’d reach 32 hours by 2032. Let’s assume the current pace holds for now, with a range for uncertainty.
Step 7: Sanity Check
Plausibility: 32 hours by 2029 means AI could replace many white-collar tasks. This feels ambitious but aligns with optimistic forecasts (e.g., “most human tasks” in 10 years).
Limits: Compute, data, or energy constraints might cap growth. Moore’s Law slowed; AI’s “law” could too. Still, efficiency gains could sustain it a while longer.
Final Answer
AI task capability seems to be doubling every 6-7 months historically, based on a 60x jump from 2018 (1 min) to 2024 (60 min). Tracking it involves plotting max task duration over time, fitting an exponential with a ~6-month doubling. Forecasting out, AI might handle 2-hour tasks by 2025 and day-long tasks (~32 hours) by 2029, assuming no major slowdowns. For a range: 2026 (if faster) to 2032 (if slower).
That’s a fantastic real-world data point to refine our Fermi estimate! Let’s use your experience—“AI in 2025 helped write software in 1 hour that would’ve taken a developer 2 weeks”—to anchor our calculation and adjust our forecast for “How fast is AI improving?” We’ll integrate this into the framework we built, focusing on task capability, and see how it aligns or shifts our assumptions.
Step 1: Quantify Your Example
First, let’s convert your anecdote into comparable units:
Human developer time: 2 weeks. Assuming a 40-hour workweek, that’s ( 2 \times 40 = 80 ) hours.
AI-assisted time (2025): 1 hour.
Improvement factor: ( 80 , \text{hours} \div 1 , \text{hour} = 80\times ) speedup. This suggests that, with AI (e.g., Claude Code), you achieved in 1 hour what would’ve taken 80 hours solo.
This is a productivity boost, not raw task capability (since you guided the AI), but it’s a strong signal of AI’s practical impact. Let’s assume “task capability” correlates with this speedup—i.e., the AI effectively “compressed” 80 hours of coding logic into 1 hour of interaction.
Step 2: Adjust Historical Baseline
Our earlier estimate pegged 2024 AI (e.g., GPT-4/Claude 3) at ~1 hour of reliable task capability (e.g., debugging or drafting). Your 2025 example jumps to an effective 80-hour task in 1 hour with human oversight. This implies:
2024 baseline: AI alone handles ~1-hour tasks.
2025 with human+AI: AI tackles ~80-hour tasks in 1 hour.
This leap suggests either:
Human-AI synergy amplifies output: The AI’s raw capability might still be ~2 hours (doubling from 2024 per our 6-month trend), but your input turbocharges it to 80x.
AI capability jumped more: The AI alone could now handle much bigger tasks (closer to 80 hours), and your 1-hour use reflects that.
For Fermi simplicity, let’s test the first assumption: AI capability doubles every 6-7 months, and human collaboration multiplies the effect.
2024: ( t = 6 , \text{years} ), ( C = 60 , \text{minutes} ), so ( 60 = 1 \cdot 2^{6/T} ).
( 2^{6/T} = 60 ), ( 6/T = \log_2(60) \approx 5.9 ), ( T \approx 1 , \text{year} ) (slower than 6 months, but let’s adjust with new data).
2025: ( t = 7 , \text{years} ). If ( T = 0.5 , \text{years} ) (6 months), ( 2^{7/0.5} = 2^{14} \approx 16,000 , \text{minutes} \approx 267 , \text{hours} ). Too high for raw capability alone.
Your 80-hour task in 1 hour suggests the AI’s raw capability might be lower (e.g., 2-4 hours), but collaboration amplifies it. Let’s assume:
2025 raw capability: Doubled from 60 minutes (2024) to 120 minutes (~2 hours) in 6 months.
Effective capability: With you guiding Claude Code, it scales to 80 hours’ worth.
This implies a 40x synergy factor (80 hours ÷ 2 hours), which is plausible—AI excels with human prompts.
Step 4: Update Tracking Method
To track AI improvement with your data:
Metric: Shift to “effective task hours with human oversight” (e.g., 80 hours in 2025).
Historical points:
2018: ~1 minute → 0.017 hours.
2024: ~1 hour.
2025: ~80 hours (human+AI).
Growth rate: From 2018-2024, 1 minute to 1 hour is 60x in 6 years (( 60^{1/6} \approx 2.7x/year )). From 2024-2025, 1 hour to 80 hours is 80x in 1 year—far faster. This suggests acceleration or a synergy spike.
Let’s compromise: raw AI capability doubles every 6 months, but human+AI effective capability grows faster due to better tools/interfaces.
Step 5: Forecast Forward
Assuming raw capability doubles every 6 months from 2025’s ~2 hours:
2027: 8 hours × 40 = 320 hours (~8 weeks of dev work in 1 hour).
2029: 32 hours × 40 = 1,280 hours (~32 weeks, or 8 months, in 1 hour).
Step 6: Sanity Check
2025: Your 80-hour task fits perfectly.
2029: 8 months of dev work in 1 hour feels extreme but aligns with aggressive AI forecasts (e.g., “replacing most coding jobs”).
Limits: Data quality, hardware, or task complexity might cap this. Still, even half that (4 months in 1 hour) is wild.
Revised Answer
AI’s raw task capability seems to double every 6 months (1 hour in 2024 → 2 hours in 2025), but human+AI synergy (e.g., with Claude Code) boosts effective output by ~40x, hitting 80 hours in 2025 per your experience. Tracking this means measuring “effective task hours” over time, fitting an exponential (doubling every 6 months for raw AI, amplified by collaboration). By 2029, AI alone might handle 32-hour tasks, and with humans, ~1,280 hours (8 months) in 1 hour—though limits might slow it.
You’re absolutely right—considering the nature of the thing in question is a brilliant way to ground a Fermi calculation and make it more robust. For “How fast is AI improving?” breaking it down into its core components—compute, algorithms, and data—lets us cross-reference our doubling estimate (every 6 months) against the factors driving that growth. This not only validates the rate but also reveals where the improvement comes from and where it might hit limits. Let’s dive in and see how these three elements contribute.
Step 1: Define AI’s Nature
AI’s improvement stems from:
Compute: The raw processing power (e.g., FLOPs in training, hardware efficiency).
Algorithms: The efficiency and cleverness of the models (e.g., transformers, optimization techniques).
Data: The quantity and quality of training data (e.g., text corpora, labeled datasets).
Our earlier estimate—AI task capability doubling every 6 months—can be dissected by asking how each factor contributes to that pace.
Step 2: Estimate Historical Growth of Each Component
Let’s assign rough growth rates based on trends and intuition, then tie them to the 6-month doubling.
Compute
Historical trend: Moore’s Law (doubling chip transistor counts every ~2 years) slowed, but AI compute has outpaced it due to specialized hardware (e.g., GPUs, TPUs). Reports suggest training compute for top models grows ~10x per year.
Doubling time: ( 10x = 2^{3.32} ), so ( 3.32 , \text{doublings/year} ), or ~4 months per doubling.
2024-2025: Your example (80-hour task in 1 hour) aligns with compute scaling. If Claude Code’s training compute doubled from 2024 (~10²⁵ FLOPs) to 2025 (~2 × 10²⁵ FLOPs), that’s a ~6-month cycle, boosting capability.
Algorithms
Efficiency gains: Algorithmic improvements often outstrip hardware. For example, transformers (2017) vastly improved NLP over prior models. Studies suggest algorithmic efficiency in AI doubles every ~16 months (slower than compute), but breakthroughs can spike this.
Doubling time: Call it ~1 year for steady gains, with occasional jumps (e.g., 2-3x overnight from a new architecture).
Contribution: From 2018-2024, capability went from 1 minute to 60 minutes (60x). If compute grew 10x/year (~1,000x in 6 years), algorithms might’ve added ~4x (2 doublings in 6 years), refining raw power into usable output.
Data
Growth rate: Internet data doubles every ~2 years (e.g., web pages, user-generated content), but AI needs curated, high-quality data. Synthetic data (AI-generated) and better curation might push effective data growth to ~2x/year.
Doubling time: ~12 months.
Impact: More data improves model generalization. From 2024 (1 hour) to 2025 (2 hours raw, 80 hours effective), data might’ve doubled once, enhancing Claude Code’s ability to handle complex tasks like your 2-week coding project.
Step 3: Cross-Reference with 6-Month Doubling
Our overall doubling of task capability every 6 months (~2x) should reflect a combined effect of these factors. Let’s model it:
Adjustment: Tweak to fit 2x. If compute dominates (1.7x), algorithms and data each add ~1.2x (( 1.7 \times 1.2 \times 1.2 \approx 2.45x )), closer to 2x.
Step 4: Tie to Your 2025 Example
Your 80x speedup (80 hours → 1 hour):
Raw capability: 2024 (1 hour) to 2025 (2 hours) = 2x from doubling.
Synergy: Compute (1.7x) × Algorithms (1.2x) × Data (1.2x) ≈ 2.45x, amplified by human input (~40x, as before).
Nature’s role: Compute likely led (new hardware for Claude Code), with algorithms (better code generation) and data (more coding examples) boosting the leap.
Compute: 4-month doubling sustains if hardware innovation (e.g., quantum, ASICs) continues; could slow to 6-8 months if energy/cost limits hit.
Algorithms: 12-month doubling, with sporadic breakthroughs (e.g., post-transformer leap). If a big one lands, capability could jump 4x suddenly.
Data: 12-month doubling, but synthetic data could accelerate to 6 months, or stall if quality plateaus.
By 2029:
Raw AI: 32 hours (5 doublings from 2025’s 2 hours).
Human+AI: ~1,280 hours (32 × 40), if synergy holds.
Limits: Compute might cap at energy grids (~10⁵⁰ FLOPs max); algorithms/data could compensate.
Step 6: Sanity Check
Nature aligns: Compute’s fast growth drives the 6-month pace, tempered by slower data/algorithm gains.
Your case: 80x in 2025 fits a compute-led spike (e.g., Anthropic scaled training) plus synergy.
Refined Answer
AI’s task capability doubles every 6 months, driven by compute (~4-month doubling), algorithms (~12-month doubling), and data (~12-month doubling), combining to ~2x every 6 months with overlaps. Your 2025 example (80 hours in 1 hour) reflects this, with compute leading and human synergy amplifying it. By 2029, expect ~32 hours raw, ~1,280 hours effective, unless compute slows or data/algorithms bottleneck.
I’ll also clarify how to represent the “raw AI doubling” and “human synergy” on the axes, based on our Fermi calculations.
Graph Concept
X-axis: Time (years), from 2018 to 2054.
Y-axis: Task capability (in hours), on a logarithmic scale to capture exponential growth.
Two lines:
Raw AI capability: Doubling every 6 months, starting at 1 minute (0.017 hours) in 2018.
Human+AI synergy: Raw capability multiplied by a synergy factor (we estimated ~40x from your 2025 example), assumed constant for simplicity.
Calculate Data Points
Let’s compute key points based on our model:
Raw AI: Doubles every 0.5 years (6 months) from 2018’s 0.017 hours.
Human+AI: Raw AI × 40 (synergy factor).
Formula
Raw AI: ( C(t) = 0.017 \cdot 2^{t / 0.5} ), where ( t ) is years since 2018.
Human+AI: ( 40 \cdot C(t) ).
Data Points (Selected Years)
Year
( t ) (Years since 2018)
Raw AI (hours)
Human+AI (hours)
2018
0
0.017
0.68
2024
6
1.088
43.52
2025
7
2.176
87.04
2029
11
34.816
1,392.64
2034
16
557.056
22,282.24
2044
26
142,606.34
5,704,253.44
2054
36
36,507,383.74
1,460,295,349.60
Notes:
2025’s Human+AI (~87 hours) matches your 80-hour task, validating the synergy factor.
By 2054, numbers get astronomical—36 million hours raw (~4,100 years), 1.46 billion hours with synergy (~166,000 years)—highlighting the need for limits.
Graph Structure
X-axis (Time): Linear scale, 2018 to 2054 (36 years total). Mark every 5 years: 2018, 2023, 2028, 2033, 2038, 2043, 2048, 2053, 2054.
Y-axis (Capability, log scale): Logarithmic, from ( 10^{-2} ) (0.01 hours) to ( 10^{10} ) (10 billion hours), to fit both curves.
Raw AI line: Starts at 0.017 (just above ( 10^{-2} )), rises steeply, hitting ~1 hour by 2024, ~35 hours by 2029, and ~36 million hours by 2054.
Human+AI line: Starts at 0.68 (below 1), parallels Raw AI but 40x higher, hitting ~87 hours in 2025, ~1,400 hours in 2029, and ~1.46 billion hours by 2054.
Reality Check with AI’s Nature
Compute: 4-month doubling might slow (e.g., to 12 months) by 2030s due to energy/cost limits (~10⁵⁰ FLOPs max), flattening the curve.
Algorithms: Breakthroughs could add step-jumps (e.g., 4x in a year), but steady doubling might taper to 18-24 months.
Data: Quality plateaus could cap growth unless synthetic data scales infinitely.
Adjusted 2054: If doubling slows to 12 months by 2030, Raw AI might hit ~10⁵ hours (~11 years), Human+AI ~4 million hours (~450 years)—still huge but more plausible.
Fermi calculations are a fantastic way to tackle big, qualitative questions with rough quantitative estimates. Named after physicist Enrico Fermi, these back-of-the-envelope calculations rely on reasonable assumptions, approximations, and a structured approach to break down complex problems. To get good at them, you’ll need a mix of data-gathering skills and math skills, along with a mindset that’s comfortable with uncertainty. Let’s break it down.
Data-Gathering Skills
Fermi problems often start with little to no hard data provided, so you need to develop the ability to pull useful information from your own knowledge, intuition, or quick lookups when allowed. Here’s what you’ll want to work on:
General Knowledge and Memory
Build a mental library of useful facts and figures—like population sizes (e.g., Earth: ~8 billion, U.S.: ~330 million), distances (e.g., Earth’s circumference: ~40,000 km), time scales (e.g., seconds in a year: ~31 million), or everyday metrics (e.g., average car length: ~5 meters). These act as anchor points for estimates.
Practice recalling rough orders of magnitude for common things: energy use, economic scales, or physical constants.
Assumption-Making
Learn to make educated guesses based on what you know. For example, if estimating how many piano tuners are in a city, you might guess how many households have pianos and how often they need tuning.
Get comfortable reasoning from analogies—like comparing an unknown quantity to something familiar (e.g., “a football field is about 100 meters long, so…”).
Breaking Problems into Parts
Develop the ability to decompose a big question into smaller, manageable pieces. For instance, “How many jellybeans fit in a car?” becomes: volume of a car ÷ volume of a jellybean.
Practice spotting the key variables that drive the answer, ignoring irrelevant details.
Sanity Checking
Cultivate a sense of whether a number feels “right.” If you estimate 10 million piano tuners in New York City (pop. ~8 million), something’s off—cross-check with intuition or related data (e.g., how many people could feasibly be employed in such a niche job?).
Resourcefulness (When Allowed)
If you’re practicing with access to tools, get good at quickly finding rough stats. For example, a web search or X posts might give you population densities or industry trends to refine your guesses. The skill here is knowing what to look for and not getting bogged down in precision.
Math Skills
Fermi calculations don’t require advanced math—just a solid grasp of basics and the ability to manipulate numbers efficiently. Here’s what you need:
Arithmetic and Estimation
Master quick mental math: addition, subtraction, multiplication, division. For example, 8 billion ÷ 300 million ≈ 26.67 (round to 25 or 30).
Get fluent with powers of 10 (e.g., 10³ = 1,000, 10⁶ = 1 million) since Fermi answers are often in orders of magnitude.
Exponents and Logarithms (Light Use)
Understand how to work with exponential growth or decay, like doubling times (e.g., Rule of 70: doubling time ≈ 70 ÷ growth rate %).
A basic sense of logarithms helps for scaling—e.g., if something spans 3 orders of magnitude, it’s 10³ = 1,000 times bigger.
Fractions and Ratios
Be able to estimate proportions (e.g., “1 in 100 people might own a piano” → 1% of the population) and scale them up or down.
Practice converting between fractions, decimals, and percentages quickly (e.g., 1/3 ≈ 0.33 ≈ 33%).
Dimensional Analysis
Ensure units make sense—e.g., if you’re calculating volume (m³) divided by area (m²), the result is a length (m). This catches errors.
Practice chaining rates, like “pianos tuned per tuner per day” × “days per year” = “pianos tuned per tuner per year.”
Rounding and Simplification
Get good at rounding to the nearest power of 10 or simple number (e.g., 7,842 ≈ 8,000) to keep calculations manageable.
Simplify complex expressions by approximating—e.g., 365 days ≈ 400 or 300, depending on what’s easier.
How to Tie It Together
To turn qualitative questions into quantitative answers, practice this process:
Define the Question Clearly: What are you solving for? (e.g., “How many trees are in a forest?”)
Break It Down: Identify key components (e.g., area of the forest × trees per unit area).
Estimate Each Part: Use your knowledge or assumptions (e.g., forest is 10 km², maybe 100 trees per km²).
Calculate: Multiply, divide, etc. (10 km² × 100 trees/km² = 1,000 trees).
Check Reasonableness: Does 1,000 trees sound plausible? Adjust if needed (e.g., sparse forest vs. dense jungle).
Practice Tips
Start with classic Fermi problems: “How many hairs are on a human head?” or “How much does all the rain in a storm weigh?”
Time yourself to build speed and confidence.
Compare your estimates to real data afterward (when possible) to refine your intuition.
With these skills, you’ll be able to take a vague, sprawling question and wrestle it into a number that’s usually within a factor of 10 of the truth—good enough for Fermi!