This weekend I finally finished reading Why Machines Learn: The Elegant Math Behind AI (by Anil Ananthaswamy). It took me seven months—an unusually long time for a 500-page book. But the detour was worth it: the book kept sending me down side paths, like brushing up on linear algebra and derivatives—topics I hadn’t revisited in nearly three decades. Now that I’m done, the book feels like a perfect capstone to a journey that began in August 2022 with another book: The Alignment Problem (by Brian Christian).
Around July 2023, I stumbled upon ChatGPT like many others, and spent the next few months moving from amusement to bewilderment to full-blown fascination. Over the past three years, I’ve been going down the rabbit hole—trying to understand how large language models work, and why they work so well.
Influential Reads and Resources
Below are seven resources that deeply shaped my understanding of large language models. I’ve included the month and year I came across each because in hindsight, the order seemed to matter.
- Aug 2022: The Alignment Problem (by Brian Christian). I was immediately drawn in by its description of Machine Learning concepts, history and complexity. I remember googling “Project Pigeon” and “Animal Behavior Enterprises” because it just sounded too fantastical to be real. Over the next few months I read the remaining books by Brian Christian. His writings are the perfect combination of philosophy and technology. Not much happened in my AI journey after that, until things reignited when I started playing with ChatGPT in mid-2023.
- Aug 2023: MIT’s Introduction to Deep Learning (2023) | 6.S191. An absolute gem of a starting point for deep learning concepts. I found myself pausing often to lookup same terms in Brian Christian’s book from an year earlier. The full playlist of videos covers multiple years of the course versions by Alexander and Ava Amini, who are excellent instructors. I started with the 2023 lectures, then went through them again when they refreshed the course in 2024. The 2025 version seems nearly identical.
- Nov 2023: I read The Coming Wave (by Mustafa Suleyman). A zoomed-out view of AI’s broad, transformative potential. It didn’t dive into deep learning mechanics, but it was still a worthwhile read for its perspective on societal and national-level impacts.
- Dec 2023: I watched several Harvard AI videos but nothing came close to the clarity provided in Lecture 6 of CS50’s Introduction to Artificial Intelligence with Python 2023. CS50’s legend David Malan is behind the course and Brian Wu delivers the lecture brilliantly. The second half of the lecture offers an intuitive explanation of how words (in language) get processed through the transformer architecture. I also came across a CS50 Tech Talk: GPT-4 – How does it work, and how do I build apps with it? and it was super useful from a startup and product-builder perspective.
- April 2024: I read Artificial Intelligence: A Guide for Thinking Humans (by Melanie Mitchell) and it surfaced questions around definition of intelligence, while explaining the roles of abstraction and analogies when thinking about these systems. Abstraction as a result of vectorization of words in a vocabulary- that’s an important concept if you want to think about the potential of LLMs. Prof. Mitchell also has several engaging talks on YouTube and a podcast worth exploring.
- November 2024: Thanks to YouTube’s recommendation engine I stumbled on an 8-minute video titled ‘Large Language Models explained briefly’ on the 3Blue1Brown channel. Grant Sanderson, the mind behind the channel, is unmatched in making math and science click visually. His open-source animation engine for explaining math (manim) is a marvel. His explainer videos on transformers, attention, and even foundational courses on linear algebra and calculus changed the way I saw these topics. I had never learned basic math in such a visual, intuitive way before.
- December 2024: I started the book Why Machines Learn: The Elegant Math Behind AI (by Anil Ananthaswamy). It took the longest time to absorb but brought things full circle—a technical primer that returned me to the foundations of linear algebra and the historical arc of AI pioneers. It’s a great reminder: even with strong mathematical underpinnings, these systems often behave in ways that math alone can’t explain. That’s why so many breakthroughs remain empirical. I’m grateful Anil wrote this book—it offered an end-to-end view of the field, especially timely for where I was in my intuition-building journey.
Less Memorable, But Still Part of the Journey
As I looked through my purchase history, I found a few books that didn’t leave a lasting impression. Maybe this list will help someone else invest their time more wisely:
- Jan 2024: Superintelligence: Paths, Dangers, Strategies By: Nick Bostrom. A philosophical and technical exploration of AI surpassing human intelligence. It felt a bit too high-level—and somewhat dated.
- Feb 2024: Human Compatible (by Stuart Russell). Focused on the risks of superintelligent AI, but didn’t offer much in terms of clear explanations. It also felt a bit dated.
- April 2024: A Human’s Guide to Machine Intelligence (by Kartik Hosanagar). Covers how algorithms influence daily life and touches on fairness and transparency—but didn’t leave a strong impression.
- May 2024: Mind Wide Open: Your Brain and the Neuroscience of Everyday Life (by Steven Johnson). Not the strongest on explaining core neuroscience concepts, but Steven Johnson’s storytelling is compelling. It rekindled my interest in how the human brain works—read it if you’re looking to be fascinated more than informed.
- May 2024: The Deep Learning Revolution (by Terrence J. Sejnowski). Offers an AI pioneer’s perspective on the field’s historical arc and major breakthroughs—but doesn’t go deep into explaining key concepts.
- June 2024: The World I See (by Dr. Fei-Fei Li): Fei-Fei Li is a legend, and this book shares her personal journey—especially around ImageNet—and her reflections on ethics and AI’s future. But it leaned more memoir than technical insight.
- July 2024: Co-Intelligence (Ethank Mollick). I was on the fence about whether to include this here or in the list above. It’s one of the few practical guides on adapting your work style in the age of AI. Ethan Mollick is a worth following if you want to understand AI’s strengths and limits in various professional and creative contexts. I’ve continued to regularly read his newsletter.
- August 2024: A Brief History of Intelligence (by Max S. Bennett) is a sweeping look at the biological evolution of intelligence, but the framework felt too high-level for what I was seeking. Also read Artificial Unintelligence (by Meredith Broussard) which raises valid critiques, but many examples felt cherry-picked—didn’t leave a lasting impression.
Looking back, this reading path has been serendipitous and winding. I ended up building and then rebuilding my intuition several times. I’ve made plenty of notes along the way, and my next step is to start organizing them. The space is evolving fast, and staying oriented is no small task. For anyone just starting out, I hope this list offers a map…or at least a few useful signposts.