Building a second brain with Karpathy's LLM Wiki

Published on 11 April, 2026

Quick Summary

Andrej Karpathy, co-founder of OpenAI and former Director of AI at Tesla, has built an remarkably intelligent second brain system called LLM Wiki. Instead of using AI just to answer questions or write code faster, Karpathy lets AI autonomously build, maintain, and link a personal research wiki. His wiki has now reached over 100 articles and 400,000 words, all written and updated by AI. Unlike traditional RAG which only retrieves temporarily, LLM Wiki operates on a "compilation" principle: AI compiles raw documents into structured knowledge, automatically creates backlinks, detects contradictions, and continuously updates. The system requires only a raw folder for source documents and a wiki folder for Markdown files, running entirely local with Obsidian, with no complex database or vendor lock-in. LLM Wiki marks a shift from using AI to "ask and answer" toward using AI to "build and manage knowledge long-term," and is considered one of the most powerful and practical Second Brain approaches available today.

Andrej Karpathy, co-founder of OpenAI, former Director of AI at Tesla, and the person who coined the term "vibe coding," shared on X how he uses AI, and the answer isn't writing code faster. It's building a self-maintaining, self-linking, self-updating knowledge system for a second brain, which he calls LLM Wiki. His research wiki on a single topic has reached 100 articles and 400,000 words, and notably, every word was written by AI without him typing a single character.

The problem with how we currently use AI to organize knowledge

Does RAG accumulate knowledge over time the way our brains do?

Most current AI tools process documents using a RAG model: you upload a document, ask a question, the system finds relevant passages, and the AI synthesizes an answer. Google's NotebookLM, ChatGPT with file uploads, and most AI workflows use this approach because it's simple and easy to deploy.

But Karpathy points to a core problem that few people notice: RAG does not accumulate knowledge. Every time you ask a question, the system starts from scratch, reading the documents again, finding relevant passages, assembling an answer. Ask the same question the next day and it repeats the entire process as if nothing happened before. A document from March and a document from October don't connect to each other on their own. Nothing accumulates and nothing is learned from the previous session, which is nothing like how our brains actually work.

Karpathy describes the shift in his own thinking with one short sentence that says a lot: most of the tokens he now consumes are no longer going into manipulating code but into manipulating knowledge.

How does LLM Wiki work?

LLM Wiki is not software, it's an Obsidian thinking architecture

Karpathy's idea is not a new piece of software or library. He published it as an "idea file" to create an Obsidian-like architecture. He created a GitHub Gist designed to be copy-pasted directly into an AI agent like Claude Code or OpenAI Codex, then let the agent build the system according to that architecture together with the user. This means you install nothing. Instead, you describe the architecture to the AI and the AI implements it for you.

You can build your own Wiki with Obsidian

Three core architectural layers of the Wiki

The system is organized into three distinct layers, each playing an irreplaceable role:

Diagram of Karpathy's LLM Wiki architecture layers

Raw source folder (raw/): Where you drop any document, whether PDF, article, transcript, note, or tweet, and AI reads it but never modifies this folder. The design principle here is important: collect first, organize later. You don't need to sort or prepare documents before adding them.
Wiki folder (wiki/): This holds all the Markdown files that AI creates and maintains. It's where knowledge is compiled, linked, and synthesized. Every document in raw/ gets read by AI and integrated into the wiki, updating existing pages, noting contradictions, and creating backlinks to related concepts.
Configuration file (CLAUDE.md or equivalent): A ruleset that tells the AI how to organize the wiki, format articles, handle contradictions, and maintain consistency across the entire system.

Karpathy describes the relationship between components with one vivid sentence: "Obsidian is the IDE. The LLM is the programmer. The wiki is the codebase." You don't write the wiki yourself. Instead, you ask questions and explore while AI handles the tedious work of maintaining and updating the knowledge base.

The self-maintaining loop is the real differentiator

Three operations running continuously without intervention

What makes LLM Wiki different from ordinary AI note-taking tools is the active loop that runs after the wiki is built. AI doesn't just summarize documents once and stop. It runs three continuous operations:

Ingest: When you drop a new document into the source folder, AI reads it, extracts key information, and integrates it into the wiki by updating existing pages, creating new ones where needed, and flagging where new information contradicts old information rather than arbitrarily deleting either.
Query: You ask in natural language, and because the wiki has already been compiled and structured, AI answers with high accuracy and can cite specific pages rather than assembling an answer from scattered passages the way standard RAG does.
Lint: AI periodically scans the entire wiki to detect broken links, isolated pages with no connections to the rest, contradictions between pages, and knowledge gaps not yet covered. Karpathy calls this "CI/CD for the knowledge base," meaning the system audits its own quality continuously.

Karpathy explains why this system is more sustainable than human-maintained wikis with one simple but precise observation: "People give up on wikis because the maintenance burden grows faster than the value they deliver. LLMs don't get tired, don't forget to update cross-references, and can edit 15 files in a single run."

Why RAG isn't needed at personal scale

Context windows are now large enough to replace vector databases

The most debated argument in Karpathy's proposal is that RAG is unnecessary at personal scale. His logic is this: a comprehensive second brain covering an entire research domain typically compiles to somewhere between 500,000 and 2 million tokens in Markdown. With the long context windows available in current models, that entire wiki can fit into a single query context without needing any complex vector search infrastructure.

Karpathy reports that at around 100 articles and 400,000 words, the system handles complex questions well without any vector database or RAG infrastructure, because AI builds and maintains its own index and summary files and navigates the full text collection efficiently through that self-built structure.

One important caveat: this limit is real. When a wiki grows past a certain threshold, perhaps a few million tokens, the context window does become a genuine bottleneck, and at that point search tools like qmd (a hybrid BM25/vector search tool for Markdown) will need to be integrated to maintain performance.

How to get started in 15 minutes

The first steps to building your first wiki

Karpathy designed this system so that anyone with Claude Code or an equivalent AI agent tool can deploy it immediately without deep technical knowledge. The basic process has four steps:

Create a new Obsidian vault. This is simply a folder on your computer where all Markdown files will be stored. Obsidian is just the interface you use to read and navigate.
Create two subfolders: raw/ for source documents and wiki/ for AI to write and maintain. These two folders are all you need to set up manually.
Copy Karpathy's GitHub Gist at GitHub and paste it into Claude Code or whichever AI agent you're using. The Gist is written as a set of instructions for the agent, letting the agent build the detailed implementation together with you rather than you doing everything yourself.
Drop a few initial documents into raw/ and let the agent begin compiling the wiki. From here everything runs on its own.

The entire system runs locally with just two dependencies: Obsidian for viewing and navigation, and an AI agent for writing and maintenance. This means no vendor lock-in, no data sent to the cloud if you use a local model, and no subscription fees beyond the API costs of whichever model you choose.

LLM Wiki compared to MemPalace, Mem0, and Zep

Four different philosophies for the same problem

Around the same time Karpathy's LLM Wiki gained attention, the AI community was also discussing MemPalace, an open-source memory system built by actress Milla Jovovich and engineer Ben Sigman that scored 96.6% on the LongMemEval benchmark. All four systems, LLM Wiki, MemPalace, Mem0, and Zep, address the problem of AI not remembering context between sessions, but they do so through four very different philosophies suited to four different needs.

The easiest way to understand the differences is through a concrete scenario: you have six months of AI conversations about a research project, covering every decision, every argument, every discarded option. You open a new session and ask: "Why did we choose direction A over B back then?" Each system answers in a completely different way.

Mem0 works like a secretary who takes meeting notes. It uses AI to read conversations, extract important facts such as preferences and decisions made, and stores them in a vector database. When you ask again, it finds the fact closest to your question and returns it. Fast, easy to integrate, and well suited to commercial chatbots, but the reasoning behind a decision and the chain of logic that led there is usually gone because the AI already decided that part wasn't important.
Zep goes one step further with a time-aware knowledge graph. It doesn't just remember "you preferred X" but "in January you thought X, in March you switched to Y because of Z." Its strength is understanding change over time and it suits applications that need to track user progress, but Zep still uses AI to decide what enters the graph, so there's still a risk of losing important context, especially complex reasoning the AI judged as unnecessary.
MemPalace takes the opposite philosophy entirely: store everything, then make it findable. Instead of letting AI decide what's worth remembering, MemPalace stores the full verbatim text of every conversation into ChromaDB and organizes it in a hierarchical structure inspired by the ancient Greek memory palace technique: Wing, Hall, Room, Closet, Drawer. Nothing is filtered out but everything has a clear address for retrieval, and the system runs entirely locally without sending data anywhere.
Karpathy's LLM Wiki solves a fundamentally different problem from the other three. Instead of remembering conversations, it compiles documents into structured knowledge. You don't feed it chat history but rather articles, transcripts, and research notes, and AI builds a linked, summarized, queryable Markdown wiki. Each new document isn't just stored but integrated into existing knowledge, creating new connections between concepts and enriching what is already known.

Comparison table to choose the right tool for the right need

Criteria	LLM Wiki	MemPalace	Mem0	Zep
Data source	Research documents, articles, transcripts	AI conversation history	Conversation history	Conversation history
Storage method	Structured Markdown, AI compiled	Full verbatim text, spatial hierarchy	Facts extracted by AI	Time-aware knowledge graph
Does AI filter information?	Yes, AI decides how to organize	No, everything is stored	Yes, AI selects important facts	Yes, AI selects entities and relations
Runs locally?	Yes, only Obsidian and a model needed	Yes, ChromaDB and SQLite on device	No, cloud service	No, cloud service
Best suited for	Research, learning, document synthesis	Long-term AI context memory	Chatbots, commercial applications	Apps tracking user progress over time
Weaknesses	Doesn't remember conversations, requires initial setup	Storage-heavy, no visual UI yet	Loses complex reasoning	Cloud dependency, still risks losing context

The most important thing to remember when choosing: LLM Wiki and MemPalace solve two different problems and can be used together rather than choosing one over the other. MemPalace remembers the history of your conversations with AI, meaning it knows what you said, what you decided, and how your thinking changed. LLM Wiki organizes knowledge from the outside world, the articles you read, the videos you watched, the documents you collected. Combining both lets AI understand both who you are and what field you're researching, and together they form a more complete second brain.

The most thought-provoking insight from LLM Wiki

Most of us use AI as a tool for generating temporary answers. Each session starts from scratch and nothing accumulates. Karpathy's LLM Wiki suggests a different direction: using AI as a knowledge compiler, where each new document isn't just stored but integrated into an existing structure, creating new connections and enriching what is already known.

If you're researching a specific domain, whether AI, technology, finance, or anything else, this is worth trying today. Create a folder, drop in five articles you've read recently, and let Claude Code begin building the first wiki. After one week of adding documents consistently, you'll see the difference between an archive and an actual knowledge base.

Discussion (0)

No comments yet. Be the first!

Claude Code, NotebookLM, and Obsidian for Smarter Research

Many people still do research manually: opening a dozen tabs, watching videos, reading articles, taking notes in scattered places, and then spending even more time trying to synthesize the result. A long-form post by monokern on X suggests a different pattern: use Claude Code to orchestrate the workflow, NotebookLM to analyze sources, and Obsidian to store long-term memory. Done correctly, this is not just a search session. It becomes an AI workflow that compounds over time. The core idea is practical: Claude Code does not need to do everything inside an expensive context window. It can call tools, run skills, create files, and offload heavy source processing to NotebookLM. The output is then saved back into Obsidian as markdown, giving the next research session better context. According to the original post, the initial setup can be completed in under 30 minutes if the required tools are already available. Why does this stack work? The strength of the workflow is that each tool owns a clear layer. Claude Code acts as the execution engine: it receives plain-language instructions, calls skills, runs commands, manages files, and coordinates the pipeline. Instead of forcing the user to operate each step manually, Claude Code becomes the system operator. NotebookLM is the analysis layer. Google's research tool can read sources, summarize them, generate analysis, flashcards, mindmaps, infographics, or audio overviews. When Claude Code sends source processing to NotebookLM, the user benefits from Google's processing layer rather than spending Claude tokens on every piece of long-form digestion. Obsidian is the memory layer. Every analysis result is saved as markdown in a personal vault. Over time, that vault becomes a structured knowledge base of topics, sources, observations, patterns, and conclusions. Claude Code can read those files later to understand what the user cares about, what formats they prefer, and how they tend to evaluate a topic. Skill Creator turns the workflow into a reusable tool The first major step in the guide is installing Skill Creator inside Claude Code. This layer lets users describe a new capability in natural language, after which Claude Code creates the skill structure, installs it, and makes it available as a reusable command. In other words, instead of rebuilding the research prompt every time, the user packages the workflow as a dedicated skill. The first example is a YouTube search skill. It uses yt-dlp to search videos by query and return metadata such as title, channel, views, duration, upload date, URL, and a views-to-subscribers ratio. For content or market research, this is more useful than a plain list of links because it shows which sources are actually attracting attention. NotebookLM handles the heavy analysis The post proposes connecting Claude Code to NotebookLM through notebooklm-py because NotebookLM does not currently provide an official public API. After installation and Google account authentication, Claude Code can use a custom skill to create a new notebook, add sources such as YouTube URLs, text, or files, and then ask NotebookLM to generate analysis or deliverables. The key point is that NotebookLM is not only a summarizer. In a real research pipeline, it can receive 10 videos on a topic, analyze which frameworks are gaining traction, which ones are overhyped, where the community disagrees, and what content gaps remain uncovered. That processing takes time, but most of the work happens on the NotebookLM side. The full pipeline: one command for a complete research task Once the YouTube search skill and NotebookLM skill exist, the next step is to create a pipeline skill that combines both. The user gives a topic, such as researching AI agent frameworks in 2026, and the pipeline searches for relevant sources, creates a notebook, adds those sources, runs the analysis, and returns the result as markdown. In monokern's example, the pipeline finds 10 video sources, sends them into NotebookLM, generates analysis, creates an infographic, and saves the result into Obsidian. The total processing time is described as around 6 minutes, most of which is NotebookLM processing. The practical value is that the user does not need to open every tab, copy every link, or manually combine the metadata. The final output is more than a chat answer. It includes full analysis, source lists, engagement metrics, trend observations, a visual deliverable, and a markdown file saved into the vault. That is what separates this workflow from a normal chatbot interaction. Obsidian makes the system smarter over time Obsidian is the most interesting part. If the workflow runs only once, it already saves time. But if it runs regularly, every new markdown file makes the personal knowledge base richer. After a month, Claude Code can see recurring topics, the types of insights the user values, and the preferred format for results. The post also highlights the role of the claude.md file inside the vault. This can become a configuration file describing working conventions, analysis style, and output preferences. After several research sessions, the user can ask Claude Code to read recent work and update that file so it better reflects the user's current process. The real value is the structure, not YouTube YouTube is only the data source in the example. The pipeline structure is the valuable part. Users can replace YouTube with academic PDFs, industry reports, public documentation, web pages, local files, transcripts, or Google Drive documents. As long as Claude Code can access the source and pass it into the analysis layer, the operational template stays the same. This opens many practical uses: researching a crypto ecosystem through whitepapers and public documentation, analyzing an emerging technology through conference talks, mapping content gaps in a niche, or tracking market dynamics from public reports. In every case, the same three layers remain: collect sources, analyze them, and store knowledge. What should you watch out for? This workflow is powerful, but it is not for everyone. It assumes the user is comfortable with Claude Code, has an Obsidian vault, can install CLI tools such as yt-dlp, and is willing to use an unofficial library to connect to NotebookLM. Also, because NotebookLM and YouTube can change access patterns, these skills should be treated as maintained tools rather than install-and-forget automation. Still, the underlying idea is important: instead of using AI as a disconnected chat box, turn it into a research system with memory, a pipeline, and the ability to learn from your own work history. For people who regularly analyze markets, technology, or content, this is far more practical than opening 10 tabs and manually stitching everything together.

Nam•

2 Jun, 2026

Gemini powers Argentina and Messi at World Cup 2026

Gemini has won big in the most literal sense, right as Messi scored his first hat-trick at the 2026 World Cup, leading Argentina to a crushing 3-0 victory over Algeria and equaling Miroslav Klose's record of 16 World Cup goals. That historic moment became the perfect launchpad for Gemini. Back in March 2026, Google and the Argentine Football Association (AFA) made a bold decision: rather than simply printing a logo on training kits, they signed a deal for the AI to actively support tactical preparation and professional decision-making. That bet has now proven to be the right call. From training kit to the tactical meeting room The agreement between AFA and Google was unveiled at Times Square, New York, a venue deliberately chosen to capture global media attention. The Gemini logo appears across all training apparel for Argentina's men's, women's and youth squads, sitting alongside Adidas and American Express in AFA's top sponsorship tier. But the interesting part isn't the jersey. According to Inside World Football, Argentina's coaching staff will use Gemini for three specific purposes: tactical analysis, injury prevention and decision support. In other words, Gemini now has a seat in meetings that previously belonged only to Scaloni and his assistants. Google has not publicly disclosed which specific Gemini tools have been integrated into AFA's workflow. What is clear is that they are using the World Cup to bring Gemini into the reality of professional football, and the results will be graded in public. What is Gemini actually doing in the dressing room? Argentina arrives at the 2026 World Cup as the reigning champion. Every decision Scaloni makes, from the squad list to the starting eleven, is scrutinized more closely than any other team, and that is precisely why Argentina has become the most ideal testing ground Google has ever had for Gemini in professional football, especially at a major tournament. Tactical analysis Gemini is used to process match data for both Argentina and their opponents, covering movement statistics, attacking patterns and defensive vulnerabilities. Instead of the coaching staff spending hours reviewing footage, AI synthesizes the data and generates tactical diagrams automatically, saving significant preparation time before each match. Injury prevention This is a problem every major team wants to solve, especially when Messi and several key players are at an age that requires careful management of training loads. Gemini analyzes biometric data and injury history to issue early warnings, helping the coaching staff adjust intensity before problems actually occur. That is part of the reason why, immediately after completing his hat-trick, Scaloni chose to substitute Messi off, prioritizing fitness and safety for the matches ahead. AI in injury prevention is nothing new. Premier League clubs have had Microsoft as a partner for similar purposes. What is different this time is that Gemini is integrated directly into the workflow of a national team competing at a major tournament, not just at club level. For fans: create Messi content, follow scores without unlocking your screen Alongside supporting the coaching staff, Gemini has also rolled out a range of features aimed at fans, and this is the side that hundreds of millions of people will actually experience. Gemini lets you create content about players directly Users can generate images, songs and digital content featuring Argentina players like Messi directly inside the Gemini app. The feature is designed to bring the World Cup experience closer to those who cannot attend matches in person. Real-time scores and automated daily briefings On Google Search, live match scores can be pinned to the lock screen and update in real time, with dedicated animations for goals and red cards, all without needing to unlock the phone. For paid Gemini users, the Scheduled Actions feature allows an automated daily football briefing to be set up, covering scores, news and fixtures, delivered at a chosen time without needing to prompt it each day. Match-day infrastructure Google has updated Street View at all 16 host stadiums and optimized routing on Waze for match days. Waze also surfaces live scores when the car is stopped at red lights, so drivers do not need to pick up their phones while on the move. The 2026 World Cup is the real test for AI in sport Google is not sponsoring Argentina alone. Gemini also appears on the kits of France, Morocco, Iraq, Turkey and the United States, while Pixel is the official phone of the French squad, which is also using Gemini for internal communications. This is clearly a comprehensive strategy from Google, not a one-off deal. What makes the 2026 World Cup particularly significant is that it will answer a question no lab environment can: what do users actually do with AI when a World Cup runs for six weeks across 104 matches? Features that run on initial novelty will fade after the group stage. Whatever users keep coming back to all the way through the final is the honest answer to where AI actually fits in everyday life, and Google knows it. Google's communications director for Latin America, Flor Sabatini, stated that the 2026 World Cup will mark a before and after in the history of football because of AI. It sounds like marketing, but the reality is that this is the first time a major AI model has been integrated into the preparation of the reigning world champions, right in the middle of the most-watched sporting event on the planet. The 2026 World Cup is Gemini's real test The most significant part of this entire story is not the Gemini logo on Messi's jersey. It is the fact that Argentina, still the most expected to win and the most scrutinized team, carrying the pressure of defending the title, has committed part of its preparation process to AI. If Argentina succeeds, Gemini will have a case study that no advertising budget can buy. If Argentina falls short and the coaching staff attributes any part of it to AI, the narrative will flip entirely. Either way, this is the first time AI has been held accountable on a stage that genuinely matters, not a benchmark, not a demo, but the World Cup. For AI users, what is worth watching is not just whether Argentina wins, but whether Gemini actually changes how a football team operates, or whether it turns out to be nothing more than a logo on a training kit that looks better than previous years.

Nam•

17 Jun, 2026

AI Technology at World Cup 2026: A Complete Overview

The Adidas Trionda match ball, three dimensional player models accurate to the millimeter, robot dogs patrolling stadiums, and Google Gemini sitting on the touchline with the Argentina national team. World Cup 2026 is not only the largest tournament in history with 104 matches across 16 cities in the United States, Canada, and Mexico, but also the most extensive deployment of AI ever seen in sports. How the Adidas Trionda smart ball works The official match ball named Adidas Trionda is equipped with an Inertial Measurement Unit IMU sensor operating at 500Hz, which means it collects 500 data points every second on movement, spin, and the exact moment the ball makes contact with a player foot. This is particularly important for offside situations, as the sensor will determine the precise moment the ball leaves the passer foot down to the millisecond. The timestamp from the sensor is synchronized immediately with the player tracking system, helping to lock the position of every player on the pitch at that exact moment instead of relying on the naked eye which can be off by up to half a second. As a result, offside decisions are made faster and more accurately than ever before. This advanced technology immediately rescued the Swedish team by identifying the precise moment of contact from striker Alexander Isak. Before that, the joy of scorer Svanberg was temporarily dampened when the VAR team stepped in to review. In a play that occurred at a breakneck speed, he appeared to be standing behind the Tunisian defense when the ball was delivered into the penalty area, leading many to believe the goal would be disallowed. However, the data from the motion sensor mounted inside the Adidas Trionda ball proved that Svanberg moved back to a valid position in time, bringing a legitimate goal for Sweden to the delight of the fans. Semi automated offside technology with 3D player avatars Semi automated offside technology SAOT has been upgraded significantly for World Cup 2026, highlighted by the 3D avatar of each player. Every player participating in the tournament is digitally scanned across the entire body in about one second, creating a 3D model with detailed body dimensions for every part. When a situation requires VAR review, the system overlays these 3D models onto real time tracking data from more than 12 specialized cameras at each stadium. This approach completely resolves the long standing issue of two dimensional offside lines, where a player arm, shoulder, or foot might be obscured from a certain camera angle. The 3D model fills those gaps using realistic anatomical data, and the result is displayed as a complete 3D animation on the pitch and on television, entirely replacing the flat red and green lines that once confused spectators. Football AI Pro: analytics platform for all 48 teams FIFA collaborated with Lenovo to build Football AI Pro, an analytics platform developed on the FIFA Football Language foundation model, which has been trained on hundreds of millions of football data points over decades of competition. This is the first time in World Cup history that all 48 participating teams have access to the same analytics platform, rather than wealthier federations holding an advantage due to better data tools. This platform outputs results in multiple formats, including text summaries, video clips, interactive charts, and 3D tactical visualizations. Teams can use it before and after matches to analyze opponent tactics, detect set piece patterns, track player workload intensity, and analyze head to head history. However, FIFA bans its use during match time, and coaching staff can only access it during halftime and after the match. Referee chest cameras with AI image stabilization For the first time in history, referees in all 104 World Cup matches wear chest cameras. The raw images from the camera when the referee runs at high speeds are shaky and cannot be used for broadcasting, but FIFA runs an AI image stabilization model in real time on every frame, creating broadcast quality video. The result is the Referee View perspective that offers a subjective experience from the pitch, quickly becoming one of the most popular broadcasting innovations. This viewpoint not only serves entertainment but also provides analysts with a new data source, which is the exact vision that the referee had when making decisions. Google Gemini on the touchline and fan experience In March 2026, the Argentine Football Association announced Google as an official global sponsor, with the Gemini logo appearing on training jerseys for the men, women, and youth teams. However, this partnership goes far beyond brand advertising, because the Argentina technical staff uses Gemini directly for tactical analysis from match videos, tracking player workload and injury recovery, querying historical data on specific matchup scenarios, and creating individual opponent briefings for each player. Notably, Argentina players and coaches use Gemini through the standard application rather than any customized interface, reflecting the maturity of general purpose AI tools in professional sports applications. Additionally, Google also deployed a series of features for fans, including live scores pinned to the Android lock screen, AI match summaries on the Gemini app, on demand tactical diagrams, jersey templates on Google Photos, stadium navigation via Google Maps, and match statistics on Google Search. Robot dogs, facial recognition, and AI security At the host venues, FIFA deployed Boston Dynamics Spot robot dogs for outer perimeter security patrols and facility inspections. These robots perform automated patrols in restricted areas, with onboard cameras connected to the stadium security AI system, which is particularly effective in spaces that are difficult to monitor continuously, such as tunnels, underground technical corridors, and stadium perimeters at night. The biometric layer is equally notable, as some stadiums use facial recognition for entry, where your face is your ticket, processed against the database in less than one second. However, the widespread presence of AI surveillance also raises questions about privacy in large scale sporting events. AI predictions for the champion: every model has a different answer Before the tournament kicked off, many AI systems simulated all 104 matches to predict the champion, and the results were completely inconsistent. ChatGPT predicted Spain, the FanDuel research model chose France to defeat Argentina 3 to 2 in the final, while Yahoo Sports and DataCamp both bet on Brazil. This disagreement is worth reflecting on, as every model was provided with the same public data sources including FIFA rankings, ELO scores, qualifying form, and injury reports, but different weighting methods created entirely different results. And of course, no model can calculate Messi left foot shot in the 89th minute of a knockout match. That is still football. AI is no longer an experiment but infrastructure What makes World Cup 2026 different from previous tournaments does not lie in any single technology, but in the fact that AI has transitioned from the experimental phase to operational infrastructure. The smart ball, the 3D offside system, the referee cameras, and the analytics platform are not pilot projects. They are the basic operational foundation for every match. The 500Hz sensor inside the ball does not understand football, as it only measures spin. However, the decision it enables, accurate to the millimeter, displayed in 3D, and returning results in seconds, with the Swedish team situation being a prime example, will change how football is operated. That is the true shape of AI when running at a large scale.

Nam•

16 Jun, 2026

Microsoft launches 7 new AI models to challenge OpenAI

Microsoft just dropped seven new AI models at Build 2026, with MAI-Thinking-1 boasting 35 billion active parameters and trained entirely on clean data. For the first time, the software giant is openly challenging the position of its own strategic partner, OpenAI, on the AI model battlefield. MAI-Thinking-1 and Microsoft's reasoning ambitions The centerpiece of Build 2026 was MAI-Thinking-1, Microsoft's first reasoning AI model developed entirely in-house. With approximately 35 billion active parameters, the model is designed to handle multi-step reasoning tasks, work with long contexts, and support complex coding, all at a lower cost than many large-scale AI models currently available. The most notable claim is that Microsoft trained MAI-Thinking-1 on clean data without using distillation from third-party AI models. In other words, this is a clear statement that Microsoft has the independent AI research capability to build competitive models without "borrowing" knowledge from GPT or any other model. According to Microsoft's published evaluations, MAI-Thinking-1 achieves competitive performance on coding benchmarks and is rated on par with many leading AI models in blind evaluation tests. The 35-billion parameter count also signals that Microsoft is prioritizing efficiency over raw scale, as many competitor models have significantly more parameters but may not necessarily deliver better output quality. From coding to voice: a complete AI ecosystem Beyond reasoning, Microsoft introduced six additional AI models to build a complete AI ecosystem serving both individual users and enterprises. From coding and image generation to voice synthesis, every piece of the puzzle now has a dedicated model. Smarter coding with MAI-Code-1-Flash For developers, MAI-Code-1-Flash is significant news. This model specializes in code generation and software development support, optimized for real-world programming tasks. More importantly, it will be integrated directly into GitHub Copilot and Visual Studio Code, two tools used daily by millions of developers. This means code suggestions and automated coding experiences will be significantly upgraded within familiar development environments. Images and voice: the missing pieces In the creative content space, Microsoft announced MAI-Image-2.5 alongside MAI-Image-2.5-Flash. These are next-generation image creation and editing models, with the Flash version optimized for fast response times, making it suitable for real-time applications like live photo editing or on-demand illustration generation. In the audio domain, Microsoft introduced two important models: MAI-Voice-2 with more natural voice synthesis capabilities and support for additional languages MAI-Transcribe-1.5 for speech-to-text conversion with significantly faster processing speeds than the previous generation Additionally, Microsoft has developed optimized variants specifically for the Microsoft Foundry platform, helping enterprises easily build and deploy their own AI applications. The strategy to reduce OpenAI dependence Where Microsoft was previously seen mainly as an infrastructure partner and deployment platform for OpenAI, Build 2026 shows the company is steadily acquiring all the essential components of a full AI ecosystem. Microsoft now has its own reasoning model, coding model, image generation model, voice synthesis model, and speech recognition model, all connected directly to the Azure, Copilot, and Microsoft Foundry ecosystem. This strategy gives Microsoft greater autonomy in developing core technology while reducing risk from dependence on external partners. More specifically, owning proprietary AI models allows Microsoft to control its product roadmap, optimize operational costs, and customize models for specific service needs without waiting for or negotiating with third parties. Where does the AI model race go from here? The simultaneous launch of seven new AI models shows Microsoft is investing heavily in foundational technologies to compete directly with major players like OpenAI, Google, and Anthropic. When OpenAI's largest partner decides to build its own AI models, that is the clearest signal that the AI race has entered a new phase where no one wants to place the future of their technology in someone else's hands. For developers and enterprises, now is the time to closely watch Microsoft Foundry and the Azure AI ecosystem, as tools that were previously only available through OpenAI will soon appear within Microsoft's familiar ecosystem. Build 2026 may well be remembered as the moment Microsoft officially declared its vision for an independent, comprehensive AI ecosystem with its own distinctive identity.

Nam•

4 Jun, 2026

Quick Summary

The problem with how we currently use AI to organize knowledge

Does RAG accumulate knowledge over time the way our brains do?

How does LLM Wiki work?

LLM Wiki is not software, it's an Obsidian thinking architecture

Three core architectural layers of the Wiki

The self-maintaining loop is the real differentiator

Three operations running continuously without intervention

Why RAG isn't needed at personal scale

Context windows are now large enough to replace vector databases

How to get started in 15 minutes

The first steps to building your first wiki

LLM Wiki compared to MemPalace, Mem0, and Zep

Four different philosophies for the same problem

Comparison table to choose the right tool for the right need

The most thought-provoking insight from LLM Wiki

Discussion (0)

Related Articles

Claude Code, NotebookLM, and Obsidian for Smarter Research

Gemini powers Argentina and Messi at World Cup 2026

AI Technology at World Cup 2026: A Complete Overview

Microsoft launches 7 new AI models to challenge OpenAI