The Educational Revolution of the Personal AI Instructor

October 19, 2024 5 min read

Because disagreement is useful, read the best AI-generated counterargument to this article.

One of my strongest convictions is that artificial intelligence is going to deeply and massively revolutionize educational paradigms and systems. It was while reading one of the most fascinating articles on the subject, “Why we stopped making Einsteins”, that I understood the infinite potential of AI to turn our children into geniuses of the kind History has known from Aristotle all the way to the digitalization of the world and the global lobotomization of minds.

We can of course debate the contribution of technology to the general enrichment of our economies, but this transformation came with a huge societal cost for our primate brains: overstimulated, unable to focus, unable to think for more than a few minutes, inhabited by the fear of missing something important in the digital world (the famous FOMO), attacked from every side by algorithmic distractions trying to siphon away the smallest second of our attention, we are regressing inexorably.

When adults cannot manage it, how could we expect children, whose cognitive development is still ongoing, to manage it? To answer this problem, AI opens interesting and genuinely exciting perspectives that can redirect algorithmic power in the right direction: the cognitive development of the youngest, but also intellectual stimulation, the preservation of curiosity, and interest in the less seductive subjects such as art, history, philosophy, etc.

If Large Language Models (LLMs) have been a revolution in the field of content production, the arrival of natural, fluid, interactive text-to-speech opens the door to another revolution: personal instructors that are available, affordable, and relevant for every child, every adult curious enough to learn.

The Revolution of Audio Assistants

One of the aha moments that stunned me was OpenAI’s first demonstration of its Advanced Audio Mode in September 2023

OpenAI Audio Advanced Mode

The fluidity and naturalness of the voice, and the extremely low latency, were disconcerting. We finally had within reach a tool that could be programmed to speak/discuss/shout/cry/exchange/laugh/take offense/marvel, making the voice modality as simple and pleasant to use as chatting with a chatbot. If reading is a channel with a low information bandwidth, estimated at around 12 bytes per second, voice can triple that value and reach about 39 bytes per second depending on the spoken language. It is quite simply a more efficient and less constraining way to exchange information, and above all it frees the user’s eyes, which is fundamental in mobility situations, allowing images to naturally complete the conversation (this is why subtitles are unpleasant and limiting).

Today, programmatic access to OpenAI’s advanced audio mode has been opened up (a few weeks ago) to everyone, but at a price equivalent to that of a call-center operator (roughly 1000 MAD per million tokens, meaning more or less 100 to 150 MAD per hour of conversation). This is just enough to start shaking human hegemony in interactions with customers/users, but not enough to diversify use cases, especially educational ones in countries like ours.

In parallel, Google has been working since 2023 on its NotebookLM concept, a research assistant capable of ingesting very large quantities of data and answering complex questions in a synthetic and concise way, thanks to the particularity of Google’s LLM models, whose context size is 3 to 4 times larger than that of competitors. But the revolution is audio, and for the past few months, NotebookLM has also made it possible to summarize documents in the form of an incredibly natural podcast between two professionals. The service is free, and the podcast quality is breathtaking if you enjoy American-style shows with an offbeat, humorous, and very dynamic tone.

Google NotebookLM

To listen to an example, here is the one I generated from the article Why we stopped making Einsteins.

And today, Open Source is slowly catching up with the state of the art in voice assistants, with for example an impressive initiative capable of cloning voices and generating emotions: F5-TTS (currently works only in English and Chinese). Bref, things are moving, and audio will soon have its ChatGPT moment.

So, Why Did We Stop Making Einsteins?

One of the reasons cited in the article is the disappearance of high-quality private tutors who could dose knowledge surgically, perfectly adjust the learning rhythm, and offer personalized attention to each child. Today, we are almost there, because the state of the art in AI has solved the voice interface (after solving the text interface with ChatGPT-like systems). Perhaps all that is missing now is a human face to anthropomorphize the machine and cross the robot instructor threshold.

China has invested massively in AI personal tutors because it has understood the educational, societal, and civilizational importance of leapfrogging its population in terms of intelligence, according to this World Economic Forum article on AI Tutors, which laments the lack of competent teachers mainly in Sub-Saharan Africa and Southeast Asia.

Global Teacher Shortage

For my part, for now I will take advantage of this revolution and of the maturation of Google NotebookLM (only in English at the moment, but other languages will probably follow) to build my own instructor on all sorts of topics of interest (blockchain, AI, geopolitics, society, etc.). And in the pipe, there is probably a personal instructor for my children :) so stay tuned.

I am following with particular interest the start-up Synthesis, which offers a personal instructor for children at a very affordable cost.

Synthesis

You will find in the following Spotify link the topics in podcast format (with NotebookLM, though I will explore other technologies):

SPOTIFY LINK - Red Frog Podcast

The podcast is called Red Frog Podcast and is an AI show for scientific popularization, discussing current affairs and science with an accessible and offbeat tone. It is not monetized and will be fed with interesting topics (and carefully selected links) that I do not have time to ingest in reading mode.

On the menu at the time of writing this article:

Each of the 7 deadly sins of Bitcoin taken from my eponymous book
A deep dive into declassified CIA sources around the Moroccan Sahara, cited in an article by Karim Serraj on le360
A compilation of quotes from the article Why we stopped making Einsteins, to go beyond the initial article

The article stops here and continues in the “Why we stopped making Einsteins” episode of the podcast. Enjoy listening!