LeCun, chief scientist of Metan’s AI Lab and professor at New York University, is one of the world’s most influential AI researchers. He tried to give the machine a general understanding of how the world works by training its nervous system to predict what would happen in videos of everyday events. But predicting the next video frame pixel by pixel is too complicated. He hit the wall.
After months of realizing what he was missing, he saw a bold new vision for the next generation of AI. In a draft document shared with MIT Technology Review, LeCun describes an approach he hopes will give machines the intelligence to navigate the world.
For LeCun, this proposal could be the first step in building what many call artificial general intelligence, or AGI, machines with the ability to think and plan like humans. It also steers clear of current hot trends in machine learning, reviving some old ideas that have gone out of fashion. But his vision is not comprehensive; in fact, it may raise more questions than it answers. The biggest question mark, as LeCun has shown himself, is that he does not know how to build what he is portraying.
Central to the new approach is a nervous system that can learn to see the world in different details. By reducing the need for pixel-perfect predictions, these networks will focus only on features in the scene that are relevant to the task at hand. LeCun recommends pairing this basic grid with another, called a configurator, which determines what level of detail is needed and modifies the system accordingly.
For LeCun, AGI will be part of our relationship with the technology of the future. His vision is colored by Meta, an employer promoting the metaverse of virtual reality. In 10 or 15 years, he says, people will not carry smartphones in their pockets, but will be headed for the days of augmented reality glasses and virtual assistants. “For us, the most useful person should have human-level intelligence,” he said.
“Annann has been talking about many of these ideas for some time,” said Joshua Bengio, an AI researcher at the University of Montreal and scientific director of the Mila-Québec Institute. “But it’s nice to see it all in one big picture.” Bengio thinks LeCun is asking the right questions. He also admires LeCun’s willingness to release documents with very few answers. According to him, this is not a pure set of results, but a research proposal.
“People talk about these things in private, but they often don’t share publicly,” Bengio said. “It’s risky.”
In 2018, he was co-winner of the Turing Prize, computing’s top prize, with Bengio and Jeffrey Hinton for his pioneering work in deep learning. “Getting machines to behave like humans and animals is my life’s work,” he says.
LeCun thinks that animal brains simulate the world in what he calls world models. Learned in infancy, animals (including humans) make good predictions about what is happening around them. Babies learn the basics in the first months of life by seeing the world, LeCun said. Seeing the ball thrown and falling several times is enough for the child to feel how gravity works.
“Intelligent” means this quality of thinking. This includes simple physics: for example, knowing that the world is three-dimensional and that objects do not actually disappear when they move outside. It allows you to predict where a bouncing ball or a speeding bicycle will be in seconds. It helps us connect the dots between complete information: if we hear metal clattering in the kitchen, we can guess that someone has dropped the pan because we know what object made the sound and when; This is it.
In short, it tells us which events are possible and which are not, and which events are more likely than others. This allows us to foresee the consequences of our actions and plan ahead, ignoring irrelevant details.
But teaching common sense machines is difficult. Thousands of examples had to be demonstrated before the modern nervous system began to see such patterns.