We have seen rapid advancement in generative AI - especially images, text, and video. These had excited a lot of people, but it is difficult to not consider these to be novelties compared to the true underlying implications of these technologies. In order to create a representation of the world, one must understand the world. Generative AI is, in a way, a visual verification of the accuracy an AI's understanding.
I am not suggesting that we are close to general AI, but I anticipate that we may soon start seeing consumer robotics that can successfully function within society. Image generation is, effectively, computer vision - it has to know what things look like. Now with Sora, we see a large leap in video generation, which is like computer vision with the ability to make predictions about potential future visual input. Video generation also requires an understanding of physics, human behavior, and other physical processes in order to produce convincing motion.
I would not be surprised if within five years (maybe even three years) some home consumer robot will be on the market with the ability to carry out conversations, identify objects in the environment, and perform basic tasks. Maybe this will just be an expensive toy, though I think it is possible to achieve these capabilities for at least $300, if many tasks are offloaded to a server.
I think people are being distracted by the rapid advancement in pretty pictures and not seeing larger implications. I would not be surprised if by the mid 2030s robots and AI devices will be integrated parts of everyday society. Eventually there may wind up being more robots walking down the sidewalk than humans - making deliveries and carrying out other tasks.
[link] [comments]