I think the future of generative art/video will be something like this: Imagine the UI is a white canvas (you choose your ratio of the canvas and its size). Then you can create "boxes" that you can resize as you wish where you type in your prompts then the ai focuses on those and then fill in the rest of the canvas and tries to make everything match.
So imagine you create a small box in the top left corner and then type in "Radiant sun". You create a large rectangular box and type in "snowy mountains". A large box in the middle of the canvas and type in "a bear and a flamingo drinking off a fountain ⛲"
Well you get the idea.
Idk if that already exist but if it does it's not on chatgpt yet.
[link] [comments]