<span class="vcard">/u/AvvYaa</span>
/u/AvvYaa

How Multimodal LLMs (like Google’s Gemini) learn to generate images!

Hello all! Sharing my new YT video about Multimodal LLMs and how they generate images. I go over concepts like VQ-VAE and image tokens, and how these neural networks convert the image generation problem into a language generation problem. Link ab…

Latent Space: Visualizing and Manipulating what Neural Nets learn

Sharing a YT video from my channel discussing the concept of latent space in Image based neural nets like Variational Autoencoders and how they can be manipulated for interesting effects. Leaving a link here for those interested! https://youtu.be/FslFZ…

Latent Space: Visualizing, interpreting, and manipulating neural networks

Sharing a video from my channel about manipulating generative models (like VAE) in the latent space… the model was trained to generate celebrity faces, and exploring the latent space allows us to do all sorts of crazy stuff – like finding similar…