Benchmarks: https://imgur.com/DWNQcaY (Table 2 on Page 7) - Gemini Pro (the launched model) is worse than ChatGPT4, but a bit better than GPT3.5. All the examples are for Ultra (actual state of the art outperforming GPT4), which won't be available until 2024.
Promo video: https://www.youtube.com/watch?v=UIZAiXYceBI (& see other videos on that channel for more)
Technical paper: https://goo.gle/GeminiPaper
Some details (source):
32k context length
efficient attention mechanisms (for e.g. multi-query attention (Shazeer, 2019))
audio input via Universal Speech Model (USM) (Zhang et al., 2023) features
no audio output? (Figure 2)
visual encoding of Gemini models is inspired by our own foundational work on Flamingo (Alayrac et al., 2022), CoCa (Yu et al., 2022a), and PaLI (Chen et al., 2022)
output images using discrete image tokens (Ramesh et al., 2021; Yu et al., 2022b)
supervised fine tuning (SFT) and reinforcement learning through human feedback (RLHF)
[link] [comments]