This week in AI - partnered with aibrews.com feel free to follow their newsletter
News & Insights
- Stability AI has announced SDXL 0.9, a significant upgrade to their text-to-image model suite that can generate hyper-realistic images. SDXL 0.9 has one of the largest parameter counts in open-source image models (3.5B) and is available on the Clipdrop by Stability AI platform [Details].
- Google presents AudioPaLM, a Large Language Model that can speak and listen. AudioPaLM fuses text-based PaLM-2 and speech-based AudioLM models into a unified multimodal architecture that can process and generate text and speech [Examples | paper].
- Google researchers present DreamHuman, a method to generate realistic animatable 3D human avatar models solely from textual descriptions [Details].
- Meta introduced Voice box - the first generative AI model for speech that can accomplish tasks it wasn't specifically trained for. Like generative systems for images and text, Voicebox creates outputs in a vast variety of styles, and it can create outputs from scratch as well as modify a sample it’s given. But instead of creating a picture or a passage of text, Voicebox produces high-quality audio clips [Details | Samples | Paper].
- Microsoft launched Azure OpenAI Service on your data in public preview, which enables companies to run supported chat models (ChatGPT and GPT-4) on their connected data without needing to train or fine-tune models [Details].
- Google Deepmind introduced RoboCat, a new AI model designed to operate multiple robots. It learns to solve new tasks on different robotic arms, like building structures, inserting gears, picking up objects etc., with as few as 100 demonstrations. It can improve skills from self-generated training data [Details].
- Wimbledon will use IBM Watsonx, to produce AI-generated spoken commentary for video highlights packages for this year's Championships. Another new feature for 2023 is the AI Draw Analysis, which utilises the IBM Power Index and Likelihood to Win predictions to assess each player’s potential path to the final [Details].
- Dropbox announced Dropbox Dash and Dropbox AI. Dropbox Dash is AI-powered universal search that connects all of your tools, content and apps in a single search bar. Dropbox AI can generate summaries and provide answers from documents as well as from videos [Details].
- Wayve presents GAIA-1 - a new generative AI model that creates realistic driving videos using video, text and action inputs, offering fine control over vehicle behavior and scene features [Details].
- Opera launched a new 'One' browser with integrated AI Chatbot, ‘Aria’. Aria provides deeper content exploration by being accessible through text highlights or right-clicks, in addition to being available from the sidebar. [Details].
- ElevenLabs announced ‘Projects’, available for early access, for long-form speech synthesis. This will enable anyone to create an entire audiobook without leaving the platform. ElevenLabs has reached over 1 million registered users [Details].
- Vimeo is introducing new AI-powered video tools: a text-based video editor for removing filler words and pauses, a script generator, and an on-screen teleprompter for script display [Details].
- Midjourney launches V5.2 that includes zoom-out outpainting, improved aesthetics, coherence, text understanding, sharper images, higher variation modes and a new /shorten command for analyzing your prompt tokens [Details].
- Parallel Domain launched a new API, called Data Lab, that lets users use generative AI to build synthetic datasets [Details]
- OpenAI considers creating an App Store in which customers could sell AI models they customize for their own needs to other businesses [Details]
- OpenLM Research released its 1T token version of OpenLLaMA 13B - the permissively licensed open source reproduction of Meta AI's LLaMA large language model. [Details].
- ByteDance, the TikTok creator, has already ordered around $1 billion worth of Nvidia GPUs in 2023 so far, which amounts to around 100,000 units [Details].
GPT-Engineer: Specify what you want it to build, the AI asks for clarification, generates technical spec and writes all necessary code [GitHub Link].
—-------
Welcome to the r/artificial weekly megathread. This is where you can discuss Artificial Intelligence - talk about new models, recent news, ask questions, make predictions, and chat other related topics.
Click here for discussion starters for this thread or for a separate post.
Self-promo is allowed in these weekly discussions. If you want to make a separate post, please read and go by the rules or you will be banned.
[link] [comments]