This week in AI – all the Major AI developments in a nutshell
This week in AI – all the Major AI developments in a nutshell

This week in AI – all the Major AI developments in a nutshell

  • Alibaba Research released Qwen1.5-110B, the largest model in the Qwen1.5 series with over 100 billion parameters in the series. It demonstrates competitive performance against Llama-3-70. The model supports the context length of 32K tokens and is multilingual [Details].
  • Gradient released a model, Llama-3 8B Gradient Instruct 1048k, that extends LLama-3 8B's context length from 8k to 1M+ [Details].
  • Abacus.AI released Llama-3-Giraffe-70B model that extends the context length of Llama 3 70B to approximately 128k [Details].
  • ByteDance released Hyper-SD, offering hyper-fast and hyper-quality text-to-image generation. The model achieves Single-Step inference on both SD1.5 and SDXL architecture without evident losses of aesthetics, styles and structures [Details | Scribble Demo | T2I Demo]
  • BigCode released StarCoder2-15B-Instruct-v0.1, the very first entirely self-aligned code LLM trained with a fully permissive and transparent pipeline. The open-source pipeline uses StarCoder2-15B to generate thousands of instruction-response pairs, which are then used to fine-tune StarCoder-15B itself without any human annotations or distilled data from huge and proprietary LLMs [Details].
  • Stardust introduced AI robot, the Astribot S1, that can perform complex tasks such as folding clothes, sorting items, flipping pots for cooking, vacuuming, and competitive cup stacking. The S1 robot is expected to be commercialized within 2024 [Details ].
  • Amazon Q, a generative AI-powered assistant for businesses and developers by AWS, is now generally available. Amazon Q includes Amazon Q Developer (a generative AI-powered conversational assistant to build and operate AWS applications), Amazon Q Business ( AI assistant that can generate content, and securely complete tasks based on data and information in enterprise systems) and Amazon Q Apps (to build generative AI-powered apps from their company’s data, without any prior coding experience) [Details]
  • Atlassian introduced Rovo, an AI tool that accelerates finding, learning, and acting on information dispersed across a range of internal tools and third-party apps. It also lets you add specialized agents to workflows [Details].
  • China's Shengshu Technology and Tsinghua University have unveiled Vidu AI, a text-to-video model capable of generating 16-second clips at 1080p resolution [video]
  • PyTorch released ExecuTorch alpha, a framework focused on deploying large language models across mobile and edge devices including wearables, embedded devices and microcontrollers [Details].
  • Memory is now available to all ChatGPT Plus users. Tell ChatGPT anything you’d like it to remember and it can use this information as context when generating a future related answer. Memory can be turned on or off in settings and is not currently available in Europe or Korea. You can also start a Temporary Chat for one-off conversations, which won’t appear in your history or in memory [Details].
  • GitHub announced GitHub Copilot Workspace: the Copilot-native developer environment. Within Copilot Workspace, developers can brainstorm, plan, build, test, and run code in natural language. This new task-centric experience leverages different Copilot-powered agents from start to finish, while giving developers full control over every step of the process [Details].
  • Researchers from KAIST AI and others released Prometheus 2, an open source Language Model specialized in evaluating other language models. Compared to the Prometheus 1 models, the Prometheus 2 models support both absolute grading) and relative grading [Details].
  • Google deepMind introduced Med-Gemini, a family of multimodal medical models built upon Gemini that establish new state-of-the-art (SoTA) performance on 10 out of 14 medical benchmarks [Paper].
  • Nous Research released Hermes 2 Pro - Llama-3 8B. Hermes Pro comes with Function Calling and Structured Output capabilities, and the Llama-3 version now uses dedicated tokens for tool call parsing tags, to make handling streaming function calls easier [Details].
  • Anthropic introduced the Claude Team plan and iOS app [Details].
  • Empathic Voice Interface (EVI) API, announced last month by Hume AI, is now publicly availably. Powered by an empathic LLM (eLLM) that processes your tone of voice, EVI unlocks new capabilities like knowing when to speak, generating more empathic language, and intelligently modulating its own tune, rhythm, and timbre [Details].
  • Google’s new ‘Speaking practice’ feature uses AI to help users improve their English skills [Details].
  • Meta’s ‘set it and forget it’ AI ad tools are misfiring and blowing through cash [Details].
  • Google introduced a new shortcut in the Chrome desktop address bar for quick access to the Gemini chatbot [Link].

Source: AI Brews - Links removed from this post due to auto-delete, but they are present in the newsletter. it's free to join, sent only once a week with bite-sized news, learning resources and selected tools. Thanks!

submitted by /u/wyem
[link] [comments]