AI News: Forward-Forward algorithm (FF); Deepmind's MuJoCo MPC (MJPC); Microsoft’s New AI Model, VALL-E.....

Hi there, today we will be sharing some research updates from the Deepmind research team, Meta, Microsoft, OpenAI, University of Toronto, Toyota research, and some bonus cool AI tools. So, let's start....

Meta: In this research paper, Meta researchers explore the scaling properties of mixed-modal generative models, discovering new scaling laws that unify the contributions of individual modalities and the interactions between them.

University of Toronto: Hinton, professor at the University of Toronto and engineering fellow at Google Brain, recently published a paper on the Forward-Forward algorithm (FF), a technique for training neural networks that uses two forward passes of data through the network, instead of backpropagation, to update the model weights.

Deepmind: MuJoCo MPC (MJPC) is an interactive tool for real-time behavior synthesis with predictive control algorithms. MJPC includes a number of planners written in multi-threaded C++, like iLQG and Gradient Descent. One of the awesome MJPC features is asynchronous simulation. You can slow down/speed up the environment time (press -/+), effectively providing additional planning time. 

Microsoft: Microsoft’s New AI Model, VALL-E, Can Generate Speech From Text Using Only A Three-Second Audio Sample. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. Experiment results show that VALL-E significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity.

Toyota/Meta/The Hebrew University of Jerusalem: ReVISE, the first universal audio-visual speech enhancement model powered by SSL. This single model can perform video-to-speech synthesis, speech inpainting, denoising, and source separation.

Open AI: OpenAI researchers collaborated with Georgetown University’s Center for Security and Emerging Technology and the Stanford Internet Observatory to investigate how large language models might be misused for disinformation purposes. The collaboration included an October 2021 workshop bringing together 30 disinformation researchers, machine learning experts, and policy analysts, and culminated in a co-authored report building on more than a year of research. This report outlines the threats that language models pose to the information environment if used to augment disinformation campaigns and introduces a framework for analyzing potential mitigations. 

OpenDR: The Open Deep Learning Toolkit for Robotics version 2.0 was just released! This new version of the toolkit includes several improvements, such as new tools for object detection, efficient continual inference, tracking, emotion estimation and high-resolution pose estimation. Furthermore, this version includes a refined ROS interface, along with support for ROS2.

Cool AI Tool: Check out Plask, a web-based, AI-powered 3D animation editor and motion capture tool that you can use for free.

Cool AI Tool: Sumly.AI, a free web tool that offers summaries of podcasts using AI technology.

