AI Dev and Research News
Posts
AI News: MIT MIMIC-IV; Japanese DeBERTa V2 Large model; One-Shot-Talking-Face; Text-to-Audio Diffusion....

AI News: MIT MIMIC-IV; Japanese DeBERTa V2 Large model; One-Shot-Talking-Face; Text-to-Audio Diffusion....

ASIF RAZZAQ
January 18, 2023

Hi there, today we will share some research updates from MIT MIMIC-IV, the Japanese DeBERTa V2 Large model, An image-to-image translation framework, One-Shot-Talking-Face, Text-to-Audio Diffusion, AI-powered Distracted Driving Monitor, and some bonus cool AI tools. So, let's start...

MIT: MIMIC-IV was recently published. The core dataset has been out for a while, but they have just published the deidentified free-text clinical notes: 300,000+ discharge summaries and 2.5 million radiology reports!

Japanese DeBERTa V2 Large model has been released: The accuracy of masked word prediction is improved by 2pt compared to Base. This is a Japanese DeBERTa V2 large model pre-trained on Japanese Wikipedia, the Japanese portion of CC-100, and the Japanese portion of OSCAR.

Bilkent University: An image-to-image translation framework for facial attribute editing with disentangled interpretable latent directions.

Report: Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations (Read The Report or Read Summary Article)

One-Shot-Talking-Face: Demonstration of "one-shot-talking-face" that can generate a video of a talking face when voice and image are put is released.

Detextify: A Python library to remove unwanted pseudo-text from images generated by your favorite generative AI models (Stable Diffusion, Midjourney, DALL·E).

Text-to-Audio Diffusion: Researchers from ETH Zurich propose a set of models to tackle multiple aspects, including a new method for text-conditional latent audio diffusion with stacked 1D U-Nets, that can generate multiple minutes of music from a textual description.

ByteDance: ByteDance AI Research Proposes a Novel Self-Supervised Learning Framework to Create High-Quality Stylized 3D Avatars with a Mix of Continuous and Discrete Parameters

AI to Character: CharacterGPT generates from text each character with their own unique personalities, identities, trends, voices and bodies.

AI-powered Distracted Driving Monitor: Tracks arm placement, eye movement, gaze direction, head turns, and cell phone + seatbelt detection. A cool application could be for insurance companies.

AI News: MIT MIMIC-IV; Japanese DeBERTa V2 Large model; One-Shot-Talking-Face; Text-to-Audio Diffusion....

Sponsor our Newsletter | Join our 13k+ ML Subreddit