- AI Research Insights
- Posts
- AI News: MIT MIMIC-IV; Japanese DeBERTa V2 Large model; One-Shot-Talking-Face; Text-to-Audio Diffusion....
AI News: MIT MIMIC-IV; Japanese DeBERTa V2 Large model; One-Shot-Talking-Face; Text-to-Audio Diffusion....
Hi there, today we will share some research updates from MIT MIMIC-IV, the Japanese DeBERTa V2 Large model, An image-to-image translation framework, One-Shot-Talking-Face, Text-to-Audio Diffusion, AI-powered Distracted Driving Monitor, and some bonus cool AI tools. So, let's start...
MIT: MIMIC-IV was recently published. The core dataset has been out for a while, but they have just published the deidentified free-text clinical notes: 300,000+ discharge summaries and 2.5 million radiology reports!
Japanese DeBERTa V2 Large model has been released: The accuracy of masked word prediction is improved by 2pt compared to Base. This is a Japanese DeBERTa V2 large model pre-trained on Japanese Wikipedia, the Japanese portion of CC-100, and the Japanese portion of OSCAR.
Bilkent University: An image-to-image translation framework for facial attribute editing with disentangled interpretable latent directions.
Report: Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations (Read The Report or Read Summary Article)
One-Shot-Talking-Face: Demonstration of "one-shot-talking-face" that can generate a video of a talking face when voice and image are put is released.
Detextify: A Python library to remove unwanted pseudo-text from images generated by your favorite generative AI models (Stable Diffusion, Midjourney, DALL·E).
Text-to-Audio Diffusion: Researchers from ETH Zurich propose a set of models to tackle multiple aspects, including a new method for text-conditional latent audio diffusion with stacked 1D U-Nets, that can generate multiple minutes of music from a textual description.
ByteDance: ByteDance AI Research Proposes a Novel Self-Supervised Learning Framework to Create High-Quality Stylized 3D Avatars with a Mix of Continuous and Discrete Parameters
AI to Character: CharacterGPT generates from text each character with their own unique personalities, identities, trends, voices and bodies.
AI-powered Distracted Driving Monitor: Tracks arm placement, eye movement, gaze direction, head turns, and cell phone + seatbelt detection. A cool application could be for insurance companies.