• AI Research Insights
  • Posts
  • AI News: 🚀 Can LLMs Run Natively on Your iPhone? | Do you really need a studio to finetune LLMs? | NVIDIA Researchers create realistic hair for 3D Characters in real-time......

AI News: 🚀 Can LLMs Run Natively on Your iPhone? | Do you really need a studio to finetune LLMs? | NVIDIA Researchers create realistic hair for 3D Characters in real-time......

This newsletter brings AI research news that is much more technical than most resources but still digestible and applicable

NVIDIA Researchers create realistic hair for 3D Characters in real-time: NVIDIA researchers have made a breakthrough in creating realistic hair for 3D characters in real-time! They've developed a local-global solver dedicated to simulating Discrete Elastic Rods (DER) with Coulomb friction, specifically designed to take full advantage of the incredible processing power of modern GPUs. The research team has successfully demonstrated that their simulator can accurately reproduce analytical results from recent cantilever, bend-twist, and stick-slip experiments. What's more, it does so while significantly reducing iteration times for high-resolution hair simulations. This means that we can now expect far more realistic hair in video games, movies, and other digital media!

Can LLMs Run Natively on Your iPhone? Meet MLC-LLM: An Open Framework that Brings Language Models (LLMs) Directly into a Broad Class of Platforms with GPU Acceleration. MLC LLM enables language models to be deployed natively on a wide range of hardware backends, including CPUs and GPUs and native applications. This means that any language model can be run on local devices without the need for a server or cloud-based infrastructure. MLC LLM provides a productive framework that allows developers to optimize model performance for their own use cases, such as Natural Language Processing (NLP) or Computer Vision. It can even be accelerated using local GPUs, making it possible to run complex models with high accuracy and speed on personal devices.

Do you want your characters to interact physically and naturally with the scene? Tired of manual annotation? Want to generalize far beyond the training objects and scenarios? InterPhys is an innovative research project that focuses on modeling the interaction between physical scenes and characters using learned constraints. The goal is to improve the quality and realism of simulated interactions in virtual environments, such as video games and virtual reality simulations. The researchers use a combination of supervised and reinforcement learning to teach the model to generate plausible physical responses in a wide range of scenarios. InterPhys learns the constraints of objects and characters in a scene by observing and imitating their behavior in a given environment. This allows the model to predict the outcomes of various interactions and generate realistic animations.

CMU Researchers introduce DocPrompting: a natural-language-to-code generation approach that explicitly leverages code documentation. DocPrompting can generate code that calls unseen functions by simply reading the docs.The research team demonstrated the general DocPrompting approach across two retrievers, 5 generators, and across 2 new retrieval-based NL->code datasets.

65B LLaMA-Adapter-V2 code & checkpoint are NOW ready at Github. LLaMA-Adapter-V2 surpasses #ChatGPT in response quality (102%:100%) & beats #Vicuna in win-tie-lost. Utilizing GPT-4 to evaluate response quality, LLaMA-Adapter V2 outperforms ChatGPT with superior results on 50 out of 80 questions. Comparatively, Vicuna only excels in 14 instances.

Prompt Diffusion: UT Austin and Microsoft Researchers present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our model automatically understands the underlying task and performs the same task on a new query image following the text guidance. To achieve this, the research team propose a vision-language prompt that can model a wide range of vision-language tasks and a diffusion model that takes it as input. The diffusion model is trained jointly on six different tasks using these prompts. The resulting Prompt Diffusion model becomes the first diffusion-based vision-language foundation model capable of in-context learning.

Do you really need a studio to finetune LLMs? They are meant for photography. All you need is AutoTrain. Finetune models like pythia, dolly, llama, etc with just a few clicks, no code and no infrastructure. 🤗 AutoTrain is a no-code tool for training state-of-the-art models for Natural Language Processing (NLP) tasks, for Computer Vision (CV) tasks, and for Speech tasks and even for Tabular tasks. It is built on top of the awesome tools developed by the Hugging Face team, and it is designed to be easy to use.

Meet Jsonformer: A Bulletproof Way to Generate Structured JSON from Language Models. Generating structured JSON from language models is challenging. Current approaches like prompt engineering, fine-tuning, and post-processing often fail to produce syntactically correct JSON. Jsonformer: A wrapper around HuggingFace models that only generates content tokens and fills in fixed tokens during the process. This makes it more efficient and bulletproof than existing methods. Jsonformer supports a subset of JSON Schema, including number, boolean, string, array, and object types. It's built on top of the HuggingFace transformers library, making it compatible with any model that supports the HuggingFace interface.

Featured AI Tools For This Newsletter Issue:




Shutterstock AI Image Generator


Find 100s of cool artificial intelligence (AI) tools. Our expert team reviews and provides insights into some of the most cutting-edge AI tools available. Check out AI Tools Club