• AI Research Insights
  • Posts
  • 🚀 What is Trending in AI Research?: Open Interpreter + AVIS + TinyLlama + LLaSM + Qwen-VL and Qwen-VL-Chat ...What is Trending in AI Tools?

🚀 What is Trending in AI Research?: Open Interpreter + AVIS + TinyLlama + LLaSM + Qwen-VL and Qwen-VL-Chat ...What is Trending in AI Tools?

This newsletter brings AI research news that is much more technical than most resources but still digestible and applicable

Open Interpreter lets LLMs run code (Python, Javascript, Shell, and more) locally. You can chat with Open Interpreter through a ChatGPT-like interface in your terminal by running $ interpreter after installing. Open Interpreter equips developers with a broad array of capabilities, including Content Creation; it enables effortless content creation and editing of various formats such as photos, videos, PDFs, and more. Developers can take control of a Chrome browser, facilitating efficient research and automation. Open Interpreter seamlessly handles data-related tasks, allowing users to plot, clean, and analyze large datasets for informed decision-making.

AI Minds NewsletterNewsletter at the Intersection of Human Minds and AI

Researchers from UCLA and Google propose AVIS, an Autonomous Information Seeking Visual Question Answering framework designed to tackle this challenge. The method uses a Large Language Model (LLM) for dynamic decision-making and strategizing the use of external tools, such as APIs, to gather necessary information. The system comprises three main components: a planner to decide the next tool to use, a reasoner to analyze the obtained data, and a working memory to retain this information. User studies were conducted to understand human decision-making in similar tasks, and this data was used in two critical ways: to create a transition graph that limits available actions at each state and to provide contextual examples that improve the LLM’s decision-making capabilities. The approach achieves state-of-the-art performance on benchmarks like Infoseek and OK-VQA.

This paper investigates the performance of ChatGPT against that of students in 32 university-level courses. Employing two specially designed classifiers, the study also explores the detectability of ChatGPT's text. Additionally, it surveys students and educators in five countries to gauge their perspectives on the use of such tools for school work. The results indicate that ChatGPT performs comparably or even better than students across multiple courses. Importantly, existing AI-text classifiers struggle to reliably flag AI-generated text, mainly due to false positives and the ease with which AI text can be edited. Both students and educators seem to converge on the opinion that using ChatGPT for academic work amounts to plagiarism. The findings could inform policies regarding AI's role in educational settings.

Decode AITurn Artificial Intelligence into Real Results with carefully curated AI developments and insights in less than 5 minutes a day alongside 50k+ readers from companies like Adobe, Amazon, Microsoft, ...

In the ever-evolving landscape of Language Model research, the quest for efficiency and scalability has led to a groundbreaking project – TinyLlama. This audacious endeavor, spearheaded by a research assistant at Singapore University, aims to pre-train a 1.1 billion parameter model on a staggering 3 trillion tokens within a mere 90 days, utilizing a modest setup of 16 A100-40G GPUs. The potential implications of this venture are monumental, as it promises to redefine the boundaries of what was once thought possible in the realm of compact Language Models. While existing models like Meta’s LLaMA and Llama 2 have already demonstrated impressive capabilities at reduced sizes, TinyLlama takes the concept a step further. The 1.1 billion parameter model occupies a mere 550MB of RAM, making it a potential game-changer for applications with limited computational resources.

Alibaba introduces two open-source large vision language models (LVLM) – Qwen-VL and Qwen-VL-Chat. Qwen-VL, the first of these models, is designed to be the sophisticated offspring of Alibaba’s 7-billion-parameter model, Tongyi Qianwen. It showcases an exceptional ability to process images and text prompts seamlessly. Qwen-VL-Chat, on the other hand, takes the concept further by tackling more intricate interactions. Empowered by advanced alignment techniques, this AI model demonstrates a remarkable array of talents, from composing poetry and narratives based on input images to solving complex mathematical questions embedded within images.

This paper introduces a novel framework called Large Language and Speech Model (LLaSM), aiming to address this gap. LLaSM is an end-to-end trained multi-modal model that combines both speech and language understanding capabilities, thereby facilitating cross-modal conversational abilities. The model is designed to follow instructions given through both speech and text, offering a more natural and convenient form of human-AI interaction. The authors also provide an initial dataset, LLaSM-Audio-Instructions, to enable further research and evaluation in the realm of multi-modal speech-and-language instruction following.

Creative AI DigestThis is your favorite weekly newsletter about the intersection of AI and creativity. Only Creative AI Digest delivers a humorous and wise perspective for experienced creative professionals.

What is Trending in AI Tools?

  • Mubert: As an AI-driven platform, Mubert empowers you to craft personalized soundtracks and tunes that match your vibe. [Music]

  • Shutterstock AI Image Generator: By harnessing the power of AI, users can create truly breathtaking and unique designs with ease. [Image Generation]

  • PFPMaker: PFPMaker lets individuals generate captivating profile pictures at no cost. [Free Photo Editing]

  • Hostinger AI Website Builder: The Hostinger AI Website Builder offers an intuitive interface combined with advanced AI capabilities designed for crafting websites for any purpose. [Startup and Web Development]

  • Adcreative AI: Boost your advertising and social media game with AdCreative.ai - the ultimate Artificial Intelligence solution. [Marketing and Sales]

  • Aragon AI: Get stunning professional headshots effortlessly with Aragon. [Photo and LinkedIn]

  • Sanebox: SaneBox's powerful AI automatically organizes your email for you. [Email]

  • Rask AI: a one-stop-shop localization tool that allows content creators and companies to translate their videos into 130+ languages quickly and efficiently. [Speech and Translation]

Editor’s Recommended AI Tool

Aragon AI: Get stunning professional headshots effortlessly with Aragon. Utilize the latest in A.I. technology to create high-quality headshots of yourself in a snap! Skip the hassle of booking a photography studio or dressing up. Get your photos edited and retouched quickly, not after days. Receive 40 HD photos that will give you an edge in landing your next job. [Photo and LinkedIn]

AI Minds NewsletterNewsletter at the Intersection of Human Minds and AI