• AI Research Insights
  • Posts
  • 🚀 AI News: LLM-Blender for Superior Results, EU's Groundbreaking AI Regulations, LU-NeRF's Accurate Pose Estimation, ChatDB's Symbolic Memory, Tülu's Advancements, and WebGLM's Web-Enhanced QA System

🚀 AI News: LLM-Blender for Superior Results, EU's Groundbreaking AI Regulations, LU-NeRF's Accurate Pose Estimation, ChatDB's Symbolic Memory, Tülu's Advancements, and WebGLM's Web-Enhanced QA System

This newsletter brings AI research news that is much more technical than most resources but still digestible and applicable

➡️ Although some LLMs may show better O𝘃𝗲𝗿𝗮𝗹𝗹 performance, the 𝘰𝘱𝘵𝘪𝘮𝘢𝘭 LLMs for different examples can significantly vary! How to E𝗻𝘀𝗲𝗺𝗯𝗹𝗲 multiple LLMs use their diverse strengths to produce better results? Meet LLM-Blender: an ensembling framework designed to attain consistently superior performance by leveraging the diverse strengths of multiple open-source large language models (LLMs). This framework consists of two modules: PairRanker and GenFuser, addressing the observation that optimal LLMs for different examples can significantly vary. PairRanker employs a specialized pairwise comparison method to distinguish subtle differences between candidate outputs. GenFuser aims to merge the top-ranked candidates, generating an improved output by capitalizing on their strengths and mitigating their weaknesses.

➡️ Big AI News From EU: The European Union (EU) has made significant strides in regulating artificial intelligence (AI) technology. In a groundbreaking move, the EU has implemented a ban on the use of AI for biometric surveillance, emotion recognition, and predictive policing. Furthermore, the EU requires registration of AI models, including detailed information about the training data used. This ensures transparency and accountability in AI systems. Additionally, there is a specific requirement to identify and mitigate the risks associated with deepfakes, further safeguarding individuals against manipulated content. These developments reflect the EU's commitment to responsible AI deployment and the protection of individual's rights and privacy.

➡️ A team of researchers from the University of Massachusetts, Amherst, and Google has introduced an innovative approach called LU-NeRF, which addresses the joint estimation of camera poses and neural radiance fields with more flexible assumptions regarding pose configuration. This novel approach follows a local-to-global methodology, initially optimizing over localized subsets of the data referred to as "mini-scenes." LU-NeRF accurately estimates the local pose and geometry, even for challenging few-shot scenarios. By employing a robust pose synchronization step, the mini-scene poses are then aligned within a global reference framework, allowing for a final optimization of the pose and scene. The researchers demonstrate that their LU-NeRF pipeline surpasses previous attempts at unposed NeRF while avoiding restrictive assumptions regarding the pose prior.

➡️ Meet ChatDB: A Framework that Augments LLMs with Symbolic Memory in the Form of Databases. Researchers from Tsinghua University, Beijing Academy of Artificial Intelligence and Zhejiang University advocate using databases as innovative symbolic memory for LLMs to solve the problems above. The research team demonstrate the advantages and capabilities of symbolic memory and chain-of-memory approach in enhancing complex reasoning and preventing error accumulation. By providing a precise storage mechanism for intermediate results, symbolic memory enables accurate and reliable operations. Moreover, the use of symbolic languages, such as SQL, allows symbolic computation and manipulation of stored information.

➡️ So many instruction-tuning datasets came out recently! How valuable are they, and how far are open models really from proprietary ones like ChatGPT? Researchers from AI2 and the University of Washington propose Tülu: a suite of LLaMa-tuned models up to 65B. Tülu contains models from 7-65B that are full-parameter finetuned from LLaMa on a combination of 7 datasets. Especially, Tülu 65B is a very strong model that outperforms models of smaller scales or trained on individual datasets. Though, it still has a notable gap from ChatGPT.

➡️ Meet WebGLM: the new member of the GLM family (GLM-130B, ChatGLM-130B, ChatGLM-6B, VisualGLM-6B). It is a web-enhanced QA system based on the General Language Model (GLM). All we need next is a search engine. The researchers leverage GPT-3’s in-context learning ability to build a LLM-bootstrapped quoted and long-form QA dataset, which is used to train this model. Further, they trained a human preference-aware scorer and use it to give marks to responses generated by our model. For each question, the scorer can select the highest-scored response from candidates, thus obtaining a final answer humans prefer the most. They conducted extensive experiments, including both the human evaluation and the Turing test, to demonstrate the competitive performance of WebGLM with some of the pioneering web-enhanced question-answering systems.

Featured Tools:

and many more in our AI Tools Club.