- AI Research Insights
- Posts
- AI News: MLANG for the French Tax Code; New GPTZero Model; What if you could fit an entire codebase in an LLM?; Stanford’s Parsel,......
AI News: MLANG for the French Tax Code; New GPTZero Model; What if you could fit an entire codebase in an LLM?; Stanford’s Parsel,......
Hi there, today we will share some research updates from MLANG’s inclusion for the French Tax Code, The New GPTZero Model, What if you could fit an entire codebase in an LLM?, Stanford’s Parsel, SparceGPT, MPCFormer, and many other cool updates. So, let's start...
MLANG A Modern Compiler for the French Tax Code: An open-source compiler toolchain whose goal is to replace the existing infrastructure: French researchers transformed their tax code into computer code, compiling it for Python. This provides valuable insights into the workings of France's income tax computations. The government is reviewing this to see if they can adopt it for official use in production.
The New GPTZero Model: It now handles mix AI + human text, and highlights portions of text that are most likely to be AI-generated, a key feature that educators in our community have been requesting. They also built a pipeline to handle file batch uploads in PDF, Word, and .txt format so you can run multiple files through GPTZero at ease.
What if you could fit an entire codebase in an LLM?😀 Researchers from Google developed a simple analytical model for inference efficiency to select the best multi-dimensional partitioning techniques optimized for TPU v4 slices based on the application requirements. They combined these with a suite of low-level optimizations to achieve a new Pareto frontier on the latency and model FLOPS utilization (MFU) tradeoffs on 500B+ parameter models that outperform the FasterTransformer suite of benchmarks
Researchers at Stanford Introduce Parsel: Parsel is an AI framework that enables automatic implementation and validation of complex algorithms with code large language models LLMs. For code language models, every token is a new chance to break a program. What if LLMs wrote code like people, decomposing programs into solvable parts? They can solve competition-level coding problems by writing natural language programs in Parsel, beating prior SoTA by >75%!
TextReducer: TextReducer is a tool for summarization and information extraction powered by the SentenceTransformer library. Unlike many techniques for extractive summaries, TextReducer has the option for a "target" around which the summary will be focused. This target can be any text prompt, meaning that a user can specify the type of information that they would like to find or summarize and ignore everything else. Another key benefit of TextReducer is that rather than extracting the sentences for the summary, it carves away at the original text, removing unnecessary sentences. This leads to more fluent summarizations and preserves grammatical features like coreference that are often lost in traditional extractive summarization.
MPCFormer: This Artificial Intelligence AI Framework Called MPCFormer Enables Private Inference With Secure Multiparty Computation (MPC) For Transformers (Copilot, ChatGPT, OPT). Love using Copilot but don’t want to send codes to the cloud? MPC provides a strong privacy guarantee by keeping original inputs local.
Cool Project: An ML Search Engine that can find exact timestamps for anything on Youtube using OpenAI Whisper and UKPLab's SBERT Sentence Transformers
SparceGPT: Researchers from IST Austria show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models.
A simple algorithm to decide whether to use ChatGPT