• AI Research Insights
  • Posts
  • Marktechpost Newsletter: Fully Quantized FP8 Version of Meta’s Llama 3.1 405B Model, LAMBDA, Agent-E,.....

Marktechpost Newsletter: Fully Quantized FP8 Version of Meta’s Llama 3.1 405B Model, LAMBDA, Agent-E,.....

Marktechpost Newsletter: Fully Quantized FP8 Version of Meta’s Llama 3.1 405B Model, LAMBDA, Agent-E,.....

Presented by

Featured Research

Neural Magic has recently announced a significant breakthrough in AI model compression, introducing a fully quantized FP8 version of Meta’s Llama 3.1 405B model. This achievement marks a milestone in AI, allowing the massive 405 billion parameter model to fit seamlessly on any 8xH100 or 8xA100 system without the common out-of-memory (OOM) errors typically encountered with the original FP8 and FP16 versions. The new model solves memory constraints and enhances inference speeds by over 2X, leveraging faster memory and computing capabilities and eliminating the need for CPU offloading or distribution across multiple nodes.

Neural Magic provides two key versions of the model:

The fully quantized FP8 version, Meta-Llama-3.1-405B-Instruct-FP8-dynamic, maintains the architecture of Meta-Llama-3.1, designed for an assistant-like chat in multiple languages. However, it is restricted to usage in English and for lawful applications only. Released under version 1.0, this model was developed by Neural Magic and operates under the llama3.1 license.....

 Editor’s Picks…

LAMBDA: A New Open-Source, Code-Free Multi-Agent Data Analysis System to Bridge the Gap Between Domain Experts and Advanced AI Models

A team of researchers from Hong Kong Polytechnic University has introduced LAMBDA, a new open-source and code-free multi-agent data analysis system developed to overcome the lack of effective communication between domain experts and advanced AI models. LAMBDA provides an essential medium that allows smooth interaction between domain knowledge and AI capabilities in data science. This method solves numerous problems like removing coding barriers, integrating human intelligence with AI, and reshaping data science education, promising reliability and portability. Reliability means LAMBDA can address the tasks of data analysis stably and correctly. Portability means it is compatible with various LLMs, allowing it to be enhanced by the latest state-of-the-art models.

The proposed method, LAMBDA, a multi-agent data analysis system, contains two agents that work together to solve data analysis tasks using natural language. The process starts with writing code based on user instructions and then executing that code. The two main roles of LAMBDA are the “programmer” and the “inspector.” The programmer writes code according to the user’s instructions and dataset. This code is then run on the host system. If the code encounters any errors during execution, the inspector plays the role of suggesting improvements. The programmer uses these suggestions to fix the code and submit it for re-evaluation.

ADVERTISEMENT

Time: August 13, 2024 | 10:00 am PT / 1:00 pm ET

Hear from teams behind the AI developer cloud Lambda and the synthetic data platform Gretel about how their combined stack drives faster AI experimentation and innovation.

In this webinar, learn how Gretel and Lambda together unlock faster experimentation so teams can easily vet approaches, fail fast, and be much more agile in delivering a LLM solution that works. We will use Gretel Navigator, the first compound AI system for synthetic data generation, to design and iterate on a task-specific dataset. Designing (from scratch) and iterating on data is built into Navigator and into how users interact with it, creating a new paradigm for how AI/ML teams approach overall model development. Teams are no longer limited to experimenting with just architectures, model configurations and training parameters. They can quickly experiment with data itself, and increasingly it’s data experimentation that’s driving most innovation.

Emergence AI Proposes Agent-E: A Web Agent Achieving 73.2% Success Rate with a 20% Improvement in Autonomous Web Navigation

Researchers at Emergence AI introduced Agent-E, a novel web agent designed to overcome the shortcomings of existing systems. Agent-E’s hierarchical architecture divides the task planning and execution phases into two distinct components: the planner agent and the browser navigation agent. This separation allows each component to focus on its specific role, improving efficiency and performance. The planner agent decomposes tasks into sub-tasks, which are then executed by the browser navigation agent using advanced DOM distillation techniques

The methodology of Agent-E involves several innovative steps to manage noisy and expansive web content effectively. The planner agent breaks down user tasks into smaller sub-tasks and assigns them to the browser navigation agent. This agent uses flexible DOM distillation techniques to select the most relevant DOM representation for each task, reducing noise and focusing on task-specific information. Agent-E employs change observation to monitor state changes during task execution, providing feedback that enhances the agent’s performance and accuracy.

What if the Next Medical Breakthrough is Hidden in Plain Text? Meet NATURAL: A Pipeline for Causal Estimation from Unstructured Text Data in Hours, Not Years

Researchers from the University of Toronto, Vector Institute, and Meta AI introduced NATURAL, a novel family of causal effect estimators leveraging large language models (LLMs) to analyze unstructured text data. This method allows for extracting causal information from diverse sources such as social media posts, clinical reports, and patient forums. By automating data curation and leveraging the capabilities of LLMs, NATURAL provides a scalable solution for various applications.

NATURAL utilizes LLMs to process natural language text and estimate the conditional distributions of variables of interest. The process involves filtering relevant reports, extracting covariates and treatments, and using these to compute average treatment effects (ATEs). The method mimics traditional causal inference techniques but operates on unstructured data, making it a versatile and scalable solution.....

Upcoming AI Webinars

Here is a list of Upcoming AI Webinars from various AI and Data Companies

July 31, 2024 (2 PM EST)

Aug 8, 2024 (10:00 am PST)

Aug 01, 2024 (9:00 AM PST)

Aug 08, 2024 (9:00 AM PST)

August 13, 2024 (10:00 am PT)