AI Dev and Research News
Posts
What's included in this newsletter: AgileCoder, Qwen2-Math, LiveBench AI, KGLens and EXAONE 3.0

What's included in this newsletter: AgileCoder, Qwen2-Math, LiveBench AI, KGLens and EXAONE 3.0

ASIF RAZZAQ
August 12, 2024

Visit AI 48k+ Subreddit | Advertise on Marktechpost

Hello, You!

It was another busy week with plenty of news and updates about artificial intelligence (AI) research and dev. We have curated the top industry research updates specially for you. I hope you enjoy these updates, and make sure to share your opinions with us on social media.

In today’s edition of AI Research/Dev News & Updates:

Software AI Engineering

Researchers at FPT Software AI Center Introduce AgileCoder: A Multi-Agent System for Generating Complex Software, Surpassing MetaGPT and ChatDev

In this work, a team of researchers from the FPT Software AI Center propose AgileCoder, a novel framework that mimics the intricate software development process in the real world by drawing inspiration from Agile Methodology, a widely used approach in professional software development teams. Approximately 70% of professional teams employ Agile Methodology, which is better suited to real-world software development. AgileCoder is built upon a key concept of Agile: software continually evolves over time, and thus development should be structured in the form of sprints (aka. phases).

AgileCoder consists of multiple agents playing distinct roles: a Project Manager, a Scrum Master, a Developer, a Senior Developer, and a Tester. These agents work collaboratively across sprints to achieve user tasks in accordance with the Agile methodology. By adapting Agile workflows to a multi-agent framework, AgileCoder emphasizes dynamic adaptability and iterative development. Outputs and problems from previous sprints are inherited and refined in subsequent sprints, increasing the likelihood of success for final products.....

➡️ Continue reading here!

Maths and AI

Qwen2-Math Released: A Comprehensive AI Suite Featuring Models Ranging from 1.5B to 72B Parameters, Transforming Mathematical Computation

The Qwen Team has recently released the Qwen 2-Math series. This release, encompassing several model variants tailored for distinct applications, demonstrates the team’s commitment to enhancing AI’s proficiency in handling complex mathematical tasks. The Qwen 2-Math series is a comprehensive set of models, each designed to cater to different computational needs.

✅ Qwen 2-Math-72B ✅ Qwen 2-Math-72B-Instruct ✅ Qwen 2-Math-7B ✅ Qwen 2-Math-7B-Instruct ✅ Qwen 2-Math-1.5B ✅ Qwen 2-Math-1.5B-Instruct

These models vary in complexity and instruction-following capabilities. It provides users with various options depending on their specific requirements. At the top of the range is the Qwen 2-Math-72B, a model that boasts an impressive 72 billion parameters. This variant is designed for highly complex mathematical computations and is suitable for tasks requiring deep learning and extensive data processing. The “Instruct” version of this model, Qwen 2-Math-72B-Instruct, offers additional enhancements that allow it to follow user instructions more precisely......

➡️ Continue reading here!

Leaderboard

Abacus AI Introduces LiveBench AI: A Super Strong LLM Benchmark that Tests all the LLMs on Reasoning, Math, Coding and more

Abacus.AI, a prominent player in AI, has recently unveiled its latest innovation: LiveBench AI. This new tool is designed to enhance the development and deployment of AI models by providing real-time feedback and performance metrics. The introduction of LiveBench AI aims to bridge the gap between AI model development and practical, real-world application. LiveBench AI is tailored to meet the growing demand for efficient and effective AI model testing. LiveBench AI addresses this need by offering developers and data scientists a platform where they can receive instant feedback on their models’ performance. This feature is good for teams working on large-scale AI projects, where iterative testing and improvement are essential for success.

➡️ Continue reading here!

Recommended AI WEBINAR from Our Partner

Workshop - Rapid LLM Experimentation with Gretel and Lambda

Hear from teams behind the AI developer cloud Lambda and the synthetic data platform Gretel about how their combined stack drives faster AI experimentation and innovation.

ter.li/glld87

Knowledge Graphs- AI

Apple Researchers Present KGLens: A Novel AI Method Tailored for Visualizing and Evaluating the Factual Knowledge Embedded in LLMs

Researchers from Apple introduced KGLENS, an innovative knowledge probing framework that has been developed to measure knowledge alignment between KGs and LLMs and identify LLMs’ knowledge blind spots. The framework employs a Thompson sampling-inspired method with a parameterized knowledge graph (PKG) to probe LLMs efficiently. KGLENS features a graph-guided question generator that converts KGs into natural language using GPT-4, designing two types of questions (fact-checking and fact-QA) to reduce answer ambiguity. Human evaluation shows that 97.7% of generated questions are sensible to annotators…

➡️ Continue reading here!

Small Language Model

EXAONE 3.0 Released: A 7.8B Open-Sourced State of the Art Language Model from LG AI Research

LG AI Research has recently announced the release of EXAONE 3.0. This latest third version in the series upgrades EXAONE’s already impressive capabilities. The release as an open-source large language model is unique to the current version with great results and 7.8B parameters. With the introduction of EXAONE 3.0, LG AI Research is driving a new development direction, marking it competitive with the latest technology trends. EXAONE 3.0 has many new features and enhancements that set it apart from its predecessors. One of the most notable improvements is the increased processing power, allowing faster and more efficient data analysis. This enhancement is crucial in handling the massive datasets that modern AI systems must process to deliver accurate and reliable results. The increased computational capacity also enables EXAONE 3.0 to perform complex tasks more precisely, making it a valuable tool for various industries...........

➡️ Continue reading here!