AI Dev and Research News
Posts
Marktechpost Newsletter: Q-GaLore Released + European LLM Leaderboard + Patronus AI Introduces Lynx.....

Marktechpost Newsletter: Q-GaLore Released + European LLM Leaderboard + Patronus AI Introduces Lynx.....

ASIF RAZZAQ
July 15, 2024

Presented by

Good morning, AI aficionados! Today, we delve into the latest innovations and breakthroughs shaping the future of artificial intelligence. The AI landscape is advancing rapidly, from pioneering research in machine learning to the transformative potential of large language models (LLMs). In this edition, we'll explore cutting-edge AI models and showcase research articles from the AI research community.

Stay curious and inspired!

— Marktechpost.com Team

Featured

Q-GaLore Released: A Memory-Efficient Training Approach for Pre-Training and Fine-Tuning Machine Learning Models

Researchers from the University of Texas at Austin, the University of Surrey, the University of Oxford, the California Institute of Technology, and Meta AI have introduced Q-GaLore to reduce memory consumption further and make LLM training more accessible. Q-GaLore combines quantization and low-rank projection to enhance memory efficiency significantly. This method builds on two key observations: the gradient subspace exhibits diverse properties, with some layers stabilizing early in training. In contrast, others change frequently, and the projection matrices are highly resilient to low-bit quantization. By leveraging these insights, Q-GaLore adaptively updates the gradient subspace based on convergence statistics, maintaining performance while reducing the number of SVD operations. The model weights are kept in INT8 format, and the projection matrices are in INT4 format, which conserves memory aggressively.

Q-GaLore employs two main modules: low-precision training with low-rank gradients and lazy layer-wise subspace exploration. The entire model, including optimizer states, uses 8-bit precision for the Adam optimizer, and the projection matrices are quantized to 4 bits. This approach results in a memory reduction of approximately 28.57% for gradient low-rank training. Stochastic rounding maintains training stability and approximates the high-precision training trajectory. This method allows for a high-precision training path using only low-precision weights, preserving small gradient contributions effectively without needing to maintain high-precision parameters.

AI Webinar - 'Optimise Your Custom Embedding Space: How to find the right embedding model for YOUR data.'

Date: July 18, 2024

Selecting the optimal embedding model for your specific use case is a challenge for many ML teams, but critical in ensuring a meaningful and accurate representation of your data. While popular models like CLIP perform well on standard benchmark datasets, their effectiveness on any unique datasets remains uncertain.

However, manually evaluating each model is time-consuming and error-prone. So how do you find the correct embedding model for YOUR data?

Enter TTI-Eval: Your go-to tool for benchmarking text-to-image embedding models on custom datasets.

✅ Compare multiple embedding models effortlessly

✅ Gain insights tailored to your specific use cases

✅Join the repos authors for a webinar this Thursday to dive into the art of embedding evaluation!

Leaderboard

OpenGPT-X Team Publishes European LLM Leaderboard: Promoting the Way for Advanced Multilingual Language Model Development and Evaluation

The release of the European LLM Leaderboard by the OpenGPT-X team presents a great milestone in developing and evaluating multilingual language models. The project, supported by TU Dresden and a consortium of ten partners from various sectors, aims to advance language models’ capabilities in handling multiple languages, thereby reducing digital language barriers and enhancing the versatility of AI applications across Europe.

Several benchmarks have been translated and employed in the project to assess the performance of multilingual LLMs:

ARC and GSM8K: Focus on general education and mathematics.

HellaSwag and TruthfulQA: Test the ability of models to provide plausible continuations and truthful answers.

MMLU: Provides a wide range of tasks to assess the models’ capabilities across different domains.

FLORES-200: Aimed at assessing machine translation skills.

Belebele: Focuses on understanding and answering questions in multiple languages.

Patronus AI

Patronus AI Introduces Lynx: A SOTA Hallucination Detection LLM that Outperforms GPT-4o and All State-of-the-Art LLMs on RAG Hallucination Tasks

Patronus AI has announced the release of Lynx. This cutting-edge hallucination detection model promises to outperform existing solutions such as GPT-4, Claude-3-Sonnet, and other models used as judges in closed and open-source settings. This groundbreaking model, which marks a significant advancement in artificial intelligence, was introduced with the support of key integration partners, including Nvidia, MongoDB, and Nomic.

Hallucination in large language models (LLMs) refers to generating information either unsupported or contradictory to the provided context. This poses serious risks in applications where accuracy is paramount, such as medical diagnosis or financial advising. Traditional techniques like Retrieval Augmented Generation (RAG) aim to mitigate these hallucinations, but they are not always successful. Lynx addresses these shortcomings with unprecedented accuracy.

One of Lynx’s key differentiators is its performance on the HaluBench, a comprehensive hallucination evaluation benchmark consisting of 15,000 samples from various real-world domains. Lynx has superior performance in detecting hallucinations across diverse fields, including medicine and finance. For instance, in the PubMedQA dataset, Lynx’s 70 billion parameter version was 8.3% more accurate than GPT-4 at identifying medical inaccuracies. This level of precision is critical in ensuring the reliability of AI-driven solutions in sensitive areas.

ETH Zurich

ETH Zurich Researchers Introduced EventChat: A CRS Using ChatGPT as Its Core Language Model Enhancing Small and Medium Enterprises with Advanced Conversational Recommender Systems

Researchers from ETH Zurich have introduced EventChat, a CRS tailored for SMEs in the leisure industry. The company aims to balance cost-effectiveness with high-quality user interactions. EventChat utilizes ChatGPT as its core language model, integrating prompt-based learning techniques to minimize the need for extensive training data. This approach makes it accessible for smaller businesses by reducing the implementation complexity and associated costs. EventChat’s key features include handling complex queries, providing tailored event recommendations, and addressing SMEs’ specific needs in delivering enhanced user experiences.

EventChat operates through a turn-based dialogue system where user inputs trigger specific actions such as search, recommendation, or targeted inquiries. The backend architecture combines relational and vector databases to curate relevant event information. Combining button-based interactions with conversational prompts, this hybrid approach ensures efficient resource use while maintaining high recommendation accuracy. Developed using the Flutter framework, EventChat’s frontend allows for customizable time intervals and user preferences, enhancing overall user experience and control. By including user-specific parameters directly in the chat, EventChat optimizes interaction efficiency and satisfaction.

Also, don’t forget to follow us on Twitter and join our 46k+ ML SubReddit, 26k+ AI Newsletter, Telegram Channel, and LinkedIn Group.

If You are interested in a promotional partnership (content/ad/newsletter), please fill out this form.

Marktechpost Newsletter: Q-GaLore Released + European LLM Leaderboard + Patronus AI Introduces Lynx.....