• AI Research Insights
  • Posts
  • ⏰ AI Insights: NVIDIA Introduces MM-Embed & MBZUAI Releases Atlas-Chat (2B, 9B, and 27B)...

⏰ AI Insights: NVIDIA Introduces MM-Embed & MBZUAI Releases Atlas-Chat (2B, 9B, and 27B)...

Newsletter Series by Marktechpost.com

Hi There,

Dive into the hottest AI breakthroughs of the week—handpicked just for you!

Super Important AI News 🔥 🔥 🔥

🎙️ Marktechpost just released the latest magazine report on the hottest topic for the year 2024: 'Small Language Models' (Our next version is on ‘Open Source AI’)

🎃 NVIDIA AI Introduces MM-Embed: The First Multimodal Retriever Achieving SOTA Results on the Multimodal M-BEIR Benchmark

Microsoft Researchers Introduce Magentic-One: A Modular Multi-Agent System Focused on Enhancing AI Adaptability and Task Completion Across Benchmark Tests

📢 MBZUAI Researchers Release Atlas-Chat (2B, 9B, and 27B): A Family of Open Models Instruction-Tuned for Darija (Moroccan Arabic)

🚨 Arcee AI Releases Arcee-VyLinh: A Powerful 3B Vietnamese Small Language Model

💡 Salesforce AI Research Introduces Moirai-MoE: A MoE Time Series Foundation Model that Achieves Token-Level Model Specialization Autonomously

Featured AI Research 🛡️🛡️🛡️

NVIDIA AI Introduces MM-Embed: The First Multimodal Retriever Achieving SOTA Results on the Multimodal M-BEIR Benchmark

Summary

NVIDIA researchers have stepped up to address these challenges by introducing MM-Embed, the first multimodal retriever that has achieved state-of-the-art (SOTA) results on the multimodal M-BEIR benchmark and ranks among the top five retrievers on the text-only MTEB retrieval benchmark. MM-Embed aims to bridge the gap between multiple retrieval formats, allowing for a more fluid search experience that spans both text and image-based content. The researchers fine-tuned MM-Embed using a multimodal large language model (MLLM) as a bi-encoder retriever across 16 retrieval tasks and ten datasets, demonstrating its versatility. Unlike other existing retrievers, MM-Embed does not restrict itself to a single type of data but instead supports complex user queries that may be composed of both text and images. Furthermore, the introduction of modality-aware hard negative mining plays a crucial role in enhancing MM-Embed’s retrieval quality by minimizing the biases commonly seen in MLLMs…

Other AI News 🎖️🎖️🎖️

🎙️ Researchers from Stanford and Cornell Introduce APRICOT: A Novel AI Approach that Merges LLM-based Bayesian Active Preference Learning with Constraint-Aware Task Planning

♦️ Is Your LLM Agent Enterprise-Ready? Salesforce AI Research Introduces CRMArena: A Novel AI Benchmark Designed to Evaluate AI Agents on Realistic Tasks Grounded on Professional Work Environments

🧩 Researchers from Bloomberg and UNC Chapel Hill Introduce M3DocRAG: A Novel Multi-Modal RAG Framework that Flexibly Accommodates Various Document Context

📢 RAGCache: Optimizing Retrieval-Augmented Generation with Dynamic Caching

🥁 📚 Cerebras Systems Revolutionizes AI Inference: 3x Faster with Llama 3.1-70B at 2,100 Tokens per Second