AI Dev and Research News
Posts
⏰ Featured AI: AI2 Releases Tülu 3 405 and Mistral Releases the Mistral-Small-24B-Instruct.....

⏰ Featured AI: AI2 Releases Tülu 3 405 and Mistral Releases the Mistral-Small-24B-Instruct.....

February 02, 2025

Sponsored by Plurai

Hi There,

Dive into the hottest AI breakthroughs of the week—handpicked just for you!

Super Important AI News 🔥 🔥 🔥

🧵🧵 Meet IntellAgent: An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System _(Promoted)

⭐ The Allen Institute for AI (AI2) Releases Tülu 3 405B: Scaling Open-Weight Post-Training with Reinforcement Learning from Verifiable Rewards (RLVR) to Surpass DeepSeek V3 and GPT-4o in Key Benchmarks

📢 Mistral AI Releases the Mistral-Small-24B-Instruct-2501: A Latency-Optimized 24B-Parameter Model Released Under the Apache 2.0 License

🚨 Beyond Open Source AI: How Bagel’s Cryptographic Architecture, Bakery Platform, and ZKLoRA Drive Sustainable AI Monetization_(Promoted)

💡💡 Researchers from Stanford, UC Berkeley and ETH Zurich Introduces WARP: An Efficient Multi-Vector Retrieval Engine for Faster and Scalable Search

🧲 🧲 Memorization vs. Generalization: How Supervised Fine-Tuning SFT and Reinforcement Learning RL Shape Foundation Model Learning

Featured AI Update 🛡️🛡️🛡️

🔥 The Allen Institute for AI (AI2) Releases Tülu 3 405B: Scaling Open-Weight Post-Training with Reinforcement Learning from Verifiable Rewards (RLVR) to Surpass DeepSeek V3 and GPT-4o in Key Benchmarks

The team has developed its latest release, Tülu 3 405B, the first open-weight model to successfully apply a fully open post-training recipe at a 405-billion-parameter scale. The model introduces a novel reinforcement learning approach known as Reinforcement Learning with Verifiable Rewards (RLVR), which significantly improves model performance in specialized tasks by ensuring that rewards are based on verifiable outcomes rather than subjective feedback. The research team deployed Tülu 3 405B using vLLM with 16-way tensor parallelism, optimizing computational efficiency across 256 GPUs running in parallel.

The Tülu 3 post-training recipe follows a four-stage approach that begins with data curation and synthesis, ensuring that core skills such as reasoning, mathematics, coding, and safety are well represented. The next stage involves supervised fine-tuning (SFT), where the model is trained using carefully selected prompts and their completions. Direct Preference Optimization (DPO) is applied in the third stage, leveraging off-policy and on-policy preference data to refine responses. Finally, RLVR is introduced to enhance specialized skills, particularly in verifiable tasks such as mathematical problem-solving. One of the key differentiators of Tülu 3’s approach is its ability to scale effectively. The team found that using MATH data exclusively, rather than combining GSM8k and IFEval, yielded better results for larger models......

Other AI News 🎖️🎖️🎖️

🚨 Meet IntellAgent: An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System _(Promoted)

🧿 Curiosity-Driven Reinforcement Learning from Human Feedback CD-RLHF: An AI Framework that Mitigates the Diversity Alignment Trade-off In Language Models

🧩 Intel Labs Explores Low-Rank Adapters and Neural Architecture Search for LLM Compression

📢 Google AI Introduces Parfait: A Privacy-First AI System for Secure Data Aggregation and Analytics

🚨 Marktechpost is inviting AI Companies/Startups/Groups to partner for its upcoming AI Magazines on ‘Open Source AI in Production’ and ‘Agentic AI’. _{(Editor’s Message)}

Coding Tutorial 🎖️🎖️🎖️

🖥️ Creating an AI-Powered Tutor Using Vector Database and Groq for Retrieval-Augmented Generation (RAG): Step by Step Guide (Colab Notebook Included)

In this tutorial, we will create an AI-powered English tutor using RAG. The system integrates a vector database (ChromaDB) to store and retrieve relevant English language learning materials and AI-powered text generation (Groq API) to create structured and engaging lessons. The workflow includes extracting text from PDFs, storing knowledge in a vector database, retrieving relevant content, and generating detailed AI-powered lessons. The goal is to build an interactive English tutor that dynamically generates topic-based lessons while leveraging previously stored knowledge for improved accuracy and contextual relevance…..