• AI Research Insights
  • Posts
  • ⏰ AI Insights: NVIDIA AI Unveils Fugatto & Hymba 1.5B and Intel AI Research Releases FastDraft...

⏰ AI Insights: NVIDIA AI Unveils Fugatto & Hymba 1.5B and Intel AI Research Releases FastDraft...

Hi There,

Dive into the hottest AI breakthroughs of the week—handpicked just for you!

Super Important AI News 🔥 🔥 🔥

NVIDIA Introduces Hymba 1.5B: A Hybrid Small Language Model Outperforming Llama 3.2 and SmolLM v2

📢 Neural Magic Releases 2:4 Sparse Llama 3.1 8B: Smaller Models for Efficient GPU Inference

🎃 OpenAI Researchers Propose a Multi-Step Reinforcement Learning Approach to Improve LLM Red Teaming

🚨 [Download] 2024 Gartner® Cool Vendors™ in AI Engineering Report(Promoted)

💡 CMU Researchers Propose XGrammar: An Open-Source Library for Efficient, Flexible, and Portable Structured Generation

🎃 NVIDIA AI Unveils Fugatto: A 2.5 Billion Parameter Audio Model that Generates Music, Voice, and Sound from Text and Audio Input

🔥 Intel AI Research Releases FastDraft: A Cost-Effective Method for Pre-Training and Aligning Draft Models with Any LLM for Speculative Decoding

Featured AI Research 🛡️🛡️🛡️

🔥 Intel AI Research Releases FastDraft: A Cost-Effective Method for Pre-Training and Aligning Draft Models with Any LLM for Speculative Decoding

Summary

Researchers at Intel Labs introduced FastDraft, an efficient framework for training and aligning draft models compatible with various target LLMs, including Phi-3-mini and Llama-3.1-8B. FastDraft stands out by employing a structured approach to pre-training and fine-tuning. Pre-training focuses on processing datasets containing up to 10 billion tokens of natural language and code while fine-tuning uses sequence-level knowledge distillation to improve draft-target alignment. This process ensures that the draft models achieve optimal performance across diverse tasks.

FastDraft’s architecture imposes minimal requirements, allowing for flexibility in model design while ensuring compatibility with the target LLM’s vocabulary. During pre-training, the draft model predicts the next token in a sequence, using datasets like FineWeb for natural language and The Stack v2 for code. The alignment phase employs synthetic datasets generated by the target model, refining the draft model’s ability to mimic the target model’s behavior. These techniques ensure that the draft model maintains high efficiency and accuracy…

Other AI News 🎖️🎖️🎖️

🧵🧵 LTX-Video: A Groundbreaking Real-Time Video Generation Open-Source Model with Day-One Native Support in ComfyUI, Empowering Innovators to Transform Content Creation

♦️ Researchers from the University of Maryland and Adobe Introduce DynaSaur: The LLM Agent that Grows Smarter by Writing its Own Functions

🥁 📚 Researchers from MBZUAI and CMU Introduce Bi-Mamba: A Scalable and Efficient 1-bit Mamba Architecture Designed for Large Language Models in Multiple Sizes (780M, 1.3B, and 2.7B Parameters)

 🎃 Hugging Face Releases Observers: An Open-Source Python Library that Provides Comprehensive Observability for Generative AI APIs

🚨 [Download] 2024 Gartner® Cool Vendors™ in AI Engineering Report(Promoted)