• AI Research Insights
  • Posts
  • ⏰ Featured AI: Google DeepMind Introduces AlphaGeometry2 and Kyutai Releases Hibiki.....

⏰ Featured AI: Google DeepMind Introduces AlphaGeometry2 and Kyutai Releases Hibiki.....

Hi There,

Dive into the hottest AI breakthroughs of the week—handpicked just for you!

Super Important AI News 🔥 🔥 🔥

📢 Meta AI Introduces ParetoQ: A Unified Machine Learning Framework for Sub-4-Bit Quantization in Large Language Models

🧵🧵 Meet IntellAgent: An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System (Promoted)

💡💡 Google DeepMind Introduces AlphaGeometry2: A Significant Upgrade to AlphaGeometry Surpassing the Average Gold Medalist in Solving Olympiad Geometry

🧲 🧲  Microsoft AI Researchers Release LLaVA-Rad: A Lightweight Open-Source Foundation Model for Advanced Clinical Radiology Report Generation

Featured AI Update 🛡️🛡️🛡️

🔥 Kyutai Releases Hibiki: A 2.7B Real-Time Speech-to-Speech and Speech-to-Text Translation with Near-Human Quality and Voice Transfer

Kyutai has developed Hibiki, a 2.7 billion-parameter decoder-only model designed for real-time speech-to-speech (S2ST) and speech-to-text (S2TT) translation. Operating at 12.5Hz framerate with a 2.2kbps bitrate, Hibiki currently supports French-to-English translation and is designed to preserve voice characteristics in the translated output. A distilled version, Hibiki-M (1.7B parameters), is optimized for real-time performance on smartphones, making it more accessible for on-device translation.

Hibiki has demonstrated strong performance in translation quality and speaker fidelity. It achieves an ASR-BLEU score of 30.5, surpassing existing baselines, including offline models. Human evaluations rate its naturalness at 3.73/5, approaching the 4.12/5 score of professional human interpreters. The model also performs well in speaker similarity, with a 0.52 similarity score compared to 0.43 for Seamless. Compared to Seamless and StreamSpeech, Hibiki consistently delivers higher translation quality and better voice transfer, while maintaining a competitive latency. The distilled Hibiki-M variant, though slightly lower in speaker similarity, remains effective for real-time on-device use.

Other AI News 🎖️🎖️🎖️

🚨 Meet IntellAgent: An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System (Promoted)

🧿 Meta AI Introduces Brain2Qwerty: A New Deep Learning Model for Decoding Sentences from Brain Activity with EEG or MEG while Participants Typed Briefly Memorized Sentences on a QWERTY Keyboard

 🧩 This AI Paper Introduces MaAS (Multi-agent Architecture Search): A New Machine Learning Framework that Optimizes Multi-Agent Systems

📢   ChunkKV: Optimizing KV Cache Compression for Efficient Long-Context Inference in LLMs

Coding Tutorial 👩🏼‍💻👩🏼‍💻

In this tutorial, we demonstrate the workflow for fine-tuning Mistral 7B using QLoRA with Axolotl, showing how to manage limited GPU resources while customizing the model for new tasks. We’ll install Axolotl, create a small example dataset, configure the LoRA-specific hyperparameters, run the fine-tuning process, and test the resulting model’s performance.

Step 1: Prepare the Environment and Install Axolotl…..

# 1. Check GPU availability
!nvidia-smi


# 2. Install git-lfs (for handling large model files)
!sudo apt-get -y install git-lfs
!git lfs install


# 3. Clone Axolotl and install from source
!git clone https://github.com/OpenAccess-AI-Collective/axolotl.git
%cd axolotl
!pip install -e .


# (Optional) If you need a specific PyTorch version, install it BEFORE Axolotl:
# !pip install torch==2.0.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118


# Return to /content directory
%cd /content