AI Dev and Research News
Posts
AI Research/Dev Super Interesting News: Jina-ColBERT-v2 Released, NVEagle Released by NVIDIA, and many more....

AI Research/Dev Super Interesting News: Jina-ColBERT-v2 Released, NVEagle Released by NVIDIA, and many more....

September 03, 2024

In partnership with

Upcoming Live Session: ‘Building Performant AI Applications with NVIDIA NIMs and Haystack’

Newsletter Series by Marktechpost.com

Hi There…

It was another busy week with plenty of news and updates about artificial intelligence (AI) research and dev. We have curated the top industry research updates specially for you. I hope you enjoy these updates, and make sure to share your opinions with us on social media.

Jina-ColBERT-v2 Released: A Groundbreaking Multilingual Retrieval Model Achieving 6.6% Performance Boost and 50% Storage Reduction Across Diverse Benchmarks

Researchers from the University of Texas at Austin and Jina AI GmbH have introduced Jina-ColBERT-v2, an advanced version of the ColBERT model designed specifically to address the shortcomings of current methods. This new model incorporates several significant improvements, particularly in effectively handling multilingual data. The research team has focused on enhancing the architecture and training pipeline of the ColBERT model. To improve inference efficiency, their approach includes using a modified version of the XLM-RoBERTa backbone, optimized with flash attention and rotary positional embeddings. The training process is divided into two stages: an initial large-scale contrastive tuning phase and a more targeted fine-tuning phase with supervised distillation. These improvements allow Jina-ColBERT-v2 to reduce storage requirements by up to 50% compared to its predecessors while still delivering strong performance across various English and multilingual retrieval tasks.

The technology behind Jina-ColBERT-v2 is a blend of several cutting-edge techniques to enhance efficiency and effectiveness in information retrieval. One key innovation is using multiple linear projection heads during training, allowing the model to choose different token embedding sizes at inference time with minimal performance loss. This flexibility is achieved through Matryoshka Representation Loss, which enables the model to maintain performance even when reducing the dimensionality of the token embeddings. The model’s backbone, Jina-XLM-RoBERTa, incorporates flash attention mechanisms and rotary positional embeddings, enhancing its performance during inference. These technological advancements improve the model’s ability to handle multilingual data and make it more efficient in storage and computation.

➡️ Continue reading here!

Upcoming Webinar

Building Performant AI Applications with NVIDIA NIMs and Haystack

September 04, 2024, 8 am PST

Click here to register for this live session

NVEagle Released by NVIDIA: A Super Impressive Vision Language Model that Comes in 7B, 13B, and 13B Fine-Tuned on Chat

Researchers from NVIDIA, Georgia Tech, UMD, and HKPU have developed the Eagle family of MLLMs. This new approach systematically explores the design space of MLLMs by benchmarking various vision encoders, experimenting with different fusion strategies, and progressively identifying optimal combinations of vision experts. The researchers introduced a method that involves simply concatenating visual tokens from complementary vision encoders, which was as effective as more complex mixing architectures. This approach simplifies the design process while maintaining high performance. They introduced a Pre-Alignment stage to align non-text-aligned vision experts with the language model before integrating them, which enhances model coherence and performance.

The Eagle family of models, also known as NVEagle, includes several variants tailored to different tasks and requirements. The models come in three main versions: Eagle-X5-7B, Eagle-X5-13B, and Eagle-X5-13B-Chat. The 7B and 13B models are designed for general-purpose vision-language tasks, with the 13B variant offering enhanced capabilities due to its larger parameter size. The 13B-Chat model is specifically fine-tuned for conversational AI, making it exceptionally well-suited for applications that require nuanced understanding and interaction based on visual inputs....

➡️ Continue reading here!

ReMamba: Enhancing Long-Sequence Modeling with a 3.2-Point Boost on LongBench and 1.6-Point Improvement on L-Eval Benchmark

Researchers from Peking University, National Key Laboratory of General Artificial Intelligence, 4BIGAI, and Meituan introduced a new architecture called ReMamba, designed to enhance the long-context processing capabilities of the existing Mamba architecture. While efficient for short-context tasks, Mamba shows a significant performance drop when dealing with longer sequences. The researchers aimed to overcome this limitation by implementing a selective compression technique within a two-stage re-forward process. This approach allows ReMamba to retain critical information from long sequences without significantly increasing computational overhead, thereby improving the model’s overall performance.

ReMamba operates through a carefully designed two-stage process. In the first stage, the model employs three feed-forward networks to assess the significance of hidden states from the final layer of the Mamba model. These hidden states are then selectively compressed based on their importance scores, which are calculated using a cosine similarity measure. The compression reduces the required state updates, effectively condensing the information while minimizing degradation. In the second stage, ReMamba integrates these compressed hidden states into the input context, using a selective adaptation mechanism that allows the model to maintain a more coherent understanding of the entire text sequence. This method incurs only a minimal additional computational cost, making it a practical solution for enhancing long-context performance....

➡️ Continue reading here!

Trending Feeds…

➡️ Re-LAION 5B Dataset Released: Improving Safety and Transparency in Web-Scale Datasets for Foundation Model Research Through Rigorous Content Filtering [Tweet]

➡️ Google DeepMind Researchers Propose GenRM: Training Verifiers with Next-Token Prediction to Leverage the Text Generation Capabilities of LLMs [Tweet]

➡️ AnyGraph: An Effective and Efficient Graph Foundation Model Designed to Address the Multifaceted Challenges of Structure and Feature Heterogeneity Across Diverse Graph Datasets [Tweet]

➡️ SynDL: A Synthetic Test Collection Utilizing Large Language Models to Revolutionize Large-Scale Information Retrieval Evaluation and Relevance Assessment [Tweet]

➡️ 👨‍⚕️👩‍⚕️🤖 How Close Are We to LLMs as Clinical Assistants? [Tweet]

Wanna get in front of 1 Million+ Data Scientists, developers, AI engineers, CTOs???

Sponsor a newsletter or social post

Click here for all the details.