- AI Research Insights
- Posts
- AI Research/Dev Insights: Writer Releases Palmyra-Med and Palmyra-Fin Models, BRAG and Whisper-Medusa Released,......
AI Research/Dev Insights: Writer Releases Palmyra-Med and Palmyra-Fin Models, BRAG and Whisper-Medusa Released,......
AI Research/Dev Insights: Arcee AI Released DistillKit for Small Language Models, Gemma 2-2B Released, Zamba2-2.7B Released, Meta Segment Anything Model 2 (SAM 2) Released....
Featured Research
the Writer Team has developed two new domain-specific models: Palmyra-Med and Palmyra-Fin. Palmyra-Med is designed for medical applications, while Palmyra-Fin targets financial tasks. These models are part of Writer’s suite of language models and are engineered to offer exceptional performance in their respective domains. Palmyra-Med-70B is distinguished by its high accuracy in medical benchmarks, achieving an average score of 85.9%. This surpasses competitors such as Med-PaLM-2 and performs particularly well in clinical knowledge, genetics, and biomedical research. Its cost efficiency is truly praiseworthy, priced at $10 per million output tokens, substantially lower than the $60 charged by models like GPT-4.
Palmyra-Fin-70B, designed for financial applications, has demonstrated outstanding results. It passed the CFA Level III exam with a score of 73%, outperforming general-purpose models like GPT-4, which scored only 33%. Furthermore, in the long-fin-eval benchmark, Palmyra-Fin-70B outperformed other models, including Claude 3.5 Sonnet and Mixtral-8x7b. This model excels in financial trend analysis, investment evaluations, and risk assessments, showcasing its ability to handle complex financial data precisely.
Palmyra-Med-70B uses advanced techniques to achieve its high benchmark scores. It integrates a specialized dataset and fine-tuning methodologies, including Direct Preference Optimization (DPO), to enhance its performance in medical tasks. The model’s accuracy in various benchmarks—such as 90.9% in MMLU Clinical Knowledge and 83.7% in MMLU Anatomy—demonstrates its deep understanding of clinical procedures and human anatomy. It scores 94.0% and 80% in genetics and biomedical research, respectively, underscoring its ability to interpret complex medical data and assist in research.
Editor’s Picks…
BRAG is a series of high-performance Retrieval Augmented Generation (RAG) models developed by Maximalists AI Researcher. The BRAG models are a family of small language models (SLMs) designed to offer cost-effective, high-performance alternatives in AI-driven language processing. These models have been trained at an impressively low cost of under $25 each, positioning them as efficient and economical solutions in artificial intelligence.
The BRAG models were created in response to the need for efficient and high-performing language models that do not require the extensive computational resources typically associated with large-scale models like those from Nvidia and OpenAI. The primary motivation behind BRAG was to develop a series of models that could match or exceed the performance of leading models such as Cohere’s Command R+, Qwen2, Llama3.1, and Llama3 Instruct while keeping the training costs minimal.
The BRAG series includes four models:
✅ BRAG-Qwen2-7b-v0.1
✅ BRAG-Llama-3.1-8b-v0.1
✅ BRAG-Llama-3-8b-v0.1
✅ BRAG-Qwen2-1.5b-v0.1......
Israeli AI startup aiOla has unveiled a groundbreaking innovation in speech recognition with the launch of Whisper-Medusa. This new model, which builds upon OpenAI’s Whisper, has achieved a remarkable 50% increase in processing speed, significantly advancing automatic speech recognition (ASR). aiOla’s Whisper-Medusa incorporates a novel “multi-head attention” architecture that allows for the simultaneous prediction of multiple tokens. This development promises to revolutionize how AI systems translate and understand speech.
The introduction of Whisper-Medusa represents a significant leap forward from the widely used Whisper model developed by OpenAI. While Whisper has set the standard in the industry with its ability to process complex speech, including various languages and accents, in near real-time, Whisper-Medusa takes this capability a step further. The key to this enhancement lies in its multi-head attention mechanism; this enables the model to predict ten tokens at each pass instead of the standard one. This architectural change results in a 50% increase in speech prediction speed and generation runtime without compromising accuracy.
UPCOMING WEBINAR
Sponsored
Free AI Webinar: ‘Rapid LLM Experimentation with Gretel and Lambda’
Time: August 13, 2024 | 10:00 am PT / 1:00 pm ET
Hear from teams behind the AI developer cloud Lambda and the synthetic data platform Gretel about how their combined stack drives faster AI experimentation and innovation.
In this webinar, learn how Gretel and Lambda together unlock faster experimentation so teams can easily vet approaches, fail fast, and be much more agile in delivering a LLM solution that works. We will use Gretel Navigator, the first compound AI system for synthetic data generation, to design and iterate on a task-specific dataset. Designing (from scratch) and iterating on data is built into Navigator and into how users interact with it, creating a new paradigm for how AI/ML teams approach overall model development. Teams are no longer limited to experimenting with just architectures, model configurations and training parameters. They can quickly experiment with data itself, and increasingly it’s data experimentation that’s driving most innovation.
Intel Labs Introduce RAG Foundry: An Open-Source Python Framework for Augmenting Large Language Models LLMs for RAG Use Cases 👏 👏
Intel Labs introduces RAG Foundry, providing a flexible, extensible framework for comprehensive RAG system development and experimentation. RAG Foundry emerges as a comprehensive solution to the challenges inherent in Retrieval-Augmented Generation (RAG) systems. This open-source framework integrates data creation, training, inference, and evaluation into a unified workflow. It enables rapid prototyping, dataset generation, and model training using specialized knowledge sources. The modular structure, controlled by configuration files, ensures inter-module compatibility and supports isolated experimentation. RAG Foundry’s customizable nature facilitates thorough experimentation across various RAG aspects, including data selection, retrieval, and prompt design.....
Researchers at Meta FAIR have introduced a novel approach called the “Self-Taught Evaluator.” This method eliminates the need for human annotations by using synthetically generated data for training. The process begins with a seed model, which produces contrasting synthetic preference pairs. The model then evaluates these pairs and improves iteratively, using its judgments to enhance its performance in subsequent iterations. This approach leverages the model’s capability to generate and evaluate data, significantly reducing dependency on human-generated annotations.
Magpie-ultra, a new dataset by the Argilla team for supervised fine-tuning, has been released, featuring 50,000 instruction-response pairs. This synthetically generated dataset utilizes the advanced Llama 3.1 405B-Instruct model and other Llama models like Llama-Guard-3-8B and Meta-Llama-3.1-8B-Instruct. The dataset covers various tasks, including coding, mathematics, data analysis, creative writing, advice-seeking, and brainstorming, offering challenging instructions and responses to enhance AI model training.
Upcoming AI Webinars
✅ Learn how to fine-tune SAM 2 with your own data [Encord]
Aug 8, 2024 (10:00 am PST)
✅ Rapid LLM Experimentation with Gretel and Lambda [Gretel AI] August 13, 2024 (10:00 am PT)
✅ Data Intelligence with Azure Databricks [Databricks]
August 14, 2024 (9.00 am PST)
✅ Conquering the Risk of LLM Hallucinations [DataRobot]
Aug 20, 2024 (9:00 am PDT)
✅ Combining structured and unstructured data to enhance RAG capabilities [Snorkel] Aug 22, 2024 (10:00 am PDT)