- AI Research Insights
- Posts
- 🚀 AI News: How to Instruction Tune Code LLMs without GPT4 Data? + Meet 3D-VisTA + Meet Cheetor...(Aug 17, 2023 Edition)
🚀 AI News: How to Instruction Tune Code LLMs without GPT4 Data? + Meet 3D-VisTA + Meet Cheetor...(Aug 17, 2023 Edition)
This newsletter brings AI research news that is much more technical than most resources but still digestible and applicable
🔥 Trending AI Research: Let’s learn something new from the trending papers.
🛎️ Trending Tools: Check out some cool AI tools picked up by our editorial team.
Read Time: 3 Minutes
🔥Trending AI Research
1️⃣ How to Instruction Tune Code LLMs without GPT4 Data? Meet OctoPack: A Set of AI Models for Instruction Tuning Code Large Language Models [Paper] [Blog]
The study presents the advantages of finetuning large language models on specific instructions, specifically using code. By exploiting the unique structure of Git commits that combine code modifications with human directions, the researchers developed CommitPack, a massive compilation of 4 terabytes of Git commits from 350 different programming languages. When tested using the 16B parameter StarCoder model, CommitPack outperformed other code instructions, securing the best performance in the HumanEval Python benchmark without training on OpenAI outputs. The paper also introduced HumanEvalPack, which expands the HumanEval benchmark, encompassing three coding tasks in six different languages. Their resulting models, OctoCoder and OctoGeeX, showed superior performance, emphasizing CommitPack's capability in broadening its applicability to various languages and coding assignments.
2️⃣ Meet 3D-VisTA: A Pre-Trained Transformer for 3D Vision and Text Alignment that can be Easily Adapted to Various Downstream Tasks [Blog] [Paper]
The paper addresses the growing field of 3D vision-language grounding (3D-VL), which bridges the 3D world with natural language for embodied intelligence. Recognizing the complexity of current 3D-VL models, the study introduces 3D-VisTA, a pre-trained Transformer that simplifies the alignment of 3D vision and text. Unlike previous designs, 3D-VisTA solely relies on self-attention layers for both individual and combined modal operations. The research also presents ScanScribe, a groundbreaking dataset of 3D scene-text pairs, created from ScanNet, 3R-Scan, and GPT-3, enhancing 3D-VisTA's performance. Pre-training on ScanScribe, 3D-VisTA delivers state-of-the-art results in multiple 3D-VL applications, exhibiting high data efficiency, especially in scenarios with sparse annotations.
3️⃣ This AI Research from UCLA Indicates Large Language Models (such as GPT-3) have Acquired an Emergent Ability to Find Zero-Shot Solutions to a Broad Range of Analogy Problems [Paper] [Blog]
The study examines the capability of large language models, specifically the text-davinci-003 variant of GPT-3, in analogical reasoning tasks, contrasting them with human cognitive abilities. The focus is on the model's zero-shot reasoning, a process wherein novel problems are tackled without direct prior training, a trait humans are particularly adept at through analogy. By leveraging tests similar to Raven's Standard Progressive Matrices, the researchers found GPT-3 exhibited an impressive aptitude for abstract pattern recognition, often equating or surpassing human performance. Preliminary tests with GPT-4 show even greater promise, indicating such models possess an inherent ability to solve diverse analogy problems without prior direct instruction.
4️⃣ Meet Cheetor: A Transformer-based Multimodal Large Language Models (MLLMs) that can Effectively Handle a Wide Variety of Interleaved Vision-Language Instructions and Achieves State-of-the-Art Zero-Shot Performance [Paper] [Blog]
Recent developments in Multimodal Large Language Models (MLLMs) have shown promising capabilities for multiple vision-language tasks. Despite this, current techniques are primarily restricted to singular visual contexts and limited instruction types. To better assess these models, this study introduces the I4 benchmark. It evaluates the capability of MLLMs in handling complex interleaved vision-language instructions, such as those found in visually-rich content. An identified challenge is the Visual Prompt Generator's (VPG) limitation in extracting task-specific visual details. To overcome this, the authors propose a knowledge re-injection module and a cross-attention guided training strategy. By incorporating these, they present Cheetor, a Transformer-based MLLM that demonstrates superior zero-shot performance on the I4 benchmark without requiring high-quality instruction tuning data. Furthermore, Cheetor is competitive when compared with other top-performing models on the MME benchmark.
🛎️ Trending Tools
AdCreative AI: This AI Tool can help you boost your advertising and social media game with.
Sanebox: AI Powered Email Optimization Tool
ChatPDF: Your PDF AI - like ChatGPT but for PDFs.
Recraft: An innovative Art AI tool that transforms text prompts into captivating vector art.
Pecan AI: Pecan AI automates predictive analytics to solve today’s business challenges: shrinking budgets, rising costs, and limited data science and AI resources.
Claude 2: Claude 2 is an AI chatbot that rivals ChatGPT in terms of functionality. While both tools are comparable, Claude 2 offers certain benefits over ChatGPT.
Taplio: Transform your LinkedIn presence with Taplio's AI-powered platform.
Equals: Equals, the ultimate tool for startups to swiftly analyze data, stands out as the singular spreadsheet equipped with seamless integration to any database.
Robin: In the realm of project management, Robin stands out. It offers collaborative features, Gantt charts, and task management
Cresta AI: Cresta AI harnesses the power of AI to empower sales teams.
Notion: Notion AI, is a robust generative AI tool that assists users with tasks like note summarization, identifying action items in meetings, and creating and modifying text.
Quickchat: Quickchat offers chatbot solutions that engage customers on websites and social media platforms.
TinyEinstein: tinyEinstein is an AI Marketing manager that helps you grow your Shopify store.
Ferret AI: Ferret provides deep insights into customer behavior through web analytics.
Xembly: Xembly acts as an AI Chief of Staff, simplifying enterprise tasks by managing schedules, meeting summaries, and to-do lists.
ChatSimple: With Chatsimple's website chatbots, you can capture all leads and handle customer inquiries without any hassle.