- AI Research Insights
- Posts
- AI News: Microsoft’s FLAME for spreadsheets; Dreamix creates and edit video from image and text prompts; Salesforce AI CausalAI library; Multimodal-CoT from Amazon AI.....
AI News: Microsoft’s FLAME for spreadsheets; Dreamix creates and edit video from image and text prompts; Salesforce AI CausalAI library; Multimodal-CoT from Amazon AI.....
Hi there, today we will share some research updates from Microsoft’s FLAME; for spreadsheets; Dreamix creates and edit video from image and text prompts; Salesforce AI CausalAI library; UniPi; Multimodal-CoT from Amazon AI; and many other cool updates. So, let's start...
Microsoft: Microsoft builds a language model specifically for spreadsheet formulas. They called it FLAME, a T5-based model trained on Excel formulas that leverages domain insights to achieve competitive performance with a substantially smaller model (60M parameters) and two orders of magnitude less training data. FLAME(60M) can outperform much larger models, such as Codex-Davinci (175B), Codex-Cushman (12B), and CodeT5 (220M), in 6 out of 10 settings.
Google Research/The Hebrew University of Jerusalem: Researchers propose Dreamix for general text-based appearance and motion editing of real-world videos. It can modify a video using just text. It can create a video using an image and a description. It can create a video using a series of images and a description.
Salesforce AI: Open sources CausalAI library for causal analysis of time series and tabular data. It supports causal discovery and causal inference for tabular and time series data, of both discrete and continuous types. This library includes algorithms that handle linear and non-linear causal relationship between variables, and uses multi-processing for speed-up.
MIT/Google Brain/UC Berkeley/University of Alberta: Can text-to-video generation help decision-making? Researchers introduce UniPi, which acts by synthesizing a video of what it will do. UniPi can generate diverse videos/actions across many environments (and combinatorially generalize!).
Amazon AI: Amazon researchers propose Multimodal-CoT that incorporates vision features in a decoupled training framework. The framework separates the rationale generation and answer inference into two stages. By incorporating the vision features in both stages, the model is able to generate effective rationales that contribute to answer inference. Multimodal-CoT outperforms GPT-3.5 by 16% (75.17% -> 91.68%) on ScienceQA and even surpasses human performance.
Meta AI: GenBench is a new effort led by Meta AI researchers, that aims to make state-of-the-art generalization testing the new status quo for NLP work. As a first step, they present a generalization taxonomy, describing the underlying building blocks of generalisation in NLP. They use the taxonomy to do an elaborate review of over 400 generalisation papers and make recommendations for promising areas for the future.
Fastest companies to hit 100 million users: 🤯🤯
Facebook: 4 years.
Snapchat: 3 years.
MySpace: 3 years.
Instagram: 2 years.
Google: 1 year.
OpenAI (via ChatGPT): 2 months.