• AI Research Insights
  • Posts
  • AI News: 1 Million ChatGPT Users in 5 Days; RAVEn 𓄿; Stanford releases MegaBlocks; Microsoft's eXtensible Prompt (X-Prompt) beyond NL....

AI News: 1 Million ChatGPT Users in 5 Days; RAVEn 𓄿; Stanford releases MegaBlocks; Microsoft's eXtensible Prompt (X-Prompt) beyond NL....

Hi there, today we will share some research updates from 1 Million ChatGPT Users in 5 Days, Microsoft's eXtensible Prompt (X-Prompt) beyond NL, Make-A-Video3D, Stanford releases MegaBlocks, RAVEn 𓄿, and many other cool updates. So, let's start...

ChatGPT: This again proves that a customer-centric approach is key to success in product development. The model's user acquisition timeline is a testament to this, as it was able to attract over 1 million users in under five days. This is a remarkable achievement and a clear indication that a strong product will always outshine any marketing strategy. The user acquisition timeline of ChatGPT is comparable to that of other revolutionary technical products, and it's no surprise that it has quickly become a favorite among consumers.

Microsoft AI Research: Proposes eXtensible Prompt (X-Prompt) for Prompting a Large Language Model (LLM) Beyond Natural Language (NL). Incorporating intangible concepts into LLM prompts is impossible w/ natural language. Microsoft researchers developed a method to use imaginary words to represent ideas, like a writer's style and allows prompt to be more descriptive & fine-grained.

Meta AI/Make-A-Video3D: Generating 3D dynamic (mini) scenes from input text. Researchers from Meta AI introduce MAV3D (Make-A-Video3D), a method for generating three-dimensional dynamic scenes from text descriptions. Their approach uses a 4D dynamic Neural Radiance Field (NeRF), which is optimized for scene appearance, density, and motion consistency by querying a Text-to-Video (T2V) diffusion-based model.

Stanford releases MegaBlocks: A system for efficient Mixture-of-Experts (MoE) training on GPUs. MegaBlocks outperforms Tutel by up to 40% by reformulating MoEs as block-sparse operations, which allows us to avoid token dropping without sacrificing hardware efficiency.

VoiceGPT Project: A virtual assistant that leverages the powerful ChatGPT chatbot to answer questions in a realistic, synthesized voice. You speak the requests, and VoiceGPT responds with realistic, synthesized speech.

RAVEn 𓄿: RAVEn is a self-supervised model that jointly learns powerful visual and auditory speech representations entirely from raw data. Researchers observed strong results in low and high-resource labeled data settings when fine-tuning the visual and auditory encoders resulting from a single pre-training stage, in which the encoders are jointly trained. Notably, RAVEn surpasses all self-supervised methods on visual speech recognition (VSR) on LRS3, and combining RAVEn with self-training using only 30 hours of labeled data even outperforms a recent semi-supervised method trained on 90,000 hours of non-public data.

Heard about massive text benchmark (MTEB)? MTEB is a great way to see which models give the best results for specific NLP tasks like classification, clustering or retrieval.