• AI Research Insights
  • Posts
  • AI News: Meta AI’s GenAug; Do you know that DeepMind has actually open-sourced the heart of AlphaGo & AlphaZero?; The v0.14 release of BERTopic is here....

AI News: Meta AI’s GenAug; Do you know that DeepMind has actually open-sourced the heart of AlphaGo & AlphaZero?; The v0.14 release of BERTopic is here....

This newsletter brings AI research news that is much more technical than most resources but still digestible and applicable

Hi there, today we will share some research updates from Meta AI’s GenAug, Do you know that DeepMind has actually open-sourced the heart of AlphaGo & AlphaZero?, The v0.14 release of BERTopic is here, Ever wondered why deep learning is always done on array data? and many other cool updates. So, let's start...

🤔 Do you know that DeepMind has actually open-sourced the heart of AlphaGo & AlphaZero? Its actually hidden in an unassuming repo called “mctx” on GitHub. It provides JAX-native Monte Carlo Tree Search (MCTS) that runs on batches of inputs, in parallel, and blazing fast. MCTS is a search algorithm that solves for the best move in turn-based games by selecting → expanding → simulating → updating the nodes in a strategy tree. DeepMind’s mctx library powers not just AlphaGo, but also AlphaZero (plays Go, Chess, and Shogi from scratch) and MuZero (AlphaZero + also solving Atari games).

GenAug: A new system by Meta AI researchers that uses text2image models to enable robots to transfer behaviors zero-shot from a simple demonstrated scene to unseen scenes of varying complexity — at no extra robot/human cost.

Ever wondered why deep learning is always done on array data?🤔 In their past research, DeepMind and University of Haifa researchers had introduced functa, a framework for representing data as neural functions (aka neural fields, INRs) and doing deep learning on them. Now they introduce spatial functa where they show how to scale up the approach to ImageNet-1k 256x256.

Big Little Transformer Decoder: UC Berkeley researchers propose Big Little Decoder (BiLD). This framework can improve inference efficiency and latency (2x) for a wide range of text generation applications w/o degradation of performance. The BiLD framework contains two models with different sizes that collaboratively generate text. The small model runs autoregressively to generate text with a low inference cost, and the large model is only invoked occasionally to refine the small model's inaccurate predictions in a non-autoregressive manner. To coordinate the small and large models, BiLD introduces two simple yet effective policies: (1) the fallback policy that determines when to hand control over to the large model; and (2) the rollback policy that determines when the large model needs to review and correct the small model's inaccurate predictions.

The v0.14 release of BERTopic is here: You can now fine-tune your topic keywords and labels with models from OpenAI, huggingface, CohereAI, and LangChainAI. Apart from keywords and labels, you can also use models for part-of-speech tagging, text generation, zero-shot classification, and more.

Discover optimization algorithms via program search: The Adam optimizer is at the heart of modern AI. Researchers have been trying to dethrone Adam for years. How about we ask a machine to do a better job? GoogleAI and UCLA researchers use evolution to discover a simpler & efficient algorithm with remarkable features. It’s just 8 lines of code. The discovered “Lion” optimizer is able to boost the accuracy of Vision Transformers (ViT) by up to 2% on ImageNet, reduce training compute by up to 2.3x for diffusion models, and achieve comparable performance on LLMs. It is more memory-efficient compared to human designs. This paper is a great demonstration of a scalable symbolic AI system. There are prior works that propose neural network-based, learned meta-optimizers. But Lion is a much simpler symbolic form that is interpretable and lightweight to incorporate.

SwitchPrompt: Using pre-trained language models has shown great potential for improving performance across various natural language processing tasks. However, their effectiveness in low-resource domains is limited due to a mismatch between the pre-training data and the specific task. A recent AI research collaboration between Bosch and Adobe has developed a new, innovative prompting approach called "SwitchPrompt" to bridge this gap. This lightweight methodology can adapt language models trained on general domain datasets to handle the diverse demands of low-resource domains better. This Even outperforms some domain-specific LLMs by 10.7% in accuracy.

Energy Transformer (ET): A transformer architecture that replaces the sequence of feedforward transformer blocks with a single large Associative Memory model. Energy Transformer (ET) has many of the familiar architectural primitives that are often used in the current generation of transformers. However, it is not identical to the existing architectures. The sequence of transformer layers in ET is purposely designed to minimize a specifically engineered energy function, which is responsible for representing the relationships between the tokens. As a consequence of this computational principle, the attention in ET is different from the conventional attention mechanism.

Do You Know Marktechpost has 1.5 Million+ Page views per month and 500,000 AI Community members?

Want to support us?