Latest Posts

Stanford U & Open AI’s Meta-Prompting Elevates Language Model Performance, Surpassing Standard Prompting by 17%

In a new paper Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding, the team introduces meta-prompting. This innovative scaffolding approach proves to be highly effective, surpassing standard prompting by 17.1%, expert (dynamic) prompting by 17.3%, and multi-persona prompting by 15.2%.

A Robot Chemist Driven by GPT-4 Made Its Debut in Nature: Autonomously Designs Reactions and Performs Complex Experiments

In a new paper Autonomous chemical research with large language models, a research team from Carnegie Mellon University and Emerald Cloud Lab introduces an innovative LLMs-Powered system named Coscientist, which autonomously designs, plans, and executes complex scientific experiments, marking a significant leap forward in the integration of laboratory automation technologies with powerful language models.

DeepMind’s DiLoCo Revolutionizes Language Model Training with 500× Less Communication

In a new paper DiLoCo: Distributed Low-Communication Training of Language Models, a Google DeepMind research team presents Distributed Low-Communication (DiLoCo). DiLoCo employs a distributed optimization algorithm that facilitates the training of language models on islands of poorly connected devices, surpassing the performance of fully synchronous models while reducing communication by 500 times.

ETH Zurich’s UltraFastBERT Realizes 78x Speedup for Language Models

In a new paper Exponentially Faster Language Modelling, an ETH Zurich research team introduces UltraFastBERT, a variant of the BERT architecture. UltraFastBERT takes a revolutionary approach by replacing feedforward layers with fast feedforward networks, resulting in an impressive 78x speedup over the optimized baseline feedforward implementation.

Microsoft Orca 2’s Triumph: Comparable or Superior Performance to Models 5-10x Its Size in Mastering Reasoning Tasks

Microsoft has recently unveiled Orca 2 in a new paper titled “Orca 2: Teaching Small Language Models How to Reason.” to explore how enhanced training signals can augment the reasoning abilities of smaller language models. Notably, Orca 2 surpasses models of similar size, achieving performance levels comparable to or better than models 5-10 times larger.

Adobe & ANU’s LRM Reconstructs Models For Single Image to 3D in 5s

In a new paper LRM: Large Reconstruction Model for Single Image to 3D, a research team from Adobe Research and Australian National Univerisity introduces an innovative Large Reconstruction Model (LRM). This groundbreaking model has the remarkable ability to predict a 3D model of an object from a single input image in a mere 5 seconds.

DeepMind Verifies ConvNets Can Match Vision Transformers at Scale

In a new paper ConvNets Match Vision Transformers at Scale, a Google DeepMind research team challenges the prevailing belief that Vision Transformers possess superior scaling capabilities compared to ConvNets and provides empirical results revealing that ConvNets can indeed hold their own against Vision Transformers at scale.