Latest Posts

Unveiling the Secret Linearity of Transformers: Further Advance Model Efficiency and Performance

In a new paper Your Transformer is Secretly Linear, a research team uncovers a near-perfect linear relationship in transformations between sequential layers and introduces a novel distillation technique that approximates certain layers linearly while preserving model performance.

Embracing the Era of 1-Bit LLMs: Microsoft & UCAS’s BitNet b1.58 Redefines Efficiency

In a new paper The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits, a research team introduces a new variant of 1-bit LLMs called BitNet b1.58, which preserves the advantages of the original 1-bit BitNet while ushering in a novel computational paradigm that significantly enhances cost-effectiveness in terms of latency, memory usage, throughput, and energy consumption.