Transformers are Deep Optimizers

Transformers are Deep Optimizers: Provable In-Context Learning for DeepModel Training

This paper investigates the transformer’s capability for in-context learning (ICL) to simulate the training process of deep models, providing a provable explicit construction.

May 2025 · 1 min · Weimin Wu, Maojiang Su, Jerry Yao-Chieh Hu,  Zhao Song, Han Liu
Computational limits of low-rank adaptation (lora) for transformer-based models

Computational limits of low-rank adaptation (lora) for transformer-based models

We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory.

April 2025 · 1 min · Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo, Zhao Song, Han Liu