MLPs Learn In-Context on Regression and Classification tasks

Best AI papers explained - A podcast by Enoch H. Kang

Categories:

This research paper demonstrates that Multi-Layer Perceptrons (MLPs) can perform In-Context Learning (ICL), an ability often attributed exclusively to Transformer models. The researchers show that MLPs, and related MLP-Mixer models, achieve performance comparable to Transformers on synthetic ICL tasks involving regression and classification. Furthermore, in experiments testing relational reasoning—which is related to ICL classification—MLPs surprisingly outperformed Transformers in terms of both compute efficiency and generalization. These findings suggest that ICL is not solely dependent on attention-based architectures and challenge previous assumptions about the limitations of simple neural networks like MLPs in solving relational tasks. The study encourages further exploration of non-Transformer architectures to better understand the mechanisms of ICL.