Tag Archives: paper

Is bigger better?

Recent work in Natural Language Processing (NLP) indicates that the bigger your model is, the better performance you will get. In a paper by Kaplan, Jared, et al., they show that loss scales as a power-law with model size, dataset size, and the amount of compute used for training.

Kaplan, Jared, et al. “Scaling laws for neural language models.” arXiv preprint arXiv:2001.08361 (2020).
Continue reading