RMSNorm regularises the summed inputs to a neuron in one layer according to root mean square(RMS), giving the model re-scaling invariance property and implicit learning rate adaptation ability. RMSNorm is computationally simpler and thus more efficient than layer-norm.

References