Understanding Add & Norm in Transformer Models