From a design perspective, [Here](https://github.com/Leeroo-AI/mergoo/blob/main/mergoo/models/modeling_llama.py#L242), shall we consider to add the original `x` to the hidden states of `down_proj`? <img width="194" alt="图片" src="https://github.com/Leeroo-AI/mergoo/assets/54089835/65fac43e-fbd6-4273-a887-e873949808eb">