Keep kv cache as list of tensors maybe better than one tensor

**Describe the bug**
If we keep kv cache as list of tensors, there has no need to concatenate kv caches of each decoder blocks (https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/gemma/gemma_causal_lm.py#L225). It is helpful for model performance.

**Expected behavior**
Remove useless concatenation to improve performance.