I use 8*A100 to finetune mistralai/Mixtral-8x7B-Instruct-v0.1 However,Under the same configuration, I only changed the learning rate. When it is e-5, the model can train normally, but when it is e-4, the model training gets stuck. What could be the reason for this?