Your concern about catastrophic forgetting is mostly unfounded in the regime of ...

frotaur · 2025-09-24T21:08:27 1758748107

I see, it was probably my high learning rate that caused problems. To be honest, I got a bit lazy to retry full finetuning since LoRA worked so well, but maybe I'll revisit this in the future, maybe with Qwen Image.

throwaway314155 · 2025-10-05T19:55:21 1759694121

Perhaps what you were dealing with was actually exploding gradients using fp16 training which _are_ prone to corrupting a model and this can depend on the learning rate.