Yes, because of a key word in a lot of copyright laws... "distribution". Using that copyrighted material themselves to train the model still gives them plausible deniability. Handing the copyrighted material to another group starts to run afoul of other laws and also removes the plausible deniability that the original group can claim regarding their training data.