I ended up cleaning it up to share it and got sucked back into the project. >:'(((( Spent 2 hours today in RawTherapee trying to improve scan contrasts :'((((
I hope to post a perhaps more cleaned up version of at least the dataset code, the aligmnent code is extremely messy and in the discord server (licensed under 'free to steal as many ideas as you want from' basically).
Unless you've, like, named all the variables with ethnic slurs, you're not going to get in trouble because you've just copied and pasted the same subroutine twelve times instead of refactoring it into something clean.