I feel like the most-valuable silos for that are LinkedIn, ClearanceJobs, and GovConNet. Those datasets are more-often between humans and have money and/or trust on the line.
Among the sites which facilitate connections between strangers, those have the strongest incentives to ensure both are human. In terms of value for training, you’d want to avoid any conversations which are part of the dead Internet bot-to-bot problem.