I had many thousands of machines. I needed to collect disk size, ram, software i...

I had many thousands of machines. I needed to collect disk size, ram, software inventory, some custom config, if present. Some machines are Linux, some windows.

With prefect I created a task 'collect machine details for windows', another 'collect machine details for Linux', another 'collect software inventory'.

I have a list of machines in a database so I create a task to get them. That task is an sqlalchemy query so I can pass the task a filter.

I get a list of linux machines and pass that to a task to run. I get a list of windows machines and pass that to a task.

Note that the above don't depend on each other.

I have a task that filters good results from bad. I have another task that writes a list to a database.

Other tasks have credentials.

Another task puts errors to an error table, the machines that failed get filtered from the results and run into this task.

I plumb the above up with a Prefect flow and it builds a DAG that runs the flow. Everything that can be run in parallel does so, everything that has some other input waits for the input.

Tasks that fail can be retried by Prefect automatically. Intermediate results cached. And, I get a nice gui for everything. I can even schedule it in the gui.