Thanks, and I agree, make can work well for data pipelines.
When you're integrating many different data sources with a complicated set of scripts, it's important to automate what you can. The easy but impractical thing to do is rerun everything. Make, properly used, will rerun everything that needs to be run in the correct order... and nothing else. GNU make is also awesome for running things in parallel.
When you're integrating many different data sources with a complicated set of scripts, it's important to automate what you can. The easy but impractical thing to do is rerun everything. Make, properly used, will rerun everything that needs to be run in the correct order... and nothing else. GNU make is also awesome for running things in parallel.