An Intro to Workflow Management with Prefect
Kevin Kho
Open Source Community Engineer, Prefect
Prefect is an open-source modern workflow management system designed with Dask natively built-in. Prefect can handle large-scale data pipelines with a multitude of small tasks, as users can use the Dask Executor to take advantage of Dask’s millisecond-latency task scheduler. Using Dask also parallelizes Task execution and utilizes distributed compute with minimal overhead.
In an interactive demo, we’ll go over Prefect basic concepts such as Flows, Tasks, and Parameters. We’ll then move on to more advanced topics such as mapping and conditional logic, which let us dynamically create Tasks inside a Flow. During the demo, we will deploy a Flow locally, and then show how seamless it is to port the Flow to a Dask cluster on the cloud.
About the speaker
Kevin is an Open Source Community Engineer at Prefect, an open-source workflow orchestration management system. Previously, he was a data scientist at Paylocity, where he worked on adding machine learning features to their Human Capital Management (HCM) Suite. Outside of work, he is a contributor for Fugue, an abstraction layer for distributed compute. He also organizes the Orlando Machine Learning and Data Science Meetup.
Szekciók: Enterprise Data · Startup Data · Data Platform