Generalized, Modular, Real-time Task Creation and Execution System (v1.0) for Data Processing and Model Automation

Wednesday, September 2, 2015
Task Automation, Queue, and Operational System (TAQOS)

Successfully testing a generalized, modular, real-time python task creation and workflow execution system (~2 months design and implementation) for processing GOES satellite data using aspects of strategy, command, and singleton design patterns along with a framework that utilizes configuration and task completion JSON files, validation schemas, process status (psutils), dynamic plug-ins, and previously developed data directory and file regular expression search module classes.

I checked in the alpha/beta software to our CIRA internal git repository so that colleague, Chris Slocum, was able to check out, write, and test a plug-in for his real-time post-processing of NWP data in a couple of days. Others and I will be able to use this system for *all* real-time data processing, modelling, etc tasks by just writing plug-in task creation and task execution methods. This will reduce future development time by ~50-75% for all automation tasks.

Next iterations: daemonizing the driver, refactoring the task library to a task class (version 2) and eventually incorporating the Celery distributed task queue system (version 3). Version upgrade compatibility issues should be minimal since the interface between driver/daemon and plug-in only uses dictionaries that are JSON validated in method call.