Scientific workflows are an approach to implement automated, scalable, portable, and reproducible data analyses and in-silico experiments with low development costs.
What sets this approach apart from other distributed computing paradigms is its focus on the composition of programs. As a bioinformatics example, the output of a program that aligns reads to a reference genome is often processed by another program that analyzes genomic variants. Each of the programs is assumed to be readily available and is treated as a black box.
Continue reading “A Gentle Introduction to Scientific Workflow Languages and Systems”
My literature review has been accepted in Information Systems. It covers methods to predict the resource usage of batch computing jobs and is available here. Continue reading “Predictive Batch Scheduling Survey Accepted in Information Systems”
Last week, my paper on randomized task graph scheduling was accepted at the Workshop on Workflows in Support of Large-Scale Science collocated with the SC18 in Dallas.
My idea behind this paper was to improve on the extremely good performance of the HEFT  scheduler I had observed in various experiments. My attack on the problem was to allocate a larger time budget to allow exploring variations of HEFT’s usually already good schedules.
Continue reading “Randomized Scheduling Paper Accepted at WORKS”
At the beginning of my computer science PhD, I wrote a literature review and found it to be a challenging project. I’d like to share some of my insights from planning the review, sifting and organizing the material, and the challenges of this process.
Continue reading “Writing a Literature Review”
I’ve reached the end of my first year as a PhD student. This is a brief summary of my topic, progress, and future directions.
Continue reading “My PhD Topic: Resource Consumption Prediction for Distributed Scientific Workflows”