Skip to content

Suggestions for experiments for the HPDC paper #1

@mturilli

Description

@mturilli

Experiments

Here some ideas and notes on the experiments we may want to design, setup and run for the HPDC paper. Happy to discuss each experiment further if you find this interesting/useful.

Experiment 1

Done, results at https://github.com/radical-experiments/hyperspace_experiments/blob/master/analysis/nonuniform_tasks/nonuniform_tasks.ipynb

Experiment 2

On the base of the Experiment 1, showing better resource utilization by keeping TTX constant while reducing the total amount of resoruces required and controlling the number of execution generation per task duration.

Design

  • 120 1-core tasks
  • task durations: 10s, 100s, 1000s
  • Ideal TTX: 1000s
  • Actual TTX: 1000s + RP/resource overheads.
  • variable ratio among task durations (as in the first experiments)
  • Decreasing amount of resources for each given ratio
  • Increasing amount of generations for 10s and 100s tasks, with Sum Tx(generation) <= 1000s.

Setup

  • task duration ratio: 40-40-40
  • 40 1000s tasks: 1 execution generation
  • 40 100s tasks: 2, 4, 8, 10 execution generations
  • 40 10s tasks: 2, 4, 8, 16, 32, 40 generations
  • Number of cores requested: 120, 80, 60, 50, 47, 46, 45.
Run ID #T_1000s #T_100s #T_10s #G(T_1000s) #G(T_100s) #G(T_10s) #Cores TTX ideal RU
1 40 40 40 1 2 2 120 1000s ?
2 40 40 40 1 4 4 80 1000s ?
3 40 40 40 1 8 8 60 1000s ?
4 40 40 40 1 10 16 47 1000s ?
5 40 40 40 1 10 32 46 1000s ?
6 40 40 40 1 10 40 45 1000s ?

Legenda

  • #Run: Number of experimt run
  • #T_1000s: Number of tasks with 1000s duration
  • #T_100s: Number of tasks with 100s duration
  • #T_10s: Number of tasks with 10s duration
  • #G(T_1000s): Number of generations for executing tasks with 1000s duration
  • #G(T_100s): Number of generations for executing tasks with 100s duration
  • #G(T_10s): Number of generations for executing tasks with 10s duration
  • #Cores: Number of cores used to execute all the given tasks
  • TTX ideal: Ideal total execution time of all the given tasks
  • RU: Resource utilization

Notes:

  • we can write an equation to calculate the minimal number of cores required to have the maximal resource utilization with the minimal amount of total execution time, given a set of tasks with known heterogeneous execution time.
  • You want to implement this experiment using EnTK and 3 concurrent pipelines. In this way, you will always guarantee that the tasks with longest runtime start at the same time (1st stage of their pipeline). Generalizing, you want a workflow with N pipelines where N = the number of tasks with different execution time or, more formally, the number of partitions of the set of runtimes.

Experiment 3

Shows that the results observed in Experiment 1 apply to real-life workflows with tasks that have an actual distribution of execution time. Thus shows the insight we can get about resource utilization for a an actual workflow. As experiment 1 but with distribution of task execution time measured by executing one of the scientific workflows of the paper (choose the most interesting one from a scientific point of view).

Experiment 4

Shows we can maximize resource utilization while keeping the workflow execution time as close as feasible to its ideal total execution time. As Experiment 2 but with the same distribution of task execution time as in Experiment 3, and only with the maximal resource utilization, i.e., the run with minimal number of cores.

Experiment 5

Applies what learned with the previous experiments to an actual workflow, maximizing its resource utilization while minimizing its execution time for a given resource. Analyze the execution of the scientific workflow used for Experiment 3 and define all the unique ratios between heterogeneous tasks. For example, imagine that across the execution of al the workflow, we have 4 distinct ratios of 3 types of tasks. We would have 4 cases of Experiment 1. We would then apply the equation derived for Experiment 2 and we would calculate the optimal resource utilization as done in Experiment 4.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions