Skip to main content

Running an Experiment

Now that you have downloaded OpenDC, we will start creating a simple experiment. In this experiment we will compare the performance of a small, and a big data center on the same workload.

info

In this tutorial, we will learn how to create and execute a simple experiment in OpenDC.

This is tutorial is based on a "Simple demo" which can be found on the OpenDC github.

Download the demo [here] to run in an interactive notebook.

Running this demo requires OpenDC. Download the latest release here and put it in this folder.

1. Designing a Data Center

The first requirement to run an experiment in OpenDC is a topology. A topology defines the hardware on which a workload is executed. Larger topologies will be capable of running more workloads, and will often quicker.

A topology is defined using a JSON file. A topology contains one or more clusters. clusters are groups of hosts on a specific location. Each cluster consists of one or more hosts. A host is a machine on which one or more tasks can be executed. hosts are composed of a cpu and a memory unit.

Small Data Center

in this experiment, we are comparing two data centers. Below is an example of the small topology file:

{
"clusters":
[
{
"name": "C01",
"hosts" :
[
{
"name": "H01",
"cpu":
{
"coreCount": 12,
"coreSpeed": 3300
},
"memory": {
"memorySize": 140457600000
}
}
]
}
]
}

This topology consist of a single cluster, with a single host.

The topology file can be found here

Large Data Center

We compare the the previous datacenter with a larger datacenter defined by the following topology file:

{
"clusters":
[
{
"name": "C01",
"hosts" :
[
{
"name": "H01",
"cpu":
{
"coreCount": 32,
"coreSpeed": 3200
},
"memory": {
"memorySize": 256000
}
}
]
},
{
"name": "C02",
"hosts" :
[
{
"name": "H02",
"count": 6,
"cpu":
{
"coreCount": 8,
"coreSpeed": 2930
},
"memory": {
"memorySize": 64000
}
}
]
},
{
"name": "C03",
"hosts" :
[
{
"name": "H03",
"count": 2,
"cpu":
{
"coreCount": 16,
"coreSpeed": 3200
},
"memory": {
"memorySize": 128000
}
}
]
}
]
}

Compared to the small topology, the big topology consist of three clusters, all consisting of a single host.

The topology file can be found here

tip

For more in depth information about Topologies, see Topology

2. Workloads

Next to the topology, we need a workload to simulate on the data center. In OpenDC, workloads are defined as a bag of tasks. Each task is accompanied by one or more fragments. These fragments define the computational requirements of the task over time. For this experiment, we will use the bitbrains-small workload. This is a small workload of 50 tasks, spanning over a bit more than a month time.

Workloads traces define when tasks are submitted, and their computational requirements. A workload consists of two trace files defined as parquet files:

  • tasks.parquet provides a general overview of the tasks executed during the workload. It defines when tasks are scheduled and the hardware they require.
  • fragments.parquet provides detailed information of each task during its runtime
Input
import pandas as pd

df_tasks = pd.read_parquet("workload_traces/bitbrains-small/tasks.parquet")
df_fragments = pd.read_parquet("workload_traces/bitbrains-small/fragments.parquet")

df_tasks.head()

Output
idsubmission_timedurationcpu_countcpu_capacitymem_capacity
010192013-08-12 13:35:46259225200012926181352
110232013-08-12 13:35:46259225200012926260096
210262013-08-12 13:35:46259225200012926249972
310522013-08-29 14:38:1257785500012926131245
410732013-08-21 11:07:12182356600012600179306
Input
df_fragments.head()

Output
iddurationcpu_countcpu_usage
0101930000010
11019300000111.704
2101960000010
31019300000111.704
4101990000010

3. Executing an experiment

To run an experiment, we need to create an experiment file. This is a JSON file, that defines what should be executed by OpenDC, and how. Below is an example of a simple experiment file:

{
"name": "simple",
"topologies": [
{
"pathToFile": "topologies/small_datacenter.json"
},
{
"pathToFile": "topologies/big_datacenter.json"
}
],
"workloads": [
{
"pathToFile": "workload_traces/bitbrains-small",
"type": "ComputeWorkload"
}
],
"exportModels": [
{
"exportInterval": 3600,
"printFrequency": 168,
"filesToExport": [
"host",
"powerSource",
"service",
"task"
]
}
]
}

The experiment file defines four parameter values. First, is the name. This defines how the experiment is called in the output folder. Second, is the topologies. This defines where OpenDC can find the topology files. third, the workloads. This defines which workload OpenDC should run. Finally, exportModels defines how OpenDC should export its result. In this case we set the exportInterval and the printFrequency, and the filesToExport. The exportInterval and the printFrequency determine how often OpenDC should sample for output, and print to the terminal. Using filesToExport we specify that we only want to output specific files.

As you can see, both topolgies and workloads are defined as lists. This allows the user to define multiple values. OpenDC will run a simulation for each seperate combination of parameter values. In this case two simulations will be ran; one with the small topology, and one with the big topology.

tip

For more in depth information about ExportModels, see ExportModel.

For more in depth information about Experiments, see Experiment

4. Running OpenDC

An experiment in OpenDC can be executed directly from the terminal. The only parameter that needs to be provided is --experiment-path which is the path to the experiment file we defined in 3. While running the experiment, OpenDC periodically prints information about the status of the simulation. In this experiment, OpenDC prints every week, but this can be changes using the exportModel.

Input
import subprocess

pathToScenario = "experiments/simple_experiment.json"
subprocess.run(["OpenDCExperimentRunner/bin/OpenDCExperimentRunner", "--experiment-path", pathToScenario])

Output

================================================================================  Running scenario: 0  ================================================================================  Starting seed: 0 

Simulating... 0% [ ] 0/2 (0:00:00 / ?) 12:54:19.045 [WARN ] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 1680 hours: Tasks Total: 50 Tasks Active: 1 Tasks Pending: 39 Tasks Completed: 10 Tasks Terminated: 0

12:54:19.269 [WARN ] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 3360 hours: Tasks Total: 50 Tasks Active: 1 Tasks Pending: 37 Tasks Completed: 12 Tasks Terminated: 0

12:54:19.471 [WARN ] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 5040 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 32 Tasks Completed: 15 Tasks Terminated: 0

12:54:19.724 [WARN ] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 6720 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 26 Tasks Completed: 21 Tasks Terminated: 0

12:54:19.883 [WARN ] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 8400 hours: Tasks Total: 50 Tasks Active: 2 Tasks Pending: 18 Tasks Completed: 30 Tasks Terminated: 0

12:54:19.913 [WARN ] org.opendc.compute.simulator.service.ComputeService - Failed to spawn Task[uid=00000000-0000-0000-8c8a-1f148a8bb259,name=740,state=PROVISIONING]: does not fit 12:54:19.979 [WARN ] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 10080 hours: Tasks Total: 50 Tasks Active: 5 Tasks Pending: 8 Tasks Completed: 36 Tasks Terminated: 1

12:54:20.043 [WARN ] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 11760 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 0 Tasks Completed: 46 Tasks Terminated: 1

================================================================================  Running scenario: 1  ================================================================================  Starting seed: 0 

Simulating... 100% [=================================] 2/2 (0:00:03 / 0:00:00)

CompletedProcess(args=['OpenDCExperimentRunner/bin/OpenDCExperimentRunner', '--experiment-path', 'experiments/simple_experiment.json'], returncode=0)

Running the simulation has created the output folder containing information about the experiment. In the next tutorial we will use these files for analysis and visualization.