Running an Experiment
Now that you have downloaded OpenDC, we will start creating a simple experiment. In this experiment we will compare the performance of a small, and a big data center on the same workload.
In this tutorial, we will learn how to create and execute a simple experiment in OpenDC.
This is tutorial is based on a "Simple demo" which can be found on the OpenDC github.
Download the demo [here] to run in an interactive notebook.
Running this demo requires OpenDC. Download the latest release here and put it in this folder.
1. Designing a Data Center
The first requirement to run an experiment in OpenDC is a topology
.
A topology
defines the hardware on which a workload
is executed.
Larger topologies will be capable of running more workloads, and will often quicker.
A topology
is defined using a JSON file. A topology
contains one or more clusters.
clusters are groups of hosts on a specific location. Each cluster consists of one or more hosts.
A host is a machine on which one or more tasks can be executed. hosts are composed of a cpu and a memory unit.
Small Data Center
in this experiment, we are comparing two data centers. Below is an example of the small topology
file:
{
"clusters":
[
{
"name": "C01",
"hosts" :
[
{
"name": "H01",
"cpu":
{
"coreCount": 12,
"coreSpeed": 3300
},
"memory": {
"memorySize": 140457600000
}
}
]
}
]
}
This topology
consist of a single cluster, with a single host.
The topology
file can be found here
Large Data Center
We compare the the previous datacenter with a larger datacenter defined by the following topology
file:
{
"clusters":
[
{
"name": "C01",
"hosts" :
[
{
"name": "H01",
"cpu":
{
"coreCount": 32,
"coreSpeed": 3200
},
"memory": {
"memorySize": 256000
}
}
]
},
{
"name": "C02",
"hosts" :
[
{
"name": "H02",
"count": 6,
"cpu":
{
"coreCount": 8,
"coreSpeed": 2930
},
"memory": {
"memorySize": 64000
}
}
]
},
{
"name": "C03",
"hosts" :
[
{
"name": "H03",
"count": 2,
"cpu":
{
"coreCount": 16,
"coreSpeed": 3200
},
"memory": {
"memorySize": 128000
}
}
]
}
]
}
Compared to the small topology, the big topology consist of three clusters, all consisting of a single host.
The topology
file can be found here
For more in depth information about Topologies, see Topology
2. Workloads
Next to the topology, we need a workload to simulate on the data center. In OpenDC, workloads are defined as a bag of tasks. Each task is accompanied by one or more fragments. These fragments define the computational requirements of the task over time. For this experiment, we will use the bitbrains-small workload. This is a small workload of 50 tasks, spanning over a bit more than a month time.
Workloads traces define when tasks are submitted, and their computational requirements. A workload consists of two trace files defined as parquet files:
tasks.parquet
provides a general overview of the tasks executed during the workload. It defines when tasks are scheduled and the hardware they require.fragments.parquet
provides detailed information of each task during its runtime
Input
import pandas as pd
df_tasks = pd.read_parquet("workload_traces/bitbrains-small/tasks.parquet")
df_fragments = pd.read_parquet("workload_traces/bitbrains-small/fragments.parquet")
df_tasks.head()
Output
id | submission_time | duration | cpu_count | cpu_capacity | mem_capacity | |
---|---|---|---|---|---|---|
0 | 1019 | 2013-08-12 13:35:46 | 2592252000 | 1 | 2926 | 181352 |
1 | 1023 | 2013-08-12 13:35:46 | 2592252000 | 1 | 2926 | 260096 |
2 | 1026 | 2013-08-12 13:35:46 | 2592252000 | 1 | 2926 | 249972 |
3 | 1052 | 2013-08-29 14:38:12 | 577855000 | 1 | 2926 | 131245 |
4 | 1073 | 2013-08-21 11:07:12 | 1823566000 | 1 | 2600 | 179306 |
Input
df_fragments.head()
Output
id | duration | cpu_count | cpu_usage | |
---|---|---|---|---|
0 | 1019 | 300000 | 1 | 0 |
1 | 1019 | 300000 | 1 | 11.704 |
2 | 1019 | 600000 | 1 | 0 |
3 | 1019 | 300000 | 1 | 11.704 |
4 | 1019 | 900000 | 1 | 0 |
3. Executing an experiment
To run an experiment, we need to create an experiment
file. This is a JSON file, that defines what should be executed
by OpenDC, and how. Below is an example of a simple experiment
file:
{
"name": "simple",
"topologies": [
{
"pathToFile": "topologies/small_datacenter.json"
},
{
"pathToFile": "topologies/big_datacenter.json"
}
],
"workloads": [
{
"pathToFile": "workload_traces/bitbrains-small",
"type": "ComputeWorkload"
}
],
"exportModels": [
{
"exportInterval": 3600,
"printFrequency": 168,
"filesToExport": [
"host",
"powerSource",
"service",
"task"
]
}
]
}
The experiment file defines four parameter values. First, is the name
. This defines how the experiment is called in the output folder. Second, is the topologies
. This defines where OpenDC can find the topology files.
third, the workloads
. This defines which workload OpenDC should run. Finally, exportModels
defines how OpenDC should export its result.
In this case we set the exportInterval
and the printFrequency
, and the filesToExport
.
The exportInterval
and the printFrequency
determine how often OpenDC should sample for output, and print to the terminal.
Using filesToExport
we specify that we only want to output specific files.
As you can see, both topolgies
and workloads
are defined as lists. This allows the user to define multiple values. OpenDC will run a simulation for each seperate combination of parameter values. In this case two simulations will be ran; one with the small topology, and one with the big topology.
For more in depth information about ExportModels, see ExportModel.
For more in depth information about Experiments, see Experiment
4. Running OpenDC
An experiment in OpenDC can be executed directly from the terminal. The only parameter that needs to be provided is --experiment-path
which is the path to the experiment
file we defined in 3. While running the experiment, OpenDC periodically prints information about the status of the simulation. In this experiment, OpenDC prints every week, but this can be changes using the exportModel
.
Input
import subprocess
pathToScenario = "experiments/simple_experiment.json"
subprocess.run(["OpenDCExperimentRunner/bin/OpenDCExperimentRunner", "--experiment-path", pathToScenario])
Output
[32m================================================================================[0m [34m Running scenario: 0 [0m [32m================================================================================[0m [34m Starting seed: 0 [0m
Simulating... 0% [ ] 0/2 (0:00:00 / ?) 12:54:19.045 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 1680 hours: Tasks Total: 50 Tasks Active: 1 Tasks Pending: 39 Tasks Completed: 10 Tasks Terminated: 0
12:54:19.269 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 3360 hours: Tasks Total: 50 Tasks Active: 1 Tasks Pending: 37 Tasks Completed: 12 Tasks Terminated: 0
12:54:19.471 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 5040 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 32 Tasks Completed: 15 Tasks Terminated: 0
12:54:19.724 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 6720 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 26 Tasks Completed: 21 Tasks Terminated: 0
12:54:19.883 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 8400 hours: Tasks Total: 50 Tasks Active: 2 Tasks Pending: 18 Tasks Completed: 30 Tasks Terminated: 0
12:54:19.913 [[33mWARN [m] org.opendc.compute.simulator.service.ComputeService - Failed to spawn Task[uid=00000000-0000-0000-8c8a-1f148a8bb259,name=740,state=PROVISIONING]: does not fit 12:54:19.979 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 10080 hours: Tasks Total: 50 Tasks Active: 5 Tasks Pending: 8 Tasks Completed: 36 Tasks Terminated: 1
12:54:20.043 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 11760 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 0 Tasks Completed: 46 Tasks Terminated: 1
[32m================================================================================[0m [34m Running scenario: 1 [0m [32m================================================================================[0m [34m Starting seed: 0 [0m
Simulating... 100% [=================================] 2/2 (0:00:03 / 0:00:00)
CompletedProcess(args=['OpenDCExperimentRunner/bin/OpenDCExperimentRunner', '--experiment-path', 'experiments/simple_experiment.json'], returncode=0)
Running the simulation has created the output
folder containing information about the experiment.
In the next tutorial we will use these files for analysis and visualization.