Running an Experiment

Now that you have downloaded OpenDC, we will start creating a simple experiment. In this experiment we will compare the performance of a small, and a big data center on the same workload.

info

In this tutorial, we will learn how to create and execute a simple experiment in OpenDC.

This is tutorial is based on a "Simple demo" which can be found on the OpenDC github.

Download the demo [here] to run in an interactive notebook.

Running this demo requires OpenDC. Download the latest release here and put it in this folder.

1. Designing a Data Center

The first requirement to run an experiment in OpenDC is a topology. A topology defines the hardware on which a workload is executed. Larger topologies will be capable of running more workloads, and will often quicker.

A topology is defined using a JSON file. A topology contains one or more clusters. clusters are groups of hosts on a specific location. Each cluster consists of one or more hosts. A host is a machine on which one or more tasks can be executed. hosts are composed of a cpu and a memory unit.

Small Data Center

in this experiment, we are comparing two data centers. Below is an example of the small topology file:

{
    "clusters":
    [
        {
            "name": "C01",
            "hosts" :
            [
                {
                    "name": "H01",
                    "cpu":
                    {
                        "coreCount": 12,
                        "coreSpeed": 3300
                    },
                    "memory": {
                        "memorySize": 140457600000
                    }
                }
            ]
        }
    ]
}

This topology consist of a single cluster, with a single host.

The topology file can be found here

Large Data Center

We compare the the previous datacenter with a larger datacenter defined by the following topology file:

{
    "clusters":
    [
        {
            "name": "C01",
            "hosts" :
            [
                {
                    "name": "H01",
                    "cpu":
                    {
                        "coreCount": 32,
                        "coreSpeed": 3200
                    },
                    "memory": {
                        "memorySize": 256000
                    }
                }
            ]
        },
        {
            "name": "C02",
            "hosts" :
            [
                {
                    "name": "H02",
                    "count": 6,
                    "cpu":
                    {
                        "coreCount": 8,
                        "coreSpeed": 2930
                    },
                    "memory": {
                        "memorySize": 64000
                    }
                }
            ]
        },
        {
            "name": "C03",
            "hosts" :
            [
                {
                    "name": "H03",
                    "count": 2,
                    "cpu":
                    {
                        "coreCount": 16,
                        "coreSpeed": 3200
                    },
                    "memory": {
                        "memorySize": 128000
                    }
                }
            ]
        }
    ]
}

Compared to the small topology, the big topology consist of three clusters, all consisting of a single host.

The topology file can be found here

tip

For more in depth information about Topologies, see Topology

2. Workloads

Next to the topology, we need a workload to simulate on the data center. In OpenDC, workloads are defined as a bag of tasks. Each task is accompanied by one or more fragments. These fragments define the computational requirements of the task over time. For this experiment, we will use the bitbrains-small workload. This is a small workload of 50 tasks, spanning over a bit more than a month time.

Workloads traces define when tasks are submitted, and their computational requirements. A workload consists of two trace files defined as parquet files:

tasks.parquet provides a general overview of the tasks executed during the workload. It defines when tasks are scheduled and the hardware they require.
fragments.parquet provides detailed information of each task during its runtime

Input

import pandas as pd

df_tasks = pd.read_parquet("workload_traces/bitbrains-small/tasks.parquet")
df_fragments = pd.read_parquet("workload_traces/bitbrains-small/fragments.parquet")

df_tasks.head()

Output

	id	submission_time	duration	cpu_count	cpu_capacity	mem_capacity
0	1019	2013-08-12 13:35:46	2592252000	1	2926	181352
1	1023	2013-08-12 13:35:46	2592252000	1	2926	260096
2	1026	2013-08-12 13:35:46	2592252000	1	2926	249972
3	1052	2013-08-29 14:38:12	577855000	1	2926	131245
4	1073	2013-08-21 11:07:12	1823566000	1	2600	179306

Input

df_fragments.head()

Output

	id	duration	cpu_count	cpu_usage
0	1019	300000	1	0
1	1019	300000	1	11.704
2	1019	600000	1	0
3	1019	300000	1	11.704
4	1019	900000	1	0

3. Executing an experiment

To run an experiment, we need to create an experiment file. This is a JSON file, that defines what should be executed by OpenDC, and how. Below is an example of a simple experiment file:

{
    "name": "simple",
    "topologies": [
        {
            "pathToFile": "topologies/small_datacenter.json"
        },
        {
            "pathToFile": "topologies/big_datacenter.json"
        }
    ],
    "workloads": [
        {
            "pathToFile": "workload_traces/bitbrains-small",
            "type": "ComputeWorkload"
        }
    ],
    "exportModels": [
        {
            "exportInterval": 3600,
            "printFrequency": 168,
            "filesToExport": [
                "host",
                "powerSource",
                "service",
                "task"
            ]
        }
    ]
}

The experiment file defines four parameter values. First, is the name. This defines how the experiment is called in the output folder. Second, is the topologies. This defines where OpenDC can find the topology files. third, the workloads. This defines which workload OpenDC should run. Finally, exportModels defines how OpenDC should export its result. In this case we set the exportInterval and the printFrequency, and the filesToExport. The exportInterval and the printFrequency determine how often OpenDC should sample for output, and print to the terminal. Using filesToExport we specify that we only want to output specific files.

As you can see, both topolgies and workloads are defined as lists. This allows the user to define multiple values. OpenDC will run a simulation for each seperate combination of parameter values. In this case two simulations will be ran; one with the small topology, and one with the big topology.

tip

For more in depth information about ExportModels, see ExportModel.

For more in depth information about Experiments, see Experiment

4. Running OpenDC

An experiment in OpenDC can be executed directly from the terminal. The only parameter that needs to be provided is --experiment-path which is the path to the experiment file we defined in 3. While running the experiment, OpenDC periodically prints information about the status of the simulation. In this experiment, OpenDC prints every week, but this can be changes using the exportModel.

Input

import subprocess

pathToScenario = "experiments/simple_experiment.json"
subprocess.run(["OpenDCExperimentRunner/bin/OpenDCExperimentRunner", "--experiment-path", pathToScenario])

Output

[32m================================================================================[0m [34m Running scenario: 0 [0m [32m================================================================================[0m [34m Starting seed: 0 [0m

Simulating... 0% [ ] 0/2 (0:00:00 / ?) 12:54:19.045 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 1680 hours: Tasks Total: 50 Tasks Active: 1 Tasks Pending: 39 Tasks Completed: 10 Tasks Terminated: 0

12:54:19.269 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 3360 hours: Tasks Total: 50 Tasks Active: 1 Tasks Pending: 37 Tasks Completed: 12 Tasks Terminated: 0

12:54:19.471 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 5040 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 32 Tasks Completed: 15 Tasks Terminated: 0

12:54:19.724 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 6720 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 26 Tasks Completed: 21 Tasks Terminated: 0

12:54:19.883 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 8400 hours: Tasks Total: 50 Tasks Active: 2 Tasks Pending: 18 Tasks Completed: 30 Tasks Terminated: 0

12:54:19.913 [[33mWARN [m] org.opendc.compute.simulator.service.ComputeService - Failed to spawn Task[uid=00000000-0000-0000-8c8a-1f148a8bb259,name=740,state=PROVISIONING]: does not fit 12:54:19.979 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 10080 hours: Tasks Total: 50 Tasks Active: 5 Tasks Pending: 8 Tasks Completed: 36 Tasks Terminated: 1

12:54:20.043 [[33mWARN [m] org.opendc.compute.simulator.telemetry.ComputeMetricReader - Metrics after 11760 hours: Tasks Total: 50 Tasks Active: 3 Tasks Pending: 0 Tasks Completed: 46 Tasks Terminated: 1

[32m================================================================================[0m [34m Running scenario: 1 [0m [32m================================================================================[0m [34m Starting seed: 0 [0m

Simulating... 100% [=================================] 2/2 (0:00:03 / 0:00:00)

CompletedProcess(args=['OpenDCExperimentRunner/bin/OpenDCExperimentRunner', '--experiment-path', 'experiments/simple_experiment.json'], returncode=0)

Running the simulation has created the output folder containing information about the experiment. In the next tutorial we will use these files for analysis and visualization.

Running an Experiment

1. Designing a Data Center​

Small Data Center​

Large Data Center​

2. Workloads​

Input​

Output​

Input​

Output​

3. Executing an experiment​

4. Running OpenDC​

Input​

Output​

1. Designing a Data Center

Small Data Center

Large Data Center

2. Workloads

Input

Output

Input

Output

3. Executing an experiment

4. Running OpenDC

Input

Output