Batch Configuration Files

This section discusses the format of configuration files used to submit batches to FRED Cloud. Batches are submitted using the command fredcli batch submit from within the model directory. Exactly how the batch is executed is controlled by a configuration file as described here.

Each batch consists of one or more tasks, with each task corresponding to a cloud container environment managed by FRED Cloud. This environment runs as an Amazon AWS Elastic Cloud Container (EC2) running a curated Docker image suitable for executing FRED simulations. At the moment, only one task can be executed for each batch.

The configuration file uses a JSON format, and by default is named fredconfig.json. The name of the configuration file can be specified using the -f / --file options on the fredcli batch submit command.

The configuration file format has the following sections, each of which is discussed in a separate section. A valid configuration file contains at least one workItem entry. Additional work items and the other file sections are optional. Additional section names are also possible. These will be ignored by FRED Cloud but can record comments or other information if desired.

{
  "labels": [  ],
  "population": {  },
  "dependencies": {  },
  "environment": {  },
  "workItems": [  ]
}

See the appropriate section for more details on each portion of the file.

Batch Labels

The optional labels list specifies the labels to apply to the batch. when using the fredcli batch submit command, this entry is replaced when one or more --label options are specified on the command line.

{
  "labels": [ "label-one", "label-two" ]
}

Synthetic Population Data

The population section of the configuration file defines one or more locations to be used in by the FRED model. This section supports two values

version

A valid U.S. synthetic population name. Currently, the two values US_2010.v3 and US_2010.v4 are supported.

locations

An array of strings, with each string holding a valid FRED location to load before executing the model. A location can be a valid FRED text location or a FIPS code. See the FRED documentation for the list of valid text entries, or use the U.S. Census FIPS code for a county.

For example, the following section uses the v4 synthetic population files and loads the synthetic population information for Jefferson County, PA.

{
  "population": {
    "version": "US_2010.v4",
    "locations": ["Jefferson_County_PA"]
  }
}

In this second example, the v4 files are once again used and the files for the indicated counties are loaded into simulation node. These counties represent the seven counties in the Pittsburgh, PA metropolitan area, which can also be loaded using the textual location “Pittsburgh_PA_MSA”.

{
  "population": {
    "version": "US_2010.v4",
    "locations": [ "42003", "42005", "42007", "42019", "42051", "42125", "42129" ]
  }
}

Model Dependencies

The dependencies section represents dependencies required by the work items defined in the configuration file. The target location for these dependencies is specified by the FRED_PROJECT environment variable, which is set to the directory /fred/models in the cloud container environment.

The following types of dependencies are supported.

gitHub

Clone a repository from GitHub. Currently, only Epistemix repositories are supported.

s3

This is not yet supported, but soon will allow data from AWS Simple Storage Service (s3) to be downloaded into the location.

GitHub Repository Dependencies

A gitHub dependency section is used to clone one or more repositories into the container. A GitHub Access Token suitable for accessing Epistemix repositories is provided within the environment. At the moment, a user-specified token cannot be specified.

Within this section an array of repositories is required. Each repository is specified by the following values:

url

The repository SSH URL representing the repository.

ref

The reference name for the branch or tag to download from the given repository.

For example, the following block would download the main branch of the public FRED-tutorials repository into the local environment.

{
  "dependencies": {
    "gitHub": [
      {
        "url": "[email protected]:Epistemix-com/FRED-tutorials.git",
        "ref": "main"
      }
    ]
  }
}

As another example, the following block would download two repositories Measles and School-Vaccination into the environment, using the references as indicated.

{
  "dependencies": {
    "gitHub": [
      {
        "url": "[email protected]:Epistemix-com/Measles.git",
        "ref": "v1.0.0"
      },
      {
        "url": "[email protected]:Epistemix-com/School-Vaccination.git",
        "ref": "main"
      }
    ]
  }
}

Data Dependencies

This format is not yet defined, though will eventually allow data objects to be downloaded into the environment.

Environment Settings

The optional environment section defines the environment for the simulation. The following settings are supported, and correspond to the indicated command-line options in fredcli batch submit. The command-line option will always replace a setting specified in a configuration file.

For details on these options, the corresponding command-line option, and the possible values for these settings, see the fredcli batch submit reference page.

computeSize

The container size to use for the batch. The setting is overridden by the --size command-line option.

fredVersion

The version of FRED to use for the batch, as either the version number (as in 7.9.1) or the release name (as in latest). This is overridden by the --fredVersion command-line option.

kill

The kill time to apply to the batch, overridden by the --kill command-line option.

logLevel

The container log level to apply within the container, overridden by the --container-log-level command-line option.

For example, the following block specifies each of these settings.

{
  "environment" : {
    "computeSize" : "tiny",
    "fredVersion": "7.9.0",
    "kill": "4h",
    "logLevel": "warning"
  },
}

Work Items

The workItems section specifies the simulation or other work to perform in the cloud environment. Work items are either Linux scripts to run or FRED jobs to execute in the environment.

The block contains an array of items to execute. Each item must specify a name and a type. The name is a string containing a friendly name for the item, while the type must be either script or job. For example, the following partial block specifies a script, a job, and another script.

{
  "workItems": [
    {
      "name": "A pre-script",
      "type": "script"
    },
    {
      "name": "A simulation",
      "type": "job"
    },
    {
      "name": "A post-script",
      "type": "script"
    }
  ]
}

At least one workItems section is required.

Script Work Items

For script work item types, the following value is required:

file

An executable file in the model directory to execute in the cloud environment.

For example, the following complete configuration file would execute a single script called fred-cloud-execute in the cloud environment.

{
  "workItems": [
    {
      "name": "Script to execute",
      "type": "script",
      "file": "fred-cloud-execute"
    }
  ]
}

Job Work Items

For job work item types, the following values are required:

entryPoint

The initial, or main, simulation file for the FRED simulation.

key

The job key to assign to the FRED simulation.

startRun

The starting run number to use for the job.

numberOfRuns

The number of runs to perform in the job.

parallel

How many runs to perform simultaneously (in parallel) during the job.

For example, the following complete configuration file would execute the FRED simulation in the current directory using the latest version of FRED and make the synthetic population files for Jefferson County, PA available in the environment. Note that the optional labels and environment sections are not included in this example.

{
  "fredVersion": "latest",
  "population": {
    "version": "US_2010.v4",
    "locations": ["Jefferson_County_PA"]
  },
  "workItems": [
    {
        "name": "Simple Flu",
        "type": "job",
        "entryPoint": "main.fred",
        "key": "simpleflu",
        "startRun": 1,
        "numberOfRuns": 4,
        "parallel": 2
    }
  ]
}

This configuration file is equivalent to the following fred_job command, when used within the simpleflu directory of the FRED-tutorials repository.

$ fred_job -k simpleflu -p main.fred -n 4 -m 2

Sample Configuration File

As a complete example, the following JSON represents a configuration file to run a FRED simulation.

The following configuration file is built to run a COVID-19 simulation for a supposed Rhode Island Home Show event in April, 2022. In our fictitious example, the model was built using FRED version 7.8.0, and the state location code “RI” represents the five counties in the state of Rhode Island. There is also a dependency on COVID19 and COVID19-calibration repositories from GitHub. A script prepare-event is run prior to the simulation, and another script process-event-results is run after the simulation. For the simulation itself, the FRED program is defined by the main.fred file and a total of 30 runs are simulated, with 15 runs performed in parallel.

The labels “event”, “home-show”, and “RI” are applied to the batch, and a container size of “large2” is requested.

{
  "labels": [ "event", "home-show", "RI" ],
  "population": {
    "version": "US_2010.v4",
    "locations": [ "44001", "44003", "44005", "44007", "44009" ]
  },
  "dependencies": {
    "gitHub": [
      {
        "url": "[email protected]:Epistemix-com/COVID19.git",
        "ref": "v1.1.0"
      },
      {
        "url": "[email protected]:Epistemix-com/COVID19-calibration.git",
        "ref": "main"
      }
    ]
  },
  "environment": {
    "fredVersion": "7.8.0",
    "computeSize": "large2",
    "spot": false
  }
  "workItems": [
    {
        "name": "Configure Event",
        "type": "script",
        "file": "prepare-event"
    },
    {
        "name": "Rhode Island Home Show",
        "type": "job",
        "entryPoint": "main.fred",
        "key": "rihs-2022-04-07",
        "startRun": 1,
        "numberOfRuns": 30,
        "parallel": 15
    },
    {
        "name": "Process Event Results",
        "type": "script",
        "file": "process-event-results"
    }
  ]
}