Quickstart Guide Part 10: Data Input and Output¶

Congratulations on making it to the final lesson of this Quickstart Guide!

Along the way, you've learned how to run simulations and access outputs using the Epistemix Platform Python tools. You have explored the concepts of conditions and states, which are used to describe the behaviors that agents will carry out during simulations, and you spent time exploring the properties of the synthetic population of agents (and associated places) that lies at the heart of the Epistemix Platform simulation engine.

In Parts 6, 7, 8, and 9, you examined more complex models that demonstrated how agents can interact with each other and with their environment. That set of models also presented many of the action rules available within FRED to dictate agent behavior, including control structures that enable richer interaction with simulation variables and more finely tuned branching behavior within states.

In the final part of this introductory sequence, you will run a model that reads in external data to define an age-specific probability with which female agents will become pregnant. You will encounter the set of functions that are included in the FRED modeling language to read in data from external files. You will also see a summary of the various methods used throughout this guide to export data from your simulations.

By the end of this lesson, you will know how to: 1. Create shared list, table, and list_table variables from data stored in external files. 2. Use the various output methods available in FRED to write out files containing data from your simulation. 3. Transfer data between the Epistemix IDE and your own computer.

Let's load the packages we need to run this model and interact with the outputs. We will be making use of the pandas package to create data output files:

import epxexec
import epxexec.epxresults
from epxexec import fred_job
import pandas as pd

10.1 Functions to Read in Data¶

Open the read_table_example.fred file in the file browser to the left to follow along.

FRED includes several functions to read data in from a file and store it as a variable. In the example model here, we are reading the data in the preg_probs.csv file and creating a shared table variable named preg_prob_table. The columns in the preg_probs.csv file are the estimated rates of pregnancy for agents at each age between 15 and 44. This data was drawn from a report released by the National Center for Health Statistics that estimated pregnancy rates for women in the United States. We can tentatively use these rates as an estimate of the probabilities that female agents will become pregnant at their current ages. (Although, in a more comprehensive model, it may be prudent to take more fine-grained data on how pregnancy rates vary between different subpopulations into account.)

The steps to create a table from a file are: 1. Instantiate a shared table variable in the variables block to hold the data. 2. Use the read_table function to open the external data file and specify which two columns from the file should be stored in the table variable.

You can see both of these steps in the read_table_example.fred model. The variable is created using this line in the variables block:

    shared table preg_prob_table # create table variable that will be populated from a file

The data is then read in and assigned to the preg_prob_table variable in the startup block:

    read_table(preg_prob_table, preg_probs.csv, 0, 1)

The first parameter passed to the read_table function is the name of the target table variable in your model. The second parameter is the name of the external file. (Note, if this file is not in the same directory where you are running the model, you will need to specify the complete path to the file.)

The third and fourth parameters specify which columns in the external file should be used as the keys and values, respectively, for the table. Each column in the external file is referenced using its integer position (starting with 0) in the list of columns. In this case, the preg_probs.csv file only has two columns, so these values are 0 and 1. We specify them in that order, so that the ages are the keys in our table.

The read_table function will ignore any lines that start with a non-numeric character, so it will skip any text-header rows present in your file. Note that the FRED modeling language only allows for numeric variables at this time. Any key entries that are not numeric will cause the simulation to fail and return an "Undefined value for key" error. Any value entries that are text will be replaced by the value 0.

10.2 Run the Model and Examine the Output¶

Execute the cell below to run the model:

preg_job=fred_job("read_table_example.fred")

You should see a text representation of the data in the preg_prob_table displayed above this cell. This text output was produced by the for loop structure in the startup block.

Tables are comprised of keys and values. The print function in FRED cannot print the key-value pair directly. Instead, we use a for loop to iterate through the table's keys and use each key to access the corresponding value for each row:

    for (age_key, get_keys(preg_prob_table)) do { 
        print("Age: ", age_key, " Prob: ", preg_prob_table[age_key]) 
    }

Here, we make use of the function get_keys(). This function returns a list containing all of the keys from the table. (Note that there is a corresponding get_values() function that can be used to obtain a list of the table's values.) We can then use the key to access the value using the syntax table[key] (or, lookup(table, key), equivalently).

By the way, you can open any text file inside the Epistemix IDE. Click on the preg_probs.csv file to the left to open it, and then compare its contents to the text output of the simulation above. As you can see, the data matches row for row.

10.3 Additional File Input Functions¶

The FRED modeling language also specifies functions for creating lists and list_tables from files. These are called read and read_list_table. You can read more information about using these two functions here and here in the FRED Modeling Language Reference.

The read_list_table function can be very useful - you can see examples of it being used in the Ground Logistics model (Ground-Shipping-Logistics) included in the Epistemix Platform Community Library.

10.4 A Recap on Data Output¶

You have seen several methods for interacting with the output of FRED simulations in this set of introductory tutorials. Here, we'll summarize those methods and then show you how to create output files that you can store within the Epistemix Platform or download to your own computer to work with offline.

Output keyword for list, table, and list_table variables¶

All of these variable types include a built-in keyword called output, which indicates that the values contained in the corresponding variable should be recorded in a CSV file at the end of the simulation. The output keyword for the preg_prob_table variable is turned on in this model, so that you can compare it to the original file and the text output from the simulation:

    shared table preg_prob_table # create table variable that will be populated from a file
    preg_prob_table.output = 1 # turn on csv output for table

To retrieve the resulting CSV file and load it into a pandas Series object, we use the get_table_variable method in the epx-results package for the FREDRun object produced by the simulation. Run the cell below to see this in action:

preg_job.runs[1].get_table_variable("preg_prob_table")

We can save this table as a CSV file using the built-in to_csv method for pandas Series objects:

preg_job.runs[1].get_table_variable("preg_prob_table").to_csv("preg_probs_output.csv")

After running the cell above, you should see a new CSV file appear in the browser to the left. Click on it to open and examine the file in a new tab. You'll see the same data as the original file.

Output of epx-results functions¶

You can save the output of any epx-results method that returns a pandas Series or DataFrame in this way. A similar to_csv method is also built into pandas DataFrame objects (you can read more about that here). For example, to save a CSV file of the agent counts in a given state each day, you can first call the get_job_state_table method from a FREDJob object or the get_state method from a FREDRun object to produce a DataFrame. (Head back to Parts 1 and 2 of this guide for a refresher on how these methods work!) Then you can save the data in a file using DataFrame.to_csv():

# save agent state counts for the Start state in the ASSIGN_PREGNANCY_PROB conditon
# from the FREDJob object
preg_job.get_job_state_table(
    condition="ASSIGN_PREGNANCY_PROB", 
    state='Start',
    count_type='new'
).to_csv('start_state_counts.csv')

# save agent state counts for the AssignProbs state in the ASSIGN_PREGNANCY_PROB 
# conditon from the FREDJob object
preg_job.runs[1].get_state(
    condition="ASSIGN_PREGNANCY_PROB", 
    state='AssignProbs',
    count_type='new'
).to_csv('assignprobs_state_counts.csv')

After running the above cell, you should see two new files to the left named start_state_counts.csv and assignprobs_state_counts.csv. These contain the number of new agents in each state.

Using the `print_csv` or `print_file` commands¶

The FRED modeling language also includes two functions that allow agents to print information directly to a file as part of the action rules they carry out in a given state. You have already seen the print_csv function used in several previous lessons. For example, in Part 7, we used the function to have the agents report the location where they were infected with influenza.

Recall that any print_csv statement must be refer to a file that was previously opened in an open_csv statement, so that the file is available for the agent to write to. This open_csv call is generally made in the startup block, so that the meta agent opens the file before the simulation starts.

The data in these CSV files can also be retrieved and loaded into a pandas DataFrame using the get_csv_output function of a FREDRun object. You can then save the file to your working directory using the to_csv method of the DataFrame, as above.

The print_file function works analogously to the print function, except it directs the output to appear in a file - which must have previously been opened using the open_file command - rather than in the standard output that is automatically displayed in the notebook after the simulation finishes.

10.5 Uploading to and Downloading from the Epistemix Platform IDE¶

It is easy to transfer data between the Epistemix Platform and your local machine.

Data Upload (Your Machine → Epistemix Platform)¶

Getting a file into the Platform is as simple as dragging it into the file browser area to the left. For one or more individual files, this action will result in a dashed grey border appearing around the file browser. Simply drop the file and upload will begin.

Note that you cannot drag a folder into the platform in this way. You will need to create a folder within the platform using the small folder icon with a plus sign above the file browser. Then, you can drag and drop the individual files you wish to store inside that new folder.

Data Download (Epistemix Platform → Your Machine)¶

To download any individual file in the browser to the left, right-click on the file name (Ctrl-click on a Mac) and select "Download" from the menu that pops up. This will open your computer's download dialogue, so you can specify where to store the file on your computer.

Again, note that you cannot download a folder in this way. You will need to right click on the individual files inside the folder and download them one by one.

If you have many files to download, this may be burdensome. You can work around this by using the terminal built into the Epistemix IDE to create a single archive file (e.g., a tarball), and then right-click to download that.

To open a terminal, click on the large button with the plus sign above the file broswer, and select "Terminal" from the bottom row of icons. This will launch a terminal. You can then use your favorite method for creating a single archive file for the folder of interest.

10.6 Lesson Recap¶

In this lesson, you explored a short model that loads data from an external file into a table variable in a FRED simulation. Female agents between the ages of 15 and 44 then query that table to update their agent variable my_preg_prob, which stores the probability with which they will become pregnant.

We also discussed several methods available within FRED simulations and within the Epistemix Platform for creating data output files. After exploring this model, you should be able to:

Use the read, read_table, and read_list_table functions to populate list, table, and list_table variables (respectively) with data from an external file.
Make use of the output keyword to record the final values contained within certain variables in a file at the end of the simulation.
Retrieve state tables from FREDJob and FREDRun objects and create pandas Series and DataFrame objects that can then be saved as CSV files.
Retrieve CSV files that agents used to record information during the simulation with the get_csv_output method of FREDRun objects, and save it in a file using the built-in to_csv method for pandas DataFrames.
Drag and drop files into the Epistemix IDE, and download them from the file browser.

You've made it to the end of this introduction to the Epistemix Platform and the FRED Modeling language! Hopefully you've enjoyed learning about the many features of the FRED language and seeing some of the powerful things that can be included in simulations.

Next Steps¶

Here are some next steps for continuing your learning journey:

Try modifying any of the models in this guide to run in a different place. Or, add a state or condition to an existing model to introduce new agent behaviors.
Take a look around our Community Library to explore models built by the Epistemix team and by other FRED modelers. The knowledge you have gained by working through this guide should make it possible for you to read those more complex models and to start to think about how you might start developing your own models for whatever interesting problems you want to tackle.
Check out the Epistemix Community Forum. This is a place where you can ask questions about FRED, get help with any problems you are encountering in developing models or using the platform itself, and see what other users are getting up to. Epistemix Team members monitor the community and are always ready to answer your questions and help solve your problems.

Special Topics¶

There are also additional tutorials that supplement this Quickstart Guide. They cover the following special topics that may be relevant to your modeling projects: - external-data: This lesson offers more examples of how external data can be incorporated into your simulation, in the form of (1) augmenting the synthetic population with additional agent attributes and (2) building custom places in FRED from real location data.

These lessons can be found in the special-topics directory.