Quickstart Guide Part 5: Places¶
In Part 4, you explored some of the properties that are defined for the agents in the Epistemix synthetic population. The synthetic population also includes a set of realistic geographic places that define the environment in which agents interact. In this lesson, you will explore a simple model that has agents report where they live and work. You will then visualize these places on a map using the folium Python package.
By the end of this lesson, you will be able to: 1. Describe some of the different types of places are included in the Epistemix synthetic population. 2. Make use of agents' membership in a particular place to access properties of that place. 3. Use the latitude and longitude functions in the FRED modeling language to extract the geolocations of households, workplaces, and other places. 4. Use the folium Python package to make an interactive map of agent households and workplaces.
Let's begin by loading the Python packages you will need to run this simulation and examine the outputs:
import folium
import pandas as pd
import time
# Epistemix package for managing simulations and simulation results in Python
from epx import Job, ModelConfig, SynthPop
5.1 Places in FRED¶
In the simulations you have run so far in this Quickstart Guide, you have already encountered one set of geographic places that define the simulation environment - the county or counties that are specified when configuring a model. Loving County, TX is one such place; the full Epistemix synthetic population includes all counties within the United States by default. Simulations can also be run for specific Metropolitan Statistical Areas (MSAs) or for entire states.
A second set of places, corresponding to households, workplaces, schools, and more, is also included in the synthetic population. These places are a statistical representation of the real-world sites where humans live and work, and they are defined by a geographic position.
Each place is identified by a unique place ID and has associated properties, including a latitude and a longitude.
If needed, you can define additional places that are not included in the synthetic population to represent locations relevant to your problem, like pharmacies, stores, dentist offices, airports, and more.
Places as "mixing groups"¶
Places aren't just important in the FRED modeling language because they help define the world of our simulation. They are important sites of agent interaction - a key concept in agent-based models that we'll develop in subsequent lessons.
Places are sites within FRED simulations where agents come into close physical proximity with one another. Which places a given agent can be found in during a simulation (and when they can be found there), depends on which places an agent is a member of.
Agents only visit places that they are designated to be a member of. This is not membership in the real-life sense. Any agent can be instructed to join or quit any place at any time during a simulation. Instead, it is better to think of membership as defining the possibility that an agent will be at a place during a simulation. An agent will not go to places that they are not a member of.
By default, the synthetic population defines an out-of-the-box set of memberships for every agent: their household, their workplace if they work, their school if they are a child or young adult, and places related to the spatial structures of the census (their block group, census tract, county, etc.)
The membership of a place defines the set of agents that can interact while present at that place. When they interact is defined by the schedule of the place - the hours during which members attend. Each default place has a default schedule used by the simulation engine, but that schedule can be customized as desired.
5.2 Controlling Agent behavior with Predicates¶
Agents are able to access information about the places within the synthetic population that they are members of. However, if they are not a member of a specific place type, an attempt to access information about that place will fail and cause the simulation to stop running. For example, an older retired person who is not working is not a member of a Workplace
. And an unhoused agent will not have a household defined for them. If that agent tries to access information about that place, the simulation will fail.
To work around this problem, the model you are exploring here defines multiple conditions that instruct agents to report information about each distinct location type of interest separately. In each condition, a check is carried out to ensure that the agent is a member of the place type of interest before attempting to access the relevant information.
Click on the place_info.fred
file in the file broswer to the left to take a look. Three individual conditions are defined in the model. In each, agents will record a piece of information in the tables defined in the variables
block. The first condition reports the agent's age and race, as in the model from Part 3. The remaining conditions report the geographic information that describes the agent's household and workplace locations.
Inside the REPORT_WORKPLACE
and REPORT_HOUSEHOLD
conditions, the model makes use of a predicate - if
- in the start state (Start
) to determine whether agents are a member of that place type. Here is what the predicate looks like in start state of the REPORT_WORKPLACE
condition:
ReportHousehold
state. Otherwise, the agent is sent to the Excluded state, which we encountered in the last few lessons. In this case, Excluded essentially functions as a "does not apply" state.
5.3 Gathering Agents' Households and Workplaces¶
This model instructs agents to use two built-in FRED functions - latitude
and longitude
- to determine the physical locations of their households and workplaces. Here are the lines of code where these functions are called in the REPORT_HOUSEHOLD
condition:
# Action rules
household_lat[id()] = latitude(Household)
household_long[id()] = longitude(Household)
Household
for every agent is set up as part of the synthetic population and, when used in a function as it is here, evaluates to the place ID of the agent's household. When Household
is passed to the latitude
and longitude
functions, the simulation engine looks up that place by its ID and returns the associated latitude or longitude. Each agent then pairs these location values with their own agent ID and records the information in the appropriate table variable.
You can see what this output looks like by running the model in the cell below. This model is being run in a different location from the previous lesson - Dane County, WI, the home of Madison and the Wisconsin state capitol. Madison is a city built between and around lakes, as you'll see by the end of this notebook. It is also the home of the University of Wisconsin-Madison. (Go Badgers!)
Execute the cell below to run the model and create a FREDJob object called place_job
:
# create a ModelConfig object
place_config = ModelConfig(
synth_pop=SynthPop("US_2010.v5", ["Dane_County_WI"]),
start_date="2022-05-10",
end_date="2022-05-10",
)
# create a Job object using the ModelConfig
place_job = Job(
"place_info.fred",
config=[place_config],
key="place_job",
fred_version="11.0.1",
results_dir="/home/epx/qsg-results"
)
# call the `Job.execute()` method
place_job.execute()
# the following loop idles while we wait for the simulation job to finish
start = time.time()
timeout = 300 # timeout in seconds
idle_time = 3 # time to wait (in seconds) before checking status again
while str(place_job.status) != 'DONE':
if time.time() > start + timeout:
msg = f"Job did not finish within {timeout / 60} minutes."
raise RuntimeError(msg)
time.sleep(idle_time)
str(place_job.status)
As in Part 3, use the JobResults.table_var()
to create pandas DataFrame objects that contain the agent and place information recorded during the simulation. Then, you can use pandas DataFrame methods to process that information so that it can be easily synthesized into a single DataFrame (where agent IDs are used as index values) using the pandas concat
function:
table_vars = [
"household_lat",
"household_long",
"workplace_lat",
"workplace_long",
"agent_race",
"agent_age"
]
series_objects = [
place_job.results.table_var(var_name)
.rename(columns={"key": "id", "value": var_name})
.set_index("id")[var_name]
for var_name in table_vars
]
place_data = pd.concat(series_objects, axis=1)
place_data.head(10) # just show the first 10 lines of the DataFrame
You can now see that code above produced a single DataFrame, with one row per agent that lists all the variables and properties we had agents report during the simulation.
Note that some agents who report a household will not report a workplace. In that case, the DataFrame will fill the workplace latitude and longitude entires with the value NaN
(Not a Number).
Now that you have gathered all the required agent information, you can make a map visualization to show the locations where agents live and work.
5.4 Using folium to Map FRED Position Data¶
Next, you are going to use the folium Python package to make an interactive map of where people live and work in Dane County. A full discussion of folium is beyond the scope of this lesson, but feel free to refer to this short tutorial to learn how to instatiate maps and plot places in folium.
The folium visualizations can cause memory issues for your browser if you try to map too many data points at once. This can cause the Epistemix platform IDE and your browser to hang and force you to restart. To avoid this, we are going to downsample the location data and map no more than 4000 locations at once. The following cell checks the number of agents, and takes a random sample of 4000 agents, if needed:
# Grab random sample of 4000 places to manage memory and make map function
if (len(place_data['household_lat'].unique()) > 4000):
place_data = place_data.sample(n=4000, random_state=1)
print("Downsampled to 4000 places.")
else:
print("No downsampling required!")
Execute the following cell to display the map (note this may take a few seconds to run as Python downloads the map images.)
# determine the center of the map based on the means of the household latitude and longitude
lat_cen = place_data.household_lat.astype(float).median()
long_cen = place_data.household_long.astype(float).median()
# Create interactive map with default basemap
map_osm = folium.Map(
location=[lat_cen, long_cen],
tiles='https://api.mapbox.com/styles/v1/epxadmin/cm0ve9m13000501nq8q1zdf5p/tiles/{z}/{x}/{y}?access_token=pk.eyJ1IjoiZXB4YWRtaW4iLCJhIjoiY20wcmV1azZ6MDhvcTJwcTY2YXpscWsxMSJ9._ROunfMS6hgVh1LPQZ4NGg',
attr='Mapbox',
zoom_start = 10,
)
# place a blue circle marker for each household
for i, place_info in place_data.iterrows():
# test for NaN values
if pd.notna(place_info['household_lat']):
folium.Circle(
radius=1,
location=[place_info['household_lat'], place_info['household_long']],
fill=True,
color='#F9B72D'
).add_to(map_osm)
# place a red circle marker for each workplace
for i, place_info in place_data.iterrows():
# test for NaN values
if pd.notna(place_info['workplace_lat']):
folium.Circle(
radius=1,
location=[place_info['workplace_lat'], place_info['workplace_long']],
fill=True,
color='#F0438D'
).add_to(map_osm)
# display the map
map_osm
This map displays the housholds of the Dane County residents in yellow and their workplaces in pink. You can clearly see Madison's famous system of lakes on the map, shown in light blue. And, if you zoom out, you can see that many workplaces are located outside of Dane County. In fact, some residents commute large distances to work - all the way to Milwaukee, northern Illinois, or even the Chicago area.
5.5 Lesson Recap¶
In this lesson, you examined and ran a model with multiple conditions that instructed agents to report information about themselves and where they work and live. You learned that:
- Geography is represented in the Epistemix synthetic population using places such as households, workplaces, and schools.
- Agents are assigned to be members of places by default in the synthetic population. Agents can quit or join places if instructed to do so, and they will not visit any places that they are not members of.
- Places have latitude and longitude properties that agents can query during a simulation.
- Having an agent query a place that they are not a member of will create an error and cause the simulation to fail. To avoid this, you can use predicates to have agents determine whether they are a member of a place before making any queries.
- Python has many packages for mapping locations - including the folium package, which you used to make an interactive map of where agents work and live.
In the lessons you have explored so far, agents have acted as individuals with no awareness of each other. In the next lesson, you will explore how agents can be instructed to interact with each other. This realm is where agent-based modeling excels. Moreover, it is a realm where very interesting behavior can be observed in the output of simulations. Let's keep going!
Additional Exercises¶
-
What would happen if the reporting conditions of this model were combined into a single condition?
-
Add a
REPORT_SCHOOL
condition to the place_info model.
Exercise solutions¶
- Reporting conditions
Although it is often possible to combine conditions (as it is in the case described in the question, above), it is also often inadvisable.
Splitting up each type of report (household vs. workplace) into separate conditions makes it easy to move agents who are not associated with one or more of the location types in question out of the condition entirely.
Ensuring that only the relevant agents move through a condition makes models easier to parse and reduces compute times, especially for larger populations and more complex conditions.
Thus, the result of combining the reporting conditions is likely to be code that is (1) more difficult to read and (2) more likely to contain a mistake.
- Reporting on schools
The code block below contains a REPORT_SCHOOL
condition that is analogous to the reporting conditions for the other places. Recall that, in the FRED modeling language, there can be more than one variables
block, so you could add the condition just by copying the code below and pasting it at the end of the place_info.fred
file. However, to make your model code more readable, it may be better to copy the code from this variables
block into the existing one instead.
variables {
shared table school_lat
shared table school_long
school_lat.output_interval = 1
school_long.output_interval = 1
}
condition REPORT_SCHOOL {
start_state = Start
state Start {
# Action rules
# Wait rules
wait(0)
# Transition rules
if (is_member(School)) then next(ReportSchool)
default(Excluded)
}
state ReportSchool {
# Action rules
school_lat[id()] = latitude(School)
school_long[id()] = longitude(School)
# Wait rules
wait(0)
# Transition rules
default(Excluded)
}
}