Customizing the Synthetic Population¶

Methodology¶

Adding New Agent Attributes¶

The default set of agent attributes in the Epistemix synthetic population (age, sex, race, and household income) provide a foundation for building simulation models in which individual behavior is influenced by differences in demographic and economic characteristics. However, models often benefit from the ability to differentiate agents according to personal attributes that are not included by default. We are working to add new agent attributes to the synthetic population all the time but, as the individual attributes that could be useful in models are as diverse as the problems that can be modeled using the Epistemix Platform, we want to provide users with the tools needed to add their own attributes as the need arises. When adding new attributes to the synthetic population, we can think of default set of agent attributes as scaffolding attributes that we can correlate with new attributes of interest as the basis for adding the new attributes to the synthetic population. We provide two strategies for adding new attributes based around the idea of scaffolding attributes, as detailed below.

Direct assignment¶

Under direct assignment, we accept a concrete value for the new attribute for each combination of scaffolding attributes. This can be used to facilitate the "cutting" of a continuous attribute into discrete categories. For example, you might use a continuous BMI scaffolding attribute to create a new "BMI category" attribute that groups the population into the categories "Underweight", "Normal Weight", "Overweight", and "Obese" based on clinical guidelines. Direct assignment is also a useful option for quickly moving forward with a new attribute in the face of uncertainty about which distribution family best models the new attribute (or how you might fit the parameters to specify the precise form of the distribution for each combination of scaffolding attributes).

Distributional assignment¶

A more flexible approach is distributional asignment. Under distributional assignment, rather than accepting a concrete value for the new attribute for each combination of scaffolding attributes, we instead accept a probability distribution from which individuals in the synthetic population with a given combination of scaffolding attributes will draw their attribute value. This has the advantage of preventing all individuals with the same combination of scaffolding attribute values being assigned exactly the same value for the new attribute (as is the case when using direct assignment).

Distributional assignment can involve different levels of complexity. For example, to add an attribute that encodes whether an agent does or does not have a given chronic disease in a relatively straightforward way, you can search for a high-quality data set that reports the prevalence of the disease in subsets of the real population that are defined by demographic characteristics present in the default set of agent attributes in the synthetic population. Then, you can segment the synthetic population into analogous subsets and assign each agent the attribute indicating the presence of the disease with probability equal to the reported prevalence of the disease in the corresponding subset of the real population.

A more complex approach to distributional assignment might involve fitting a predictive model that predicts a probability distribution over the possible values of the new attribute using real-world data. Then, enumerating all of the possible feature combinations and providing each one as input to the fitted predictive model to enumerate all of the possible agent types and obtain a distribution for each one to draw from in order to assign values for the new attribute.

Adding New Locations¶

It is sometimes useful to augment the synthetic population with additional places, such as custom Workplace locations or locations corresponding to an entirely new place type that is not included in the default model (e.g., Hospital). This can be achieved by specifying the locations for these new places in a data file and loading them into the simulation at runtime. See the read_place_file page in the FRED language reference for additional information.

Adding New Processes¶

By encoding the underlying process in a simulation model, agent attributes can be made to evolve over time during a simulation. This general process model can then be a component of many specific simulation models that are enhanced by accounting for the evolution of certain attributes over time.

The simplest example of this is the age attribute. Every year that passes in a simulation model, every agent gets another year older.