Lesson 9 Developing a model

Model development is an iterative activity. Once we have a good working definition of our hypothesis, we can start off to sketch the conceptual model. At this stage we also characterise the nature of the outcome that our model should produce, i.e. the expected ‘patterns’: is the expected pattern the amount and direction of habitat expansion that we are going to simulate, or is it sufficient to know, whether a population survives or not? Especially in case, we want to predict specific, quantifiable patterns, we will now have to assess how adequate the available validation data is. For example, let’s assume the purpose of the model is to predict forest recovery after a storm event in a certain study area. For validation we have aerial images. Such images do not allow for identification of seedling and sapling regrowth under the crown cover of larger trees, which is the main driver for forest recovery. In the worst case we might conclude that it is not worth investing time and money into the development of a model that cannot be adequately validated. This decision is hard to take, but it is still feasible at an early stage of model development. In case the available validation data is good enough for the purpose, we now can start off with the implementation.

It has proven successful to take a rapid prototyping approach, i.e. to quickly produce a simple prototype by reusing existing building blocks, to parameterise it with plausible values and run a first simulation. After subsequent analysis and validation of the first results we will already have a good grasp on what our model does and what is still missing. Probably we need to re-conceptualise parts of the model, add code, refine parameters etc.: we are going to iterate through the ‘modelling cycle’ many times until the model is good enough for the purpose it is designed for.

The five steps of the modelling cycle to develop a simulation model.

Figure 9.1: The five steps of the modelling cycle to develop a simulation model.

9.1 Conceptualisation

The conceptual model is the most fundamental level to describe and discuss a model. Not surprisingly, also the most fundamental kinds of error and uncertainty are related to the conceptual model of the system. It specifies system components, their interrelationships, the temporal and spatial scale and the boundaries of the system, the input parameters, state variables, and expected outputs. We have discussed many of these aspects already in previous lessons. To report a model, it is important to explicitly declare and reason about all these decisions and assumptions of the conceptual model.

To specify the conceptual model in a rigid way, we can make use of “diagrammatic modelling”. In this established approach from software engineering, conceptual models are sketched graphically in diagrams based on certain notation standards. Several alternative graphical notation standards have been suggested for the specific purpose of agent-based modelling. Among the proposed notations is the Business Process Model and Notation (BMPN) (Onggo and Karpat 2011), the Unified Modelling Language (UML) (Bersini 2012), or adaptations of these standards such as Agent UML (Bauer, Müller, and Odell 2001), or the Agent Modelling Language (Červenka et al. 2004). The true power of all of these methods is that model diagrams can automatically be transferred into executable code. Unfortunately, the ABM community has not yet agreed which standard to use for declaring conceptual models. Nevertheless, software developers have started to implement graphical editors into ABM software environments, e.g. statecharts for Repast Symphony (Ozik et al. 2015), or the Graphical Editor for the GAMA modelling platform (Taillandier 2014). The latter is built on UML class diagrams, which are excellent to capture the structure of a model (its agents, attributes and the environment), but are not well suited to represent the process dynamics. Additional integration of UML activity diagrams is therefore planned (Taillandier 2014).

9.1.1 UML Activity Diagram

The UML Activity Diagram is well-suited to capture the logical flow of agent behaviour and interaction. Activity Diagrams are the UML standard notation of what is commonly known as a flowchart. Its notation is quite straightforward. For the purpose of capturing an agent-based model we need only eight symbols.

  1. initial state This is the ONE node, at which the modelled process starts. A UML Activity Diagram can only have one starting node, unless there are several nested or encapsulated program parts. For the notation of an ABM model, it makes sense to separate the initialisation phase from the iterative simulation part. Each of these parts has its initial state node.

  2. action The action (or activity) is the central element of a UML Activity Diagram. It represents the execution of an agent’s action or interaction, or an event in the environment.

  3. control flow Control flows define, how the actions are scheduled. Control flows guide through the diagram and capture the logical flow of processes.

  4. decision node Adaptive behaviour that follows rules with respect to the agent’s own state or the local state of the surrounding system are captured in decision nodes. Each single agent or cell decides individually at each time step. These individual, but potentially dependent decisions are responsible for the “smart”, adaptive behaviour, unpredictable, non-linear processes and ultimately emergent patterns at system level

  5. guards The outgoing control flow arrows from the decision node can be labelled with the respective conditions. These conditions are the “guards”, which make sure that only allowed agents may pass along this path.

  6. swimlanes Swimlanes group related activities into one column (or one row). In ABM Activity Diagrams, swimlanes usually represent the pathway for one agent type or cellular automaton. This adds modularity to the Activity Diagram: it supports adding or removing agent types. Although swimlanes are optional elements, they usually give more clarity to an Activity Diagram and greatly help to keep a clear structure in the diagram.

  7. final state The state which the system reaches at the end of the initialisation, or at the endo of one simulation step is known as the Final State. A model can have multiple final states. Once the final state of a simulation step is reached, the model is reiterated from the updated initial state.

  8. frame A frame element in a UML diagram encapsulates sections of collaborating elements. For the purpose of representing an agent-based model, we can make use of frames to separate the initialisation phase from the iterative simulation part.

In Figure 9.2, you can see the notation symbols of the UML Actitivity Diagram notation (left) and its examplary application to a predetor prey model (right).

[The UML notation for activity diagrams (left) and the implementation for a simple predator-prey model (right).

Figure 9.2: [The UML notation for activity diagrams (left) and the implementation for a simple predator-prey model (right).

Usually, a modeller constantly refines the conceptual model during model development. In principle there is no limit to the level of detail in which we choose to specify a conceptual model. However, there is a level of medium complexity beyond which it does not pay off to add further details. Models that are too complex lose their explanatory power, because causal relationships between the model structure and its emerging behaviour gets blurred. The below quote is often attributed to Albert Einstein. It says it all:

Models should be as simple as possible, but not simpler.

A strategy to find the zone of medium complexity of a model is to follow the principle of the so-called Occam’s razor: it states that among competing hypotheses, the one with the fewest assumptions should be selected. Other, more complicated models may be correct, but the fewer assumptions made, the better. Or with other words: simple is beautiful. A modeller will proudly describe his or her model as elegant, if it is very simple and still explains complex behaviour – as for example the flocking model.

9.1.2 Exercise conceptual model with UML

In this exercise, you will extend the above conceptual model of a predator-prey system to also include grass. This grass is the fodder of zebras.

Download the open-source software WhiteStarUML and install it on your computer and then follow the steps below:

  • download the pred-prey.uml file
  • Open the WhiteStarUML software
  • In the Model Explorer (window in the right), open Logical View with a double-click,
  • open the Activity Graph1 with a double-click
  • and open the Activity Diagram, again with a double-click. The diagram appears in the canvas (the central window).

Now, you can select notations elements of a UML Activity diagram from the left menu and add them to the canvas. How are you going to implement grass? Hint: it is a cellular automaton that is very similar to the two agents. Just think along a third ‘swimlane’ from the perspective of a single grass-cell!

You may sketch the design first with pen and paper, if you want. Conceptual diagrams are an abstraction of the real system. so, there are potentially many different designs of your UML.

Export the diagram as an image (File -> export diagram) and post it to the forum.

9.2 Formalisation

Formalisation refers to the process of translating a conceptual model into a computer-readable ‘language’. Depending on the approach, a model can either be formalised with code (e.g. cellular automata and agent-based models) or by defining mathematical equations (e.g. system dynamics models).

Formalisation transfers a conceptual model into a computer-readable format. Here, the turtle draws a rectangle with a few forward (fd) and right-turn (rt) commands.

Figure 9.3: Formalisation transfers a conceptual model into a computer-readable format. Here, the turtle draws a rectangle with a few forward (fd) and right-turn (rt) commands.

At the first glance, coding might be seen as the central part of modelling. In fact it is not. Of course we need programming skills to write the code, but if conceptual model is well described, then coding is not the main creative part of research. In most cases modellers are not primarily computer scientists. Fortunately, recently it has become easier to programme a model.

Modelling frameworks have been designed specifically for the purpose of modelling. These frameworks support the modeller with a GUI, model analysis algorithms and visualisation tools. Some frameworks offer domain-specific programming languages that are relatively easy to learn, but at the same time provide a powerful functionality for the specific purpose of building simulation models. The most prominent examples in the open source domain are NetLogo, Repast and GAMA. NetLogo was developed in the spirit of the educational Logo programming language family according to the motto ‘low threshold and no ceiling’. It is probably the most widely spread framework. Repast is a comprehensive framework that is generally harder to use. Finally, GAMA (GIS and Agent-based Modelling Architecture) has its focus on spatially explicit simulation and it provides the richest GIS functionality. It is maintained and further developed by a highly active developer group and has a fast growing user community. We use GAMA throughout this module.

Many modelling frameworks and model code-sharing platforms provide an increasing wealth of model libraries that serve as building blocks for more complex models. This takes away the hurdle of coding from scratch for each new model. Further, such code fragments and sample models have the advantage of being intensively tested and verified and thus are the basis of a collaborative effort of the modelling community to advance the field as a whole.

The overall structure of an agent-based model always is composed of the same building blocks:

  1. a setup procedure initialises the system. Here, we define the system boundaries, the scale and extent of our ‘world’, we load external data and create and parameterise the initial set of agents. This procedure is executed only once, at the beginning of the simulation.

  2. Iterative procedures govern the behaviour of the model at one simulation step. They are repeated at each time step.

  3. The output procedures report the current state of the system in visual and quantitative manner, e.g. by reporting current values of state variables, drawing graphs or saving state variable changes to a log file. Output procedure may be called at each time step, or they come into action at the end of a simulation run.

9.3 Parameterisation

Parameterising a model is like sitting in the front of a dashboard and turning the knobs before starting the show.

A parameter is an invariant input variable that represents a property of the modelled system.

In the parameterisation, we assign a value to this parameter. For example the parameter ‘tree growth per year’ may have the value of 20 cm per annual time step. Unlike variables, which change over time during simulation depending on the respective state of the system, parameters are either completely invariant or their variation is specified externally beforehand, e.g. to model an environment with constantly increasing temperature.

Parameterising a model is turning the knobs of the model's dashboard.

Figure 9.4: Parameterising a model is turning the knobs of the model’s dashboard.

9.3.1 Calibration

One important function of model parameters is to enable a formalised model that describes the system in a generic way to be made specific to a particular system of interest. So, we can have a generic ‘forest model’ that can represent both, a specific low-land beech forest in Germany or a high-mountain pine-forest in the Himalayas, if we parameterise it accordingly. This can be done by calibrating the model parameters to fit a specific study area, so that the input-output behaviour of the model resembles the observed input-output behaviour of the system.

For some parameters, we may like to present a generic model and leave the actual parameterisation to the user or decision maker. Such parameters can be considered as control variables. Once we have parameterised a model, we can push the ‘start’ button of our model – and let the first simulation run.

9.4 Model analysis

The output of a simulation run usually consists of a large amount of numbers, that is: data. These data describe the state of the system over time and the emergence of patterns at system level. However, to retrieve this information from the simulation data, we need to analyse it. Or as Grimm and Railsback (2005) put it: we need to decode the data to extract the information, i.e. the expected patterns. Thus, the adequate portrayal of emergent patterns is the core task in this modelling step.

A state variable is a variable, which characterises the current state of a system or an entity in the system.

Printed over time, state variables visualise the dynamics of a system. A set of state variables describe a pattern. Taken together they can then be used to define the “state” of the system or model respectively.

A pattern is a spatio-temporal phenomenon “above random” that describes emergent properties of the system.

9.5 Validation

The validation is the last step of a modelling cycle: here, we check the validity of the model outcome. There are many strategies for model validation and we will dedicate an entire lesson to it. However, for the time being, we assume that we validate the quality of the model by comparison against reality. This is the most rigorous way of validation and can be seen as default strategy for applied models. The validation is thus based on the comparison of state variables and spatio-temporal patterns between modelled and real systems.

In validation, we compare the model outcome with reality.

Figure 9.5: In validation, we compare the model outcome with reality.

There are two possible results of the validation step:

  1. If the model does not yet produce valid results, we will try to address the modelling step that contributes most to the deviation of the model from the real system. This is part of an overall validation strategy and will be discussed in more detail in a later lesson.

  2. If we conclude from the comparison between model and the real-world system that the model is valid with respect to its purpose, we can stop the modelling cycle and proceed to use the model for our research.

9.6 Summary

The development of an model is an iterative process that involves much more than ‘just’ programming (or writing down mathematical equations if it is an aggregate model). Once you start implementing a model you will often get drawn into syntax errors and programming manuals. However, don’t loose sight on the big picture, take a step back every one in a while and check where you are with your model. Check, whether you have adequate validation data before you start programming and allow as much time as you invest in coding for analysis and validation.

References

Bauer, Bernhard, Jörg P Müller, and James Odell. 2001. “Agent UML: A Formalism for Specifying Multiagent Software Systems.” International Journal of Software Engineering and Knowledge Engineering 11 (03): 207–30.
Bersini, Hugues. 2012. “Uml for Abm.” Journal of Artificial Societies and Social Simulation 15 (1): 9.
Červenka, Radovan, Ivan Trenčanskỳ, Monique Calisti, and Dominic Greenwood. 2004. “AML: Agent Modeling Language Toward Industry-Grade Agent-Based Modeling.” In International Workshop on Agent-Oriented Software Engineering, 31–46. Springer.
Grimm, Volker, Eloy Revilla, Uta Berger, Florian Jeltsch, Wolf M Mooij, Steven F Railsback, Hans-Hermann Thulke, Jacob Weiner, Thorsten Wiegand, and Donald L DeAngelis. 2005. “Pattern-Oriented Modeling of Agent-Based Complex Systems: Lessons from Ecology.” Journal Article. Science 310 (5750): 987–91. https://doi.org/10.1126/science.1116681.
Onggo, Bhakti SS, and Onder Karpat. 2011. “Agent-Based Conceptual Model Representation Using BPMN.” In Proceedings of the Winter Simulation Conference, 671–82. Winter Simulation Conference.
Ozik, Jonathan, Nicholson Collier, Todd Combs, Charles M Macal, and Michael North. 2015. “Repast Simphony Statecharts.” Journal of Artificial Societies and Social Simulation 18 (3): 11.
Taillandier, Patrick. 2014. “GAMAGraM: Graphical Modeling with the GAMA Platform.” In The 4th International Conference on Complex Systems and Applications, 6–p.