Lesson 1 Spatial Simulation: an overview

Preface

This web-book is a text book with exercises that together form the learning materials for “Spatial Simulation”, an elective module of the UNIGIS distance learning program in Geoinformatics at the University of Salzburg.

UNIGIS Distance Learning program

The web-book is published under an open licence. I welcome everybody to explore the contents and to work through the simulation modelling exercises. Only the related discussion forum and the assignments can exclusively be accessed by actively enrolled UNIGIS students, who signed up for this module.

About this web-book

Spatial simulation integrates the dimensions of space and time and thus explicitly represent process dynamics in its spatial context. This includes a wide range of application areas, such as transportation, hydrology, population ecology, healthcare, land-use change, wildfire prevention, etc. Depending on the nature of a specific spatio-temporal phenomenon, a multitude of methodological approaches have been developed. The wide range of model categories is confusing at the first sight: there are mathematical, numerical, analytic, probabilistic, stochastic, deterministic and individual-based models.

In the first part, this module provides a theory-based categorisation of models and the respective type of problems they can solve. You will gain the competence to identify the most adequate modelling approach for your research interest or problem domain. In the second part, there will be a specific focus on models that explicitly incorporate the spatial perspective in their structure: cellular automata and agent-based models.

For exercises and assignments, the open-source modelling framework ‘GAMA’ will be used. The programming language GAMA was explicitly designed for coding agent-based models. This modelling platform is particularly well suited to work with GIS data. It comes with a great wealth of model examples that are openly accessible and usable through GAMAs model library. From there, you will learn to use, modify and design your own models and how to use your own GIS data.

For me, it was a great experience to design and write this module. My own motivation to take this effort was rooted in the experience that I made, when I was a PhD fresher, desperately looking for a textbook that would help me getting an overview over a confusing abundance of spatially explicit modelling methods. I hope, this module helps to fill the gap.

One last request: This is a living document. I try to provide and maintain high-quality and up-to-date materials. Please report any issue that you may find to the UNIGIS issue tracker on GitHub (account needed) or drop me an email.

kind regards Gudrun Wallentin


This web-book is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

.

1.1 Models

Everything we think we know about the world is a model (Meadows and Wright 2008)

A rather philosophical way to think about models. In Geoinformatics models are part of our daily work. The same real world phenomenon can be represented in different ways, e.g. as raster or as vector models. An architect may work with physical models, e.g. mini-buildings made of paper to visualise his ideas to the public in a tangible way. Although simplified, such paper house needs to be sufficiently detailed to enable neighbours to envision the future building.

The commonly accepted definition for models is the following:

Models are abstract and simplified representations of reality that are designed for a particular purpose

Following up this definition, simulation models are computer models that iteratively recalculate the state of a system as it changes over time, where the recalculation is based on mathematical, empirical and / or logical relationships that describe the system (O’Sullivan and Perry 2013).

In this module we talk about spatial simulation models, which represent both, space and time in an integrated way. The terms ‘spatio-temporal models’ or ‘dynamic models’ are therefore often used synonymously for spatial simulation models.

1.2 Areas of Application

Simulation modelling can be used in a wide range of application domains, including

  • Environmental Planning
  • Climate Change Impact and Mitigation Planning
  • Manufacturing Applications
  • Construction Engineering
  • Military Applications
  • Logistics, Transportation, and Distribution Applications
  • Business Process Simulation
  • Healthcare

The common denominator of all application domains listed above is that we can model, analyse and visualise complex systems. In the early times of simulation modelling, where computer power was limited, modellers were restricted to abstract and generic models. This provided new and insightful views on the general functioning of studied systems, but hardly could be applied to specific study areas. Only since the advent of strong and fast computers, it has become possible to model real world systems and accordingly explore real world problems across space and time.

IMAGE ALT TEXT HERE

This Youtube video is provided by the developer team of the GAMA modelling software, which we will use in this module. There you will see, how GAMA earns its reputation to be the best agent-based simulation modelling software for dealing with large scale GIS data. The video shows some quite impressive real world application examples. There comes no audio information with it. Thus, you can just switch off the rather annoying background sound, if you want.

Simulation modelling is not a trivial task and it clearly is on the advanced side of the spatial analysis toolbox. Despite a much increased usability of simulation modelling frameworks, it takes quite some expertise to adequately design, implement and validate a model such way that it is useful to solve a problem. Before we start to think about HOW to solve problems with simulation models, we take our time to think WHY we want to use simulation models. Or to phrase it differently: for which problems should we use simulation models and for which problems can (and should) we use simpler approaches?

The short answer to this question: simulation models are appropriate for problems that cannot - or only at great expense - be studied in the real system or a physical lab experiment. This might be the case, if the studied system.

  • is dangerous, e.g. explosions of nuclear power plants,
  • is large, e.g. questions related to global climate change,
  • has a time scale that exceeds reasonable time frames of physical experiments, e.g. geological plate tectonics,
  • is a historical system in the past, e.g. expansion of the Roman Empire.
  • manipulations of interest would impact the system in inadequate ways, e.g. forest fire distribution with respect to different management regimes,
  • involves a significant portion of randomness and we want to learn about the range of potential system behaviour (process variability).
Simulation models are sophisticated tools. Although, they can be very helpful, it is not appropriate to use them for simple tasks.

Simulation models are sophisticated tools. Although, they can be very helpful, it is not appropriate to use them for simple tasks.

If a problem can be solved by common sense or analytically, the use of simulation is unnecessary. Additionally, using algorithms and mathematical equations may be faster and less expensive than simulation modelling. Also, if the problem can be solved by performing direct experiments on the system to be evaluated, then conducting direct experiments may be more desirable than simulating.

In summary, simulation models are appropriate to solve complex problems. Specifically, if:

  1. the problem cannot be solved using common sense,
  2. the problem cannot be solved analytically,
  3. real-world experiments cannot be performed, because they would be too big, dangerous or costly,
  4. the time and resources to build a model are available,
  5. there is a basic understanding of the system of interest,
  6. it is possible to validate the model.

1.3 Complex systems

Warren Weaver (1948) offered an interesting view on complexity. He argued that classical science traditionally has focused on systems with either very few or very many elements. In the first case problems can be solved analytically (i.e. with mathematical equations), latter problems can be treated with statistical methods. Unfortunately, in the real world we often find ‘middle-numbered’ systems between these two extremes with too many unknown variables for an analytical solution and at the same time not enough to average out. Single events in such complex systems can have large impact on the results and outcomes are hard to predict. Along these lines, decision making situations can be categorised into four problem domains that afford different solution strategies - known as the ‘Cynefin (kʌnɨvɪn) framework’ (Snowden 2000):

Simple problems can be solved by straightforward categorisation; there is a clear ‘best practice’ solution
Complicated problems need to be analysed to find an adequate solution. The solution represents ‘good practice’ (there might be other good solution too)
Complex systems follow some guiding principles, but the behaviour of the system cannot be intuitively predicted: it ‘emerges’ from the behaviour of system components. This problem domain lends itself for simulation modelling as tool that helps solving problems.
Chaotic systems have no inherent causal relationships, every situation is novel and the first solution strategy is immediate action to stabilise the system.

Cynefin framework

Cynefin framework

Before we dive deeper into modelling, let me provide a list of important terms:

Conceptual models break down a real world phenomenon into parts that are relevant for the model’s purpose.
Physical models are ‘hardware’ models, such as wind tunnels to investigates the airflow around the downscaled replica of a wing of a plane.
Simulation models are computer models to conduct virtual experiments, and to find out how the system behaves across space and time.
Theoretical models describe relations between its components rooted in theoretical considerations in the literature.
Mathematical models describe the relation between system components with help of mathematical equations
Analytical models are models that can be solved exactly with pencil and paper by using mathematical reasoning.
Numerical models are mathematical models that cannot be solved exactly. They slice continuous space / time into discrete pieces for a stepwise approximation.
Empirical models describe the relation between system components by observational evidence.
Deterministic models are fully reproducible: the same parameter settings lead to the same outcomes.
Stochastic models have random elements encoded. Outcomes thus vary between model runs.

1.4 Why do we need simulation models?

1.4.1 Prediction models

What will the future look like? Arguably, this is one of the most tempting questions that we have, when somebody presents a simulation model to us. This is natural human curiosity and after all, it has been accurate predictions that pushed forward scientific knowledge and our understanding of the world. The position of stars can be exactly predicted based on the understanding of planetary orbits, and chemical reactions are explained by our understanding of atoms and molecules.

The purpose of simulation models traditionally was to predict the future state of a system. Systems that we understand very well and for which we have abundant, reliable and detailed data are the ones that are most suitable for prediction modelling. Typical application domains for predictive models are traffic, hydrology, or evacuation management.

Unfortunately, simulation models do not lend themselves for accurate predictions. Other than for problems that can be solved analytically (as for engineered systems) or probabilistically (as for weather), simulation models target at the study of the very nature of a complex, living system. Thus, simulations always have a range of uncertainty associated with them. Weather forecasts are prominent examples for the prediction of stochastic (random) processes: meteorologist can predict tomorrow’s weather, but we never can be completely sure that the forecast will be correct. Often the meteorologist therefore uses phrases like: “the probability of rain showers are highest in the west” or “in the afternoon or early evening, first snowfall is expected”. These statements express spatial, temporal and attribute uncertainties.

For some problems, we might have tons of data, but still we do not thoroughly understand the underlying processes. Global financial markets are an example for such systems. Modelling in this case can help in the development of theories.

Other problem areas revolve around questions that are well routed in underlying theoretic understanding, but are lacking detailed data. Population dynamics of most animal groups are a good example, here.

Last but not least, we have these nasty questions where we cannot draw on many data, nor on good process understanding of the studied phenomenon. The most adequate use of simulation models in such cases is to use models as ‘tools to think with’, i.e. learning devices that help the researcher in gaining a better understanding of the system and also guide towards the areas with most pressing need for data.

1.4.2 Models to guide data collection

For simulation models that are explicitly defined to solve a real world problem, it is essential to have reliable real-world data. The primary role of a simulation model in this case is to assess the sensitivity of model predictions to changes in the input parameters. This way, it is possible to determine the most critical empirical data that are needed to improve our understanding of the system. This kind of analysis is called ‘sensitivity analysis’ and it involves a systematic change of input parameters to find out which parameters have the strongest effects on the system behaviour.

An example for an application domain that frequently takes advantage of simulation models to guide data collection are conservation efforts of endangered species. On the other side, a rather depressing evidence of a misuse of simulation models is the case of fisheries in the Adriatic Sea: not least because of the ignorance of uncertainties in input parameters, stochastic behaviour of the system led to policies that resulted in overfished ecosystems.

Data shortage has been a great issue until quite recently. Today, the situation has changed dramatically and most domains are confronted with an oversupply of data: ‘Information explosion’, ‘big data’, and ‘data big bang’ are related buzzwords. Consequently, we are in the rather new and unfamiliar situation to have huge amounts of empirical data, but we are still lacking clear understanding of causal relationships that govern the dynamics of the underlying system.

With help of simulation models we can test various hypotheses against data and thus gain increasing understanding of the system. The major problem of this approach is the equifinality problem, which refers to the problem that there might be several models that equally well describe the state of a system. We thus might be misled, if we assume that a simulation outcome that reproduces the state of a real system results from a correct model. Later in this module we will discuss a way to address this problem of equifinality with the approach of ‘pattern oriented modelling’ (Grimm et al. 2005).

1.4.3 Simulation models in science

In scientific research, simulation models are increasingly used as heuristic learning tools to gain a better understanding of a particular system. As such spatial simulation models are a relatively recent addition to the scientific toolbox, that can be thought of ‘virtual laboratories’ for hypothesis testing. This is an essential part of research in the hypothetico-deductive approach – the standard approach to acquiring new knowledge in (natural) sciences

The hypothetico-deductiv approach in science

  1. Formulate a hypothesis
  2. Deduce predictions from the hypothesis for a specific problem in a specific study area
  3. Test the prediction in an experiment
  4. Analyse the results: does the hypothesis hold or was it falsified?

Taken that simulation models in this context can be conceptualised as laboratories, simulations conform to experiments to test a hypothesis. Just as in ‘real’ experiments, the status of the system is recorded and tracked throughout time for subsequent analysis.

1.5 How to solve problems with modelling

1.5.1 Definition of the purpose

The nature of the problem that needs to be solved determines the purpose of the model. In scientific models the hypothesis equals the problem. There is no model for the sake of its own and every model is developed for a particular purpose. Or to put it the other way round: every problem needs its own model. Clearly, some parts of one model can be used for another model with a similar purpose. However, there is nothing like a ‘generic’ model, nor a model that is generally ‘valid’. The purpose guides the modeller on how to break down the real world system under investigation to a conceptual model. Further, it is the model’s purpose that helps to select an adequate modelling approach.

Therefore, the ‘problem definition’ is the first step for the development of any model. It needs to be an explicit, specific and operational definition of the problem at hand. Usually is stated as hypothesis or as product specification.

1.5.2 Conceptual model

Let’s consider a real world phenomenon that we want to study to solve a related problem. For example, a delegate of the European Union, who wants to know about the impact of alternatively discussed agricultural subsidy regimes on land use change across Europe. Or a bank director, who wonders whether he should invest in a ski resort in the Swiss Alps, despite climate change. The first step in model development is to design a conceptual model. To do so, we need to explicitly define purpose, scope, grain and extent of our model. Further, the phenomenon under study is broken down into general elements: System components are discrete entities that represent the elements of the system under study, which are relevant to the model’s purpose, e.g. tourists from different destinations across Europe. State variables are attributes at system level or at the level of particular components that allow for quantitative measurement of the state of a system, e.g. annual revenue in Swiss ski resorts. Interactions describe, which system components are interrelated and how these relationships are organised, e.g. word of mouth effect between tourists from the same destination about recent snow cover. Processes describe, how a system and / or its components evolve over time. Depending on the modelling approach, processes might be described theoretically (rule-based models), statistically (empirical models), or mathematically (system dynamic models), e.g. the linear increase of temperature in the Swiss Alps over the next 50 years.

1.5.3 Choosing a modelling approach

Simulation models can conceptualise the system from a top-down (system-level) and a bottom-up (individual-level) perspective. In this module we will focus on bottom-up modelling approaches, as they are more relevant for solving spatial problems. For now, there is a brief description of core characteristics of the four main approaches to simulation modelling: System Dynamics models describe relations between sub-systems (components) with help of differential equations. Components are aggregated and quantifiable entities, e.g. populations, biomass, market stocks or the volume of water in a lake. System Dynamics models are deterministic, continuous over time, and inherently non-spatial. Discrete Event models describe relations between subsequent events with help of a flow graphs. Individual elements (e.g. customers in a shop or goods in an automatic production process) move along the graph, get delayed by operating processes, and wait in queues until its their turn. Each event triggers new events. The processing schedule emerges during the simulation. Time is handled purely event driven. Cellular Automata models describe relations between neighbouring cells in a grid for each time step with help of if-then rules. These models can be deterministic or stochastic, depending whether randomness is incorporated in the rules; they are discrete in time and inherently spatial. Agent-based models describe local relations between individual objects (=agents) and their local environment based on a set of rules. Agent-based models are stochastic, discrete in time and inherently spatial.

1.6 Summary

Everybody views the world differently and thus an unlimited amount of models about the world exist. Or to put it the other way round: different problems in the world afford different models to address them. Many problems can (and should!) be solved without the help of simulation models. However, complex systems in which intelligent entities constantly adapt to their environment are not fully predictable. Such systems are best understood by means of simulation modelling.

1.7 Exercise - Start off with GAMA

1.7.1 Install the software

In this module, we will use the latest version of GAMA: GAMA 1.8 Release Candidate 1. Please follow GAMA’s installation guidelines to download and install the software on your machine.

1.7.2 Code your first model

Watch the video and implement your first model alongside the video instructions.

GAMA in 10 min

The above Youtube video is the official “Hello World” instruction to GAMA. It is a 10-min video intro to the GAMA platform that will provide you with a nice overview of the modelling platform we will use for the assignments in this course.

1.7.3 GAMA resources

I have collected a few helpful resources to code GAMA. You will find these in the third section of the chapter Coding 101: Part III: GAMA resources. Besides other things, you will also find some tutorials there. In this module, there are exercises that should equip you with all the necessary skills to manage the assignments. However, of course you are always welcome to explore further!

However, take your time: You will start with your first GAMA assignment not before Lesson 5.

References

Grimm, Volker, Eloy Revilla, Uta Berger, Florian Jeltsch, Wolf M Mooij, Steven F Railsback, Hans-Hermann Thulke, Jacob Weiner, Thorsten Wiegand, and Donald L DeAngelis. 2005. “Pattern-Oriented Modeling of Agent-Based Complex Systems: Lessons from Ecology.” Journal Article. Science 310 (5750): 987–91. doi:10.1126/science.1116681.

Meadows, Donella H, and Diana Wright. 2008. Thinking in Systems: A Primer. Book. Vermont: Chelsea Green Publishing.

O’Sullivan, David, and George LW Perry. 2013. Spatial Simulation: Exploring Pattern and Process. Book. John Wiley & Sons.

Snowden, David. 2000. “Cynefin: A Sense of Time and Space, the Social Ecology of Knowledge Management.” Book Section. In Knowledge Horizons : The Present and the Promise of Knowledge Management, edited by C. Despres and D. Chauvel. Oxford: Butterworth Heinemann. doi:citeulike-article-id:9699956.

Weaver, Warren. 1948. “Science and Complexity.” Journal Article. American Scientist 36: 536–44.