Limitations
If you learn only one thing from our book, let it be this: our large world is not formalisable. More precisely, it is not formalisable in its entirety. If you take this on board, then it is evident that the aim of science cannot be to generate some kind of ultimate theory of everything. Not only is it unlikely that this is possible, but it is also not what we want. A completely formalised world is small and thus hostile to life and freedom. As an alternative, our kind of science is concerned with making maps of the territory: more maps, more trustworthy maps, and also more useful maps. The ultimate aim of science is not to unify all of these maps into one big super-map. That would not be useful. Instead, it is to provide us limited beings with robust tools for an actionable understanding of our large world: allowing us to find our way around, to act in coherent and meaningful ways, to recognise our situation, to be truly at home in the universe.
But there is an obvious tension here: science, as skillful modelling practice, is also a thoroughly analytic undertaking. As we have seen, modelling — and that (as we have said) includes the design of laboratory experiments, and other experimental interventions — requires abstraction, simplification, idealisation. We need to take the world apart, at least to some degree, if simply hugging a tree is not enough for us. And we don’t want to end up with Borges’ map — as large and fine-grained as the territory itself. As scientists, we always need to find a compromise between the complexity of the world and the limitations of our own comprehension, between accuracy and tractability. Scientific knowledge must explain, it must be understandable to us, it must inform the actions of us limited human beings.
As we’ve seen in the last chapter (and this appendix), we can use the mathematical tools of set theory to put our abstractions, our logic, and our thinking about how we model the world on a firm formal basis. Formalisation is helpful and, in fact, necessary both for doing science and for understanding how we do it! It’s just that we cannot hope to formalise everything. Not even the activity of skillful modelling itself. Every formal model is limited in its explanatory power and domain of application.
But the large world we are modelling is not limited like that. Philosopher Nicholas Rescher has a name for this simple fact: he calls it ontological excess. There are always more facts that can be known about the world than we can squeeze into our formal models, or express as symbolic propositions.
Surprisingly, the same applies to the abstract universe of set theory, and the universe of mathematics and logic more generally. Formalisation is limited, even in this purely symbolic domain. In the case of set theory, this is because there is no set that contains all sets (see also the appendix).
Let’s examine this counterintuitive claim, step by step. In chapters 10 and 11, we introduced two epistemic cuts that are fundamental prerequisites for doing science — Rosen’s two “dualisms.” The first separates the self and the second distinguishes a natural system from the ambience. This is what enables formal analysis in the first place. While we can model both self and system as a set, this is not the case for the ambience. The epistemic reason for this is that the large world contains unknown unknowns — things we do not know we don’t know — that thus cannot be captured as distinguishable elements of a set.
But there is a more abstract reason as well. Think of a formal system modelling a natural one: in the previous chapter, we defined such a model by the set of all its possible states, which we called S. Ideally, S should be a well-defined and unchanging subset of the ambience. This seems to imply that the ambience itself can be represented by a set as well. After all, there has to be something S is a subset of. And yet, there are two problems. First, as we shall see shortly, not all natural systems can be formalised as a well-defined set. And second, the problem of relevance suggests that we may define an indefinite number of systems as subsets of the ambience, because we cannot decide in advance what will be relevant in any given situation. Even if each such formally defined system would be a well-defined subset of the ambience, their number is still indefinite, and thus the totality of these subsets (their set-theoretic union), will contain an indefinite number of elements: their totality is not itself a well-defined set.
This problem is known to mathematicians as Cantor’s paradox. For those of you who are interested, we provide mathematical background in this appendix. The gist of it is: sets may be finite or infinite in size. But they have to have a specific number of elements, which mathematicians call the cardinality of a set. Each set is therefore characterised by a cardinal number, which is just a natural number in the case of a finite set, and a specific kind of infinity for an infinite set. If a set contains five elements, its cardinality is “five.” If it has countably infinite elements, its cardinality is “countable infinity.” And so on.
Now consider the set of all sets. What size would it have? Well, here’s the paradox: it would have to be larger than the cardinality of any well-defined set. But that’s obviously impossible: if the set of all sets is itself a set, it will have to be a member of itself. And since it also contains other sets as members, it would have to be larger than itself! Ouch. This makes our heads hurt. And yet, we can draw a simple logical conclusion from it: the universe of set theory, which contains all possible sets, cannot itself be a set. It is not well-defined, has an indefinite number of elements, and thus its size is indeterminate. In fact, it must be larger than any infinity mathematicians can possibly think of. Just how weird is that?
Think of it: set theory is a purely formal mathematical theory, but the abstract universe it exists in is not formalisable in its entirety! You can imagine what controversy this caused in the late 19th century, at the time of its discovery. And, it led to an even worse paradox that threatened the consistency of the very foundation of mathematics (see the appendix). We’ll come back to that when we talk about modelling organisms. What’s important here is that Cantor’s paradox gives us a formal reason why the ambience cannot be a set: to model it formally, we’d need that set of all sets which, unfortunately, does not exist.
As it turns out, set theory is by far not the only mathematical construction that requires us to compromise between consistency and completeness. In 1931, a little over 50 years after Cantor’s formulation of his paradox, Kurt Gödel published his two famous incompleteness theorems, which rocked the mathematical world yet again. Gödel, in fact, used a method developed by Cantor to show that any sufficiently sophisticated formal system that is consistent will be limited in the sense that there will always be truths that fall into its domain but cannot be proven using only its existing set of fundamental postulates, or axioms. In particular, such systems cannot be used to prove their own consistency. This put an end to the formalisation program of David Hilbert, who had set out to ground all of mathematics on a well-defined logical foundation. Hilbert defined “formalisation” just the way we do: to turn ill-defined problems into well-defined ones. Surprisingly, it is not possible to do this for all of mathematics in a globally consistent way. And Gödel proved that, beyond any doubt.
So, now we know: not only is our physical world large, but so is the abstract world of mathematics and logic! Which is truly remarkable: even formal systems (like set theory) cannot be entirely and conclusively formalised. This is a bit confusing, because of the two different uses of “formal” in the previous sentence. The first simply emphasises the fact that mathematical theories are symbolic and rule-based, the latter, more specifically, that they can be completely and precisely defined using a finite number of axioms and syntactic rules (Hilbert’s narrower sense of “formalisation”).
All of this, by the way, is intimately related to Turing’s decision problem, which says that even well-defined problems (like the decision problem itself) are not, in general, computable. Both our inability to completely formalise mathematics and our inability to compute some well-defined problems (like the decision problem) reveal general limitations when it comes to handling symbolic procedures. We’ll say it again and again: the map is not the territory! And it will never be, no matter what.
But formalisation is much broader than computation. It not only constrains our capacity to solve certain problems, but also reveals the essential role of relevance realisation, i.e. the process of formalising a problem. This is why we should not ask whether the physical world is computable, but whether it is formalisable in the first place. The answer, as we have seen, is a resounding “no!” In fact, the world is neither formalisable nor computable. And worse: we cannot even compute or formalise our own symbolic constructions — at least not all of them, at once. What an extraordinarily strange insight!
We have now covered one way in which neither the symbolic nor the physical world is a set. But there is another fundamental shortcoming of set theory, which will take us back to the problem (mentioned above) that not all natural systems can be encoded as a well-defined set of states.
This limitation is as follows: set theory relies on the basic assumption that the world can be subdivided into distinguishable stable entities — elements that make up a set — whose characteristics are somehow determined intrinsically. Remember Locke’s primary and secondary qualities? Primary qualities are those that are entirely internal to the object they define. Like the atomic number of gold. The problem is: there are no such purely internal qualities in the natural world! We’ve discussed this before: a feature of the world must interact with us (or our measuring device) in some way to be observable at all. In this sense, all features we can extract from the world, strictly speaking, are secondary as defined by Locke.
Not only are the properties of objects relational, rather than internal, but the objects themselves are not truly stable either. Again, we have discussed this already: even protons decay (although over ridiculously long time spans). More importantly: what we pick out of our stream of experience are not actually objects, but more or less stable patterns that we recognise as distinct from the rest of our perception of the ambience. There is another basic tension here: the world we experience is fundamentally processual, but our most fundamental mathematical theory (and our logic) is based on unchanging things. Is it sane to base our entire formal universe on such an obviously unrealistic assumption? We have our doubts. But before we dismiss set theory outright (it is extremely robust and useful, after all), let us have a closer look at what the specific problems are that this mismatch is causing.
Set theory represents processes as connections between sets. As we have seen in the last chapter, this can be expressed as a mathematical relation, which connects any number of elements of its source (or domain) to any number of elements of its target (or range). Let’s consider the example again of sorting people into age groups: if the age ranges we define overlap, each person can be part of multiple groups, while each group contains people with a range of different ages. The process this relation describes is the activity of sorting people into groups. Alternatively, we can express a process as a mapping (a special kind of relation), which associates each element of the domain with one single element of the range. Recall the example of the mercury thermometer: it implements the process of measuring temperature by letting the volume expansion of the mercury equilibrate to a particular value on the thermometer’s scale.
What is striking here is that the focus is almost exclusively on the outcome, and not on the dynamics that get us there. We group people into age classes for some specific purpose. It does not matter how. And all that really interests us about the thermometer is reading the temperature off its scale, not how the mercury column gets there. Thus, while set theory can easily represent relations of various kinds, it focuses on the elements, i.e., objects, that form the starting and (more importantly) end points of a given process. This has severe drawbacks when studying phenomena that depend on the trajectories between those points, or when studying processes that have no clear beginning and/or ending. Many patterns we pick out from our experience are history- and context-dependent like this, fading in and out gradually against a messy and constantly changing background of other processes.
Think of the ontogenesis of an organism. In its broadest sense, this is the process by which a living being acquires the capacity to reproduce during its life cycle. It is how it “grows up” to become an adult. This includes the development of multicellular organisms, but is broader than that. Other processes — metabolic, physiological, even reproductive behaviour — are also required to acquire the capacity to reproduce, and are therefore an essential part of ontogenesis. Moreover, single-celled organisms “grow up” as well: they too must mature — replicating their genome and their organelles, developing to a bigger size — and they must find mates, if they reproduce sexually.
But how do we define the beginning and end of ontogenesis? Any decision in this regard is arbitrary to some degree, since we are dealing with a life cycle after all. We could take fertilisation as the beginning of a new life, or course, but not all organisms reproduce sexually. Alternatively, we could take birth (in animal reproduction), or other key events during the life cycle, and argue just as convincingly that life begins there. Even more tricky is the question when the offspring becomes an individual. Shouldn’t this be the beginning of life? But individuation is an ill-defined and gradual process. Pinpointing the end seems easier: life ends at death. Yet, death is not instantaneous either. Moreover, asexually reproducing lineages (or the germlines of sexually reproducing ones) are immortal in some sense, since the dynamic organisation of a parent continues to live on in its offspring. Anyway, doesn’t the reproductive cycle end when fertility ceases, which is often long before death, especially in modern humans?
At a more fine-grained level, biologists tend to subdivide life cycles into discrete stages. Again, we have the problem of determining the beginning and end points of each stage. Often, they are designed to be easy for the biologist to identify, rather than reflecting biological function or capturing any actual discreteness. After all, ontogenesis is a continuous process. And worse: it consists of many sub-processes that interact, but do not necessarily coordinate their dynamics very precisely. There is a lot of leeway for “sloppiness” in these interactions, which is what makes ontogenesis such a robust yet versatile process. Plus: organisms need to be adaptable, and one of the biological functions of ontogenesis is to generate the right kind of variability for evolution. Here, too much precision can actually be a bad thing.
On top of all this, ontogenesis crucially depends on the specific trajectories and rates of change that guide and constrain the processes that constitute it. Timing is generally of the essence. It influences the order in which ontogenetic events occur. But different cells can arrive at their differentiated states by very different paths. Set theory, with its rigidity and focus on objects and outcomes, makes it difficult to capture these important processual aspects of ontogenesis. A mathematical relation or mapping, defined set-theoretically, cannot easily tell us how or how fast we get from one point to another.
One last problem is that the dynamics of ontogenetic processes are constantly modified by an ever-changing organismic and environmental environment. Again, set theory with its rigidly fixed and intrinsically defined elements seems ill-suited to capture such context-dependent flexibility.
Yet, it is possible (and sometimes even convenient) to build mathematical structures with set theory that do allow us to study such process- and context-dependent phenomena and events. We’ll show you a few examples in the next two sections (and in this appendix). Still, the mathematical formalism does have its limits. Wouldn’t it be great, then, to have an equally powerful and fundamental mathematical framework that is explicitly focussed on processes and their relations? The good news is: we do have something like that. It is called category theory. We will motivate and outline it in the last two sections of this chapter. But first, let us talk some more about dynamics (or change) and how it can be formalised using set theory.
If you learn only one thing from our book, let it be this: our large world is not formalisable. More precisely, it is not formalisable in its entirety. If you take this on board, then it is evident that the aim of science cannot be to generate some kind of ultimate theory of everything. Not only is it unlikely that this is possible, but it is also not what we want. A completely formalised world is small and thus hostile to life and freedom. As an alternative, our kind of science is concerned with making maps of the territory: more maps, more trustworthy maps, and also more useful maps. The ultimate aim of science is not to unify all of these maps into one big super-map. That would not be useful. Instead, it is to provide us limited beings with robust tools for an actionable understanding of our large world: allowing us to find our way around, to act in coherent and meaningful ways, to recognise our situation, to be truly at home in the universe.
But there is an obvious tension here: science, as skillful modelling practice, is also a thoroughly analytic undertaking. As we have seen, modelling — and that (as we have said) includes the design of laboratory experiments, and other experimental interventions — requires abstraction, simplification, idealisation. We need to take the world apart, at least to some degree, if simply hugging a tree is not enough for us. And we don’t want to end up with Borges’ map — as large and fine-grained as the territory itself. As scientists, we always need to find a compromise between the complexity of the world and the limitations of our own comprehension, between accuracy and tractability. Scientific knowledge must explain, it must be understandable to us, it must inform the actions of us limited human beings.
As we’ve seen in the last chapter (and this appendix), we can use the mathematical tools of set theory to put our abstractions, our logic, and our thinking about how we model the world on a firm formal basis. Formalisation is helpful and, in fact, necessary both for doing science and for understanding how we do it! It’s just that we cannot hope to formalise everything. Not even the activity of skillful modelling itself. Every formal model is limited in its explanatory power and domain of application.
But the large world we are modelling is not limited like that. Philosopher Nicholas Rescher has a name for this simple fact: he calls it ontological excess. There are always more facts that can be known about the world than we can squeeze into our formal models, or express as symbolic propositions.
Surprisingly, the same applies to the abstract universe of set theory, and the universe of mathematics and logic more generally. Formalisation is limited, even in this purely symbolic domain. In the case of set theory, this is because there is no set that contains all sets (see also the appendix).
Let’s examine this counterintuitive claim, step by step. In chapters 10 and 11, we introduced two epistemic cuts that are fundamental prerequisites for doing science — Rosen’s two “dualisms.” The first separates the self and the second distinguishes a natural system from the ambience. This is what enables formal analysis in the first place. While we can model both self and system as a set, this is not the case for the ambience. The epistemic reason for this is that the large world contains unknown unknowns — things we do not know we don’t know — that thus cannot be captured as distinguishable elements of a set.
But there is a more abstract reason as well. Think of a formal system modelling a natural one: in the previous chapter, we defined such a model by the set of all its possible states, which we called S. Ideally, S should be a well-defined and unchanging subset of the ambience. This seems to imply that the ambience itself can be represented by a set as well. After all, there has to be something S is a subset of. And yet, there are two problems. First, as we shall see shortly, not all natural systems can be formalised as a well-defined set. And second, the problem of relevance suggests that we may define an indefinite number of systems as subsets of the ambience, because we cannot decide in advance what will be relevant in any given situation. Even if each such formally defined system would be a well-defined subset of the ambience, their number is still indefinite, and thus the totality of these subsets (their set-theoretic union), will contain an indefinite number of elements: their totality is not itself a well-defined set.
This problem is known to mathematicians as Cantor’s paradox. For those of you who are interested, we provide mathematical background in this appendix. The gist of it is: sets may be finite or infinite in size. But they have to have a specific number of elements, which mathematicians call the cardinality of a set. Each set is therefore characterised by a cardinal number, which is just a natural number in the case of a finite set, and a specific kind of infinity for an infinite set. If a set contains five elements, its cardinality is “five.” If it has countably infinite elements, its cardinality is “countable infinity.” And so on.
Now consider the set of all sets. What size would it have? Well, here’s the paradox: it would have to be larger than the cardinality of any well-defined set. But that’s obviously impossible: if the set of all sets is itself a set, it will have to be a member of itself. And since it also contains other sets as members, it would have to be larger than itself! Ouch. This makes our heads hurt. And yet, we can draw a simple logical conclusion from it: the universe of set theory, which contains all possible sets, cannot itself be a set. It is not well-defined, has an indefinite number of elements, and thus its size is indeterminate. In fact, it must be larger than any infinity mathematicians can possibly think of. Just how weird is that?
Think of it: set theory is a purely formal mathematical theory, but the abstract universe it exists in is not formalisable in its entirety! You can imagine what controversy this caused in the late 19th century, at the time of its discovery. And, it led to an even worse paradox that threatened the consistency of the very foundation of mathematics (see the appendix). We’ll come back to that when we talk about modelling organisms. What’s important here is that Cantor’s paradox gives us a formal reason why the ambience cannot be a set: to model it formally, we’d need that set of all sets which, unfortunately, does not exist.
As it turns out, set theory is by far not the only mathematical construction that requires us to compromise between consistency and completeness. In 1931, a little over 50 years after Cantor’s formulation of his paradox, Kurt Gödel published his two famous incompleteness theorems, which rocked the mathematical world yet again. Gödel, in fact, used a method developed by Cantor to show that any sufficiently sophisticated formal system that is consistent will be limited in the sense that there will always be truths that fall into its domain but cannot be proven using only its existing set of fundamental postulates, or axioms. In particular, such systems cannot be used to prove their own consistency. This put an end to the formalisation program of David Hilbert, who had set out to ground all of mathematics on a well-defined logical foundation. Hilbert defined “formalisation” just the way we do: to turn ill-defined problems into well-defined ones. Surprisingly, it is not possible to do this for all of mathematics in a globally consistent way. And Gödel proved that, beyond any doubt.
So, now we know: not only is our physical world large, but so is the abstract world of mathematics and logic! Which is truly remarkable: even formal systems (like set theory) cannot be entirely and conclusively formalised. This is a bit confusing, because of the two different uses of “formal” in the previous sentence. The first simply emphasises the fact that mathematical theories are symbolic and rule-based, the latter, more specifically, that they can be completely and precisely defined using a finite number of axioms and syntactic rules (Hilbert’s narrower sense of “formalisation”).
All of this, by the way, is intimately related to Turing’s decision problem, which says that even well-defined problems (like the decision problem itself) are not, in general, computable. Both our inability to completely formalise mathematics and our inability to compute some well-defined problems (like the decision problem) reveal general limitations when it comes to handling symbolic procedures. We’ll say it again and again: the map is not the territory! And it will never be, no matter what.
But formalisation is much broader than computation. It not only constrains our capacity to solve certain problems, but also reveals the essential role of relevance realisation, i.e. the process of formalising a problem. This is why we should not ask whether the physical world is computable, but whether it is formalisable in the first place. The answer, as we have seen, is a resounding “no!” In fact, the world is neither formalisable nor computable. And worse: we cannot even compute or formalise our own symbolic constructions — at least not all of them, at once. What an extraordinarily strange insight!
We have now covered one way in which neither the symbolic nor the physical world is a set. But there is another fundamental shortcoming of set theory, which will take us back to the problem (mentioned above) that not all natural systems can be encoded as a well-defined set of states.
This limitation is as follows: set theory relies on the basic assumption that the world can be subdivided into distinguishable stable entities — elements that make up a set — whose characteristics are somehow determined intrinsically. Remember Locke’s primary and secondary qualities? Primary qualities are those that are entirely internal to the object they define. Like the atomic number of gold. The problem is: there are no such purely internal qualities in the natural world! We’ve discussed this before: a feature of the world must interact with us (or our measuring device) in some way to be observable at all. In this sense, all features we can extract from the world, strictly speaking, are secondary as defined by Locke.
Not only are the properties of objects relational, rather than internal, but the objects themselves are not truly stable either. Again, we have discussed this already: even protons decay (although over ridiculously long time spans). More importantly: what we pick out of our stream of experience are not actually objects, but more or less stable patterns that we recognise as distinct from the rest of our perception of the ambience. There is another basic tension here: the world we experience is fundamentally processual, but our most fundamental mathematical theory (and our logic) is based on unchanging things. Is it sane to base our entire formal universe on such an obviously unrealistic assumption? We have our doubts. But before we dismiss set theory outright (it is extremely robust and useful, after all), let us have a closer look at what the specific problems are that this mismatch is causing.
Set theory represents processes as connections between sets. As we have seen in the last chapter, this can be expressed as a mathematical relation, which connects any number of elements of its source (or domain) to any number of elements of its target (or range). Let’s consider the example again of sorting people into age groups: if the age ranges we define overlap, each person can be part of multiple groups, while each group contains people with a range of different ages. The process this relation describes is the activity of sorting people into groups. Alternatively, we can express a process as a mapping (a special kind of relation), which associates each element of the domain with one single element of the range. Recall the example of the mercury thermometer: it implements the process of measuring temperature by letting the volume expansion of the mercury equilibrate to a particular value on the thermometer’s scale.
What is striking here is that the focus is almost exclusively on the outcome, and not on the dynamics that get us there. We group people into age classes for some specific purpose. It does not matter how. And all that really interests us about the thermometer is reading the temperature off its scale, not how the mercury column gets there. Thus, while set theory can easily represent relations of various kinds, it focuses on the elements, i.e., objects, that form the starting and (more importantly) end points of a given process. This has severe drawbacks when studying phenomena that depend on the trajectories between those points, or when studying processes that have no clear beginning and/or ending. Many patterns we pick out from our experience are history- and context-dependent like this, fading in and out gradually against a messy and constantly changing background of other processes.
Think of the ontogenesis of an organism. In its broadest sense, this is the process by which a living being acquires the capacity to reproduce during its life cycle. It is how it “grows up” to become an adult. This includes the development of multicellular organisms, but is broader than that. Other processes — metabolic, physiological, even reproductive behaviour — are also required to acquire the capacity to reproduce, and are therefore an essential part of ontogenesis. Moreover, single-celled organisms “grow up” as well: they too must mature — replicating their genome and their organelles, developing to a bigger size — and they must find mates, if they reproduce sexually.
But how do we define the beginning and end of ontogenesis? Any decision in this regard is arbitrary to some degree, since we are dealing with a life cycle after all. We could take fertilisation as the beginning of a new life, or course, but not all organisms reproduce sexually. Alternatively, we could take birth (in animal reproduction), or other key events during the life cycle, and argue just as convincingly that life begins there. Even more tricky is the question when the offspring becomes an individual. Shouldn’t this be the beginning of life? But individuation is an ill-defined and gradual process. Pinpointing the end seems easier: life ends at death. Yet, death is not instantaneous either. Moreover, asexually reproducing lineages (or the germlines of sexually reproducing ones) are immortal in some sense, since the dynamic organisation of a parent continues to live on in its offspring. Anyway, doesn’t the reproductive cycle end when fertility ceases, which is often long before death, especially in modern humans?
At a more fine-grained level, biologists tend to subdivide life cycles into discrete stages. Again, we have the problem of determining the beginning and end points of each stage. Often, they are designed to be easy for the biologist to identify, rather than reflecting biological function or capturing any actual discreteness. After all, ontogenesis is a continuous process. And worse: it consists of many sub-processes that interact, but do not necessarily coordinate their dynamics very precisely. There is a lot of leeway for “sloppiness” in these interactions, which is what makes ontogenesis such a robust yet versatile process. Plus: organisms need to be adaptable, and one of the biological functions of ontogenesis is to generate the right kind of variability for evolution. Here, too much precision can actually be a bad thing.
On top of all this, ontogenesis crucially depends on the specific trajectories and rates of change that guide and constrain the processes that constitute it. Timing is generally of the essence. It influences the order in which ontogenetic events occur. But different cells can arrive at their differentiated states by very different paths. Set theory, with its rigidity and focus on objects and outcomes, makes it difficult to capture these important processual aspects of ontogenesis. A mathematical relation or mapping, defined set-theoretically, cannot easily tell us how or how fast we get from one point to another.
One last problem is that the dynamics of ontogenetic processes are constantly modified by an ever-changing organismic and environmental environment. Again, set theory with its rigidly fixed and intrinsically defined elements seems ill-suited to capture such context-dependent flexibility.
Yet, it is possible (and sometimes even convenient) to build mathematical structures with set theory that do allow us to study such process- and context-dependent phenomena and events. We’ll show you a few examples in the next two sections (and in this appendix). Still, the mathematical formalism does have its limits. Wouldn’t it be great, then, to have an equally powerful and fundamental mathematical framework that is explicitly focussed on processes and their relations? The good news is: we do have something like that. It is called category theory. We will motivate and outline it in the last two sections of this chapter. But first, let us talk some more about dynamics (or change) and how it can be formalised using set theory.
Dynamics
The universe of set theory mirrors that of the human doctrine of containment, a substantivist view of the world: everything consists of objects made of smaller objects until you hit a level of fundamental particles. In set theory, of course, there is no such fundamental level, but there is the axiom of regularity, which forbids infinite chains of sets within sets. This amounts to a limited number of hierarchical levels in both cases. Both in the physical and the symbolic world, the buck has to stop somewhere.
We’ve shown earlier why this kind of view is so widespread and has so much intuitive appeal, despite the fact that it does not at all fit with the way human beings actually interact with the world. The fundamental units of our experience are perceived patterns, and only when these patterns are clearly delimited from their surroundings and remain stable at the time scale we are observing them does it make sense to treat them as fixed objects. As we have seen, many important patterns in our experience do not conform to these restrictive conditions. Processes are the rule, objects the exception. In our human world of experience, Heraclitus’ πάντα ῥεῖ (“everything flows”) undoubtedly applies.
For this reason, we have argued in the previous section that it would be useful to have a formalism that explicitly reflects these priorities: a formalism that takes processes and relations as primary, and objects as secondary stable manifestations of those. But before we can introduce such a formalism, we have some homework to do. Our very first task must be to define what it means for something to be a process. And also: what exactly is change? It is surprisingly difficult to specify this pair of related concepts precisely.
Here is an example definition, from Rescher’s book “Process Metaphysics:” “a process is a coordinated group of changes in the complexion of reality, an organized family of occurrences that are systematically linked to one another either causally or functionally…”
Quite the mouthful! And not at all trivial to parse. The core of the definition is to define process as “a coordinated group of changes.” Therefore, we need to think more about what we mean by “change.” We’ll do that shortly. For now, let’s focus on the rest of the definition. We provide an interpretation of the “complexion of reality” in chapter 9: reality is that part of the world that affects us, but lies beyond our control. So far, so good. But what is “an organized family of occurrences?” Well, “occurrences” are what we called “events” in the previous chapter, which can be interpreted as interactions between natural systems that have some kind of impact on our selves, i.e., they are observable or measurable. These events must be linked functionally or causally in a process. This is what it means for them to be organized or coordinated, which is precisely how we defined the patterns captured by natural systems in the first place. And don’t worry: we’ll go a lot more into the nature of these linkages later on.
But what then, do we mean by change being fundamental for the concept of a process? There seems to be an utter lack of tangibility here. In set theory, change is defined as the difference between an element in the domain, and the corresponding element(s) in the range of a relation or mapping. But this is a secondary property, which presumes the existence of these elements (i.e., start and end objects of a process) to begin with. What could it possibly mean, then, that change is fundamental?
Well, we’ve encountered this problem before, when we talked about continuous physical theories vs. the discrete nature of computation. In particular, we introduced the idea of an infinitesimal, an infinitely small interval that is still not quite zero. It defines an instantaneous rate of change. Could this be it? Change is defined by the rate at which something becomes different. This sounds sensible — intuitively clear. But there’s a problem: as explained in chapter 7, instantaneous rates are also derived, abstracted from the process of shrinking the interval between two time points to infinity. These rates only exist as mathematical limits in the idealised realm of the infinitesimally small! Since they are not directly accessible to our experience, we cannot really use them as the basis for a naturalistic definition of a process either.
It seems like we really hit a wall, trying to define what change is via differences or rates. One possible conclusion to draw from our failed attempts would be to go along with Aristotle (once more) and simply declare change as fundamental. Time, he averted, is nothing but the irreducible experience or measure of change. The more time passes, the more things appear to become different. This, and the fact that we have to derive rates of change from extended time intervals, indicates that the essence of time is duration. This idea lies at the heart of a famous debate that Albert Einstein had with philosopher Henri Bergson in 1922. Einstein is widely believed to have won, but the matter was never really settled.
The debate was, roughly, about whether the lived experience of duration (Bergson) or measured clock time (Einstein) is more fundamental for our understanding of the world. While we probably all agree that measuring time with a reliable clock is more precise to our lived experience of the flow of time, this is not really the point that Bergson was trying to make. What he was claiming, instead, was that experienced duration is a prerequisite for the existence of the clock as a measuring device. What we now call “physical time” is derived (through abstraction) from the duration of change we directly experience. The basic insight here is: change, whether continuous or discrete, takes time.
The experience of duration comes natural to us but seems difficult to grasp precisely. It is definitely not easy to put into words. This is reflected by the fact that most of our vocabulary concerning change relies on spatial metaphors: time flows downward (like a river), and we move or travel forward through time. This can lead to confusion, because one important aspect that sets time apart from space is that it is irreversible: as far as we know, we can only traverse it in one direction. Moreover, we cannot stop on our journey through time either: change is relentless. And everything changes. Even a proton. Change, whatever it may be, seems to be truly universal. It is all around us. All of the time.
Before you get frustrated with us, and we start going around in definitional circles, let us switch gears: maybe we can gain further insights into what a process is, not by endlessly ruminating about the fundamental nature of change, but by asking how we can identify and characterise different features of the ambience in practice. Evidently, this is particularly easy to achieve for solid objects, which is why the doctrine of containment is so ingrained in us: an object — say, a rock, a toy, a hammer, or an apple — is something that is discrete and bounded, and relatively unchanging. It clearly stands out from the rest of the ambience. This is what makes it possible for us to formally describe an object as a finite set.
In contrast, processes are defined primarily by their interactions and relations: they are continuous with the ambience, being in constant exchange with it. This means they are not so easily delimited. And, lacking any clear boundaries, they may even be considered infinite entities in some peculiar formal sense. This will become central to our argument below.
Since it is generally impossible to demarcate exact spatial and temporal boundaries for a process, we need to consider its time scale, continuity, and cohesion to characterise it instead. Processes of the same kind share a similar structure, constituted by the rules that govern how one occurrence or event follows from another, and how this leads to organised change over time. We’ve encountered this type of regularity as the mathematical concept of naturality in the last chapter (see this appendix for a mathematical definition). And it is not called natural without reason: it is the most common and salient feature of our flow of experience. As Rosen and Wimsatt both point out: no living being could exist (or evolve) in a world lacking this kind of fluid and transformational order.
It may be harder to pick out a fast-changing process from the ambience, compared to a stable object, but it is still eminently feasible. In fact, as we explained before, we do this all the time. And we are able to do so because the structural criteria listed above render individual processes identifiable, reproducible, and classifiable, even if our characterisations of processes may be less general, and less exactly delimited, than those concerned with objects. If we know a thing or two about the weather, we can recognise a cold front when we see one, and we can track it along its trajectory. We can do this because of its characteristic form — the patterns and relations it exhibits — even though no two fronts look or behave the same, even though each front interacts differently with the atmosphere around it and the landscape below, and even though we cannot tell precisely where one of those fronts begins and where it ends.
We’ll get to the wider philosophical implications of all this in the next chapter. For now, we will focus instead on describing the kind of regular change we want to capture with our process-based formalism.
For starters, it is important to acknowledge that change happens simultaneously across many different time scales. This is reflected by the separation of variables and parameters in a formal model. Typically, variables change their values within the time interval we are focussing on, while parameters stay fixed, changing, if at all, at larger time scales than the one our model is considering.
Take, for example, a model of a gene regulatory network in a living cell. In this case, the variables represent the concentration of gene products, which may be transcription factor proteins that bind to the regulatory sequences of other genes in the genome of the cell to activate or repress their expression. These concentrations change over time, typically on a relatively short scale between minutes to days. In contrast, model parameters characterise the structure of the network: how strongly two genes interact, and whether the interaction they describe is activating or repressing. Their values only change at a much slower rate, by heritable mutation, across generations, during evolution. This may take millions of years.
In the terminology we’ve introduced before (see also this appendix), we would say that the state of the system (defined by its variables) changes at a fast time scale (within the time window we are considering) while the structure of the system (affected by its parameters) changes much more slowly (beyond our current time window). And yet, time scale is not the only difference here: both kinds of change are qualitatively different from each other. While changes in state happen according to the rules that are determined by the structure of the system, when the structure itself changes, so do the very rules by which the system’s state changes. One kind of change happens within the rules of the game, the other plays with the rules: Carse’s finite games versus infinite play, once again. Clearly, there is some kind of hierarchy of changes here. But, as we shall see, this hierarchy does not have to be rigid or static.
We will talk about structural change, how it relates to dynamic organisation, and how it differs from state change, in the next section. But before we go there, let us look at one additional conceptual distinction that is important for a deeper understanding of change and process.
We said above that there is continuity between a process and its environment. This is, yet again, one of those tricky and overloaded concepts. What we meant, initially, is that the process is open: it is constantly exchanging material, energy, and/or information across its boundaries. But then, a process must also exhibit a different kind of continuity to be identifiable. There must be some recognisable material or causal connection among the patterns that make up a process at different points in time. This will be important later. Finally, change itself may be continuous (as opposed to discrete), which constitutes yet another meaning of the term. Discrete systems progress by jumps: there lies an extended time interval between any two consecutive states. In contrast, continuous systems change smoothly, without any such discrete jumps, i.e., without any gaps or discontinuities between successive states. We’ve encountered this kind of change before, with Zeno’s arrow paradox, when comparing physics and computation.
Both discrete and continuous change come with their own difficulties. The question we should ask about discrete change is “what happens in between consecutive system states?” Clearly, time doesn’t stop or ceases to exist in those intervals? Therefore, it seems like discrete time is just another abstraction based on the fundamental concept of duration. Bergson may have been right, after all! If we need causal or material continuity for a process to be identifiable there can be no gaps between states.
But, as Zeno nicely illustrated, continuous change creates other complications. In particular, modelling continuous time requires an infinite amount of computation, because there are an infinite number of mappings to be evaluated between the states that are the elements of any extended time interval. In addition, we already established that infinitesimal-sized intervals (intervals that lack any extension) are just another kind of abstraction. How infuriating: no matter how we approach the nitty-gritty of change, it escapes our grasp. We approximate it in various abstracted ways, but never quite manage to put our finger on it. But then, it is hardly surprising that change, by its nature, should be such a slippery concept.
Despite this confusion, it is relatively straightforward to codify this entire bundle of ideas into the formalism of sets. For this type of formalisation, it doesn’t really matter if change is discrete or continuous. Remember that a (“timeless”) general system can be defined by its system-response mapping X × S → Y, which describes the process that happens between input (X) and output (Y), given the system’s state S at a specific moment in time (see also the appendix). In order to define a general time system, we now need to add a second mapping, called the state transition, which describes the process of propagating the system’s internal state S from some time t₁ to a later time t₂. We write:
The universe of set theory mirrors that of the human doctrine of containment, a substantivist view of the world: everything consists of objects made of smaller objects until you hit a level of fundamental particles. In set theory, of course, there is no such fundamental level, but there is the axiom of regularity, which forbids infinite chains of sets within sets. This amounts to a limited number of hierarchical levels in both cases. Both in the physical and the symbolic world, the buck has to stop somewhere.
We’ve shown earlier why this kind of view is so widespread and has so much intuitive appeal, despite the fact that it does not at all fit with the way human beings actually interact with the world. The fundamental units of our experience are perceived patterns, and only when these patterns are clearly delimited from their surroundings and remain stable at the time scale we are observing them does it make sense to treat them as fixed objects. As we have seen, many important patterns in our experience do not conform to these restrictive conditions. Processes are the rule, objects the exception. In our human world of experience, Heraclitus’ πάντα ῥεῖ (“everything flows”) undoubtedly applies.
For this reason, we have argued in the previous section that it would be useful to have a formalism that explicitly reflects these priorities: a formalism that takes processes and relations as primary, and objects as secondary stable manifestations of those. But before we can introduce such a formalism, we have some homework to do. Our very first task must be to define what it means for something to be a process. And also: what exactly is change? It is surprisingly difficult to specify this pair of related concepts precisely.
Here is an example definition, from Rescher’s book “Process Metaphysics:” “a process is a coordinated group of changes in the complexion of reality, an organized family of occurrences that are systematically linked to one another either causally or functionally…”
Quite the mouthful! And not at all trivial to parse. The core of the definition is to define process as “a coordinated group of changes.” Therefore, we need to think more about what we mean by “change.” We’ll do that shortly. For now, let’s focus on the rest of the definition. We provide an interpretation of the “complexion of reality” in chapter 9: reality is that part of the world that affects us, but lies beyond our control. So far, so good. But what is “an organized family of occurrences?” Well, “occurrences” are what we called “events” in the previous chapter, which can be interpreted as interactions between natural systems that have some kind of impact on our selves, i.e., they are observable or measurable. These events must be linked functionally or causally in a process. This is what it means for them to be organized or coordinated, which is precisely how we defined the patterns captured by natural systems in the first place. And don’t worry: we’ll go a lot more into the nature of these linkages later on.
But what then, do we mean by change being fundamental for the concept of a process? There seems to be an utter lack of tangibility here. In set theory, change is defined as the difference between an element in the domain, and the corresponding element(s) in the range of a relation or mapping. But this is a secondary property, which presumes the existence of these elements (i.e., start and end objects of a process) to begin with. What could it possibly mean, then, that change is fundamental?
Well, we’ve encountered this problem before, when we talked about continuous physical theories vs. the discrete nature of computation. In particular, we introduced the idea of an infinitesimal, an infinitely small interval that is still not quite zero. It defines an instantaneous rate of change. Could this be it? Change is defined by the rate at which something becomes different. This sounds sensible — intuitively clear. But there’s a problem: as explained in chapter 7, instantaneous rates are also derived, abstracted from the process of shrinking the interval between two time points to infinity. These rates only exist as mathematical limits in the idealised realm of the infinitesimally small! Since they are not directly accessible to our experience, we cannot really use them as the basis for a naturalistic definition of a process either.
It seems like we really hit a wall, trying to define what change is via differences or rates. One possible conclusion to draw from our failed attempts would be to go along with Aristotle (once more) and simply declare change as fundamental. Time, he averted, is nothing but the irreducible experience or measure of change. The more time passes, the more things appear to become different. This, and the fact that we have to derive rates of change from extended time intervals, indicates that the essence of time is duration. This idea lies at the heart of a famous debate that Albert Einstein had with philosopher Henri Bergson in 1922. Einstein is widely believed to have won, but the matter was never really settled.
The debate was, roughly, about whether the lived experience of duration (Bergson) or measured clock time (Einstein) is more fundamental for our understanding of the world. While we probably all agree that measuring time with a reliable clock is more precise to our lived experience of the flow of time, this is not really the point that Bergson was trying to make. What he was claiming, instead, was that experienced duration is a prerequisite for the existence of the clock as a measuring device. What we now call “physical time” is derived (through abstraction) from the duration of change we directly experience. The basic insight here is: change, whether continuous or discrete, takes time.
The experience of duration comes natural to us but seems difficult to grasp precisely. It is definitely not easy to put into words. This is reflected by the fact that most of our vocabulary concerning change relies on spatial metaphors: time flows downward (like a river), and we move or travel forward through time. This can lead to confusion, because one important aspect that sets time apart from space is that it is irreversible: as far as we know, we can only traverse it in one direction. Moreover, we cannot stop on our journey through time either: change is relentless. And everything changes. Even a proton. Change, whatever it may be, seems to be truly universal. It is all around us. All of the time.
Before you get frustrated with us, and we start going around in definitional circles, let us switch gears: maybe we can gain further insights into what a process is, not by endlessly ruminating about the fundamental nature of change, but by asking how we can identify and characterise different features of the ambience in practice. Evidently, this is particularly easy to achieve for solid objects, which is why the doctrine of containment is so ingrained in us: an object — say, a rock, a toy, a hammer, or an apple — is something that is discrete and bounded, and relatively unchanging. It clearly stands out from the rest of the ambience. This is what makes it possible for us to formally describe an object as a finite set.
In contrast, processes are defined primarily by their interactions and relations: they are continuous with the ambience, being in constant exchange with it. This means they are not so easily delimited. And, lacking any clear boundaries, they may even be considered infinite entities in some peculiar formal sense. This will become central to our argument below.
Since it is generally impossible to demarcate exact spatial and temporal boundaries for a process, we need to consider its time scale, continuity, and cohesion to characterise it instead. Processes of the same kind share a similar structure, constituted by the rules that govern how one occurrence or event follows from another, and how this leads to organised change over time. We’ve encountered this type of regularity as the mathematical concept of naturality in the last chapter (see this appendix for a mathematical definition). And it is not called natural without reason: it is the most common and salient feature of our flow of experience. As Rosen and Wimsatt both point out: no living being could exist (or evolve) in a world lacking this kind of fluid and transformational order.
It may be harder to pick out a fast-changing process from the ambience, compared to a stable object, but it is still eminently feasible. In fact, as we explained before, we do this all the time. And we are able to do so because the structural criteria listed above render individual processes identifiable, reproducible, and classifiable, even if our characterisations of processes may be less general, and less exactly delimited, than those concerned with objects. If we know a thing or two about the weather, we can recognise a cold front when we see one, and we can track it along its trajectory. We can do this because of its characteristic form — the patterns and relations it exhibits — even though no two fronts look or behave the same, even though each front interacts differently with the atmosphere around it and the landscape below, and even though we cannot tell precisely where one of those fronts begins and where it ends.
We’ll get to the wider philosophical implications of all this in the next chapter. For now, we will focus instead on describing the kind of regular change we want to capture with our process-based formalism.
For starters, it is important to acknowledge that change happens simultaneously across many different time scales. This is reflected by the separation of variables and parameters in a formal model. Typically, variables change their values within the time interval we are focussing on, while parameters stay fixed, changing, if at all, at larger time scales than the one our model is considering.
Take, for example, a model of a gene regulatory network in a living cell. In this case, the variables represent the concentration of gene products, which may be transcription factor proteins that bind to the regulatory sequences of other genes in the genome of the cell to activate or repress their expression. These concentrations change over time, typically on a relatively short scale between minutes to days. In contrast, model parameters characterise the structure of the network: how strongly two genes interact, and whether the interaction they describe is activating or repressing. Their values only change at a much slower rate, by heritable mutation, across generations, during evolution. This may take millions of years.
In the terminology we’ve introduced before (see also this appendix), we would say that the state of the system (defined by its variables) changes at a fast time scale (within the time window we are considering) while the structure of the system (affected by its parameters) changes much more slowly (beyond our current time window). And yet, time scale is not the only difference here: both kinds of change are qualitatively different from each other. While changes in state happen according to the rules that are determined by the structure of the system, when the structure itself changes, so do the very rules by which the system’s state changes. One kind of change happens within the rules of the game, the other plays with the rules: Carse’s finite games versus infinite play, once again. Clearly, there is some kind of hierarchy of changes here. But, as we shall see, this hierarchy does not have to be rigid or static.
We will talk about structural change, how it relates to dynamic organisation, and how it differs from state change, in the next section. But before we go there, let us look at one additional conceptual distinction that is important for a deeper understanding of change and process.
We said above that there is continuity between a process and its environment. This is, yet again, one of those tricky and overloaded concepts. What we meant, initially, is that the process is open: it is constantly exchanging material, energy, and/or information across its boundaries. But then, a process must also exhibit a different kind of continuity to be identifiable. There must be some recognisable material or causal connection among the patterns that make up a process at different points in time. This will be important later. Finally, change itself may be continuous (as opposed to discrete), which constitutes yet another meaning of the term. Discrete systems progress by jumps: there lies an extended time interval between any two consecutive states. In contrast, continuous systems change smoothly, without any such discrete jumps, i.e., without any gaps or discontinuities between successive states. We’ve encountered this kind of change before, with Zeno’s arrow paradox, when comparing physics and computation.
Both discrete and continuous change come with their own difficulties. The question we should ask about discrete change is “what happens in between consecutive system states?” Clearly, time doesn’t stop or ceases to exist in those intervals? Therefore, it seems like discrete time is just another abstraction based on the fundamental concept of duration. Bergson may have been right, after all! If we need causal or material continuity for a process to be identifiable there can be no gaps between states.
But, as Zeno nicely illustrated, continuous change creates other complications. In particular, modelling continuous time requires an infinite amount of computation, because there are an infinite number of mappings to be evaluated between the states that are the elements of any extended time interval. In addition, we already established that infinitesimal-sized intervals (intervals that lack any extension) are just another kind of abstraction. How infuriating: no matter how we approach the nitty-gritty of change, it escapes our grasp. We approximate it in various abstracted ways, but never quite manage to put our finger on it. But then, it is hardly surprising that change, by its nature, should be such a slippery concept.
Despite this confusion, it is relatively straightforward to codify this entire bundle of ideas into the formalism of sets. For this type of formalisation, it doesn’t really matter if change is discrete or continuous. Remember that a (“timeless”) general system can be defined by its system-response mapping X × S → Y, which describes the process that happens between input (X) and output (Y), given the system’s state S at a specific moment in time (see also the appendix). In order to define a general time system, we now need to add a second mapping, called the state transition, which describes the process of propagating the system’s internal state S from some time t₁ to a later time t₂. We write:
It should be immediately obvious that this map is in some sense orthogonal to the system response:
Note that not only the state, but also the input and output of the system are now dependent on time.
We can invert the downward flow of time in this formal scheme and illustrate such a simple time system with the metaphor of a tower of bricks being built:
We can invert the downward flow of time in this formal scheme and illustrate such a simple time system with the metaphor of a tower of bricks being built:
Here, the inputs are taken from a heap of individual bricks X that are stacked to produce a particular sequence of states s in state space S. Over each time interval (no matter how short it is), another brick is added to the tower, resulting in a linear structure (the output of the system) as shown in the figure.
All of this is an explicit acknowledgment that change in a natural system necessarily occurs over some duration, no matter whether we treat it as continuous or discrete. It happens across an interval between the times t₁ and t₂. Again: set theory fundamentally represents change as difference between an initial state and an outcome (another brick in the tower), no matter if the two occur far apart or infinitesimally close to each other in time. There is no representation of what happens between states.
In addition, this kind of formalization tells us that a time system can only be realised — i.e., decoded to be congruent with a natural system that can actually exist in the ambience — if the state-transition renders the system responses at distinct time points consistent with each other. One has to somehow lead to the next: brick put on top of existing towers made of bricks. Once more, there must be a natural regularity to a process for it to be realisable, and we can make this idea mathematically rigorous, as we explain in these two appendices. So there, at last, is our grip on change! And another example of how formalisation helps us properly understand a difficult concept.
In this case, a purely intuitive understanding is so hard to come by, because the consistency required of a time system is extremely fluid and flexible. It can be extremely subtle, or complex, and can come in many shapes and forms. In principle, it need not even involve any causal connection between the states, but can be completely random (what mathematicians call stochastic), as long as there is some material continuity to it. But this is not our focus here. We’ll come back to the role of stochasticity in dynamics later (see also this appendix).
More to the point: time systems with either fixed or time-variable structures can be consistent. In the latter case, the set of system states S may differ (even quite radically) between any two time points. Like the structure of the system itself, this set is now time-dependent, which is what we acknowledge by writing S(t). Not only which state it is actually in, but also which states are accessible to such a time-variable system, depends on when (and under what circumstances) we examine it. And the resulting totality of all sets of states, at any given time, need not be a set itself. Just like the impossible set of all sets, it may contain an indefinite number of elements. It may be undefinable. We cannot delimit such a system cleanly as a specific subset of the ambience. We’ll return to this important problem in the next section.
Mathematically, it is much easier to deal with systems whose structure remains fixed (or varies only within well-defined and predictable limits). This, of course, was the whole point of Newton’s trick, his universal mathematical oracle: by separating the rules of change (or laws of motion) from the state of the system, Newton rendered both of them not only independent of each other, but also independent of time. The system can now be described by the same set of states at every point of its history (even though the states it actually visits still change). It no longer matters when you look at it. This kind of system is well-defined because we can delimit a specific set of possible states across all time points. This set may be infinite, but it is always a set with a definite number of elements. Practically, this means that if you measure the system’s state at any time, you can predict its future (and reconstruct its past) using only the laws of motion. Newton’s trick allows you to become the Laplacian demon! At least a little bit … locally.
In terms of general systems theory, all of this means that the state-transition mapping of such a Newtonian system maps its unique set of states to itself
All of this is an explicit acknowledgment that change in a natural system necessarily occurs over some duration, no matter whether we treat it as continuous or discrete. It happens across an interval between the times t₁ and t₂. Again: set theory fundamentally represents change as difference between an initial state and an outcome (another brick in the tower), no matter if the two occur far apart or infinitesimally close to each other in time. There is no representation of what happens between states.
In addition, this kind of formalization tells us that a time system can only be realised — i.e., decoded to be congruent with a natural system that can actually exist in the ambience — if the state-transition renders the system responses at distinct time points consistent with each other. One has to somehow lead to the next: brick put on top of existing towers made of bricks. Once more, there must be a natural regularity to a process for it to be realisable, and we can make this idea mathematically rigorous, as we explain in these two appendices. So there, at last, is our grip on change! And another example of how formalisation helps us properly understand a difficult concept.
In this case, a purely intuitive understanding is so hard to come by, because the consistency required of a time system is extremely fluid and flexible. It can be extremely subtle, or complex, and can come in many shapes and forms. In principle, it need not even involve any causal connection between the states, but can be completely random (what mathematicians call stochastic), as long as there is some material continuity to it. But this is not our focus here. We’ll come back to the role of stochasticity in dynamics later (see also this appendix).
More to the point: time systems with either fixed or time-variable structures can be consistent. In the latter case, the set of system states S may differ (even quite radically) between any two time points. Like the structure of the system itself, this set is now time-dependent, which is what we acknowledge by writing S(t). Not only which state it is actually in, but also which states are accessible to such a time-variable system, depends on when (and under what circumstances) we examine it. And the resulting totality of all sets of states, at any given time, need not be a set itself. Just like the impossible set of all sets, it may contain an indefinite number of elements. It may be undefinable. We cannot delimit such a system cleanly as a specific subset of the ambience. We’ll return to this important problem in the next section.
Mathematically, it is much easier to deal with systems whose structure remains fixed (or varies only within well-defined and predictable limits). This, of course, was the whole point of Newton’s trick, his universal mathematical oracle: by separating the rules of change (or laws of motion) from the state of the system, Newton rendered both of them not only independent of each other, but also independent of time. The system can now be described by the same set of states at every point of its history (even though the states it actually visits still change). It no longer matters when you look at it. This kind of system is well-defined because we can delimit a specific set of possible states across all time points. This set may be infinite, but it is always a set with a definite number of elements. Practically, this means that if you measure the system’s state at any time, you can predict its future (and reconstruct its past) using only the laws of motion. Newton’s trick allows you to become the Laplacian demon! At least a little bit … locally.
In terms of general systems theory, all of this means that the state-transition mapping of such a Newtonian system maps its unique set of states to itself
(note the deliberate absence of the times t here, because S remains fixed across all time points). Or, keeping track of the elements (states) in S that are mapped to each other over time, we write
(where an element of state space S at time t₁ is mapped to an element of the same S at t₂). The brick tower grows and grows but it always remains a representation of a linear sequence of states within S.
The fact that the set of states remains fixed across time immediately implies two additional constraints: first, the state-transition must map each state in S at t₁ to exactly one state (which may, in fact, be itself) at t₂. And, second, it must do this for all the states in S. If either of these two conditions are violated, we “lose” states over time and the state space no longer remains fixed. In other words, the state transition must be a permutation of system states, and an isomorphism (a one-to-one correspondence, see chapter 11). Mathematicians call such an isomorphism that maps a set onto itself an automorphism. The existence of this kind of mapping imposes a structure on state space that is very convenient for the modeller. For instance, it implies that the trajectories of the system (if there are multiple possible ones) do not cross. We will come back to this (and other properties) of state space later (see also the appendix).
What is important right now is the following realisation: in a Newtonian system with its fixed state space, the system-response and the state-transition mapping are indeed completely independent of each other, while this is generally not the case in systems with a time-variable structure. This allows for an unusually compact and powerful mathematical formulation. We call such a formal system, whose behaviour (at all times) is completely captured by these two independent mappings, a dynamical system.
There can be discrete or continuous dynamical systems, depending on the nature of the underlying change. Also keep in mind that we are talking about a specific kind of formal system here, not just any natural system with dynamics. We discuss different formalisms to implement dynamical systems in the appendix. The details don’t matter much for our current line of argument, but it may be worth mentioning that what is called dynamical systems theory in a strict sense (a widely used and well developed modelling framework based on differential equations) is more narrow still, requiring even more mathematical structure. In particular, the state space of the system (traditionally called phase space in physics) must be a topological space that has well-defined neighborhoods and is smooth, i.e. both continuous and differentiable.
What is important here is that very few natural systems actually behave in a way that conforms to the paradigm of a dynamical system we have just described. We’ve said it before: almost all physical systems are in constant exchange with their surroundings, and have their structures altered by these interactions in rather unpredictable ways. Newton-style dynamical systems, clearly, are based on one of the starkest idealisations ever made in the history of science. They are a very powerful, but also very abstract and specialised map of the territory. We still employ their strict separation of state and structure today, across disciplines, in quantum mechanics and the theory of relativity, but also when modelling biological and social systems using a dynamical systems approach or related formalisms imported from physics. This is Smolin’s “physics in a box.”
The only problem is: for many natural systems, the box matters much more than its contents! Yet, we almost always put it in the background. This is exactly why we are so often confused about map and territory. It is high time we start thinking about the box.
The fact that the set of states remains fixed across time immediately implies two additional constraints: first, the state-transition must map each state in S at t₁ to exactly one state (which may, in fact, be itself) at t₂. And, second, it must do this for all the states in S. If either of these two conditions are violated, we “lose” states over time and the state space no longer remains fixed. In other words, the state transition must be a permutation of system states, and an isomorphism (a one-to-one correspondence, see chapter 11). Mathematicians call such an isomorphism that maps a set onto itself an automorphism. The existence of this kind of mapping imposes a structure on state space that is very convenient for the modeller. For instance, it implies that the trajectories of the system (if there are multiple possible ones) do not cross. We will come back to this (and other properties) of state space later (see also the appendix).
What is important right now is the following realisation: in a Newtonian system with its fixed state space, the system-response and the state-transition mapping are indeed completely independent of each other, while this is generally not the case in systems with a time-variable structure. This allows for an unusually compact and powerful mathematical formulation. We call such a formal system, whose behaviour (at all times) is completely captured by these two independent mappings, a dynamical system.
There can be discrete or continuous dynamical systems, depending on the nature of the underlying change. Also keep in mind that we are talking about a specific kind of formal system here, not just any natural system with dynamics. We discuss different formalisms to implement dynamical systems in the appendix. The details don’t matter much for our current line of argument, but it may be worth mentioning that what is called dynamical systems theory in a strict sense (a widely used and well developed modelling framework based on differential equations) is more narrow still, requiring even more mathematical structure. In particular, the state space of the system (traditionally called phase space in physics) must be a topological space that has well-defined neighborhoods and is smooth, i.e. both continuous and differentiable.
What is important here is that very few natural systems actually behave in a way that conforms to the paradigm of a dynamical system we have just described. We’ve said it before: almost all physical systems are in constant exchange with their surroundings, and have their structures altered by these interactions in rather unpredictable ways. Newton-style dynamical systems, clearly, are based on one of the starkest idealisations ever made in the history of science. They are a very powerful, but also very abstract and specialised map of the territory. We still employ their strict separation of state and structure today, across disciplines, in quantum mechanics and the theory of relativity, but also when modelling biological and social systems using a dynamical systems approach or related formalisms imported from physics. This is Smolin’s “physics in a box.”
The only problem is: for many natural systems, the box matters much more than its contents! Yet, we almost always put it in the background. This is exactly why we are so often confused about map and territory. It is high time we start thinking about the box.
Constraints
Let us recap: in the last section, we have defined a general class of formal time systems, which are defined set-theoretically by two mappings: system response (instantaneous, input to output) and state transition (propagating the system through time). These mappings must be consistent with each other to be realised as (or to decode into) a natural system. As long as this condition is met, we should be able to model any natural system by a congruent time system. At least in principle. And because all natural systems must persist through time (long enough for us to discern them), we can say that time systems with their two mappings (and not the more general “timeless” systems introduced in the last chapter) are the minimal model of a natural system in the formal (symbolic) domain.
However, not all time systems can be defined precisely: some cannot be completely formalised (in the sense of Hilbert). Only dynamical systems — the special class of time systems where the system-response is not only consistent with but also independent of the state and its transitions — are formally representable by a subset of the ambience, called the state or phase space, which has a well-defined (though possibly infinite) number of states as its elements.
The class of dynamical systems includes all time systems with a fixed structure (see above). But (and this is a subtle but important nuance), it also contains some time systems whose structure does change over time, albeit in a peculiar manner. In such a forced dynamical system, how the structure changes is also independent of the state of the system, and happens in a rule-based manner. Therefore, structure (even though dynamic) and state are still perfectly separate.
A good example of such a forced dynamical system is an ecosystem that is exposed to the annual change of seasons on earth. Let us assume this ecosystem is situated outside the tropics. Its structure — consisting of food webs, and energetic exchanges, among other relations — will vary significantly depending on the time of the year. But the single parameter that forces these changes, which is solar irradiance (the influx of energy from the sun), changes in a predictably repetitive manner that is completely independent of the intrinsic dynamics of the ecosystem itself. Instead, as we all know, it depends on the tilted orbit of the earth around the sun, and the latitude at which the system finds itself.
Seasonal change in irradiance imposes a time-variable constraint on the ecosystem, affecting its dynamics in a global and lawlike manner that is not at all reliant on the details of its internal structure, or its interactions with other systems. This stands in contrast to the context-dependent, nonintegrable, or nonholonomic constraints we’ve encountered in chapter 10. Accordingly, we call this new kind of constraint context-independent, integrable, or holonomic. As mentioned before, “holonomic,” just means lawlike.
Holonomic constraints can arise in many ways. Seasonal changes in ecology are just one example. Others include the geometry of the space in which a system’s dynamics play out. Take, for example, the earth’s spherical shape, which restricts our horizontal movements on its surface. Yet more holonomic constraints arise from gravitational or electromagnetic potentials that guide the movement of objects exposed to them without being substantially altered by these motions themselves. As an example, consider the gravity well of our solar system, which confines the planets to their regular elliptical orbits. Here, the subtle gravitational effects of the planets themselves play no notable role in the model, because the sun contains an astonishing 99.86% of the total mass of the solar system.
Whether we consider the behaviour of a process to be governed by internal rules of change or holonomic constraints imposed from outside is somewhat arbitrary. It depends on the temporal and spatial scale on which we focus and how we draw the box around the system to be modelled. We’ve explained in the last chapter why this is more of an art (in the sense of being a practiced skill) rather than an exact science. Your constraint can easily become my rule if the focus changes. In this sense, internal rules and holonomic constraints reflect a type of strict and rigid hierarchy: both are easily dissociable from each other (through a simple separation of scales) and can be modeled as their own dynamical system. Alternatively, when we combine the two scales into one model, we codify these constraints as a regular (lawlike) change in the system’s equations and their parameters, including the initial and boundary conditions which delimit the spatiotemporal extent of a dynamical system (see chapter 6).
All in all, holonomic constraints seem quite innocuous. They can be captured in a classical manner, Newton-style, by a separation of scales, or by including them in the model as we draw a larger box around our system. We may have to incorporate multiple spatial and temporal scales, but they remain cleanly separated. In our ecosystem example, seasonal changes occur across the year, while interactions within the system generally happen at a much faster pace. It is all still pretty neat and tidy.
Where things get messy, however, are time systems in which the input-output response explicitly depends on the state transition (and vice versa), or on interactions with other systems that are not lawlike. And, once again: let us remind you that many natural systems in the ambience are of this kind — especially (but not only) in the biological and social realms. After all, our reality is fundamentally processual and heavily intertwined. We’ve said it before: in such a messy world, the Newtonian paradigm — “physics in a box” — is an extreme idealisation. It is a map that can be utterly deceiving.
Now, if we are to build useful and robust maps outside the box, we need to use the right kind of time system, a choice of model which depends on the dynamical behaviour of the processes to be modelled. We cannot simply presuppose that dynamical systems models are suitable without proper justification. Unfortunately, this is exactly what we are doing when maneuvering within a mechanistic view of the world. We uncritically assume that every natural process can be usefully approximated by a dynamical system. It is a foregone conclusion that all natural processes must behave like mechanisms. But the world is not a machine, and physics is not computation!
There are a couple of distinctions that are crucial here: the first depends on whether a natural system behaves in a predictable manner or not. The second one concerns the kind of unpredictability that can occur in dynamical systems compared to more general time systems in which the input-output relation does depend on the state. These are qualitatively different from each other. One is of a computational nature, the other (more radical one) is fundamentally not. Very few people realise this, so let us explain.
At first glance, it may sound a little contradictory, but the fluid natural regularity that characterises all recognisable natural processes (and ensures their temporal continuity) does not imply that a system must behave in a predictable manner. Once more, let us take the weather as an example. There are regular patterns in weather systems, like our familiar cold fronts, or warm fronts, hurricanes, and high-pressure areas, that we can reliably identify and classify based on their intrinsic behaviour and their interactions with other meteorological features. But how exactly they play out, and how they interact with each other, remains hard to forecast beyond a few days’ time, even with our most powerful prediction models. The weather shows regularities, no doubt, but it does not behave in a simple, lawlike manner, like the orbits of the planets. Predicting a solar eclipse is easy; forecasting next week’s weather is not.
We have encountered several possible reasons for this lack of predictability already. It could stem from fundamental (quantum) indeterminacy, but it remains unclear if this actually matters at a macroscopic scale. Another reason is deterministic chaos (combined with our limited ability to measure things precisely). This is, in fact, what makes the weather hard to predict. We’ve outlined both of these reasons in chapter 4. Alternatively, the lack of predictability could result from computational irreducibility, as we explained in chapter 7. This kind of unpredictable behaviour is computationally complex: we have to simulate a system step by step to evaluate its outcome, because there are no shortcuts for prediction. The important thing to notice is that, in all these cases, we may still use Newton-style dynamical systems models to explain how a process works, to reveal its underlying rules or causes, but such models will not be able to predict much of its future behaviour, or will only do so in an roughly approximate or probabilistic manner.
All of the above limits of predictability have one thing in common: they arise from the internal dynamics of a process, or the dynamics of time-variable holonomic constraints imposed on it. As we’ve just seen, these still behave according to given (state-independent) rules that can be formally modelled with dynamical systems. In the worst case, we have to introduce a nondeterministic (stochastic) component, or include multiple hierarchically nested subsystems in the model. But, in principle, if we had unlimited measurement precision and computational power, our models could capture the behaviour of such a system up to an arbitrary degree of precision (see chapter 7). This is why dynamical and computational models are such powerful tools for studying physical processes, still predominant across science today.
In contrast, a completely different, much more radical kind of unpredictability may arise in cases where the input-output response explicitly depends on the state of a time system. It causes the very rules that govern system behaviour to change unpredictably from one moment to another. Let us call this type of process an evolutionary process. Its structure depends explicitly on the history of the process, either via its interactions with other systems or an internal dynamic that builds up over time, constructing or constricting the state space of the system at every step of its evolution over time.
We can depict this using our brick tower metaphor by introducing branching points into the process that depend on both the current state of the system and its interactions with the large world around it:
Let us recap: in the last section, we have defined a general class of formal time systems, which are defined set-theoretically by two mappings: system response (instantaneous, input to output) and state transition (propagating the system through time). These mappings must be consistent with each other to be realised as (or to decode into) a natural system. As long as this condition is met, we should be able to model any natural system by a congruent time system. At least in principle. And because all natural systems must persist through time (long enough for us to discern them), we can say that time systems with their two mappings (and not the more general “timeless” systems introduced in the last chapter) are the minimal model of a natural system in the formal (symbolic) domain.
However, not all time systems can be defined precisely: some cannot be completely formalised (in the sense of Hilbert). Only dynamical systems — the special class of time systems where the system-response is not only consistent with but also independent of the state and its transitions — are formally representable by a subset of the ambience, called the state or phase space, which has a well-defined (though possibly infinite) number of states as its elements.
The class of dynamical systems includes all time systems with a fixed structure (see above). But (and this is a subtle but important nuance), it also contains some time systems whose structure does change over time, albeit in a peculiar manner. In such a forced dynamical system, how the structure changes is also independent of the state of the system, and happens in a rule-based manner. Therefore, structure (even though dynamic) and state are still perfectly separate.
A good example of such a forced dynamical system is an ecosystem that is exposed to the annual change of seasons on earth. Let us assume this ecosystem is situated outside the tropics. Its structure — consisting of food webs, and energetic exchanges, among other relations — will vary significantly depending on the time of the year. But the single parameter that forces these changes, which is solar irradiance (the influx of energy from the sun), changes in a predictably repetitive manner that is completely independent of the intrinsic dynamics of the ecosystem itself. Instead, as we all know, it depends on the tilted orbit of the earth around the sun, and the latitude at which the system finds itself.
Seasonal change in irradiance imposes a time-variable constraint on the ecosystem, affecting its dynamics in a global and lawlike manner that is not at all reliant on the details of its internal structure, or its interactions with other systems. This stands in contrast to the context-dependent, nonintegrable, or nonholonomic constraints we’ve encountered in chapter 10. Accordingly, we call this new kind of constraint context-independent, integrable, or holonomic. As mentioned before, “holonomic,” just means lawlike.
Holonomic constraints can arise in many ways. Seasonal changes in ecology are just one example. Others include the geometry of the space in which a system’s dynamics play out. Take, for example, the earth’s spherical shape, which restricts our horizontal movements on its surface. Yet more holonomic constraints arise from gravitational or electromagnetic potentials that guide the movement of objects exposed to them without being substantially altered by these motions themselves. As an example, consider the gravity well of our solar system, which confines the planets to their regular elliptical orbits. Here, the subtle gravitational effects of the planets themselves play no notable role in the model, because the sun contains an astonishing 99.86% of the total mass of the solar system.
Whether we consider the behaviour of a process to be governed by internal rules of change or holonomic constraints imposed from outside is somewhat arbitrary. It depends on the temporal and spatial scale on which we focus and how we draw the box around the system to be modelled. We’ve explained in the last chapter why this is more of an art (in the sense of being a practiced skill) rather than an exact science. Your constraint can easily become my rule if the focus changes. In this sense, internal rules and holonomic constraints reflect a type of strict and rigid hierarchy: both are easily dissociable from each other (through a simple separation of scales) and can be modeled as their own dynamical system. Alternatively, when we combine the two scales into one model, we codify these constraints as a regular (lawlike) change in the system’s equations and their parameters, including the initial and boundary conditions which delimit the spatiotemporal extent of a dynamical system (see chapter 6).
All in all, holonomic constraints seem quite innocuous. They can be captured in a classical manner, Newton-style, by a separation of scales, or by including them in the model as we draw a larger box around our system. We may have to incorporate multiple spatial and temporal scales, but they remain cleanly separated. In our ecosystem example, seasonal changes occur across the year, while interactions within the system generally happen at a much faster pace. It is all still pretty neat and tidy.
Where things get messy, however, are time systems in which the input-output response explicitly depends on the state transition (and vice versa), or on interactions with other systems that are not lawlike. And, once again: let us remind you that many natural systems in the ambience are of this kind — especially (but not only) in the biological and social realms. After all, our reality is fundamentally processual and heavily intertwined. We’ve said it before: in such a messy world, the Newtonian paradigm — “physics in a box” — is an extreme idealisation. It is a map that can be utterly deceiving.
Now, if we are to build useful and robust maps outside the box, we need to use the right kind of time system, a choice of model which depends on the dynamical behaviour of the processes to be modelled. We cannot simply presuppose that dynamical systems models are suitable without proper justification. Unfortunately, this is exactly what we are doing when maneuvering within a mechanistic view of the world. We uncritically assume that every natural process can be usefully approximated by a dynamical system. It is a foregone conclusion that all natural processes must behave like mechanisms. But the world is not a machine, and physics is not computation!
There are a couple of distinctions that are crucial here: the first depends on whether a natural system behaves in a predictable manner or not. The second one concerns the kind of unpredictability that can occur in dynamical systems compared to more general time systems in which the input-output relation does depend on the state. These are qualitatively different from each other. One is of a computational nature, the other (more radical one) is fundamentally not. Very few people realise this, so let us explain.
At first glance, it may sound a little contradictory, but the fluid natural regularity that characterises all recognisable natural processes (and ensures their temporal continuity) does not imply that a system must behave in a predictable manner. Once more, let us take the weather as an example. There are regular patterns in weather systems, like our familiar cold fronts, or warm fronts, hurricanes, and high-pressure areas, that we can reliably identify and classify based on their intrinsic behaviour and their interactions with other meteorological features. But how exactly they play out, and how they interact with each other, remains hard to forecast beyond a few days’ time, even with our most powerful prediction models. The weather shows regularities, no doubt, but it does not behave in a simple, lawlike manner, like the orbits of the planets. Predicting a solar eclipse is easy; forecasting next week’s weather is not.
We have encountered several possible reasons for this lack of predictability already. It could stem from fundamental (quantum) indeterminacy, but it remains unclear if this actually matters at a macroscopic scale. Another reason is deterministic chaos (combined with our limited ability to measure things precisely). This is, in fact, what makes the weather hard to predict. We’ve outlined both of these reasons in chapter 4. Alternatively, the lack of predictability could result from computational irreducibility, as we explained in chapter 7. This kind of unpredictable behaviour is computationally complex: we have to simulate a system step by step to evaluate its outcome, because there are no shortcuts for prediction. The important thing to notice is that, in all these cases, we may still use Newton-style dynamical systems models to explain how a process works, to reveal its underlying rules or causes, but such models will not be able to predict much of its future behaviour, or will only do so in an roughly approximate or probabilistic manner.
All of the above limits of predictability have one thing in common: they arise from the internal dynamics of a process, or the dynamics of time-variable holonomic constraints imposed on it. As we’ve just seen, these still behave according to given (state-independent) rules that can be formally modelled with dynamical systems. In the worst case, we have to introduce a nondeterministic (stochastic) component, or include multiple hierarchically nested subsystems in the model. But, in principle, if we had unlimited measurement precision and computational power, our models could capture the behaviour of such a system up to an arbitrary degree of precision (see chapter 7). This is why dynamical and computational models are such powerful tools for studying physical processes, still predominant across science today.
In contrast, a completely different, much more radical kind of unpredictability may arise in cases where the input-output response explicitly depends on the state of a time system. It causes the very rules that govern system behaviour to change unpredictably from one moment to another. Let us call this type of process an evolutionary process. Its structure depends explicitly on the history of the process, either via its interactions with other systems or an internal dynamic that builds up over time, constructing or constricting the state space of the system at every step of its evolution over time.
We can depict this using our brick tower metaphor by introducing branching points into the process that depend on both the current state of the system and its interactions with the large world around it:
As a consequence of such a constructive, branching evolutionary dynamic, entirely new states become accessible at each step, while some that were reachable up to just a moment ago may no longer be. The model constantly transcends its state space to enter the adjacent possible.
This requires us to reformulate the model as we go along — basically, to constantly rewrite its equations — in order to keep it congruent with the dynamics of the evolutionary process it models. And what’s worse: this transformation of the rules (or equations) of the system itself occurs in a way that is not just unpredictable, but fundamentally not prestatable in terms of any well-defined set of states. As theoretical biologist Stuart Kauffman and his colleagues point out: we cannot even write down the appropriate formal rules for the system’s future evolution in advance of that evolution actually happening. Or as they say, quite pithily: “no law entails evolution.”
Kauffman calls this second type of unpredictability radical emergence (and we discuss it as organisational emergence in chapter 14). It differs from the first kind — the one we get from computational irreducibility (which is sometimes called weak emergence) — in that it involves the evolution of nonholonomic constraints. In contrast to the holonomic kind described above, these constraints radically depend on their context, i.e., the presence of other nonholonomic constraints that participate in the inner dynamics and external interactions of the system in which they occur (see the appendix). As we have already mentioned, they restrict the behavioural repertoire of a process compared to a situation where they are not present. Nonholonomic constraints channel and divert dynamics in specific directions, which are not predictable from the underlying rules of change alone. The classic example for this type of dynamic is Darwinian evolution by natural selection, which leads to adaptation by restricting the range of organismic traits that make it to the next generation through differential reproduction (see chapter 10).
In sum, we have two qualitatively different kinds of unpredictability here. First, there is a computational kind that emerges from the internal rule-based dynamics of a process modelled by a dynamical system (a mechanism, as defined in this appendix). And, second, a radical or organizational kind of unpredictability that emerges from changes in the structure, i.e., in the very rules that govern a time system, due to its open-ended interactions with other processes in its surroundings (as in the example of biological evolution). In the former case, our limited ability to predict is rooted in the fundamental indeterminacy of nature, and in our necessarily finite power of measurement and computation. In the latter case, it is rooted in our inability to identify all relevant factors for modelling a natural system across all possible kinds of contexts. The problem is not that we cannot compute this kind of system precisely, it is that we cannot properly formalise it to begin with. It is in this particular sense that evolution is not a mechanism!
But what does this mean concretely? Well, it breaks the box around our dynamical system wide open. Or, to be more precise, it makes it impossible for us to justify drawing a box in the first place. And yet, without the box, we can have no predetermined set of states S that defines the system: there may always be interactions that alter the rules of change in a way that we cannot anticipate, even in principle. We can never be sure. And this is exactly why we are so bad at predicting evolutionary processes (in the biological, cultural, or technological realm). They are not formally prestatable as subsets of the ambience. The future of evolution is radically open! This is yet another way in which the world is not a set.
And it is yet another reason why it is impossible for a limited being to model the entire universe, completely, all at once. Radical uncertainty (of the second kind described above) is a fundamental feature of existence for a limited being in a large evolving world. It arises transjectively, from the way we relate to our large world, and the radically open evolution of our own experience within it (another prime example of the kind of evolutionary process we are talking about). As Rescher points out: “[i]f we could predict discoveries in detail in advance, then we could make them in advance.” This is logically impossible. We see no way by which this limitation could be overcome, not just by us human beings, but by any kind of limited intelligence — terrestrial or alien, natural or artificial — no matter how highly evolved and sophisticated. Life is always full of surprises! Literally. For everyone. Everywhere.
At this point, you may object: how can we be so sure about this? Didn’t we insist that nothing can be known with absolute certainty? Well, we are pretty confident about our argument but, of course, never 100% convinced. We freely admit that we may be wrong. Even if we cannot see it right now, some day, someone smarter than us may find a way out of the logical conundrum we have just posed. But it seems unlikely. Our logic seems solid and sound. And, anyway: it is not our job to come up with a way out. As perspectivists, we are happy with our kind of imperfect philosophy for limited beings. So, if you disagree, be our guest and come up with a counterargument that defeats our perspectival logic. Honestly, we have not seen any convincing arguments like that anywhere so far. But this does not mean they cannot exist.
In the meantime, however, radical unpredictability is here to stay — an essential ingredient for any kind of naturalistic theory of knowledge that deserves its name. Our ability to formalise evolutionary processes is fundamentally limited. But that is not all. There is another, quite subtle (but immensely powerful), reason that we cannot prestate the future states of certain systems. And this reason is the very key to any proper understanding of what life is, and how it differs from non-living matter.
We've shown in chapter 10 already that there are two very different ways in which two processes can reciprocally interact with each other. Both can be captured formally by describing them as some kind of evolution of nonholonomic constraints. But the two differ qualitatively and in essential ways. This distinction is so central to our argument that we will repeat it here.
The first kind of interaction is called cybernetic feedback, where two pre-existing processes constrain each other’s dynamics. As an example, we mentioned the feedback regulation of a metabolic pathway, where the product of the pathway typically inhibits its own production. Or, as Denis Walsh writes in his book “Organisms, Agency, and Evolution,” we could think about Winston Churchill’s famous saying that “we shape our buildings; thereafter they shape us.” Universities, for instance, used to look like monasteries with their cloisters, where scholars could perambulate and philosophise in a protected and inspiring environment. Nowadays, in contrast, they look more like factories, where researchers toil away, under increasingly high pressure, at their various tasks. It is hard to deny that this shapes how we do research which, in turn, affects how we will build universities in the future. Who knows, maybe they will look like prisons someday, if we continue on our current path? Be that as it may, the buildings and the people that work in them evidently exist independently of each other. In particular, how a particular human being grows and is trained to be a scientist, and how a particular university building is designed and constructed, have nothing to do with each other. We could call the constraints that guide such a feedback interaction, channeling a pre-existing dynamic into specific directions, restrictive constraints.
The second kind of interaction is very different. It is involved in the organisational closure that underlies the self-manufacture (or autopoiesis) of living systems (as we have explained in chapter 10). Like feedback, it involves a reciprocal relation between two processes. But unlike feedback, the two processes not only influence but co-construct each other. Following Walsh again, we take Karl Marx’ word for it: “The animal is immediately one with its life activity. It does not distinguish itself from it.” This is important: neither process would even exist without the other’s activities. The two processes not only affect each other’s nonholonomic constraints, but the set of constraints that defines their interaction at one instant in time generates the set of constraints which defines them and their interaction at the next. Even if this process is influenced by external influences from the environment, at least some of the dynamics are not just causally altered but constructed by this self-referential interaction, which renders them (at least partly) autonomous of their surroundings. What happens next is no longer a passive response to environmental factors, but depends actively on the internally controlled self-production of new constraints. Its constructive aspect (not its self-referentiality) is what distinguishes this kind of interaction from mere feedback. This is why we call the nonholonomic constraints involved here constructive constraints.
This has important consequences for our ability to model living systems. Basically, the entire rest of the book revolves around these. For now, let us first try to understand what mutual co-construction means in formal terms, by looking at a time system with two subsystems that reciprocally generate each other. As a motivating example, let us consider once more how we explore our large world and how our explorations both generate and, in turn, are generated by what we already know about the world. This seeming paradox lies at the very heart of our theory of knowledge. Philosopher Thomas Kuhn called it the theory-ladenness of our observations: what we perceive (and how we perceive it) fundamentally depends on our existing conceptual framework. In turn, this framework is built upon our experiences (and that of our peers and ancestors). The two are inextricably intermingled: one cannot exist without the other.
This raises the danger that their reciprocal co-construction may be radically ungrounded. And indeed it is, unless outside influences impinge on this process: if the world within our horizon of experience is all there is, there would be no escape, no way to connect to any “outside” reality whatsoever. This is what some philosophical idealists or radical relativists believe. Luckily, as we have seen, there is an external reference. It is that part of our experience over which we have no control. It molds our actions and experiences in a way that can adapt our behaviour and our knowledge to our physical ambience. The apparent paradox, as we have said before, is really an evolving strange loop between agent and arena. And our co-construction of sensation and action becomes adaptive, robustly embedded in the world.
But, we may ask: how does such a co-constructing system ever get started? Evidently, neither action nor knowledge are possible without the other. Both have to get going together, or they won’t budge at all. This is a serious deadlock. Even worse, a living cell consists of hundreds of processes that mutually generate each other, not just two. They all have to come together, in a coordinated manner. How can life ever get started like this? This is what makes the origin of life such a tremendous mystery. It makes no sense to isolate any of the reciprocal relations, or to prioritise some specific process over all the others. What came first: action or knowledge? The two must co-emerge and co-evolve. Similarly, biological organization is irreducible in the sense that closure is a property of the entire set of processes that constitute an organism at any specific moment in time. It is the emergence of this overarching coherence that we need to explain. But, again: where do we even begin?
All of this suggests that co-constructing systems aren’t mechanisms in the typical Newtonian sense. Just like any other evolutionary process, they are incompletely formalisable, but for quite distinct reasons. Apart from the fact that there is no well-defined starting or end point (which is something that affects all evolutionary processes), there is no longer a clearly defined linear sequence of operations that takes us through cycles of reciprocal construction. The system’s trajectories are ill-defined, as illustrated by the fact that our actions and knowledge remain unhinged unless they are constrained by reality. We’ll get a deeper insight into why this is in a minute. For now, let us just say that Aristotle and Newton came up with their original prohibition of circular causality to avoid exactly this kind of ill-defined situation.
But here we are. Today, the strange loop described above is our best shot at a theory of knowledge for limited human beings, and the closure of constraints is our only viable and plausible hypothesis concerning self-manufacturing (autopoietic) living systems and how they are organised. And even if this were not so, it seems reasonable to assume that the kind of co-constructing interactions we’ve just described can and do exist in the natural world. So we should have a theory of how they behave. This requires us to overcome the mechanistic ban on circularity, to step out of the machine map of the world, in a way which keeps our modelling practice grounded and our formalisms free of contradictions.
For the sake of developing such a theory, let us formalise a simple system of two co-constructing processes in terms of set-theoretical mappings. The core formal structure of the resulting time system consists of two system-response maps that are coupled in a way such that the input X of one is the output Y of the other, and vice versa. This is illustrated in the following diagram
This requires us to reformulate the model as we go along — basically, to constantly rewrite its equations — in order to keep it congruent with the dynamics of the evolutionary process it models. And what’s worse: this transformation of the rules (or equations) of the system itself occurs in a way that is not just unpredictable, but fundamentally not prestatable in terms of any well-defined set of states. As theoretical biologist Stuart Kauffman and his colleagues point out: we cannot even write down the appropriate formal rules for the system’s future evolution in advance of that evolution actually happening. Or as they say, quite pithily: “no law entails evolution.”
Kauffman calls this second type of unpredictability radical emergence (and we discuss it as organisational emergence in chapter 14). It differs from the first kind — the one we get from computational irreducibility (which is sometimes called weak emergence) — in that it involves the evolution of nonholonomic constraints. In contrast to the holonomic kind described above, these constraints radically depend on their context, i.e., the presence of other nonholonomic constraints that participate in the inner dynamics and external interactions of the system in which they occur (see the appendix). As we have already mentioned, they restrict the behavioural repertoire of a process compared to a situation where they are not present. Nonholonomic constraints channel and divert dynamics in specific directions, which are not predictable from the underlying rules of change alone. The classic example for this type of dynamic is Darwinian evolution by natural selection, which leads to adaptation by restricting the range of organismic traits that make it to the next generation through differential reproduction (see chapter 10).
In sum, we have two qualitatively different kinds of unpredictability here. First, there is a computational kind that emerges from the internal rule-based dynamics of a process modelled by a dynamical system (a mechanism, as defined in this appendix). And, second, a radical or organizational kind of unpredictability that emerges from changes in the structure, i.e., in the very rules that govern a time system, due to its open-ended interactions with other processes in its surroundings (as in the example of biological evolution). In the former case, our limited ability to predict is rooted in the fundamental indeterminacy of nature, and in our necessarily finite power of measurement and computation. In the latter case, it is rooted in our inability to identify all relevant factors for modelling a natural system across all possible kinds of contexts. The problem is not that we cannot compute this kind of system precisely, it is that we cannot properly formalise it to begin with. It is in this particular sense that evolution is not a mechanism!
But what does this mean concretely? Well, it breaks the box around our dynamical system wide open. Or, to be more precise, it makes it impossible for us to justify drawing a box in the first place. And yet, without the box, we can have no predetermined set of states S that defines the system: there may always be interactions that alter the rules of change in a way that we cannot anticipate, even in principle. We can never be sure. And this is exactly why we are so bad at predicting evolutionary processes (in the biological, cultural, or technological realm). They are not formally prestatable as subsets of the ambience. The future of evolution is radically open! This is yet another way in which the world is not a set.
And it is yet another reason why it is impossible for a limited being to model the entire universe, completely, all at once. Radical uncertainty (of the second kind described above) is a fundamental feature of existence for a limited being in a large evolving world. It arises transjectively, from the way we relate to our large world, and the radically open evolution of our own experience within it (another prime example of the kind of evolutionary process we are talking about). As Rescher points out: “[i]f we could predict discoveries in detail in advance, then we could make them in advance.” This is logically impossible. We see no way by which this limitation could be overcome, not just by us human beings, but by any kind of limited intelligence — terrestrial or alien, natural or artificial — no matter how highly evolved and sophisticated. Life is always full of surprises! Literally. For everyone. Everywhere.
At this point, you may object: how can we be so sure about this? Didn’t we insist that nothing can be known with absolute certainty? Well, we are pretty confident about our argument but, of course, never 100% convinced. We freely admit that we may be wrong. Even if we cannot see it right now, some day, someone smarter than us may find a way out of the logical conundrum we have just posed. But it seems unlikely. Our logic seems solid and sound. And, anyway: it is not our job to come up with a way out. As perspectivists, we are happy with our kind of imperfect philosophy for limited beings. So, if you disagree, be our guest and come up with a counterargument that defeats our perspectival logic. Honestly, we have not seen any convincing arguments like that anywhere so far. But this does not mean they cannot exist.
In the meantime, however, radical unpredictability is here to stay — an essential ingredient for any kind of naturalistic theory of knowledge that deserves its name. Our ability to formalise evolutionary processes is fundamentally limited. But that is not all. There is another, quite subtle (but immensely powerful), reason that we cannot prestate the future states of certain systems. And this reason is the very key to any proper understanding of what life is, and how it differs from non-living matter.
We've shown in chapter 10 already that there are two very different ways in which two processes can reciprocally interact with each other. Both can be captured formally by describing them as some kind of evolution of nonholonomic constraints. But the two differ qualitatively and in essential ways. This distinction is so central to our argument that we will repeat it here.
The first kind of interaction is called cybernetic feedback, where two pre-existing processes constrain each other’s dynamics. As an example, we mentioned the feedback regulation of a metabolic pathway, where the product of the pathway typically inhibits its own production. Or, as Denis Walsh writes in his book “Organisms, Agency, and Evolution,” we could think about Winston Churchill’s famous saying that “we shape our buildings; thereafter they shape us.” Universities, for instance, used to look like monasteries with their cloisters, where scholars could perambulate and philosophise in a protected and inspiring environment. Nowadays, in contrast, they look more like factories, where researchers toil away, under increasingly high pressure, at their various tasks. It is hard to deny that this shapes how we do research which, in turn, affects how we will build universities in the future. Who knows, maybe they will look like prisons someday, if we continue on our current path? Be that as it may, the buildings and the people that work in them evidently exist independently of each other. In particular, how a particular human being grows and is trained to be a scientist, and how a particular university building is designed and constructed, have nothing to do with each other. We could call the constraints that guide such a feedback interaction, channeling a pre-existing dynamic into specific directions, restrictive constraints.
The second kind of interaction is very different. It is involved in the organisational closure that underlies the self-manufacture (or autopoiesis) of living systems (as we have explained in chapter 10). Like feedback, it involves a reciprocal relation between two processes. But unlike feedback, the two processes not only influence but co-construct each other. Following Walsh again, we take Karl Marx’ word for it: “The animal is immediately one with its life activity. It does not distinguish itself from it.” This is important: neither process would even exist without the other’s activities. The two processes not only affect each other’s nonholonomic constraints, but the set of constraints that defines their interaction at one instant in time generates the set of constraints which defines them and their interaction at the next. Even if this process is influenced by external influences from the environment, at least some of the dynamics are not just causally altered but constructed by this self-referential interaction, which renders them (at least partly) autonomous of their surroundings. What happens next is no longer a passive response to environmental factors, but depends actively on the internally controlled self-production of new constraints. Its constructive aspect (not its self-referentiality) is what distinguishes this kind of interaction from mere feedback. This is why we call the nonholonomic constraints involved here constructive constraints.
This has important consequences for our ability to model living systems. Basically, the entire rest of the book revolves around these. For now, let us first try to understand what mutual co-construction means in formal terms, by looking at a time system with two subsystems that reciprocally generate each other. As a motivating example, let us consider once more how we explore our large world and how our explorations both generate and, in turn, are generated by what we already know about the world. This seeming paradox lies at the very heart of our theory of knowledge. Philosopher Thomas Kuhn called it the theory-ladenness of our observations: what we perceive (and how we perceive it) fundamentally depends on our existing conceptual framework. In turn, this framework is built upon our experiences (and that of our peers and ancestors). The two are inextricably intermingled: one cannot exist without the other.
This raises the danger that their reciprocal co-construction may be radically ungrounded. And indeed it is, unless outside influences impinge on this process: if the world within our horizon of experience is all there is, there would be no escape, no way to connect to any “outside” reality whatsoever. This is what some philosophical idealists or radical relativists believe. Luckily, as we have seen, there is an external reference. It is that part of our experience over which we have no control. It molds our actions and experiences in a way that can adapt our behaviour and our knowledge to our physical ambience. The apparent paradox, as we have said before, is really an evolving strange loop between agent and arena. And our co-construction of sensation and action becomes adaptive, robustly embedded in the world.
But, we may ask: how does such a co-constructing system ever get started? Evidently, neither action nor knowledge are possible without the other. Both have to get going together, or they won’t budge at all. This is a serious deadlock. Even worse, a living cell consists of hundreds of processes that mutually generate each other, not just two. They all have to come together, in a coordinated manner. How can life ever get started like this? This is what makes the origin of life such a tremendous mystery. It makes no sense to isolate any of the reciprocal relations, or to prioritise some specific process over all the others. What came first: action or knowledge? The two must co-emerge and co-evolve. Similarly, biological organization is irreducible in the sense that closure is a property of the entire set of processes that constitute an organism at any specific moment in time. It is the emergence of this overarching coherence that we need to explain. But, again: where do we even begin?
All of this suggests that co-constructing systems aren’t mechanisms in the typical Newtonian sense. Just like any other evolutionary process, they are incompletely formalisable, but for quite distinct reasons. Apart from the fact that there is no well-defined starting or end point (which is something that affects all evolutionary processes), there is no longer a clearly defined linear sequence of operations that takes us through cycles of reciprocal construction. The system’s trajectories are ill-defined, as illustrated by the fact that our actions and knowledge remain unhinged unless they are constrained by reality. We’ll get a deeper insight into why this is in a minute. For now, let us just say that Aristotle and Newton came up with their original prohibition of circular causality to avoid exactly this kind of ill-defined situation.
But here we are. Today, the strange loop described above is our best shot at a theory of knowledge for limited human beings, and the closure of constraints is our only viable and plausible hypothesis concerning self-manufacturing (autopoietic) living systems and how they are organised. And even if this were not so, it seems reasonable to assume that the kind of co-constructing interactions we’ve just described can and do exist in the natural world. So we should have a theory of how they behave. This requires us to overcome the mechanistic ban on circularity, to step out of the machine map of the world, in a way which keeps our modelling practice grounded and our formalisms free of contradictions.
For the sake of developing such a theory, let us formalise a simple system of two co-constructing processes in terms of set-theoretical mappings. The core formal structure of the resulting time system consists of two system-response maps that are coupled in a way such that the input X of one is the output Y of the other, and vice versa. This is illustrated in the following diagram
where the sets of states S₁ and S₂ for each process evolve over time in a way that is consistent with the state-transition maps of each respective process. These transition maps, at each time point, are defined by the outcome of the corresponding reciprocal system-responses. If you follow the arrows in the diagram closely, you recognise that each of the state sets is definable exclusively in terms of the other. Mathematicians call this kind of codependent definition impredicative (see appendix). The state space of the entire system (both sets of states taken together) is therefore defined in a collectively impredicative way.
This formal property, the impredicativity of the co-constructing time system, is what explains why the trajectories of the system are not well-defined. Any attempt to figure out how this system works (in the absence of information about its context) leads to an infinite regress: we keep going around the arrow diagram in circles, forever. There is no definite way by which we can capture the dynamical potential of such a system. This is yet another way in which the world is not a set: knowledge and life defy formalisation not only at the level of their evolution, but at the very root of their organisation!
To summarise: we can distinguish between three different types of general time systems based on how they behave
This formal property, the impredicativity of the co-constructing time system, is what explains why the trajectories of the system are not well-defined. Any attempt to figure out how this system works (in the absence of information about its context) leads to an infinite regress: we keep going around the arrow diagram in circles, forever. There is no definite way by which we can capture the dynamical potential of such a system. This is yet another way in which the world is not a set: knowledge and life defy formalisation not only at the level of their evolution, but at the very root of their organisation!
To summarise: we can distinguish between three different types of general time systems based on how they behave
Of these three, only dynamical systems (leftmost panel) can be precisely and completely characterised as a well-defined subset of the ambience (the system’s state space S). However, this type of formal model only applies to systems whose rules remain fixed or change themselves in a very regular manner that is independent of the system’s own state. In most cases, and especially in the biological and social domain, this is an unwarranted idealisation. If it applies in a specific case, it needs to be properly justified. In contrast, the other classes of time systems — evolutionary (middle) and co-constructing processes (righthand panel) — cannot be formalised completely as specific subsets of the ambience. This is because their set of states changes in a manner that is not formally prestatable (in the case of evolutionary processes), or is indeterminate because it can only be defined in an collectively impredicative manner (for co-constructing processes including, as we shall see, all living organisms).
Not only is the world not a set, but set-theoretic formalisation hits its limits pretty quickly when trying to capture open-ended living and evolutionary dynamics.
On top of these limitations, there is only so much we can learn about the classification of dynamics and the organisation of natural processes from formalisations based on set theory. To gain a deeper understanding of why these types of systems differ from each other, and especially how they behave in different ways, we need to focus, not on their material parts, but on the processual relations that constitute their dynamic organisation. In fact, we’ve tried to do this in the cartoonish diagrams above, although our attempt remains clumsy and vague. We’ll fix this in later chapters.
For this purpose, we need a better mathematical formalism, one that explicitly focuses on the relations between the processes that constitute a system. So let us introduce an alternative to set theory that meets these requirements. Buckle up, it’ll get a little bit abstract before we return to less technical arguments in the next chapter.
Not only is the world not a set, but set-theoretic formalisation hits its limits pretty quickly when trying to capture open-ended living and evolutionary dynamics.
On top of these limitations, there is only so much we can learn about the classification of dynamics and the organisation of natural processes from formalisations based on set theory. To gain a deeper understanding of why these types of systems differ from each other, and especially how they behave in different ways, we need to focus, not on their material parts, but on the processual relations that constitute their dynamic organisation. In fact, we’ve tried to do this in the cartoonish diagrams above, although our attempt remains clumsy and vague. We’ll fix this in later chapters.
For this purpose, we need a better mathematical formalism, one that explicitly focuses on the relations between the processes that constitute a system. So let us introduce an alternative to set theory that meets these requirements. Buckle up, it’ll get a little bit abstract before we return to less technical arguments in the next chapter.
Categories
Category theory originated in the 1940s as a mathematical tool to understand natural transformations that give rise to all kinds of mathematical structures. From the start, it was deliberately designed as a theory about mathematics — a kind of metamathematics. Its original authors were Samuel Eilenberg and Saunders Mac Lane, who built on Emmy Noether’s pioneering work on structure-preserving processes. Think of it this way: if set theory is the (meta)mathematics of containment and substance thinking, then category theory is the (meta)mathematics of relations and process philosophy. Both provide alternative foundations for the rest of mathematics (see this appendix), as both allow us to relate various abstract structures to each other. But they do so in complementary ways.
In our context, it is useful to consider category theory more specifically as a general theory of modelling relations: it not only describes how we encode and decode models of natural systems to make them congruent (chapter 11), but it also relates the resulting formal systems to each other based on the kinds of (natural) transformations that exist between them. In this way, category theory constitutes a formal model of the perspectival and processual theory of knowledge and the general view of science we introduced in the last three chapters (and review in the next). It allows us to generate formal templates for (often counterintuitive) relational and process-based concepts, such as naturality, or the notion of an open-ended evolutionary process. These templates, in turn, help us understand such concepts more rigorously, and to relate them to each other in a systematic manner.
But what exactly is the theory of categories? And what makes it so exceptionally useful for our purpose?
The basics are quite simple: a mathematical category consists of a collection of abstract objects with arrows (called morphisms) that connect them. Here is an example:
Category theory originated in the 1940s as a mathematical tool to understand natural transformations that give rise to all kinds of mathematical structures. From the start, it was deliberately designed as a theory about mathematics — a kind of metamathematics. Its original authors were Samuel Eilenberg and Saunders Mac Lane, who built on Emmy Noether’s pioneering work on structure-preserving processes. Think of it this way: if set theory is the (meta)mathematics of containment and substance thinking, then category theory is the (meta)mathematics of relations and process philosophy. Both provide alternative foundations for the rest of mathematics (see this appendix), as both allow us to relate various abstract structures to each other. But they do so in complementary ways.
In our context, it is useful to consider category theory more specifically as a general theory of modelling relations: it not only describes how we encode and decode models of natural systems to make them congruent (chapter 11), but it also relates the resulting formal systems to each other based on the kinds of (natural) transformations that exist between them. In this way, category theory constitutes a formal model of the perspectival and processual theory of knowledge and the general view of science we introduced in the last three chapters (and review in the next). It allows us to generate formal templates for (often counterintuitive) relational and process-based concepts, such as naturality, or the notion of an open-ended evolutionary process. These templates, in turn, help us understand such concepts more rigorously, and to relate them to each other in a systematic manner.
But what exactly is the theory of categories? And what makes it so exceptionally useful for our purpose?
The basics are quite simple: a mathematical category consists of a collection of abstract objects with arrows (called morphisms) that connect them. Here is an example:
This category has three objects (A, B, and C), with two arrows between them (which we denote by f: A 🠂 B and g: B 🠂 C). Note that the source (or domain) and the target (or range) of each arrow is an intrinsic part of its definition: even if f and g were representing the same type of mathematical relation (a linear relationship, let’s say), they would still not be the same morphism. For two arrows to be equal, domain and range must also coincide. Conversely, there can be multiple arrows (each representing a different relationship) that connect the same two objects. All these arrows are elements of what is called a hom-set. The set of all distinct arrows from A to B, for instance, is called hom(A,B) (see the appendix).
But there is more. What we’ve described so far is not yet a category, but what mathematicians call a directed graph: a collection of nodes (or vertices) connected by directional edges (we've encountered graphs before: see this appendix, and also this one). To become a category, this abstract structure must meet two additional conditions.
For one, in a category, we can concatenate or compose any two arrows that connect the way f and g do in the following diagram:
But there is more. What we’ve described so far is not yet a category, but what mathematicians call a directed graph: a collection of nodes (or vertices) connected by directional edges (we've encountered graphs before: see this appendix, and also this one). To become a category, this abstract structure must meet two additional conditions.
For one, in a category, we can concatenate or compose any two arrows that connect the way f and g do in the following diagram:
In words: two arrows are composable, if the range of f coincides with the domain of g — i.e., the arrows meet at object B. In mathematical notation, we write g ○ f (pronounced “g after f”). Despite its composite character, what results is just another standard arrow or morphism straight from domain A to range C. A mathematician would say that the above diagram commutes (or is commutative): no matter whether you follow arrows f and then g, or go through g ○ f directly, you end up at the exact same spot.
Moreover, for reasons explained in the appendix, the operation of composing arrows must also be associative: if we concatenate more than two morphisms — say f, g, and h — the order in which we do so does not matter. Formally: (h ○ g) ○ f = h ○ (g ○ f). Or, graphically, the following diagram commutes (i.e., once again, following any combination of arrows yields the same outcome):
Moreover, for reasons explained in the appendix, the operation of composing arrows must also be associative: if we concatenate more than two morphisms — say f, g, and h — the order in which we do so does not matter. Formally: (h ○ g) ○ f = h ○ (g ○ f). Or, graphically, the following diagram commutes (i.e., once again, following any combination of arrows yields the same outcome):
For the sake of simplicity, we will usually not draw composite arrows. Just remember that they are perfectly normal arrows or morphisms, and they are always there.
The second condition is as follows: every object in a category must have an arrow that points to itself
The second condition is as follows: every object in a category must have an arrow that points to itself
We write these circular morphisms as 1A, 1B, and 1C and call them identities, because they all conform to the following two unit laws: 1B ○ f = f and g ○ 1B = g. In words: if you take the route around 1B from B back to itself, nothing changes: you end up at exactly the same B you had before. Note that an arrow can point from an object back to itself without being an identity (see the concept of automorphism we have introduced when talking about dynamics above; also this appendix). As in the case of composite arrows, we usually do not show the identities when drawing a category diagram, but they are always there.
Identities may seem a little superfluous at first. But they allow us to make an important move: we can eliminate all the objects from a category, and still get the same — perfectly well-defined — mathematical structure! All we have to do is to equate each object with its identity relation: B = 1B. In fact, this little trick is extremely important for our argument. A set is defined by the elements it contains (its intrinsic properties), but a categorical object is defined entirely by the relations it has with itself and other objects (which are external properties, dependent on the context of the object). In set theory, elements are primary objects, but in category theory, the objects are derived secondarily from their relations. This is one important sense in which a category is not just a set with additional structure.
As an example, consider a category where the objects are people and the morphisms are their personal relations. This is quite apt: you, as a person are defined as much by your connections to others as you are by your intrinsic properties. Your partners, your parents, your friends, your teachers, and the wider societal context you find yourself in, have a tremendous influence on who you are.
And even your intrinsic properties can be usefully interpreted as some kind of relation with yourself. This not only applies at the level of your personality. Think of your genome, which many people consider the essential intrinsic property of a human being. They talk about your genes somehow “determining” or “programming” the way you look and behave. But this is not at all accurate. Genes don’t do anything on their own. They are just sequences on an inert strand of DNA. To be active, they need to be expressed by a living cell. Biologist C. H. Waddington famously quipped that it is impossible to decide whether the genome is instructing the cell, or whether the cell is interpreting the genome. This is, fundamentally, a two-way relation. Which genes get expressed — where and when — depends on their relations with other genes, the context of the living cell they are a part of, and this cell’s context among other cells within your body and the outside environment. It’s relations all the way up, and all the way down.
Okay, you may say, I get it: categories are the relational and processual complement to substantivist sets. But what, then, is the connection between category and set? Well, just as objects are special (relatively stable) processes, a set is a special case of a category. It is a category with identities as its only arrows:
Identities may seem a little superfluous at first. But they allow us to make an important move: we can eliminate all the objects from a category, and still get the same — perfectly well-defined — mathematical structure! All we have to do is to equate each object with its identity relation: B = 1B. In fact, this little trick is extremely important for our argument. A set is defined by the elements it contains (its intrinsic properties), but a categorical object is defined entirely by the relations it has with itself and other objects (which are external properties, dependent on the context of the object). In set theory, elements are primary objects, but in category theory, the objects are derived secondarily from their relations. This is one important sense in which a category is not just a set with additional structure.
As an example, consider a category where the objects are people and the morphisms are their personal relations. This is quite apt: you, as a person are defined as much by your connections to others as you are by your intrinsic properties. Your partners, your parents, your friends, your teachers, and the wider societal context you find yourself in, have a tremendous influence on who you are.
And even your intrinsic properties can be usefully interpreted as some kind of relation with yourself. This not only applies at the level of your personality. Think of your genome, which many people consider the essential intrinsic property of a human being. They talk about your genes somehow “determining” or “programming” the way you look and behave. But this is not at all accurate. Genes don’t do anything on their own. They are just sequences on an inert strand of DNA. To be active, they need to be expressed by a living cell. Biologist C. H. Waddington famously quipped that it is impossible to decide whether the genome is instructing the cell, or whether the cell is interpreting the genome. This is, fundamentally, a two-way relation. Which genes get expressed — where and when — depends on their relations with other genes, the context of the living cell they are a part of, and this cell’s context among other cells within your body and the outside environment. It’s relations all the way up, and all the way down.
Okay, you may say, I get it: categories are the relational and processual complement to substantivist sets. But what, then, is the connection between category and set? Well, just as objects are special (relatively stable) processes, a set is a special case of a category. It is a category with identities as its only arrows:
This is called a discrete category. If the collection of objects of such a category is well-defined, then it is exactly equivalent to a set with the categorical objects as its elements, where the unique properties of each object or element are guaranteed by the uniqueness of its identity arrow. Consider the example of sorting cars by colour (see this appendix, and this one): having the property of being yellow or red is described equally well as the car having an intrinsic property, or being yellow, a relation it has with itself.
There are other points of connection. We can consider any category to be made up of a collection of objects, plus a collection of arrows. If the collection of objects is a well-defined set, we call the category concrete (or small). A concrete category need not be discrete, but it can be straightforwardly expressed as a set with a collection of mappings between its elements. This collection of mappings will always be a set as well.
But not every category is concrete or small: think of that strange beast, the category of all (well-defined) sets, with mappings between them as its morphisms. This category is so important in mathematics that it has its own name: it’s called (to nobody’s surprise) Set (the bold font indicating that it is a category). But since there is no set of all sets, Set’s collection of objects (i.e., all sets) cannot be a set itself (see the first section above and this appendix). Neither is its collection of arrows: there is no set of all possible mappings between sets. We call a category that is not concrete (or small) a large category. For those of you who are interested in such things: this is the abstract conceptual basis of the distinction between small and large worlds.
There are lots of other interesting categories. We’ll encounter quite a few of them in what follows. And there are a lot of subtleties to the kinds of morphisms we can have in a category, how we can subdivide categories, how we can form such things as the dual or opposite of a category, or the product of two categories. We talk about a few of these in the appendix.
For the rest of this section, instead, we want to focus on how you get from individual categories to those famous natural transformations that category theory was designed to characterise. For us, the most essential and general function of category theory is this ability to rigorously define the concept of naturality, introduced in the last chapter, which underlies much of our capability to pick out patterns from the ambience, and to perform the skillful art of modelling them. Basically, a natural pattern (or transformation) is a dynamic regularity we’ve evolved to recognise. We already mentioned how difficult this concept is to pin down precisely. Maybe having an exact mathematical formulation will help?
To understand the idea of a natural transformation, we must think beyond relations inside categories, to relations between them. We can illustrate this with the familiar example of Rosen’s modelling relation:
There are other points of connection. We can consider any category to be made up of a collection of objects, plus a collection of arrows. If the collection of objects is a well-defined set, we call the category concrete (or small). A concrete category need not be discrete, but it can be straightforwardly expressed as a set with a collection of mappings between its elements. This collection of mappings will always be a set as well.
But not every category is concrete or small: think of that strange beast, the category of all (well-defined) sets, with mappings between them as its morphisms. This category is so important in mathematics that it has its own name: it’s called (to nobody’s surprise) Set (the bold font indicating that it is a category). But since there is no set of all sets, Set’s collection of objects (i.e., all sets) cannot be a set itself (see the first section above and this appendix). Neither is its collection of arrows: there is no set of all possible mappings between sets. We call a category that is not concrete (or small) a large category. For those of you who are interested in such things: this is the abstract conceptual basis of the distinction between small and large worlds.
There are lots of other interesting categories. We’ll encounter quite a few of them in what follows. And there are a lot of subtleties to the kinds of morphisms we can have in a category, how we can subdivide categories, how we can form such things as the dual or opposite of a category, or the product of two categories. We talk about a few of these in the appendix.
For the rest of this section, instead, we want to focus on how you get from individual categories to those famous natural transformations that category theory was designed to characterise. For us, the most essential and general function of category theory is this ability to rigorously define the concept of naturality, introduced in the last chapter, which underlies much of our capability to pick out patterns from the ambience, and to perform the skillful art of modelling them. Basically, a natural pattern (or transformation) is a dynamic regularity we’ve evolved to recognise. We already mentioned how difficult this concept is to pin down precisely. Maybe having an exact mathematical formulation will help?
To understand the idea of a natural transformation, we must think beyond relations inside categories, to relations between them. We can illustrate this with the familiar example of Rosen’s modelling relation:
In this scheme, there are two distinct categories. On the left is a natural system, whose objects are perceived phenomena and events to which we impute causal relations (the morphisms of the category, which Rosen calls “causal entailment”). On the right, there is a formal system, whose objects are the variables of the model, and the morphisms are formal relations between these (what Rosen calls “inferential entailment”). The latter correspond to the rules (both internal and constraint-based) that govern the dynamics of a time system (as described in the previous sections). But there is no guarantee that the objects and morphisms of one category map (in any simple way) onto the other. There need be no one-to-one correspondence — as we’ve explained in the last chapter. Instead, the main criterion for a good model is that it is encoded in a way that allows us to decode useful inferences about the behaviour of the natural system in return.
It should be evident that the relations underlying the encoding and decoding arrows in the above diagram are far from trivial. Therefore, to understand what congruence between natural and formal systems actually means, we need to unravel the deeper structure of the modelling relation they constitute. But what do these arrows between categories actually mean? The good news is: they are just another kind of morphism. The bad news is (as we’ve come to expect by now): it’s a bit more complicated than that.
Morphisms between categories are called functors. Each functor is a composite that consists of two distinct maps: (1) an object mapping, and (2) an arrow mapping. Let’s take the encoding functor ε above as an example. Its object mapping assigns to each perceived phenomenon P that is part of a natural process a variable V of the corresponding formal system: P 🠂 ε(P) = V. When studying a gene regulatory network, for instance, we measure the changing concentration of a transcription factor protein in some tissue. We then assign our time series of measurements (an observable) to one of the state variables of the model. Analogously, the arrow mapping connects every inferred causal relation c among perceived phenomena to a logical relation i in the model: c 🠂 ε(c) = i. Let’s say that an increase in concentration of one transcription factor leads to the disappearance or downregulation of another: then (roughly) we can infer that the first represses the expression of the latter, and we parametrize the equations of our model accordingly, implementing the interaction of the two corresponding variables as a negative or inhibitory one.
This is how category theory enables us to model our modelling practice. Somewhat crudely still, in the account we’ve just given. But it’s a start. We could call this metamodelling.
Now, we mentioned many times before that the connections between a natural system and its formal model need not be direct or one-to-one. But still, there must be some kind of definite regularity. Otherwise, there would be no way for us to decode the model — to draw useful inferences, or (even better) to produce accurate predictions from it. The idea here is not that the model is necessarily correct. And we don’t mean to imply that it represents exactly what is going on in the ambience. In order to be useful, the model simply needs to be in a relation with the phenomena that is well-defined and regular in a way that allows us to robustly decode it. And if we want to understand this kind of definiteness and regularity, we need to focus on the relation between encoding and decoding. This is a relation between relations between categories.
We hope you’re still with us: because this is exactly what a natural transformation is. And if you remember only one thing from this section, let it be this: a natural transformation tells us how to turn one functor into another with the kind of natural regularity we are after.
An example may be helpful here: remember, in the last chapter, we introduced three different kinds of relations between the encoding and decoding functors of the modelling relation. In the simplest (but most unrealistic) case, the natural system and its formal model are related by an isomorphism, which simply means that the encoding and decoding functors are inverses of each other, and there is an exact one-to-one correspondence between the objects and arrows of either category. This is Borges’ map again: it contains just as much detail as the territory it represents, which (as we already know) is not very useful.
In a bit more realistic scenario, the natural system and its formal model are connected by an equivalence, which means that the objects and arrows of the model are isomorphic only to selected (relevant) aspects of the natural system. In this case, encoding and decoding are still invertible, but only when it comes to these relevant aspects, and only insofar as the objects and arrows of the model are approximations, abstractions, and idealisations of the corresponding components of the natural system.
In the most general case, the natural system and its formal model are related by an adjunction. This means that the encoding and decoding functors are what mathematicians call (right and left) adjoints of each other (see appendix). Here, there is no longer any direct invertibility. Instead, the path to decoding inferences from an encoded model can be asymmetric, convoluted, and indirect. But: we can still draw valid inferences as long as the relation between the two functors shows the kind of regularity we are trying to define here. This is one of the main purposes of the category-theoretical concept of a natural transformation: it allows us to define precisely, in mathematical terms, what an adjunction is. The details of this definition depend on the context of the particular adjunction we are looking at. But it is always a specific relation between functors. And also: we can show that the first two cases above — isomorphism and equivalence — are just special cases of more general adjunctions.
This is where the maths get a little hairy. So we won’t go into any details in terms of the formalism here (but see the appendix, if you are interested). Instead, let us try to extract the gist of the concept of a natural transformation from what we just said about all three possible relations between encoding and decoding. For a mathematical transformation to be natural, it simply needs to be applicable anywhere, in the same way, no matter how asymmetric and complicated its application rule may be, and whether or not this rule is directly invertible or not.
In the case of the modelling relation, this means that we can apply the same sequence of mathematical transformations to any object or arrow of either one of the categories, and end up at the same spot again after we’re done. This transformation is trivial when it comes to an isomorphism or equivalence between categories: encoding is just the inverse of decoding. In the case of the isomorphism, it does not matter which object or arrow we’re dealing with. The functors can be inverted anywhere, in exactly the same way. In the case of the equivalence, we need to restrict ourselves to those aspects of the natural system captured by the model, but invertibility still applies (at least roughly speaking). The adjunction is more complicated: how we encode a model can be very different from how we decode inferences from it. But even such an asymmetric relationship can be regular in the sense that it allows our inferences to reconnect to the phenomena they concern, in the same way, for all relevant phenomena, even if encoding and decoding have nothing to do with each other, at first glance, and are not simply invertible.
Natural transformations are so powerful, because they not only allow us to relate the functors describing how we encode and decode a model, but also apply to cases where a natural system has more than one model, or where a formal model describes more than one natural system.
It should be evident that the relations underlying the encoding and decoding arrows in the above diagram are far from trivial. Therefore, to understand what congruence between natural and formal systems actually means, we need to unravel the deeper structure of the modelling relation they constitute. But what do these arrows between categories actually mean? The good news is: they are just another kind of morphism. The bad news is (as we’ve come to expect by now): it’s a bit more complicated than that.
Morphisms between categories are called functors. Each functor is a composite that consists of two distinct maps: (1) an object mapping, and (2) an arrow mapping. Let’s take the encoding functor ε above as an example. Its object mapping assigns to each perceived phenomenon P that is part of a natural process a variable V of the corresponding formal system: P 🠂 ε(P) = V. When studying a gene regulatory network, for instance, we measure the changing concentration of a transcription factor protein in some tissue. We then assign our time series of measurements (an observable) to one of the state variables of the model. Analogously, the arrow mapping connects every inferred causal relation c among perceived phenomena to a logical relation i in the model: c 🠂 ε(c) = i. Let’s say that an increase in concentration of one transcription factor leads to the disappearance or downregulation of another: then (roughly) we can infer that the first represses the expression of the latter, and we parametrize the equations of our model accordingly, implementing the interaction of the two corresponding variables as a negative or inhibitory one.
This is how category theory enables us to model our modelling practice. Somewhat crudely still, in the account we’ve just given. But it’s a start. We could call this metamodelling.
Now, we mentioned many times before that the connections between a natural system and its formal model need not be direct or one-to-one. But still, there must be some kind of definite regularity. Otherwise, there would be no way for us to decode the model — to draw useful inferences, or (even better) to produce accurate predictions from it. The idea here is not that the model is necessarily correct. And we don’t mean to imply that it represents exactly what is going on in the ambience. In order to be useful, the model simply needs to be in a relation with the phenomena that is well-defined and regular in a way that allows us to robustly decode it. And if we want to understand this kind of definiteness and regularity, we need to focus on the relation between encoding and decoding. This is a relation between relations between categories.
We hope you’re still with us: because this is exactly what a natural transformation is. And if you remember only one thing from this section, let it be this: a natural transformation tells us how to turn one functor into another with the kind of natural regularity we are after.
An example may be helpful here: remember, in the last chapter, we introduced three different kinds of relations between the encoding and decoding functors of the modelling relation. In the simplest (but most unrealistic) case, the natural system and its formal model are related by an isomorphism, which simply means that the encoding and decoding functors are inverses of each other, and there is an exact one-to-one correspondence between the objects and arrows of either category. This is Borges’ map again: it contains just as much detail as the territory it represents, which (as we already know) is not very useful.
In a bit more realistic scenario, the natural system and its formal model are connected by an equivalence, which means that the objects and arrows of the model are isomorphic only to selected (relevant) aspects of the natural system. In this case, encoding and decoding are still invertible, but only when it comes to these relevant aspects, and only insofar as the objects and arrows of the model are approximations, abstractions, and idealisations of the corresponding components of the natural system.
In the most general case, the natural system and its formal model are related by an adjunction. This means that the encoding and decoding functors are what mathematicians call (right and left) adjoints of each other (see appendix). Here, there is no longer any direct invertibility. Instead, the path to decoding inferences from an encoded model can be asymmetric, convoluted, and indirect. But: we can still draw valid inferences as long as the relation between the two functors shows the kind of regularity we are trying to define here. This is one of the main purposes of the category-theoretical concept of a natural transformation: it allows us to define precisely, in mathematical terms, what an adjunction is. The details of this definition depend on the context of the particular adjunction we are looking at. But it is always a specific relation between functors. And also: we can show that the first two cases above — isomorphism and equivalence — are just special cases of more general adjunctions.
This is where the maths get a little hairy. So we won’t go into any details in terms of the formalism here (but see the appendix, if you are interested). Instead, let us try to extract the gist of the concept of a natural transformation from what we just said about all three possible relations between encoding and decoding. For a mathematical transformation to be natural, it simply needs to be applicable anywhere, in the same way, no matter how asymmetric and complicated its application rule may be, and whether or not this rule is directly invertible or not.
In the case of the modelling relation, this means that we can apply the same sequence of mathematical transformations to any object or arrow of either one of the categories, and end up at the same spot again after we’re done. This transformation is trivial when it comes to an isomorphism or equivalence between categories: encoding is just the inverse of decoding. In the case of the isomorphism, it does not matter which object or arrow we’re dealing with. The functors can be inverted anywhere, in exactly the same way. In the case of the equivalence, we need to restrict ourselves to those aspects of the natural system captured by the model, but invertibility still applies (at least roughly speaking). The adjunction is more complicated: how we encode a model can be very different from how we decode inferences from it. But even such an asymmetric relationship can be regular in the sense that it allows our inferences to reconnect to the phenomena they concern, in the same way, for all relevant phenomena, even if encoding and decoding have nothing to do with each other, at first glance, and are not simply invertible.
Natural transformations are so powerful, because they not only allow us to relate the functors describing how we encode and decode a model, but also apply to cases where a natural system has more than one model, or where a formal model describes more than one natural system.
In the first case (left panel), where we have a natural system that allows multiple valid models, we get distinct encodings, and we can investigate how (or if) they relate to each other using natural transformations. This will be important in the last two parts of our book, when we talk about living organisms and systems (ecological or social) that contain them as components.
In the second case (right panel), there are multiple interpretations (i.e., decodings) of the same formal model that allow us to apply it to different natural systems that are often quite different in scale and physical composition. Category-theorists and physicists call this kind of situation universality (see the appendix). Frequently invoked examples of universal phenomena are physical phase transitions (as in ice melting into water), or spin glasses (originally used to understand the behaviour of ferromagnetic solids), which can be analogised to critical transitions in various complex systems and networks, such as communities, ecosystems, or gene regulatory networks. Again, we’ll have more to say about this later.
But before we go on, let’s take stock: we now have powerful formal tools — category, functor, natural transformation (and adjunction, in particular), as well as concepts such as naturality and universality — that allow us to have a rigorous and systematic look at models of natural systems, and their modelling relations. Our specific aim is to use these tools to compare living and non-living systems to see whether and how they differ from each other.
We hope you are not too exhausted yet, your brain not completely saturated. It’s been quite a journey! In the last four chapters, we learnt about science as a skillful modelling activity, about epistemic cuts (setting apart both the knower and the system to be known), about formalisation (and its limitations), about the relation between formal and natural systems, and about the mathematical theories and tools we use to study not only those systems, but also the very activity of modelling itself.
This brings us to the conclusion of the second part of the book. If you remember only one thing from this discussion, let it be this: models are human artefacts, abstract tools which we build and employ to draw inferences about natural phenomena and events in order to better understand and predict them. They are inferential blueprints (related by adjunction) first, and representations of reality (related by isomorphism or equivalence) only in special cases.
In the next chapter, we’ll wrap up this part by returning to a less formal and more philosophical way of looking at the world. Our intention is to bring together all the strands of thinking we have touched upon in the last four chapters, to understand how we (as limited living beings) can obtain useful and robust scientific knowledge about our multi-levelled and dynamic world — despite (or maybe exactly because of) all its ever-changing and entangled complexity.
In the second case (right panel), there are multiple interpretations (i.e., decodings) of the same formal model that allow us to apply it to different natural systems that are often quite different in scale and physical composition. Category-theorists and physicists call this kind of situation universality (see the appendix). Frequently invoked examples of universal phenomena are physical phase transitions (as in ice melting into water), or spin glasses (originally used to understand the behaviour of ferromagnetic solids), which can be analogised to critical transitions in various complex systems and networks, such as communities, ecosystems, or gene regulatory networks. Again, we’ll have more to say about this later.
But before we go on, let’s take stock: we now have powerful formal tools — category, functor, natural transformation (and adjunction, in particular), as well as concepts such as naturality and universality — that allow us to have a rigorous and systematic look at models of natural systems, and their modelling relations. Our specific aim is to use these tools to compare living and non-living systems to see whether and how they differ from each other.
We hope you are not too exhausted yet, your brain not completely saturated. It’s been quite a journey! In the last four chapters, we learnt about science as a skillful modelling activity, about epistemic cuts (setting apart both the knower and the system to be known), about formalisation (and its limitations), about the relation between formal and natural systems, and about the mathematical theories and tools we use to study not only those systems, but also the very activity of modelling itself.
This brings us to the conclusion of the second part of the book. If you remember only one thing from this discussion, let it be this: models are human artefacts, abstract tools which we build and employ to draw inferences about natural phenomena and events in order to better understand and predict them. They are inferential blueprints (related by adjunction) first, and representations of reality (related by isomorphism or equivalence) only in special cases.
In the next chapter, we’ll wrap up this part by returning to a less formal and more philosophical way of looking at the world. Our intention is to bring together all the strands of thinking we have touched upon in the last four chapters, to understand how we (as limited living beings) can obtain useful and robust scientific knowledge about our multi-levelled and dynamic world — despite (or maybe exactly because of) all its ever-changing and entangled complexity.
|
Previous: The Art of Modelling
|
Next: Rainforest Ontology
|
The authors acknowledge funding from the John Templeton Foundation (Project ID: 62581), and would like to thank the co-leader of the project, Prof. Tarja Knuuttila, and the Department of Philosophy at the University of Vienna for hosting the project of which this book is a central part.
Disclaimer: everything we write and present here is our own responsibility. All mistakes are ours, and not the funders’ or our hosts’ and collaborators'.
Disclaimer: everything we write and present here is our own responsibility. All mistakes are ours, and not the funders’ or our hosts’ and collaborators'.