Category
This appendix serves the same purpose as the one on set theory. Consult it if you want a better understanding of category-theoretical formalism and the concepts used in the main text, or if you are interested in what category theory does and how it provides a foundation for the rest of mathematics.
For further reading, we also recommend Eugenia Cheng’s book “The Art of Abstraction,” an accessible introduction from the point of view of an active researcher in the field of category theory. If you like a more formal approach, check out Saunders MacLane’s “Categories for the Working Mathematician.” If you prefer something equally formal but shorter, then consult the appendix of Aloisius Louie’s “More Than Life Itself.” “Intangible Life,” (Louie’s most recent work) has a more in-depth introduction to category theory, and directly relates it to Rosen’s modelling relation (in Part III of the book). And if you feel adventurous, read Robert Goldblatt’s surprisingly accessible “Topoi,” which explains how category theory can provide a foundation for mathematical logic.
Just to recap from chapter 12, a mathematical category (usually denoted by bold face: e.g. C) is defined by:
a collection of abstract objects (A, B, C …) which are defined by their relations, represented by
a collection of arrows/morphisms (f, g, h …), connecting objects to themselves and other objects.
Arrows are composable if the target (or range) of one coincides with the source (or domain) of the other, i.e., if one arrow ends where the other one begins. In addition, we have the following category axioms:
1. Uniqueness: arrows can be sorted into disjoint sets, called hom-sets, that contain all morphisms which start/end at specific objects. We write C(A,B) for the set of all arrows from domain A to range B in category C. (In set theory, the equivalent set of all mappings from A to B is written as Bᴬ, for reasons we will revisit in the second and third sections of this appendix.) All of this means that f₁: A 🠂 B and f₂: B 🠂 C are never the same morphism, even if they represent the same kind of relation f (e.g., a linear relationship between two objects).
2. Associativity: the order in which we compose more than two arrows does not matter or, in mathematical notation: (h ○ g) ○ f = h ○ (g ○ f), where ○ is the symbol for composition, and is read as “after.” In other words: executing the composite of h and g after f equals executing h after the composite of g and f. This renders composition path- and history-independent.
3. Identity (or the unit laws of a category): for any object B, there exists an identity morphism 1B mapping the object to itself. Given two mappings f: A 🠂 B and g: B 🠂 C, coming into and out of B, we have 1B ○ f = f and g ○ 1B = g. In words: if we trace the identity arrow from an object B back to itself, absolutely nothing changes, and we end up with the exact same B we started with.
We explain in the main text that a discrete category (with identities as its only arrows) amounts to a relational redescription of a set, where the identity arrows distinguish the different set elements. For instance, the trivial category 1 contains exactly one object, with its identity morphism the only arrow. And just like there is an empty set, there is also an empty category, which contains no objects and no arrows at all.
If the collection of objects of a category is a set, the category is concrete or small; if it is not, the category is large. An example of a large category is the category of all well-defined sets, with arrows mappings between them. This category is called Set. Its collection of objects is not a set, since there is no set of all sets (see chapter 12, and this appendix). A closely related (but more general) category is Rel, with objects all well-defined sets, and arrows relations between them. Remember: mappings are special kinds of (mathematical) relations, associating only one specific value with each element of their domain.
Both Set and Rel are locally small: collections of maps or relations between any two sets are always sets themselves (the local hom-sets of these two large categories). Each hom-set in Set is thus a subset of the equivalent hom-set in Rel, since not all relations are mappings, but all mappings are relations.
Categories can be used in ways that are very particular, or they can be used in ways that are very abstract and general. Their flexibility is their strength. As we have just seen, a discrete category can represent one specific set: it is a small category, its objects the elements of the set, with morphisms determining the identity of each element. In contrast, Set and Rel are large categories with a higher degree of abstraction that contain whole sets as objects, with arrows representing mappings or relations between them.
Similarly, in what follows, we distinguish small categories that represent one specific natural or formal system with its internal relations, and large categories that represent whole classes of natural or formal systems (or, indeed, the class of all natural or formal systems) and how these systems relate to one another. Wherever confusion may arise, we will make sure to highlight the difference.
Many other interesting and useful categories are derived from Set. We can interpret their objects as sets and their arrows as mappings with some added mathematical structure. Take, for example, Mon, the category of small monoids, or Grp, the category of small groups, with arrows representing homomorphisms, i.e., mappings that preserve the mathematical structure of the objects they connect. We have outlined, in this appendix, how monoids and groups model or represent arithmetic operations on numbers (addition, multiplication, etc.). Within these large categories, individual monoids and groups can themselves be treated as small categories, each containing just one object (e.g., M and G, respectively) which represents the underlying set. To this set, we add a binary operation, which is associative and has an identity element (in the case of a monoid M), and is also invertible (in the case of a group G). Such an operation is nothing but a morphism from the one object in the category to itself. This morphism is not the identity arrow. Instead, it maps (pairs of) elements of the underlying set to other elements (e.g. adding 1 and 2 to get 3).
This should make a few things clear. First, the name “monoid” refers to the fact that this structure can be described by a category that has only one single object. Second, a group is nothing but an invertible monoid. This shows how category theory relates different branches of mathematics to each other. And, last but not least, we now understand why categories have a composition operation that must be associative and have an identity element. It makes composition non-arbitrary (in the sense of “independent of its own history”), which not only makes it easier to deal with, but also allows us to map categorical concepts to standard mathematical operations, such as those modelled by monoids and groups.
There are many other large categories that are built on Set. Partially and totally ordered sets (introduced previously in this appendix) are the objects of categories Pos and Tos, with arrows representing their ordering relations ≤ (or ≥). Vct is the category of vector spaces (not the same as vector fields) with linear transformations between them. Top is the category of topological spaces and continuous maps. And so on.
But, and this is important, there are also categories whose objects are not sets, or whose arrows are not mappings. An interesting example of the former is Cat, the category of all small categories. Or functor categories. We’ll revisit these in the next section. As an example of a category whose morphisms are not mappings, we’ve already encountered Rel. Another example is Toph, where the objects are topological spaces (i.e., sets with structure), but the arrows represent homotopy classes which classify, roughly speaking, how the maps they contain as elements connect two points in space, and whether they can be smoothly transformed into each other. Don’t worry if you don’t understand exactly what this means. Our main point here is that, even though most categories can be interpreted in set-theoretical terms, not all of them can. Just like objects are specialised processes that stay constant on the time scale we observe them, sets are specialised categories. This implies that category theory is more general than set theory.
Let us now turn to the question of how categories relate to each other. Usually, this involves functors (see the next section). But if we look at Set and Rel a bit more closely, we notice that they are related in a very special way: both are based on the same collection of objects (the totality of all well-defined sets), and yet, mappings are a proper subclass of relations (the tiny fraction of relations where each element of the domain relates to exactly one element of the range). In addition, we notice that if we compose two mappings, we always get another mapping (but never a relation that is not a mapping). A mathematician would say that the operation of composition exhibits closure in Set. This is, in fact, a basic property of categories that we neglected to mention above: composition of morphisms is never supposed to take you outside the category. We’ve encountered this kind of operational closure before, and we’ve also already mentioned that it is related to but not the same as semiotic or organisational closure (see chapter 10).
All of this means that Set is a subcategory of Rel: every object and arrow in Set is also an object and arrow of Rel (but, in this case, not vice versa). In this sense, a subcategory is defined like a (proper) subset. As an additional condition, however, we also need to make sure that composition of arrows is closed within the subcategory, and that object identities are faithfully inherited from the parent. This is obviously the case for Set and Rel. In contrast, a full subcategory contains only a subset of objects of its parent, but the full complement of arrows between these objects (i.e., the hom-sets are identical between any two objects of the subcategory and its parent). Thus, Set is a subcategory, but not a full subcategory of Rel.
Finally, a few words on operations that we can perform on categories. While unions and intersections of categories are much more complicated than those of sets, we can easily form products of categories. This works in a way that is analogous to the Cartesian product of a set. Let’s say we have two (small) categories A and B. Their product A × B has ordered pairs ⟨A,B⟩ of objects as its objects, and ordered pairs of morphisms ⟨f,g⟩: ⟨A,B⟩ 🠂 ⟨A’,B’⟩ (where f: A 🠂 A’, and g: B🠂 B’) as its arrows. This is pretty straightforward, but will be extremely useful when we talk about universality in the third section below.
This appendix serves the same purpose as the one on set theory. Consult it if you want a better understanding of category-theoretical formalism and the concepts used in the main text, or if you are interested in what category theory does and how it provides a foundation for the rest of mathematics.
For further reading, we also recommend Eugenia Cheng’s book “The Art of Abstraction,” an accessible introduction from the point of view of an active researcher in the field of category theory. If you like a more formal approach, check out Saunders MacLane’s “Categories for the Working Mathematician.” If you prefer something equally formal but shorter, then consult the appendix of Aloisius Louie’s “More Than Life Itself.” “Intangible Life,” (Louie’s most recent work) has a more in-depth introduction to category theory, and directly relates it to Rosen’s modelling relation (in Part III of the book). And if you feel adventurous, read Robert Goldblatt’s surprisingly accessible “Topoi,” which explains how category theory can provide a foundation for mathematical logic.
Just to recap from chapter 12, a mathematical category (usually denoted by bold face: e.g. C) is defined by:
a collection of abstract objects (A, B, C …) which are defined by their relations, represented by
a collection of arrows/morphisms (f, g, h …), connecting objects to themselves and other objects.
Arrows are composable if the target (or range) of one coincides with the source (or domain) of the other, i.e., if one arrow ends where the other one begins. In addition, we have the following category axioms:
1. Uniqueness: arrows can be sorted into disjoint sets, called hom-sets, that contain all morphisms which start/end at specific objects. We write C(A,B) for the set of all arrows from domain A to range B in category C. (In set theory, the equivalent set of all mappings from A to B is written as Bᴬ, for reasons we will revisit in the second and third sections of this appendix.) All of this means that f₁: A 🠂 B and f₂: B 🠂 C are never the same morphism, even if they represent the same kind of relation f (e.g., a linear relationship between two objects).
2. Associativity: the order in which we compose more than two arrows does not matter or, in mathematical notation: (h ○ g) ○ f = h ○ (g ○ f), where ○ is the symbol for composition, and is read as “after.” In other words: executing the composite of h and g after f equals executing h after the composite of g and f. This renders composition path- and history-independent.
3. Identity (or the unit laws of a category): for any object B, there exists an identity morphism 1B mapping the object to itself. Given two mappings f: A 🠂 B and g: B 🠂 C, coming into and out of B, we have 1B ○ f = f and g ○ 1B = g. In words: if we trace the identity arrow from an object B back to itself, absolutely nothing changes, and we end up with the exact same B we started with.
We explain in the main text that a discrete category (with identities as its only arrows) amounts to a relational redescription of a set, where the identity arrows distinguish the different set elements. For instance, the trivial category 1 contains exactly one object, with its identity morphism the only arrow. And just like there is an empty set, there is also an empty category, which contains no objects and no arrows at all.
If the collection of objects of a category is a set, the category is concrete or small; if it is not, the category is large. An example of a large category is the category of all well-defined sets, with arrows mappings between them. This category is called Set. Its collection of objects is not a set, since there is no set of all sets (see chapter 12, and this appendix). A closely related (but more general) category is Rel, with objects all well-defined sets, and arrows relations between them. Remember: mappings are special kinds of (mathematical) relations, associating only one specific value with each element of their domain.
Both Set and Rel are locally small: collections of maps or relations between any two sets are always sets themselves (the local hom-sets of these two large categories). Each hom-set in Set is thus a subset of the equivalent hom-set in Rel, since not all relations are mappings, but all mappings are relations.
Categories can be used in ways that are very particular, or they can be used in ways that are very abstract and general. Their flexibility is their strength. As we have just seen, a discrete category can represent one specific set: it is a small category, its objects the elements of the set, with morphisms determining the identity of each element. In contrast, Set and Rel are large categories with a higher degree of abstraction that contain whole sets as objects, with arrows representing mappings or relations between them.
Similarly, in what follows, we distinguish small categories that represent one specific natural or formal system with its internal relations, and large categories that represent whole classes of natural or formal systems (or, indeed, the class of all natural or formal systems) and how these systems relate to one another. Wherever confusion may arise, we will make sure to highlight the difference.
Many other interesting and useful categories are derived from Set. We can interpret their objects as sets and their arrows as mappings with some added mathematical structure. Take, for example, Mon, the category of small monoids, or Grp, the category of small groups, with arrows representing homomorphisms, i.e., mappings that preserve the mathematical structure of the objects they connect. We have outlined, in this appendix, how monoids and groups model or represent arithmetic operations on numbers (addition, multiplication, etc.). Within these large categories, individual monoids and groups can themselves be treated as small categories, each containing just one object (e.g., M and G, respectively) which represents the underlying set. To this set, we add a binary operation, which is associative and has an identity element (in the case of a monoid M), and is also invertible (in the case of a group G). Such an operation is nothing but a morphism from the one object in the category to itself. This morphism is not the identity arrow. Instead, it maps (pairs of) elements of the underlying set to other elements (e.g. adding 1 and 2 to get 3).
This should make a few things clear. First, the name “monoid” refers to the fact that this structure can be described by a category that has only one single object. Second, a group is nothing but an invertible monoid. This shows how category theory relates different branches of mathematics to each other. And, last but not least, we now understand why categories have a composition operation that must be associative and have an identity element. It makes composition non-arbitrary (in the sense of “independent of its own history”), which not only makes it easier to deal with, but also allows us to map categorical concepts to standard mathematical operations, such as those modelled by monoids and groups.
There are many other large categories that are built on Set. Partially and totally ordered sets (introduced previously in this appendix) are the objects of categories Pos and Tos, with arrows representing their ordering relations ≤ (or ≥). Vct is the category of vector spaces (not the same as vector fields) with linear transformations between them. Top is the category of topological spaces and continuous maps. And so on.
But, and this is important, there are also categories whose objects are not sets, or whose arrows are not mappings. An interesting example of the former is Cat, the category of all small categories. Or functor categories. We’ll revisit these in the next section. As an example of a category whose morphisms are not mappings, we’ve already encountered Rel. Another example is Toph, where the objects are topological spaces (i.e., sets with structure), but the arrows represent homotopy classes which classify, roughly speaking, how the maps they contain as elements connect two points in space, and whether they can be smoothly transformed into each other. Don’t worry if you don’t understand exactly what this means. Our main point here is that, even though most categories can be interpreted in set-theoretical terms, not all of them can. Just like objects are specialised processes that stay constant on the time scale we observe them, sets are specialised categories. This implies that category theory is more general than set theory.
Let us now turn to the question of how categories relate to each other. Usually, this involves functors (see the next section). But if we look at Set and Rel a bit more closely, we notice that they are related in a very special way: both are based on the same collection of objects (the totality of all well-defined sets), and yet, mappings are a proper subclass of relations (the tiny fraction of relations where each element of the domain relates to exactly one element of the range). In addition, we notice that if we compose two mappings, we always get another mapping (but never a relation that is not a mapping). A mathematician would say that the operation of composition exhibits closure in Set. This is, in fact, a basic property of categories that we neglected to mention above: composition of morphisms is never supposed to take you outside the category. We’ve encountered this kind of operational closure before, and we’ve also already mentioned that it is related to but not the same as semiotic or organisational closure (see chapter 10).
All of this means that Set is a subcategory of Rel: every object and arrow in Set is also an object and arrow of Rel (but, in this case, not vice versa). In this sense, a subcategory is defined like a (proper) subset. As an additional condition, however, we also need to make sure that composition of arrows is closed within the subcategory, and that object identities are faithfully inherited from the parent. This is obviously the case for Set and Rel. In contrast, a full subcategory contains only a subset of objects of its parent, but the full complement of arrows between these objects (i.e., the hom-sets are identical between any two objects of the subcategory and its parent). Thus, Set is a subcategory, but not a full subcategory of Rel.
Finally, a few words on operations that we can perform on categories. While unions and intersections of categories are much more complicated than those of sets, we can easily form products of categories. This works in a way that is analogous to the Cartesian product of a set. Let’s say we have two (small) categories A and B. Their product A × B has ordered pairs ⟨A,B⟩ of objects as its objects, and ordered pairs of morphisms ⟨f,g⟩: ⟨A,B⟩ 🠂 ⟨A’,B’⟩ (where f: A 🠂 A’, and g: B🠂 B’) as its arrows. This is pretty straightforward, but will be extremely useful when we talk about universality in the third section below.
Arrows
We’ve now established enough of the basics about categories to focus our attention on what matters most in the context of our argument: morphisms, functors, and natural transformations — arrows that represent different kinds of relations that occur within and between categories.
First, let’s have another look at the morphisms that relate objects within a category. We already mentioned in chapter 12 that a category can be defined in terms of arrows alone: no objects are required, since an object is characterised completely by its identity morphism, the relation it has with itself. Formally, this means that object B is taken to be equivalent to arrow 1B. Or, in more philosophical terms: being defined by a certain intrinsic property is the same as existing in a relation to yourself that corresponds to that property. A car has the property of being yellow (see also this appendix). This seems like a trivial distinction, but it is not: it matters greatly whether we treat relations as primary (emphasising context and change), or as derived from unchanging intrinsic essences (an essentialist and substantivist take).
A fundamental property of the general relations represented by categorical arrows (which are not restricted to mathematical relations) is that they have a polarity, or direction: they point from one object (their source or domain) to another (their target or range). Graphically, this is represented by the arrowhead. Such polarity is an essential part of the definition of a morphism, and applies even if the arrow is an identity or an automorphism (see chapter 12), where domain and range are the same object. We “traverse” an arrow in a certain direction — from tail to (arrow)head. This is especially evident in the case of an automorphism, which — if the object it maps to itself is a set — permutes the elements of that set in a given way, whose inverse is usually not the same as itself. In chapter 12, for example, we show how we can use automorphisms to represent the temporal dynamics of an object — always proceeding from “before” to “after.” Time only flows in one direction after all. This is the kind of polarity we are talking about: watching a movie forward looks very different than watching it backward.
Having established this polarity, we can now enter a strange mirror world in category theory: each category C has an opposite (or dual) Cᵒᵖ, which has exactly the same collection of objects, but the direction of all its arrows is reversed. This means that every morphism f: A 🠂 B of C is replaced by fᵒᵖ: B 🠂 A in Cᵒᵖ, which (and this is important) is not necessarily an inverse, i.e., applying fᵒᵖ after f does not necessarily leave A unchanged (or, formally: fᵒᵖ ○ f ≠ 1A). All that we require is that fᵒᵖ ○ gᵒᵖ exists if g ○ f does. In other words, if two arrows connect in C then their opposites must connect in the reverse order or direction in Cᵒᵖ (formally: (g ○ f)ᵒᵖ = fᵒᵖ ○ gᵒᵖ). Note that automorphisms (which are invertible, by definition) are also reversed (which means time would flow backwards in the movie above), and so are identities, although it really doesn’t matter in their case, as the reverse of an identity is just itself.
For an intuitive illustration of dual categories, let’s take a (partially or totally) ordered set such as the natural numbers ℕ (see this appendix) with an ordering relation ≤ and treat it as a (small) category (i.e., an object in either Pos or Tos; see above). The opposite of such a set is the same set but ordered by ≥ (an object in either Posᵒᵖ or Tosᵒᵖ). Numbers now increase to the left instead of to the right:
We’ve now established enough of the basics about categories to focus our attention on what matters most in the context of our argument: morphisms, functors, and natural transformations — arrows that represent different kinds of relations that occur within and between categories.
First, let’s have another look at the morphisms that relate objects within a category. We already mentioned in chapter 12 that a category can be defined in terms of arrows alone: no objects are required, since an object is characterised completely by its identity morphism, the relation it has with itself. Formally, this means that object B is taken to be equivalent to arrow 1B. Or, in more philosophical terms: being defined by a certain intrinsic property is the same as existing in a relation to yourself that corresponds to that property. A car has the property of being yellow (see also this appendix). This seems like a trivial distinction, but it is not: it matters greatly whether we treat relations as primary (emphasising context and change), or as derived from unchanging intrinsic essences (an essentialist and substantivist take).
A fundamental property of the general relations represented by categorical arrows (which are not restricted to mathematical relations) is that they have a polarity, or direction: they point from one object (their source or domain) to another (their target or range). Graphically, this is represented by the arrowhead. Such polarity is an essential part of the definition of a morphism, and applies even if the arrow is an identity or an automorphism (see chapter 12), where domain and range are the same object. We “traverse” an arrow in a certain direction — from tail to (arrow)head. This is especially evident in the case of an automorphism, which — if the object it maps to itself is a set — permutes the elements of that set in a given way, whose inverse is usually not the same as itself. In chapter 12, for example, we show how we can use automorphisms to represent the temporal dynamics of an object — always proceeding from “before” to “after.” Time only flows in one direction after all. This is the kind of polarity we are talking about: watching a movie forward looks very different than watching it backward.
Having established this polarity, we can now enter a strange mirror world in category theory: each category C has an opposite (or dual) Cᵒᵖ, which has exactly the same collection of objects, but the direction of all its arrows is reversed. This means that every morphism f: A 🠂 B of C is replaced by fᵒᵖ: B 🠂 A in Cᵒᵖ, which (and this is important) is not necessarily an inverse, i.e., applying fᵒᵖ after f does not necessarily leave A unchanged (or, formally: fᵒᵖ ○ f ≠ 1A). All that we require is that fᵒᵖ ○ gᵒᵖ exists if g ○ f does. In other words, if two arrows connect in C then their opposites must connect in the reverse order or direction in Cᵒᵖ (formally: (g ○ f)ᵒᵖ = fᵒᵖ ○ gᵒᵖ). Note that automorphisms (which are invertible, by definition) are also reversed (which means time would flow backwards in the movie above), and so are identities, although it really doesn’t matter in their case, as the reverse of an identity is just itself.
For an intuitive illustration of dual categories, let’s take a (partially or totally) ordered set such as the natural numbers ℕ (see this appendix) with an ordering relation ≤ and treat it as a (small) category (i.e., an object in either Pos or Tos; see above). The opposite of such a set is the same set but ordered by ≥ (an object in either Posᵒᵖ or Tosᵒᵖ). Numbers now increase to the left instead of to the right:
It should be easy to see that any true statement that applies to the original category also applies to its dual, just in mirror-reversed order. A trivial example: if 1 is larger than 0, then 0 must be smaller than 1.
This leads to a general principle of categorical duality, which is based on the evident fact that the opposite of an opposite category is the category itself: (Cᵒᵖ)ᵒᵖ = C. Categorical duality implies that for each concept in C, there is a corresponding co-concept in Cᵒᵖ, such that any true statement about this concept in C is also true for its co-concept in Cᵒᵖ. This is tremendously convenient for the category theoretician: any insight about category C immediately yields an equivalent insight about its opposite Cᵒᵖ, which basically cuts the mathematician’s work in half. You derive one theorem, and you get another one for free!
As another example, let us look at two special objects that are defined in terms of their relations with other objects: the initial and terminal objects of a category. Their definitions are quite intuitive. An initial object is special because it has exactly one arrow going out to each other object in the category, while a terminal object has exactly one arrow coming in from each of the other objects.
Not all categories possess initial and terminal objects. Set is an example of a category that has them. In fact, Set has infinitely many distinct terminal objects, because every singleton set (every set with only one unique element) is terminal. Any mapping going into such a singleton has the exact same target or range: the only element there is. Therefore, there can only be one unique mapping into a singleton, and we can easily construct a constant mapping (yielding only one single value as output) from any other set.
In contrast, the initial object of Set is a bit weird: it is the empty set. To understand how mappings connect to a set with no elements, remember that a mapping is itself defined as a set, its elements ordered pairs associating each member of the domain to one specific member of the range (the value of the mapping; chapter 11 and this appendix). On the one hand, this means you cannot have a mapping into the empty set, as there is no element that could serve as the value of the mapping. On the other hand, and this is where things get strange, we do have a unique mapping from the empty set to any other (empty or non-empty) set. Not surprisingly, it is called the empty mapping: a set that contains no ordered pairs. It is a perfectly valid mathematical construction, as long as we cannot prove that it contradicts the definition of a mapping. Because there are no elements in its domain, there isn’t an element that maps to multiple values either. Mathematicians call a condition like this vacuously true. Curious, but consistent.
In some categories, initial and terminal elements are the same object. Take Grp, for example: the underlying set of a group cannot be empty, since the group operation has to operate on something after all. The simplest object in this category, therefore, is the trivial group, whose underlying set is a singleton with only one element: the identity of its operation. It is both the initial and the terminal object of Grp. Mathematicians call this (somewhat confusingly) a zero object (in this case, the zero group).
Initial and terminal objects nicely illustrate the notion of a co-concept in opposing categories: in Setᵒᵖ, the empty set is terminal, while any singleton is initial. In contrast, a zero object like the trivial group is not affected by turning Grp into its dual Grpᵒᵖ. So the trick of exploiting duality as a mathematician is to first define precisely which concepts correspond to co-concepts and how. Once you’ve established this, you can apply any insights into a category straightforwardly to its dual or opposite.
This brings us to the rather complicated relation of a morphism to its co-morphism. We’ve already seen that some morphisms are invertible, but (probably not surprisingly) there is more to this concept than first meets the eye. Invertibility, basically, means that given some morphism f there is another morphism (which mathematicians usually denote by f-1 but we can more generally just call g) such that applying g after f takes you back to exactly where you started from (formally: g ○ f = 1A, if A is the domain of f). If this is the case, we call f a monomorphism (or mono, for short). Sometimes, it is also called a right inverse (because f is on the right of g in the formula above, and its arrow is pointing to the right, from A to B). If f is a mapping (an arrow in Set), this corresponds to an injection, which maps each element of its domain to a different unique value (it is one-to-one). This is why it is possible to move back (left) to the original domain via g.
Now, seasoned category theorists that we are, we will ask: what is the dual of a monomorphism? Concretely, what happens if we reverse the order of f and g, applying f after g (f ○ g) instead of the other way around? And here lies danger, because this reverse operation is not necessarily invertible! Take, for example, an injective mapping such as the exponential function (f) defined on the real numbers:
This leads to a general principle of categorical duality, which is based on the evident fact that the opposite of an opposite category is the category itself: (Cᵒᵖ)ᵒᵖ = C. Categorical duality implies that for each concept in C, there is a corresponding co-concept in Cᵒᵖ, such that any true statement about this concept in C is also true for its co-concept in Cᵒᵖ. This is tremendously convenient for the category theoretician: any insight about category C immediately yields an equivalent insight about its opposite Cᵒᵖ, which basically cuts the mathematician’s work in half. You derive one theorem, and you get another one for free!
As another example, let us look at two special objects that are defined in terms of their relations with other objects: the initial and terminal objects of a category. Their definitions are quite intuitive. An initial object is special because it has exactly one arrow going out to each other object in the category, while a terminal object has exactly one arrow coming in from each of the other objects.
Not all categories possess initial and terminal objects. Set is an example of a category that has them. In fact, Set has infinitely many distinct terminal objects, because every singleton set (every set with only one unique element) is terminal. Any mapping going into such a singleton has the exact same target or range: the only element there is. Therefore, there can only be one unique mapping into a singleton, and we can easily construct a constant mapping (yielding only one single value as output) from any other set.
In contrast, the initial object of Set is a bit weird: it is the empty set. To understand how mappings connect to a set with no elements, remember that a mapping is itself defined as a set, its elements ordered pairs associating each member of the domain to one specific member of the range (the value of the mapping; chapter 11 and this appendix). On the one hand, this means you cannot have a mapping into the empty set, as there is no element that could serve as the value of the mapping. On the other hand, and this is where things get strange, we do have a unique mapping from the empty set to any other (empty or non-empty) set. Not surprisingly, it is called the empty mapping: a set that contains no ordered pairs. It is a perfectly valid mathematical construction, as long as we cannot prove that it contradicts the definition of a mapping. Because there are no elements in its domain, there isn’t an element that maps to multiple values either. Mathematicians call a condition like this vacuously true. Curious, but consistent.
In some categories, initial and terminal elements are the same object. Take Grp, for example: the underlying set of a group cannot be empty, since the group operation has to operate on something after all. The simplest object in this category, therefore, is the trivial group, whose underlying set is a singleton with only one element: the identity of its operation. It is both the initial and the terminal object of Grp. Mathematicians call this (somewhat confusingly) a zero object (in this case, the zero group).
Initial and terminal objects nicely illustrate the notion of a co-concept in opposing categories: in Setᵒᵖ, the empty set is terminal, while any singleton is initial. In contrast, a zero object like the trivial group is not affected by turning Grp into its dual Grpᵒᵖ. So the trick of exploiting duality as a mathematician is to first define precisely which concepts correspond to co-concepts and how. Once you’ve established this, you can apply any insights into a category straightforwardly to its dual or opposite.
This brings us to the rather complicated relation of a morphism to its co-morphism. We’ve already seen that some morphisms are invertible, but (probably not surprisingly) there is more to this concept than first meets the eye. Invertibility, basically, means that given some morphism f there is another morphism (which mathematicians usually denote by f-1 but we can more generally just call g) such that applying g after f takes you back to exactly where you started from (formally: g ○ f = 1A, if A is the domain of f). If this is the case, we call f a monomorphism (or mono, for short). Sometimes, it is also called a right inverse (because f is on the right of g in the formula above, and its arrow is pointing to the right, from A to B). If f is a mapping (an arrow in Set), this corresponds to an injection, which maps each element of its domain to a different unique value (it is one-to-one). This is why it is possible to move back (left) to the original domain via g.
Now, seasoned category theorists that we are, we will ask: what is the dual of a monomorphism? Concretely, what happens if we reverse the order of f and g, applying f after g (f ○ g) instead of the other way around? And here lies danger, because this reverse operation is not necessarily invertible! Take, for example, an injective mapping such as the exponential function (f) defined on the real numbers:
It is one-to-one: every real number is mapped to a single unique value, because the values are monotonically increasing. But all these values are positive! This is the reason why the inverse mapping, the natural logarithm (g), is only defined on positive real numbers. Thus, if we start with the logarithm g and the entire range of real numbers, we will not get an inversion, because the logarithm does not yield any values for the negative real numbers.
In this case, f is called the left inverse, because it is on the left of the formula above, and we are starting from the original range B leftwards back to the domain A of f. And: inversion in this reverse order only works if f (considered as a mapping in Set) is a surjection, meaning that it maps onto its entire range (which the exponential above so obviously failed to do). The categorical generalisation of such a mapping is an epimorphism (or epi, for short). It is the dual co-concept of a monomorphism. In other words, any monomorphism in category C turns into an epimorphism in its opposite Cᵒᵖ.
In those special cases when f is both a left and a right inverse, both mono and epi, we call it an isomorphism (see chapter 11 and this appendix). It can be inverted in any direction. This corresponds to a bijective mapping in Set, which is both one-to-one (injective) and onto (surjective). When two objects A and B are connected by such an arrow, they are isomorphic to each other. Isomorphisms are central to category theory, because they allow us to develop notions of equality and equivalence which are relational and context-dependent. We say that A and B are equal up to isomorphism (see also chapter 11), meaning they are equivalent in specified aspects (but, perhaps, not in others).
Note the basic difference between set and category theory here: according to the axiom of extension, two sets are equal if and only if they contain the exact same elements within them, i.e., if they have the same intrinsic content and properties. In contrast, two categorical objects are considered the same if they are isomorphic to each other, i.e., if they stand in a certain relation to each other. Here, equivalence is relative to an extrinsic context: isomorphic objects have identical connections to other objects in a category (which specifies the context), and their duals are isomorphic to each other too. Under the given circumstances and for given aspects, they are indistinguishable, but in another context, they may not be. Their identity is intimately tied to how they relate to themselves and to the rest of the world.
The same kind of logic applies to relations between categories: as explained in chapter 12, functors represent mappings from one category to another that preserve the mathematical structure of the connected categories. They are the arrows of Cat, the category of all small categories (see above).
Specifically, a covariant functor F: C 🠂 D consists of two distinct maps: an object mapping, which assigns to object A in category C an object FA in category D, and
an arrow mapping, which assigns to arrow f: A 🠂 B in category C an arrow Ff: FA 🠂 FB in category D.
The dual contravariant functor F- has the same object mapping, but its arrow mapping assigns to arrow f: A 🠂 B in category C an arrow F-f: FB 🠂 FA in category D, i.e., it reverses the direction of each arrow it is mapping.
This should sound familiar by now: evidently, a contravariant functor F- is the same as a covariant functor from Cᵒᵖ to D, or (equivalently) from C to Dᵒᵖ. For this reason, mathematicians prefer to deal with the simpler covariant functors, unless contravariance is indispensable for their argument.
Importantly, functors not only preserve objects and arrows (including identities) but also the associative composition of arrows. Also, they map isomorphisms in one category to isomorphisms in the other.
Every category has an identity functor (just like its objects have identity morphisms), I: C 🠂 C, which maps every object and arrow of the category to itself. And just like morphisms, functors can be composed in an associative manner, which simply involves composing the object and arrow maps separately.
The identity functor is a trivial example of a functor that is itself an isomorphism, which means that both its object and arrow maps are bijective. But (yet again) there are more subtle distinctions to be made: if we look at each hom-set C(A,B) separately, we can define a collection of mappings F{A,B} : C(A,B) 🠂 D(FA,FB). When each of the mappings in this collection is surjective, then F is a full functor (arrows from A to B in C cover the full complement of arrows between FA and FB in D). If each mapping is injective, F is faithful (it maps arrows in C one-to-one to arrows in D). If a functor is both full and faithful, then all its F{A,B} are isomorphisms, but this does not mean that the functor overall is an isomorphism, as there may be objects in D that have no equivalent in C (i.e., the object mapping itself is not surjective, and neither is the arrow mapping as a whole).
As an example, consider the inclusion functor I: C 🠂 D of a subcategory C in a (larger) category D: this functor is very similar to the identity functor (see above), except that it maps C to D instead of to itself (I: C 🠂 C). It is faithful if and only if C is a full subcategory of D, i.e., if it includes all the arrows between objects of D that are also objects of C (see also above).
Other commonly used examples of functors include the dual functor Iᵒᵖ: C 🠂 Cᵒᵖ, which maps a category C to its opposite Cᵒᵖ, the power set functor P: Set 🠂 Set, which assigns to each set its set of subsets (and to each mapping from a set, a collection of corresponding mappings from its subsets), and hom-functors, which are used to assign hom-sets to (pairs of) objects in categories.
Remember that categories such as Mon and Grp (and Vct for that matter), consist of objects that can be interpreted in terms of underlying sets, plus mappings with additional mathematical structure (e.g., the arithmetic operations modelled by monoids and groups, or linear transformations between vector spaces; see above). Such categories can be connected to Set simply by discarding the additional structure. Functors that do this are called forgetful functors. An interesting example is the forgetful functor for Cat: it turns a category into its underlying directed graph, which category theorists call a quiver (and the category of all small quivers Quiv), by eliminating identities, the composition operation, and the closure and uniqueness conditions we used to define a category in the first place.
The opposite of a forgetful functor is free: we’ve encountered this powerful idea before, in chapter 10, when we discussed how we can generate words from the symbols of an alphabet by concatenating them freely. We can now recognise that the resulting object (the collection of all possible words generated by an alphabet) is nothing other than a free monoid — with the alphabet as the underlying set, and concatenation as its operation. Mathematicians call this object “free” because it requires nothing other than the basic defining conditions of its category to be generated. In a similar manner, we can construct free groups from underlying sets, and free vector spaces from underlying bases (which are also kinds of sets), and even free categories from underlying quivers. Think about it: forgetful and free functors not only connect categories of very general algebraic structures to each other, but allow us to construct such categories with very little effort. Such is the power of category theory!
Just like there can be many distinct morphisms between the same objects A and B, there can be many distinct functors between the same categories C and D. Remember: these morphisms can be treated as the elements of a (hom-)set, which set-theorists denote by Bᴬ (see the previous and next sections). Analogously, we can define a category of all functors between two categories and call it Dᶜ. This is a functor category, which has functors as its objects. But what are its morphisms then? They, in fact, are maps between functors that are called natural transformations. These are the last topic we cover in this section. And it’s an important one: the concept of naturality is central to category theory and this book.
As we have just seen, a natural transformation α maps one functor to another. But what does this mean, exactly? Let’s take two functors F, G: C 🠂 D in category Dᶜ. Then, α: F 🠂 G is defined as follows:
for each object A in category C, there is a morphism in D denoted by αA that maps object FA to object GA, i.e., it transforms the value of the object mapping of F into that of G, and
for each morphism f: A 🠂 B in category C, the following diagram commutes (i.e., we end up at the same spot, no matter which arrows we choose to follow)
In this case, f is called the left inverse, because it is on the left of the formula above, and we are starting from the original range B leftwards back to the domain A of f. And: inversion in this reverse order only works if f (considered as a mapping in Set) is a surjection, meaning that it maps onto its entire range (which the exponential above so obviously failed to do). The categorical generalisation of such a mapping is an epimorphism (or epi, for short). It is the dual co-concept of a monomorphism. In other words, any monomorphism in category C turns into an epimorphism in its opposite Cᵒᵖ.
In those special cases when f is both a left and a right inverse, both mono and epi, we call it an isomorphism (see chapter 11 and this appendix). It can be inverted in any direction. This corresponds to a bijective mapping in Set, which is both one-to-one (injective) and onto (surjective). When two objects A and B are connected by such an arrow, they are isomorphic to each other. Isomorphisms are central to category theory, because they allow us to develop notions of equality and equivalence which are relational and context-dependent. We say that A and B are equal up to isomorphism (see also chapter 11), meaning they are equivalent in specified aspects (but, perhaps, not in others).
Note the basic difference between set and category theory here: according to the axiom of extension, two sets are equal if and only if they contain the exact same elements within them, i.e., if they have the same intrinsic content and properties. In contrast, two categorical objects are considered the same if they are isomorphic to each other, i.e., if they stand in a certain relation to each other. Here, equivalence is relative to an extrinsic context: isomorphic objects have identical connections to other objects in a category (which specifies the context), and their duals are isomorphic to each other too. Under the given circumstances and for given aspects, they are indistinguishable, but in another context, they may not be. Their identity is intimately tied to how they relate to themselves and to the rest of the world.
The same kind of logic applies to relations between categories: as explained in chapter 12, functors represent mappings from one category to another that preserve the mathematical structure of the connected categories. They are the arrows of Cat, the category of all small categories (see above).
Specifically, a covariant functor F: C 🠂 D consists of two distinct maps: an object mapping, which assigns to object A in category C an object FA in category D, and
an arrow mapping, which assigns to arrow f: A 🠂 B in category C an arrow Ff: FA 🠂 FB in category D.
The dual contravariant functor F- has the same object mapping, but its arrow mapping assigns to arrow f: A 🠂 B in category C an arrow F-f: FB 🠂 FA in category D, i.e., it reverses the direction of each arrow it is mapping.
This should sound familiar by now: evidently, a contravariant functor F- is the same as a covariant functor from Cᵒᵖ to D, or (equivalently) from C to Dᵒᵖ. For this reason, mathematicians prefer to deal with the simpler covariant functors, unless contravariance is indispensable for their argument.
Importantly, functors not only preserve objects and arrows (including identities) but also the associative composition of arrows. Also, they map isomorphisms in one category to isomorphisms in the other.
Every category has an identity functor (just like its objects have identity morphisms), I: C 🠂 C, which maps every object and arrow of the category to itself. And just like morphisms, functors can be composed in an associative manner, which simply involves composing the object and arrow maps separately.
The identity functor is a trivial example of a functor that is itself an isomorphism, which means that both its object and arrow maps are bijective. But (yet again) there are more subtle distinctions to be made: if we look at each hom-set C(A,B) separately, we can define a collection of mappings F{A,B} : C(A,B) 🠂 D(FA,FB). When each of the mappings in this collection is surjective, then F is a full functor (arrows from A to B in C cover the full complement of arrows between FA and FB in D). If each mapping is injective, F is faithful (it maps arrows in C one-to-one to arrows in D). If a functor is both full and faithful, then all its F{A,B} are isomorphisms, but this does not mean that the functor overall is an isomorphism, as there may be objects in D that have no equivalent in C (i.e., the object mapping itself is not surjective, and neither is the arrow mapping as a whole).
As an example, consider the inclusion functor I: C 🠂 D of a subcategory C in a (larger) category D: this functor is very similar to the identity functor (see above), except that it maps C to D instead of to itself (I: C 🠂 C). It is faithful if and only if C is a full subcategory of D, i.e., if it includes all the arrows between objects of D that are also objects of C (see also above).
Other commonly used examples of functors include the dual functor Iᵒᵖ: C 🠂 Cᵒᵖ, which maps a category C to its opposite Cᵒᵖ, the power set functor P: Set 🠂 Set, which assigns to each set its set of subsets (and to each mapping from a set, a collection of corresponding mappings from its subsets), and hom-functors, which are used to assign hom-sets to (pairs of) objects in categories.
Remember that categories such as Mon and Grp (and Vct for that matter), consist of objects that can be interpreted in terms of underlying sets, plus mappings with additional mathematical structure (e.g., the arithmetic operations modelled by monoids and groups, or linear transformations between vector spaces; see above). Such categories can be connected to Set simply by discarding the additional structure. Functors that do this are called forgetful functors. An interesting example is the forgetful functor for Cat: it turns a category into its underlying directed graph, which category theorists call a quiver (and the category of all small quivers Quiv), by eliminating identities, the composition operation, and the closure and uniqueness conditions we used to define a category in the first place.
The opposite of a forgetful functor is free: we’ve encountered this powerful idea before, in chapter 10, when we discussed how we can generate words from the symbols of an alphabet by concatenating them freely. We can now recognise that the resulting object (the collection of all possible words generated by an alphabet) is nothing other than a free monoid — with the alphabet as the underlying set, and concatenation as its operation. Mathematicians call this object “free” because it requires nothing other than the basic defining conditions of its category to be generated. In a similar manner, we can construct free groups from underlying sets, and free vector spaces from underlying bases (which are also kinds of sets), and even free categories from underlying quivers. Think about it: forgetful and free functors not only connect categories of very general algebraic structures to each other, but allow us to construct such categories with very little effort. Such is the power of category theory!
Just like there can be many distinct morphisms between the same objects A and B, there can be many distinct functors between the same categories C and D. Remember: these morphisms can be treated as the elements of a (hom-)set, which set-theorists denote by Bᴬ (see the previous and next sections). Analogously, we can define a category of all functors between two categories and call it Dᶜ. This is a functor category, which has functors as its objects. But what are its morphisms then? They, in fact, are maps between functors that are called natural transformations. These are the last topic we cover in this section. And it’s an important one: the concept of naturality is central to category theory and this book.
As we have just seen, a natural transformation α maps one functor to another. But what does this mean, exactly? Let’s take two functors F, G: C 🠂 D in category Dᶜ. Then, α: F 🠂 G is defined as follows:
for each object A in category C, there is a morphism in D denoted by αA that maps object FA to object GA, i.e., it transforms the value of the object mapping of F into that of G, and
for each morphism f: A 🠂 B in category C, the following diagram commutes (i.e., we end up at the same spot, no matter which arrows we choose to follow)
or, in words: each morphism mapped by F (Ff) is transformed into a morphism mapped by G (Gf), because a natural transformation acts at object A in a way that is consistent with how it acts at any other object B (αA and αB are the components of the transformation at A and B).
This is the kind of regularity that the concept of naturality captures, and we have now defined it in a mathematically precise manner: there must be consistency in the way one relation is transformed into another, across its domain of application (i.e., across all the components of the transformation).
A natural transformation that is isomorphic is called a natural isomorphism. It is an isomorphism in the functor category Dᶜ, and a powerful tool to show that different mathematical operations or constructs are equivalent. We’ll use a quite abstract (and admittedly somewhat complicated) example to illustrate this. But bear with us, as it will come in handy below and in the last two parts of the book. It has to do with how we apply mappings (and in which order), especially in the case of mappings with more than one variable, such as h: X × Y 🠂 Z, which maps X and Y into Z. We can also write h(x,y) = z.
In order to understand how to calculate such a mapping, we must know the following: strictly speaking, when we write down any mathematical function like f(x) = y, we are lumping together two distinct operations (that are more clearly separated in computer programming than in mathematical formalism): (1) the definition or declaration of a specific relationship between variables x and y (is it linear, is it exponential, and so on), and (2) the actual evaluation of the resulting map, i.e., its application to a specific x (or to a whole domain X). We write e(f,x) = f(x) for the second step — with the e standing for evaluation mapping — when we want to make this distinction explicit. In the notation we’ve introduced in the previous section, we can write e: Yᵡ 🠂 Y, meaning that we pick a particular map f from a whole set of possible different mappings in Yᵡ and apply it to domain X to yield a range of values in Y.
This all sounds like we are just unnecessarily complicating a simple situation. But we can now apply this kind of thinking to our original multivariate mapping h(x,y), where it is actually really helpful. Evidently, we can consider h (with two variables in X and Y) equally well as a mapping f = X 🠂 Zʸ (with only one variable in X). But this f is a bit of a weird map: it doesn’t yield a numerical output, but a mapping from Y to Z, as its value. We’ll call this output function g(y) = z. All we have to do now to get back to the original h(x,y) is to evaluate g, i.e., e(g,y) = g(y). But why is this helpful? Well, first of all, it is practical when we actually have to calculate the output of a multivariate mapping. This is called currying in functional programming: to dissect a multivariate map into a sequence of maps with only one (functional) value each, which is easier to compute. And, second, it allows us to define the evaluation of any multivariate map as a natural transformation, which means that to the category theorist the following hom-sets are isomorphic:
This is the kind of regularity that the concept of naturality captures, and we have now defined it in a mathematically precise manner: there must be consistency in the way one relation is transformed into another, across its domain of application (i.e., across all the components of the transformation).
A natural transformation that is isomorphic is called a natural isomorphism. It is an isomorphism in the functor category Dᶜ, and a powerful tool to show that different mathematical operations or constructs are equivalent. We’ll use a quite abstract (and admittedly somewhat complicated) example to illustrate this. But bear with us, as it will come in handy below and in the last two parts of the book. It has to do with how we apply mappings (and in which order), especially in the case of mappings with more than one variable, such as h: X × Y 🠂 Z, which maps X and Y into Z. We can also write h(x,y) = z.
In order to understand how to calculate such a mapping, we must know the following: strictly speaking, when we write down any mathematical function like f(x) = y, we are lumping together two distinct operations (that are more clearly separated in computer programming than in mathematical formalism): (1) the definition or declaration of a specific relationship between variables x and y (is it linear, is it exponential, and so on), and (2) the actual evaluation of the resulting map, i.e., its application to a specific x (or to a whole domain X). We write e(f,x) = f(x) for the second step — with the e standing for evaluation mapping — when we want to make this distinction explicit. In the notation we’ve introduced in the previous section, we can write e: Yᵡ 🠂 Y, meaning that we pick a particular map f from a whole set of possible different mappings in Yᵡ and apply it to domain X to yield a range of values in Y.
This all sounds like we are just unnecessarily complicating a simple situation. But we can now apply this kind of thinking to our original multivariate mapping h(x,y), where it is actually really helpful. Evidently, we can consider h (with two variables in X and Y) equally well as a mapping f = X 🠂 Zʸ (with only one variable in X). But this f is a bit of a weird map: it doesn’t yield a numerical output, but a mapping from Y to Z, as its value. We’ll call this output function g(y) = z. All we have to do now to get back to the original h(x,y) is to evaluate g, i.e., e(g,y) = g(y). But why is this helpful? Well, first of all, it is practical when we actually have to calculate the output of a multivariate mapping. This is called currying in functional programming: to dissect a multivariate map into a sequence of maps with only one (functional) value each, which is easier to compute. And, second, it allows us to define the evaluation of any multivariate map as a natural transformation, which means that to the category theorist the following hom-sets are isomorphic:
(where ≅ is the symbol for “is isomorphic to”). Expressed in words, this simply means what we just said: that we can turn any multivariate map into a sequence of maps with one variable each, in a systematic and regular manner, no matter what kind of mapping it is that we are evaluating. Pretty neat.
To wrap up this section, and to summarise it, let us return to our more concrete and familiar example of the modelling relation, and the general principle with which we started: while set theory is based on objects that are packed into (immaterial) bags, category theory deals primarily with the relations that give us the (contextual and transient) existence of objects in the first place. And this makes it an ideal mathematical theory for the modelling process itself (see chapter 11): we can think of morphisms as the relations within a category that structurally and dynamically define a specific system, of functors as relations between systems (especially between natural systems and their formal models), and of natural transformations as systematically and consistently relating one model of a natural system to another. Quite a powerful toolkit for our formal model of science as a skillful modelling practice!
To wrap up this section, and to summarise it, let us return to our more concrete and familiar example of the modelling relation, and the general principle with which we started: while set theory is based on objects that are packed into (immaterial) bags, category theory deals primarily with the relations that give us the (contextual and transient) existence of objects in the first place. And this makes it an ideal mathematical theory for the modelling process itself (see chapter 11): we can think of morphisms as the relations within a category that structurally and dynamically define a specific system, of functors as relations between systems (especially between natural systems and their formal models), and of natural transformations as systematically and consistently relating one model of a natural system to another. Quite a powerful toolkit for our formal model of science as a skillful modelling practice!
Universality
While naturality is all about relating different models of the same natural system to each other, there is another central concept in category theory that helps us understand why we can often apply the same model to quite different natural systems — across scales and contexts. This is universality. The term is a bit misleading, really, because philosophically we tend to interpret “universality” as “something that applies everywhere, all the time.” Being universal is simply the opposite of being relative (to context).
In contrast, a universal property in the category-theoretical sense is captured by a mathematical construction that can only be achieved in some particular manner. It cannot be done any other way. Or, in more technical terms: the construction of the property is equal up to isomorphism across various specified contexts. And category theory enables you to prove this. If this sounds a bit abstract, don’t worry: we will show how it works with a number of concrete examples.
Incidentally, physicists use “universality” in an analogous way to describe similar or identical behaviours in systems that are based on (sometimes very) different physical substrates. This is important, for example, when we think about the physical implementation of computation (in laptops vs. waterfalls, let’s say) or, later in the book, of agency (in organisms, but not in machines). The question here is: why do certain (but not all) substrates enable a specific type of behavioural capacity?
How do we define such universal properties or behaviours mathematically? We have to admit: the definition is not easy to understand. That’s why we will start with an easy example: the construction that mathematicians call a product. We have encountered it before, as the Cartesian product of sets X × Y (see this appendix), or the product category A × B (in the first section of this appendix above). Other kinds of products show up all over the place, such as good old algebraic multiplication, the various products of linear algebra, direct products of groups, topological product spaces, or tensor products of vector spaces, just to name a few examples that are frequently used by mathematicians. Each of these mathematical constructions differs in the details of their definitions, and each has a distinct use and domain of application.
The trick is to identify what is common to all of them, and then to show that the commonality is due to the fact that a product cannot be formed in any other way. What is the reason that makes each of these constructions a kind of product? In this case, it is pretty straightforward: we recognise intuitively that a product is a mathematical structure which can be factorised. In other words, it can be cleanly partitioned into components. And this, in essence, is the universal property of a (mathematical) product.
For a more formal definition, let’s have a look at the following diagram
While naturality is all about relating different models of the same natural system to each other, there is another central concept in category theory that helps us understand why we can often apply the same model to quite different natural systems — across scales and contexts. This is universality. The term is a bit misleading, really, because philosophically we tend to interpret “universality” as “something that applies everywhere, all the time.” Being universal is simply the opposite of being relative (to context).
In contrast, a universal property in the category-theoretical sense is captured by a mathematical construction that can only be achieved in some particular manner. It cannot be done any other way. Or, in more technical terms: the construction of the property is equal up to isomorphism across various specified contexts. And category theory enables you to prove this. If this sounds a bit abstract, don’t worry: we will show how it works with a number of concrete examples.
Incidentally, physicists use “universality” in an analogous way to describe similar or identical behaviours in systems that are based on (sometimes very) different physical substrates. This is important, for example, when we think about the physical implementation of computation (in laptops vs. waterfalls, let’s say) or, later in the book, of agency (in organisms, but not in machines). The question here is: why do certain (but not all) substrates enable a specific type of behavioural capacity?
How do we define such universal properties or behaviours mathematically? We have to admit: the definition is not easy to understand. That’s why we will start with an easy example: the construction that mathematicians call a product. We have encountered it before, as the Cartesian product of sets X × Y (see this appendix), or the product category A × B (in the first section of this appendix above). Other kinds of products show up all over the place, such as good old algebraic multiplication, the various products of linear algebra, direct products of groups, topological product spaces, or tensor products of vector spaces, just to name a few examples that are frequently used by mathematicians. Each of these mathematical constructions differs in the details of their definitions, and each has a distinct use and domain of application.
The trick is to identify what is common to all of them, and then to show that the commonality is due to the fact that a product cannot be formed in any other way. What is the reason that makes each of these constructions a kind of product? In this case, it is pretty straightforward: we recognise intuitively that a product is a mathematical structure which can be factorised. In other words, it can be cleanly partitioned into components. And this, in essence, is the universal property of a (mathematical) product.
For a more formal definition, let’s have a look at the following diagram
where the object X × Y stands for any kind of mathematical product. It is characterised by two canonical mappings p: X × Y 🠂 X and q: X × Y 🠂 Y, which are called the natural projections of the product onto its components X and Y (think of the actual visual projection of a Cartesian coordinate system onto its x- and y-axes). And, yes, they are natural exactly in the sense we’ve discussed in the previous section.
These projections p and q are universal morphisms. They define the universal property of the product. And why they are universal is evident from the diagram or, more precisely, from the conditions that render it commutative. It represents the operation of factorisation by a pair of arrows f and g from some object W to its components X and Y. The diagram only commutes if this pair ⟨f, g⟩ is isomorphic to an arrow h going from W to the product X × Y, plus the subsequent application of the canonical projections p and q. In other words: W can be factorised only if it can be expressed as a product! It has to go through the actual factorisation step h and projections p, q. Or, formally: ⟨f, g⟩ ≅ ⟨p○h, q○h⟩. There is no other way. And this is what makes a mathematical product equal (up to isomorphism) to any other product.
Alternatively, we can visually interpret the diagram as a combination of cones. Can you see them? The first one (on top) has W as its vertex and ⟨f, g⟩ as its sides; the other one (flattened, at the bottom of the diagram) has X × Y as its vertex and ⟨p, q⟩ as its sides. We can now form a new category (called a comma category, for those who care about such things), which has cones as objects, and connecting arrows (such as h) as morphisms. What is important to realise is that the universal cone defined by X × Y and its projections ⟨p, q⟩ is a terminal object in this category: any other possible cone of arbitrary W and ⟨f, g⟩ has a unique arrow h that points to it. From this, we can immediately conclude that terminal objects (and initial objects, as we shall see) are also a kind of universal construction.
Now, we can go a step further on our path to abstraction and think about what all universal properties have in common. This will lead us to their most abstract and general definition. For this purpose, let us next consider products with not only two, but any arbitrary (and even infinite) number of factors. Mathematicians write this as A = ∏Aᵢ (“A is the product of factors Aᵢ,” with i the elements of some index set I). This looks complicated, but it is just the same as writing A = A₁ × A₂ × ⋯ × Aₙ for any finite n. Such a product is defined by its universal morphisms — the canonical projections πᵢ (where, for each j ∊ I, πj :A 🠂 Aj).We can now draw a generalised diagram
These projections p and q are universal morphisms. They define the universal property of the product. And why they are universal is evident from the diagram or, more precisely, from the conditions that render it commutative. It represents the operation of factorisation by a pair of arrows f and g from some object W to its components X and Y. The diagram only commutes if this pair ⟨f, g⟩ is isomorphic to an arrow h going from W to the product X × Y, plus the subsequent application of the canonical projections p and q. In other words: W can be factorised only if it can be expressed as a product! It has to go through the actual factorisation step h and projections p, q. Or, formally: ⟨f, g⟩ ≅ ⟨p○h, q○h⟩. There is no other way. And this is what makes a mathematical product equal (up to isomorphism) to any other product.
Alternatively, we can visually interpret the diagram as a combination of cones. Can you see them? The first one (on top) has W as its vertex and ⟨f, g⟩ as its sides; the other one (flattened, at the bottom of the diagram) has X × Y as its vertex and ⟨p, q⟩ as its sides. We can now form a new category (called a comma category, for those who care about such things), which has cones as objects, and connecting arrows (such as h) as morphisms. What is important to realise is that the universal cone defined by X × Y and its projections ⟨p, q⟩ is a terminal object in this category: any other possible cone of arbitrary W and ⟨f, g⟩ has a unique arrow h that points to it. From this, we can immediately conclude that terminal objects (and initial objects, as we shall see) are also a kind of universal construction.
Now, we can go a step further on our path to abstraction and think about what all universal properties have in common. This will lead us to their most abstract and general definition. For this purpose, let us next consider products with not only two, but any arbitrary (and even infinite) number of factors. Mathematicians write this as A = ∏Aᵢ (“A is the product of factors Aᵢ,” with i the elements of some index set I). This looks complicated, but it is just the same as writing A = A₁ × A₂ × ⋯ × Aₙ for any finite n. Such a product is defined by its universal morphisms — the canonical projections πᵢ (where, for each j ∊ I, πj :A 🠂 Aj).We can now draw a generalised diagram
which defines a product of any number of factors as a universal property. Or almost so: one ingredient is still missing. Or actually implicitly hidden in the diagram above: it is the diagonal functor Δ, which allows us to construct, out of a discrete category I of indices i (representing index set I) and a given object A in category C, an indexed family of factors Aᵢ , which is a member of functor category Cᴵ. Again, this looks complicated, but really only describes (in formal terms) how we get to construct the families of components that the product gets factorised into — the Aj in the diagram above.
If we make this functorial relation explicit, we get the following general definition of a universal property
If we make this functorial relation explicit, we get the following general definition of a universal property
(yes, we warned you: this is not easy to understand). Here, we have some functor F from category C to category D. A universal morphism from this functor F to some object V in D is then defined as the ordered pair 〈X, φ〉, where X is an object in C and φ a morphism in D. In the example of the product, F is the diagonal functor Δ: C 🠂 Cᴵ, V corresponds to the set underlying the indexed family of components Aᵢ, X is equal to object ∏Aᵢ, and φ corresponds to the collection of projections πᵢi.
In effect, all this complicated diagram is saying is that, if some universal property exists, then any morphism f from an arbitrary object FY to V can be partitioned into Fh (with corresponding h in C) followed by the application of φ in D. Put simply: if you want to get to object V from Y, you have to go via X and φ, even if that may not be immediately obvious. This is what makes 〈X, φ〉 universal: it makes the construction of V both unique and general at the same time. You can only get there in one specific way, but it works the same way across many different contexts. When it comes to components and projections, all products are the same, even if they differ in other (non-universal) aspects.
Now we’re sure you’ll be thrilled to learn that, like everything else in category theory, there is a dual to the universal property we just defined. But don’t worry: from here on out, it’s smooth sailing. We only have to reverse the direction of all the arrows in the above diagram
In effect, all this complicated diagram is saying is that, if some universal property exists, then any morphism f from an arbitrary object FY to V can be partitioned into Fh (with corresponding h in C) followed by the application of φ in D. Put simply: if you want to get to object V from Y, you have to go via X and φ, even if that may not be immediately obvious. This is what makes 〈X, φ〉 universal: it makes the construction of V both unique and general at the same time. You can only get there in one specific way, but it works the same way across many different contexts. When it comes to components and projections, all products are the same, even if they differ in other (non-universal) aspects.
Now we’re sure you’ll be thrilled to learn that, like everything else in category theory, there is a dual to the universal property we just defined. But don’t worry: from here on out, it’s smooth sailing. We only have to reverse the direction of all the arrows in the above diagram
which means we are now going from object V in D to some arbitrary object Y in C, and we have to go via universal morphism 〈X, φ〉 to do this. In other words: morphism f can be partitioned into φ, and the subsequent application of Fh (with corresponding h in C), just like above but in reverse.
If we apply this to the example of the product, we get the general definition of a co-product
If we apply this to the example of the product, we get the general definition of a co-product
where the Aj are an indexed family of components or summands, and the ɩj are the canonical inclusion mappings we call natural injections. Or, for the special case where we have only two summands, we get
In Set and Top, the co-product corresponds to the disjoint union of sets and spaces, respectively. In Grp, it is (somewhat confusingly) named the free product of groups. In Vct, it is the direct sum of vector spaces.
It will come as no surprise that co-products are initial objects in the comma category that has (upside-down) cones as objects, and connecting arrows (like h) as morphisms, since there is a unique arrow coming out of the cone with X ⨿ Y as vertex (and injections i and j as sides) to any other cone. This, of course, implies that initial objects are universal constructions as well.
Universal properties are so common in mathematics that it is almost not worth listing further examples here. But a few more illustrations may be helpful. For instance, the number 0 stands for the universal identity element of any additive operation, while 1 is the universal identity of any multiplication. Similarly, -1 is the universal inverse for addition. In contrast, however, the i of complex numbers is not universal. It is defined as the square root of -1 (formally: i² = -1), which is not unique, because it can be defined in two different ways. This is because -i also fits the bill. Therefore, it is arbitrary whether we choose i or -i as the unit for the imaginary part of a complex number. It is a matter of mere convention.
A powerful example of universal constructs are the free monoids, groups, vector spaces, and categories that we introduced in the previous section. Take the construction of a free category from its underlying graph or quiver: it results from concatenating the arrows from the quiver that connect to each other in a unique manner, which category theorists call “taking the product over the collection of objects.” We can easily show that this corresponds to a universal construction called a pullback, which is closely related to, but based on slightly more complicated cones, than a product. The dual of a pullback is called a pushout.
We hope you are still with us! If you are, you have reached an important milestone: we now have the tools to understand one of the most powerful and sophisticated universal constructions for our mathematical understanding of modelling practice — adjunction. We have used it in chapter 11 to formalise and precisely define the general idea of a model as an inferential blueprint. Basically, models are not necessarily direct and faithful representations of reality, but conceptual tools that allow us to draw relevant and robust inferences about their congruent target — even if there is no straightforward one-to-one correspondence between the natural and the formal system.
Like a product or a co-product, an adjunction is a specific kind of universal mathematical construction that applies across many contexts, but must be generated in a unique and specific manner. Actually, we’ve already encountered an adjunction in the previous section, when we discussed the evaluation of a multivariate mapping. Remember this isomorphism between hom-sets?
It will come as no surprise that co-products are initial objects in the comma category that has (upside-down) cones as objects, and connecting arrows (like h) as morphisms, since there is a unique arrow coming out of the cone with X ⨿ Y as vertex (and injections i and j as sides) to any other cone. This, of course, implies that initial objects are universal constructions as well.
Universal properties are so common in mathematics that it is almost not worth listing further examples here. But a few more illustrations may be helpful. For instance, the number 0 stands for the universal identity element of any additive operation, while 1 is the universal identity of any multiplication. Similarly, -1 is the universal inverse for addition. In contrast, however, the i of complex numbers is not universal. It is defined as the square root of -1 (formally: i² = -1), which is not unique, because it can be defined in two different ways. This is because -i also fits the bill. Therefore, it is arbitrary whether we choose i or -i as the unit for the imaginary part of a complex number. It is a matter of mere convention.
A powerful example of universal constructs are the free monoids, groups, vector spaces, and categories that we introduced in the previous section. Take the construction of a free category from its underlying graph or quiver: it results from concatenating the arrows from the quiver that connect to each other in a unique manner, which category theorists call “taking the product over the collection of objects.” We can easily show that this corresponds to a universal construction called a pullback, which is closely related to, but based on slightly more complicated cones, than a product. The dual of a pullback is called a pushout.
We hope you are still with us! If you are, you have reached an important milestone: we now have the tools to understand one of the most powerful and sophisticated universal constructions for our mathematical understanding of modelling practice — adjunction. We have used it in chapter 11 to formalise and precisely define the general idea of a model as an inferential blueprint. Basically, models are not necessarily direct and faithful representations of reality, but conceptual tools that allow us to draw relevant and robust inferences about their congruent target — even if there is no straightforward one-to-one correspondence between the natural and the formal system.
Like a product or a co-product, an adjunction is a specific kind of universal mathematical construction that applies across many contexts, but must be generated in a unique and specific manner. Actually, we’ve already encountered an adjunction in the previous section, when we discussed the evaluation of a multivariate mapping. Remember this isomorphism between hom-sets?
It says that evaluating a multivariate mapping h(x,y) = z (an element of the hom-set on the left-hand side of the equation) produces the same result as a two-step evaluation of each variable separately (an element of the hom-set on the right). The latter yields a univariate mapping (not a number) as output of its first step, e(x,f) = g(y), which, upon further evaluation, yields the final result of the original mapping: e(g,y) = z. Either way, we get the same result in the end.
As we have also learnt in the previous section, an isomorphism like the one shown above is a composite of two invertible arrows, in this case two functors from category Set to Set: FX = X × Y, and GZ = Zʸ. Substituting these functors into the equation gives us the general definition of an adjunction
As we have also learnt in the previous section, an isomorphism like the one shown above is a composite of two invertible arrows, in this case two functors from category Set to Set: FX = X × Y, and GZ = Zʸ. Substituting these functors into the equation gives us the general definition of an adjunction
where F is the left adjoint (mnemonic: it points to the left) and G the right adjoint (pointing to the right). This relation constitutes a natural isomorphism. But note that F and G are not (generally) invertible! Their connection is an asymmetric one, which we denote by F ⊣ G. Still, they uniquely determine each other through the isomorphism, and we can specify any adjunction as an ordered triple 〈F,G,φ〉. Thus, for instance, we can define Rosen’s modelling relation (from chapter 11) as an adjunction with F the decoding functor δ (left adjoint), G the encoding functor ε (right adjoint), and φ the isomorphic natural transformation from one to the other that provides the criterion for model congruence, i.e., whether we can draw useful inferences from a formal model, or not. The asymmetry here is that we encode a model with a specific purpose in mind. This is reflected in the notation δ ⊣ ε. While the encoding prescribes how we decode the model, the way we choose to encode it is initially motivated by the kind of decoding (inferences and predictions) we expect to obtain from it.
By now you are probably wondering what all of this means. As we have mentioned, the underlying abstract pattern is universal (there is no other way to construct an adjunction), but it is not at all obvious. Let’s take an example from mathematics to bring it out: vector spaces V (also called linear spaces) are collections of vectors (mathematical entities with a certain length and direction) that can be added together and multiplied (basically, stretched and squeezed to change their length) using scalars that are drawn from a field K (usually the real or complex numbers, with their own addition and multiplication operations). Each vector space is constructed from a basis X, which is a set of independent vectors from which the entire space can be constructed through linear combination (i.e., adding together variously scaled basis vectors). The number of vectors in its basis determines the dimension of the vector space. Think, for example, about physical forces that operate in a two- or three-dimensional space.
Vector spaces are related to each other through linear transformations (also called linear maps) which preserve the operations of vector addition and scalar multiplication. Visually, you can imagine such an operation as stretching, squeezing, shearing, and/or rotating the entire vector space in a uniform manner. Sometimes, linear transformations also change the dimensionality of the space, for example, when we project a three-dimensional force field on a (flat) two-dimensional surface.
Vector spaces V over a field K form the objects, and linear transformations between them are the arrows of a category called VctK. In contrast, the basis of a vector space and the linear combinations of its basis vectors exist in the category Set (as they are simple sets of vectors). VctK and Set are connected through a pair of functors, one of them forgetting the addition and scalar multiplication operations to reduce a vector space to its underlying set of (linearly combined) vectors, and a free functor (which we’ve already encountered) constructing a vector space automatically, or “for free,” from its set of basis vectors. Let us call the free functor F and the forgetful one G. These two are connected through the familiar natural isomorphism of an adjunction:
By now you are probably wondering what all of this means. As we have mentioned, the underlying abstract pattern is universal (there is no other way to construct an adjunction), but it is not at all obvious. Let’s take an example from mathematics to bring it out: vector spaces V (also called linear spaces) are collections of vectors (mathematical entities with a certain length and direction) that can be added together and multiplied (basically, stretched and squeezed to change their length) using scalars that are drawn from a field K (usually the real or complex numbers, with their own addition and multiplication operations). Each vector space is constructed from a basis X, which is a set of independent vectors from which the entire space can be constructed through linear combination (i.e., adding together variously scaled basis vectors). The number of vectors in its basis determines the dimension of the vector space. Think, for example, about physical forces that operate in a two- or three-dimensional space.
Vector spaces are related to each other through linear transformations (also called linear maps) which preserve the operations of vector addition and scalar multiplication. Visually, you can imagine such an operation as stretching, squeezing, shearing, and/or rotating the entire vector space in a uniform manner. Sometimes, linear transformations also change the dimensionality of the space, for example, when we project a three-dimensional force field on a (flat) two-dimensional surface.
Vector spaces V over a field K form the objects, and linear transformations between them are the arrows of a category called VctK. In contrast, the basis of a vector space and the linear combinations of its basis vectors exist in the category Set (as they are simple sets of vectors). VctK and Set are connected through a pair of functors, one of them forgetting the addition and scalar multiplication operations to reduce a vector space to its underlying set of (linearly combined) vectors, and a free functor (which we’ve already encountered) constructing a vector space automatically, or “for free,” from its set of basis vectors. Let us call the free functor F and the forgetful one G. These two are connected through the familiar natural isomorphism of an adjunction:
This tells us that the operation of linearly transforming a vector space constructed from a basis (through the free functor F) into another vector space in VctK (formally: F(X) 🠂 V) is isomorphic to a corresponding linear-combination operation on the basis in Set (formally: X 🠂 G(V)). Somewhat confusingly, mathematicians call this a representation of vector-based linear transformations in the category Set. But there isn’t a simple mirroring between the two, as the term “representation” would suggest! The two operations we’ve just described are not isomorphic or equivalent. Nor are forgetful and free functors symmetric: one loses information, while the other regains it in a specific, formulaic way. The relation is skewed, like any modelling relation is, with the way we perform linear combinations of vectors constraining (in fact, entailing) the free construction of a vector space.
With a bit more maths (which we won’t go into here) we can now show that an isomorphism or equivalence between categories (as described in chapter 11) is really just an exceptional case of an adjunction, where the relation between the left and right adjoint is symmetrical and, in fact, directly invertible. Therefore, the concept of adjunction generalises the idea of a model “representing” its target far beyond the mirroring or faithfully depicting of specific features. This is why it is such a powerful tool for studying the practice of modelling, both within mathematics and in the natural sciences. Mathematicians can use it to model structures that would otherwise be hard to handle in the simple and tractable category Set. Natural scientists, in turn, can get away with oversimplifying and idealising their models and still be able to draw useful inferences or make accurate predictions — under the appropriate conditions. Adjunction tells us, through its naturality and universality, why (and when) this is the case, and when it breaks down. We’ll discuss several concrete examples of this in the latter two parts of the book.
We’ll cover only one more use of adjunction here, which leads to the definition of a very special class of categories. Not only are products and co-products universal constructions, but we can express them in terms of adjunctions as well. Remember the diagonal functor Δ from above? It allows us to construct product categories with multiple identical elements as objects (or, formally: Δ: C 🠂 C × C). This functor has a right adjoint if and only if C has binary products, and we get the natural isomorphism
With a bit more maths (which we won’t go into here) we can now show that an isomorphism or equivalence between categories (as described in chapter 11) is really just an exceptional case of an adjunction, where the relation between the left and right adjoint is symmetrical and, in fact, directly invertible. Therefore, the concept of adjunction generalises the idea of a model “representing” its target far beyond the mirroring or faithfully depicting of specific features. This is why it is such a powerful tool for studying the practice of modelling, both within mathematics and in the natural sciences. Mathematicians can use it to model structures that would otherwise be hard to handle in the simple and tractable category Set. Natural scientists, in turn, can get away with oversimplifying and idealising their models and still be able to draw useful inferences or make accurate predictions — under the appropriate conditions. Adjunction tells us, through its naturality and universality, why (and when) this is the case, and when it breaks down. We’ll discuss several concrete examples of this in the latter two parts of the book.
We’ll cover only one more use of adjunction here, which leads to the definition of a very special class of categories. Not only are products and co-products universal constructions, but we can express them in terms of adjunctions as well. Remember the diagonal functor Δ from above? It allows us to construct product categories with multiple identical elements as objects (or, formally: Δ: C 🠂 C × C). This functor has a right adjoint if and only if C has binary products, and we get the natural isomorphism
which basically is just another mathematical way to express the first diagram of product cones above. Dually, the diagonal functor has a left adjoint, if and only if C has binary co-products
which, again, is just another way to express the diagram with the inverted cones that we used to define the binary co-product earlier.
The question is, of course: why should we care about expressing universal constructions like products and co-products as adjunctions? Well, there is a certain elegance in how the diagonal functor connects product and co-product in an asymmetric way. We shall have more to say about the relation between these constructions later on. But even more importantly: if the diagonal functor is the left adjoint of the product, we can ask whether it also has a right adjoint. This is where things get interesting, because we’ve encountered this right adjoint of the product before, in the specific case of Set. It is implicit in
The question is, of course: why should we care about expressing universal constructions like products and co-products as adjunctions? Well, there is a certain elegance in how the diagonal functor connects product and co-product in an asymmetric way. We shall have more to say about the relation between these constructions later on. But even more importantly: if the diagonal functor is the left adjoint of the product, we can ask whether it also has a right adjoint. This is where things get interesting, because we’ve encountered this right adjoint of the product before, in the specific case of Set. It is implicit in
which allows us to interpret Zʸ in a powerful new way: if a category has binary products, then this object corresponds to the exponential object of the category. Therefore, if C = Set, exponential object Zʸ equals the hom-set C(Y,Z). Similar relations exist in other categories. In Cat, for example, the exponential object is a functor category (e.g., the category Dᶜ we’ve already come across in the previous section). But this is not necessarily the case. Exponential objects can exist in other categories that do not correspond to such specific mapping relations.
Categories in which such special relations for the exponential exist have very convenient properties for the mathematician. They are called Cartesian closed categories. To state the conditions more explicitly and in full, such a category must have (1) a terminal object, (2) binary products for all its objects, and (3) an exponential object with the property of expressing an evaluative mapping as described for Set and Cat above. Cartesian closed categories are important to us here, in particular, because they help us understand formalisms that operate on their own operators (such as the rewrite systems we describe in this appendix). We will use this concept later on to explore the limitations of such formalisms when it comes to capturing the strangely circular and hierarchical self-manufacturing dynamics of living systems.
If you made it this far: congratulations! We are impressed. You now have all the category theory concepts (and notations) you need to gain a really deep and precise understanding of the argument in the rest of this book. If you had your share of formalism, then don’t hesitate to return to the main text at this point. If, however, you’d like to know how we can use category theory (instead of set theory) to provide a foundation for the rest of mathematics, then read on and dive into the last section of the appendix, where we’ll reveal the true abstract power and range of the category! Believe us, it’s awesome.
Categories in which such special relations for the exponential exist have very convenient properties for the mathematician. They are called Cartesian closed categories. To state the conditions more explicitly and in full, such a category must have (1) a terminal object, (2) binary products for all its objects, and (3) an exponential object with the property of expressing an evaluative mapping as described for Set and Cat above. Cartesian closed categories are important to us here, in particular, because they help us understand formalisms that operate on their own operators (such as the rewrite systems we describe in this appendix). We will use this concept later on to explore the limitations of such formalisms when it comes to capturing the strangely circular and hierarchical self-manufacturing dynamics of living systems.
If you made it this far: congratulations! We are impressed. You now have all the category theory concepts (and notations) you need to gain a really deep and precise understanding of the argument in the rest of this book. If you had your share of formalism, then don’t hesitate to return to the main text at this point. If, however, you’d like to know how we can use category theory (instead of set theory) to provide a foundation for the rest of mathematics, then read on and dive into the last section of the appendix, where we’ll reveal the true abstract power and range of the category! Believe us, it’s awesome.
Topos
[coming soon!]
[coming soon!]
|
Previous: Limitations Of Mathematical Modelling
|
Next: Organisational Theories of Life
|
The authors acknowledge funding from the John Templeton Foundation (Project ID: 62581), and would like to thank the co-leader of the project, Prof. Tarja Knuuttila, and the Department of Philosophy at the University of Vienna for hosting the project of which this book is a central part.
Disclaimer: everything we write and present here is our own responsibility. All mistakes are ours, and not the funders’ or our hosts’ and collaborators'.
Disclaimer: everything we write and present here is our own responsibility. All mistakes are ours, and not the funders’ or our hosts’ and collaborators'.