A Pattern Language Graph
Published 3/13/21
A Pattern Language by Christopher Alexander et al. is a very unconventional book. Not just because of the architectural philosophy it advocates for, but also because of the way it is structured. Instead of having a traditional linear sequence of ideas, each idea (called a pattern) is linked to other patterns that it helps form, and that help form it. This idea is very cool, but sadly, the book's book-ness doesn't really lend itself to the information's structure. While all of the patterns are interconnected, it isn't very visible.
My goal was to do something about that.
I wanted to create a visualization of the book that would allow you to see how the patterns relate to each other. I could have created this graph manually, but with 253 patterns, each of which connects to 2-10 other patterns, it would have taken ages to do by hand. Instead, I decided to get my computer to do the work for me :D
Creating the graph
Every pattern in A Pattern Language has the same basic elements, which makes it possible for me to teach my computer how to read them. These elements include:
- A number, identifying the pattern.
- A name, so the pattern can be conversed about.
- A section and subsection, which helps to organize the language.
- A number of stars (0-2), indicating how close the authors believe they have come to defining an "invariant" pattern.
- Other patterns the pattern helps to form, which come at the beginning of the pattern.
- Other patterns that help to form the pattern, which come at the end of the pattern.
Some of these values can be gathered manually without too much trouble (eg, the numbers, names, and sections). But to create the graph in a reasonable time span, the others need to be processed automatically. Automatic processing means that our final graph probably won't be perfect, but it will be much better than the nothing that currently exists.
To start creating the graph I needed something that my computer could read, and my physical copy:
probably wasn't going to cut it. So I went to a totally legal website and downloaded a pdf version of the book. Sadly, this pdf wasn't workable either, because it was just images and no text. I had to run it through my favorite pdf converter before I had something my python program could actually read.
But this gave me text that was so messy that I (a human) practically couldn't read it:
2 THE DISTRIBUTION
OF TOWNS
... consider now the character of settlements within the region:
what balance of villages, towns, and cities is in keeping with the
independence of the region—INDEPENDENT REGIONS (1)?
If the population of a region is weighted too far toward
small villages, etc
17
^LTOWNS
towns with many small towns and few large ones; and indeed, the
nature of this distribution does correspond, etc
This can only be halted by policies which guarantee an equal
sharing of resources, and economic development, across the entire
region, etc
18
^L2 THE DISTRIBUTION OF TOWNS
As the distribution evolves, protect the prime agricultural land
for farming—aGRICULTURAL VALLEYs (4); protect the smaller
outlying towns, by establishing belts of countryside around them
and by decentralizing industry, so that the towns are economically
stable—counTry Towns (6). In the larger more central urban
areas work toward land policies which maintain open belts of
countryside between the belts of city—ciry COUNTRY FINGERS
(3)...
20
^L3 CITY COUNTRY FINGERS**
The first thing I noticed was all of the extra junk. Page numbers, duplicate headers, form feeds, etc. This stuff was just cluttering up the data, not providing useful information. After applying some regex to get rid of it, the file was a lot more readable and workable:
[[THE DISTRIBUTION OF TOWNS]]
... consider now the character of settlements within the region:
what balance of villages, towns, and cities is in keeping with the
independence of the region—INDEPENDENT REGIONS (1)?
If the population of a region is weighted too far toward
small villages, etc
towns with many small towns and few large ones; and indeed, the
nature of this distribution does correspond, etc
As the distribution evolves, protect the prime agricultural land
for farming—aGRICULTURAL VALLEYs (4); protect the smaller
outlying towns, by establishing belts of countryside around them
and by decentralizing industry, so that the towns are economically
stable—counTry Towns (6). In the larger more central urban
areas work toward land policies which maintain open belts of
countryside between the belts of city—ciry COUNTRY FINGERS
(3)...
[[CITY COUNTRY FINGERS**]]
Now I had to figure out how to separate the important pattern-linking lines from the unimportant ones. For a while I played with using the ellipses (...) as markers, but the OCR had had trouble with them, and they showed up in other places in the text. Instead (since I was going to regex to match the pattern tags anyway) I decided to remove all of the whitespace, and squash each paragraph into a single line.
[[THE DISTRIBUTION OF TOWNS]]
...considernowthecharacterofsettlementswithintheregion:whatbalanceofvillages,towns,andcitiesisinkeepingwiththeindependenceoftheregion—INDEPENDENTREGIONS(1)?
Ifthepopulationofaregionixweightedtoofartowardsmallvillages,etc
townswithmanysmalltownsandfewlargeones;andindeed,thenatureofthisdistributiondoescorrespond,etc
Asthedistributionevolves,protecttheprimeagriculturallandforfarmingaGRICULTURALVALLEYs(4);protectthesmalleroutlyingtowns,byestablishingbeltsofcountrysidearoundthemandbydecentralizingindustry,sothatthetownsareeconomicallystablecounTryTowns(6).Inthelargermorecentralurbanareasworktowardlandpolicieswhichmaintainopenbeltsofcountrysidebetweenthebeltsofcity—ciryCOUNTRYFINGERS
(3)...
[[CITY COUNTRY FINGERS**]]
This way I wouldn't have to worry about how many lines I was processing to find linked patterns. I just had to scan a single line.
There was just one final issue I had to fix before creating the graph. Because of how the OCR ran, sometimes there pattern IDs that were left dangling on their own line. So I squished short lines up into the previous paragraph to make sure those IDs got processed:
[[THE DISTRIBUTION OF TOWNS]]
...considernowthecharacterofsettlementswithintheregion:whatbalanceofvillages,towns,andcitiesisinkeepingwiththeindependenceoftheregion—INDEPENDENTREGIONS(1)?
Ifthepopulationofaregionixweightedtoofartowardsmallvillages,etc.
townswithmanysmalltownsandfewlargeones;andindeed,thenatureofthisdistributiondoescorrespond,etc
Asthedistributionevolves,protecttheprimeagriculturallandforfarmingaGRICULTURALVALLEYs(4);protectthesmalleroutlyingtowns,byestablishingbeltsofcountrysidearoundthemandbydecentralizingindustry,sothatthetownsareeconomicallystablecounTryTowns(6).Inthelargermorecentralurbanareasworktowardlandpolicieswhichmaintainopenbeltsofcountrysidebetweenthebeltsofcity—ciryCOUNTRYFINGERS(3)...
[[CITY COUNTRY FINGERS**]]
After all of that cleaning I was finally able to process the file and export the graph using PyGraphML.
Pretty pictures
Now that I had the data in a workable format, I could load it up in gephi to see what questions I could answer (and what pretty pictures I could make).
For example, the book is broken down into large patterns (towns), medium patterns (buildings), and small patterns (construction). Do the connections between patterns show this separation as well? It seems so, yes.
I also tried to see if the "invariant-ness" (number of stars) a pattern has correlates with its connected-ness. But I don't think this form of visualization is very insightful.
We can also look at subsections of the graph. For example, just the patterns connected to "A House for One" (78)
Or just the "Roads and Paths" patterns:
Or just the "Roads and Paths" patterns, plus their neighbors:
Resources
I'm sure there are lots of other visualizations someone might want to create with this data, so I've uploaded the patterns.graphml file and the process.py file to a github repo so anyone can download them and play with them. If anyone finds errors in the graph (which will exist since it's automated), feel free to send in issues and pull requests :D
Happy visualizing!