Following on from my experiments using AI to identify and classify small site development opportunities in London’s suburbs, I’ve been deploying the same pattern recognition software on aerial photography to see if it’s possible to categorise, and quantify, how much of London’s green belt has the potential for new homes.
The new government has made much of the potential for so-called “grey belt” to meet the country’s housing need, particularly where this is located close to public transport. The Labour manifesto stated that “the release of lower quality ‘grey belt’ land will be prioritised and we will introduce golden rules’ to ensure development benefits communities and nature.”
This was followed by a consultation version of the National Planning Policy Framework (NPPF) in July 2024, which established a definition for what grey belt is:
“For the purposes of plan-making and decision-making, ‘grey belt’ is defined as land in the green belt comprising Previously Developed Land and any other parcels and/or areas of Green Belt land that make a limited contribution to the five Green Belt purposes (as defined in para 140 of this Framework), but excluding those areas or assets of particular importance listed in footnote 7 of this Framework (other than land designated as Green Belt).”
In her statement to the house which accompanied the launch of the draft NPPF, Angela Rayner said the following:
The Green Belt today accounts for more land in England than land that is developed – around 13 per cent compared to 10 per cent. Yet as many assessments show, large areas of the Green Belt have little ecological value and are inaccessible to the public. Much of this area is better described as ‘grey belt’: land on the edge of existing settlements or roads, and with little aesthetic or environmental value. It is also true that development already happens on the Green Belt, but in a haphazard and non-strategic way, leading to unaffordable houses being built without the amenities that local people need.
There are wildly varying estimates as to how much of the green belt could be considered as “grey”. Property technology outfit LandTech suggested it could be around 0.4%, but CBRE estimated it as 7%. Bearing in mind that England’s metropolitan green belt covers some 16,384 square kilometres (with London’s 5,085sqkm of this), even at 2% this would be enough for more than 1.6m homes. It’s a lot.
Setting aside for one moment the varying definitions of grey belt, I’ve been trying to calculate how much of the green belt is covered by stuff that’s clearly not green. This isn’t necessarily an attempt to find locations to build new homes, but simply an exercise to find the car parks, breaker’s yards, quarries, waste transfer depots and landfill sites that are currently protected from development by virtue of green belt protection.
Counting the Dots
Using GIS mapping I created a grid with intersections every 25m and applied this to the entirety of London’s green belt. In total that’s approximately 8.1 million data points. This is too much to handle in one go, so I divided this down according to planning authority; there’s 72 district councils, unitary authorities and London boroughs who have at least some of the capital’s green belt within them.
The unitary authority of Buckinghamshire has the greatest area of green belt, with 15,630 hectares. This map is taken from article I wrote about the impact of new housing targets on each of England’s planning authorities, and provides a quick overview of the green belt in each:
The smallest is the Royal Borough of Greenwich, which has a tiny area of green belt in its south east corner (so small it can’t be seen in the map below).
At every data point within an authority area I exported a Google satellite image covering an area of 625sqm (i.e. contained within a 25m x 25m tile). I then manually grouped a number of these into categories. Here’s an example of each classification showing the types of images generated:
Armed with a large set of images sorted into folders according to what I could see in each tile, I then trained an AI learning model on the resulting data. Applying this learning model to the remaining images and returning this data to the GIS software, I was able to map the location of every data point and show the distribution of different land types across each authority area.
I started with St Albans City and District Council (SACDC). Below you can see an area in the south of the district, close to the border with Hertsmere. Each coloured circle corresponds to a different designation. The outer colour shows the AI’s best estimate at what’s on the ground, the inner colour are those categories I have determined could be classified as “grey belt”. Tarmac’s Harper Lane asphalt plant appears as a concentration of red and orange blobs; woodland as dark green rings, fields as lighter green, and agriculture in a lighter shade. The grey rings represent buildings, which you can see to the right of the image, with car parks among these in red.
Although it’s possible to spot some inaccuracies in the classifications, overall this appears to be pretty convincing. One of the issues is that Google’s satellite data varies in resolution across the country, and at which time of the year it was taken. This can lead to agricultural land being misidentified as earthworks, for example, when the picture was taken in winter and the crops harvested. In some remote places the ground is obscured by cloud or vapour trails from passing planes. The use of better aerial photography will help iron out these inconsistencies.
Zooming further into the image you can see numbers appearing within the centre of each dot. This is the “confidence level” that the AI has ascribed to each identification. A score of, say, 60%, means that the AI thinks the tile meets the characteristics of a classification with this degree of confidence. The higher the figure, the greater the chance it’s correct.
This is useful, because the AI also then includes a secondary and tertiary prediction with reducing levels of confidence. Because the tiles cover an area of 625sqm it is likely that they might include different types of ground cover: part car park, part woodland, for example. By multiplying the percentage confidence level by 625 we can get an approximation of how much of the tile is covered by each category. So, a tile that the AI has identified as having a primary classification of “tree cover” with a confidence level of 80% and a secondary classification as “car park”, with 20% confidence, might contain 500sqm of the former and 125sqm of the latter.
While individual data points can display odd results, there are ways to even out the anomalies. Grey belt land uses tend to cover a wider area than just 625sqm, and as can be seen above, the significant opportunities tend to be where a number of these are grouped together. Using a clustering algorithm in GIS we can find those parts of the study area in which many of the positive data points appear. Although these are not particularly useful for analysis, they can help signpost parts of the green belt which are worthy of further investigation.
The map below shows the southeastern corner of Buckinghamshire, a unitary authority. Here I’ve used a clustering algorithm to create a series of circles where groups of grey belt points occur. The grey areas in this image are not within the green belt – everything else is.
Zooming into this area further shows that these clusters are broadly correct. There are some large solar farms in this area, something that’s not found in St Albans (and the dataset on which the learning model was trained), so the AI is incorrectly identifying these as water. A further refinement will be to include a new category for photovoltaic panels. I’ve masked the areas not covered by green belt protection, which are shown in grey.
In the image comparison above you can see the large earthworks (Cemex’s Langley quarry) in the centre of the frame, and the eastern part of Thorney business park to the north. The two areas to the west are Traveller sites, with a solar farm misidentified as water in the bottom right. What this shows is that the use of clustering can help in identifying the areas of potential grey belt. The M25 can be seen running north to south on the right-hand side, and the Elizabeth Line across the centre, broadly parallel to the Grand Union Canal above it. Importantly, Iver Station is just about visible to the left of the intersection between the railway and motorway. Trains stop here every few minutes, so one wonders whether this not make a good location for new homes: the area to the north of the station is clearly a candidate for new development were it not protected by virtue of its green belt designation.
We know that each data point represents an area of 625sqm, so by counting the number of these within each category as determined by the AI, we can quantify the total amount of land occupied by each.
The following table shows the classification of each of the 210,283 points within St Albans District. I’ve highlighted in green those categories of land that I think could be included within a “grey belt” designation.
Category | Data points | Area (ha) | % of Total | Grey Belt? (ha) |
---|---|---|---|---|
Allotments & Garden Centres | 2,102 | 131 | 1.00% | 0 |
Buildings – Commercial | 2,035 | 127 | 0.97% | 0 |
Buildings – Domestic | 5,208 | 326 | 2.48% | 0 |
Car Parks | 1,963 | 123 | 0.93% | 123 |
Cemeteries | 202 | 13 | 0.10% | 0 |
Earthworks | 1,066 | 67 | 0.51% | 67 |
Golf Courses | 8,217 | 514 | 3.91% | 0 |
Highway | 11,612 | 726 | 5.52% | 0 |
Open Space – Agriculture | 65,846 | 4,115 | 31.31% | 0 |
Open Space – Fallow Fields | 11,349 | 709 | 5.40% | 0 |
Open Space – General | 38,778 | 2,424 | 18.44% | 0 |
Open Space – Wasteland | 171 | 11 | 0.08% | 11 |
Polytunnels & Greenhouses | 1,325 | 83 | 0.63% | 0 |
Railways | 1,205 | 75 | 0.57% | 0 |
Sewage Farms | 4 | 0 | 0.00% | 0 |
Sports – Hard Courts | 4,493 | 281 | 2.14% | 0 |
Sports – Pitches & Fields | 2,868 | 179 | 1.36% | 0 |
Tree Cover | 40,772 | 2,548 | 19.39% | 0 |
Water | 9,680 | 605 | 4.60% | 0 |
Yards | 1,387 | 87 | 0.66% | 87 |
Total | 210,283 | 13,143 | 100.00% | 287 |
2.18% |
Based on these figures we can see that 287 hectares (or 2.18%) of St Alban’s green belt is being classified by the AI as “grey”. This includes earthworks, some of which are likely to have been misidentified due to the similarity between this and dry / fallow fields.
Looking generally at the distribution of these “earthworks” classifications, it appears that around half are likely identified correctly. That leaves around 250 hectares of land with which there’s a high level of certainty over the nature of the surface. That’s just shy of 2% of St Alban’s green belt: or 12,500 homes, if this were built out to a notional capacity of 50dph.
Of course not all of this land will be suitable for development—some is far from public transport or the road network, or otherwise occupied by productive pourposes (some of those concrete batching plants will actually be needed to build these homes!). But even if we generously discount half of the space identified, that still leaves a huge area of grey belt that can be developed for new homes. St Albans’ new five-year housing target, under the new Standard Method, is 7,720 homes. Most will be delivered, one assumes, through urban and suburban intensification. But let it not be argued that the district doesn’t have enough land to deliver these homes.
England’s metropolitan green belt covers an area of 1.64m hectares. Even if just one percent of this falls under the commonly-accepted decision of “grey belt”—car parks, breaker’s yard and landfill—then at over 800,000 homes that’s more than half of the new government’s five-year housing target. This is clearly an opportunity not to be missed.