How Many Homes? Using AI to locate Small Sites in London

When Sadiq Khan first published his draft London Plan in 2017, in addition to the general housing targets that had been a feature of London Plans since 2000, the mayor set out a requirement for each of the planning authorities in London (including each of the 32 boroughs, the City of London, and the two Development Corporations) to achieve a proportion of these new homes on small sites.

In total, the mayor expected nearly 250,000 new homes to come from small sites, defined as being no greater than 0.25 hectares—around a third of the total.

During the Examination in Public, a formal process through which interested parties can make representations to a Planning Inspector tasked with assessing the “soundness” of the Plan, many of the boroughs—and in particular, the outer London ones—complained that they didn’t believe they had sufficient capacity to achieve the numbers that the mayor expected of them.

Unable to provide sufficient evidence that the figures were robust (largely based, as they were, on an extrapolation of historic approvals on “windfall” sites), the Inspector required the targets to be slashed. The twenty suburban boroughs saw reductions in their small sites housing targets of between 57% (Croydon) and 65% (Havering). The inner-London boroughs’ figures were reduced by far less, with Islington and Hackney maintaining the original figures and Camden, Hammersmith & Fulham, Kensington & Chelsea and Southwark all subject to less than 25% reduction.

The London Plan was eventually adopted in 2021.

Now that Sadiq Khan is almost certain to secure a third and final term, his team is beginning to think about how the next iteration of his Plan might look. I have written previously about the potential for green belt reform, and how I believe the mayor should adopt a bolder approach to London’s expansion. Yet, to meet London’s increasingly desperate housing need, no one solution will do. Everything should be on the table: building on suburban station car parks; reform of the green belt; the release of some of London’s golf courses for development—and a dramatic increase in suburban intensification through the development of small sites.

How Many Homes?

During the 2019 Examination in Public, it became clear that nobody could agree on how many homes could be delivered on small sites, largely because data were not available, and figures based on historic approvals was not a good indication of future capacity.

In 2020, architects Ash Sakula, and my practice RCKa, were appointed by Lewisham Council to produce a set of guidance for the delivery of new homes on small sites in the borough. This guide was adopted as a Supplementary Planning Document in October 2021, and we were subsequently appointed by Bexley Council to undertake a similar exercise in support of its own Local Plan (this is yet to be published).

In undertaking this work we needed to understand the types of small sites that were coming forward for development in each borough, what was receiving approval, and what was being refused. This enabled us to set out specific guidance based on a standard set of “typologies”, including street infill, backland sites and so on.

I started by producing a set of mapping tiles for every small site freehold in Lewisham (setting a threshold of between 100sqm and 0.25ha, assuming that any small site with an area of less than that was unlikely to be developable), of which there are around 56,000.

Using my knowledge of small sites in Lewisham, and the site types described in the Mayor of London’s Small Sites Design Code, I manually categorised around 10,000 of the tiles I had produced using QGIS. Yes, it took a long time.

Each site was determined as falling into one of several types:

Site TypeDescription
AmenityAn empty site with a street frontage
BacklandAn empty or under-developed site without a street frontage, usually at the centre of a perimeter block, sometimes with a street frontage
CornerAn empty site with an equal frontage to two streets
Estate InfillA site with existing buildings where there’s capacity for small-scale infill
MewsA long, narrow site with a property at one end but access to the public highway at the other
Side Street InfillThe street-facing rear garden of a property facing the highway where there’s capacity for another home to the rear
Street ExtensionA site which can be divided in two to create a new home to one side of the existing property
Street InfillA gap between two properties facing the public highway where there’s sufficient width to create a new dwelling

In addition to these, I categorised a whole bunch of sites which I considered not to be developable. These included freeholds occupied by existing terraced houses, semi-detached and detached homes, and where existing structures occupy the entirety of the plot or where there are narrow strips of land which are undevelopable.

With this large dataset in place, it’s possible to make some predictions based on an extrapolation of the proportions of different site types to the 56,132 small sites in the borough, shown in the table below. The sites in italics are considered to be undevelopable.

Site TypeManually CategorisedPercentage of TotalExtrapolated Total
Estate Infill2402.39%1,341
Street Infill1131.13%632
Side Street Infill3673.65%2,051
Street Extension5165.14%2,884
Sub Total Developable Sites2,16921.60%12,123
Fully Filled Site3413.40%1,906
Narrow Strip570.57%319
Sub Total Undevelopable Sites7,87478.40%44,009
Total 10,043100%56,132

Based on this, we can estimate that around a quarter of Lewisham’s small sites have the capacity to accommodate one or more new homes; 12,700 in total. Most of these sites will yield a single dwelling, but some—particularly backland sites and amenity spaces—have the capacity for more.

Bearing in mind that Lewisham’s London Plan small sites target is just 3,790 homes over 10 years, it seems likely that the council could deliver three quarters of its entire housing target of 16,670 homes on small sites alone.

So, we’ve established a theoretical capacity…but where are the sites themselves?

Release the Robots!

Armed with a big folder full of files separated into the different categories, I used a pattern-recognition algorithm to examine each one and build an AI model which could be deployed on the remaining 46,000 tiles. I’m (clearly) not an AI expert, so I used the ImageAI Python library to achieve this, testing each of the four in-built algorithms to understand which returned the most accurate predictions based on the images I supplied to it.

Of those available, InceptionV3 appeared to yield the best results across the various site types, although others seemed to more accurately identify certain types better than others. Each of these algorithms has been developed using large sets of photographic images, intended to identify -for example – types of animals, fruits, vegetables and other inanimate objects, rather than sorting abstract geometric data. But, nevertheless, they seem to work reasonably well for this purpose, although there may well be more suitable algorithms out there which are better suited for this task.

I then set the learning model to task on the full dataset, taking around eight hours to process all 51,000 files. This was relatively quick compared the generating the library of images; QGIS took several days to create individual image files for each freehold.

The output of this process is a simple text file, with each line including the Land Registry INSPIRE ID code (a unique identifier for every freehold boundary in the country), and a list of the top five likely site types, ranked by the level of confidence.

When the AI model assesses each tile, it attempts to categorise the image and assigns a confidence level to its prediction, with its second preference receiving a lower confidence level, and so on. Here’s a typical line:


The first column is the INSPIRE ID, the second is the learning model used to identify the site type. The third and fourth columns show the top-ranked prediction and the level of confidence prescribed to that prediction. In this case, it’s predicting that this is a terraced house with a 99.9% level of confidence, and if we check on the map, we can see that this appears to be correct:

Dropping down to street level using Google Streetview confirms this too:

Typically, a terraced house doesn’t make for a compelling development opportunity. Whilst the London Plan includes conversions within its target figures for small site developments (i.e. turning one house into several flats), many planning authorities are rightly concerned about the loss of family homes that this entails, and in many cases have brought in measures to resist this. So, what we’re looking for are those small sites where there’s an opportunity to add new homes without losing existing ones (or, at least, where there’s no net loss of homes).

Scrolling through the spreadsheet we can immediately identify a few lines which look like they might yield some interesting results.

As an example, the AI has identified INSPIRE ID number 58163886 as an “amenity” site. This is one which has direct access to the public highway but is not currently occupied by development (although it may have garages on it, which are not always picked up by the Ordnance Survey mapping data used to create the image tiles).

Lo and behold, if we find this site on Google Maps we can see that it is occupied solely by a couple of garages. A perfect development opportunity that looks like it could accommodate a couple of houses or a small block of flats.

Exciting stuff!

Here’s what they all look like on a map of Lewisham. We can see from this that there are clusters of mews sites in the north-west of the borough – as you’d expect, in Brockley, which is famous for this particular pattern of development, including Ashby, Breaskspears and Wickham Mews.

Fewer sites seem to appear around Catford, which again makes sense given that terraced houses predominate, albeit there are some opportunities here for side-facing street infill.

So, armed with a huge database containing a list of every small site in Lewisham, it’s time to stick them all on a map to see where the small sites opportunities are.

Accuracy of the Findings

As mentioned above, the AI algorithm appears better at finding some site types than others. This might be a result of the learning data: there are many, many more “side-street infill” sites in Lewisham than there are “street infill”. The more examples there are, the more accurate the predictions will be.

Because the number of side-facing street infill and street extension sites is high, the AI is pretty good at distinguishing between the two even though there’s not much of a difference between the two (really, just the proximity between the end of the terrace and the street – if the terrace extends to the pavement line, but has a deep garden, then it’s the latter; if there’s space to pop a new home on the end of the terrace, then the former).

Take this area of Hither Green, SE6:

Within this small area you can see a range of coloured polygons, each one representing a potential development site, with the colour corresponding to a different site type.

This freehold boundary on the corner of Sandhurst Road and Wellmeadow Road has the potential for an additional home by extending the existing terrace to the south. There’s also space for a further side-facing street infill development to the rear – at least two homes! This is a street extension site.

Directly opposite this is a large house with a deep garden which has one side along the street edge. Because the existing property extends to the pavement there’s no opportunity here to extend the terrace further, but a new home could be constructed at the end of the garden. This is a side-facing street infill site.

The AI is smart enough to recognise that because the adjoining property to the south doesn’t have a garden with access to the street, there’s no development possible here.

Up on Minard Road the AI has located a triangle of land, previously developed (or, at least, with no buildings currently on it) which has a street frontage. Someone should develop this! It’s defined as an amenity site – and I reckon there’s space for at least three houses on it.

The geometry of the above site types is pretty consistent from freehold to freehold: the dimensions might vary, but the general arrangement is broadly the same. It gets more complicated for sites which exhibit the same broad principles, but vary significantly in terms of their geometry.

Where the AI doesn’t perform so well is in those site conditions where there’s a more varied: backland sites, for example, or estate intensification. Because the geometry doesn’t distinguish between residential and non-residential buildings, it’s not possible to determine whether these sites are suitable for intensification – for example, a large block in the centre of a site could be a block of flats surrounded by car parking (and therefore suitable for some development); or, it could be a sports centre or retail unit. A future version of the AI might pick distinguish between the two.

So, some of the designations need to be treated with caution…but with that caveat in mind, let’s have a look at where all the sites are located.

Counting the Sites

We’ve seen how the AI is better at identifying some sites more successfully than others. Partly this is because some of the site types are very similar to one another (for example, terraced houses and end-of-terrace side-facing street infill) and in this case the AI assigns a similar level of confidence to each first and second-ranked type, thereby reducing the first-placed confidence threshold. Other types, such as detached houses, tend to be more easily distinguishable from others, thereby gaining a higher first-ranked confidence threshold.

Site TypeManually ExtrapolatedAI CategorisedDifference
Estate Infill1,3411,028-313
Street Infill6321,064432
Side Street Infill2,0511,687-364
Street Extension2,8841,721-1,163
Sub Total Developable Sites12,1238,630-3,493
Fully Filled Site1,9062,068162
Narrow Strip319115-204
Sub Total Undevelopable Sites44,00947,5023,493
Total 56,13256,1320

Urban Morphology

Because the AI is able to categorise every type of freehold boundary in a borough (even those which are unsuitable for development), a potentially useful byproduct of this work is a decent picture of urban morphology. We can use the mapping to identify areas which are largely terraced homes (those around Catford, for example) and the suburban semis of Sydenham.

When we were working on the Lewisham Small Sites SPD, we used Nolli maps to understand the pattern of development around the borough. With the help of the AI model, we can quickly see the different character areas of Lewisham as defined by the types of property found in each.

This clearly shows the preponderance of terraced houses in the centre of the borough, with more varied types around Forest Hill and Sydenham in the west. The south-east seems less dense by comparison, but this is an anomaly which occurs because the AI is only working to categorise small sites – many of the homes here are owned by social housing providers, and so fall under large freeholds; the speckle of terraced houses here are likely due to some of these homes being acquired under Right to Buy.


Something that became apparent during this research was the inaccuracy of the mapping data I was using to train the AI. In some cases the geometry shows surprising levels of granularity (a side extension, or garden room); but in others, whole houses are missing or backland garages omitted. Of course there’s always going to be a lag between things being built and the Ordnance Survey picking these up in the polygon data, but as some of the properties missing polygons from the Zoomstack information are decades, old, their omission seems odd. What if I were able to apply the same methodology to aerial photography from Google Maps? Would this deliver similar (or more accurate) results?

To find out I produced another large set of image tiles, including freehold boundaries as before, but with the underlying geometry replaced with aerial imaging. A typical tile might look like this:


Because each line of the spreadsheet contains a unique identifier for every freehold boundary, we can pair this with the Ordnance Survey’s INSPIRE polygon dataset, thereby providing us with the geographic mapping data. Using QGIS, it’s then possible to locate every site identified by the AI back onto a map – and then, using the software’s XYZ image-generation capabilities, produce a big interactive map showing every single one.

Click the button below to explore the entire map of Lewisham. Bear in mind that the map takes a few seconds to load. It’s processing 57,000 lines of code to generate the markers. At some point I’ll find a more efficient way of achieving this.

Zoom right in and you’ll see little dots representing each potential site that the AI has identified. The colours correspond to the type of site, and clicking on each one reveals a pop-up listing the site area, the site type, a rough idea of the potential number of homes that the site can accommodate, and a link to the property on Streetview.