Loqate, a GBG solution
Loqate

Help Centre   |   Guides   |   Data Sets   |   Property Intelligence User Guide

Property Intelligence User Guide

This user guide provides information on the Property Intelligence database, prepared by GBG for customers looking to use the data. Property Intelligence is designed to assist in the process of providing insurance quotes for customers in England, Wales and Scotland by supplying data on residential properties.

Property coverage:

  • Houses, bungalows and flats covered although the majority of testing has been on freehold properties i.e. typically houses and bungalows;

Geographic coverage:

  • England and Wales full coverage;
  • Scotland full coverage but reduced accuracy because of different policies for land and property registration and the publication of data;
  • Isle of Man, Channel Islands, Northern Ireland – not covered;

Property Intelligence is currently built on a quarterly basis, this is subject to review. The underlying datasets have a range of update frequencies from monthly upwards. Updates are included in the build as they become available.

Database Fields

Utility Fields

The database contains a set of utility fields and a set of feature fields. The utility fields are as follows:

  • UPRN - the Unique Property Reference Number originating from the Ordnance Survey
  • UDPRN - the Unique Delivery Point Reference Number originating from the Royal Mail
  • UMRRN - the Unique Multiple Residence Reference Number originating from the Royal Mail
  • Address1 - a standardised first line of address containing house name or number and street
  • Postcode - a full postcode
  • Easting - Ordnance Survey National Grid Easting
  • Northing - Ordnance Survey National Grid Northing
  • Latitude - latitude in ETRS89 converted from the easting using OSTN02
  • Longitude - longitude in ETRS89 converted from the northing using OSTN02
  • Output area code - Output Area (OA) code from the ONS Postcode Directory
  • Lower super output area code - Lower Super Output Area (LSOA) code from the ONS Postcode Directory
  • Country - one of England, Northern Ireland, Scotland, or Wales from the ONS Postcode Directory

 

Data Items

The feature fields in the database are arranged in sets of three:

  • X – this is the data item of interest, for example a number of bedrooms for which X = “bedrooms”;
  • source_X – this is the source of the information, for example the number 2 indicates that this data item is sourced from the Land Registry;
  • p_X – this is a confidence score for the data ranging between 0 and 1. Confidence scores are calculated, where possible, as a “fraction correct” measure against a groundtruth dataset of 36,000 properties supplied by Simple and Open;

The unique key to the database is the UDPRN / UMRRN pair supplied by Royal Mail, the UPRN is also supplied. The list of data items is as follows:

  • Property type - whether the property is semi-detached, detached, terraced or a flat
  • Number of floors - the estimated number of floors in the property based on the height of the building.
  • Number of bedrooms - the number of bedrooms in a property
  • Number of bathrooms - the number of bathrooms in a property
  • Number of rooms in total - the number of rooms excluding bathrooms and kitchens
  • Building construction period - the construction date of a building in one of the following periods: (before 1719 (old), 1720-1839 (Georgian), 1840-1919 (Victorian/Edwardian), 1920-1945 (Inter-war), 1946-1979 (Post-war) and 1980 to date (Modern))
  • Year built - the year built, only available for those buildings in the Land Registry Price Paid data, built after 1995
  • Listed building - The grade of listing of a building, if it is listed, using data supplied by English Heritage, Cadw or Historic Scotland
  • Cadastral polygon area - the area of the cadastral parcel in which the building sits expressed in square metres using data from Land Registry (not currently available)
  • Height - the building height in metres
  • Building footprint (square metres) - the approximate footprint of the building expressed in square metres
  • Building volume (cubic metres) - the approximate volume of the building expressed in cubic metres
  • Average roof slope - the average slope of the property roof, can be used to identify properties with flat roofs
  • Flat roof fraction - the estimated fraction of a building which has a flat roof
  • Distance to tree - distance from the nearest tree over 10 metres tall to the property geocode
  • Geocode multiplicity - the number of property geocodes falling within the footprint of the building at 1.8 metres above ground level
  • Floor area (square metres) - the liveable floor area in square metres (data available for England and Wales only)
  • Last transaction price - the price paid at the last transaction recorded by the Land Registry (England and Wales only, back to 1995)
  • Last transaction date - the date of the last transaction recorded by the Land Registry (England and Wales only, back to 1995)
  • Estimated current value - estimated current value based on data from Land Registry (England and Wales only, back to 1995)
  • Number of transactions - the number of transactions recorded by the Land Registry (England and Wales only, back to 1995)
  • Estimated council tax band - estimated council tax from price at reference years using Land Registry data (England and Wales only, back to 1995)
  • Within 200 metres of watercourse - flag indicating whether there is a watercourse within 200 metres
  • Distance to watercourse (within 200 metres) - distance (in metres) to a watercourse, if it is within 200 metres
  • Distance to road - the distance to the centre line of the nearest road from the property geocode, not necessarily accessible
  • Road class - road class, as provided by Ordnance Survey
  • Business usage - a flag indicating potential business usage
  • Planning classification - planning classification as per Town and Country Planning (Use Classes) Order 1987
  • Congestion zone - a flag indicating if a property is in the London Congestion Zone (not currently available)
  • Burglary rate - the number of burglaries per property per year averaged over a LSOA
  • Storey on which flat sits - storey on which a flat sits possible values include ground, 1st, 2nd and so forth but also mid- and top-floor
  • Is top floor flat? - Is a flat on the top floor of the building
  • Number of extensions - the number of extensions to a property, typically 1 but up to 4
  • Wall type - the type of wall used in construction, possible values cavity wall, solid brick, sandstone, granite, timber frame, system built and SAP05
  • Main central heating fuel - Main central heating fuel, possible values include gas, electricity, oil, coal, LPG, wood, B30K (a biofuel mix) and also 'not known' and 'none'

 

Technical Details

Technical details for each of these fields are shown in the table below:

Title

Field name

Data type

UPRN

UPRN

Number

UDPRN

UDPRN

Number

UMRRN

UMRRN

Number

Address1

address1

Text

Postcode

postcode

Text

Easting

easting

Number

Northing

northing

Number

Latitude

latitude

Number

Longitude

longitude

Number

Output area code

OA11CD

Text

Lower super output area code

LSOA11CD

Text

Country

country

Text

Property type

property_type

Lookup

Number of floors

floors

Number

Number of bedrooms

bedrooms

Number

Number of bathrooms

bathrooms

Number

Number of rooms in total

total_rooms

Number

Building construction period

age

Lookup

Year built

year_built

Number

Listed building

listed

Number

Cadastral polygon area

cadastral

Number

Height

height

Number

Building footprint (square metres)

footprint

Number

Building volume (cubic metres)

volume

Number

Average roof slope

avg_roof_slope

Number

Flat roof fraction

flat_roof_fraction

Number

Distance to tree

distance_to_tree

Number

Geocode multiplicity

geocode_multiplicity

Number

Floor area (square metres)

floor_area

Number

Last transaction price

last_transaction_price

Number

Last transaction date

last_transaction_date

Text

Estimated current value

est_current_value

Number

Number of transactions

n_transactions

Number

Estimated council tax band

est_council_tax

Lookup

Within 200 metres of watercourse

watercourse_200M

Number

Distance to watercourse (within 200 metres)

distance_to_water

Number

Distance to road

distance_to_road

Number

Road class

road_class

Number

Business usage

business_usage

Number

Planning classification

planning_classification

Number

Congestion zone

congestion_zone

Number

Burglary rate

burglary_rate

Number

Storey on which flat sits

flat_floor

Text

Is top floor flat?

top_floor_flat

Number

Number of extensions

extensions

Number

Wall type

wall_type

Number

Main central heating fuel

main_fuel

Number

Table 1: Technical details for each utility and data field. source_X fields are lookup fields, p_X fields are number fields. 

 

Lookup Tables 

Tables 2-13 are the lookup tables relating the numbers found in the database fields to descriptions for the property type, property age, Council Tax band, and data source. The Yes/No lookup is used for the 'watercourse 200M', 'congestion zone' and 'top floor flat' fields.

Yes/No Lookup

Description

Value

No

0

Yes

1

 

Table 2: Yes/No Lookup

 

Property Type Lookup

Description

Value

Detached

0

Semi-detached

1

Terraced

2

Flat

3

Unknown

4

Table 3: Property type lookup

 

Property Age Lookup

Description

Value

Before 1719 (old)

0

1720-1839 (Georgian)

1

1840-1919 (Victorian/Edwardian)

2

1920-1945 (Inter-war)

3

1946-1979 (Post-war)

4

1980 to date (Modern)

5

Not known

6

Table 4: Property age lookup

 

Councol Tax Lookup

Description

Value

A

0

B

1

C

2

D

3

E

4

F

5

G

6

H

7

I

8

N/A

100

Table 5: Council tax band lookup

 

Data Source Lookup

Description

Value

Default

0

Land Registry

2

Historic England

3

Estate agent

4

LIDAR

7

NROSH multipart

8

NROSH snapshot

9

VOA

12

Heuristic

14

ML (age)

15

Naive Bayes (age)

17

Banded VOA

18

ML (bedrooms)

19

VOA (Council Tax)

20

OS Open Rivers

21

NB (bedrooms)

24

OS Open Map

25

Financial Services

26

Transport for London

28

police.uk

29

Flats modeller

30

Cadw

33

Historic Environment Scotland

34

OS Open Roads

35

Royal Mail

36

DCLG

37

DCLG non-domestic

38

OS AddressBase Premium

41

Table 6: Data source lookup

 

Business Usage Lookup

Description

Value

Domestic

0

Business

1

Table 7: Business usage lookup

 

Main Fuel Lookup

Description

Value

Gas

0

Electricity

1

Oil

2

Not known

3

Coal

4

LPG

5

Wood

6

None

7

B30K

8

Other

9

Biomass/Biogas

10

District heating

11

Waste heat

12

Table 8: Main fuel lookup

 

Wall Type Lookup

Description

Value

Cavity wall

0

Solid brick

1

Sandstone

2

Timber frame

3

Granite

4

System built

5

SAP05

6

Not known

7

Table 9: Wall type lookup

 

Planning Classification Lookup

Description

Value

Not known

0

A1/A2 Retail and Financial/Professional services

1

A3/A4/A5 Restaurant and Cafes/Drinking Establishments and Hot Food takeaways

2

B1 Offices and Workshop businesses

3

B2 to B7 General Industrial and Special Industrial Groups

4

B8 Storage or Distribution

5

C1 Hotels

6

C2 Residential Institutions - Hospitals and Care Homes

7

C2 Residential Institutions - Residential schools

8

C2 Residential Institutions - Universities and colleges

9

C2A Secure Residential Institutions

10

C3 - Dwelling houses

11

D1 Non-residential Institutions - Community/Day Centre

12

D1 Non-residential Institutions - Crown and County Courts

13

D1 Non-residential Institutions - Education

14

D1 Non-residential Institutions - Libraries Museums and Galleries

15

D1 Non-residential Institutions - Primary Health Care Building

16

D2 General Assembly and Leisure plus Night Clubs and Theatres

17

Others - Passenger terminals

18

Others - Emergency services

19

Others - Miscellaneous 24hr activities

21

Others - Car Parks 24 hrs

22

Others - Stand alone utility block

23

Others - Telephone exchanges

24

Sui generis

25

Table 10: Planning classification lookup

 

Road Class Lookup

Description

Value

Unclassified

0

Not classified

1

Classified unnumbered

2

B Road

3

A Road

4

Motorway

5

Unknown

6

Table 11: Road class lookup

 

Listed Building Grade Lookup

Description

Value

Not listed

0

I or A

1

II* or B

2

II or C

3

Table 12: Listed building grade lookup

Accuracy

Accuracy for the tested fields calculated using 2020-06-17-rc4.14-consensus.sqlite on 2020-07-20 11:15:54 against 33106 properties is shown in the table below.

Field

Accuracy (%)

Number of bedrooms

71.4

Number of bathrooms

77.3

Building construction period

63.9

Property type

83.3

Number of floors

90.8

Table 13: Summary accuracy for fields, measured against 'groundtruth' properties in England and Wales, excluding flats

Coverage

The following tables show dataset coverage and accuracy for number of floors, bedrooms, age and property type using the along with confidence for these attributes based on measurements against the 33,000 property groundtruth dataset covering England and Wales.

 

Source

Coverage

Accuracy

Confidence

DCLG

0.156

0.705

0.700

Default

0.026

0.385

0.500

Estate agent

0.325

0.856

0.850

Flats modeller

0.002

0.235

0.600

NB (bedrooms)

0.482

0.639

0.645

NROSH multipart

0.009

0.805

0.800

NROSH snapshot

0.000

1.000

0.800

Overall

1.000

0.714

0.718

Table 14: Accuracy and coverage for bedrooms

 

Source

Coverage

Accuracy

Confidence

Default

0.707

0.800

0.730

Estate agent

0.293

0.707

0.760

Overall

1.000

0.774

0.739

Table 15: Accuracy and coverage for bathrooms

 

Source

Coverage

Accuracy

Confidence

Cadw

0.000

0.636

0.630

DCLG non-domestic

0.000

0.000

0.500

Historic England

0.004

0.562

0.540

Land Registry

0.043

0.939

0.950

Naive Bayes (age)

0.589

0.722

0.728

Overall

1.000

0.639

0.646

VOA

0.365

0.469

0.479

Table 16: Accuracy and coverage for age

 

Source

Coverage

Accuracy

Confidence

Banded VOA

0.066

0.622

0.601

DCLG

0.091

0.934

0.900

Default

0.002

0.400

0.540

Estate agent

0.610

0.878

0.880

LIDAR

0.227

0.749

0.800

Land Registry

0.000

0.000

0.920

NROSH multipart

0.004

0.169

0.800

Overall

1.000

0.833

0.844

Table 17: Accuracy and coverage for property_type

 

Source

Coverage

Accuracy

Confidence

Banded VOA

0.149

0.823

0.811

DCLG

0.080

0.955

0.940

LIDAR

0.771

0.903

0.900

Overall

1.000

0.895

0.890

Table 18: Accuracy and coverage for floors

Attribute Distribution Charts

The following charts show the distribution of values for selected fields, for domestic properties, not arising from the default model.

 

Figure 1: Distribution of property type

 

Figure 2: Distribution of number of bedrooms

 

Figure 3: Distribution of number of bathrooms

 

Figure 4: Distribution of building construction period

 

Figure 5: Distribution of number of floors

Direct Data Content

The following tables shows the coverage with direct data for the five fields tested against groundtruth.

Attribute

Percentage direct

Property type

86.9

Floors

73.2

Bedrooms

53.7

Bathrooms

26.5

Age

9.3

Table 19: Percentage of data supplied from direct sources rather than modelled

Data Recency

Data recency for the Property Intelligence dataset is determined by a number of factors, listed below:

  • The build process for Property Intelligence takes approximately 2 months from start to delivery to customer with quarterly scheduled releases;
  • Individual datasets have a range of update frequencies, some are static and will never be updated, others are yearly, quarterly or monthly;
  • Two datasets, DCLG and Estate agent data, have property-level fields which indicate when an inspection was carried out so potentially day-level data on recency could be provided;
  • The LIDAR data is a composite dataset, 80% of which has been collected in the last 10 years;

The table below shows the dates of the datasets used in this version of Property Intelligence along with an indication of the expected update frequency.

Dataset

Frequency

Date

DCLG

Quarterly

2019-08-22

Business services

Monthly

2020-06-11

Land Registry House Price Index

Monthly

2020-06-25

Land Registry Price Paid

Monthly

2020-06-26

English Heritage

Yearly

2020-05-18

Historic Environment Scotland

Yearly

2020-05-18

Cadw

Yearly

2020-05-18

NROSH

Once

2016-12-12

ONSPD

Quarterly

2020-03-17

ONS rural-urban classification

Once

2016-12-12

OS Open Rivers

Quarterly

2020-03-20

OS Open Roads

Quarterly

2020-03-20

Police.uk

Monthly

2020-06-26

Royal Mail

Monthly

2020-06-26

VOA

Yearly

2019-03-20

Estate Agent

Monthly

2015-12-01

Table 20: Data recency and frequency by dataset
The Environment Agency started to systematically cover England for LIDAR measurement in about 2005 and they have added, very approximately 5% coverage in each year since then.

Attributions

This dataset contains Open Data typically provided under the UK government's OGL3 license, a requirement of this license is that an attribution is provided for the data. These are as follows:

  • DCLG: Contains public sector information licensed under the Open Government Licence v3.0.
  • LIDAR - Environment Agency: (c) Environment Agency copyright and/or database right (2019). All rights reserved.
  • LIDAR - Scottish Government: Crown copyright Scottish Government, SEPA and Scottish Water (2012)
  • LIDAR - Lle: Contains Natural Resources Wales data © Crown copyright and database right 2018. This data is licensed under the Open Government Licence v3.0.
  • Land registry Price Paid: Contains HM Land Registry data © Crown copyright and database right 2018. This data is licensed under the Open Government Licence v3.0.
  • Listed buildings England: Historic England (2020). Contains Ordnance Survey data © Crown copyright and database right (2020). The Historic England GIS Data contained in this material was obtained on 2019-11-05
  • Listed buildings Wales: Designated Historic Asset Descriptive Information, The Welsh Historic Environment Service (Cadw), 2019-11-05, licensed under the Open Government Licence http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
  • Listed buildings Scotland: Contains Historic Environment Scotland and Ordnance Survey data © Historic Environment Scotland - Scottish Charity No. SC045925 © Crown copyright and database right [2018]
  • Office for National Statistics: Contains National Statistics data © Crown copyright and database right [2018]
  • Office for National Statistics: Contains Royal Mail data © Royal Mail copyright and database right [2018]
  • Ordnance Survey: Contains OS data © Crown copyright and database right (2018)

Release Notes

July 2020

As a result of recent supplier changes we are withdrawing a number of fields including the geocode accuracy and red route fields. The Congestion Zone field will remain but not be populated.

We have withdrawn the Tenancy field as a result of other supplier licensing changes.

There are a number of fields which have not been populated for some time including multiplicity, outdoor area, building count and number of adult occupants.
As a result we have removed the following fields from this release:

  • geocode accuracy
  • red route
  • tenancy
  • multiplicity
  • outdoor area
  • building count
  • adult occupants

The Land Registry UK House Price Index has been suspended as of the April 2020 release, due to be published in June because of the impact of COVID-19 which means limited transactions are occurring on which to base the Index. The relevant Land Registry Bulletin describing this change is here. This means the estimated current value field will contain the estimated current value at last release of the House Price Index - 1st March 2020.

Take a trial - no obligation, full support Get started