Unzipping rasters in QGIS

Recently I walked the first half of the Cotswold Way (from North to South). While planning the second half, working out how many miles to attempt each day, I wanted to compare the terrain for the two halves of the walk.

The Ordnance Survey provide, as Open Data, a Digital Terrain Model (DTM) based on a 50m grid size, which is a good enough resolution for my purposes. Unfortunately the 158Mb download is for the whole of Great Britain. Another added complication is that the zip file contains a number of other zip files, one for each 10km tile. I wanted to automate the process of extracting only the necessary tiles and adding them as layers in QGIS.

The first stage was to get the 10km grid as a QGIS layer, to determine which tiles I needed. It is possible to download all of the OS grids as a geopackage. Here the 10km grid is shown along with the route of the Cotswold Way (downloaded as a gpx file from the National trails website). I have added a 1km buffer to the route since it will be useful to also get tiles very close, but not intersecting the route (e.g. tile ST89):

So, the aim is to get the names of those tiles that intersect the buffer and then extract those tiles from the zip file.

The first step is to find the intersecting tiles. In QGIS Python:

import processing, os
params = {'INPUT':'os_bng_grids 10km_grid','PREDICATE':'0','INTERSECT':'Buffered','METHOD':'0'}
processing.run("qgis:selectbylocation", params)
...

The buffer could have been created in Python but here I used the QGIS user interface. The parameters for the select by location tool are first defined in a dictionary (params). The INPUT is the name of the grid layer. The PREDICATE is chosen from:

0: intersect
1: contain
2: disjoint
3: equal
4: touch
5: overlap
6: are within
7: cross

The INTERSECT parameter is the name of the second layer in the relationship (the buffer) and the METHOD is selected from:

0: creating new selection
1: adding to current selection
2: selecting within current selection
3: removing from current selection

The result is:

Selected grid cells

Now we need to get, programatically, the names of the selected tiles:

...
# get the layer (NB this returns a list so get first (and only) item...
my_layer = QgsProject.instance().mapLayersByName('os_bng_grids 10km_grid')[0]
# get the selected features...
selectedGridCells = my_layer.selectedFeatures()
# Add the selected grid cell names to a list...
for gridCell in selectedGridCells:    
    tileList.append(gridCell.attributes()[1])
print(str(len(tileList)) + "tiles required")
...

Here, we simply get the selected cells in the specified layer and add their names, SO70, SP03 etc. to a list (tileList). The next step is to examine the zip file’s structure.

The zip file contains a subfolder, data, which itself contains a number of folders – one for each two letter OS grid code e.g nk, st etc. Within each of these folders there are further zip files – one for each 10km square (see above). Within these zip files are an .asc raster file and its associated files such as the .prj (projection) file. For example:

terr50_gagg_gb.zip > data > st > st76_OST50GRID_20230518.zip > ST76.asc, ST76.prj
terr50_gagg_gb.zip > data > st > st74_OST50GRID_20230518.zip > ST74.asc, ST74.prj

The method chosen is to loop through all of the file names in the original zip file and when a zip file containing the name of any of the selected tiles, e.g. ST76, is found then extract that file. It would be quicker to go directly to the zip file for the specific tiles but that would mean hardcoding the format of the file names e.g. st76_OST50GRID_20230518.zip, but it looks like the name contains a date; so our solution is slower but more reusable. We use the ZipFile module to do the unpacking:

...
from zipfile import ZipFile
outputFolder = "C:\\temp\\tiles"
rasterFileType = "asc" # this is the extension of the raster file
tileFiles = []
with ZipFile("C:\\temp\\terr50_gagg_gb.zip", 'r') as zObject:
    # for every file in the zip file, see if it has a name similar to any required tile names...
    for fileName in zObject.namelist(): 
        #check if the zipped file is in the list of required tiles...
        if any(tileName.lower() in fileName for tileName in tileList):                        
            # we don't want to preserve the folder structure so extract the filenames only
            fileInfo = zObject.getinfo(fileName)            
            fileInfo.filename = os.path.basename(fileInfo.filename)            
            zObject.extract(fileInfo, outputFolder)
            # now we have extracted the required zip file, extract everything from that...
            innerFileName = outputFolder + '\\' + fileInfo.filename            
            with ZipFile(innerFileName, 'r') as zObjectInner:                
                zObjectInner.extractall(outputFolder)
                # get the name of the .asc file for the tile...
            for innerFileFileName in zObjectInner.namelist():
                if innerFileFileName[-3:] == rasterFileType:
                    print (innerFileFileName)
                    tileFiles.append(outputFolder + '\\' + innerFileFileName)
            zObjectInner.close()
            # now delete the zipped zip file for the current tile...
            os.remove(innerFileName)
zObject.close()
...

Here we use the namelist method of Zipfile to extract the names of the files in the main zip (these will include other zip files, one per tile). Where there is a match, we extract the base filename only – using the getinfo method (since we don’t want to preserve the original folder structure but put all the extracted files into the same folder – here c:\temp\tiles).

The extract method is then use to extract the individual zip files to the destination folder – we specify the name of the file to extract and its destination. The next stage is to extract everything from the extracted zip file, this time using the extractall method. We also fetch the full name of the extracted .asc file, including the path (to a list, tileFiles), since we will need these to open the rasters in QGIS. Once we have extracted the files for the tile we can then delete its individual zip file from the destination folder (using the remove method of the os module).

We should then end up with a single folder containing all of the extracted files for all of the selected tiles. The next step is to add them to QGIS using the tileFiles list created early. Firstly we create a layer group, terrain50:

root = QgsProject.instance().layerTreeRoot()
tileGroup = root.addGroup("terrain50")

We can then add the rasters to that group, one at a time:

for tileFile in tileFiles:    
    newTileLayer = QgsRasterLayer(tileFile)
    QgsProject.instance().addMapLayer(newTileLayer, False)
    tileGroup.addLayer(newTileLayer) # add the layer to the group

This is slightly complicated by the fact that we want to load the raster straight into a group. Here we create the layer then add it to the map (with addMapLayer) but set the addToLegend parameter to False to stop it being displayed. We then add it to the group, displaying it in the process. The result will be something like this:

QGIS with rasters added

We could develop the script to create the buffer and merge the individual rasters, but these tasks are easy enough with the QGIS user interface. Once a single raster has been created, the QGIS Profile Tool plugin can be used to follow the elevation along the route.

The full script can be found on GitHub.

Contains OS data © Crown Copyright [and database right] (2023).