In With The New - A Geopackage Poke

SCT Part 1

Scrapping For Data

The City of Capetown has a great and searchable place where data on and about the city is shared openly. The data is currently being migrated to a Portal for ArcGIS site which will bring with it a wealth of features and functionality. For now (June 2018), I am happy with the ‘simple’ site. This post is about collating spatially formatted data on the City of CapeTown. (Well, as provided on the open-data portal.)

So to start out I got city-wide base data from the Open Data Portal site and any apparently interesting themes, storing it locally for easy reach. The data was in compressed zip files, excel spreadsheets, pdfs and ODS files. I just dumped these, after some processing, in one folder. Fortunately there was sanity to the file naming from source.

Creating a Geodatabase

I went with GeoPackage for data storage/file format. So I could learn more about it and also because I have been hearing about it more often. Recently from FOSS4GNA 2018. I used primarily Q, QGIS and other tools I will mention. Here’s the step by step:

1. From the Open Data Portal, search and download the suburbs spatial data. This is available in zipped shapefile format.

2. Load the official planning suburbs data (zipped) shapefile in Q. The CRS(Coordinate Reference System) in Q set to ESPG:4326,

Suburbs in Q

3. Right click on the suburbs layer, then Save As…leads to a dialog box of format choice and other options. Choose GeoPackage for a format.

Save As Geopackage

On “Geometry”, Q has it as Automatic. (Q will correctly interpret this as polygon which it truly is.)

On “Extent”, Choose layer. Which defines the bounds of City of Capetown .(Well, I have prior knowledge of this. Local context data)

Layer Options:

Q has a tooltip for these fields. So hovering a mouse gives a hint what a field is for.)

Custom Options:

Conveniently, the data from the Open Data Portal comes with some metadata. So we use that to populate these fields. Better build a comprehensive database from the ground up for future usage’s sake.

Layer Options

On OK.Q loads the GeoPackage and subsequently the layer. Q has made the export MultiPolygon and not just Polygon. (We will investigate this later.)

Our Geodata, The GeoPackage

To see what we just did. In Q, Load Vector data. Point to the GeoPackage (cct_opendata.gpkg ) and Voila! Our Suburbs data ‘geopackaged’ and with only two fields “fid” and “NAME”.

Inside The Geopackage

(From previous database fiddle experience, something about the “NAME” field in CAPS bothers me. We’ll deal with it late if need be.)

To take a peek into our GeoPackage Launch DB Manager from Q.

The GeoPacahe in DB Manager

From here there is a plethora (I could be exaggerating) of GeoPackage versions. So lets’s investigate which version we just created so we know in case of eventualities as we build our datastore. We get a tip from here on how we can check a geopackage version.

With the GeoPackage opened in DB Manager.

Pragma User Version

User_version ‘0’ is not very indicative.

We get a better results. Let’s interpret ‘1196437808’. from this guide we learn “1196437808 (the 32-bit integer value of 0x47503130 or GP10 in ASCII) for GPKG 1.0 or 1.1”

So our geopackage version is atleast 1.0.

While we are still at the SQL Interface. Let’s interrogate our ‘suburbs’ data.

How many suburbs are in our table?

SELECT Count(fid) FROM suburbs, 

792 suburbs!

Fiddle

“A GeoPackage is a platform-independent SQLite [5] database file that contains GeoPackage data and metadata tables.” Source. From this we infer that we should be able to explore the GeoPackage somemore with DB Manager as a SpatiaLite database.

Geopackage in Spatialite

More about those fields is explained here. Not for the faint hearted. Gladly we can just point Q at the geopackage and get our layer(s) to display!

Stacking The Store

To add more data to the GeoPackage. Repeat the procedure described in Creating a GeoDatabase above.

So I went on to load several layers form the Open Data Portal. After several clicks, some typing and 1.5Gigs later, 22 spatial data layers! (These were pseudo-randomly selected. No particular preference just a hunch that the data may help answer a question yet unknown.)

Data Together

So while scrolling through the data (in Q via DB Manager),I noticed a triangle warning one of the layers did not have a spatial index. I opted to create one as prompted by Q. Q did it behind the scenes I simply had to click a hyperlink.

Using the SQLite Database Browser, we can get more insight into the GeoPackage.

#PostScript

While browsing to the geopackage location. I couldn’t help noticing two extra files

Extra Files

I found out these are temprary files used by SQLite. For a moment I thought the multi-file legacy of the Shapefile was back.

I used:

Refs:

1. FOSS4GNA 2018 Geopackage Presentation

2. Geopackage Website

3. Fulcrum’s Working With GeoData

4. More On GeoPackage

5. SQLite Browser Site