Cape Flood Safe

SCT Part 3

(18.42281, -33.95947)

So, those are the numbers (coordinates) to keep and punch into the satellite navigator in the unlikely event of the Atlantic Ocean bursting its Cape Town shores. If you stay in Cape Town, the first place that comes to mind when you hear ‘Flood Survival’ is likely Table Mountain and rightly so. But, which spot exactly can one stand longest on solid earth in the event of a massive flood?

Cut To The Chase: The Safe Strip

How To DEM

Get Data

Relief maps always get my attention. The mystery of attempting to model reality on the computer enthuses me. There are plenty tutorials on the net on how to do this so here’s my version.

I got some vector base data (Metro Boundary, Suburbs, Roads, Railway Line, Building Footprints and a DEM - Digital Elevation Model ) off City of Cape Town’s Open Data Portal or here. The vector data, I loaded into a geopackage. How? Of special interest was the DEM, described as “Digital model (10m Grid) depicting the elevation of the geographical surface (Bare Earth Model) of the Cape Town municipal area.” The Open Data Portal has the data stored as 10m_Grid_GeoTiff.zip.

Style Terrain

After extracting the compressed elevation data, I loaded it in QGIS.

  1. Load 10m_BA,
  2. Style the ‘relief map’ in Properties –> Style

Pseudo Coloured DEM Settings

The raster would then look as shown below

Pseudo Coloured DEM

  1. Now to create a hillshade. Do Raster –> Analysis –>DEM (Terrain Models)…

    With the settings shown below create a Hillshade DEM

Hillshade DEM Creation Settings

This gives us a hillshade …

Hillshade DEM

  1. To get us a Terrain Map. Set transparency in the Relief Map (10M_BA). RightClick –> Properties –> Transparency

Pseudo Coloured DEM Settings Transparency

  1. Now a combined view of The Relief and Hillshade …

Relief Map - Combined Relief and Hillshade

  1. Now Load the other support vector layers
    • Metro Boundary (Flood Plane)
    • Suburbs
    • Roads
    • Railway Line
    • Building Footprints

All Layers Loaded

Getting 3D

To get started with 3D Terrain. Load and activate the Qgis2threejs Plugin.(Read more). (This is one approach to getting 3D in QGIS. At the time of writing this post, 3D comes native with QGIS. Read (more.). But, I was using QGIS 2.18.7, Portable gotten from here.

The Qgis2threejs Plugin is unique in that it bootstraps the process to have a web ready 3D model to play with.

  1. Launching the Plugin for the first time should give something like this …

Qgis2threejs First Launch

  1. Zoom to Cape Town CBD and include all of the table of Table Mountain.

DEM focus area

Now, to prepare for exporting the subject area, emulate the following series of settings paying attention to the parts World, DEM, Roads,

Export Settings

Export Settings

Export Settings

Export Settings

Export Settings

Export Settings

Export Settings

The Surburbs Layer is used mainly for labelling purposes in the final export. We are not interested in showing the Suburbs boundaries for now.

Through Trial and Error and guided by the absolute height of the DEM, we get the optimum height value, 1 060m, to use for our Flood Plane (Which in essence is just a polygon covering all of our area of interest - Metro Boundary in this case).

Export To Web

In order to have an interactive 3D Map simulating a flood.

  1. Ensure 3DViewer(dat-gui).html is chosen under the Template file in the Qgis2threejs Plugin.

  2. Choose the appropriate path and name for the index html for the visualisation. In this case CCTerrain.html is chosen.

  3. Export and Qgis2threejs will generate the necessary style sheets and other files for the export.

  4. Now using a text editor, edit parts of the exported html file (CTTerrain.html) to reflect what the project is about and put in some usage information.

A Preview below - full screen here

#PostScript - Find The Safe Spot

Click around the map to turn on/off the layers.

Relations, I Mean, Tables Galore

SCT Part 2

Embracing Text

In the previous blog, we explored how we can load (explicitly) spatial data, initially in shapefile format, to a geopackage. In this post we explore the loading of relations a.k.a table to the geopackage. Again, data from the Open Data Portal is used.

Data Is Rarely Clean!

We will explore “Data showing City land that has been disposed of.” The data is spread over several spreadsheets covering various time periods. In some instance yearly quarters. The data custodian is given as “Manager: Property Development & Acquisitions city of capetown.

In the data wrangling process using LibreOfficeCalc, the open source equivalent of MS Excel. Several attributes will be learnt about the data. Especially that which hinders a refined data structure. Issue will be varied, among them:

  • Not all records being filled in.
  • No consistency to period of registrations.
  • Inconsistent formatting of currency.
  • Sale price included or excluded VAT inconsistently.
  • Typos in area names.
  • A regions field which definition could not be got from the Open Data Portal.

To improve the data we may end up

  • Adding some fields to improve schema
  • Concluding area field = Suburb Name (per suburb_layer from previous blog post)
  • Matching the area field names to match up with Suburb Name in the Base Suburbs reference data we adopted.This procedure inadvertently modifies data from it’s original structure. However, knowledge of local suburb boundaries and place locations means the impact will be minimised.

Casual inspection reveals

  • A lot of sales to churches and trusts. Possible follow-up questions would be. Was this influenced by any legislation in a particular year? Which year saw the largest number of sales to churches?

Loading Tabular Data To Geopackage

With the Land Disposals data, covering several years, curated and consolidated into one spreadsheet. It is time to import this data into our geopackage. Here are the steps to do it:

1. From the Spreadsheet application (OpenOfficeCalc), Save file as and choose Text CSV (Comma Separated Values).

To load the Text File to the geopackage we use any one of the two approaches.

Method 1: Using QGIS

1. Start Q if not already running.

2. Select Add Delimited Text. This is a large comma icon with a (+)plus sign. When you load the file, it will likely look like the screenshot below. (Note that the image shown here is of a file that would have undergone considerable pre-processing. Especially on field names and data types in cells.)

Delimited Text Import in Q

Make the selections as shown above. In particular ‘No Geometry’.

Opening the table in Q we see the data we just imported.

CSV in QGIS

Now let’s save this to the GeoPackage.

3. Rick Click the Layer name, –> Save As. Then specify the location of the geopackage. In the dialog box that comes up, under Geometry, choose No Geometry. (We are just importing a non-spatial table.).

Save to GeoPackage

That’s it!

Let’s explore an alternative way of getting the text file into the GeoPackage.

Method 2: Using SQLite Browser

1. Open the GeoPackage in the SQLite Database Browser.

2. Do File –> Import –> Table from CSV. Check on; Column Names in first line and Trim fields?. Separator is comma.

Import to SQLite

Done!

3 Switch to the Browse Data tab to see what we have just loaded.

SQLite Data Browser

A Review: What To Choose

On inspecting the table loaded with QGIS and SQLite Browser. Somehow Q correctly and auto-magically distinguish text and numeric field data types. SQLite Browser made everything type “TEXT” in this particular case.

Let’s experiments some more with our data.

Close SQlite Browser and switch to QGIS. (It is good practice to close one application when accessing data from another. Particularly if you are to do editing.)

Launch DB Manager and Browse to the geopackage. Let’s see how many distinct suburbs do we have in this table. Start the SQL Windows and run…

SELECT DISTINCT  area FROM  land_disposals_2009_2018
ORDER by area ASC;

we get 102 records . We see there is NULL in first entry so we really have 101 records. On close inspection we notice there are a lot of typos in some names.

typos in data

We will need to clean this data if we are to have meaningful analysis later on.

Let’s check the suburbs layer and see how the data compares

SELECT DISTINCT  NAME FROM  suburbs
ORDER by NAME ASC;

We get 773! This is gonna be interesting, especially getting exact matches.

Before cleaning the data further, lets see if we can ‘connect’ the suburbs and land_disposals datasets.

SELECT  * FROM suburbs
JOIN land_disposals_2009_2018
ON suburbs.NAME = land_disposals_2009_2018.area;

Join operation

Yep! We have some hits.

We are going to be basing our analysis on the suburbs layer so we use that layer for a base and format subsequent datasets after that it as much as possible. Cleaning can be done in QGIS user interface editing or better still in the sqlite database (geopackage) using SQL.

#PostScript

The take away is that GeoPackage can store tables, no sweat.

The join operation used above is simplistic. One would need to appropriately use LEFT, RIGHT, INNER JOIN.

I used:

  • QGIS 2.18 (DB Manager)
  • DB Browser for SQLite 3.10.1
  • SublimeText 3.1.1

Saving A One to Many 'Join' in QGIS

When A Relate Won’t Do

While working on the next part of my series of blogs I ran into an interesting ‘discovery’ which warrants an interjection. Testing the data loads into a geopackage, I decided to join a spatial and attribute data AND retain the results as a new ‘layer’. Tools in the QGIS Processing didn’t help much. So I Googled for a solution. To my surprise I got a few first time hits. I learnt MapInfo can do it so can ArcMap. There we also suggestions of having a relate in QGIS. But, that’s not what I sought.

Buried in a Google Forum was the solution by Thomas McAdam. So here’s how I ended up doing it … so we have one more source for the solution on the web.

Love The Database

A pre-requisite to the procedure is to have the data stored in some database and that’s not scary at all. I explain one way to doing that here creating a Geopackage. You can also create a SQLite Database as explained here. You don’t even have to leave the comfort of Q while doing it.

A One To May (1-M) Join

With the objective of joining two datasets and keeping the results. The assumption is that your data is clean and you have common fields between your two datasets.

  • Table 1: suburbs - Has polygons of suburbs I wish to join (One Record).
  • Table 2: land_disposals_2009_2018 - which has sales records per suburb (Many Records).

Two tables

Now, with the target datasets loaded in a ‘spatial database’ as suggested above.

  • From within Q, Launch DB Manager to access the geopackage (in my case, explained here.)

  • Select the appropriate database/ geopackage then
  • Launch the SQL Window.

SQL Window

  • Within the window write SQL Statement to JOIN our suburbs with the land_disposals_2009_2018 table.

Which translates to

SELECT  *
FROM suburbs
JOIN land_disposals_2009_2018
ON suburbs.alt_name = land_disposals_2009_2018.area;

Simply put this says, “Basing on the common valued fields alt_name and area of the suburbs and land_disposals_2009_2018 layers respectively, join these two.”

  • Click on Execute to see what results this gives. (An inspection will help verify the legitimacy of the operation we just ran.)

  • Tick the Load as new layer Check Box near the bottom of the SQL Window.
  • Tick the Geometry column Check Box. (Selecting a value that looks/ read something like geom).

  • Now click on Load now! to have the results display in Q.

Load Results to Q

Results, If You Please.

On clicking Load Now! The Q canvas will fill with spatial, familiar ground!

To Save the results, Right Click the layer name in Q, Save As and we can save to any format supported by Q to our heart’s content.

#PostScript

The JOIN used here is an overly simplified approach to tip-top SQL queries. I would recommend intensive reading up on the JOIN operation as things can really go wrong surreptitiously.

In With The New - A Geopackage Poke

SCT Part 1

Scrapping For Data

The City of Capetown has a great and searchable place where data on and about the city is shared openly. The data is currently being migrated to a Portal for ArcGIS site which will bring with it a wealth of features and functionality. For now (June 2018), I am happy with the ‘simple’ site. This post is about collating spatially formatted data on the City of CapeTown. (Well, as provided on the open-data portal.)

So to start out I got city-wide base data from the Open Data Portal site and any apparently interesting themes, storing it locally for easy reach. The data was in compressed zip files, excel spreadsheets, pdfs and ODS files. I just dumped these, after some processing, in one folder. Fortunately there was sanity to the file naming from source.

Creating a Geodatabase

I went with GeoPackage for data storage/file format. So I could learn more about it and also because I have been hearing about it more often. Recently from FOSS4GNA 2018. I used primarily Q, QGIS and other tools I will mention. Here’s the step by step:

1. From the Open Data Portal, search and download the suburbs spatial data. This is available in zipped shapefile format.

2. Load the official planning suburbs data (zipped) shapefile in Q. The CRS(Coordinate Reference System) in Q set to ESPG:4326,

Suburbs in Q

3. Right click on the suburbs layer, then Save As…leads to a dialog box of format choice and other options. Choose GeoPackage for a format.

Save As Geopackage

  • Select FileName and browse to the appropriate location. Enter the Geopackage Name (cct_opendata.gpkg).
  • Edit Layer name (suburbs). (Taking out the original Planning and Official to reduce characters in the layer name). CRS Auto populates to EPSG:4326, WGS84.

  • (Encoding is Grayed out at UTF-8 which is okay. we chose this when we loaded the suburbs shapefile).

  • Expanding the “Select fields to export and their export options”. Tick off SHAPE_leng and SHAPE_area. (Area and Length are not really useful especially when using a Geographic Coordinate System. Additionally, these where values calculated in the original gis software the data was exported from so we drop them to avoid confusing ourselves when using this data in our new geodatabase we are building.)

  • Keep “Add saved file to map”. (So we will be able to see if our export went well).

  • There is no Symbology to export so skip that part. (We have not done any styling to our data yet which we would want to keep.)

On “Geometry”, Q has it as Automatic. (Q will correctly interpret this as polygon which it truly is.)

On “Extent”, Choose layer. Which defines the bounds of City of Capetown .(Well, I have prior knowledge of this. Local context data)

Layer Options:

Q has a tooltip for these fields. So hovering a mouse gives a hint what a field is for.)

  • DESCRIPTION: set to “Official Planning Suburbs”
  • FID:fid (Kept as is. Means Feature IDentity we infer.)
  • GEOMETRY NAME: geom (Shortened from geometry. A matter of preference from previous experience. Conveniently short when writing out SQL statements.)
  • IDENTIFIER: Suburbs
  • SPATIAL _INDEX: YES (This helps in speeding things up when doing queries and spatial stuff with this layer.)

Custom Options:

Conveniently, the data from the Open Data Portal comes with some metadata. So we use that to populate these fields. Better build a comprehensive database from the ground up for future usage’s sake.

Layer Options

On OK.Q loads the GeoPackage and subsequently the layer. Q has made the export MultiPolygon and not just Polygon. (We will investigate this later.)

Our Geodata, The GeoPackage

To see what we just did. In Q, Load Vector data. Point to the GeoPackage (cct_opendata.gpkg ) and Voila! Our Suburbs data ‘geopackaged’ and with only two fields “fid” and “NAME”.

Inside The Geopackage

(From previous database fiddle experience, something about the “NAME” field in CAPS bothers me. We’ll deal with it late if need be.)

To take a peek into our GeoPackage Launch DB Manager from Q.

  • From Menu –> Database –>DB Manager –>DB Manager

  • Right Click on GeoPackage.–>New Connection. Point to the GeoPackage (cct_opendata). A connection is added and therein is the suburbs layer we just added.

The GeoPacahe in DB Manager

From here there is a plethora (I could be exaggerating) of GeoPackage versions. So lets’s investigate which version we just created so we know in case of eventualities as we build our datastore. We get a tip from here on how we can check a geopackage version.

With the GeoPackage opened in DB Manager.

  • Launch the SQL Window (Database – >SQL Window).
  • Then type
    PRAGMA user_version;
    

    then Run (F5) the query.

Pragma User Version

User_version ‘0’ is not very indicative.

  • Run the following query in stead
    PRAGMA application_id;
    

    then Run (F5) the query.

We get a better results. Let’s interpret ‘1196437808’. from this guide we learn “1196437808 (the 32-bit integer value of 0x47503130 or GP10 in ASCII) for GPKG 1.0 or 1.1”

So our geopackage version is atleast 1.0.

While we are still at the SQL Interface. Let’s interrogate our ‘suburbs’ data.

How many suburbs are in our table?

SELECT Count(fid) FROM suburbs, 

792 suburbs!

Fiddle

“A GeoPackage is a platform-independent SQLite [5] database file that contains GeoPackage data and metadata tables.” Source. From this we infer that we should be able to explore the GeoPackage somemore with DB Manager as a SpatiaLite database.

  • Right Click SpatialLite icon —>New Connection, then point to cct_opendata.gpkg. Expanding it reveals the contents.

Geopackage in Spatialite

More about those fields is explained here. Not for the faint hearted. Gladly we can just point Q at the geopackage and get our layer(s) to display!

Stacking The Store

To add more data to the GeoPackage. Repeat the procedure described in Creating a GeoDatabase above.

So I went on to load several layers form the Open Data Portal. After several clicks, some typing and 1.5Gigs later, 22 spatial data layers! (These were pseudo-randomly selected. No particular preference just a hunch that the data may help answer a question yet unknown.)

Data Together

So while scrolling through the data (in Q via DB Manager),I noticed a triangle warning one of the layers did not have a spatial index. I opted to create one as prompted by Q. Q did it behind the scenes I simply had to click a hyperlink.

Using the SQLite Database Browser, we can get more insight into the GeoPackage.

#PostScript

While browsing to the geopackage location. I couldn’t help noticing two extra files

Extra Files

I found out these are temprary files used by SQLite. For a moment I thought the multi-file legacy of the Shapefile was back.

  • wal - Write-ahead Log File –*“the purpose of the WAL file is to implement atomic commit and rollback”**
  • shm - Shared memory file – *“The only purpose of the shared-memory file is to provide a block of shared memory for use by multiple processes all accessing the same database in WAL mode.”**

I used:

  • QGIS 2.18 (DB Manager)
  • DB Browser for SQLite 3.10.1
  • SublimeText 3.1.1

Refs:

1. FOSS4GNA 2018 Geopackage Presentation

2. Geopackage Website

3. Fulcrum’s Working With GeoData

4. More On GeoPackage

5. SQLite Browser Site

A City With Data

SCT Part 0

Driven To It

At one geo-software hands-on training I attended. A colleague alluded to the instructor, during feedback time, that it would be great if the training providers used data with a local context. In defence the instructor indicated the concept(s) being taught remained the same. I couldn’t help thinking though how I struggled remembering place names of the data we wrangled. Yes, it had a location but it would have been even better if it were from a place I knew. That somehow takes away a layer of learning hindrance.

I also recently (May 2018) bumped into a job advert for a GIS/Location Intelligence Market Analyst.It was (is) a fascinating job, requiring an intriguing skill set. I struck on the idea of evaluating how well I would fare if I was hired for such a post.[Personal box ticking]. Additionally, constantly in my tweeter feeds. There is much chatter on Data Science, Data Analyst, R, TidyVerse, SQL, JavaScript, Visualisations …[well, must be who I chose to follow, but the fact that I hadn’t unfollowed them was an indication of interest in the topic(s) . ]

So, I will start on a Blog Series - Spatial CapeTown (SCT) with the objective of, well, having a more focused approach to my blogging:

  • Improve my writing skills. (Writing tutorials/ reports that are easy to follow).
  • Develop and improve work flows while using exclusively FOSS. (So that the workflows are reproducible, software availabiity angle)
  • Data massage and honing. (Local context data - the City of Cape Town in particular).
  • Improve data analysis and Interpretation skills (Draw insights about the City from the data).

As a spin-off I hope to have fun and quench my insatiable desire for cool visualisations and code in software packages.

CCT contours in Q

The City of Cape Town has a magnificent spatial data viewer from where one can find a ton of information about the city. If you are a tinkeror though, you also want access to the raw data. Frame your own questions and answer them yourself or answer unasked questions while playing with the data . So this series of blog posts is an alternative window to the wealth of spatio-temporal information about the City of Capetown. It will border on data manipulation to statistical inference about matters and phenomena.

On twitter ~ #spatialcct

#PostScript - Disclaimer

The proceeding work is no way authoritative and is purely my work in my personal capacity and not representative of City of Cape Town authority. It is neither official communication nor authoritative data dissemination. Consider it a weekend hack by a fanatic in some garage with data gotten off the internet.