A Little Less Water

Water Waste to Water, Less ?

This is a pseudo-follow-up post to A Point About Points.CapeTown is still in the midst of drought howbeit winter rains bringing some relief. As further mitigation, introducing Level 4b water restrictions from 1 July 2017, were each individual should use 87L per day. With this information all about I could not resist the itch to know where the water sources were in CapeTown, well, geographically that is. The Centre for Geographical Analysis at Stellenbosch University started a satellite image time-series of main dams supplying Cape Town with water. What a brilliant idea! But where were these 6 main (and 8 minor) water sources in the Western Cape? - I sought to answer that.

You can - Cut To The Chase!

Think Water, Find Dams

Think Water, Think Less Than 87L per day

To get me started I got the Dam Names off the City of Cape Town’s Open Data Portal. Search term “dams” yields a csv file of Dam levels, (Dam levels update 2012-2017.csv). This listed all the 14 water supply sources, with storage capacities. I Googled the locations for these and had X, Y coordinates for these.

I planned on having a story map of sorts to help the ‘gazer’ have an idea where the water sources where. I settled for JackDougherty’s template on GitHub. This was particularly attractive because it had LeafletJS in the mix and some GeoJson. My mind was set on the map portion of this excellent visualisation which proved to be too complex to emulate. I also considered Odyssey.js which turned out to be unpolished and kinda project abandoned.

It didn’t take long to have the scrollable map-story for the dams. After following the instructions on the GitHub Repo and editing a few lines in Sublime Text.

See Full Version

Credits

The following sources were used for the Dams Narratives.

  1. Cape Water Harvest

  2. Hike Table Mountain

  3. The Table Mountain Fund

  4. Cape Trekking

and the City of Cape Town website for dam capacities.

Sketch Pad or Formatter

Love to Doodle ?

I tweeted a workflow doodle of what I was doing geo-wise on my day job. The workflow aka algorithm, did produce the desired results when I executed it. In retrospect however I self queried if I really had to and if I did enjoy doodling. What was wrong with working with models (ESRI’s Model Builder) on the computer? Was doodling not ‘wasting’ precious time …since I had to recreate the thing in the geo-software later on anyway? Add to that I was the sole workman on this project so I had the liberty to transfer thought to implementation without the need to bounce the idea off a co-worker. Well, white boards are a great idea, especially when there’s need for team collaboration. Not to mention the departure from the boxing effect a computer system/ software tool may have on the thinking process.

The Handicap

Mine wasn’t (isn’t) just a matter of preferring to doodle on papyrus first. I then realised the doodling was born in the past. Dating back to the time I was first introduced to the word processor, Corel Word perfect on a Windows 95 laptop, circa. year 2000. A computer, I was made to believe, …and also observed many treating it so, was a formatter of work. It was meant to make one’s work look neater, cleverer, add fancy text formatting to that and you had a document you could look at and admire the ‘look’ of things, never mind the worth of what was being presented. So I would spend hours, writing out stuff on paper first for later transfer unto ‘The Computer’. (…well call it electronic typewriter).

The idea that a personal computer was a tool, crept in two years later when I was introduced to programming..in Pascal. Only then did it start to dawn that you could have additional text on the screen as you interacted with the PC. But then, PCs at the College were communal, so back to doodling again…for when I get a chance at the keyboard.

So that’s how I fell in love with doodling. One shouldn’t take their mistakes with to the computer you see. So in working the habit off, it’s let’s muddle on the machine. It is a tool and It’s not like it may explode or anything.

A Point About Points

Point or Line?

The City of Cape Town on 27 February 2017 released a list of the top 100 water consumers in the city. Cape Town lies in a water-scarce region and is (Q1, 2017) in the grip of a severe drought. Among other measures like water usage restrictions, part of mitigating the water crisis in the city was the publication of the list of ‘heavy users’ above. Several media houses ran with the story (and list) with some including a points map of these water users’ streets.

The top water wasters were named by street name and not street address.The ‘offenders’ points maps (I’ve come across) serve an excellent purpose in pinpointing the location of the subject street. A point however, gives the impression of ‘on-this-spot’.There is also as attribute of the point, the subject street name to aid the map user. Still a point map exposes our species of gazers to misinterpretation. The camouflaged water users cannot be quickly and easily identified via some buffer of the points plotted but rather by buffering the entire ‘offenders’ street. The offending household lives ‘along-this-street’ and not ‘around-this-point.’

Well, enough read? You can - Cut To The Chase

For The Love Of Eye-Candy

Well, I’ve learnt (and came to realise it too) that the value of spatial data decreases with the passage of time but, does good cartography follow the same trend? (discussion for another day). But for the love of better cartography I decided to come up with an improved version of the Top 100 Water Users Map.

Get The List

I got the list from the News24 Wesbsite. From the way it was formatted, I quickly concluded this would be a good candidate for CSVs. 100 lines in Notepad++ is not a lot to manipulate and format.

Make It Geo

Address geocoding used to be a costly and rigorous process. Nowadays there are several options to turn address descriptions to vector data. QGIS has the GeoCode option with the MMQGIS ‘Plugin’. Since I had determined to have the data visualised in CARTO, I moved the streets data to there for address geocoding. In seconds I had points!

Georeference In CARTO

So in minutes I had the same Points Map as the media houses!

Street Points In CARTO

On inspecting the point data - Look, Spatial Duplicates! This would slip the untrained eye when looking at text data, especially that sorted/ranked by consumption. This is where spatial representation shines. The following duplicates(street wise) were found;


89.Hofmeyr Street‚ Welgemoed - 143 000 litres

94.Hofmeyr Street‚ Welgemoed - 123 000 litres

4.Upper Hillwood Road‚ Bishop’s Court - 554 000 litres

71.Upper Hillwood Road‚ Bishop’s Court - 201 000 litres

23.Deauville Avenue‚ Fresnaye - 334 000 litres

29.Deauville Avenue‚ Fresnaye - 310 000 litres

100.Deauville Avenue‚ Fresnaye - 116 000 litres

69.Sunset Avenue‚ Llandudno - 204 000 litres

95.Sunset Avenue‚ Llandudno - 122 000 litres

46.Bishop’s Court Drive‚ Bishop’s Court - 236 000 litres

97.Bishop’s Court Drive‚ Bishop’s Court - 119 000 litres


This ‘repeat’ of a street name changes the way the data would be represented. For Hofmeyr Street for instance, the combined consumption would be 266 000 litres. Since consumption is presented by street name.(At the back of our minds cognisant of the fact that the data was presented by street to mask individual street addresses. )

The geocoding in CARTO was also not 100% (would say 95% accurate). I had to move a few point to be exactly over the subject street in the suburb. Not CARTO’s geocoder’s fauly at all. My data was roughly formatted for a relly good geocode success hit.

Street = Line

Now to get my streets (spatial) data, I headed over to the City of Cape Town Open Data Portal. Road_centrelines.zip hordes the street vector data.

In CARTO I exported the points data, to be used to identify the street ‘line’ data.

Within QGIS, the next task was to select the subject streets, being guided by the points data. The (un)common spatial question:

Select all the streets (lines) near this point (having the same street name).

To answer the above question, a ‘Select by Attribute’ was done on the street geo-data. Formatting the expression with the 100 street names was done in a text editor (Notepad++) but the expression executed in QGIS. Some subject streets (e.g GOVAN MBEKI ROAD) spanned more than one suburb and these had to be clipped appropriately. Again I headed over to the City of Cape Town Open Data Portal to get the Suburbs spatial data (Official planning suburbs 2016.zip).

I used these to clip the road segments to the corresponding suburbs. I further merged small road sections to ease the process of assigning attributes from the point to the streets. A field to store the combined consumption for that street was also created.

A Buffer operation of the points data was done followed by a Join Attributes by Location operation to transfer attributes from the points to the line. An edit was made for the spatial duplicates to sum the consumption totals.

The streets vector data (lines) will not have Usage Position assigned to them. The data has been aggregated and totals for a street are now in use. This reduces the data count from 100 to 93 ( 6 repeats , 1 not found street).

Show It

Back to CARTO, I imported the edited streets data for mapping and styling. Wizards in CARTO make mapping trivial for the user. The resultant map below -

I chose the Darkmatter background inorder to highlight the subject streets. A on-hover Info window option was chosen in keeping with the ‘gazing species’.

Perhaps Areas (Polygons) ?

I am somewhat satisfied with the way the data looks represented as lines (read - streets). Strong point being - a line forces one to scan its entire length. While with a point, it immediate vicinity. There’s still some discontentment with the way the lines look - ‘dirtyish’, but then that is the reality of the data. There are several flaws such a visualisation:

  • The length of a subject street gives the impression of higher water usage.(Colour coding of the legend attempts to address that.)
  • Chances are high a would be user associates the ‘lines’ with water pipes.
  • On first sight, the map doesn’t ‘say’ intuitively what’s going on.

This leads me to Part B - “Maybe it’s better to have property boundaries along the street mapped for the water usage”.

#postscript

  • Good luck finding Carbenet Way‚ Tokai! (Looks like a data hole to me.)
  • With the data wrangling bit, street names such as FISHERMAN’S BEND were problematic dealing with the 's
  • CCT open data portal could imporove on the way data is downloaded. Preserving the downloaded filename for instance and other things.

But My GIS Doesn't Like That

Make Me A Map

Oft in my day job I get requests to map data contained in spreadsheets. I must say I prefer ‘seeing’ my (now not) spreadsheets in a GIS Software environment - TableView. So I quickly want to move any data I get to that place. The path to such and eventually the cartographic output is rarely straight forward, springing from the fact that:

  • Most data cells are merged to make the spreadsheet look good.
  • Column (field) names are rarely database friendly. (Remember the Shapefile/ DBF >10 characters limit?)
  • Something is bound to happen when exporting the data to an intermediate format - CSV - for manipulation. (Data cells formatting.)
  • Not forgetting the unnecessary decimal places at times.

The map requester often wonder why I’m taking so long to have their maps ready. In the meantime I have to deal with a plethora of issues as I wrangle the data:

  • Discovering that some field names chosen by the client are reserved words in “My GIS” - [ size, ESRI File Geodatabase ]
  • The tabular data can’t be joined to the spatial component straight-on because:
    • A one-to-one relationship doesn’t exist cleanly because the supplied data contains spatial duplicates but unique attribute data. (What to retain?)
    • The supposed common (join) field of the source data doesn’t conform to the format of my reference spatial dataset.
    • Some entries of the common field from the master data, contain data that does not fall within the domain of the spatial - e.g a street address such as 101 Crane Street, where evidently street numbers are only 1 - 60. (and 01,10 and 11 exist in the database - in case you’re thinking about a transcribing error.)

Frustration from the Client is understandable. The integrity of their data has come under scrutiny by the Let-The-Data-Speak dictate of data processing and cleaning. Unfortunately they have to wait a bit longer for their map. The decision on how to deal with the questionable data lies with them. I simply transform, as best as I can, to a work of map art.

Database Again Please

… entry level Data Scientists earn $80-$100k per year. The average US Data Scientist makes $118K. Some Senior Data Scientists make between $200,000 to $300,000 per year…

~ datascienceweekly.org, Jan 2017

Working with and in geodatabases makes me somewhat feel like I’m getting closer to being a data scientist. So as I resolved for year 2017 - more SQL (…and Spatial SQL) and databases.

Importing spreadsheets to an ESRI’s Geodatabase has major advantages but isn’t straightforward either. The huge plus being able to preserve field names from source and do data check things. Good luck importing a spreadsheet without first tweaking a field or two to get out of the way ‘Failed To Import’ errors.

Import To Geodatabase

Comma Separated Values (CSV) always wins but, beware of formatting within the spreadsheet. If wrongly formatted…Garbage In! But once the import is neatly done, I can comfortably clean data for errors and check integrity better in My GIS environment. As a bonus I get to learn SQL too. Cant’ resist that pretty window!

Import To Spatialite

#postscript

The above are just a few matters highlighting what goes behind the scenes in transforming a spreadsheet to a map. The object to show the non GIS user it takes more than a few clicks to come up with a good map.

Compassing It

Still Looking ?

GIS Specialist

I am in the habit of Googling “GIS” job openings, first in my province of residence and then nationally. CareerJet and Gumtree are my go-to places. CareerJet is particularly good in that it aggregates posts from several job sites. Job Insecurity? ~ far from it! I like to know how my industry is doing, job wise, and also just to check how employable and relevant I still am, a check against having decanted skills and a way of identifying areas of personal improvement. During this insightful ‘pass-time’ activity I couldn’t help notice one particular job that persistently appeared in my searches, albeit evidently being re-posted , for more than 1 year!

Well, as usual you may ~ Cut To The Chase.

Yardsticking

For the particular job above, I felt I was comfortable enough with the requisite skills to dive in. In some areas I wasn’t 100% confident but was sure I could learn on the job in no time (if the prospective employer had such time. But looking at 1 year of advertising? ~ the solution they had in place was good enough for now and could afford to wait before migrating it. Anyway…). I hit upon the idea of preparing myself as if I was to take such a job. As a spin off I decided to aggregate ten “GIS” jobs within South Africa I thought were cool and get a sense of what was commonly being sought.

Add to it, during the beginning of the year I had come across a great post on what an Entrepreneur was and that an individual should be one at their workplace. This was going to be my aim in 2017. I was off to a good start with How Do I Get Started In Data Science from which I had also taken a hint on doing ‘job analysis’. So this year I would list skills I wanted to develop and improve on and work on these through the year. My quick list;

  • Improve on my SQL chops . . .SpatialSQL.
  • Improve on JavaScript . . . for WebMapping
  • Learn Python . . .for scripting.
  • Dig a little deeper into PostGRES (..and PostGIS) Admin
  • Play more with an online web GeoServer on Digital Ocean or OpenShift.

So I’ll see how that compared with what the market required.

Compassing It

GIS Jobs are variedly titled! For the same tasks and job description you find different banners in use. During this ‘pseudo’ job hunt I quickly discovered that GIS Developers were in demand. I ignored the curiosity to find out more and focus on what I was sure was comfortable with. Just avoiding the small voice in the head saying ‘You must learn software engineering, have some software development experience, learn Java, master algorithm design….’.STOP! The search string for all jobs was simply “GIS”.

I chose the following job postings for the exercise.

  1. GIS Specialist
  2. GIS Engineer
  3. GIS Fascilitator/Manager
  4. Head GIS and Survey
  5. Senior GIS Specialist
  6. Senior GIS Analyst
  7. GIS Business Analyst
  8. GIS Specialist
  9. Economic Development and GIS Data Quality Researcher
  10. Business Analyst (GIS)

A Play On Words

To complete the exercise, I aggregated all the text used in the above job posts (skippping stuff like email your CV to..blah blah). The aim was to have a sense of responsibilities given and skills sought.

I plugged the combined text from the ten job posts and plugged them into Word ItOut for a word cloud. Why? The thing just looks nice doesn’t it? Seriously. As an aside I figured the larger sized words mean that they appears more often ~ viz is what potential employees are looking for in candidates. After the first run I found some not so useful words appearing often. I pre-processed the combined text - removing words like - South Africa, Salary, City,…I finally ended up with:

My text from job posts had 1051 words. With a minimum frequency of 5 words, 95 words were displayed on the word cloud.

All Words Count

GIS Job Word Cloud

For overkill I also created a 10 word ‘summary’

10 Word Cloud

GIS Job Word Cloud 10

Last Word

Interestingly, the technical terms are obviated by words like team, development, support e.t.c. SQL doesn’t feature as I guessed it would. Development and experience are dominant. Employers must be looking for persons with systems/ procedures setup skills (Skills is mentioned significantly too!). These positions are likely non entry level kind, requiring experience. For now I’ll stick to my quick list above for skills development and give less weight to results from this exercise.

#postscript

  • This surely is not the best way to do a trend analysis of job requirements. I figure something that crawls the web and scraps job sites by South Africa domain names is the way to go. The method used here is rudimentary and creates a cool word cloud graphic to paste on some wall.
  • Results from this exercise didn’t produce the Wow! Effect one would get from a visualisation. It wasn’t what I expected at all. The black box spewed something off what I expected. To see that SQL didn’t make it to the word cloud? Where is database?