Archive | April, 2015

Excel Geocoder

17 Apr

ESRI makes maps for Office. I thought this could be interesting and went to work on trying to insert a map in to a spreadsheet. I did not have much luck. Instead, I decided to throw together a quick macro that would geocode addresses in a spreadsheet and give back coordinates. I do not think VB Script can parse JSON, so the result is ugly, but you get the idea.

Start with a spreadsheet of addresses



Then create a Macro – I copied the code for reading the URL from Ryan Farley.

Sub Macro1()
i = 2

Do Until IsEmpty(ActiveCell)

URL = “” & ActiveCell.Value
Dim objHttp
Set objHttp = CreateObject(“Msxml2.ServerXMLHTTP”)
objHttp.Open “GET”, URL, False
Cells(i, 2).Value = objHttp.ResponseText
i = i + 1
ActiveCell.Offset(1, 0).Select
Set objHttp = Nothing

End Sub

The code starts at cell A2 and reads addresses until it reaches an empty cell. It takes the value and sends it to the ESRI REST endpoint for the City of Albuquerque Geocoding Service. It sets the cell next to it with the results. They should really be parsed, but I am too lazy and was just curious if it could be done. It can. The result is below.


I am still thinking of how to embed a web page in the sheet.

Graph Database and Albuquerque Bus Stops: Neo4j with py2neo

15 Apr

I have been slightly obsessed with the question: “How do you define network service areas client-side on a map.” I know it needs a networked data set and something to do with the Djikstra algorithm (Yes, we could just use an ESRI REST service but there is not one available yet – I will ask the City). After looking at JavaScript implementations of NetworkX, I stumbled upon graph databases, most notably Neo4J.  A networked data set is a graph. Guess what, it has Djikstra built-in, so I must be on the right path. I installed it and added a fake social graph using py2neo. That allowed me to make sure I could do a few things:

  • Add a node
  • Add a relationship
  • add attributes

Now it was time to start with some real data.

My first test was to load Albuquerque Bus Stops for a single route. Here is what I have in my database.

Bus Stops for Route 766. No Relations added yet.

Bus Stops for Route 766. No Relations added yet.

The image above was generated by calling the City of Albuquerque REST Endpoint for bus stops, parsing the response, and putting it in to Neo4J. The image is a view from the DB Manager. The code to do this is below.

from py2neo import Graph
from py2neo import Node, Relationship
from py2neo import authenticate
import urllib2
import json




for x in reply[“features”]:

Notice there are no Relationships! This is crucial if we will ever walk the network. I have manually added on, seen in the image below.

San Mateo links to Louisianna.

San Mateo links to Louisianna.

The code for this is:



I need to think about how to automate the relationship creation based on stop order and direction (there are stops on both sides of the street). Then, I will need to figure out how to make a node have relationships to other routes. For example, many stops are connected to the 777 route and I do not want a separate node for each. I want one with a property showing routes.

Well, a start to say the least. It has been fun learning about graph databases and if GIS doesn’t interest you, you could map your social network and walk it.

Open Data Disclaimers and Terms of Service

7 Apr

I saw a tweet by @waldojaquith that commented on the State of NY Open Data Portals TOS. The TOS states that you cannot access the page

“…by using an automated device, script, bot, spider, crawler or scraper”

I can guess their intent is to prevent people from hitting the site with bots and overloading the server. But a device and script? This seems at best, too broad, at worst, ignorance as to how people will want to use the data. Furthermore, will I be prosecuted from doing so or will my IP address be blocked? Is this even enforceable? Can a government block a private citizen from a civic website?

This tweet made me curious about the city I live in – Albuquerque. I was dumbfounded by the content of their disclaimer/TOS. It read

“The City may require a user of this data to terminate any and all display, distribution or other use of any or all of the data provided at this website for any reason…”

Really? The City can call me and tell me to remove the bar chart of car thefts by month from my website because it is based on their data? Or, that I can no longer send the PublicArt.kmz file to someone?

I love Albuquerque Open Data. I find it quite useful and very good. So when the page says that

“The City makes no warranty, representation, or guaranty as to the content, accuracy, timeliness, or completeness of any of the data provided at this website.”

it makes me question the quality of the data. I get it, the data might be wrong, don’t hold it as gospel. Seems to be common sense to me. But this statement creates a distrust in the data. Is it authoritative? After reading the TOS, I would not take it to be.

If I were a gambling man, I would bet this is the work of the Legal Department. The same people that probably have email signatures saying if they send you confidential information by mistake you are in trouble for not deleting it. Yeah, I don’t think so Matlock. Saying it doesn’t make it true.

Government is made up of many departments. Those departments create a bunch of data. That data finds its way to an open data website probably run by the IT Department. The IT department just puts it out there and doesn’t really know anything about the data. They don’t make it so why would they stand by it? They point you to the department that made it. But that department probably didn’t care to release it in the first place and don’t want to be bothered exporting it or updating it. Or worse yet, fielding calls from citizens about it. But if someone is going to put out some data, there needs to be a level of trust or quality in the data.

Is it better to put out some data of quality by choice than to be legally obligated to provide it as the result of a FOIA request? I would imagine open data cuts down on a lot of those requests, saving a lot of time in money.

Once you put it out, please don’t think you have any control over what I do with it or who I send it to.

I do not know the answer so I am putting the question out there, and I did so on Twitter as well:

Has a City ever been sued for the quality of their open data?