Why Your Org Should Develop Software

11 Jun

If only there was a way to move this to here and make this load automatically in there and all I have to do is press a button, then I could get straight to the part of my job that requires personal attention and knowledge and not have to keep doing the same ten steps over and over.

I hear this a lot. When someone has been in a position long enough, they quickly discover routine tasks that could be automated or ways to make them more efficient. What started as a desire to simplify my job and not waste time performing rote tasks turned in to a career out of writing simple applications to make everyone’s routine tasks easier.

This worked well for me in small companies that lacked resources – and were forward thinking. They didn’t know python from perl and didn’t care. They knew a task that took days now took hours – with no money being spent – and that was enough.

Enter the large organization.

The large organization buys its applications. They have accounting software, web servers, databases, Cognos, SharePoint, and GIS applications. When you have a task that needs to be accomplished, you use one of these applications. If you can’t, go buy something that will.

This line of thinking results in inefficient workplaces.

Joe needs to grab a field from a database and put it on a website. Great, our organization has software for that – Cognos. Does Joe really need a massive Cognos report to display data that really only takes 3 lines of code?  No. So what are his options? Build it or buy it.

These daily tasks that are needed by individuals are often too specific to be solved by off the shelf applications and also simple enough that they could be built in house – the sad part is they aren’t.

The Case Against In House Development

In house development is prevented for a variety of reasons. My favorites are:

Who will maintain it?

  • We can’t have an employee wasting time fixing applications, that costs money.
  • When the employee leaves, who will maintain it then?
  • When we need updates, who will do it.

What about security?

  • In House applications are not secure.

Who is accountable?

  • If we buy it, we have someone to blame when it all goes wrong.
  • We have someone to sue if something goes wrong.

The Case for In House Development

These reasons for not pursuing in house development seem reasonable enough, but we need to examine them and the alternatives.

Who will maintain it?

You need someone to maintain it and it should be your in house development team. It is not a waste of money. If the application results in efficiency gains, they need to be measured with the costs of building and maintaining the application. Who maintains vendor applications? The vendor. But do they maintain it for free? Not always. Need something fixed in your application because you got rid of your image server for ESRI Rest and now your vendor applications don’t work? Too bad. You broke it, not them. Pay, and maybe they will fix the application for you. Did you upgrade to IE 9 because of another application and now your primary doesn’t work. Oh well. Your vendor doesn’t have an IE 9 version yet. If you need a new feature, will your vendor add it? Will they charge you for it? What if your vendor discontinues the product? No support for you – unless you buy the new version or application.

What about security?

I hate this argument. I understand that you have no confidence in your in house developers but to think that because a developer works for a vendor somehow makes their applications more secure is absurd. Let’s just assume that they are more secure for now. Security is measured in what we know today.In 1995, developers were not thinking about SQL Injections. Their applications were secure – as far as they knew. Time proved them wrong. While many rewrote them, we still find these insecurities today. The point is, as technology changes, we see new security holes. While you can protect yourself with what you know today, protecting against the future is difficult. And if you need a fix, will your vendor provide it quickly and cheaply?

Security is not a function of company size and reputation. Microsoft produces software that has numerous vulnerabilities. Of course their software is huge, but so is the company. ESRI ArcServer 10 is subject to cross site scripting vulnerabilities. ESRI has no updates for 10 so you need to buy a newer version. But I haven’t been using 10 for very long and my budgets are tight. Can’t I just get a patch? Nope.

Software is hard and hackers are looking for ways to exploit it. You can code with what you know to be best practices but the key to security is fixing known issues as soon as they arise – not waiting on purchasing to approve a contract for a fix – if your vendor has got around to writing one.

Who is accountable?

Organizations like to be able to point the finger at someone else, but this doesn’t make you function better. If you are a government and are providing a service that doesn’t work, the public could care less who built it, they just want it to work. If they find out you paid a large sum of money for the complicated, buggy, requires a plugin application they are using they are going to be more upset that their tax dollars when toward it.

Go ahead, pass the buck, but your users don’t care. They want to get their jobs done, or interact with your organization. And if it has your logo on it, it’s you even if you didn’t build it. So if you are going to let a vendor control your image and reputation, good luck. At least you can spend even more money to sue them when it fails or when they lose all your customers social security numbers.

One word for you: Healthcare.gov.

The feds paid millions for a website that was a complete failure. Nobody cared that the government didn’t build it. The government got the blame for hiring an inept vendor and for spending millions and getting nothing. Can you name the vendor? No, because even you don’t care – it was the governments fault. In response to this fiasco, the feds started 18F and the US Digitial Service – in house developers. But your organization doesn’t need developers because you’re special.

 

The best software is software you use.

The best software is software you use. Basecamp is a popular project management application that started out as the developers internal project management app. They wrote it for themselves. Now, over 15,000,000 people have used it. Git is another example. Git is a revision control system that was developed by Linus Torvalds for his work on maintaining the Linux kernal. It is the most widely used software management system.

Your organization knows what it needs. If it can build it, you will be better off. I am also a realist. I do not think you should build everything. But for the small tasks that make everyone more efficient, why not?

 

Advertisements

Data and Design

8 May

If you have read any of the posts on this blog, you should know I love data. But what you may not know is that I love architecture, design and a good sketch. I spent five hours getting Study For The War Coffer by Eugene Delacroix tattooed on my chest.

coffer

Often these two worlds collide. I came across a tweet today:

Mindlessly drawing with data? How dare she. I once thought it a good idea to write computer code that could read an architectural program and develop the floor plan automatically. While I still favor some of this thinking, I have had to think it through. And slowly I have come out against it, and I have sided with Tara on this issue.

There is something to be said for hand drawing. The lines made by a pen, with their varying weights, show movement in a still image. There is something beautiful about them. About the process of sketching. Freely moving your hand across the canvas. The AIA had a podcast on Didactic Drawing that really brought it all home for me. On a computer, scale can change. You can draw a hundred foot line and based on your zoom level (scale) it could be a millimeter long. On paper, your scale is fixed. The movement of your hand across the page lets you know how long the line is.

I am not against BIM. But without pre-sketching designs, these program make it easy to create boxes, squares and overall bland buildings, to draw without a set scale, to fully understand and feel the building you are creating.  To design with data is an idea I am still deeply attached to. But I think we walk a fine line between letting data inform design- on how people use buildings for example – to creating the design for us – as in my program example earlier.

Applications like Revit or Grasshopper make it east to start with a simple form – a box – and twist, pull, rotate and skew it to come up with a whole host of possible forms. The results are soulless – though some look really cool. I do not see the art in it. If we are just going to feed some data in to a model to generate a form and say “look at this cool form I created from using the coordinates of all tweets that had the word Gehry in it” then we might as well give up – though I find these kinds of experiments interesting.

Data is, of course, valuable for facility maintenance. I also find value in data on movements of individuals within buildings and with modeling designs for things like airflow, heat, sunlight, etc. These are the kinds of data that can inform design – or confirm that a specific design is a functional design.

I do not want to live in a City full of bland buildings, just as much as I do not want to live in a world full of monuments to the architect that are outrageously out of context. There needs to exist a balance of the art and the science, of architecture and data. And each needs to compliment the other.

 

 

Excel Geocoder

17 Apr

ESRI makes maps for Office. I thought this could be interesting and went to work on trying to insert a map in to a spreadsheet. I did not have much luck. Instead, I decided to throw together a quick macro that would geocode addresses in a spreadsheet and give back coordinates. I do not think VB Script can parse JSON, so the result is ugly, but you get the idea.

Start with a spreadsheet of addresses

sheet

 

Then create a Macro – I copied the code for reading the URL from Ryan Farley.

Sub Macro1()
Range(“A2”).Select
i = 2

Do Until IsEmpty(ActiveCell)

URL = “http://coagisweb.cabq.gov/arcgis/rest/services/locators/CABQ_NetCurr/GeocodeServer/findAddressCandidates?f=json&outSR=4326&street=” & ActiveCell.Value
Dim objHttp
Set objHttp = CreateObject(“Msxml2.ServerXMLHTTP”)
objHttp.Open “GET”, URL, False
objHttp.Send
Cells(i, 2).Value = objHttp.ResponseText
i = i + 1
ActiveCell.Offset(1, 0).Select
Set objHttp = Nothing
Loop

End Sub

The code starts at cell A2 and reads addresses until it reaches an empty cell. It takes the value and sends it to the ESRI REST endpoint for the City of Albuquerque Geocoding Service. It sets the cell next to it with the results. They should really be parsed, but I am too lazy and was just curious if it could be done. It can. The result is below.

results

I am still thinking of how to embed a web page in the sheet.

Graph Database and Albuquerque Bus Stops: Neo4j with py2neo

15 Apr

I have been slightly obsessed with the question: “How do you define network service areas client-side on a map.” I know it needs a networked data set and something to do with the Djikstra algorithm (Yes, we could just use an ESRI REST service but there is not one available yet – I will ask the City). After looking at JavaScript implementations of NetworkX, I stumbled upon graph databases, most notably Neo4J.  A networked data set is a graph. Guess what, it has Djikstra built-in, so I must be on the right path. I installed it and added a fake social graph using py2neo. That allowed me to make sure I could do a few things:

  • Add a node
  • Add a relationship
  • add attributes

Now it was time to start with some real data.

My first test was to load Albuquerque Bus Stops for a single route. Here is what I have in my database.

Bus Stops for Route 766. No Relations added yet.

Bus Stops for Route 766. No Relations added yet.

The image above was generated by calling the City of Albuquerque REST Endpoint for bus stops, parsing the response, and putting it in to Neo4J. The image is a view from the DB Manager. The code to do this is below.

from py2neo import Graph
from py2neo import Node, Relationship
from py2neo import authenticate
import urllib2
import json

authenticate(“localhost:7474″,”myUserName”,”myPassword”)
graph=Graph()
graph.delete_all()

url=”http://coagisweb.cabq.gov/arcgis/rest/services/public/fullviewer/mapserver/22/query?where=ROUTE=’766’&f=json&outFields=*&outSR=4326″
rawreply=urllib2.urlopen(url).read()

reply=json.loads(rawreply)

for x in reply[“features”]:
graph.create(Node(“stop”,route=x[“attributes”][“ROUTE”],direction=x[“attributes”][“DIRECTION”],street=x[“attributes”][“STREET”],intersection=x[“attributes”][“NEAR_INTER”],lat=x[“geometry”][“y”],long=x[“geometry”][“x”]))

Notice there are no Relationships! This is crucial if we will ever walk the network. I have manually added on, seen in the image below.

San Mateo links to Louisianna.

San Mateo links to Louisianna.

The code for this is:

rel=Relationship(graph.node(42),”Next”,graph.node(41))

graph.create(rel)

I need to think about how to automate the relationship creation based on stop order and direction (there are stops on both sides of the street). Then, I will need to figure out how to make a node have relationships to other routes. For example, many stops are connected to the 777 route and I do not want a separate node for each. I want one with a property showing routes.

Well, a start to say the least. It has been fun learning about graph databases and if GIS doesn’t interest you, you could map your social network and walk it.

Open Data Disclaimers and Terms of Service

7 Apr

I saw a tweet by @waldojaquith that commented on the State of NY Open Data Portals TOS. The TOS states that you cannot access the page

“…by using an automated device, script, bot, spider, crawler or scraper”

I can guess their intent is to prevent people from hitting the site with bots and overloading the server. But a device and script? This seems at best, too broad, at worst, ignorance as to how people will want to use the data. Furthermore, will I be prosecuted from doing so or will my IP address be blocked? Is this even enforceable? Can a government block a private citizen from a civic website?

This tweet made me curious about the city I live in – Albuquerque. I was dumbfounded by the content of their disclaimer/TOS. It read

“The City may require a user of this data to terminate any and all display, distribution or other use of any or all of the data provided at this website for any reason…”

Really? The City can call me and tell me to remove the bar chart of car thefts by month from my website because it is based on their data? Or, that I can no longer send the PublicArt.kmz file to someone?

I love Albuquerque Open Data. I find it quite useful and very good. So when the page says that

“The City makes no warranty, representation, or guaranty as to the content, accuracy, timeliness, or completeness of any of the data provided at this website.”

it makes me question the quality of the data. I get it, the data might be wrong, don’t hold it as gospel. Seems to be common sense to me. But this statement creates a distrust in the data. Is it authoritative? After reading the TOS, I would not take it to be.

If I were a gambling man, I would bet this is the work of the Legal Department. The same people that probably have email signatures saying if they send you confidential information by mistake you are in trouble for not deleting it. Yeah, I don’t think so Matlock. Saying it doesn’t make it true.

Government is made up of many departments. Those departments create a bunch of data. That data finds its way to an open data website probably run by the IT Department. The IT department just puts it out there and doesn’t really know anything about the data. They don’t make it so why would they stand by it? They point you to the department that made it. But that department probably didn’t care to release it in the first place and don’t want to be bothered exporting it or updating it. Or worse yet, fielding calls from citizens about it. But if someone is going to put out some data, there needs to be a level of trust or quality in the data.

Is it better to put out some data of quality by choice than to be legally obligated to provide it as the result of a FOIA request? I would imagine open data cuts down on a lot of those requests, saving a lot of time in money.

Once you put it out, please don’t think you have any control over what I do with it or who I send it to.

I do not know the answer so I am putting the question out there, and I did so on Twitter as well:

Has a City ever been sued for the quality of their open data?

 

Historic Bus Location Data

27 Mar

Last night I was trying to think of interesting uses for a Rasberry Pi. One thing I came up with was a data logger. But what to log? Then I thought about a previous post on the Albuquerque Realtime Bus Data. Hmmm. What if I wanted to show the bus locations using a time slider. What if I want to see if they ever deviated from their routes, or if they deviated from their schedules? I can’t really do any analysis without the historic data, and the City does not give that out currently. So I think I find a use – logging Albuquerque Bus Data.

I don’t have a Rasberry Pi, yet, so I wrote a python script on my desktop to test the logger.

I am not going to post the code because I don’t know the impact on the Albuquerque server. I will give a brief explanation. The City has KML files for each bus route. Each route has multiple buses. I grabbed a single route – 766, and parsed the results. I initially sent the results to a csv – as you can see in the data below this post. Writing to CSV is not too helpful when the data gets large (I am not going to say BIG). Once I knew it worked, I sent the data to a MongoDB that was spatially indexed. In the database, I can now:

Get the total records

                         collection.count()

Get all the Records

                        for x in collection.find():
                                      print x

Get a specific bus number

                       for x in collection.find({‘number’:’6903′}):
                                           print x

Or find near a lat,lng

                       for x in collection.find({“loc”:{“$near”:[35.10341,-106.56711]}}).limit(3):
                                           repr(x)

With a database, multiple people can query it and perform operations on it. Lastly, if the data gets larger, Mongo can be split (sharding) across multiple machines to hold it all.

My MongoDB records look like:

{u’loc’: [35.08156, -106.6287], u’nextstop’: u’Central @ Cornell scheduled at 3:45 PM’, u’number’: u’6903′, u’time’: u’3:46:52 PM’,
u’_id’: ObjectId(‘5515cfd814cd2829e4c1b718′), u’speed’: u’20.5 MPH’}

Here is the results of my original run. I ran the script for 7 minutes and got the following results for Route 766.

Bus

Hard to see, but displays bus locations along route 766 over a 7 minute period.

 

6409,0.0 MPH,1:34:02 PM,Central @ San Mateo (Rapid) scheduled at 1:31 PM,-106.58642,35.07778
6904,1.9 MPH,1:34:03 PM,Central @ Edith scheduled at 1:31 PM,-106.64776,35.08401
6411,0.0 MPH,1:33:56 PM,CUTC Bay B scheduled at 1:39 PM,-106.72266,35.07726
6407,21.7 MPH,1:34:00 PM,Central @ 1st (across from A.T.C.) scheduled at 1:34 PM,-106.64317,35.08359
6410,35.4 MPH,1:33:53 PM,Central @ San Mateo (Rapid) scheduled at 1:36 PM,-106.58247,35.07749
6403,0.0 MPH,1:34:02 PM,Indian School @ Louisiana scheduled at 1:40 PM,-106.57089,35.10305
6903,29.8 MPH,1:33:56 PM,Central @ Atrisco scheduled at 1:36 PM,-106.69049,35.08443
6409,0.0 MPH,1:35:14 PM,Louisiana @ Central (Rapid) scheduled at 1:36 PM,-106.5852,35.07764
6904,1.9 MPH,1:35:14 PM,Central @ Edith scheduled at 1:31 PM,-106.6478,35.08413
6411,0.0 MPH,1:35:07 PM,Next stop is CUTC Bay B scheduled at 1:44 PM,-106.72525,35.07886
6407,0.0 MPH,1:35:11 PM,Copper @ 2nd scheduled at 1:34 PM,-106.64784,35.08417
6410,19.3 MPH,1:35:17 PM,Central @ Carlisle (Rapid) scheduled at 1:40 PM,-106.58734,35.07802
6403,0.0 MPH,1:35:14 PM,Indian School @ Louisiana scheduled at 1:40 PM,-106.57087,35.10306
6903,14.9 MPH,1:35:08 PM,Central @ Tingley (Rapid) scheduled at 1:38 PM,-106.68485,35.08576
6409,23.6 MPH,1:36:38 PM,Louisiana @ Central (Rapid) scheduled at 1:36 PM,-106.58084,35.07723
6904,18.0 MPH,1:35:53 PM,Central @ Edith scheduled at 1:31 PM,-106.647,35.08373
6411,0.0 MPH,1:36:31 PM,Next stop is CUTC Bay B scheduled at 1:44 PM,-106.72525,35.07885
6407,0.6 MPH,1:36:35 PM,Copper @ 5th scheduled at 1:35 PM,-106.64944,35.08541
6410,0.0 MPH,1:36:40 PM,Central @ Carlisle (Rapid) scheduled at 1:40 PM,-106.59512,35.07883
6403,0.0 MPH,1:36:42 PM,Indian School @ Louisiana scheduled at 1:40 PM,-106.57086,35.10306
6903,31.7 MPH,1:36:33 PM,Central @ Rio Grande (Rapid) scheduled at 1:40 PM,-106.67824,35.09252
6409,37.9 MPH,1:37:49 PM,Louisiana @ Central (Rapid) scheduled at 1:36 PM,-106.57203,35.07627
6904,0.6 MPH,1:37:55 PM,Central @ Cedar (Rapid) scheduled at 1:33 PM,-106.63771,35.08276
6411,0.0 MPH,1:37:55 PM,Central @ Coors scheduled at 1:47 PM,-106.72526,35.07885
6407,1.9 MPH,1:37:47 PM,Copper @ 5th scheduled at 1:35 PM,-106.6496,35.08487
6410,0.0 MPH,1:37:52 PM,Central @ Yale (UNM) scheduled at 1:44 PM,-106.60369,35.07979
6403,0.0 MPH,1:37:57 PM,Indian School @ Louisiana scheduled at 1:40 PM,-106.57087,35.10306
6903,0.0 MPH,1:37:56 PM,Central @ Rio Grande (Rapid) scheduled at 1:40 PM,-106.67159,35.09515
6409,0.0 MPH,1:39:14 PM,Louisiana @ Central (Rapid) scheduled at 1:36 PM,-106.56872,35.07595
6904,18.0 MPH,1:38:19 PM,Central @ Cedar (Rapid) scheduled at 1:33 PM,-106.63713,35.08265
6411,0.0 MPH,1:39:19 PM,Central @ Coors scheduled at 1:47 PM,-106.72527,35.07884
6407,3.1 MPH,1:39:10 PM,Central @ Rio Grande (Rapid) scheduled at 1:40 PM,-106.65295,35.086
6410,31.1 MPH,1:39:16 PM,Central @ Yale (UNM) scheduled at 1:44 PM,-106.61167,35.08084
6403,5.6 MPH,1:39:23 PM,Indian School @ Louisiana scheduled at 1:40 PM,-106.5707,35.10368
6903,23.0 MPH,1:39:20 PM,Gold @ 5th (Rapid) scheduled at 1:44 PM,-106.67011,35.09438
6409,26.1 MPH,1:40:37 PM,Louisiana @ Lomas scheduled at 1:38 PM,-106.56848,35.07723
6904,24.9 MPH,1:40:10 PM,Central @ Cedar (Rapid) scheduled at 1:33 PM,-106.63585,35.08255
6411,0.0 MPH,1:40:31 PM,Central @ Coors scheduled at 1:47 PM,-106.72526,35.07884
6407,23.0 MPH,1:40:36 PM,Central @ Rio Grande (Rapid) scheduled at 1:40 PM,-106.65675,35.0863
6410,34.2 MPH,1:40:40 PM,Central @ Yale (UNM) scheduled at 1:44 PM,-106.61567,35.0811
6403,0.0 MPH,1:40:42 PM,Indian School @ Louisiana scheduled at 1:40 PM,-106.56875,35.10221
6903,23.6 MPH,1:40:32 PM,Gold @ 5th (Rapid) scheduled at 1:44 PM,-106.66327,35.08892
6409,0.6 MPH,1:41:49 PM,Indian School @ Uptown Loop Road scheduled at 1:42 PM,-106.56849,35.08691
6904,24.9 MPH,1:41:55 PM,Central @ Cedar (Rapid) scheduled at 1:33 PM,-106.63585,35.08255
6411,0.0 MPH,1:41:55 PM,Central @ Coors scheduled at 1:47 PM,-106.72526,35.07884
6407,0.0 MPH,1:41:58 PM,Central @ Rio Grande (Rapid) scheduled at 1:40 PM,-106.65822,35.08648
6410,28.6 MPH,1:41:52 PM,Central @ Yale (UNM) scheduled at 1:44 PM,-106.6216,35.08113
6403,0.0 MPH,1:42:01 PM,Louisiana @ Lomas scheduled at 1:45 PM,-106.56779,35.10184
6903,23.0 MPH,1:41:55 PM,Gold @ 5th (Rapid) scheduled at 1:44 PM,-106.65819,35.08629
6409,41.0 MPH,1:43:13 PM,Indian School @ Uptown Loop Road scheduled at 1:42 PM,-106.56863,35.09282
6904,24.9 MPH,1:42:15 PM,Central @ Cedar (Rapid) scheduled at 1:33 PM,-106.6217,35.08093
6411,0.0 MPH,1:43:19 PM,Central @ Coors scheduled at 1:47 PM,-106.72528,35.07883
6407,9.9 MPH,1:43:10 PM,Central @ Rio Grande (Rapid) scheduled at 1:40 PM,-106.661,35.08784
6410,34.2 MPH,1:43:16 PM,Central @ Mulberry (Rapid) scheduled at 1:47 PM,-106.62524,35.08125
6403,16.8 MPH,1:43:22 PM,Louisiana @ Lomas scheduled at 1:45 PM,-106.56635,35.10156
6903,19.3 MPH,1:43:19 PM,Gold @ 5th (Rapid) scheduled at 1:44 PM,-106.6549,35.084

Python Wrapper for Leaflet

20 Mar

I recently stumbled upon Folium – a python wrapper for leaflet. I was excited and it seemed to work well. I slowly ran in to problems and the pages loaded slow. I probably did something wrong on my end, but decided to write a simple wrapper on my own.

My wrapper is a python function for different Leaflet features such as map and marker. When you call each function, it writes a string to a file to generate the HTML. Below is my python code (pyLeaflet.py).

class l(object):

def __init__(self,path):
self.path=path
self.f=open(self.path,”w+”)
self.f.write(‘<html><head><title>Map From Python</title><link rel=”stylesheet” href=”http://cdn.leafletjs.com/leaflet-0.7.2/leaflet.css&#8221; /></head><body><script src=”http://cdn.leafletjs.com/leaflet-0.7.2/leaflet.js”></script><div style=”height:900px; width:900px” id=”map”></div><script>\n’)

def map(self, lat,long,zoom):
self.lat=lat
self.long=long
self.zoom=zoom
self.f.write(“var map = L.map(‘map’, {center: [“+str(self.lat)+”,”+str(self.long)+”], zoom:”+str(self.zoom)+”});\n”)
self.f.write(“L.tileLayer(‘http://{s}.tile.osm.org/{z}/{x}/{y}.png’).addTo(map);\n”)
def marker(self,lat,long, popup=””):
self.x=lat
self.y=long
self.popup=popup
self.f.write(‘L.marker([‘+str(self.x) +’,’+str(self.y)+’]).bindPopup(“‘+str(self.popup)+'”).addTo(map);\n’)
def onclick():

def makeMap(self):
self.f.write(‘</script></body></html>’)
self.f.close()

To use the code, follow the example below.

>>> from pyLeaflet import l
>>> L=l(“Paul.html”)
>>> L.map(35,-106,8)
>>> L.marker(35,-106)
>>> L.marker(34,-106,”Hello from Python”)
>>> L.makeMap()

The output will be an HTML file called paul.html that displays a map with a maker.