Posts tagged ‘Code’

New Learnings

During the course of interviews I have gone through in recent weeks, a significant portion of my time was spent refreshing my knowledge of basic theoretical computer science type things. The reason was fairly simple in that I wanted to be able to more or less instantly be able to answer any of the basic knowledge questions. I could have independently come up with various graph traversal algorithms given a bit of time as I did have them stored deep in my grey matter. That said, it seemed better to freshen up the on the details so those particular nodes in my brain had a bit more weight to them.

So there was lots of refreshing of information, which was notably taking up my brain time which meant to less time learning wholly new things. This is okay, but one of the best things about software development is the constant learning. So when a company mentioned that they were using Hadoop and HBase, this was instantly exciting to me as it gave me a perfectly good reason to go and research some new technologies.

If you haven’t heard of these two projects, and unless you are specifically working with them, you probably haven’t, Hadoop and HBase are free software implementations of two systems designed by Google, MapReduce and BigTable respectively. MapReduce is a framework developed by Google to facilitate processing and working with large datasets across a distributed network. BigTable is essentially a way to store structured data across a distributed network, though it is important to note that this structured in terms of nested hashes, not in a traditional relational manner.

Google had a few motivations for building something like this. They regularly worked with gigantic datasets, their search index itself, search logs and Google Maps tilesets as a few examples. Analyzing these datasets took massive CPU resources and a distributed approach was more or less deemed the only practical way to actually compute solutions. MapReduce takes the classic divide and conquer approach to solving a problem. The problem space is split up and doled out to dozens, hundreds or thousands of computers. This is the map phase. Results from each piece of the calculation is then returned to the master computer which then reduces the results down into some sort of final output.

This approach has been used for solving all sorts of distributed problems, but MapReduce was unique in that it was a framework, usable for a variety of tasks. The hard parts of managing a distributed network for a single calculation are often just that, the management tasks. Deciding which servers get which chunks of data depending on where they are in the network. Deciding if a server has crashed, or if it’s just slow. How and what, if any, data do you duplicate across multiple nodes to guarantee that you get an answer. These are all concerns that must be addressed when creating a distributed application.

What Google did with MapReduce was abstract away the management tasks so their developers could focus on actually writing the algorithms to solve the problems. Hadoop is a Java version of MapReduce. But I digress.

What was really nice was that in the midst of this review and stress, as looking for meaningful employment is not without some concerns, was an opportunity to read a couple of new papers and learn some new things. So I promptly went and wrote a distributed program to calculate how many lines each character in the collected works of Shakespeare spoke. This was not something that required much computer horsepower, but it was pretty cool to run the thing on two computers and get chunks distributed to each of them.

So that went over well. I am going to post this now as I have been sitting on this for a while and not finishing it up. I need to lower my standards somewhat as, to paraphrase Joel and Jeff, I only have one reader anyway.

Resolutions – 2009

I read quite a bit. I read fiction, fantasy and sci-fi mostly, but I am not opposed to a good mind numbing blockbuster either. I read non-fiction, recently the works of Pierre Berton have caught my interest, though I tend to read a fair bit of political philosophy books as well. I read computer books, my desk and surrounding area is littered things I have read and occasionally need to reference or just read a chapter of to refresh my memory. Finally, like most regular Internet users, I read blogs.

It was directly due to the words of a couple of well respected bloggers that I started this one. There are multiple reasons. By writing, you exercise the communication part of the brain, which is always good. By putting up a public blog, it becomes easier to control your own Internet presence. As long as your writing is even passable, it is good marketing for yourself. These were just a few of the points that were made. I bought into it and am now going on a couple of months of roughly weekly posts. I believe I can keep this up rather easily.

So, one mission accomplished.

On more than one occasion, one of these people who are well respected in the software development field will mention something about a compiler. Specifically, that every serious programmer should write one at some point in time. This makes sense as a compiler tends to be one of those bits of software that does a good job of covering pretty much everything that a programmer needs to do somewhat regularly. Lots of string manipulation, lots of recursion, lots of knowledge about how a computer actually works, memory management, the list goes on.

I have not yet written a compiler. It is something that I have wanted to do more than once, but never actually sat down and started working on it. I’ve seen enough mentions in passing recently that I feel it is time for me to complete this particular rite of passage and actually write a compiler. I am capable of it, but I needed the push down the right path to do it.

So it is not going to be a huge project and I’m not going to sink huge amounts of time into it, but I do have goals for it that should be completed by the end of 2009. I need to define my own language, it should have ifs and whiles, it should evaluate mathmatical expressions correctly. I think I will compile to JVM compatible byte code, though I have not decided on that yet either. That would theoretically allow me to use Java libraries as well, which is appealing. It’s not going to be object oriented, mainly as this is a learning experience and I don’t want to bite off more than I can chew so to speak.

I’m leaning towards Ruby for the project, but am stronger with Java which will make some things significantly easier. I will likely make the final decision once I’ve read some of the Dragon book and know more about what I am getting into. My intention is to write it all at this point. I don’t think I will use a parser generator or anything like that, simply as the point of this is to learn how to write a parser/lexer/etc.

So that’s the plan for 2009, dedicate a couple of hours a week to this project and hopefully be able to run some of my own code in my own language by the end of the year. Who knows if that’s reasonable, but it should be… illuminating.

The customer is always right

I’m back. Took a bit of a break over the Christmas season and enjoyed not working on software, programming and business for a few days. I did have a few interviews, which is always stressful, so I tried a bit harder to actually enjoy the downtime that I did have.

During one of the interviews a comment was made in passing about programmers not wanting to talk to clients. This is a popular topic for developers to joke about, but I had always treated that as a joke. When it came down to the business of doing business, well, talking to your clients is just to be expected. Ultimately, they are the reason you have a job, so there is something very serious to be said for treating your clients with respect.

This particular ramble is going to go down a couple of paths, I’m going for breadth more than depth here.

Generally a software developer is writing code for one of two different reasons. You are writing software for a new product that may or may not have clients yet, you may be trying to carve a niche into an existing market or creating a new one. The second and by far the most common reason is that you are contracted to write software that accomplishes some task for another person/organization/demographic/etc.

So what does that make our job as developers? Our clients or employers have problems that they have decided for whatever reason are best solved by computers. We are to analyze that problem and develop a solution. Hopefully that solution makes our clients more money or saves them money by making existing processes more efficient. If the post-mortem shows that the cost/benefit ratio is less than 1, then we have failed. It really is that simple.

Solving a client’s problem is our job. One of the aspects that makes software development so much fun is that many times our client does not know exactly what they want. This may be a source of entertainment for us, but is a very important thing to keep in mind. The world of programming is one of absolutes, the computer does exactly what we tell it to in a nice and ordered fashion. (Defects being the times when we tell the computer to do the wrong thing) The real world is much less black and white. What this means is that we have to listen to our clients and understand what they are trying to accomplish. If the request is vague, we have to figure this out by asking the proper questions to build a better picture of their business.

This is challenging.

Worse, they may know exactly what they want, but are wrong. I have personally been asked to implement things with very little more than a vague bit of hand waving and comments to the effect of “We think having something that does x would be nice.” In this particular case, x was fairly obviously, to a developer of the system who had detailed usage statistics, not what our users needed. They actually needed y. Through some discussion with our clients, our team was able to make the case that x would only help less than 1% of our users, and a particular subset of users who did not and would not spend any money. That probably helped our case somewhat. We ended up implementing a modified y and our customer service requests for a certain class of problem dropped significantly.

What was interesting to me from a personal standpoint was that the same people who were arguing to just implement x were the same ones who did not want to talk to the client. Now, I should clarify, I don’t mean, the grudging, “Oh no, small talk and meetings with people I don’t know”, the typical introvert response where you know that you will come out of the meeting drained, though hopefully with the problem solved. In this case I mean attempting to make the case that it was not our job to talk to the client to figure out what they needed.

This did not sit well with me. There is a reason that the principles of the Agile Manifesto have several points which specifically reference and talk about dealing with the customer. These points all touch on dealing with the customer with respect and understanding as they are the experts in their own business. We are experts in software development.

As software developers, our job is to learn from our clients to build systems that improve their business. To do that, we will always need to interact with our clients to determine their needs and meet them.