Software Engineering vs Programming

  • warning: include(/tmp/fortune.txt): failed to open stream: No such file or directory in /home/mohawksoft/org/www/htdocs/includes/common.inc(1696) : eval()'d code on line 1.
  • warning: include(): Failed opening '/tmp/fortune.txt' for inclusion (include_path='.:/usr/share/php:/usr/share/pear') in /home/mohawksoft/org/www/htdocs/includes/common.inc(1696) : eval()'d code on line 1.

I am often at a loss for words when it comes to describing about what is right or wrong about a project. Throughout my career I have come into projects at many different levels, from engineer to CTO, and have seen projects that I think are good vs those that I think are bad. In my opinion it comes down to the difference between architecture and programming. A brilliant programmer can create a great and wonderful program. Unfortunately what starts out as a working and successful program quickly becomes a slow moving disaster as the original code is, as always happens, extended, fixed, and/or modified.

What separates a good programmer from a good software engineer is "architecture." Any reasonably talented programmer can create a program that works within a well defined and limited context. It takes a competent software engineer to do it in a way that promotes improvement and "anticipates" the future. All too often management only looks at the output of programmers and not the important engineering work. How much time have we all spent working around deficiencies of design?

Designing good software is more than just algorithms. Yes, algorithms are the corner stone of good software, this is true, but proper and flexible implementation are a must to create value from them. As an illustration, lets look at the implementation of the file stream API in the C programming language.

fopen, fclose, fread, fwrite, fseek, and fstat are the core file I/O functions, they represent a file stream. They have been around longer than a good portion of working software developers have been alive. Virtually unchanged since the early 1980s these APIs represent what could be considered by any definition excellence in design. Efficient, easy to use, virtually self documenting, with very few surprises in their use. Before "object oriented" was a buzz word, the FILE object was an abstract that represented more than just disk files. It was a good design, then and now. It has really stood the test of time.

Now, lets compare and contrast a "good" design against a "bad" design. UTF-16 is a bad design. It started as "Universal Character Set" and was intended to replace 8 bit ASCII. It took a brute force approach of simply widening every single character by a byte. Not only did it intend to force a whole sale change of every line of software code that manages strings, it was not much better than the ASCII it was intended that it replace, it just had 255 times more characters. Clearly not enough to be a comprehensive internationalization strategy, it had to be augmented by "code pages" that indicate the actual character being represented. This was, in fact, how 8 bit ASCII managed a limited number of alternate character sets. It was clever for its time, but at the time of UTF-16 it was a kludge. UTF-16 is a terrible design and we were almost stuck with it.

A good design was the eventual UTF-8 character set. It could behave like ASCII. It was self synchronizing, i.e. you did not have to scan a string starting at location 0. You could scan a string backwards. It preserved existing code, and it does not need a "code page."

Both UTF-16 and UTF-8 represent two approaches to the same problem. UTF-16 was one way, UTF-8 was another. Both are equally impressive as efforts, both solve the problem intended to be solved. Ironically, UTF-16 was too much like ASCII that it did not address the fundamental problems with ASCII: (1) a limited number of distinct characters and (2) a disconnect from the character "values" to the eventual character. By being too similar in design to ASCII, it failed to be compatible with it. UTF-8 is nothing like ASCII, but it preserves most of the ASCII practices and, in fact, the 7-bit ASCII character set.

So what is the spark that made the difference? I'm not sure anyone can say definitively, hindsight is always 20/20. My guess it is the design process. I don't believe you can design software in a committee or as a specification.

Designing software is like mountain climbing or an expedition. You can have a clear idea of where you want to go and you can have a clear idea how you want to get there, but there are setbacks and opportunities along the way. You will need to change direction at various times along your way. Sometimes it will be because of an impassible obstruction and sometimes it will be because a new path has been opened. You can't be inflexible.

Software development is the same way. Many times you will encounter technical difficulties that require a change in strategy. Other times, the process of implementation inspires new thinking about the problem at hand a results in a new way of doing it. In designing software it is important to question your design as it is being developed. It is important to be coding sooner rather than later because "coding" is a different type of thinking than "specifying." UTF-8 definitely shows the sort of "problem domain thinking" that UTF-16 seems to lack.

In short, I don't believe you can architect down to the middle level or program from the bottom up to the top. You should have a very broad high level specification that outlines what you want to accomplish. Understand that you have an estimate and rough design of what is probably required to get there, but the real architecture comes from the techniques used and the low level building blocks used along the way.

I don't believe UTF-16 ever made sense at this level. Sure, it made sense from a systemic high level "How do we represent ASCII and kanji in our system?" but it did not make sense from the block level perspective of "How to we represent and use international character sets in in a practical way?" UTF-8 showed a real amount of "engineering," problem solving with an eye toward change. UTF-16 feels like it was a "quick" way to solve a very specific problem, not a strategy for handling a class of problems.