Thursday, December 23, 2010

Mature Coding Environment

It’s a day for bitching about programming tools again. I have just finished reading a couple of fanboy posts and articles about various programming languages/IDE's etc that are new and hot and just the secret sauce and I am tired of it.

These systems are IMMATURE. They are not only not worthy of production code but they are not worthy of hobby coding. They are to use a technical term... toys.  They are a good idea that is under development and will not be ready for productive use until they have matured.

Why do I bother making what should be a bleedingly obvious comment? Firstly, cause I am sick of the endless spew of new coding stuff and secondly because its clearly not obvious to others. Oh and I'm forcing myself to articulate it because it helps crystallize my thoughts.

So what do I mean by mature? OK. I'll state it up front and then support it rather than trying to build it via subtle and cleaver arguments.... (Never really works anyway)

*** A MATURE PROGRAMMING ENVIRONMENT HELPS THE PROGRAMMER BE EFFECTIVE ***

Yeah? And.... like how?

1. A stable language feature set

2. A state of the art IDE or language support in your favorite text editor

3. Comprehensive RICH support for the major aspects of development (develop, test, deploy, maintain)

4. Has tools to extend the programmer via automation. (Debuggers, Static Analysis, formatters, document extraction, profiling, code generators, macros, Refactoring, GUI Designers, test frameworks and runners)

5. Has high level tools for managing complexity (various file/text/logic/structure views, flowcharts, models etc)

6. Integration with a STABLE ecosystem of components (databases, media libraries, GUI systems)

7. Has a rich knowledge base with both breadth and depth on both core competencies and application to more exotic projects.

8. Has systems of support. (No man is an island... shoulders of Giants etc) Forums, discussions, communities where solutions can be found in a timely fashion.

9. Comprehensive documentation for the tools.

10. Is integrated ( or can be integrated ) into a robust workflow that can be packed up for storage and reopened when needed.

Without getting into stupid methodology arguments, I think these aspects make for an environment that gives the working programmer the best chance of getting from A to B with a project in the most effective way. I'm not talking about the self enrichment that comes from tackling something unknown or hard or only working in assembler or any other side effect of programming. I'm talking about getting a job done effectively so you can get on to the next thing/problem/job/stage whatever. I get that many people enjoy programming and fiddling and wish it would never end. I have those moments and its easy to waste time for little real progress. (Refactoring should be stored in a locked cabinet with the other dangerous drugs...)  I'm talking about getting a specific, bounded job "done" efficiently. (I may be strange in that I am rarely working on a single project at a time, usually there are half a dozen or more in the air at any time in a couple of languages with their own domain issues, so I see patterns develop.)

This then becomes a "how long is a piece of string" discussion about what "efficiently" means.  Well it means the system works, satisfies the criteria I have agreed too, will not embarrass me when its reviewed and will not be a pain in the ass when I inevitably have to come back to it and make some changes.

So what’s with the MATURE shit? Well it goes like this... a software system is an investment... it has costs to create and hopefully will return value to someone in its use, not just its creation. It has a life cycle and some sort of diminishing returns curve. It also potentially has ways to extend its value (maintenance).  If you can conceptualise of software this way, then the economics of the tool set used to create it are an important variable in the value model. Not that hard if you're used to building any financial models or doing basic cost benefit analysis.

So the initial outlay cost of the tools is a fixed amount, but the cost of using the tools is a variable amount, dependent upon both the task its being used for (difficulty) and the time the task takes.  These two will probably have an exponential relationship; simply meaning that the harder the job, the longer it takes. However time is a linear constant, so its will probably not vary too much unless you have an unlimited number of people to throw at the job...(Mythical man month anyone?)

These two variable costs are the ones that make this model suck hardest; but also identify the issues to attack for the maximum gains. (Has anyone profiled the actual activity of programming? I 've certainly tried.)

The time variable can be fiddled with a little but has a ceiling. There is only so much work you can get out of someone in a given time period and adding more people has a diminishing return... so it has some pretty hard limits from the people side of the equation. However... if the tools have an effect on time... then.... ah... you see the point... if you tools are throttling the activity of the people then you are tool bound.... better tools could potentially allow them to reach their maximum productivity.

The other variable is difficulty, which can manifest in all sorts of ways. Difficulty from the platform, from weird API's, from crappy third party support, from ugly documentation, from missing language features, from having to write and test the same boilerplate code over and over again... The point however is that within the activity of programming itself there are only two states... the first is where you are making forward progress toward the project goals(State A) and the other where you are stalled or going backward (State B).  In state B, you are burning time and resources for no gain. These situations can't always be avoided... or could they?
There can be a bit of grey between these two states but forcing myself to make a call between state A or state B can clarify the situation in my own head.

Anyway, so after a brief look at my personal economics philosophy I get back to mature programming environments and how it all relates.

The key point is that a mature programming environment has been optimised to reduce the cost in time and the multiplying effect of complexity/difficulty. This optimising never quite ends but there is a distinction between a well developed, mature environment and the solutions available for a "new" language/tool set.

The thought exercise I always use is to conceptualise a couple of projects, the first is a one-off throw away data solution for a project for a single user of maybe 1kloc, then a simple experiment package of say about 10kloc for a couple of researchers, the next is a more developed multi-part tool set for working with motion capture data of about 70kloc for both internal and external users, the last is a larger package with more history, a huge library of resources, multiple generations in service at once and large body of users of about 750kloc. I mentally try to apply a prospective tool set/language to each of these projects and see if I can imagine using that tool set productively on the projects.

Honestly, most of the languages and tool sets I see talked up fail before the first hurdle. They're not even worth considering for a tiny throw-away project. Why not?  Because their initial setup and investment in the tools is massive in comparison to the time spent on the productive project work!  It takes time to find and assemble all the bits and update to the latest builds and scrounge enough information to build a GUI and you need to hand code everything without any useful samples... etc. Why bother?  For larger projects the initial cost is much less significant, but the other issues start to come into play. How well integrated is the tool set? Will it build with a single button click, will it deploy iteratively, can I build various types of tests (unit, integration, GUI?)  Etc.  How does the tool chain scale? How does it manage complexity and extend the limits of the human brain? Can it graphically express code structures, does it support static analysis tools, are there profilers, debuggers, code formatters, documentation systems, code generators and an API for building your own tools against the exiting toolset. Is there a macro system for the tools?

These are basic features that a working programmer should expect. But so often are lacking.

As such, there are very few systems that can conceivably be described as mature. There are a lot that are moving in that direction and there are an infinite slew that is much closer to toys....
I'm just tired of the illiterate fanboy’s who get lost in the excitement of a shiny new toy without realizing that its got such a long hard way to go before its grown up enough to be a serious contender for anything.  That probably makes me seem quite dated.....

I must build a table of all the contenders one day... Wikipedia maybe....

Thursday, December 16, 2010

Sloppy Code Article

http://journal.stuffwithstuff.com/2010/11/26/the-biology-of-sloppy-code/

I just read an article about Sloppy Code. The seed idea is not the gem here its the explanation of how to fit it into the mindset of "programmers" and all the issues surrounding the evolution of both the craft and the environment in which we are all working. I found the article very deeply resonated with a bunch of half formed ideas that have been slowly orbiting my conscious and unconscious mind for some time now. This article not only articulated it, but did it with grace and clarity. The linkage with the abstraction levels among the sciences was wonderfully illustrative. It just resonated.

The best aspect was the optimistic spin. I have been reading articles about change in various industries and environments recently and the common thread has been the fear and uncertainty communicated by the authors which ended up with a common negative taint being attached to change in any form. ( I understand that change generally means loss and dispossession by many... so its fair... but still there have to some who see the upside)
Anyway, I found this article strangely uplifting. It has a lot of parallels with what I find myself doing more and more. While I still occasionally get a job that I can break out C++ and hack against some low level library from Apache or Boost, more often I am writing loose VBA or scripts to drive high level objects or automate whole executable through some high level API. This gets stuff done and is often quite satisfying to get it done quickly, but it lacks some of the fundamental satisfaction of having constructed it from elemental primitives.

I guess thats why I still get so much satisfaction from going back to raw materials in the shed. I would rather build a lathe out of fundamental components, weld them together, cut and shape and slowly assemble them rather than buy one and use it.  But on the other hand I have enough experience with turn-key packages that I also enjoy getting something that "just works" and getting stuff done with it. Different levels of abstraction.

The next major building block I am wrestling with is AI. There are enough low level libraries around to build various simple constructs but you can still see the bare metal through the library. The question becomes whether to use a library that has rough edges or to build it yourself. There is not a strong enough value proposition to use them as higher level black boxes and glue something on top because they are not really high level. They are still just first generation collections of tools and routines.

I want a library that I can instantiate a functional AI from in a single line of code, be it a Neural Net or Agent game actor or some other variant that is already done. I can then just build the rest of the experiment rather than having to go back to almost bare metal and make all the decisions and construct it slowly.

Now I think about it... I guess I am moving further away from the metal in a number of threads. The attraction of building another 300KLOC program just to get something high enough to run a couple of stepper motors as an abstract unit within its own work envelope is just depressing. Maybe its just fatigue. Having re-invented the wheel a few times and worked with so many packages that have done the same thing, over and over again, I am just tired. There is a certain point at which the idea of inventing the same wheel in yet another immature language becomes down right depressing. Trying to map the concepts that I have spent countless hours of bloody minded effort learning onto a simpler faster way of doing it.... almost seems a step backward. The time spent lerning basic, Pascal, VB, Assembly, then C code and learning C++, OOP, Managed Code, VBA, Perl, Python, Lua, RegEx, various libraries and windowing toolkits, Generic Programming, Functional Programming, Logic Programming, Scripting Languages, Embedded Languages, all the IDE's, debuggers, profilers, Static Analysis tools, patterns, refactoring, Testing Frameworks, Graphics Libraries, Game Engines, Encryption Libraries, AI Engines, Physics Engines, and now GPU languages, Network stacks, Databases, Servers, OS's, Parallel Programming, Threads, Memory Models....  the hours and hours spent looking for solutions that you know must be there but will not turn up in a search no matter how you rack your brain to describe it.... all fading into irrelevance.... Now I can barely perceive the metal through the layers of code. Working in VBA over the objects in Office is a totally different model. So much of what you know is useless. You can do it the easy "Office" way or you can try to torture the system by mapping your own ideas over it and quickly find the limitations of just what is possible. Its not really OO, its not really any of the techniques you may have known, its impure, its unpleasant, its still possible to have control over some things but not an even level of control, you can still manage lifetimes but not easily.... in the back of your mind the darkness grows and you start doing things the "Office" way... and then the .NET way... and before you know it you are on the slippery slope to being a competent Access Developer. No longer battling against the restrictions but hacking fast and loose and getting stuff done.... not worrying about creating a Q&D object to encapsulate some code and putting dirty switch logic in that you would be embarrased to write in C++. It just works. Its not something that will come back to haunt you because it will get replaced in the next round of refactoring and massaging. In fact, adding the overheads of "Quality" just makes the code that little bit more rigid and expensive to change when it needs to. My feel is that the value proposition has been reconfigured with the lighter more dynamic systems that combine  high order abstractions with loose glue languages. There is much less value in building comprehensive code that is robust and complete because the very nature of these systems is fluid. The quality has been pushed from the code you write into the objects you trust. We are delegating at a much higher level. Not only are you delegating responsibility for functionality but also high level error handling and self management.

Suddenly COM has come into itself. Web API's are next. The objects are not only a black box, they are a black wall. With a tiny window in it. You can talk through that window and the rest totally and I mean totally takes care of itself. This promotes the ideas of loose coupling in a way that I had not fully appreciated before. Having some sort of abstract communication mechanism between your glue code ( or Sloppy Code) and the API of a service provider forces you to keep it at such a long arms length that many of your assumptions are obviously violated in ways that force you to stop making them. Communicating via JSON or XML or some non native intermediate language stops you depending on things that are too easy to depend on when you are passing longs across to an API of a dll that you have loaded by hand into memory. There is so much less and more to trust in the relationship. The illusion of control has been stripped away that little bit more and you need to be a little more accepting of how little you know or can depend on in the exchange.

I think pulling data from a web service in a stateless transaction is a cleansing experience. Much the same as I once heard John Carmack quoted as saying that working on driver code is good for a programmers soul; so to is working through a flimsy API via a crappy intermediate language with a service hosted on a computer who knows where running an unknown OS, maintained by someone across a public network of wildly fluctuating service availability and with no illusion of control. Its humbling, frustrating and simple while being ugly and primal at the same time.

The idea that you can profile the service and get into the guts of the code and fix stuff... just goes away. You need to be more accepting of your limits and realistic about choices that you can make and the cost of making them. Because the choices are real simple. You can't wait for an updated version of the library or compile one from source. You can't look for alternative options or roll your own... the whole system has become about the data in the system rather than the framework of code through which it flows no matter how the abstractions within the code facilitate reuse or change... its just crap to either work or get out of the way.

Moving on.... I want high level API's (essentially an AI that I can talk to in natural language that will then go and do stuff in an organic way) and I want it now ... end of post.