Tuesday, May 17, 2011

Rant on Exception Handling Strategy

I have finally made some peace with exception handling... again.

After refactoring the way I thought about exceptions so many times, I have come to a couple of (probably not new) realizations.

1. The program should never crash. It should exit gracefully even if its going out backward in a cloud of smoke. A programmer who lets an exception smash his software is just not a contender. 

2. All exceptions should be caught and reported in a controlled way as close to the root of the program as possible.  This results in the main() being a very thin wrapper around the top level exception catch.  (Since its always a fairly thin wrapper around the root object creation point, this is no bad thing.)

3. Yes exceptions come in three flavors. User errors, programmer errors and system (environment) errors.
 - User errors are violations of the assumptions of the user and need to be explained clearly and in detail. Usually with some sort of reference to "More Information".
 - Programmer errors ( bugs) are violations of the programmers assumptions.  These are things that should not exist in the finished product but usually do.  They need to be reported in enough detail to be reproduced, identified and fixed.  This may be a detailed log, stack dump, whatever.
- System errors are violations of the programmers/users assumptions about the state of the environment in which the software is running.  As this is usually a dynamic system... to a certain extent... shit happens.
These need to be handled when they can be handled, reported to the programmer when they cannot be handled and apologized for to the User. 

4. Most exceptions are not explained in a useful fashion and need to be translated. Even for programmers.  They all have dynamic context that should be captured as close to where they are thrown as possible.  There is no point letting a weird exception unwind the stack and erupt at the high level handler with a cryptic message and no contextual information.  Its just wasting everyones time. So catch it at the site, decorate it with all the nice contextual information that will make it understandable and the let it rip.

5. My custom exception class has two messages. One for the User and one for the Programmer.  Even when its the users assumptions being violated, there is useful information for the programmer.  This needs to go into an error log.  Even the fact that lots of users have a similar error means you should probably spend some time improving something.

6. The error log is gold.  Log everything.  Put in keys so it can be imported into a database and harvested for patterns.  Make your life easier. 

7. Catch your exception at the throw site. Handle it or format it for reporting.  Then either resume or throw it to the top level reporting handler.  This is easy for trivial size programs that can exit gracefully. Its a pain in the ass for large multifunction apps that have very complex state.  They should still be designed to have a thin shell (UI) that generates the state and serves as the catch layer for anything that is thrown from the depths. Where the state is trashed or becomes uncertain, the users should still be left with a UI that will allow them to re-try the same operation.  The program should not ghost out.

8. There should be layers of stability within the program.  The exception should unwind back to the previous layer of stability.  This layer should be in a known state and be immune to damage from the exception.  The User should then be able to move forward in the task again and test a different idea or variation without the whole program collapsing.

9. Wrap some protection around constructors and RAII style activity.  Be gentle when chaining constructors.  Have layers so that there is only one or two steps that unwind from a failed construction... not everything.
Give the user some levels of interaction (unless its a script driven or cmd line app obviously)  Let them find their way through the code and around the problems.

No comments:

Post a Comment