Friday, March 25, 2011

Testing with Mocks, stubs and an ugly testing framework

This article provides a good argument about some of the subtle issues with Unit testing. However the points are hard to make out through the vague illustration and the uncommented code polluted with yet another fuggly testing framework.

I am getting to the point where I really cannot look at another framework that introduces yet another sub-dialect of a language in an attempt to "simplify" some aspect of our lives.  Probably started off as a good idea but has then been taken to a point where its inaccessible for new users and casual readers. At least add some fucking comments to the code to help clarify what the fuck you think is going on.  That's the point of comments.  Tell us what you want the code to be doing.... even when its not actually doing it right.  We can read the code all we like and still never get any insight into whats going on in your head.  All we can do is draw our own conclusions about your head, the level of damage it may have suffered and the incompetence its demonstrating and then talk to our friends and accomplices about how to track it down and give it a kicking.  Code is not comments.  Comments are not code.  They have the capacity to tell different stories.
Its good that machines cannot read comments and "help" us by fixing them up and keeping them in sync with the code. Can you imagine just how much of a mess it would be if your IDE 'helped" by re-writing the comments to match the horrible code that some crappy coder turned in?  (you perhaps)  Imagine having to pull old copies of the code from the repository just to verify if the comments have changed and try to recapture the semantics of the original design notes that you wrote when your head was fresh.  What a pain in the arse. Long live stupid IDE's.

Dueling Algorithms Paper

This is interesting.  Must read the paper.

Monday, March 21, 2011

Willow Garage ROS


This is looking most interesting.... bit expensive though.

Tuesday, March 8, 2011

5 axis cnc design

This is a nice build. Good work envelope.  Need to check back later and see if he has updated the resources.

This is another project worth looking at. Seems to be struggling a bit and may die from lack of interest. Its not particularly unique either.

Monday, March 7, 2011

Friday, March 4, 2011

CoreWars and CoreLife

There is a thought niggling at the back of my brain about this but its getting drowned out by other things. Read this and come back to it later.

Quote of the day

"Christianity is a spectrum disorder."

Think about it.  Equally applies to so many other things....

Thursday, March 3, 2011

Article on Windows Malware over the past 20 years,0

This is a very good read. Good summary of strategies and their effects.  It would be interesting to look at the variations in strategy against some metric of spread rates. This is mostly a factor of how fast the module can communicate->Infect->Not Clash->Repeat.

The best point is again the final one... people are now the weakest link in the chain.  How do we mitigate this risk?

Comment Thread on Bad Software

There is an interesting thread of comments on this post. Philosophical insight into programmers and perfection.

Logisim Tool

Need to have a look at this in depth.

Unspoken Truth about managing geeks article

This is a good read. Lots to think about.

And from another point of view....

Another Hardware Hacking site

Seem interested in xy lasers. Good info.

Wednesday, March 2, 2011

Articel on the Joys of Maintenance Programming

There is a lot of resonance in this article.  I find the concept of Joy in programming very fulfilling.  I like getting a project to the point where its functioning and moving more into the maintenance phase.  At this point the reward for finding and fixing a bug is much bigger and more specific than for solving a problem during the initial development. During development, usually its just you who experiences the pain of a bug and the rush of the solution; during production there is a much bigger payoff for solving a specific bounded bug that has direct and measurable impact on a whole group of people.  The payoff is multiplied.

Maintenance if the art of keeping a fine polish on a system, rather than hammering in the foundations that no one ever sees.

C++ Language Arguments

This looks like an interesting blog.  Some of the usual rants in the comments but a few perls hidden around the edges...

Design of Experiment Systems

I want to outline my ideas on a general pattern for most of the experiment systems I work on.

1) Design Stage
- Capture the requrements
- Walk through some scenarios
- Collect, scrounge, borrow what is needed

2) Experiment Module
- Generate stimuli / Survey / Test / Assesment etc
- Sense response 
- Log response

3) Library Module
- Collate raw data
- Preserve raw data
- Curate and add meta data

4) Data Transformation Pipeline 
- Clean bad records
- Cull unwanted elements
- Map from one form to another ( Coding, transforms, etc)
- Summarise ( Cook down )

5) Analysis Module
- Summary Stats
- Other Stats
- > To Publications and Presentations

6) Presentation Module
- Visualisations, Graphs, Screen Shots etc
- > To Publications and Presentations

This to me is a fairly graceful set of units that abstract the process and provide the maximum flexibility and reuse.

What do I use for the various Modules?

Design stage
Word, Email, Excel, Project, Mind Maps  etc. This stage is primarily about communication. Getting the information and pieces in place to reduce the uncertainty in the project as early as possible.  This is the stage to test the researchers resolve and see if they have really settled in their mind what they are doing and are committed to seeing it through.

Experiment Module
PointLightLab, E-Prime, ExperimentBuilder, Superlab, Biopac EEG, Eyelink, BrainAmp EEG, Matlab, C++, PERL, Python, Visual Basic, 3ds Max, Permier, Mudbox, Audacity, Photoshop, SurveyMonkey, Qualtrics, Combustion, hardware hacking etc... endless number of tools get used to generate this stage.

Library Module
CSV files, Access DB's Spreadsheets, log files.  Anything that is easily accessed by other software and can be read and recovered in a couple of years. Binary files are bad.

Data Transformation Pipeline
Ideally this is completely automated using simple macros and scripts. The reason for the automation is that this module involves a great deal of repetitious labour and the chance of human error is huge.  This should generate a completely reproducible result given the same inputs. So human fiddling is naughty.
I use PERL, Python, VBA Macros,  MatLab  + Libs, SciLab, SQL, Visual Basic and C++ ( with various libs) when I need to do some heavy lifting.
Generally this is a specific toolchain for every project. The only reuse is where a particular researcher continues to use and evolve the same tool set over multiple experiments or projects.

Analysis Module
This module gets down to being the favorite stats tool of the researcher. Where the analysis is fairly straight forward ( small data set, known stats models, well understood assumptions) its usually just a job of pushing the data into a format suitable for the tool SPSS/PASW, Excel, R.

Presentation Module
Generating Movies, Audio Tracs, Still Images, Animations, Graphs, Interactive Visualisations.  This can be anything from PowerPoint Gymnastics through Excel, Sigmaplot, Processing, VBA, 3dsMax, Permier, Quicktime, Photoshop etc etc etc. Usually lots of multimedia work driven from stats outputs. Every so often I get to do something more fun like animating a fish or visualising an MRI plot.

Ok thats my brain dump for the day.....

Tuesday, March 1, 2011

Risk Managment and Building Experiment systems

I'm busy building a test suit at the moment in E-Prime for a client and it got me thinking about some issues particular to building experimental systems.  The issue I'm looking at is that of embedding analysis code inside the experiment module itself. (By experiment module, I mean the software unit/system that runs the experiment and collects the raw data.)

My thoughts on this is that its a "bad thing"(tm) My general attitude is that during the experimental run there should be the least amount of code running as possible. Just enough to present the stimuli and log the result.  Analysis can be done afterward. I.e Not in Real Time.

This provides a couple of benefits.

1) There is less potential things to go wrong during the experiment and crash/trash a run.
2) You can re-think the analysis later on. (Some good, some bad)
3) Researchers can't play the black box game and "trust" the software to always be right.
4) You can't fix bugs in real time. 
5) It provides separation of functionality.  The experimental software is focused on doing one thing right.
6) It spreads the cost of development over time.
7) It allows you to use different tools for different parts of the tool chain.

This all works fine until you have a feedback system that depends on some sort of calculated property based on the results.  But that's ok... it just takes a little more testing to verify.

None of this makes a system idiot proof. You can still introduce bugs into an analysis tool chain just as easily as into an experiment module. But when the modules are smaller and simpler its easier to verify each one individually.

I've found a couple of researchers who are of different opinions to me and want the system to do everything in a single all consuming step(Generate, collect and analyze). I think this illustrates a general lack of understanding of process and methodology rather than any particular lack of insight into software or programming. My generalization is that they have used fairly high end systems that did a great deal of hand-holding rather than being used to creating their own toolchain out of lower level units.  Neither good nor bad but just a difference in expectations between myself and the client which needs to be managed early and often. It also speaks of a "delegation" mindset. This is fine when there are lots of RA's working on the project and you are delegating to a person that you trust and can fiddle with the process until they get it right. But delegating to a piece of software carries a degree of fragility and untested assumptions. 

The other side of this is that having a researcher who is very clear on their analysis before the data collection begins is both interesting and somewhat risky.  I like that they are prepared and have clarity on what they want to do with the data, but on the other hand, I find that until you really get a look at the raw data, its always a slight unknown. (Which is the point of experimentation.) So I find that crafting an analysis toolchain that will inhale, clean, process and summarise the data all without the researcher having to look at the raw data... worries me.  There are just too many un-tested assumptions hard wired into that process. Too much blind trust.

The other side is obviously just as dangerous. Where the researcher has a vague or non-existent idea of what analysis they want to run on the data and wants to "see" what it looks like before they decide... is usually a bad sign.  Sometimes its just that they're a visual learner and can't articulate what is still clear in their head... the other is where they are just wasting everyone's time and really have no idea what they are doing and should get out of the lab until they figure it out or get a job with a private sector public opinion firm.  How they get their research proposals past Ethics is a mystery to me....

I guess it all comes down to the issues of the relationship between the researchers and their process. Mostly they are exploring a half grasped idea and improvising as they go. This demands a degree of flexibility from the tools and modularity.  Very clean and clear interfaces that don't leak assumptions.  Decomposition of the problem rather than excessive composition of functionality. All this with an eye toward mitigating the risks of:

1) Changing ideas and the associated cost to change
2) Change in research focus
3) Unexpected failures in other modules
3) Insufficient testing due to constant evolution
4) Poor time planning
5) Other parallel projects and tasks cause distraction

Unfinished idea