Stratasphere: February 2011

Monday, February 28, 2011

HackHut site

http://hackhut.com/

Yet another place to get ideas.... and waste more time than I have.... Bad.... bad site.... must delete bookmark.... finger ... not... working... (yet)

http://imranontech.com/2007/01/24/using-fizzbuzz-to-find-developers-who-grok-coding/#comment-10055

Article on trivial programming tasks to test basic programming skills.

This is both fun and intricate. The comments are an interesting look at code geeks at play. Some of the solutions posted are cryptic/wrong/ugly and I would be embarrassed to put them into production code but I respect the urge to have fun with the problem and turn it into a competition. I think the ability to have fun and "play" in code requires both fluency and mastery. Getting it wrong just shows there is still things to learn, not that you should not be trying to play the game.

My "professional" solution would be:

//Solution 1

//Code to solve the FizzBuzz task

//This should print the number from 1 to 100 and replace any that are a multiple of 3 with the string "Fizz" and a multiple of 5 with the string "Buzz". Multiples of both 3 and 5 should be replaced with "FizzBuzz".

#include < iostream >

void FizzBuzz( )

{

for(int i = 1; i <= 100; ++i)

{

if( i % 3 == 0 || i % 5 == 0 )

{

if(i % 3 == 0)

std::cout << "Fizz";

if( i % 5 == 0)

std::cout << "Buzz";

}

else

{

std::cout << i;

}

std::cout << std::endl;

}

void main( void )

{

FizzBuzz();

}

Naturally I also wanted to play and see how clever I could get... but as there were some really ugly one liners already in the comments I took a different approach.

//Solution 2 or the fun with rules/code solution

#include < iostream >

void main( void)

{

std::cout << "the numbers 1 to 100" << std::endl;

}

Endless entertainment....

Thursday, February 24, 2011

Dumped on by Data

http://chronicle.com/article/Dumped-On-by-Data-Scientists/126324/

Another article about the impact of IT and big data on researchers and their ability to function.
This implicitly identifies a lot of associated issues with researchers that is of interest from an IT perspective.

Wednesday, February 23, 2011

Maze Generation Algorithms

http://weblog.jamisbuck.org/2011/1/12/maze-generation-recursive-division-algorithm

Something to read later.

Some Machine Learning Articles

http://ibmresearchnews.blogspot.com/2011/02/watsons-wagering-strategies.html
Article on Watsons wagering strategies for winning Jeopardy.

http://blogs.technet.com/b/next/archive/2011/02/16/machine-learning-for-dummies-john-platt.aspx
Machine Learning for Dummies from Microsoft. Bit light.

http://scpro.streamuk.com/uk/player/Default.aspx?wid=7739
IET Lecture on Machine Learning by Chistopher Bishop.

The Solver Manifesto

http://www.solversmanifesto.com/

There is much resonance here. I describe it as being a solution not a problem. Being a go-to guy. Being the expeditious solution. That sort of thing. The biggest difficulty comes when someone asks "what you do". I've tried to formulate the answers to that question on another of my pages... but it took a couple of pages to answer. It was more a list of all the things I've already done, not all the things I could potentially do.

Funny to think of yourself as a solution looking for a problem.

Once you get into that game, its then a question of either taking whatever problems come or starting to select the problems based on some criteria. Bigger problems? Most problems solved in a period of time? New Problems only? Better compensated problems? More impressive Problems? Problems with the largest scope for improving/destroying society? Start to sound like an academic... oh wait... where do I work again?

The final step is documenting your solutions so others can benefit from the solution you have presented. For that you need a simple, cheap, quick mechanism to publish/edit/reply-to the solutions.... ta-da... the web/blog and a cheap digital camera .... oh wait... that's what I'm doing here.

As I said, much resonance.

I would contest the end of the article about the respect issue. I have come to think that you will never get respect because most people can't really identify a "Solver" and place them in their hierarchy of respected people. There is too much cognitive dissonance. Especially when the "solver" does not meet the criteria for all the other people they respect ( rich, famous, important, well dressed, parent, role model, older, educated etc) You will get respect if you can display some or all of these other criteria as well as being a "solver" but its too hard to respect the property of being a "solver" by itself. It's even harder to get respect if you display any properties that are in opposition to those on the respect list. (young, badly dressed, less educated, modest income.. etc)
Why is this so? It gets back to the problem of people not being able to put a box around a solver... they are not predictable. You do not fit neatly into a single box... instead you fit loosely into lots of boxes. But you never fill any one box completely. So if you want a bit of video hacked ... fine, get a solver to do it. But if you want a professional job done... get a professional video editor to do it and pay accordingly. Solvers are unknown and thus untrusted to be "best" at anything. They are "good enough" when you need to minimize costs on a job ( time, capital, manpower etc).

No one gets respect for being the cheapest, fastest, Q&D solution. People get respect for excellence.

Monday, February 21, 2011

Another Scientist makes bad Software Engineer article

http://www.nature.com/news/2010/101013/full/467775a.html

This is interesting in that it repeats the same message that I and many others have been talking about for (in some peoples cases ) since the 80's. Most scientists are very good at their domain but not good at other things, like software engineering... surprise.

Friday, February 18, 2011

Data Recover on a WD Mybook 500gig drive

The drive is partially functional but has a heat based failure fairly soon after warming up. I have dismounted the drive IO board and it shows obvious discoloration around the major SMD's. I have tried using a fan to keep the IO board cool but without any luck. Apparently due to storing the firmware on the actual drive rather than on the IO board its much harder to simply swap the IO board on modern drives. Hence this job is rapidly getting out of my reach. I have recommended that the client pass the job to an external professional who can do the job. The quote is $495 which I thought was very reasonable to recover a few hundred gigs. Anyway its in the clients hands now.

There seems to be a bit of noise on the web about Western Digital drives failing in the MyBook series so I may have to re-think my attitude to Western Digital being a nice stable brand. Go figure.

Friday, February 11, 2011

Thingverse

Drool...

http://www.thingiverse.com/

The Reading List

After a few weeks of conscientiously trying to manage my online reading I have cut down my reading list even more.

The only feeds I am not following are:

AAAI's AI topics
Hack a Day
CodeProject

These are manageable with a very high signal to noise ratio. I wish the AAAI stuff was better formatted but that's a minor bitch. I have recently killed off ScienceDaily because it was burying me under a landslide of marginally interesting shit. At last count it was sitting at 1251 unread items. That's insane. No one can keep up with that amount of crap. They have no system to specialize the feed or split it by topic so you get everything or nothing. I pick nothing.

There are a few other low volume blogs that are interesting, mainly math, stats and some robotics stuff... but I'm being more cautious about committing my time to testing them. They need to earn my trust first and display the high signal to noise ratio that they need to be worth my time.

Energy Accumulators Game

What's a primitive energy accumulator?

Ok, the rules of the game are:

1. Must be found/built by a single person.
2. Must not require tools/equipment than can't be carried/transported by a single person.
3. Must be maintainable by the same person or another similarly versed with the same tools.
4. The design must be able to be communicated verbally.
5. Uses common/found/grown/scavenged materials.

6. Energy must be able to be input into the system.
7. Energy must be extractable from the system on demand (Within an hour of that demand anyway..)
8. The system must be re-usable ( not consumed apart from general wear and tear or environmental damage)
9. The system must be reliably efficient ( I.e It can be be low efficiency but it should be similarly efficient between uses. Not unpredictable is the the point)

10. The system must be useful and reasonable for the intended purpose and the need cycle of that purpose. (i.e no possible but implausible systems)
11. The design must be implementable at more than one site. ( I.e not dependant upon some unique site specific feature/s)
12. The system must be implementable within a reasonable period of time (<1yr)

Libertarian Sci fi Fiction

http://en.wikipedia.org/wiki/Libertarian_science_fiction

Never realized this existed. Makes sense. Wonder what the parallels with dystopian and post-apocalyptic genres are?

Sci Fi to catch up on

http://en.wikipedia.org/wiki/Hugo_Awards

http://en.wikipedia.org/wiki/Nebula_Awards

To read.

Strandbeest and low cost primative energy accumulators

http://www.strandbeest.com/beests_storage.php

I could enjoy playing with something like this. The next level is to add some primitive sensors and directionality to the beast so it can sense the wind and direct itself into the wind in such a way as to optimize its energy input.

Then give it a sense of energy conservation, so it can evaluate its own energy stores and decide to conserve energy when the wind ( energy source ) is low.

This is an interesting example of very low tech energy accumulators. It uses air pressure and polycarbonate bottles as the pressure vessels. All you need is a pump, pipes and some valves and you have a cheap, maintainable and accessible energy system. Its low power and low availability but simple enough to set up anywhere there is wind.

Once you have some sort of turbine or other converter you could hook all sorts of wind sources ( steam, vapor, waste air pressure etc) into the same system.

Building a low efficiency turbine using similar materials and some sort of magnets and induction coils should be fairly straight forward. The materials are getting a little more exotic however. (Copper wire and some magnets) but still fairly easily scrounged from any stream of modern waste.
If you have access to old electric motors then you are already there, just run it backwards. So you have the ability to harvest wind energy, store it and deploy on demand.

Now we just need a big enough bottle farm to hold all the accumulated energy without leaking.

I remember seeing a low tech bio-gas pressure system made from two water tanks, one large open topped one ( which could be replaced by a simple dam or large puddle) and half filled with water to form the seal. The second tank was upside down and held the gas. The gas was pumped into the tank under low pressure and the tank rose. Then they simply added some bricks and rocks to the top of the tank to increase the pressure and thus had a high pressure source of bio-gas to run the stove.

You could do something similar with this kind of rig. Just use the wind power to pump air at low pressure into vessels then change the pressure by adding something easy ( say water ) which can be pumped around to add weight to the vessel. Then you have a high pressure air source to generate power on demand. The stored potential energy can act as a large, simple battery.

Thursday, February 10, 2011

Using querydefs and parameter querys in VBA

Today I will be bitching about using parametrized query's from VBA code in a robust way. I will also be presenting a roundup of the solutions as far as I know them.

The Problem
You want to modify a query on the fly using some dynamic option or parameter in VBA and then use that as the RecordSource for a form. Bloody obvious need if one is building a db driven app.

Possible Solutions
The first is to user a parameter query with a field somewhere on a form/global variable that the query can just pick up and run with. This sucks because its inflexible. I need to essentially plant a global somewhere that always exists and always contains the right value... can anyone say spagetti code? Hello 1980.

The second ( variation) is to try to plug the parameter in on-the-fly by massaging the Querydef object and then pulling the Recordset by hand and somehow pushing it into the form or whatever you are trying to use it as a source. This is just fuggly.

The third is to compose the SQL as a string and put the parameter into the string using string functions. Nice, easy and completely unmaintainable using all the nice GUI tools provided in Access. Yes I can do it, but it means I end up with my code littered with SQL fragments that I have to then hand maintain every time something changes in the db schema. what a PITA.

The fourth possible solution I have seen floating around is to build a whole dialog that can compose the SQL on the fly using some combo boxes etc. This is just a variation on solutions 1 and 3. With the worst parts of both. Now you have even more trash to maintain and find when you change anything. The whole point is to reduce complexity not add to it. Blahhhhh.

Fifth possible solution is to roll my own object that somehow can be instantiated, wrap a Querydef object and gracefully update it as required. Since there are no useful hooks to build this around, it gets back to having a global object somewhere and messing with it on demand... back to option 1 but with the overhead of a whole class of code to now maintain.

Main Bitch
What I would much prefer is a nice wrapper object around a Querydef that could be used to create a temporary Querydef with parameter slots, I could then plug in the params, test that the Querydef is stable, and pass it to the form as its Recordsource. This way I can maintain the Querydef using the Querybuilder GUI and all the other test tools, use the QueryDef gracefully from code and not littler my code with all sorts of fragile text string and maintenance headaches. Oh and I want MS to supply this wrapper so I don't need to roll my own. Bastards.

Usually when I find myself banging my head against what seems to be an obvious problem, I usually find that its just that I don't know the "Right way" to do things in that language or framework and that with a little bit of reading I can find it and get my head right (usually refactor a truck load of code) but get back on the true path and start making progress again. However in this case it seems like its a common problem that doesn't really have any particular "right solution" its just lots of ugly hacks. (There are some truly UGLY solutions floating around the forums to this problem. But I don't feel like I have a more pleasing solution so its much of a muchness. Still they offend my eye in ways that indicate they are fragile, hard to maintain and error prone)

Ok head is a bit clearer... bitching complete. I need to go read up on the QueryDef object I think.

Edit:

The best solution I have come up with ( tip of the hat to a forum post I read but can't remember... its probably common knowledge anyway) is to maintain the query in as a normal QueryDef which gives me the ability to use the GUI and compile the SQL for testing; the Query contains a parameter which is essentially a unique key phrase that I then use the "Replace" function to replace with my desired parameter info.

For instance

    Dim qdef As QueryDef

    'contains a param "[ContID]"
    Set qdef = CurrentDb.QueryDefs("ContinuityRelationshipFormSpecificQuery")

    Dim theSql As String
    theSql = Replace(qdef.sql, "[ContID]", str(relationshipID))

   Me.RecordSource = theSql

This is a fairly simple and maintainable system. It takes about four lines to do what I would like to do in less, but its still only a line per parameter so not too horrible a trade-off.

Thursday, February 3, 2011

Identity Ecosystem

http://arstechnica.com/security/news/2011/01/identity-ecosystem-inside-uncle-sams-trusted-identity-proposal.ars

This is an interesting article on an identify system that the US is proposing. It's interesting for a broad range of reasons. The first is that its clearly intended for use online, but the case study quoted is clearly an offline scenario.

The second obvious issue is that its US centric... again. More on this later.

The third is that the US government is trying to farm it out to private organizations.... again. Lol.

It's both fascinating and saddening to see just how unable to change a large system like a government is. Not little changes... things like letterhead or the name on a door, but the way it does business. Big stuff. The approach that can be taken to make something happens becomes more and more familiar. In the US case, its the involvement of the private sector in everything. There is still this fundamental concept that somehow it will be better if its not centrally managed. I am all for having a light hand on the tiller, but as a fairly ignorant spectator, I feel like there is more of a rush to privatize past a rational point in the back of the US policy makers heads.

The issues with a private trust and identity system should be obvious to everyone. Once you have a profit motive on top of what should be a universal service.... you have some conflict. It will always move toward a pay-to-play system. So what happens with people who can't/won't/shouldn't pay?

Do prisoners keep their identities?
Do illegal immigrants buy an identity?
What about "homeless people"? How can you use an identity when you need a cell phone if you don't have a charger?
What about people who want to go off the grid? Runaway kids? Runaway spouses?
How does someone who has had their identity stolen? (With their cellphone perhaps?) recover their identity?
What about someone who leaves the country for some time and comes back... does their identity still exist? Can they pick up where they left off?

Who owns the transaction data on the use of the identity?

Basically, once the identity is a separate commodity and not directly tied to the physical being of a person, its just a product. Watch films like Gattaca or The Island and see what can happens when two people accidentally or intentionally share one identity. Watch any number of films about identity theft to see how that plays out (The Net, Single White Female... etc)

Digital Cloning. (TM)
To borrow, steal, share or jack someone's digital identity. Obviously its already happening and has been for some time. Once there is an even more divorced identity system where you are only one of three parties in the system who can assert your identity... you no longer even have a voting majority.

I also guarantee they will end up with an id number as the primary key in the identity record rather than a name or biometric data or even your face. ( Now that would be a search key that would be hard to index. (Not a photo of the face, but a real, living breathing, blood pumping through it, age spots, facial hair and scars face.)) So potentially you will have to argue with two computers about whether you are or are not a particular number. How much fun will that be?

I'm becoming more interested in systems in various states of failure. Finding edge cases and writing rules to handle them is an endless game. The better designs treat everything as an edge case and have a way to gracefully degrade the performance of the system from an easy case toward the other end of the spectrum where the hard cases do not occur in a neat row. Its more a sparse array mixed with the splatter pattern of a plate of peas upended by a two year old.
How do you design for situations where people fall out of the system, re-enter the system, find duplicates in the system, join a cult, digitally suicide, change their identity, leave the country and vow never to return, leave the planet (potentially), change governments, sell their kids identities, loose ownership of their identities through legal action, criminal behavior or stupidity?
The edge cases in a system like this are probably much more significant than the "normal" uses. How will people purchase embarrassing products when its all tracked by a private corporation (could this be worse than the current customer tracking systems?)

What happens when inevitably one of the identity databases gets cracked and uploaded on wiki-leaks or bit-torrent? How do you fix a few million identities that might or might not have been compromised?

Trust is the key to it all. I remember reading somewhere about the rules of a trust system. (Probably in a book about 2m to my right....) the one that stuck with me is "Trust is not transitive" (See this article on Trust Relationships for a background if you're bored. http://technet.microsoft.com/en-us/library/cc977993.aspx)

Computer systems can establish and maintain trust relationships like this but the reality is that within a system like the one being proposed it would not be a neat triangle with three players. It would be hundreds and hundreds of computers all playing nicely together to form the chains of trust required to implement such a system, even for a single person. Think of all the communication stacks in between, all the man in the middle players, in both the primary "trust establishment" transaction and even worse, the secondary "trusted" transactions.

Just think of the skimming possibilities at any of these points... how can we pretend this is trust?

Wednesday, February 2, 2011

Folklore in software research and other Academic Misdemeanor

http://morendil.github.com/folklore.html

This is a good article to read on inquiry and critical review of publications. The fact that its about software engineering is almost irrelevant as it provides a nice summary of problems with citing non-existent references to support an argument. Should be a useful read for new research students.

http://www.johndcook.com/blog/2009/09/18/make-up-your-own-rules-of-probability/

This is a vaguely interesting article about bad stats in published papers. It alludes to both poor statistics and poor research habits. Nothing new or particularly special here. One of the interesting points is a comment about the Journal not correcting the mistake... Not sure what the comment is referring to but it got me thinking.

We bang on about academic misconduct and plagiarism but what about misconduct at the publishing end of the food chain? What happens when a Journal or other forum edits something or fails to correct a mistake? Or can they even correct it? Once the Journal has been distributed on dead trees its very difficult to announce a correction or recall the print run. Obviously with online distribution the cost to change is much less. However the mechanism is still convoluted. Do you change the text with a note about the edit? Do you print a retraction in the next edition? Do you quietly update the only webpage and pretend it didn't happen? Who peer reviews the peer reviewers? Is there a "Media Watch" for Journals?

The actual substance of the article on creating your own rules of probability is I feel all too common among researchers who spend more time on their subject matter and not enough time on understanding the tools they use for their research. This is only getting worse with the explosion in software solutions and black box tools that researchers are using or trying to use to stitch together a solution to their problem without fully understanding many of the side effects embedded in the process. This is a problem of complexity that has no easy solution except to reduce the complexity back to a level that the researcher can manage. Often that's paper and pencil. I don't say that to be mean, its just that computers tend to add so many layers of complexity and hide so much of its that its impossible for any one person to consider them all when trying to produce a result that they are confident about. It's so easy to embed an assumption into part of a system and forget about it. It's even easier to embed assumptions that you are not even aware you hold. They are very hard to detect, let alone document or validate. But that's the game we play....

My new favorite catch phrase is "No one researches alone." This is similar to the "Standing on the shoulders of Giants"(Newton) philosophy that has been around for years although with a darker meaning. The problem is that now we are not just standing on the shoulders of giants, we are holding hands with all sorts of crappy software, mysterious script files, graphing packages, hacked stats tools, time poor tech staff, half trained monkeys, reviewers with agendas, publishers going broke and researchers with attention split across teaching, administration, grant hunting, students, politics, ... etc etc.
Nothing you do is your work alone... for good and ill. It's only sheer bloody-minded persistence that can get a piece of research published now. The number of obstacles and distractions that derail the publication of any science are growing exponentially. There is also less capacity in the system to catch errors, reproduce studies and pinpoint where the errors came from. Again, sheer bloody-minded OCD will get you there, but its hard to find people who have the right mix of personal attributes and topical interest to take on an increasingly hard process when the results and rewards seem to be diminishing every year.

We can train a student in the tools and techniques of research but I'm coming more and more to the opinion that research training is simply a culling processes. Unless you have the personality and motivation before you get here.... there's no amount of knowledge that will shape you into a researcher.

I think its probably possible to predict researchers and hackers in primary school. These people are born and nurtured rather than being made. There is probably a whole rant on the education system from this topic but I will avoid it. I'm done here.

Later....

Swords as sculpture

http://www.youtube.com/user/michaelcthulhu#p/u/9/23jVP0iRUso

Watch, listen, read the comments... laugh. Its all good.

Tuesday, February 1, 2011

Software Craftsmanship Manifesto

http://dannorth.net/2011/01/11/programming-is-not-a-craft/

Interesting article on the never ending theme of what is the profession of programming.

Couple of thoughts:

A professional body is, no matter how named, a guild. Its purpose is to prevent undue competition and promote the interests of its members over the interests of non-members and prevent said non-members from becoming members. Look at all the professional bodies that exist today.

Issues of product risk, liability and statutory rights have not been dealt with. I have always thought that the biggest value of a professional body is to prune-the-tree. Remove the poorly performing members as they are identified via professional misconduct hearings. The problem is that most professionals are uneasy about making a peer unemployed, so the system ceases to function if its run as a self-governing body.

As a standards organisation, it becomes a tick the box game. You can create some software, in budget, that meets all the standards and still does not work. The trap is trying to write a blank cheque clause in the standards about the product being fit for purpose or something similar as its then up to the client to write what the fitness should be.... and that's an endless series of half described ideas that evolves every time you have the discussion.

Trying to find a definition (or box) to put programming in, such as a craft, art, trade, profession etc is just too abstract and silly. No one cares. The end result is the bit that people start to care about. Where the work you do meets the money they either spend or make. At that point they start thinking about risk, liability, cost, resources, profit etc.

One of the biggest issues is measurement, and specifically quality measurement. "Good enough" is a measure in the same vein as "Fit-for-purpose". Having someone certified as a professional and competent to practice does not place any standard of measurement upon their performance.

Another issue is that of professional ethics. This is something else that never seems to get a mention. Ethics is essential in many occupations but seems to be a non-starter in programming. It just doesn't get mentioned.

Must rant on this topic more later....

Stratasphere