Archive for December 2008

My girlfriend can clone DNA

Cool title, huh?

Aya just started working at a lab that does genetic engineering. She’s barely there a week, and already she’s cloned DNA. Here is a HUGE simplification of how it works (as much as my programmer’s brain managed to understand, with tons of errors I’m sure):

  1. Get a sample of DNA, this will be your template
  2. Prepare a solution with:
    1. Nuclease-free water – you want to make sure the water doesn’t contain any other DNA, or it might be cloned instead of the template
    2. Buffer solution – help create optimal conditions for the enzyme (more on this later)
  3. Add two specific primers, each is a completion of some known segment of the template. The size of each primer is usually about 24 bases.
  4. Add lots of DNTPs – these are the DNA monomers, the single building blocks of DNA
  5. Add a heat-stable enzyme such as Taq polymerase – this is the engine behind the entire reaction. It will latch on to the primers and run along the templates, adding DNTPs and building the DNA molecule.
  6. Put it all in a PCR machine
  7. Repeat about 30 iterations:
    1. Heat to 94-99 °C to make the DNA strands disconnect
    2. Cool down to 50-65 °C, so the primers can attach to the DNA strands
    3. Heat to 75-80 °C, which is the the optimum temperature for Taq polymerase
    4. The Taq polymerase finds the primers and then runs along the disjoint strands, collecting nucleotides from the solution and building the complementary strands

This process theoretically clones a segment of a single DNA molecule to about 230 identical molecules. Because the Taq polymerase sometimes fall off while building the strands, after the cloning she measures the length of the DNA molecules using Agarose gel electrophoresis. On that, another time perhaps.

What to do about nondeterministic tests?

We’ve all had these. Annoying tests, that work perfectly 85% of the time, but fail mysteriously when the moon is half full and someone is using decaff. We (myself most certainly included) usually tend to ignore it, rerun the test and pray the problem will just go away.

This is simply not the way! It’s an acceptable solution for tests that work 99% of the times, but as soon as a test starts failing sporadically, you’d better do one of these:

1. Analyze the test, understand the source of the indeterminacy, and make it deterministic! This is not always practical because it’s usually the most time-consuming solution.
2. If the test is really fast, you can consider rerunning it automatically on failure / introducing more sleep() to eliminate fuzinness. Not a real solution, but will get the test green some times.
3. If all else fails, just comment out the problematic part of the test or even the entire test. This is not what you really want, but it’s better than just leaving the test failing sporadically.

If you do nothing, you’ll quickly experience CI degradation – your test suite will become meaningless. People will no longer care about it, not even enough to fix tests that are easy – because “The CI is broken anyway”. An all green CI is a wonderful productivity tool, and it is reachable and worth the ROI.

Not completely smooth upgrade to Visual Studio 2008 + TFS

Major Reversal – I take this post back. I’m experiencing some difficulties with VS2008, especially when debugging. The IDE gets stuck sometimes, and the debugger jumps into the code of a heavy ToString() of one of our objects and when I try to resume it dies. I definitely didn’t experience this with 2005. When is SP2 due?

I know when I’m thinking of upgrading a heavy software product, I need positive reviews from friends to ensure me the risk is low.

We’ve just moved to both Team Foundation Server and Visual Studio 2008 from 2005, and the move went rather well. First, Oren and Tomer upgraded the TFS version some evening. As far as I remember we had zero issues, and immediately felt the impact of faster (and some say less painful) merges.

Then, we’ve had everyone install Visual Studio 2008, and Oren migrated all the solutions and projects (over all our branches) to 2008 (the big issue was that the file format of 2008 is not backward compatible).

Some of our team members have Resharper 4 which supports C# 3 syntax, while others still have Resharper 3. This was a potential danger, so we decided to disallow C# 3 features for now. I wrote a small utility to disable C# 3 syntax in Resharper 4 (run once).

It’s hard to enumerate the benefits of VS2008. What I see immediately are that it’s a lot more stable, and has a bit friendlier UI (Little things like being able to open folders directly from Source Control Explorer). When we do get to C# 3 and .NET 3.5, we’ll be able to write nicer code and use some additions to the BCL (like a new HashSet class to replace our existing C5.HashSet – this one actually implement System.Collections.Generic.ICollection !).

To summarize – move to TFS/VS 2008 when you get the chance (if you haven’t done so already).

P.S. – A huge benefit of VS2005 is that it usually knows to compile only projects your main assembly depends on. This is a sweet time saver.

The shorter the better

I wrote in a previous post about a few refactorings meant to eliminate code duplication. I was reminded recently this principle has a name – DRY, which stands for Don’t Repeat Yourself, and should be applied everywhere.

Eliminating duplication, while a noble task, is not the only refactoring one should practice and apply. Breaking up large pieces of code into exceedingly smaller pieces is also important. It makes your code more readable to yourself and to other developers, and also make merges much easier to accomplish – nothing is more terrible than looking at a huge merge of a huge method or class and having to guess what combination of the versions is the correct one.

Here are a couple of important tips/techniques to make your code more manageable:

Tip I – Group expression into methods

It doesn’t matter if it’s code you wrote or stumbled on. Whenever you get the chance, select a few statements in a long method, use Resharper’s Extract Method refactoring, invent some name to describe what this group of statements does, and voila – you’ve shortened the original method. It’s easier to understand and maintain, and now the new method can be called and tested on its own. This technique can turn huge 200-line monster methods into responsible, comprehensible 15-liners.

Ideally you’d want to use this refactoring on a bunch of statements that actually have a logical cohesive meaning together. Even if that’s not the case, usually you’ll be better off with the extracted method. One notable exception – when the set of statements you’d refactor will lead to a method will multiple out parameters. Then you’d still have to declare these variables in the calling code, and your gain greatly diminishes.

Tip II – Group methods into classes

You’ve all seen the huge classes with dozens of methods. Hell, Tip 1 above is all about creating even more methods! Well, once you feel a class has grown too much, you should try to spot groups of methods that have related functionality. Perhaps most methods in the group are used only by a single method, but nowhere else. In this case, if you move all these methods away to a new class and make them private members of that class, their absense will certainly not be missed in the original class – it only uses the one method anyway (be sure to keep it public, of course). Like in the previous tip, you want to try and find groups of methods that have a common logical function in order to use this refactoring. However, even if you can’t find a common function to these methods, they will still be better off in another class.

While Resharper does have a Move Method refactoring, in some cases it’s better to just cut and paste, especially when you move a bunch of methods together, with the data members they use and all.

Tip III – Reduce nesting level

This will not shrink your methods by a lot, but it will make them more managable. Nesting level implies context, which you have to maintain in your head when you are processing code. When is this code called? Only if 5 different if statements happen to succeed. It’s painful on the eye.

A useful refactoring for this is Resharper’s “Invert If”. It takes a convoluted piece of code such as

public void ProcessUserInput(string name)
{
   if (name != null)
   {
      int id = FindIdByName(name);
      if (id != 0)
      {
         if (TryToStoreName(name, id))
         {
            Console.WriteLine("Stored {0}, {1}", name, id);
         }
      }
   }
}

And beautifies it into

public void ProcessUserInput(string name)
{
   if (name == null) return;
 
   int id = FindIdByName(name);
   if (id == 0) return;
 
   if (TryToStoreName(name, id))
   {
      Console.WriteLine("Stored {0}, {1}", name, id);
   }
}

Much easier to follow (the difference is highly evident in methods with 5 nesting levels or more – these acutely need this refactoring).

Disable Resharper C# 3 syntax

I love Resharper, and I love C# 3.0, but sometimes they can’t play together.
At Delver we still haven’t purchased enough R# 4 licenses, so until we do, won’t use C# 3 features such as lambdas. This makes working with R# 4 annoying, because every file you open is filled with suggestions and warnings for which you just can’t do anything, because they’re all in C# 3. Another dangerous thing is this – C# 3 has gotten better at deducing generic arguments, so R# 4 will tell you to remove the arguments when not needed, thus bringing compilation errors to people with earlier versions.

The solution: disabling C#3 for Resharper. This can be done for every project – select the project and hit F4, and change Resharper’s “Language Level” setting.

Here is a small piece of code that does this for all your projects, over all yours branches (we have ~70 projects X ~5 active branches).

Warning: This code assumes for simplicity that the per project resharper options file doesn’t contain anything interesting (it’s overwritten from scratch). In the current version, this appears to be true. Also, the solution must be closed before running the exe – otherwise the “.reshaper” files will be deleted when you close it.

using System;
using System.IO;
 
namespace Unsharper
{
    class Program
    {
        static void Main(string[] args)
        {
            if (args.Length != 1)
            {
                Usage();
                return;
            }
            string arg = args[0];
            int ver;
            if (!int.TryParse(arg, out ver))
            {
                Usage();
                return;
            }
            if (ver != 2 && ver != 3)
            {
                Usage();
                return;
            }
 
            string config = "<Configuration>n<CSharpLanguageLevel>CSharp" + 
                ver + "0</CSharpLanguageLevel>n</Configuration>";
 
            Console.WriteLine("Finding resharper setting files");
            foreach (string csproj in Directory.GetFiles(".", "*.csproj", SearchOption.AllDirectories))
            {
                string resharperFile = csproj + ".resharper";
                Console.WriteLine("Writing " + resharperFile);
                File.WriteAllText(resharperFile, config);
            }
        }
 
        private static void Usage()
        {
            Console.WriteLine("Usage: unsharper {2/3} - make Resharper use C# 2 or C# 3 syntax");
            Console.WriteLine("This runs over all projects in your current folder");
        }
    }
}

Delving Blogs

Check out my post in Delver’s blog. The feature has only been in production for a few days and already I see cool search results from blogs of friends. Let me know if you search and find something useful with it.

Enough with ExpertExchange already!

I thought Google Searchwiki will help me stop seeing this crappy website on my search results, but for some weeks now I don’t see SearchWiki anymore. Anyone knows what happened to it?

Hello Worldpress 2.7

I was waiting for such a post to upgrade. Took 6 minutes, including this post, using Wordpress Auto-Upgrade plugin. Still need to explore the new version though. And I still need to tweak Simple Tags plugins.

Update – Things I like:

  1. I finally found where to manage spam comments. I accidentally marked a bunch of legitimate comments as spam the other week, and for the life of me, just couldn’t find how to undo it. In 2.7, you can access all comments from the dashboard.
  2. Finally, built in Ajax

Pimp my blog

I just installed the Pimp My Wordpress plugin, go here is you want to see an updated list of installed plugins.

Multiplayer Magic meta-strategy

Last night I played multiplayer Magic with my usual group, and for the first time we have 5 players instead of the usual 4. Usually we play 2 headed giant (2 players act as one giant and duel against the other 2 players), or occasional chaos magic – basically Last Man Standing, where everyone fights against everyone.

Two headed giant is rather simple in its meta strategy, and Chaos multiplayer is usually about who can hide his intentions long enough, not be noticed, and then wipe out everyone else.

Last night was different. We played an unnamed variant (didn’t find it here) that has the following rules:

  • A player can only attack the players opposite of him (his opponents)
  • A player can target whatever he wants (opponents’ creatures, allies creatures, …)
  • A player wins if his two opponents are dead

We’ll use the back of a magic as a convenient representation for the players (suppose for simplicity that there are 5 mono-colored decks that sit at the appropriate positions).

It appears this format has much meta-strategy: At first, you have your two “opponents” and two “allies”, and the game seems orderly. You attack your opponents and encourage your allies to attack your opponents instead of your other ally. Around mid-game, you start seeing one or two players with a bad board position and/or low life score. This immediately turns them to the underdogs.

The allies of these players do not want them to go out. If the weaker player dies, his allies are more vulnerable. So, his allies start defending the weak player more vigirously, throwing precious spells at their own allies, in to save their other weaker ally.

When the first play dies or is very near death, things change again. Let’s say the Black player dies. If Blue will die, then Green wins. Therefore, it’s in everybody’s interest except Green to prevent Blue from dying. The same goes for Red – if he dies, White automatically wins. So Blue, White’s ally, will do everything possible to prevent Red from dying – even though Red is Blue’s traditional opponent.

What should Blue do? The answer is simple – quickly take out Green. Once Green is out, the game degenerates into 2 on 1 – Blue & White both take on Red and usually win together (if their one remaining opponent dies, both win at the same time). Symmetrically, Red must take out white as soon as possible. What might happen is that with Red protecting Blue and trying to kill White, it might be in Red & Green’s best interests to unite against White – the sooner White is out of the equation, the sooner both can win (in a joint victory). If Green doesn’t want to share victory and refuses to turn on White but tries to kill Blue directly, most likely that all three players will kill him off first.

We played two very interesting games last night, and I died first in both, so don’t take my post-analysis here too seriously. What I think is certain, is that like chaos multiplayer, you should beware of appearing as the strongest player on the board. The strongest player will be taken care of before everyone else, and will eventually die (in my case, I had Avator of Woe out, which would have dominated the game if I hadn’t been killed straight after I played her – great card btw for multiplayer magic)