How I debug messy code.

I’ve been asked a couple of time how you debug code and I think the best way to spend a couple of years fixing someone else Spaghetti code. In fact think colleges should do more projects that involve working with a very tangles codebase so students can better understand the importance of writing clean, decoupled code but short this is what you can do.

Limit how far a mistake can compound without being caught. In a perfect world the code you are given will be nice tidy units that you give an input and get an output and if that output is incorrect you know the problem is somewhere in that unit. Spaghetti code is the opposite everything is connected in complicated, unplanned and messy ways. Usually in this case you’d love to refactor the whole thing but you have a release the next day and it just needs to be fixed. If you don’t have nice clean units you might have to trace the problem backwards from the final result like untangling christmas lights. Someone of the code I’ve been given renamed variables several times in the process even changing datatypes for no reason like string TICKETNUMBER, int ticketNumber, char[] TICKET_NUMBER. In this situation you have to start with the variable you are sure of that that was the one shown on the screen at the end of the process. You can follow it back using breakpoints or debug text until you find the point where something went wrong.

Initially don’t jump to conclusions just try to understand what is going on. Some people just want to impose a conclusion on what is going on and act based on that but when it comes to a messy codebase remember the NASA saying “no situation is so bad that you can’t make it worse”. If you do chose to make a change be very conservative.

If the code is very convoluted and not clear what is happening I like to add a lot of temporary debug login. Usually with some id for each line and a bit of detail like

100 TICKETNUMBER = unset ticketNumber = 100231
200 TICKETNUMBER = 100231 ticketNumber = 100231
300 TICKET NUMBER = 100231 ticketNumber = 100232

I will usually add these every ten lines or so in a section where I think there might be a problem, more if necessary. In the above section we have a problem where the ticket number is not being updated properly because we have two variables for the ticket number, one is being updated and the other isn’t. The real solution to this is to get ride of one of the variables to avoid confusion.

If I want to narrow it down further I will add more debug code between 200 and 300. like DebugOut(“110 TICKETNUMBER = ” + TICKETNUMBER + “ticketNumber = “ + ticketNumber.toString());

I will often do this divide and conquer style putting a log output at the top of a function and one at the bottom and if there is a problem in the function I will at one in the middle, then one a quarter of the way down and so on until I find the line with the problem.

Sometimes thought the problem is so weird or so many variables are connected that it’s not that easy. Then I like to employ the scientific method. Narrow it down as much as you can. Here it’s not so much about find where the problem is, but where the problem is not. Is the data bad before it gets to the view, rule out the view. eliminate every class that isn’t a problem, any method. You might still have a big snakes nest of code to go through but try to eliminate anything you can, 2000 lines of code is still better than 4000 but then again don’t jump to conclusions, if you are not 100% sure don’t eliminate it.

Try to understand it as much as you can, think like a physicist, you can’t know everything but learn what you can so you can make educated hypotheses. Then when you have step back from the code, even leave the office. And on a note pad write down every hypothesis you can think of that might cause the problem try to come up with a dozen or more, really try to exhaust all your ideas. Then scribble down an experiment to test each, test a value at this point, try this input. etc. Then try them out, either you will solve the problem or you will have a more detailed understanding to make more hypotheses. You’ll hit a wall doing this a lot and I find taking a walk really helps.

Of course the real solution for all of this is to write good code in the first place and a messure of how good the code is is how much these things are contained. If the code you are working with is grabbing for variables from all over the place without properly passing them so much that it’s hard to keep track of that’s a bad thing.

Code should be contained in a way where you a unit does it’s job taking only what it needs and destroying variables when it’s finished with them. To take our real ticket number example it made send for there to be a ticketNumber accessible and stored for the whole run of the application but a single int value would have made more sense (than the 6 that were used). My guess is that the developer wanted a string type to do something like this “Current ticket number is 100132” but stored that string globally and used it inconsistently though out the program. It would have made more send for the int to the stored for the run and have the string passed temporarily. So instead of this:
//At the top

int TicketNumber,

//Some random spot
TICKETNUMBER = “Ticket number is “ + TicketNumber.toString();
//Updating screen
TicketLabel.text = TICKETNUMBER;

Something like this is cleaner.

// Something like this

Public string getTicketNoString()
return “Ticket number is “ + TicketNumber.toString();


TicketLabel.text = getTicketNoString();

The difference here is that the string only exists for a moment, long enough to insert it into the label and then it is destroyed. I eventually ended up moving the int TicketNumber into a container and using a getter and setter but the point is to keep things crystal clear and to try to prevent ad hoc accessing variables. If your just grabbing a value from another class maybe it’s a bad idea, maybe it should be passed in the header or there is a clearer way to do it. Will the next programmer know what you were trying to do.