Ask HN: How do you get better at debugging/finding a solution?

xsmasher · on Sept 6, 2022

* start logging shit, even things you know to be true / can't possibly be the problem. Somewhere your model of the code is broken, and you need to test find the breakage (model needs correction, or program needs correction).

* Split the problem space. Payment messages are not getting to the slack channel. Well, is it a bug in receiving payments, or a bug sending slack messages? Check if the payments are hitting the database. If no, you know it's on the input side. If yes, you know it's on the output side.

* Explain the problem to someone else; if you're alone write it up like a question you're going to post on a forum, with as much detail as possible. This must engage a different part of the brain, because often I'll figure out the issue while writing it up I'll reveal some clue while logging example output to add to the post.

Longer-term, teach a programming class! You'll get very good at debugging issues because you'll encounter a lot of other people's bugs, and you'll get a feel for what causes particular failures.

Also, don't feel any shame in asking others for help. Even senior devs get blocked, and they ask other seniors, or juniors, or if juniors are not available a rubber duck will suffice.

xsmasher · on Sept 10, 2022

I forgot one - if you start thinking there's a bug in a library, reduce your code to the smallest example that reproduces the problem. If you get it down to 5-6 lines you're going to either (1) remove something along the way that fixes the problem and gives you a clue or (2) you will have something you can post in message boards when you ask for help.

You'll also have a shortlist of functions that you can search for in stack overflow, and compare you example to theirs.

kevan · on Sept 6, 2022

>Somewhere your model of the code is broken, and you need to test find the breakage (model needs correction, or program needs correction).

Most of the harder bugs I run into fall into this category. Great list of techniques to use. For when I'm really stumped I go into a more formal scientific method to the splitting the problem space approach. Formulate a hypothesis that if answered will reduce the search space, either by ruling in or ruling out things. Write it down - this is the important part, test it, repeat.

It's slow but usually helps me keep forward momentum. Worst case it at least creates a list of things that need better logging to figure out what's going on

NBJack · on Sept 7, 2022

+1 to this. Make ZERO assumptions about the problem, and observe everything you reasonably can. In the increasingly (?) service-oriented architecture world, that includes fundamentals on your platform (i.e. I recall the first time I root-caused an issue that turned out to be an AWS bug) and taking the time to monitor external dependency changes (i.e. a service change, a code push, someone reverting stuff, etc.)

I also can't agree more with no shame in asking others. I would go as far as to say if you aren't asking others, you are going down a dangerous path of assuming you can solve the world's problems by yourself.

spiffytech · on Sept 6, 2022

Something I read forever ago and can't find a link for:

Keep a debugging journal. Take notes on every step you take and its results. It's easy to go in circles because you forget what you tried or forgot some detail of its outcome. Seeing a summary of what you already know helps you rule out possibilities and inspires new ones.

I often forget to do this or feel like "I can handle this bug without a crutch". Yet every time I actually journal the process it's helpful.

HeyLaughingBoy · on Sept 6, 2022

I find this useful also, especially with new technology where I may not already have a good mental model of how it works.

I generally use Notepad++ and it's often just a set of notes about the value of important variables in certain files @ a particular line of code or at a particular time. I find that seeing that big picture at a glance can often give me immediate insight into what's happening. Something you can't always see when you're focused on a few lines of code in a method.

greenyoda · on Sept 7, 2022

A journal also makes it much faster to jump back into a productive state after that inevitable interruption. (That interruption could be a meeting, or an emergency that forces you to drop everything and work on a higher-priority issue for several days.)

You can also refer back to your journal to remind yourself how you fixed that similar problem months ago.

Of course, a journal doesn't have to be limited to debugging. It's useful for development too.

coconuthacker42 · on Sept 7, 2022

there has to be a way to make this more streamlined and run experiments automatically, right?

luisgvv · on Sept 6, 2022

- Create a call stack map and write it down - This will give you an overview of which classes, functions, methods, paths are being used.

- People look down on console/logging, but it's useful specifically when there are race conditions or too many variables. It's way better to have several outputs and logs rather than a "standstill" picture you can only look while using the debugger.

- "Learn to debug" - This advice is thrown too vaguely around, but I'll tell you that the 2 essential pieces that helped me debug are 1- using watches to keep an eye on variables, props that are relevant to the problem 2- Use conditional breakpoints - About the 90% of the people I've paired programmed with don't know about it or even if they know it exists they don't use it and when I put in place some conditional break points they look with awe at how it can make a change.

- A somewhat counter intuitive or controversial advice: read the code involved and look for inconsistencies, dumb scenarios, flags, awful named variables and clean them! - Sometimes my attention span and memory range is entangled with garbage code that makes it harder to reason about. By throwing away the pieces I don't like and improving on them I get the benefit that in the future it will be easier to maintain and reason.

half0wl · on Sept 6, 2022

> Create a call stack map and write it down

(personal anecdote) I strongly agree with this. Writing it down is very important. Whenever I trace through a call stack, I tend to construct a visualization of it in my mind (in the form of a DAG + each root starting with the caller, if you will). But I could never hold that image especially when it gets too complex. Forcing myself to write it down helped me a lot - it doesn't matter if it's messy writings or drawings, somehow the _act of writing_ just helped me so much in reconciling concepts and connecting the dots.

tstrimple · on Sept 7, 2022

> - People look down on console/logging, but it's useful specifically when there are race conditions or too many variables. It's way better to have several outputs and logs rather than a "standstill" picture you can only look while using the debugger.

It also works across every programming language and development stack. It can be setup across network and application boundaries. Debugging skills with logging is extremely portable.

wnolens · on Sept 6, 2022

I felt like this before. Now it feels more a matter of time. I got significantly better by observing talented colleagues and picking up their strategies, some like:

Have a hypothesis and try to prove otherwise.

Start cordoning off parts of the codebase where the problem ISN'T.

Leave printf breadcrumbs to trace execution.

I find my less capable colleagues make the mistake of false assumptions, like some subroutine is executing or what state an object is in without proving it to themselves. That keeps your investigation from proceeding without luck on the larger problems.

ilitirit · on Sept 6, 2022

This is what my Mathematics Lecturer told us years ago when we had trouble understanding why some approaches worked for some problems, but not others.

"You need a few more years of study before you can fully understand. But! You have a 600 page textbook. Do every exercise in that book. When you can do that without assistance, get another 600 page textbook and repeat."

He was trying to get us to build an intuition for certain class problem solving, while at the same time saying "Shut up and calculate".

I find that problem-solving in the programming space is the same. Just keep doing things. You'll develop an intuition for it eventually.

groffee · on Sept 6, 2022

It's probably not so much your colleague that helps but the act of just talking through a problem, inanimate objects work too https://en.wikipedia.org/wiki/Rubber_duck_debugging

What also helps is just shutting the computer off and going for a walk, generally the moment you step away you'll figure it out!

nicbou · on Sept 6, 2022

If rubber ducking fails, try to write a GitHub issue that starts with "it's probably my fault, but".

By the time you're done describing the problem and its symptoms, you'll have a solution.

I can also recommend going for a walk. I usually return from long motorcycle trips with the best ideas. Tea on the balcony is also effective.

digitalsushi · on Sept 6, 2022

In college, when we got stumped on a CS assignment, we'd play Mario Kart, and while focusing on the game, we'd complain about the assignment. Very often, some quick, lazy criticism at our attempts would render an astute observation we wouldn't have made if not distracted by the game.

Fnoord · on Sept 6, 2022

> What also helps is just shutting the computer off and going for a walk, generally the moment you step away you'll figure it out!

Underrated. Diffused mode of brain.

hybridtupel · on Sept 6, 2022

Working for a bigger company I’m hesitant to go for a walk as that would mean I need to stay longer in the afternoon/evening. Even when going for a walk solves the problem, there is still little acceptance to count it as working hours. I think this mentality should shift somehow as it would benefit the company and the employee.

Fnoord · on Sept 7, 2022

There's a couple of solutions to this:

1) Find an employer who doesn't pay by the hour (but e.g. per day or task/goal)

2) Find an employer who isn't strict on the hours.

3) Self employment.

4) Find or convince an employer who accepts this.

None are ideal, except for #4, but its the long breath.

And I am thoroughly convinced it works. Same with workout, mindfulness. Such behavior should be rewarded. My employer supports workout in the sense I get 1 hour per week paid leave for working out.

samuell · on Sept 6, 2022

I have often found myself being able to dig forward way after many of my peers (not all) are out of ideas. I have seen a shift in my own ability to debug problems after I realized that with increasing "hardness" of the bug, you need to just increase the level of systematicness and rigor of your method.

The very first thing should probably be to carefully check which commit introduced the error, and then carefully study the code changes in that one for clcues.

After that, and possibly based on that, what probably most people will do is start their debugging process by running a debugger through some suspicious parts of the code, based on varying degrees of well-informed suggestions.

For problems that escape those first tries though, what you often need to do is to start a systematic process of ruling out possible sources for the bug. In many ways this resembles scientific studies where you try to control as many variables as possible, and also include control samples with known states, trying to zoom in on only the particular variable you are studying without noise from other things.

That can mean feeding the system or code under study with carefully set up data for which you know what the effect should be, and then carefully trying to change each part of it and observing the outcomes. Things like that.

In my experience, this type of effort will eventually most often lead to the solution. The main challenge I think though, is to realize how deeply you might need to go with the systematization and automation of things, to really rule out possible sources and start zooming in on the general area around where the bug is. You might need to take some real drastic measures, and this is where I see most people who fail, don't go far enough. Here you might need to really get away from the screen to get your thoughts flowing more freely ... but not in an undirected way, but rather trying to answer the question "How can I do this in an even more systematic way ... to rule out even more possible sources, or identify unexpected behavior".

Not super easy to put into words, but this is in my experience the way to go.

Finally, one caveat is that there are certain things you should probably check before even going the systematic path. Things that can totally screw things up so that whatever you do, you never get any systematic pattern of behavior or behavior change. These things often are related to caches in various form. Make sure to turn off any and all kinds of caches in the system. They will almost guaranteed drive you insane otherwise.

VoidWhisperer · on Sept 6, 2022

> The very first thing should probably be to carefully check which commit introduced the error, and then carefully study the code changes in that one for clcues.

Git bisect is a very useful tool for this assuming you have an easily reproducible case for the problem. (So generally it is decidely less useful when the bug is the result of a race condition)

Communitivity · on Sept 6, 2022

I find many problems made easier by applying cross-disciplinary knowledge.

Debugging has been one of those for me.

A long time ago I helped my mother with a murder mystery by researching how doctors diagnose things (I was a professional information broker then, with Dialog, Lexus/Nexus, GratefulMed, etc.), as much as I could without going to medical school. I learned a lot about differential diagnosis. That got me hooked on medical shows, where I learned a little more. At some point after that I wondered if I could apply differential diagnosis methods to debugging. Because there are often multiple possible causes for a bug, I found the differential diagnosis approach to work amazingly well for me.

I call this process D3 (Differential Diagnosis Debugging).

Below is roughly how I apply it. I am working on a book including this as a couple chapters, but that won't be out for at least another year. The material in this post is in the book, so I am told I must copyright anything smacking of an excerpt.

First, capture all the relevant details of the expected behavior. Create a unit test (or tests) to confirm the expected behavior.

Next, capture all the differences between the observed behavior and the expected behavior (the 'symptoms').

Then, examine those differences to come up with possible hypothesis about the causes.

After that, use a concept similar to Karnaugh Maps [1] to determine a sequence of small discrete unit tests whose truth (if true hypothesis could be true) determines a T or F for each hypothesis. If you wind up with more than one T then you need more tests (diagnostic testing).

Once you have a confirmed hypothesis, apply a fix an rerun all your tests. Rinse and repeat as needed, if needed (treatment), until all of your expected behavior tests pass.

Unpublished Work © Copyright 2022 William A. Barnhill, Jr. Some rights reserved. You may apply the D3 process as described herein; you may not incorporate the D3 process into a written work, a web site, or an email; you may discuss the D3 process if full attribution to the author is given.

Please don't hate me for the above folks. Been told I need to include that if I want to get published.

[1] https://en.wikipedia.org/wiki/Karnaugh_map

dsr_ · on Sept 6, 2022

Copyright applies to the specific words or notes that you use. You gain copyright (in the US) immediately upon fixating the words (i.e. writing them or recording them). It never applies to underlying ideas or methods.

Trademark applies to special identifiers of products and services.

Patents apply to inventions. It may apply to processes, given tangible form.

In short, you are taking bad advice. Get better lawyers.

In the alternative that you think your licensing terms mean anything:

1. By reading these words you agree, on behalf of yourself and your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that you believe I have entered into with you or your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges.

2. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

ramesh31 · on Sept 6, 2022

The most important thing I’ve learned as a dev is to be comfortable in a space I know nothing about, and to have the confidence to know “I know nothing now, but I will eventually reach the solution and fully understand it”. It’s a process. You start out groping blindly in the dark for a handhold. When you find the first one, suddenly others start falling into place. Then there’s a tiny pinhole of light that you start working towards. And all of a sudden you’re in full daylight, striding from ledge to ledge fearlessly. It’s the most beautiful thing about programming; to do this over and over again.

muzani · on Sept 7, 2022

I use scientific method for the tougher bugs. It helps in situations where there are multiple root causes or when the testing loop is over 15 mins.

Write down a list of hypotheses. Then write an experiment to test it.

Sometimes there's this intuition to comment out a block of code, run, and then comment out another block of code. Or revert to something before a bug happens. That's all fine, just make sure it's linked to a hypothesis or multiple ones.

Or sometimes you don't have a hypothesis. This is where the scientific method is also useful - you know what you don't know!

Then you use the Monte Carlo method, or as I call it, throwing darts and seeing if it hits a bug. Basically you slice out random blocks of possibly offending code, compile, slice out more or less, compile, until you narrow down an area.

From that area, you might formulate a hypothesis. Or you may need to throw more darts until you see a pattern.

Scientific method is bloody slow but you'll get to the answer eventually. It's not for everything.

smilliken · on Sept 7, 2022

If anyone reading this thinks they don't need to use the scientific method and be explicit about hypotheses, you just haven't found a hard enough bug yet. No matter how good you are, there's some bugs that you'll only be able to solve this way.

rr888 · on Sept 6, 2022

For a new job is a common problem. When I was younger I always thought I must be stupid because basic setup, builds are done well and I just can't figure it out. Now I realize that most teams have random quality builds and likely there is some peculiarity that you can't google for. You need a buddy to keep asking, also pair programming is great, if you can sit with someone else and watch what they do it really helps.

hakre · on Sept 6, 2022

I also suspect their builds have no good reproducibility tests.

llaolleh · on Sept 6, 2022

I'm fairly good at finding bugs and understanding code. Then recently I've had to mentor a few junior engineers and it seemed they struggled a lot.

It's a mix of search space reduction + heuristics. You start looking for most probable areas of fault. This includes new commits. Then you try to partition the places where the bug could be by probablistic reasoning, logging as you go.

If these approaches don't work then you start questioning your assumptions. For example, if there is eatApple() method, what did you assume about this method?

Did you assume all apples are rex, or is the functionality actually eatFruit()?

broast · on Sept 6, 2022

The code your dealing with is probably deterministic. The answer is in the code. Read the code. Run a mental model. Use your creativity, imagine edge cases to the code you see, and investigate what would happen and if that is what is happening. Rinse, repeat.

spuz · on Sept 6, 2022

If you often find yourself requesting help, ask yourself "what would X do in this situation"? I often find I can anticipate the questions a colleague would ask me when I present them a problem I'm facing. Sometimes simply trying to answer those questions will reveal the solution. Good questions for any problem are:

* When did this start happening?

* What changed between when it was working when it stopped working?

* Have you asked person Y who worked on that change?

davidatbu · on Sept 6, 2022

100%. Perhaps this is due to my work being WFH, but my version of this is that I start typing on Slack to describe my problem, and further debugging steps I could take pop into my head.

spuz · on Sept 6, 2022

Exactly the same thing happens to me :)

aradox66 · on Sept 6, 2022

Learn to use a debugger.

This advice would be very obvious to many, but I'm constantly surprised by how many of my coworkers don't know how to do this and test code by pushing logging statements into a production environment.

ozim · on Sept 6, 2022

Most of the time it is about being lazy - they just don't want to spend time recreating prod setup on their local.

To quickly fix something a lot of the times it is easier to push debug statements to environment that setup whole shebang on your local.

Most of the times you don't know which part of the system is wrong so yeah you can unit test all you want but if data in database is incorrect you still have to reproduce incorrect data on your local - but you still have to somehow get to know that there is incorrect data somewhere.

yurishimo · on Sept 7, 2022

Also, some languages are much more difficult to run debugger code on in production or running the debugger uses a whole lot more resources that would interrupt the stability of the production environment for a bug that might not be worth that much.

I use PHP/Laravel mostly, and while XDEBUG will work with a remote server, it requires an extra extension be installed and activated with a restart of PHP-FPM. That might mean downtime if you only have one app server. After the extension is activated, it requires extra resources to profile each request. Sometimes an order of magnitude more.

Luckily, because it is just PHP, running it locally isn't too hard, but I wanted to outline a scenario where a bunch of people cannot really use a debugger in prod. I've definitely been on teams that have pushed log statements to production to track down issues that were only present in one environment and not reproducible locally. Actually, one just came up about a month ago. Our app server is behind a firewall and a VPN (internal company app) and one of those services isn't properly forwarding the `X-Forwarded-Host` header correctly which caused issues later down the line when the framework was attempting to generate URLs. This caused an issue in our frontend code where some JSON data included bad URLs to the UI. Wasn't reproducible at all locally. Took some logging to find out where in the process the URL was wrongly generated so we could develop a workaround.

seanc · on Sept 6, 2022

This talk, "Debugging with the Scientific Method" changed how I debug problems. I try to watch it once every year or two.

tl;dr, when debugging people naturally form a hypothesis about what is going wrong and then set about gathering evidence to support or refute that hypothesis. When you do so unconciously you are very prone to biases of all sorts, most importantly confirmation bias.

When you do so mindfully, and Write It All Down then you are much more likely to a) come up with an evidence gathering exercise which will usefully falsify your hypothesis, and b) respect the gathered evidence, reject the now false hypothesis, and move on to a new one.

https://www.youtube.com/watch?v=FihU5JxmnBg

snarfy · on Sept 6, 2022

The biases are probably the biggest road block. You make an assumption of what is wrong and start looking in the relevant code. There thousands of if statements that all start somewhere at the top ,creating a big fork. It's easy to get stuck when you are looking on the right side when the issue is on the left. Always understand that your assumptions can be wrong.

tootie · on Sept 6, 2022

One thing I have learned is to never say "This should work". If it doesn't work it shouldn't work.

seanc · on Sept 6, 2022

For sure! My least favourite type of bug is the converse though. "How is this possibly working?"

When you're pulling on a thread and come across a function that looks completely broken and yet somehow is not.

tpoacher · on Sept 6, 2022

I often tend to sprinkle conditional code with "in theory you shouldn't ever get here" printouts.

They get triggered more often than you'd think.

LionTamer · on Sept 7, 2022

I’m a young dev but this has been a go to of mine and found it very useful

bilekas · on Sept 6, 2022

I think this is common enough personally.

The most important thing is understanding the problem itself. So in the case of compile issues, rust for me has just blown me away. But I won't harp on.

There are times too where it's just not clicking and you shouldn't be afraid to step away from it for a bit. I like taking a walk. But something to step away from it.

I do have crazy lows when I'm just stuck and feel so stupid at that moment. But solving it then gives that addictive high.

As for tools and those obscure errors that are impossible I'm always reminding myself of the X Y problem and to make sure I'm not getting trapped in it!

Power through though, you're not alone and all the problema you've solved you'll remember!

droobles · on Sept 6, 2022

No joke, pen and paper, preferably a small notebook with no lines.

One time I solved a very complex circular dependency issue this way. I think just writing down thoughts and process in a very human way can help with technical issues.

ebrewste · on Sept 6, 2022

Agreed. When I am really stuck, I print out code that I think has a problem, bring a pencil and nothing else and go somewhere else. Coffee shop, break room, park, it doesn’t matter. Just a change of scenery. It’s amazing how well it works for focused debugging. I just need to do it more often and sooner.

ltbarcly3 · on Sept 6, 2022

I've been described as being quite good at debugging, here is my approach:

Short term - while debugging a problem: 1. Gather whatever information you can. Look in the logs! 2. Debugging means tracking the flow of execution from beginning to end (and then back to the front in many cases). Where along that chain is the breakdown? 3. Often you can stop after 2, since when the point of failure is determined the issue becomes obvious. However, in some cases it is not obvious. In those cases, you have to look for 'something that does not make sense' or can't be true.

Case study: timeouts We started having errors on a service we owned. Calls would be made to our service (call it service A) and then calls to an upstream service (service B) would time out, and we didn't catch this exception so the request to A would end with a 500 response. However, when we looked at the logs for B we could see our requests from A and that they were not timing out at all, they were taking the usual amount of time. This is a huge contradiction! After a day of poking around we could reproduce the timeout with a script which called our service. Thinking it might be the load balancer, we started sending requests to the IP's of workers directly, which never timed out. On a hunch, we sent requests to the IP of the load balancer (this can't be done directly since the LB uses the hostname as part of how it routes requests to servers, but we could do this by adding the LB with the proper hostname to the /etc/hosts file.) Low and behold, the timeouts were still gone even though we were doing 'the exact same thing' as before! Removing the entry from the hosts file, the timeouts immediately came back for the same small percentage of requests.

Long term: Being good at debugging really consists of two things, knowing how things work and spending the time required to debug things. Interestingly, one great way to learn how things work is to debug things. Whenever something does not make sense, or you can't explain it, dig in and get to the bottom of it. Every time you do this you'll learn things that you never would have guessed, and your ability to debug some random future issue will go way up when that future issue just happens to intersect something that you had to dig into previously.

Thursday24 · on Sept 6, 2022

I for one consider social methods part of the problem solving process.

Knowing who knows what, and who can help, and asking for help the right way is key to getting better at problem solving.

Perhaps it is the school system that discourages problem solving through social means, and people feel guilty about taking help.

However, consider this perspective. What if the ones who could put together people into functional groups could be considered great problem solvers? Consider any of the great American enterprises, there are people at the top who figured out how to put people together to get things done.

Agingcoder · on Sept 7, 2022

Basically you have a model of your program, but the model breaks down, so you need to check that what you hold true is actually true.

Another approach is to use tools (system stats, logging, debuggers like rr/pernosco, etc).

Learn as much as you can about the system and tools you're using.

Reduce the size of the problem (reduce code, isolate, etc).

Read the code, and try to break it. Build scenarios 'this thing I see in the log can only happen if...'

Finally, there's no shame in rubber ducking a problem with a colleague!

rapjr9 · on Sept 7, 2022

In the beginning of your career I think you just have to choose problems and persist on them until you've solved them no matter how long it takes (within reason). You learn from people and web research but also by doing and remembering. Persist on problems several times and you'll have started to build some intuitions on where problems hide. You'll have to persist less and less with time, but persistence almost always pays off (or you find out why a solution is impossible, which can also be interesting and useful). Eventually you become a guru. I think this is the traditional route to learning. By trying. Learning from others is another route, but trying is really the basis of learning.

Here's a trick you might find useful when you are really stuck and don't see any path forward: change some random related things in the code. If the code does not respond to those changes as you expected then that is a clue! There is something there you don't understand and the unexpected behavior is a clue to it. It's kind of like writing unit tests, checking for what you expect to be true.

Keeping a journal is essential also.

dncornholio · on Sept 6, 2022

If you relied on others, did you take something out of it other than their answer? Did you pick up how they attacked the problem in a different way? That's how you evolve. Just observe :) Ask them how they found a way, when you couldn't.

If the problem is too obscure, I 'just' rebuild the system like I imagined it.

But I don't think everyone is as suited to this form of tackling problems. You might get more enjoyment out of creating new things?

xyzzy4747 · on Sept 6, 2022

If you can’t figure out a problem, just go back to the basics.

Comment out or delete the code until the fundamentals of the system works. Then add stuff back until errors happen. Also add better logging to track down the problem. Tear down the system and build it back up. Delete nonstandard approaches you can’t figure out and use standard ones from QuickStart tutorials. It might be slow progress but progress will be made.

davidjfelix · on Sept 6, 2022

The problem I see a lot in people who have a hard time debugging is continuing to assume all of their base assumptions in the face of a bug. You've hit the bug because of assumptions, so ask yourself: which one of those assumptions, when changed or challenged could produce the outcome you're observing? Sometimes those assumptions are as basic as "I typed everything correctly" but other times they're more nuanced like "I assumed this library only throws when getting a transport error not an HTTP error code". IMO, it's a muscle you exercise. You build understanding in your problem space and get more skilled at identifying likely culprits for bugs. Early on though? It's a lot of asking, seeking and brute forcing.

Also, print statements or use a debugger to confirm assumptions. (I'll be honest, I rarely bring out a debugger unless a print based workflow is time consuming enough to justify remembering everything about the debugger).

gigatexal · on Sept 6, 2022

I am struggling with this now, too. We've built a data solution on top of some highly custom dynamic dag building stuff off of airflow but the local run-me story is very tough and I've yet to get a debug session working which would be nice to be able to attach to a running container of it all. Not adding much I guess just saying +1.

BlameKaneda · on Sept 8, 2022

At my previous job I was debugging a problem that a client ran into. Looking back, these are some of the steps I took:

1. Recreate problem in prod - How was it triggered? I did a bit of digging and noticed that it happened when very specific conditions occurred.

2. Recreate it locally - Got an error message, i.e. `SomeVar is not defined`.

3. SomeVar - Why is it `undefined`? I started working backwards and realized that the `SomeVar` functionality was working as intended, but it's what was being fed into it that was the issue.

4. Working backwards - I had a hunch as to why the error was occurring when I started, and when I worked backwards I realized that I was partially correct.

5. More research - In regards to point 4, I learned what else was missing (i.e. why I was partially correct about my hunch).

6. Start coding - Since the issue was in prod, I put in a pretty shoddy hotfix. It worked, but it tacked on some logic into code that was already pretty confusing (due to a lot of conditional scenarios in the UI).

7. PR - I opened a PR, but to my Sr Dev's point the solution worked but the code wasn't clean.

8. Sr Dev chat - We had a quick chat, and even though the issue was in prod only a few people had it happening to them (the feature wasn't widely used). Also, unless we were actively losing money due to a prod issue, a hotfix isn't needed and we can take the time to write cleanly.

9. Coding - I realized that there was an even easier way to write the code without introducing confusing logic. I scrapped about 95% of what I had and put in something that not only worked, but was much cleaner. I also wrote some tests, as per the Sr's recommendation.

10. Updated PR - I followed-up with the Sr, who thanked me for revising the code and for writing the tests.

---

Although I reached the original solution and the new one on my own, having discussions with the Senior Dev was incredibly beneficial.

andrewstuart · on Sept 6, 2022

I've mostly worked alone as a developer. No-one is going to fix it but me.

Actually I think I enjoy problem solving more than the actual programming. Though frankly programming and problem solving are so closely related they are almost the same thing.

I can solve most problems I come up against day to day. I guess I can't solve them all because I do ask questions on Stack Overflow sometimes.

I'm not sure why, I guess it's a few things:

1: As I said, I enjoy problem solving - it feels like a game

2: I have more than 35 years hard computer problem solving, so there's alot of experience which helps.

3: I understand, and make a deliberate effort to understand, as much as I possibly can about every aspect of computer systems from hardware to operating system to database, to cloud to front end to back end.

4: I use the tried and tested problem solving method of divide and conquer - when you have a problem, break the software in half and keep doing so until you find where the problem is.

5: If you are still stuck, try to create a minimal test case the demonstrates the problem - this often solves it, and if it doesn't then you can post the minimal test case on StackOverflow.

6: Only ever work on one thing at a time.

7: Work hard to learn how to use problem solving tools.... the debugger in the browser, make sure you know a little strace for doing things like tracking what file the operating system is trying to open, ngrep so you can watch data passing over the network.

8: Be relentless. Sometimes I think this is a personality "issue", but I literally will work for 10 hours at a time trying to solve a problem until I crack it. I really don't like finishing work without having all the days problems solved. I just grind and grind and grind on problems until I work it out.

9: Have a really thorough knowledge of the technologies you are working with - don't be satisfied with learning as you go - actually take the time to read the language specification for the languages you program with, stuff like that.

10: Use a really good IDE and make sure you know how to use its ability to jump through the code, to jump to function definitions. You need to be good at following the logic of the program.

11: And finally, if no matter what you do the error remains the same, then you are probably not even working on the correct code.

In the case for example of why a program won't compile, well just start cutting code out until something compiles, then add it back in little by little until you work out where the line is that fails. Even better is to use appropriate debugging tools, but divide and conquer always works.

There's also no shame in scattering these everywhere you suspect the problem might be:

print('got here!')

sokoloff · on Sept 6, 2022

Came here to say versions of 4 & 5.

As a more junior programmer, I would sometimes resist the perceived effort or inelegance of the divide-and-conquer approach and end up spending much more time than needed to get to a solution.

madhato · on Sept 6, 2022

Exactly, some 4 & 5 will reveal why your program isn't compiling.

Agingcoder · on Sept 8, 2022

This is excellent advice.

oshirisuki · on Sept 7, 2022

I think sometimes not being able to get a software system with hundreds of thousands, or even millions of lines of code to compile is fine?

unless you didn't even do any changes, and it's a matter of the codebase requiring a specific environment or option to compile and you not having/knowing it

on getting better at debugging, I would try to isolate possible sources of the problem, form a hypothesis, test it experimentally

you'll need to know what you know

is it the environment you have? networking? OS? etc., whatever applies to your system

I think having a known working state is probably the most useful, though it might be difficult to get

something to consider, I do remember there being a paper on getting wrong results even when doing nothing wrong

Tainnor · on Sept 6, 2022

I don't know if I'm "good", i.e. efficient at debugging. I'm just incredibly persistent. If that means stepping through 20 layers of framework code... so be it. I've found weird undocumented edge cases that way.

That said, you mention not getting your code to compile. I've never (at least not in recent memory) not been able to get my code to compile by myself after a while, but I also don't know what language you're working with. Are you struggling with the type system? That can usually be tackled by taking expressions apart, assigning them explicit types and then reasoning through the inputs and outputs (with pen and paper if need be).

snarfy · on Sept 6, 2022

Use a debugger. I've gone so far as to port my linux code to windows just to use visual studio's debugger, fix the problem, then port it back.

Use version control. If the order processing worked last week and johnny made a change to order processing in this release it's highly likely it's johnny's change.

You make an assumption of where the problem is and start looking. Always understand your assumption can be wrong and you are looking in the wrong place.

Avoid code with side effects. I've taken code with bugs in it and completely rewrote all of it without side effects just to avoid those types of bugs and magically all of the hard to find bugs disappeared.

enoent · on Sept 6, 2022

Unfortunately, there's little curation of interesting debugging case studies or "debugmes". You can try to scavenge bug trackers, but they aren't optimized for this, just a bag of itches users want to scratch. Neither are war stories very fruitful, since rarely do you get to hunt down the corresponding bugfix commits, which are essential to not miss on the details that made those bugs tricky. But then, you already have the solution in front of you.

Instead, I've found much more deliberate practice and therefore value from reverse engineering: just like in debugging, you want to understand the underlying logic through program analysis.

You can pick security CTFs, crackmes, malware samples, proprietary software with bugs you want to fix, or functionalities you want to add, or even games where you want to find some hidden content, extract some resources, or modify some behaviors. In all that you will find something of your interest, and apply approaches that I think translate well to software development:

* Differential tracing: Want to know how some action reflects in the codebase? Take k instruction traces where you do everything except that action, then take a trace where you do that action, find out what are the unique differences in that last trace. Want to know what data gets written? Same approach with memory dumps and breakpoints. How do different inputs affect these changes? You will learn to be methodic and throughout in what you test and log.

* Recognizing patterns: Sometimes you don't have symbols for your functions, are you able to identify printf at a glance, or will you waste time following the logic of the function? Do you see relative offsets being used and recognize accesses to an array? With source code all this happens on a more macro view, such as algorithms or design patterns hidden in all those coupled functions and classes. But the micro view also applies: figure out what constants relate to specific functionalities, and you can grep your way to relevant functions or documentation.

* Avoiding boilerplate: This follows from the previous point, since you want to recognize the flow of data through the codebase, in order to have some call hierarchy to follow, otherwise it's easy to waste time on functionally that is irrelevant to you. Start with how data enters the application: stdin, files, database connections, http endpoints... Tests, examples, or client apps will also help here.

Oh and don't worry about learning some assembly language, just make that investment, since that's the straightforward, well defined, predictable part.

nottorp · on Sept 6, 2022

Considering how you describe the problem, you just don't know the codebase?

It's hard to debug a problem if you have no idea what most of the code is supposed to do. You can go for speed - asking colleagues that do know - or you can just start following the code until you understand it. That will take more time of course, but in time you will start having enough knowledge of the code base that you can just start throwing guesses when there's a problem.

My 2 cents in case i read your question right. If i haven't, there are a few answers describing debugging techniques :)

ericmcer · on Sept 6, 2022

Ask someone? If you have ever been part of a founding dev team you know there is gonna be stuff in the code that is unpleasant to untangle unless you are acquainted with how/why it was done. If you have some kinda searchable chat/FAQ thing I am sure someone has encountered this before or will in the future. I always thought slack was pretty good at storing years of dev questions threads in a single channel due to its informal question/answer nature vs something like stack overflow or Confluence

mav88 · on Sept 6, 2022

To improve your debugging, improve your mental model of the system you're tackling. The more you know about the state and flow of data at every point, the more you can reason about what's going on. I'm a particularly average programmer who is actually quite slow at banging out new code but I have fixed very difficult bugs in cloud deployments, libraries, firmware and realtime industrial systems just by laser focusing on what the system should be doing.

ebrewste · on Sept 6, 2022

Talking to others about the problem, but more critically, take to heart and learn their debugging techniques, not just getting others to problem solve for me. I learned to use a debugger, profiler, check compiler output, rubber ducky, five whys, etc. by seeing others apply the methods. I already understood those methods existed, but they came alive and I understood them at a deeper lever by seeing others apply them.

ravishi · on Sept 7, 2022

I guess by doing it for inappropriate amounts of time during my youth just gave me that confidence. Also I mostly debugged random problems in random stuff, so that increased my range of confidence.

Of course I've seen people who could pull out the same without having that much time. I respect and admire them.

bil7 · on Sept 6, 2022

the truth is you probably overcome many blockers a day that would stump others. It's ok to rely on others/the internet. I'm never ever mad when someone asks me for help, unless it's the same question they've asked 5 times already.

MrFantastic · on Sept 6, 2022

The biggest thing for me is just learning to accept that the problem exists and I will find the solution eventually.

Sometimes I allow myself to be annoyed that the code isn't working and that is the wrong mindset.

Everyone will experience problems but suffering is a choice.

pauleibye · on Sept 8, 2022

Spend time installing a good low friction debugger and get good at using it! Screw log lines for local debugging, debugging is a skill in itself that needs to be practiced.

hemapani · on Sept 7, 2022

this book is a very good discussion about debugging, "Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Software and Hardware Problems", https://embeddedartistry.com/blog/2017/09/06/debugging-9-ind...

hakre · on Sept 6, 2022

You can't find a solution? There is no revision of the source code that still compiles? Then find that revision first.

Divide and conquer.

And define things out of existence.

shaolinspirit · on Sept 6, 2022

rest more, sleep on the problem. usually I get the eurika moment when I'm away from the hard problem, not when I'm on it

aparticleofu · on Sept 7, 2022

Bugs are like particles in this universe, If you feel like it nothing you will find it.

eof · on Sept 6, 2022

tl;dr rely even more on others in the most transparent way to constantly get unblocked as quickly as possible. Don’t let your ego stop you from learning something.

I’m grunching this thread. But I have 20ish years of experience and consider myself a decent debugger and considered enough so by others that I’ve been asked this line of questioning while in a mentorship role on a number of occasions.

The number one thing is simply humility; asking the dumb question in the smartest way you can, as quickly as possible, to the person who can most likely give you the best answer.

As you’re learning and starting out, you may not have great resources for such, but within most orgs you should be able to find where the answers are, and first ask the dumb questions (caveated by what you do understand), then eventually ask for a brain dump.

It is through the accumulation of tons of disparate idiosyncratic knowledge that little clues and common patterns emerge that allow people to sniff out a root cause hypothesis before the facts are even in, and often be right.

But people who hide their ignorance, refuse to ask questions that may make them look silly, refuse to be the idiot in the room, they learn much less quickly because they don’t get unblocked as quickly.

joshxyz · on Sept 6, 2022

Simply get good at gaslighting yourself to keep yourself going on until you solve it.

wahnfrieden · on Sept 6, 2022

start with investing in devops for extremely tight feedback loops and exploratory instrumentation

get good at finding and reaching out to contractable experts for rabbit-holes, whether for the spot-work or to pair on solving with their expertise

shahbaby · on Sept 7, 2022

"A problem well-stated is half-solved."

codenbreww · on Sept 6, 2022

Tests

sejje · on Sept 10, 2022

Take a walk.

Rubber duck it.

Never quit.

barrenko · on Sept 6, 2022

I guess the TL;DR is "Keep failing", but I struggle from heavy imposter syndrome myself. Well, definitely an imposter at the moment.