ver ponder how a physician troubleshoots medical issues in their patients? Neither did I until a consulting client pointed out I was following a medical methodology known as “differential diagnosis.”
Intrigued by the comment, which led me to further investigate how doctors work to fix patients, just like we work to fix EMC problems.
But first, a little background…
Early in my EMC consulting career, a client asked me to explain each step as we worked to improve ESD immunity on an existing product. In addition to solving the problem, he wanted to better understand my thinking process. Fair enough, I thought.
At one point, I laid out a “fault tree” of possibilities, along with prescribing a short course of action. As it was getting complicated, I apologized for any confusion. The conversation went something like this:
“Not a problem,” my client said, “you are doing differential diagnosis.”
“Stop,” I said. “What did you just say? And where did you hear that?” Joking, I added, “I’m a consultant — we make our living with buzz words like that.”
Laughing, he responded, “It is a medical term. My brother-in-law is a physician, and we often discuss troubleshooting methods.”
This began my fascination with how doctors troubleshoot problems. In this article, I’ll share three concepts — differential diagnosis (DD), gross & microscopic diagnosis, and the ninety percent rule. All were the results of conversations with medical doctors over the years, in reverse chronological order.
A few weeks after my initial introduction to this concept, I struck up a conversion with a seatmate on a cross-country flight. Upon learning he was a doctor with the Mayo Clinic, I asked about DD and was treated to a most interesting lecture. After all, he was a teaching doctor and I was a very willing student. Those of us who teach love these situations.
He began by explaining the father of DD was Arthur Conan Doyle (the creator of Sherlock Holmes). Doyle was an MD who also wrote short stories. He had an idea for a detective based on a favorite medical professor who taught clinical diagnosis. As we all know, the rest is history. It also explains the presence of Holmes’s medical sidekick, Dr. Watson.
The objective is “rule things in/rule things out” by creating two lists – high probability and low probability. The goal is to quickly narrow down a large list of potential causes to a smaller list, maybe even one likely root cause.
For example, if a patient presents with a red rash, there may be a hundred or more possibilities. Maybe it is the measles, or maybe it is bubonic plague. The first step is looking at vitals (temperature, blood pressure, etc.), which helps quickly eliminate possibilities.
The next step is the physical examination, along with detailed questions. Sometimes an immediate diagnosis can be made — other times additional tests may be necessary.
At that point, the prescription can follow — but not before. As the Mayo doctor on my flight emphasized to me, “prescription without diagnosis is malpractice.” As an aside, how many of us have performed EMI tests or thrown solutions at the problem without thinking it through? Think like a doctor instead.
On rare occasions, however, it well may be a zebra. He pointed out the Mayo Clinic often deals with “zebras.” There may be 100 possibilities, of which 99 have been ruled out by previous doctors, making it simple to identify the zebra. This is why it is important to ask what has already been done to address the problem.
For many non-EMI engineers, all EMI problems seem like zebras, rarely seen but still common for those of us in the EMI trenches.
So what steps should we follow? Years ago, I learned a simple framework for attacking problems with the acronym ACT (aware, critique, try.) The last step in the framework is very important to avoid “paralysis by analysis” — eventually one must try something, but best to have a logical approach. Or at least a plausible hypothesis.
- What are the symptoms? (Equipment issues.) Focus “inside the equipment.” Think like a doctor, and ask where does it hurt? When did you first notice the pain? What else is wrong?
- What are the likely causes? (Environmental issues.) Focus “outside the equipment.” Three likely suspects for upsets/failures are ESD (electrostatic discharge,) RFI (radio frequency interference), and power disturbances.
- What are the constraints? (Systems issues.) Focus on the “cost of failure” not the “cost of components.” Once you find a solution, you can then optimize for cost. Determine constraints like no mods to circuit boards, etc. And watch out for “wishful thinking.”
- How will you know when it is fixed? (Success issues.) Establish a goal and a method to validate it. For chronic problems, this might include no field failures for six months, etc. But do have a measurable objective.
This is where you apply differential diagnosis. The goals are to rule out the least likely scenarios and determine the most likely. It is all about probabilities and priorities. Don’t discard the low probabilities – you may need them later. Remember the “zebras” — occasionally you will find one. But don’t chase them first, no matter how interesting.
Start with the highest probability, as that has the best chance of success. If that does not work, move on to the next item on your list. Remember, “if at first you don’t succeed, try again…” Solving EMI problems is often a process of elimination.
However, if something is very simple, go ahead and try that first. I learned this the hard way chasing a problem for several days, only to discover moving a simple ground connection solved it. A bit embarrassing but my client was still happy to have the problem solved.
A bonus to the above. Assuming a one percent probability of success, that still means that one time in a hundred you will succeed. When that happens, everyone will think you are a genius. So be sure to pick the low-hanging fruit first.
Two caveats as you try. First, start with an open mind — don’t fight last year’s battle. Second, don’t be too “scientific” and try only one thing at a time. Rather, stack the fixes up. EMI problems are often like a leaky boat — if you have multiple holes in the boat and only patch one at a time, you will never succeed.
Finally, don’t be afraid to change directions. I’ve solved more than one problem by just starting over, asking “what if up was down.” These are often the most interesting problems when solved, as one is left musing “who would have thought…” And those cases make for great EMI war stories to share later.
I learned this troubleshooting technique from a pathologist years prior to EMC consulting. Moonlighting at the time, I was engaged to help automate a hospital pathology lab. It was one of the more interesting consulting projects in my career. Not for the squeamish, though — my pathologist had buckets of preserved human hearts on a bookshelf. And we think EMC engineers are weird?
Most of us assume pathologists spend their time in ghoulish activities like autopsies, but they also serve as quality controls on hospital procedures like surgery. For example, if a surgeon removes an appendix (or anything else), the tissue is not just thrown away. Rather, it is sent to the pathology lab for a two-step procedure. Even a small hospital may run several thousand samples per month. Thus, the need to automate the process.
The first step is the gross diagnosis, that is, a quick visual inspection. “Yes,” the pathologist says, “this looks like a diseased appendix.” But then it is tagged and may be preserved and sliced and diced for further investigation.
The second step is the microscopic diagnosis, occurring sometime later. Typically examined under a microscope, this may be done by the same pathologist or another — it doesn’t matter. If the microscopic diagnosis does not match the gross diagnosis, no harm/no foul. It just means we now have more detailed information.
I find this useful when dealing with EMI problems, particularly when people are in a panic and want a quick answer. I’m often able to give the gross diagnosis, but I remind them that this may change upon additional information or test data.
Explaining this helps manage expectations, and also gives me permission to change my own mind. No, it is not flip/flopping — it just means we now have a better handle of the problem. If pressed, I often share the pathologist story to explain my change in EMI diagnosis.
I learned this troubleshooting technique as a young EMC engineer. We drank a lot of coffee in the EMC lab (too much really) and I ended up with an occasional irregular heartbeat. In my mid-20s, this was a bit scary and sent me to my doctor.
After a few quick questions, including my coffee consumption, I was advised to cut back on the caffeine. Pretty simple, right? Except it did not resolve the problem. So back to the doctor I went.
He next prescribed some kind of pill, and it worked and after a few months was no longer needed. But being the curious engineer, I asked why the initial diagnosis did not work, and also what the next step was if the pill did not work.
My doctor knew I was an engineer, so he asked with a grin, “Does everything you do work the first time?” Well, no.
He then shared his “Ninety Percent” rule. First, go with the diagnosis/treatment that works 90% of the time. If that fails, go with the next 9%, and so on. Or as we say in the engineering world, first try Plan A, then Plan B, etc. Always good to have those alternate plans in reserve.
So Plan B worked for me, but I asked about Plan C? My doctor replied, “Well we could get out the scalpel.” At that time, I decided Plan C was not in my future.
One more medical story that goes back almost a century. A great uncle of mine was a doctor from around 1900 to 1950. I barely remember him, but his wife (also his nurse) once showed me the little black bag he used on house calls.
An engineering student at the time, I was intrigued with the simple tools of his trade — a stethoscope, a simple surgical kit, and some pretty basic drugs. Yet he was able to troubleshoot medical problems with these tools, along with using the gray matter between his ears. As EMC engineers we can do the same.
I hope these anecdotes and examples help clarify your thinking on troubleshooting, as they have for me. Troubleshoot like a doctor indeed!