To effectively solve a problem, you first have to accurately define the problem. Lots of business leaders don’t appreciate just how difficult it can be to correctly diagnose technical problems or security incidents in an enterprise-grade network.
I survived a heart attack last week. Please forgive me if this column is shorter than usual; the fatigue that comes with recovery has made it difficult to think, let alone write. I’m as surprised as you are that I somehow managed to type this and get it turned in to my editor on time …
That said, while convalescing in hospital I got to thinking that there’s an important take-away from my experience that translates surprisingly well to the business world: even experts can draw perfectly logical conclusions from known, validated data and make the wrong call when it comes to corrective action. This applies to tech support and cybersecurity as much as it applies to medicine, military strategy, sport, and every other field of human endeavour.
For context, I had my first cardiac scare back in 2015. Long story short, my A&E doctors announced that there was nothing wrong with my heart (e.g., no signs of clogs, low cholesterol, etc.), so the problem must have been something else. The docs said that I had suddenly manifested hypertension despite never having shown any evidence of high blood pressure before. That (they said) must be the cause of symptoms that looked just like a minor heart attack. The doctors prescribed more exercise, a heart-healthy diet, and medications to make my heart slow down. Their remediation plan addressed the symptoms, so we got on with it.
I had my second cardiac scare a year later. Again, my A&E docs came back with the same analysis: nothing wrong with my heart, so the problem must be something else. They insisted that it was probably due to gastrointestinal issues. I was referred to a gastroenterologist who crafted his own plans for mitigating what he assumed was causing the symptoms.
To be fair to the doc, we eat a LOT of spicy and antagonistic food here in Texas. Nearly any good gas station taco can trigger heartburn that’ll bring a strong man to his knees.
In retrospect, it’s clear that the doctors were partially wrong both times. They did successfully address some actual secondary problems; they just failed to detect the causal issue. The made a mistake in not detecting the blockage in a critical coronary artery. Had they caught that one critical clue, my entire treatment regimen would have been completely different and my eventual real heart attack might not have happened.
Making things worse, I had three minor chest pain episodes before my real attack happened. In each case, I treated my pain according to my doctors’ instructions. In all three manifestations the pain receded. Not because of the treatment method, but because my coronary artery wasn’t fully blocked … yet. Still, the do-this-then-that process my docs had taught me gave me the false impression that I was correctly treating the problem. That’s why my real heart attack almost took me out.
Based on this, it would be natural to assume that I’d be furious right now with my prior healthcare providers for the nearly-lethal misdiagnosis. That would make sense … but I’m not.
I remember enough Anatomy and Physiology from my Army practical nurse course to appreciate that the human body can be staggeringly complicated. Symptoms can mean a lot of different things and sometimes can indicate multiple overlapping problems at once. Making the right diagnosis requires highly-accurate readings of all the affected systems. It also requires the right mental framework to isolate the primary causal factors. It can be very easy to draw the wrong conclusions if you miss just one crucial piece of evidence along the way. It’s also easy to misinterpret your remediation efforts as being effective when symptoms seem to recede, because your treatment might never had addressed the primary problem. As in my case.
This is not the expression of a leader, client, or patient who wants to hear ‘Whoops! We totally misdiagnosed that.’
This same principle holds true in both technical support and security response operations. As much as us boffins claim to have ‘perfect knowledge’ of our systems and infrastructure, almost all corporate IT plants are staggeringly complex. They’re haphazardly grown more than they are consciously constructed; corporate networks are often large conglomerations of dissimilar technologies, bodged together with the intellectual equivalent of duct tape. Well-meaning techs get new systems just integrated enough to function and then leave things alone. Companies often can’t afford to make things ‘perfect’ when ‘good enough’ efforts often go over-budget.
That’s why diagnosis, root cause determination, and remedial action plans are often well-intentioned but also fundamentally wrong. Fully-qualified experts can correctly draw every logical conclusion based on the evidence available and miss the core problem entirely. It only takes one lost or misinterpreted clue (like my increasingly-blocked coronary artery) hiding just below the experts’ threshold of notice to completely change the equation. The techs will then do everything right according to both academic theory and the totally of symptoms and then fail to fix the problem. Or, more often, treat one or more legitimate secondary problems under the assumption that they’re solving the primary problem. Again, as in my case.
I’ve worked in many companies where InfoTech and InfoSec leaders and head boffins were reluctant to ever formally diagnose problems or to commit that a proposed solution would completely address a crisis. These experts understood – from hard-won experience – that their professional (and, often, political) credibility would be lost if they dared commit to a course of action that didn’t truly fix the problem-at-hand. A wise technologist never promises their boss an outcome that they can’t guarantee.
There’s also the political ramifications of admitting that your last huge purchase of equipment, staff, or services won’t resolve the current mess. To a non-tech-savvy businessperson, all technology is effectively magic; they don’t want to hear that a million-pound investment in Splunk won’t stop the lateral spread of ransomware on its own.
At the same time, I’ve seen far too many non-technical leaders, clients, and business owners get furious over a pragmatic and appropriate reluctance to guarantee success; after all, these high-priced technical experts that they hired to keep things running smoothly supposedly have all the tools and knowledge that they need, so why can’t they just find the problem and fix it? The company is losing money! We can’t afford to waste time testing possible cures! We need results, now! (etc.).
That divide in opinions can poison relations between operations and IT. It’s difficult for non-savvy people to appreciate just how much networks are like people. That is to say, we’re both complicated and difficult to understand. Skill and experience are crucial to remedial action; that said, it’s a correct diagnosis that wins the day. You have to identify the right problem before you can solve it.
Edgar Allen Poe, The Tell-Tale Heart (1843 short story)
POC is Keil Hubert, firstname.lastname@example.org
Follow him on Twitter at @keilhubert.
Keil Hubert is the head of Security Training and Awareness for OCC, the world’s largest equity derivatives clearing organization, headquartered in Chicago, Illinois. Prior to joining OCC, Keil has been a U.S. Army medical IT officer, a U.S.A.F. Cyberspace Operations officer, a small businessman, an author, and several different variations of commercial sector IT consultant.
Keil deconstructed a cybersecurity breach in his presentation at TEISS 2014, and has served as Business Reporter’s resident U.S. ‘blogger since 2012. His books on applied leadership, business culture, and talent management are available on Amazon.com. Keil is based out of Dallas, Texas.