Security engineering: broken promises

The following draft excerpt comes from my upcoming book. Republished with permission of No Starch Press.


On the face of it, the field of information security appears to be a mature, well-defined, and an accomplished branch of computer science. Resident experts eagerly assert the importance of their area of expertise by pointing to large sets of neatly cataloged security flaws, invariably attributed to security-illiterate developers; while their fellow theoreticians note how all these problems would have been prevented by adhering to this year's hottest security methodology. A commercial industry thrives in the vicinity, offering various non-binding security assurances to everyone, from casual computer users to giant international corporations.


Yet, for several decades, we have in essence completely failed to come up with even the most rudimentary, usable frameworks for understanding and assessing the security of modern software; and spare for several brilliant treatises and limited-scale experiments, we do not even have any real-world success stories to share. The focus is almost exclusively on reactive, secondary security measures: vulnerability management, malware and attack detection, sandboxing, and so forth; and perhaps on selectively pointing out flaws in somebody else's code. The frustrating, jealously guarded secret is that when it comes to actually enabling others to develop secure systems, we deliver far less value than could be expected.


So, let's have a look at some of the most alluring approaches to assuring information security - and try to figure out why they fail to make a difference to regular users and businesses alike.

Flirting with formal solutions



Perhaps the most obvious and clever tool for building secure programs would be simply to algorithmically prove they behave just the right way. This is a simple premise that intuitively, should be within the realm of possibility - so why hasn't this approach netted us much?


Well, let's start with the adjective “secure” itself: what is it supposed to convey, precisely? Security seems like a simple and intuitive concept, but in the world of computing, it escapes all attempts to usefully specify it. Sure, we can restate the problem in catchy, yet largely unhelpful ways – but you know we have a problem when one of the definitions most frequently cited by practitioners is:


“A system is secure if it behaves precisely in the manner intended – and does nothing more.”


This definition (originally attributed to Ivan Arce) is neat, and vaguely outlines an abstract goal – but then tells very little on how to achieve it. It could be computer science - but in terms of specificity, it just as easily could be a passage in Victor Hugo's poem:


“Love is a portion of the soul itself, and it is of the same nature as the celestial breathing of the atmosphere of paradise.”


Now, one could argue that practitioners are not the ones to be asked for nuanced definitions - but, ask the same question to a group of academics, and they will deliver roughly the same. The following common academic definition traces back to Bell-La Padula security model, published back in the sixties (one of about a dozen attempts to formalize the requirements for secure systems - in this particular case in terms of a finite state machine – and one of the most notable ones):


“A system is secure if and only if it starts in a secure state and cannot enter an insecure state.”


Definitions along these lines are fundamentally true, of course, and may serve as a basis for dissertations, perhaps a couple of government grants; but in practice, any models built on top of them are bound to be nearly useless for generalized, real-world software engineering. There are at least three reasons for this:


  • There is no way to define desirable behavior of a sufficiently complex computer system: no single authority can spell out what the “intended manner” or “secure states” are supposed to be for an operating system or a web browser. The interests of users, system owners, data providers, business process owners, and software and hardware vendors, tend to differ quite significantly and shift rapidly – if all the stakeholders are capable and willing to clearly and honestly disclose them out to begin with. To add insult to injury, sociology and game theory suggest that computing a simple sum of these particular interests may not actually result in a satisfactory outcome; the dilemma, known as “the tragedy of the commons”, is central to many disputes over the future of the Internet.



  • Wishful thinking does not automatically map to formal constraints: even if a perfect high-level agreement of how the system should behave can be reached in a subset of cases, it is nearly impossible to formalize many expectations as a set of permissible inputs, program states, and state transitions – a prerequisite for almost every type of formal analysis. Quite simply, intuitive concepts such as “I do not want my mail to be read by others” do not translate to mathematical models particularly well - and vice versa. Several exotic approaches that let such vague requirements to be at least partly formalized exist, but they put heavy constraints on software engineering processes, and often result in rulesets and models far more complicated than the validated algorithms themselves – in turn, likely needing their own correctness to be proven... yup, recursively.



  • Software behavior is very hard to conclusively analyze: static analysis of computer programs to prove they would always behave in accordance to a detailed specification is a task that nobody managed to believably demonstrate in complex real-world scenarios (although as usual, limited success in highly constrained settings or with very narrow goals is possible). Many cases are likely to be impossible to solve in practice (due to computational complexity) – or even may turn out to be completely undecidable due to the halting problem.



Perhaps more frustrating than the vagueness and uselessness of these early definitions is that as decades fly by, little or no progress is made on coming up with something better; in fact, a fairly recent academic paper released in 2001 by the Naval Research Laboratory backtracks some of the earlier work, and arrives at a much more casual, enumerative definition of software security: one that explicitly disclaims it is imperfect and incomplete:


“A system is secure if it adequately protects information that it processes against unauthorized disclosure, unauthorized modification, and unauthorized withholding (also called denial of service). We say 'adequately' because no practical system can achieve these goals without qualification; security is inherently relative.”


The paper also provides a retrospective assessment of earlier efforts, and the unacceptable sacrifices made to preserve the theoretical purity of said models:


“Experience has shown that, on one hand, the axioms of the Bell-La Padula model are overly restrictive: they disallow operations that users require in practical applications. On the other hand, trusted subjects, which are the mechanism provided to overcome some of these restrictions, are not restricted enough. [...] Consequently, developers have had to develop ad hoc specifications for the desired behavior of trusted processes in each individual system.”


In the end, regardless of the number of elegant, competing models introduced, all attempts to understand and evaluate the security of real-world software using algorithmic foundations seem to be bound to fail. This leaves developers and security experts with no method to make authoritative statements about the quality of produced code. So, what are we left with?

Risk management



In absence of formal assurances and provable metrics, and given the frightening prevalence of security flaws in key software relied upon by modern societies, businesses flock to another catchy concept: risk management. The idea, applied successfully to the insurance business (as of this writing, with perhaps a bit less to show for in the financial world), simply states that system owners should learn to live with vulnerabilities that would be not cost-effective to address, and divert resources to cases where the odds are less acceptable, as indicated by the following formula:


risk = probability of an event * maximum loss


The doctrine says that if having some unimportant workstation compromised every year is not going to cost the company more than $1,000 in lost productivity, maybe they should just budget this much and move on – rather than spending $10,000 on additional security measures or contingency and monitoring plans. The money would be better allocated to isolating, securing, and monitoring that mission-critical mainframe that churns billing records for all customers instead.


Prioritization of security efforts is a prudent step, naturally. The problem is that when risk management is done strictly by the numbers, it does deceptively little to actually understand, contain, and manage real-world problems. Instead, it introduces a dangerous fallacy: that structured inadequacy is almost as good as adequacy, and that underfunded security efforts plus risk management are about as good as properly funded security work.


Guess what? No dice:



  • In interconnected systems, losses are not capped, and not tied to an asset: strict risk management depends on the ability to estimate typical and maximum cost associated with a compromise of a resource. Unfortunately, the only way to do it is to overlook the fact that many of the most spectacular security breaches in history started in relatively unimportant and neglected entry points, followed by complex access escalation paths, eventually resulting in near-complete compromise of critical infrastructure (regardless of any superficial compartmentalization in place). In by-the-numbers risk management, the initial entry point would realistically be assigned a lower weight as having low value compared to other nodes; and the internal escalation path to more sensitive resources would be likewise downplayed as having low probability of ever being abused.



  • Statistical forecasting does not tell you much about your individual risks: just because on average, people in the city are more likely to be hit by lightning than mauled by a bear, does not really mean you should bolt a lightning rod to your hat, but then bathe in honey. The likelihood of a compromise associated with a particular component is, on an individual scale, largely irrelevant: security incidents are nearly certain, but out of thousands exposed non-trivial resources, any resource could be used as an attack vector, and none of them is likely to see a volume of events that would make statistical analysis meaningful within the scope of the enterprise.



  • Security is simply not a sound insurance scheme: an insurance company can use statistical data to offset capped claims that might need to be paid across a large, well-studied populace, using the premiums collected from every participant; and to estimate reserves needed to deal with random events, such as sudden, localized surges in the number of claims, up to a chosen level of event probability. In such a setting, formal risk management works pretty well. In contrast, in information security, there is no meaningful way to measure how dangerous your current practices may be; no way to detect and estimate the impact of breaches when they occur in order to build a baseline; and no way to cleanly offset the costs of a breach with the value contributed by healthy assets.


Enlightenment through taxonomy



The two schools of thought discussed previously have something in common – both assume that it is possible to define security as a set of computable goals, and that the resulting unified theory of a secure system or a model of acceptable risk would then elegantly trickle down, resulting in an optimal set of low-level actions needed to achieve perfection in application design.


There is also the opposite approach preached by some practitioners – owing less to philosophy, and more to natural sciences: that much like Charles Darwin back in the day, by gathering sufficient amounts of low-level, experimental data, we would be able to observe, reconstruct, and document increasingly more sophisticated laws, until some sort of a unified model of a secure computing is organically arrived at.


This latter world view brings us projects like the Department of Homeland Security-funded Common Weakness Enumeration (CWE). In the organization's own words, the goal of CWE is to develop a unified “Vulnerability Theory”; to “improve the research, modeling, and classification of software flaws”; and “provide a common language of discourse for discussing, finding and dealing with the causes of software security vulnerabilities". A typical, delightfully baroque example of the resulting taxonomy may be:


Improper Enforcement of Message or Data Structure → Failure to Sanitize Data into a Different Plane → Improper Control of Resource Identifiers → Insufficient Filtering of File and Other Resource Names for Executable Content.


Today, there are about 800 names in this dictionary; most of them as discourse-enabling as the one quoted here.


A slightly different school of naturalist thought is manifested in projects such as the Common Vulnerability Scoring System (CVSS), a business-backed collaboration aiming to strictly quantify known security problems in terms of a set of basic, machine-readable parameters. A real-world example of the resulting vulnerability descriptor may be:


AV:LN / AC:L / Au:M / C:C / I:N / A:P / E:F / RL:T / RC:UR / CDP:MH / TD:H / CR:M / IR:L / AR:M


Given this 14-dimensional vector, organizations and researchers are expected to transform it in a carefully chosen, use-specific manner – and arrive at some sort of an objective, verifiable, objective conclusion about the significance of the underlying bug (say, “42”), precluding the need to more subjectively judge the nature of security flaws.


I may be poking gentle fun at their expense - but rest assured, I do not mean to belittle these CWE or CVSS: both projects serve noble goals, most notably giving a more formal dimension to risk management strategies implemented by large organizations (any general criticisms of certain approaches to risk management aside). Having said that, none of them yielded a grand theory of secure software yet - and I doubt such a framework is within sight.


[...end of excerpt...]

Feil igjen fra Slyngstad

Leder for oljefondets forvaltning, Yngve Slyngstad, fortsetter i dagens DN med feilinformasjon om såkalt passiv forvaltning. I følge Slyngstad ville et passivt fond måtte selge seg ut av greske statsobligasjoner i det sekund de nedgraderes og tas ut av indeksen.

Som jeg har nevnt før blir det ikke forbudt å tenke dersom en innfører passiv forvaltning. En passiv forvalter må naturligvis forsøke å være mest mulig effektiv. Å forsøke å selge seg ut av obligasjoner når det ikke finnes et marked er normalt ikke spesielt hensiktsmessig. Derfor burde heller ikke et passivt fond gjøre det.

Om NBIM mener dette egentlig er en aktiv strategi, så er det helt fint. Aktiv i den forstand at en passer på investeringene er bra. Aktiv i den forstand at en forsøker å plukke vinnerobligasjoner eller vinneraksjer er ikke spesielt lurt, noe de enorme tapene på obligasjonsporteføljen under finanskrisen viste. Fondet mente før 2008 at å geare obligasjonsporteføljen nesten to ganger var en kjempeidé, fordi man da kunne gamble på rentedifferanser. Lite visste vel Stortinget om at det som skulle være den sikre delen av porteføljen, ble brukt til ett gigantisk veddemål på én risikofaktor. Flaks har i ettertid hindret alt for store tap. Neste gang er det ikke sikkert man er like heldig. Aktiv forvaltning åpner for operasjonell risiko.

For at det ikke skal være noen som helst tvil om hva som er hensiktsmessig, så la meg være veldig konkret: Å selge greske statsobligasjoner når spreaden (differansen mellom kjøps og salgskurs) er så stor at markedet i praksis ikke fungerer er ikke hensiktsmessig. Dersom spreaden er akseptabel er det imidlertid ingen grunn til å holde på obligasjonene bare fordi de har falt i verdi. Årsaken til at verdipapirer faller i verdi er at markedet anser dem å være mindre verdifulle. Årsaken til at gjeldspapirer var så ”billige” under finanskrisen var en reell frykt for økonomisk kollaps. Det kan hende kreditorene til Hellas får 100% igjen på sine obligasjoner, men det er også meget sannsynlig at långiverne må akseptere gjeldssanering. Oljefondet bør altså kvitte seg med greske statsobligasjoner om de kan, men ikke til enhver spread.

Risikopremien for Oslo Børs 1915-2009

Jeg har fjernet dette innlegget, da det ville kreve endel arbeid å kvalitetssikre beregningene tilstrekkelig. Det er uansett andre som har gjord dette bedre, så dersom du er ute etter anslag på risikopremien i Norge, anbefaler jeg:

Elroy Dimson, Paul Marsh and Mike Staunton: "Credit Suisse Global Investment Returns Yearbook 2014"

Vulnerability databases and pie charts don't mix

There are quite a few extensive vulnerability databases in existence today. While their value in the field of vulnerability management is clear and uncontroversial, a relatively new usage pattern can also be seen: the data is being incorporated into high-level analyses addressed predominantly to executive audiences and the media to provide insight into the state of the security industry; threat reports from IBM and Symantec are good examples of this. Which vendor is the most responsive? Who has the highest number of high-risk vulnerabilities? These and many other questions are just begging to be objectively answered with a clean-looking pie chart.


Vulnerability researchers - the people behind the data points used - are usually fairly skeptical of such efforts; but their criticisms revolve primarily around the need to factor in bug severity, or the potential for cherry-picking the data to support a particular claim. These flaws are avoidable in a well-designed study. Are we good, then?


Well, not necessarily so. The most important problem is that today, for quite a few software projects, the majority of vulnerabilities is discovered through in-house testing - and the attitudes of vendors when it comes to discussing these findings publicly tend to vary. This has a completely devastating impact on the value of the analyzed data: vulnerability counting severely penalizes forthcoming players, benefits the more secretive ones, and places the ones who do not do any proactive work somewhere in between.


Consider this example from the browser world: in recent years, the folks over at Microsoft started doing a lot of in-house fuzzing, and have undoubtedly uncovered hundreds of security flaws in Internet Explorer and elsewhere. It appears to be their preference not to routinely discuss these problems, however - often silently targeting fixes for service packs or other cummulative updates instead. In fact, here's an anecdote: I reported a bunch of exploitable crashes to them in September 2009, only to see them fixed them without attribution in December that year. The underlying flaws were apparently discovered independently during internal cleanups. So be it: as long as bugs get fixed, we all benefit, and Microsoft is definitely working hard in this area.


Contrast this approach with Mozilla, another vendor doing a lot of successful in-house security testing (in part thanks to the amazing work of Jesse Ruderman). They are pretty forthcoming about their results, and announce internal, fuzzing-related fixes almost every month. Probably to avoid shooting themselves in the foot in vulnerability count tallies, they tend to report them cummulatively as crashes with evidence of memory corruption, however - and usually assign them a single CVE number to this every month. Again, sounds good.


Lastly, have a look at Chromium; several folks are fuzzing the hell out of this browser, too - but the project opts to track these issues individually, partly because the need to coordinate with WebKit developers - and each one of them ends up with a separate CVE entry. The result? Release notes often look like
this.


All these approaches have their merits - but how do you reconcile them for the purpose of vulnerability counting? And, is it fair to compare any of the above players with vendors who do not seem to be doing any proactive security work at all?


Well, perhaps the browser world is special; one could argue that at least some products with matching security practices must exist - and these cases should be directly comparable. Maybe, but the other problem is the quality of the databases themselves: recent changes to the vulnerability handling process, including the emergence of partial- or non-disclosure, the popularity of vulnerability trading, and the demise of centralized vulnerability discussion channels, all make it prohibitively difficult for database maintainers to reliably track issues through their lifetime. Common problems include:


  • The inability to fully understand what the problem actually is, and what severity it needs to be given. Database maintainers cannot be expected to be intimately familiar with every product, and need to process thousands of entries every year - but this often leads to vulnerability notes that may at first sight appear inaccurate, hard to verify, or very likely not worth classifying as security flaws at all.


  • The difficulty discovering how the disclosure process looked like, and how long the vendor needed to develop a patch. This is perhaps the most important metric to examine when trying to understand the performance of a vendor - yet one that is not captured, or captured very selectively and inconsistently, in most of the databases I am aware of.


  • The difficulty detecting the moment when a particular flaw is addressed - all the databases contain a considerable number of entries that
    were
    not
    updated
    to
    reflect
    patch
    status (apologies for Chrome-specific examples). There seems to be correlation between the prevalence of this problem and the mode in which vendor responses are made available to the general public. Furthermore, when a problem is not fixed in a timely manner, the maintainers of the database generally do not reach out to the vendor to investigate why: is the researcher's claim contested, or is the vendor simply sloppy? This very important distinction is lost.


Comparable problems apply to most other security-themed studies that draw far-fetching conclusions from simple numerical analysis of proprietary data. Pie charts don't immediately invalidate a whitepaper, but blind reliance on these figures warrants a closer investigation of the claims.

Address bar and the sea of darkness

The current contents of the address bar are our only god.


Really. There is nothing else: browsers do not have any other universal, reliable content origin indicator, and no way to predict where you will be taken next. People who do not understand this, or who do not understand the URL syntax, will suffer. Over and over again.


It is fair to note that way too many users fall into this category; in fact, even the experts can't always be sure. Guess where the following URLs will take you in MSIE, Firefox, and Chrome:


  • http://example.com\@coredump.cx/
  • http://example.com;.coredump.cx/


Chances are, you got the answers wrong. The problem is easy to pin squarely on the users, but it's the geeks who created a huge gap between the skill level needed to proficiently operate a browser, and the skill level required to do so safely. The health of the entire networked ecosystem suffers as a result.


This gap is one of the great unsolved problems in information security - and it calls for fundamental changes to how web browsers interact with the users and identify sites. Alas, not every quick kludge is necessarily a good one: careless users will be exactly just as doomed if we outlaw HTTP authentication, change onclick behavior, rework tooltips, or close all the open redirectors in the world. The few hundred remaining pages in the relevant RFCs make the world interesting. Please, pick your battles wisely.

Vulnerability trading markets and you

There is something interesting going on in the security industry: we are witnessing the rapid emergence of vulnerability trading markets. Perhaps hundreds of security researchers now routinely sell exploits to intermediaries for an easy profit (anywhere from $1,000 to $50,000), instead of the more usual practice of talking to the vendors or announcing their findings publicly. The buyers in turn resell the knowledge to unspecified end users, most likely at several times the original price tag. Some of the intermediaries may eventually release the information to the public; others withhold it indefinitely. The latter bunch is willing to pay you a lot more.


Curiously, both classes of intermediaries often ask for weaponized, multi-platform exploits, and not just a nice write-up on the nature of the glitch. Why? Some use cases in the IDS industry could be strenuously made, but I do not find them all that believable. More likely, at the end of the chain, you can find buyers with questionable intentions and a clear business reason to justify the significant expense, yet maintain anonymity. When asked about their clients, the intermediaries usually allude to unspecified government agencies - but even if this somewhat uncomfortable claim is true, the researcher does not get to choose which government he may be aiding with his work.


Many people find it difficult to sympathize with Jethro's legal troubles: he did not hesitate to take cash for an exploit that he had every reason to suspect would be used for illegal purposes. Are the proxy arrangement practiced in institutionalized exploit trade really that different? I'm not sure: can the sellers honestly claim they understand who wants these exploits, and why do these tools happen to be so unusually valuable? And if not, should they be selling them to the highest bidder, no questions asked?


Of course, there is an argument made by Charlie Miller and several other researchers that the vendors should not be entitled to free vulnerability research services from the security community. Maybe so - although it's worth noting that researchers profit from that bona fide work by gaining recognition and respect, and landing cool jobs later on; vendors gain much less from the extra public scrutiny, and some of them would probably prefer for this "free" arrangement to go away completely. But in any case, I do not think this argument genuinely supports the idea of selling the information to third parties with no regard of how it may be used: it may be legal, and it may be profitable, but it certainly does not feel right.

Responsibilities in vulnerability disclosure

The debate around responsible disclosure is as old as the security industry itself, and unlikely to be settled any time soon. Tellingly, both sides of the debate claim to be driven by the same motive - to keep users safe. Yet, both accuse the opponent of saying so under false pretenses: vendors and businesses see full disclosure proponents as attention whores, while researchers think vendors only care only about PR and legal liability damage control.


The controversy will continue, and it would be pointless to recapture it here. Having said that, I have an issue with one of the common assumptions made in this debate: the belief that vulnerabilities are unlikely to be discovered by multiple parties at once, and therefore, the original finder of a flaw is in a unique position to control the information. Intuitively, it sounds pretty reasonable: security research is hard, and the necessary skills are nearly impossible to formalize or imitate. The press, in particular, likes to think of vulnerability finding as an arcane form of art. But is it so?


Over the years, I have probably found over 200 vulnerabilities in high-profile client- and server-side apps. I think it is a pretty good data set to work with - and curiously, I am strongly convinced that none of these findings should be attributed to my unique skill. It feels that a vast majority of these findings were just a matter of the security community reaching a certain critical body of knowledge - gaining a better understanding of what can go wrong, where to look for it, and how to automate the testing with simple fuzzers and similar validation frameworks. At that point, finding bugs is simply a matter of picking a target to go after; who happens to be behind the wheel is largely immaterial.


What's more, I found that when you go after a sufficiently buggy and complex application, most of the problems you find would turn out to be dupes of what other researchers discovered weeks or months earlier. This pattern proved to be particularly prevalent in the browser world, where I had multiple bug collisions with Georgi Guninski or Amit Klein.


I suspect the same can be said by a vast majority of other security researchers - though not all of them are willing to make the same self-deprecating admission in public. Sadly, by enjoying being portrayed as wizards, we are also making it easier for vendors to advocate the view that the discovery of a vulnerability is what creates a threat - and that researchers have an obligation to wait indefinitely to help protect users against attacks.


While giving a responsive vendor some advance notification is often a good idea, creating a social pressure on researchers to wait for patches removes any incentive for vendors to respond in a timely manner. This would not be a problem if vendors were consistently awesome - but they certainly aren't today. We are commonly seeing some of the leading proponents of responsible disclosure taking from six months to two years to address even fairly simple, high-risk bugs - and seldom facing any criticism for this. Researchers who behave "irresponsibly", on the other hand, are routinely called names if they are lucky; and are formally or informally threatened if not.


Vulnerability disclosure, however done, does not make you less secure. More often that we are willing to admit, it merely brings out an existing risk from the thriving underground market and into the spotlight. Naturally, this can be disruptive in the short run, which is why the practice is controversial; it's certainly easier not to have to scramble to fix an issue on a short notice. That said, timely and verbose disclosure also levels the playing field by keeping vendors accountable, and giving all users the information needed to limit exposure - even if by stopping to use a particular service until a fix is available.