Understanding and using skipfish

Skipfish, my open source web application security scanner, is now about eight months old - and, over the course of over 70 releases, has undergone a number of substantial changes. While I maintain detailed documentation and a short troubleshooting guide, it seems appropriate to share some additional hints on how to get the most out of this tool, should you be inclined to try it out.

A word on design goals (and what they really mean)



While skipfish tries to scratch quite a few itches, its primary goals - and the areas where it hopefully stands out - are:



  • Raw speed: I am constantly frustrated by the performance and memory footprint of many of the open source and commercial scanners and brute-force tools I had to work with. Skipfish tries hard to improve this - and to my knowledge, has by far the fastest HTTP and content analysis engine out there.


    This does not mean that skipfish scans always take the least amount of time, compared to other tools; it simply means that I can cram a whole lot more functionality, and get much better coverage, without making the assessment unreasonably long.


    In cases where the server is the bottleneck, this can obviously backfire; but when dealing with slow targets, you can configure the scanner to get a reduced coverage - roughly comparable to a more traditional tool.


  • Unique brute-force capabilities: the performance of the scanner allowed me to incorporate an extensive ${keyword}.${extension} brute-force functionality similar to that of DirBuster - coupled with highly customized, hand-picked dictionaries, and a unique auto-learning feature that builds an adaptive, target-specific dictionary based on site content analysis.


    Most other scanners simply can't afford to let you do this in any meaningful way - and when they do, the dictionaries you can use with them are much less sophisticated, and far more request-intensive. I consider this functionality to be one of the more important assets of the tool - but you are certainly not forced to use it where impractical. The brute-force testing features are completely optional, and can be turned off to improve scan times as much as 500-fold.


  • High quality security checks: most scanners employ fairly naive security logic - for example, to test cross-site scripting, they may attempt injecting <script>alert(1)</script>; to detect directory traversal, they may try ../../../../../etc/passwd; and to test SQL injection, they may attempt supplying technology-specific code and look for equally technology-specific output strings.


    Needless to say, all these checks have many painfully simple failure modes: the XSS check will result in a false negative when the input is partly escaped - or appears inside a HTML comment; traversal checks will fail if the application always appends a fixed extension to the input string, or is running within a chroot() jail; and SQL injection logic will break when dealing with an unfamiliar backend or an uncommon application framework.


    To that effect, skipfish puts emphasis on well-crafted probes, and on testing for behavioral patterns, rather than signatures. For example, when testing for string-based SQL injection, we compare the results of passing '"original_value, \'\"original_value, and \\'\\"original_value. When the first response is similar to the third one, but different from from the second one - we can, with a pretty high confidence, say that there is an underlying query injection vulnerability (even if query results can't be observed directly). Interestingly, this check is versatile enough to do a pretty good job detecting eval()-related vulnerabilities in PHP, and injection bugs in many other non-SQL query languages.


    Similarly, when probing for file inclusion, the scanner will try to compare the output of original_value, ./original_value, ../original_value, and .../original_value; if the first two cases result in similar output, and the two remaining ones result in a different outcome, we are probably dealing with a traversal problem. The redundancy helps rule out differences that can be attributed to input validation - and hey, that check also triggers on many remote file inclusion vectors in PHP.


    For XSS, skipfish does depend on content analysis - but instead of the standard practice of throwing the entire XSS cheatsheet at the target, it injects a complex string that is guaranteed to break out of many different parsing modes (and much less likely to fail with crude XSS filters in place): -->">'>'", followed by a uniquely numbered tag. The unique identifier enables stored XSS detection later on, and is also interpreted in a special way inside <script> or <style> blocks.


    There are many other design decisions along these lines; I believe they have a profound impact on the ability to detect real-world security problems - although paradoxically, they make the scanner perform poorly with simulated vulnerabilities in demo sites.


  • Coverage of more nuanced problems: most web application assessment tools simply pay no attention to the security risks caused by subtle MIME type or character set mismatches - and the awareness that these problems may lead to exploitable XSS vectors is very low within the security community.


    Skipfish makes a point of noticing these and many other significant security issues usually neglected by other tools - such as caching intent mismatches, mixed content issues, XSSI, third-party scripts, cross-site request forgery, and so forth.


    Quite a few people complained about "pointless" or "odd" warnings for problems thought to be non-security issues. Rest assured, most of these cases could be shown to be exploitable. When in doubt, don't hesitate to ping me for a second opinion!


  • Adaptive scanning for real-world applications: many scanners don't handle complex, mixed technology sites particularly well; a common headache is dealing with closely related requests being handled by different application backends with different URL semantics, error messages, or even case-sensitivity. Another hard problem is recognizing obscure 404 behaviors, unusual parameter passing conventions, redirection patterns, content duplication, and so forth. All this often requires repeated, painstaking configuration tweaks.


    While skipfish certainly is not perfect - no scanner can be - the code is designed to be able to cope with these scenarios exceptionally well. This is achieved chiefly by not relying on directory, file, or error message signatures - and instead, carrying out adaptive probes for every new fuzzed location; and quickly recognizing crawl tree branches that look very much alike. While heuristics can fail in unexpected ways, I think it's of immense value.


  • Sleek reports with very little noise: skipfish generally does not complain about highly non-specific "vulnerabilities" commonly reported by other scanners; for example, it does not pay special attention to every non-httponly cookie, to every password form with autocomplete enabled, or to framework version or system path disclosure patterns on various pages. This means that in practice, auditors will see fewer issues in a skipfish report than in the output of most other assessment tools - and this is not a bug.


    Rest assured, the interactive report produced after a scan includes summary sections where the auditor can review all password forms, cookies, and so forth if necessary - but the assumption is that human evaluation can't and should not be substituted here.



While skipfish certainly isn't perfect, these are the core properties I care about - and try to continually improve.



The most important setting: dictionary modes



The single most misunderstood - and important - feature of skipfish is its directory management model. Quite simply, getting this part wrong can easily ruin your scans.


I encourage you to have a look at the recently revamped dictionaries/README-FIRST file, which explains the basics of dictionary management in greater detail; but at the very minimum, you should be aware of the following choice:


  • Absolutely no brute-force: in this mode, skipfish performs an orderly crawl of the target site, and behaves similarly to other basic scanners. The mode is not recommended in most cases due to limited coverage - resources such as /admin/ or /index.php.old may not be discovered - but is blazing fast. To use it, try:

    ./skipfish -W /dev/null -LV [...other options...]


  • Lightweight brute-force: in this mode, the scanner will only try fuzzing the file name (/admin/), or the extension (/index.php.old), but never both at the same time (/backup.tgz will typically not be hit). The cost of doing so is about 1,700 requests per fuzzed location. To use this mode, try:

    cp dictionaries/complete.wl dictionary.wl
    ./skipfish -W dictionary.wl -Y [...other options...]

    This mode is the preferred way of limiting scan time where fully-fledged brute-force testing is not feasible.


  • Normal dictionary brute-force: in this mode, the scanner will test all the possible file name and extension pairs (i.e., /backup.tgz will be discovered, too). The mode is significantly slower, but offers superior coverage - and should be your default pick in most cases. To enable it, try:

    cp dictionaries/minimal.wl dictionary.wl
    ./skipfish -W dictionary.wl [...other options...]

    The cost of this mode is about 50,000 requests per fuzzed location. You can replace minimal.wl with medium.wl or complete.wl for even better coverage, but at the expense of a 2x to 3x increase in scan time; see dictionaries/README-FIRST for an explanation of the difference between these files.



Other options you need to know about



Skipfish requires relatively little configuration, but rest assured, is not a point-and-click tool. You should definitely review the documentation to understand the operation of rudimentary options such as -C (use cookie authentication), -I (only crawl matching URLs), -X (exclude matching URLs), -D (define domain scope), -m (limit simultaneous connections), and so forth.


Here is a list of some of the more useful but under-appreciated options that you should consider using in your work:


  • Limit crawl tree fanout: options -c (immediate children limit) and -x (total descendant limit) allow you to fine-tune scan coverage for very large sites where -I and -X options are impractical to use. Non-deterministic crawl probability setting (-p) may also be helpful there.


  • More SSL checks: some sites care about getting SSL right; specify -M to warn about dangerous mixed content scenarios and insecure password forms.


  • Spoof another browser: use -b to convincingly pretend to be MSIE, Firefox, or an iPhone. This option does not merely change the User-Agent string, but also ensures that other headers have the right syntax and ordering.


  • Do not accept new cookies: on cookie-authenticated sites that do not maintain session state on server side, accidental logout can be prevented without the need to carefully specify -X locations: adding -N simply instructs skipfish to ignore all attempts to delete or modify -C cookies.


  • Trust another domain: skipfish complains about dangerous script inclusion and content embedding from third-party domains, and can optionally warn you about outgoing links. To minimize noise, use -B to identify any domains you trust, and therefore, want to exclude from these checks.


  • Reduce memory footprint: to generate scan reports, the scanner keeps samples of all retrieved documents in memory. On some large, multimedia-heavy sites, this may consume a lot of RAM. In these cases, -e may be used to purge binary (non-ASCII) content without impacting report quality appreciably.


  • Be dumb: by specifying -O, skipfish can be instructed not to analyze returned HTML to extract links - turning it into a fast, purely brute-force tool.


Troubleshooting



Skipfish has bugs, there are features yet to be implemented, and web frameworks it hasn't encountered yet. When you run into any trouble, please check out this doc, but also do not hesitate to ping me directly. Your feedback is the only way this tool can be improved.

Dyr suksess

Takket være en ”overivrig” Riskrevisjonen vet vi nå mer om hvilke bonusavtaler de eksterne forvalterne av Oljefondet har. NBIM ,som forvalter oljefondet for Norges Bank, må gjerne korrigere meg, men det ser ut til at en svensk forvalter har fått en bonus på 40 % av en meravkastning på nær halvannen milliard kroner.
Som kjent påtar ingen av Oljefondets forvaltere seg ansvar for å betale tilbake penger som de taper. Det verste som kan skje er at de mister oppdrag i fremtiden. Dette gir en åpenbar motivasjon til å øke risikoen på en måte som ikke lett kan avdekkes av fondet. Økningen i risiko har til og med en konkret verdi i kroner og øre. Slike kontrakter kalles nemlig opsjoner og vi har ganske eksakte metoder for å verdsette dem, som vist i figuren.

Dersom forvalteren for eksempel innretter seg slik at en kan forvente tap eller gevinst i forhold til referanseporteføljen i størrelsesorden 20%, så ser vi av figuren at verdien på kontrakten vil være litt i overkant av 500 millioner. Forvalteren kan i prinsippet selv velge hvor på den røde linjen han ønsker å være. Det virker derfor ikke usannsynlig at noen forvaltere vil forsøke å maksimere verdien på kontrakten ved å ta mest mulig risiko.

Enda mer interessant er det kanskje at forvalteren ikke trenger å vente på å innkassere denne gevinsten. Forvalteren kan nemlig hypotetisk sett selge opsjoner i markedet og umiddelbart motta et beløp tilsvarende verdien av kontrakten. Nå er det ingen holdepunkter for at slik har skjedd, men det er jo et tankekors at dette er praktisk fullt gjennomførbart.

NBIM har utvilsomt omfattende systemer for å kontrollere risikoen som de eksterne forvalterne påfører fondet. Problemet er at denne kontrollen er langt fra perfekt og må skje i ettertid basert på den informasjon som tilflyter NBIM. Den katastrofale erfaringen fra obligasjonsinvesteringene før finanskrisen tilsier at kontrollen er utilstrekkelig. Det kan godt hende at forvalterne da spekulerte i å ta en type risiko som NBIM ikke overvåket nettopp for å innkassere opsjonsgevinster som nevnt.

Nå er ikke opsjonen som utstedes helt kostnadsfri. Forvalteren kan miste oppdraget. Forvaltere taper imidlertid i gjennomsnitt mot markedet på grunn av kostnader. Verdien av å drive innenfor markedets risikorammer er derfor veldig liten. Forvaltere som følger spillereglene kan rett og slett ikke regne med å få noen særlig bonus. Det bør derfor ikke overraske om fristelsen til å laste opp med risiko av og til blir for stor.

I mange tilfeller vil imidlertid tap hos en forvalter dekkes av gevinst hos en annen. Hovedproblemet er derfor ikke risikoen, men de kostnadsdrivende bonusene. Siden fondet ikke får tilbakebetalt når en forvalter taper, så må fondet sette sin lit til at forvalteren med 40% bonus slår markedet mer enn 62,5% av tiden. Hvis ikke vil fondet systematisk betale ut store beløp hver gang en forvalter er heldig, og sitte igjen med regningen når det går skeis. En må derfor kunne stille spørsmål ved om ikke bonuser av denne typen forutsetter et i overkant optimistisk syn på hva eksterne forvaltere kan få til.

Ovennevnte problem kan til en viss grad avhjelpes ved å sette bonustak. NBIM har imidlertid forsvart avtalen uten å ta selvkritikk og åpnet for at de kan komme til å inngå lignende avtaler i fremtiden. Vi får håpe de har gode argumenter for det.
Et argument som NBIM har fremført er at disse kontraktene er nødvendig for at forvalterne skal ta oppdraget. Fondet har tidligere anført sin evne til å oppnå gode avtaler med eksterne forvaltere som ett av sine komparative fortrinn og et argument for aktiv forvaltning. Det virker plutselig ikke så troverdig lenger.

Aktiv forvaltning har økt avkastningen på fondet med ca 0,1% per år i forhold til indeksforvaltning. NBIM påstår at bidraget egentlig er større ved å vise til store indirekte kostnader ved passiv forvaltning, men fondets eget interne indeksfond ikke har slike kostnader. Denne lille gevinsten kan videre forklares ut fra velkjente strategier som en ikke trenger aktiv forvaltning for å følge. Det ser altså ikke ut som en får særlig mye igjen for den aktive forvaltningen. Er det da verdt det?

Uten argumenter?

Det har ikke vært noen respons på mitt innlegg i DN om en såkalt tobinskatt på aksjer, hverken fra tilhengerne i LO eller Attac. Argumentasjonen i innlegget er solid, men jeg hadde likevel ventet noe motbør. En må vel da konkludere med at tilhengerne ikke finner vektige argumenter utover de gamle som jeg har tilbakevist og at det ikke finnes gode grunner for å innføre en slik skatt?

These are not the events you are looking for

Yeah, so this probably should not be possible.


The underlying problem is pretty cute: most browsers can be programatically prevented from dequeuing and processing UI events delivered by the operating system; canonical examples involve using busy JavaScript loops, blocking XMLHttpRequest calls, and particularly complex HTML or XML documents.


Upon leaving this state, the queued events may not be properly purged, and may end up getting delivered to an incorrect and unexpected context - possibly carrying out an undesirable action in another domain, or interacting with browser chrome.


I filed bug 608899 for this particular demo in Firefox - but given the general, cross-browser state of disrepair when it comes to UI timing and related attacks, I am not getting my hopes up.