Not long ago, the HTML5 specification has extended the semantics for
<a href=...> links by adding the
download attribute. In a nutshell, the markup allows you to specify that an outgoing link should be always treated as a download, even if the hosting site does not serve the file with
Content-Disposition: attachment:
<a href="http://i.imgur.com/b7sajuK.jpg" download>What a cute kitty!</a>
I am unconvinced that this feature scratches any real itch for HTTP links, but it's already supported in Firefox, Chrome, and Opera.
Of course, there are some kinks: in absence of the
Content-Disposition header, the browser needs to figure out the correct file name for the download. In practice, this is always done based on the path seen in the URL. That's not great, because a good majority of web frameworks will tolerate trailing garbage in the path segment; indeed, so does
imgur.com. Let's try it out:
<a href="http://i.imgur.com/b7sajuK.jpg/KittyViewer.exe" download>What a cute kitty!</a>
But we shouldn't dwell on this, because the
download syntax makes it easy for the originating page to simply override that logic and pick any file name and extension it likes:
<a href="http://i.imgur.com/b7sajuK.jpg" download="KittyViewer.exe">What a cute kitty!</a>
That's odd - and keep in mind that the image we are seeing is at least partly user-controlled. A location like this can be found on any major destination on the Internet: if not an image, you can always find a JSON API or a HTML page that echoes something back.
It also helps to remember that it's
usually pretty trivial to build files that are semantically valid to more than one parser, and have a different meaning to each one of them. Let's put it all together for a trivial PoC:
<a href="http://api.bing.com/qsonhs.aspx?q=%22%26notepad%26"
download="AltavistaToolbar.bat">Download Bing toolbar from bing.com</a>
That's pretty creepy: if you download the file on Windows and click "open", the payload will execute and invoke
notepad.exe. Still, is it a security bug? Well... the answer to that is not very clear.
For one, there is a temptation to trust the tooltip you see when you hover over a download link. But if you do that, you are in serious trouble, even in absence of that whole
download bit: JavaScript code can intercept the
onclick event and take you somewhere else. Luckily, most browsers provide you with a real security indicator later on: the download UI in Internet Explorer, Firefox, and Safari prominently shows the origin from which the document is being retrieved. And that's where the problem becomes fairly evident:
bing.com never really meant to serve you with an attacker-controlled
AltavistaToolbar.bat, but the browser says otherwise.
The story gets even more complicated when you consider that some browsers don't show the origin of the download in the UI at all; this is the case for Chrome and Opera. In such a design, you simply have to put all your faith in the origin from which you initiated the download. In principle, it's not a completely unreasonable notion, although I am not sure it aligns with user expectations particularly well. Sadly, there are other idiosyncrasies of the browser environment that mean the download you are seeing on a trusted page
might have been initiated from another, unrelated document. Oops.
So, yes, browsers are messy. Over the past few years, I have repeatedly argued against
<a download> on the standards mailing lists (
most recently in 2013), simply because I think that nothing good comes out of suddenly taking the control over how documents are interpreted by the browser away from the hosting site. I don't think that my arguments were particularly persuasive, in part because nobody seemed to have a clear vision for the overall trust model around downloads on the Web.
PS. Interestingly, Firefox decided against the added exposure and implemented the semantics in a constrained way: the
download attribute is honored only if the final download location is same-origin with the referring URL.