The fact that SVG files can contain scripts was a bit of a mistake. On one hand, the animations and entire interactive demos and even games in a single SVG are cool. But on the other hand, it opens up a serious can of worms of security vulnerabilities. As a result, SVG files are often banned from various image upload tools, they do not unfurl previews, and so on. If you upload an SVG to discord, it just shows the raw code; and don't even think about sharing an SVG image via Facebook Messenger, Wechat, Google Hangouts, or whatever. In 2025, raster formats remain way more accessible and easily shared than SVGs.
This is very sad because SVGs often have way smaller file size, and obviously look much better at various scales. If only there was a widely used vector format that does not have any script support and can be easily shared.
All SVGs should be properly sanitized going into a backend and out of it and when rendered on a page.
Do you allow SVGs to be uploaded anywhere on your site? This is a PSA that you're probably at risk unless you can find the few hundred lines of code doing the sanitization.
Note to Ruby on Rails developers, your active storage uploaded SVGs are not sanitized by default.
Is there SVG sanitization code which has been formally proven correct and itself free of security vulnerabilities?
It would be better if they were sanitized by design and could not contain scripts and CSS. For interactive pictures, one could simply use HTML with inline SVG and scripts.
GitLab has some code in their repo if you want to see how to do it.
This is what they actually use: https://github.com/flavorjones/loofah
Sanitisation is a tricky process, it can be real easy for something to slip through the cracks.
Yes. Much better to handle all untrusted data safely rather than try to transform untrusted data into trusted data.
I found this page a helpful summary of ways to prevent SVG XSS: https://digi.ninja/blog/svg_xss.php
Notably, the sanitization option is risky because one sanitizer's definition of "safe" might not actually be "safe" for all clients and usages.
Plus as soon as you start sanitizing data entered by users, you risk accidentally sanitizing out legitimate customer data (Say you are making a DropBox-like fileshare and a customer's workflow relies on embedding scripts in an SVG file to e.g. make interactive self-contained graphics. Maybe not a great idea, but that is for the customer to decide, and a sanitization script would lose user data. Consider for example that GitHub does not sanitize JavaScript out of HTML files in git repositories.)
Yeah I’ve worked on a few pieces of software now that tried SVG sanitizing on uploads, got hacked, and banned the uploads.
I guess it is a matter of parsing svg. Trying to hack around with regex is asking for trouble indeed.
just run them through `svgo` and get the benefits of smaller filesizes as well
svgo is a minifier, not a sanitizer.
I should have clarified `svgo + removeScripts`
https://svgo.dev/docs/plugins/removeScripts/
External entities in XML[1] were a similar issue back when everyone was using XML for everything, and parsers processed external-entities by default.
1: https://owasp.org/www-community/vulnerabilities/XML_External...
XXE should have never existed.
Whoever decided it should be enabled by default should be put into some sort of cybersecurity jail.
It's no different from links to googlesyndication in offline html docs.
At least with external entities you could deny the parser an internet connection and force it to only load external documents from a cache you prepopulated and vetted. Turing completeness is a bullshit idea in document formats.
Postscript is pretty neat IMHO and it’s Turing complete. I really appreciated my raytraced page finally coming out of that poor HP laser after an hour or so.
I once sent a Sierpinski's Triangle postscript program to a shared printer. It took 90 minutes, and pissed off everybody else trying to print.
PostScript can emulate the ZMachine (Zork text adventures and all of infocom) with "zmachine.ps". Look it up at DDG/GG.
How does it do I/O?
A monad. It’s just a class of abstract Endor Moon or something. Probably you have to send all commands up to the current state to it. :)
One of the very first SVG documents I encountered was a port of the PS Tiger to SVG. It loaded a lot faster than the PostScript Tiger.
Sounds almost like a fun crypto mining opportunity.
With SVGs you can serve them from a different domain. IIUC the issue from TFA was that the SVGs were served from the primary domain; had they been on a different domain, they would have not been allowed to do as much.
calling Leonard Rosenthol ...
IIUC, an untrusted inline SVG is bad. An image tag pointing to an SVG is not.
I feel like this is common knowledge. Just like you don't inject untrusted HTML into your page. Untrusted HTML also has scripts. You either sanitize it. OR you just don't allow it in the first place. SVG is, at this point, effectively more HTML tags.Also remember that if the untrusted SVG file is served from the same origin and is missing a `Content-Disposition: attachment` header (or a CSP that disables scripts), an attacker could upload a malicious SVG and send the SVG URL to an unsuspecting user with pretty bad consequences.
That SVG can then do things like history.replaceState() and include <foreignObject> with HTML to change the URL shown to the user away from the SVG source and show any web UI it would like.
how is that special/different from an HTML URL?
Because displaying user-submitted images is pretty common and doesn't feel like a security footgun, but displaying user-submitted HTML is less common (and will raise more careful security scrutiny).
Would it be possible for messenger apps to simply ignore <script> tags (and accept that this will break a small fraction of SVGs)? Or is that not a sufficient defense?
I looked into it for work at some point as we wanted to support SVG uploads. Stripping <script> is not enough to have an inert file. Scripts can also be attached as attributes. If you want to prevent external resources it gets more complex.
The only reliable solution would be an allowlist of safe elements and attributes, but it would quickly cause compat issues unless you spend time curating the rules. I did not find an existing lib doing it at the time, and it was too much effort to maintain it ourselves.
The solution I ended up implementing was having a sandboxed Chromium instance and communicating with it through the dev tools to load the SVG and rasterize it. This allowed uploading SVG files, but it was then served as rasterized PNGs to other users.
Shouldn't the ignoring of scripting be done at the user agent level? Maybe some kind of HTTP header to allow sites to disable scripts in SVG ala CORS?
It's definitely a possible solution if you control how the file are displayed. In my case I preferred the files to be safe regardless of the mechanism used to view them (less risk of misconfiguration).
Content-Security-Policy: default-src 'none'
No, svgs can do `onload` and `onerror` and also reference other svgs that can themselves contain those things (base64'd or behind a URI).
But you can use an `img` tag (`<img src="evil.svg">`) and that'll basically Just Work, or use a CSP. I wouldn't rely on sanitizing, but I'd still sanitize.
> But you can use an `img` tag (`<img src="evil.svg">`) and that'll basically Just Work
That doesn't help too much if evil.svg is hosted on the same domain (with default "Content-Type: image/svg+xml" header), because attacker can send a direct link to the file.
Reddit horribly breaks direct links to images and serves html instead.
IMO, the bigger problem with SVGs as an image format is that different software often renders them (very) differently! It's a class of problem that raster image formats basically don't have.
> It's a class of problem that raster image formats basically don't have.
That took way too long to be this way. Some old browsers couldn't even get the colors of PNGs correct, let alone the transparency.
I would have expected SVGs to be like PDFs and render the same across devices. Is the issue that some renderers don’t implement the full spec, or that some implement parts incorrectly?
They are like PDFs in that they do not render the same with different software or on different devices.
I would say PDFs are actually reasonably consistent though. Weird things happen on occasion, but I've certainly had more success than with SVGs.
They are reasonably consistent because there is a de-facto reference implementation (Adobe Acrobat) which, if your implementation does not match exactly, users will think your implementation is broken.
There isn't such an implementation for SVG.
We live in a world where Adobe set the standard, and anything that didn't render like Adobe was considered "incorrect".
You definitely don't understand PDFs, let alone SVGs.
PDFs can also contain scripts. Many applications have had issues rendering PDFs.
Don't get me wrong, the folks creating the SVG standard should've used their heads. This is like the 5th time (that I am aware of) this type of issue has happened, (and at least 3 of them were Adobe). Allowing executable code in an image/page format shouldn't be a thing.
SVG can for example contain text elements rendered with a font. If the font is not available it will render in a different one. The issue can be avoided by turning text elements into paths, but not all SVGs do that.
Also text zoom.
More like HTML and getting different browsers to render pixel perfectly identical result (which they don't) including text layout and shaping. Where different browser don't mean just Chrome, Firefox, Safari but also also IE6 and CLI based browsers like Lynx.
PDFs at least usually embed the used subset of fonts and contain explicit placement of each glyph. Which is also why editing or parsing text in PDFs is problematic. Although it also has many variations of Standard and countless Adobe exclusive extensions.
Even when you have exactly the same font text shaping is tricky. And with SVGs lack of ability to embed fonts, files which unintentionally reference system font or a generic font aren't uncommon. And when you don't have the same font, it's very likely that any carefully placed text on top of diagram will be more or less misplaced, badly wrap or even copletely disappear due to lack of space. Because there is 0 consistency between the metrics across different fonts.
The situation with specification is also not great. Just SVG 1.1 defines certain official subsets, but in practice many software pick whatever is more convenient for them.
SVG 2.0 specification has been in limbo for years although seems like recently the relevant working group has resumed discussions. Browser vendors are pushing towards synchronizing certain aspects of it with HTML adjacent standards which would make fully supporting it outside browsers even more problematic. It's not just polishing little details many major parts that were in earlier drafts are getting removed, reworked or put on backlog.
There are features which are impractical to implement or you don't want to implement outside major web browsers that have proper sandboxing system (and even that's not enough once uploads get involved) like CSS, Javascript, external resource access across different security contexts.
There are multiple different parties involved with different priorities and different threshold for what features are sane to include:
- SVG as scalable image format for icons and other UI elements in (non browser based) GUI frameworks -> anything more complicated than colored shapes/strokes can problematic
- SVG as document format for Desktop vector graphic editors (mostly Inkscape) -> the users expect feature parity with other software like Adobe Illustrator or Affinity designer
- SVG in Browsers -> get certain parts of SVG features for free by treating it like weird variation of HTML because they already have CSS and Javascript functionality
- SVG as 2d vector format for CAD and CNC use cases (including vinyl cutters, laser cutters, engravers ...) -> rarely support anything beyond shapes of basic paths
Beside the obviously problematic features like CSS, Javascript and animations, stuff like raster filter effects, clipping, text rendering, and certain resource references are also inconsistently supported.
From Inkscape unless you explicitly export as plain 1.1 compatible SVG you will likely get an SVG with some cherry picked SVG2 features and a bunch of Inkscape specific annotations. It tries to implement any extra features in standard compatible way so that in theory if you ignore all the inkscape namespaced properties you would loose some of editing functionality but you would still get the same result. In practice same of SVG renderers can't even do that and the specification for SVG2 not being finalized doesn't help. And if you export as 1.1 plain SVG some features either lack good backwards compatibility converters or they are implemented as JavaScript making files incompatible with anything except browsers including Inkscape itself.
Just recently Gnome announced working on new SVG render. But everything points that they are planning to implement only the things they need for the icons they draw themselves and official Adwaita theme and nothing more.
And that's not even considering the madness of full XML specification/feature set itself. Certain parts of it just asking for security problems. At least in recent years some XML parsers have started to have safer defaults disabling or not supporting that nonsense. But when you encounter an SVG with such XML whose fault is it? SVG renderer for intentionally not enabling insane XML features or the person who hand crafted the SVG using them.
Even PDFs don't always render the same from one platform to another. I've mostly seen it due to missing fonts.
Most renderers don't implement the full spec.
Yeah, I spent a bit of time trying to figure out some masking issues with a file I created in Inkscape but which chrome would butcher. Turned out to be opacity on a mask layer or something.
Could there be a limited format that disables scripting? Like in Excel: xlsx files have no macros, but xlsm (and the old xls) can contain macros.
But how else would we revisit all the security bugs of Flash/Macromedia?
It's wild how often we rediscover that executing untrusted code leads to decades of whack-a-mole security. Excel/Word plus macros, HTML plus JavaScript, SVG plus JavaScript, ...
It’s wild how often specs are ok for 9 versions, and then at version 10, standard bodies decide to transform them into a trojan firehose.
It’s so regular like clockwork that it has to be a nation state doing this to us.
Any notable examples you can share?
PDF was purposely a non-Turing adaptation of PostScript. Then they added JavaScript support.
Does it need to be as complicated as a new format? Or would it be enough to not allow any scripting in the provided SVGs (or stripping it out). I can't imagine there are that many SVGs out there which take advantage of the feature.
Yeah, it's still insane to me that the SVG can contain scripts. Wholly unnecessary; the DOM subtree it defines could be manipulated by external scripts just fine.
Anyway, I just set `svg.disabled` in Firefox. Scary world out there.
Update: this breaks quite a few things. It seems legitimate SVGs are used more often for UI icons than random diagrams and such. I suppose I shouldn't be surprised. I'll have to rethink this.
What we got was html for vector graphics and what we wanted was jpeg for vector graphics.
If only there was a widely used vector format that had script support and also decades of work on maintaining a battle-tested security layer around it with regular updates on a faster release cycle than your browser. That'd be crazy. Sure would suck if we killed it because we didn't want to bother maintaining it anymore.
(Yes I'm still salty about Flash.)
> because we didn't want to bother maintaining it anymore
That wasn't the only reason. Flash was also proprietary, and opaque, and single-vendor, among many other problems with it.
Uh... Flash was a genuine firehose of security flaws. I mean, yeah, they patched them. So "battle tested security layer" isn't wrong in a technical sense. But, yikes, no.
The Flash revisionism I see around here occasionally is bizarre.
No, Flash was terrible and killing it was good.
There is artistically no equivalent to Flash ever since it died. Nothing else has allowed someone with artistic skills but no programming skills to create animations and games to the same degree and with the same ease.
what is missing was a replacement for the flash editor itself, not the format.
I'd say Roblox is absolutely filling that market need. And as mentioned elsewhere, the "animations and games" demographic has moved on in the intervening decades to social media, and tools like CapCut make creating online content easier than it ever has been.
Honestly I think a lot of the Flash mania is just middle aged nerds fondly remembering their youth. The actual tool was a flash in the pan, and part of a much more complicated history of online content production. And the world is doing just fine without it.
It was terrible from a security POV, but the tooling was superb.
I remember my teenage friends creating things with flash in a way that doesn't happen on the modern web.
Sure, but that's because the media and forums change, not so much a point about tool capability. The equivalent of teenaged geeks hacking on flash games today is influencer wannabes editting trends in CapCut. If anything content production is far more accessible now than in the 90's.
I think it depends on whether you see Flash as competing with webvideo or with downloadable executables.
SVG without <script> would do just fine.
SVG also supports event attributes, so you should probably strip those too.
Wikipedia, which allows uploading media, deals with this by rendering svgs on the server side.
is santizing SVGs hard, or just everyone forgets they can contain js?
I gather from the HN discussion that it's not simple to disable scripting in an SVG, in retrospect a tragically missing feature.
I guess the next step is to propose a simple "noscripting" attribute, which if present in the root of the SVG doc inhibits all scripting by conforming renderers. Then the renderer layer at runtime could also take a noscripting option, so the rendering context could force it if appropriate. Surely someone at HN is on this committee, so see what you can do!
Edit: thinking about it a little more - maybe it's best to just require noscripting as a parameter to the rendering function. Then the browsers can have a corresponding checkbox to control SVG scripting and that's it.
Disabling script execution in svgs is very easy, it's just also easy to not realize you're about to embed an svg. `<img src="evil.svg">` will not execute scripts, a bit like your "noscripting" attribute except it's already around and works. Content Security Policy will prevent execution as well, you should be setting one for image endpoints that blocks scripts.
Sanitizing is hard to get right by comparison (svgs can reference other svgs) but it's still a good idea.
I had the impression from elsewhere in this thread that loading the svg in some other way, then you are not protected. This makes a no-brainer "don't run these ever" option in the browser seem appealing.
> This makes a no-brainer "don't run these ever" option in the browser seem appealing.
Firefox has this: svg.disabled in about:config. It doesn't seem to be properly documented, and might cause other problems for the developer (I found it accidentally, and a more deliberate search turns up mainly bug tracker entries.)
its common to santize html string to parse it and remove/error on script tags (and other possible vulnerabilities)
i wonder do people not do this with svgs?
User name checks out.
I believe the username is from the AI simulation of HN in 10 years.
> On one hand, the animations and entire interactive demos and even games in a single SVG are cool. But on the other hand
Didn’t we do this already with Flash? Why would this lesson not have stuck?
I agree, when animating SVGs I never put the js inside them so having the ability embed it is just dangerious I think
Wow, I learned one thing today!
Do other vector formats have the same vulnerabilities?
"The script doesn't run unless the file is directly opened (you can't run scripts from (<img src="/image.svg">)."
It will run if its in an <object> tag.
So if you're directly embedding the thing. This is a somewhat rare use case, should not be banned almost anywhere...
There is: PDF. You may not like it or adobe, but its there and widely supported.
PDF also has script support unfortunately.
That's apparently how 4chan got hacked a while back. They were letting users upload PDFs and were using ghostscript to generate thumbnails. From what I understand, the hackers uploaded a PDF which contained PostScript which exploited a ghostscript bug.
Yes but the primary issue was that 4chan was using over a decade old version of the library that contained a vulnerability first disclosed in 2012: https://nvd.nist.gov/vuln/detail/CVE-2012-4405
Does that mean that opening arbitrary pdfs on your laptop is unsafe?
Let me put it this way...
In one of my penetration testing training classes, in one of the lessons, we generated a malicious PDF file that would give us a shell when the victim opened it in Adobe.
Granted, it relied on a specific bug in the JavaScript engine of Adobe Reader, so unless they're using a version that's 15 years old, it wouldn't work today, but you can't be too cautious. 0-days can always exist.
Yes, opening random pdfs especially in random and old pdf viewers is not a good idea.
If you must open a possibly infected pdf, then do it in browser, pdf.js is considered mostly safe, and updated.
Use the PDF to JPG online services, convenient and you still get your result without having to deal with any sandbox
Except of course that you're sharing the contents of that PDF with a random online service.
True, I just considered that once you handle a PDF with so much care like if it was poisoned, it's perhaps better to send this poison to someone else to handle.
Better a DJVU file generated at a high DPI.