Wednesday, June 30, 2010

Injection-safe templating languages

The state of the art for Cross Site Scripting (XSS) software engineering defense is, of course, contextual output encoding. This involves manually escaping/encoding each piece of user data within the right context of a HTML document. The best programmer-centric OWASP resource around XSS defense can be found here: http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet

However, manually escaping user data can be a complex, error prone and time consuming process - especially if you are battling DOM based XSS vulns. We need a more efficient way. We need our frameworks to automatically defend against XSS so programmers can focus on innovation and functionality.

The future of XSS defense is HTML templating languages that are injection-safe by default.

Thanks to Mike Samuel from Google's AppSec team for pointing these projects out to me.

First we have GXP : http://code.google.com/p/gxp/ . It's an older Google offering that is much closer structurally to JSP and so possibly a better option for someone who has a bunch of broken JSPs and wants to migrate piecemeal to a better system.

There are also Java libraries like http://gxp.googlecode.com/svn/trunk/javadoc/com/google/gxp/html/HtmlClosure.html - this Library throws exceptions that are captured in the java type system which makes auditing them and logging and assertions around them fairly easy. They've done a really bad job documenting and advocating GXP but it's very well thought out, easy to use, and feature complete. https://docs.google.com/a/google.com/present/view?id=dcbpz3ck_8gphq8bdt is the best intro.

Another angle on the problem of generating safe HTML is http://google-caja.googlecode.com/svn/changes/mikesamuel/string-interpolation-29-Jan-2008/trunk/src/js/com/google/caja/interp/index.html which talks about ways to redefine string interpolation in languages like perl and PHP.

Marcel Laverdet from Facebook is trying another tack for PHP with his XHP scheme : http://www.facebook.com/notes/facebook-engineering/xhp-a-new-way-to-write-php/294003943919 . Rasmus has publicly been very skeptical of XHP, but I think a lot of his criticisms were a result of conflating XHP with other Facebook PHP schemes, such as precompilation to C and the like.

And course, there is the Google Auto-Escape project to keep a close eye on. It was first announced on March 31st of 2009. http://googleonlinesecurity.blogspot.com/2009/03/reducing-xss-by-way-of-automatic.html

Today, we need to manually output encode each piece of user driven data that we display. Perhaps tomorrow, our frameworks will do that work for us.

1 comment:

Jim Manico said...

More from Ivan:

Hi Jim,

I saw your blog post today, so I thought you might find this interesting.

I got bored with having to think which output encoding to use where, so
I decided to implement context-aware output encoding as an extension for
Apache Velocity (but it can be abstracted to work with any data stream).

The idea is to allow developers to freely embed variables in templates,
but to parse output in real-time and apply correct output encoding to
prevent XSS.

Have a look at helloStrangerWithTemplate.vm in the attachment.

The only catch is that the parser needs to be smart enough to detect
every single context switch. The advantage is that templates are
controlled by developers and the assumption they will not actively work
to defeat the parser. Still, it is possible that they will use some
obscure feature.

I have the following contexts at the moment: HTML, JAVASCRIPT, CSS, and
URI. I am attaching a very concise list of how the parser decides to
switch context. I hope it makes sense.

Thanks!
Ivan Ristic