Wednesday, June 30, 2010

Injection-safe templating languages

The state of the art for Cross Site Scripting (XSS) software engineering defense is, of course, contextual output encoding. This involves manually escaping/encoding each piece of user data within the right context of a HTML document. The best programmer-centric OWASP resource around XSS defense can be found here: http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet

However, manually escaping user data can be a complex, error prone and time consuming process - especially if you are battling DOM based XSS vulns. We need a more efficient way. We need our frameworks to automatically defend against XSS so programmers can focus on innovation and functionality.

The future of XSS defense is HTML templating languages that are injection-safe by default.

Thanks to Mike Samuel from Google's AppSec team for pointing these projects out to me.

First we have GXP : http://code.google.com/p/gxp/ . It's an older Google offering that is much closer structurally to JSP and so possibly a better option for someone who has a bunch of broken JSPs and wants to migrate piecemeal to a better system.

There are also Java libraries like http://gxp.googlecode.com/svn/trunk/javadoc/com/google/gxp/html/HtmlClosure.html - this Library throws exceptions that are captured in the java type system which makes auditing them and logging and assertions around them fairly easy. They've done a really bad job documenting and advocating GXP but it's very well thought out, easy to use, and feature complete. https://docs.google.com/a/google.com/present/view?id=dcbpz3ck_8gphq8bdt is the best intro.

Another angle on the problem of generating safe HTML is http://google-caja.googlecode.com/svn/changes/mikesamuel/string-interpolation-29-Jan-2008/trunk/src/js/com/google/caja/interp/index.html which talks about ways to redefine string interpolation in languages like perl and PHP.

Marcel Laverdet from Facebook is trying another tack for PHP with his XHP scheme : http://www.facebook.com/notes/facebook-engineering/xhp-a-new-way-to-write-php/294003943919 . Rasmus has publicly been very skeptical of XHP, but I think a lot of his criticisms were a result of conflating XHP with other Facebook PHP schemes, such as precompilation to C and the like.

And course, there is the Google Auto-Escape project to keep a close eye on. It was first announced on March 31st of 2009. http://googleonlinesecurity.blogspot.com/2009/03/reducing-xss-by-way-of-automatic.html

Today, we need to manually output encode each piece of user driven data that we display. Perhaps tomorrow, our frameworks will do that work for us.

Tuesday, March 30, 2010

Shure SM-7B

Thank you to OWASP for this new studio-quality microphone, a Shure SM-7B. It's an incredible piece of equipment that makes my life a lot easier - and takes up a lot less space in my very crowded computer area.
I have quite a few podcasts on deck - including a 5 show batch to be released in sync with the Top Ten release!

Thanks all.

Aloha,
Jim

Thursday, January 21, 2010

How bad is it?

Thank you to John Menerick and Ben Nagy for entertaining my questions on the Daily Dave list.

Q: Is the recent ie6 0-day anything special?

John: Not really. Not as special as the NT <-> Win 7 issue recently highlighted.

Q: How many similar 0-days are for sale on the black market?

John: Quite a few.

Ben: I'd love to see your basis for this assertion. I'm not saying that in the "I don't believe you" sense, only in the "everyone always says that but nobody ever puts up any facts" sense.

Q: What is the rate/difficulty for discovery of new windows-based 0-days for the common MS and Adobe products that are installed on almost every corporate client? (I heard Dave mention that discovery is getting more difficult)?

John: Not terribly difficult for someone who is dedicated. Then again, my idea of difficult is much different from the avg. person

Ben: I think that while finding 0-days might be 'not terribly difficult', selecting and properly weaponising useful 0-days from the masses of dreck your fuzzer spits out IS difficult - at least in my experience. There was some discussion of the 'too many bugs' problem on this list previously and I know several of the other fuzzing guys are currently researching the same area. Of course you'd explain this to your 'avg. person', as well as explaining that the skillset for finding bugs is not necessarily the same as the skillset for writing reliable exploits for them, and that 'dedication' may not sufficiently substitute for either.

Lurene Grenier: I really feel that the "selecting good crashes" problem is not that hard to overcome if you have a proper bucketing system, and the ability to do just a bit of auto-triage at crash time. For example, the fuzzer I use now both separates crashes by what it perceives to be the base issue at hand, and provides a brief notes file with some information about the crash and what is controlled. This requires just a bit of sense in providing fuzzed input, and very little smarts on the part of the debugger. I really think the next step is automating that brain-jutsu; much of it is hard to keep in your head, but not hard to do in code.

Using this output, it's pretty easy to spend a lazy morning with your coffee grepping the notes files for the sorts of things you usually find to be reliably exploitable. From there you can call in your 30 ninjas and have at.

Creating reliable exploits is for sure the hardest part, but once you've done the initial work on a program, the next few exploits in it are of course more quickly and easily done.

As for the thought experiment, I think that the benefit of the top four researchers is that they've trained themselves over a long period of time (and with passion) to have a very good set of pattern-recognition tools which they call instincts. They know how to get crashes, and they know having seen one crash what's likely to find more. They know how to think about a process to get proper execution, and they're rewarded by success emotionally which makes the lesson learned this time around stick for when they need it again.

I honestly think that there is more pattern recognition "muscle-memory" type skill involved in RE, bug hunting, and exploit dev than pure mechanical process, which is why the numbers are so
skewed. It's like taking 4 native speakers of a language (who love to read!) and 100 students of general linguistics with a zillion dollars. Who will read a book in the language faster?

Q: How easy is discovery for someone with resources like the Chinese government?

John: Much simpler.

Ben: Setting aside the previous point that discovery is only the start, I think it's instructive to consider which elements of the process scale well with money.

Finding the bugs: You need a fuzzing infrastructure that scales - running peach on one laptop with 30 ninjas standing around it with IDA Pro open is not going to work. Also consider tracking what you've already tested, tracking the results, storing all the crashes, blah blah blah. This does scale well with money, but it's an area that not as many people have looked at as I would like.

Seeing which bugs are exploitable: Using a naive approach, this scales horribly poorly with money - non-linearly, to put it mildly. There are only so many analysts you will be able to hire that have enough smarts to look at a non-trivial bug and correctly determine its exploitability. You only have to look at some of the Immunity guys' (hi Kostya) records with turning bugs that other people had discarded as DoS or "Just Too Hard" into tight exploits. Even for ninjas, it's slow. There is research being done into doing 'some' of this process automatically (well, I'm doing some, and I know a couple of other guys are too, so that counts), but I don't know of anyone that has a great result in the area yet - I'd love to be corrected.

Creating nice, reliable exploits: I'd assert that this is like the previous point, but even harder. To be honest, it's not really my thing, so probably one of the people that write exploits for a living would be better to comment, but from talking to those kind of guys, it's often a very long road from 'woo we control ebx' to reliable exploitation, especially against modern OSes and modern software that has lots of stuff built in to make your life harder. I don't know how much of the process can really be automated - I mean there are some nice things like the (old now) EEREAP and newer windbg extensions from the Metasploit guys that will find you jump targets according to parameters and so forth, but up until now I was labouring under the impression that a lot of it remains brain-jitsu, which is hard to scale linearly with money.

So, while I think that 'simpler' is certainly unassailable, I would need more than a two word assertion to be convinced that it is 'much' simpler. If you give one team a million dollars and 100 people selected at random from the top 10% graduating computer science and you give the other team their pick of any 4 researchers in the world and 3 imacs, whom does the smart money think will produce more weapons grade 0day after 6 months?

(No it's not a fair comparison. It's a thought experiment.)

Food for thought, perhaps, since sound bites need little care and feeding.

Q: How bad is it really?

John: Look at the CVSSv2 score and adjust it to the environments where you determine "how bad it is." It could be much worse.

Q: I suspect we are just looking at one grain of sand in a beach of 0-days....

John: Correct. No one wants to let everyone else know what cards they hold in their hand, the tools in their toolbox, etc....