Jindřich Kubec

13 May 2011

Why we love specifications (not)!

Go to comments Leave a comment

A few days ago we blogged about another trick in PDF parsing. We got there a comment from a person recommending that we read specifications, which we (as AV guys, not pdf-reader-writing guys) usually don't do to the full extent, because most of the specifications we've seen have been misleading at best.

To illustrate this fact, let me quote from Microsoft RTF docs:

The header has the following syntax:

<header>\rtf <charset> \deff? <fonttbl> <filetbl>? <colortbl>? <stylesheet>? <listtables>? <revtbl>

An entire RTF file is considered a group and must be enclosed in braces. The \rtfN control word must follow the opening brace. The numeric parameter N identifies the major version of the RTF Specification used. The RTF standard described in this RTF Specification, although titled as version 1.6, continues to correspond syntactically to RTF Specification version 1. Therefore, the numeric parameter N for the \rtf control word should still be emitted as 1.

So lets assume we do it exactly as docs say and what we get in return? Malware passing by our filters, because Microsoft Word does not conform to specs and opens anything starting with {\rt.

The following is complete RTF file, which, when saved, would be opened with no problem by Microsoft Word 2000, 2003 and 2007. We assume also in 2010. It won't open in WordPad though.

{\rtHereYouCanSeeWhyWeDontLikeSpecsEspeciallyFromMicrosoft\ansi\ansicpg1250\uc1\deff29\deflang1029\deflangfe1029{\f0\lang1029\langfe1033\langnp1029 Hello MS world!\par}}

So, tell me something about reading specs next time... <g>



Virus Lab, Analyses