From: 3APA3A [3APA3A@SECURITY.NNOV.RU] Sent: Friday, February 15, 2002 2:08 PM To: BUGTRAQ@securityfocus.com Subject: SECURITY.NNOV: Bypassing content filtering software Dear BUGTRAQ, I planed to release this advisory later, but this problem actively discussed now on Bugtraq. So, I decided to publish it, without any information on vulnerable products (I have found few). Sorry for bad English, please feel free to ask me if something is not clear. Of cause, this advisory doesn't pretend to some kind of fullness, it's invented to show basic approach and targeted mostly on content filtering software vendors. I will be grateful for information on actual products vulnerable. Original version of this article http://www.security.nnov.ru/advisories/content.asp -=-=- There are common methods allowing to bypass almost any content filtering software (antiviral products, CVP firewalls, personal firewalls, mail attachment filters, etc). I believe multiple products are vulnerable. Contents: I. Bypassing attachment detection or invalid detection of attachment type. 1. Encoded filename or boundary in Content-Type/Content-Disposition 2. Multiple filename or boundary fields in Content-Type / Content-Disposition 3. Exploitation of poisoned NULL byte 4. Exploitation of unsafe fgets() problem 5. MIME part inside MIME part 6. UUENCODE problems 7. Additional space symbol 8. CR without LF II. Bypassing detection of potentially dangerous content 1. Inability to check Unicode (UCT-2) content 2. Inability to check UTF-7 content 3. Inability to check file marked as UTF-7 Content III. What should be done? 1. What client software vendor should do. 2. What server software vendors should do. 3. What system administrators should do. I. Bypassing attachment detection or invalid detection of attachment type. Imagine administrator who set his server to strip mail attachments with dangerous extensions: .exe, .com, .bat, .cmd, .pif, .scr etc. No he sure, that his user can't get executable file via e-mail. He's wrong. Because server and client software may use different ways to find attachments and to discover the type of attachments. Also, some servers have vulnerabilities preventing them from discovering attachments. There are few exploitation scenarios: 1. Encoded filename in Content-Type/Content-Disposition Mail software finds that MIME part is actually attachment by the 'name' attribute in Content-Type of 'filename' in Content-Disposition. If neither name nor filename attribute present most software will faild to find attachment. name and filename may contain encoded-words. Usually Content-Type looks like Content-Type: application/binary; name="eicar.com" or Content-Type: application/binary; name="=?us-ascii?Q?eicar=2Ecom?=" But there are different sub-variants server software may fail to check: Content-Type: text/plain; name==?us-ascii?Q?eicar.com?= or name=eicar.com name=""eicar.com name=eicar .com name="eicar.com name==?us-ascii?Q?eicar.com?= name==?us-ascii?Q?eicar?=.com name="eicar.=?us-ascii?Q?com?=" name="eicar.=?us-ascii?Q?com?= name=eicar.=?us-ascii?Q?com?= name=eicar.=?us-ascii?Q?co?=m in case of names like this many programs fail to detect .com extension or to find attachment at all (please note: base64 may be used instead of quoted-printable). Another example is name="=?us-ascii?B?eica.com in this case encoded word is incomplete and it's not clear if it should or shouldn't be decoded from base64. It will depend on client program realization. Good content filtering software should try both cases. Some programs also rely on boundary to detect attachments. If Content-Type contains something like boundary==?koi8-r?Q?aaa?= they may try to use boundary "aaa" while most clients will use exactly "=?koi8-r?Q?aaa?=". Another case is then software tries to decode enocded word, for example multiple programs miss attachment if it's marked as Content-Type: text/plain;=?us-ascii?B?;name="eicar.com";?= 2. Multiple filenames/boundaries. Another one point is how software behaves if there multiple name or boundary attributes. Example: Content-Type: text/plain; name="safe.txt"; name="eicar.com" Most client programs will use last name or boundary, but good content filtering software should block that kind of messages or check all possible situations. 3. Exploitation of "poisoned null byte". I belive there is not need to explain that ASCII 0 byte may be string terminator. NULL byte may present in data as is or may be encoded using base 64 or quoted printable. There is a lot of situation where server and client software may react to null byte in different way. At least Outlook Express treats NULL as CRLF. 3.1 Filename and boundary. There is no need to explain that both name="file.txt\0.exe" and name="file.exe\0.txt" may be dangerous and boundary="aaa\0bbb" may be treated as is or as "aaa". 3.2 MIME header and MIME body Imagine there is a MIME part with Content-type: text/plain; name=eicar.com \0Any: text EICAR-SIGNATURE Client software may think that EICAR-SIGNATURE is beginning of file data, while content filtering software will think it's a part of header. Or vice versa. The only good solution is do not allow NULL byte in headers. 4. Exploitation of unsafe fgets() problem I've used "unsafe fgets()" term some time ago regarding to mailbox parsing problem in few application. This is input validation bug in programs processing string input then long string are processed incorrectly in specific situation. It has nothing common with overflowing some buffer. Let's review small example. Imagine next code looks for empty string of only '\n' to find the end of MIME headers: while ( fgets(buffer, BUFFERSIZE, input) ) { ... if (*buffer == '\n') header = 0; ... } There is a bug in this code. Imagine the string of exactly BUFFERSIZE bytes long (last byte is '\n'). First fgets() call will return BUFFERSIZE-1 characters. Second call will return the string of only '\n' character. It will be incorrectly believed to be empty string. A lot of client and server software has this kind of bugs. It makes it possible to fool this software to detect headers there they shouldn't for exampe: Header:(number of spaces)Content-Type: text/plain; name="eicar.exe" or like in case of 3.2 to treat some header fields as a part of the body. 5. MIME part inside MIME part This bug is very common for software which strips attached files. Example: --aaa Content-Type=text/plain; --bbb Content-Type=application/exe; name="eicar.com" EICAR SIGNATURE --bbb-- name="eicar.com" EICAR SIGNATURE --aaa then bbb part will be removed aaa part will contain eicar.com 6. UUENCODE problems UUENCODE is older format for file attachments that doesn't require MIME part. In classic case uuencoded file begins with begin XXX filename.ext (XXX - file permissions in octal encoding). The problem is if filename contains spaces, for example begin 666 eicar .com is valid filename but multiple attachment filter fail to check everything after space. 7. Additional space symbol Additional space symbol at the end of filename or boundary may be treated in different ways by client and mail filtering software. For example: boundary=aaa\r\r\n may be treated by client software as either "aaa" or "aaa\r" and both cases should be checked. same thing is with filename in MIME or UUENCODE. 8. CR without LF At least Outlook Express treats without as end of line. It makes it possible to create Content-Type headers and body invisible for content filtering software (was reported by Valentijn Sessink). BTW: older versions of The Bat! crash on without , see http://www.security.nnov.ru/advisories/thebat2.asp II. Bypassing detection of potentially dangerous content There is a lost of software that tries to detect and block or remove dangerous file content (HTML strippers, antiviral products, etc). Inability of this software to handle specific data makes it useless. 1. Inability to check Unicode content Multiple products (including Internet Explorer/Outlook Express) support Unicode encoding for text formats including text/html. Unicode (UCT-2) text begins with 0xFF 0xFE bytes with wide (WORD) characters in Intel host byteorder (less significant first). Content filtering software may fail to strip potentially dangerous information (scripts, ActiveX, etc) from Unicode format text. For example, "