Monday, March 29, 2021

Remotely Exploitable 0day in Internet Explorer Gets a Free Micropatch


by Mitja Kolsek, the 0patch Team

 

[Note: This blog post was originally published on February 17, 2021 under URL https://blog.0patch.com/2021/02/remotely-exploitable-0day-in-internet.html but was for some reason deleted. Perhaps it was our fat fingers, perhaps evil forces - we'll never know. We have now reconstructed it from the Internet Archive which is an incredible service that you should donate to if you like this post, as we did.]

[Update 3/19/2021: This issue has been fixed by March 2021 Windows Updates. 0patch users had this critical issue, now assigned CVE-2021-26411, patched since February 17, full 20 days before an official patch became available. Since the official fix is available, this micropatch is no longer FREE and requires a PRO license.]

On February 4, 2021, security researchers at ENKI, a South Korean security consultancy, published a blog post detailing an unpatched vulnerability in Internet Explorer. This "0day" vulnerability was used in an attack campaign against various security researchers, including ENKI researchers, who noticed the attack and took the exploit apart to extract the vulnerability information. ENKI researchers kindly shared their proof of concept with us, so we could quickly start analyzing the vulnerability and create a micropatch for it.

The vulnerability is a "double free" bug that can be triggered with JavaScript code and causes memory corruption in Internet Explorer's process space. As is often the case, this memory corruption could be carefully managed and turned into arbitrary read/write memory access - which can then be leveraged to arbitrary code execution. Attackers delivered the exploit in an MHTML file to ensure recipients would open it in Internet Explorer (which is registered to open this file type). While this delivery method required recipients to confirm a security warning about executing active content, the exploit could be delivered without such warning if the victim visited a malicious web site with Internet Explorer. 

In such case, just opening the malicious web site would instantly, or a benign web site hosting a malicious ad, would result in malicious native code execution inside Internet Explorer's render process running (by default) in Low Integrity. Such code could read any data from the computer and network that the user running Internet Explorer could read, and silently send it to attacker. An additional vulnerability would be needed to escape the "Low Integrity sandbox" and achieve a long-term compromise of the computer.

Is anyone still browsing the web with Internet Explorer? While Internet Explorer is not widely used for browsing web sites anymore, it is installed on every Windows computer and (a) opens MHT/MHTML files by default, (b) is being used internally in many large organizations, and (c) executes HTML content inside various Windows applications.


The Vulnerability

Exploit and proof-of-concept have not been published yet and we won't be the first to do so, but the root of this vulnerability is not new - it's about tricking the browser to delete an object that has already been deleted in some unexpected way that existing sanitization checks don't notice. In this case, it's about deleting a node value of an HTML Attribute. The trick is to create an attribute, assign it a value that is not a string or a number, but an object (why is this even allowed?) - then when deleting this attribute, said object makes sure that the attribute is deleted before it gets deleted, so to speak.


The Micropatch

While Internet Explorer developers will probably fix the way the attribute node is deleted so that it doesn't actually get deleted while references to it still exist, we decided that such approach would simply require too much time for us and would introduce an unnecessary risk of breaking something. We thus rather decided to break the obscure browser functionality that allows setting an HTML Attribute value to an object. We assess this functionality to be useful to very few web developers whose apps are supposed to work with Internet Explorer.

Our micropatch gets applied inside the CAttribute::put_ie9_nodeValue function of mshtml.dll, where it checks the VARIANT type of the value that JavaScript code wants to assign to an attribute - and prevents that from happening if the type is 9 (VT_DISPATCH) - which is used for Object, Array, or Date.



The 64bit micropatch only has 5 CPU instructions, and the 32bit one has 6 CPU instructions.



MODULE_PATH "..\Affected_Modules\mshtml.dll_11.0.9600.19597_64bit\mshtml.dll"
PATCH_ID 560
PATCH_FORMAT_VER 2
VULN_ID 6943
PLATFORM win64

patchlet_start
 PATCHLET_ID 1
 PATCHLET_TYPE 2
 PATCHLET_OFFSET 0xbf34b4

 N_ORIGINALBYTES 5
 PIT mshtml.dll!0xbf359f ; address of exit block

 code_start

  ; we're going to check the incoming VARIANT's data type; if it's 9 (object), we're going
  ; to prevent it from being copied to the attribute.
  ; The incoming VARIANT is pointed to by rdx, and the type is in the first byte.

  mov r14, rcx         ; replicate the instruction we're injected in front of to make sure
                       ; rcx is stored in r14 in case we jump to the exit block (where rcx is
                       ; restored from r14)
  cmp byte [rdx], 0x09 ; is the incoming VARIANT data type 9 (object)?
  jne DO_NOTHING       ; if not, we don't interfere
 
  mov rbx, 0            ; return value - we simulate a successful operation
  jmp PIT_0xbf359f     ; jump to exit block
 
 DO_NOTHING:

 code_end
    
patchlet_end


Here's a video of the micropatch in action:




The micropatch applies to the following Windows versions (32bit and 64bit). 

Updated to February 2021:

  1. Windows 7 + ESU (first update from ESU year 2)
  2. Windows Server 2008 R2 + ESU (first update from ESU year 2)
  3. Windows 10 v1809, v1909, v2004, v20H2
  4. Windows Server 2016, 2019

Updated to January 2021:

  1. Windows 7 + ESU (last update from ESU year 1)
  2. Windows Server 2008 R2 + ESU (last update from ESU year 1)
  3. Windows 10 v1809, v1909, v2004, v20H2
  4. Windows Server 2016, 2019 

Updated to January 2020:

  1. Windows 7 without ESU (last free update without ESU)
  2. Windows Server 2008 R2 without ESU (last free update without ESU)
 

Online Test

 
We have prepared a simple online test to demonstrate how our micropatch changes the behavior of Internet Explorer. To perform this test, you have to use Internet Explorer 11 on one of the Windows systems listed above.

Step 1: With 0patch disabled, open https://0patch.com/poc/IE_Attribute_nodeValue_0day/test.html in Internet Explorer 11. The web page should look like the image below, indicating 6 FAILed tests.
 

Step 2: With 0patch enabled, press F5 to refresh the test page in Internet Explorer 11. The web page should look like the image below, indicating no failed tests.


 

According to our guidelines, this micropatch is free for everyone until Microsoft issues an official fix for it. By the time you're reading this the micropatch has already been distributed to all online 0patch Agents and also automatically applied except where Enterprise policies prevented that. If you're not a 0patch user and would like to use this micropatch on your computer(s), create an account in 0patch Central, install 0patch Agent and register it to your account. Note that no computer restart is needed for installing the agent or applying/un-applying any 0patch micropatch.
 
We'd like to thank ENKI researchers for their analysis of the vulnerability and an elegant proof-of-concept, which allowed us to create a micropatch.

While you're here: If your organization has Windows 7 or Server 2008 R2 machines with Extended Security Updates and wouldn't mind saving lots of money on less expensive security patches in 2021 that don't even need your machines to be restarted, proceed to our New Year's Resolution. The same applies if you're still using Office 2010 and want to keep patching critical vulnerabilities now that support has ended.

To learn more about 0patch, please visit our Help Center.  

Analyzing And Micropatching With Tetrane REVEN (Part 1, CVE-2021-26897)

 

by Mitja Kolsek, the 0patch Team


March 2021 Windows Updates included fixes for seven vulnerabilities in Windows DNS Server, two of which were marked by Microsoft as "Exploitation More Likely": CVE-2021-26877 and CVE-2021-26897. They were not known to be exploited and no details were publicly available until security researchers Eoin Carroll and Kevin McGrath published their analysis on McAfee Labs blog. Their article included enough information for us to reproduce both vulnerabilities, and then create micropatches for them.

This article will be about CVE-2021-26897, while CVE-2021-26877 will be covered in a parallel article.

These two vulnerabilities were the first we have ever analyzed with the help of Tetrane REVEN, an incredibly powerful reverse engineering tool that allows you to record a virtual machine and then browse or search through all recorded instructions in all processes and the kernel, or taint any data value forward or back in time, across processes and between user and kernel space (plus much more). I wanted to use this opportunity to show how REVEN helped us perform these analyses, which would otherwise have been done with WinDbg and countless re-launching of dns.exe process, having all interesting objects bouncing around on different memory addresses every time.


Analysis

CVE-2021-26897 is a buffer overflow issue, whereby a series of oversized "dynamic update" DNS queries with SIG (signature) records causes writing beyond the buffer boundary when these records are saved to file. DNS server periodically saves all received updates to file (so they don't get lost on restart or crash), and the issue gets triggered by simply waiting for this file write to happen after sending a number of requests, or by stopping the DNS Server service, which was a convenient time saver for us.

Our proof-of-concept (POC) sends ten malformed DNS requests, and upon stopping the DNS Server service, the dns.exe process crashes. Let's use REVEN to see what goes on.

We used a virtual machine with Windows Server 2019 and DNS Server role without March updates to keep it in the vulnerable state. It is important to record as few events (called "transitions") as possible, so "lightening" of a machine - stopping unnecessary services, and disabling Windows Defender and Windows Error Reporting - is generally a good idea. We did not stop any services but did the latter as error reporting gets triggered upon every process crash and, well, executes a lot of instructions. While it's easiest to start and stop recording manually, even one second of extra recording can create tens of millions of unneeded transitions that will just slow down post-processing of the recording. To optimize start and end of recording, REVEN provides a cool trick they call "ASM stubs", which is a fancy name for calling the int3 CPU instruction while having the ecx register set to some magic value. In other words, you can trigger starting and stopping of recording from within the virtual machine you're recording, which means you can start right before the interesting stuff happens, and stop right after it's done.

In our case, the mere sending of malformed DNS requests does not crash the process, but the stopping of the DNS Server service does. So we used our POC to send the requests, and then launched a batch script that started the recording and stopped the service. Before that, we have "abused" the Postmortem Debugging mechanism to make it launch a small executable that stops recording whenever a process crashes - instead of launching a debugger, which postmortem debugging was designed for.

Our recording generated around 1.25 billion transitions.Yes, that's billion. But it's really no problem because REVEN handles that effortlessly. The only price you pay for a larger recording is the time you have to wait for REVEN to "replay" it, which extracts all machine states and transitions, indexes them, extracts everything that was happening to the memory, downloads symbol files and a bunch of other things to assure a swift analysis thereafter. Our replay took just over an hour and generated 55 GB of data, upon which the analysis would then actually be done.

Granted, we could have tried to further reduce the recording and possibly succeed, but that doesn't make too much sense - one hour for preparation while you're having lunch is more than acceptable, and fiddling with recording optimization also takes time that can quickly exceed the time saved by a potentially quicker replay.

Proceeding to "Axion", the analysis user interface where the magic stuff happens. Axion has multiple widgets; prominently positioned in the middle are Transitions, displaying a small part of the entire recorded execution conveniently grouped in code blocks with one or more CPU instructions. Other widgets include Backtrace (the call stack for the current transition), CPU (register values before and after the current transition), Memory (at chosen address, before or after the current transition, along with the entire history of read and write accesses), Search (immensely useful, allows you to find calls to specific function, or all executions of a specific instruction), Bookmarks, and - my favorite - Taint. REVEN allows you to select any piece of data (e.g., the current value in register r8, or the value at some memory address) and taint it either forward or back in time, to see what this value affects or where it came from, respectively. This is a huge value-add for our analyses - although by no means the only one.

Now let's dive into the analysis. The first thing we need to do is find where dns.exe crashed. Scrolling through a billion+ transitions is obviously out of the question, but we can search for one of the functions that get called when an unhandled exception is thrown. KiUserExceptionDispatch is one such function.

 

 

Search finds a single hit, as expected, and here we are looking at the first executed instruction in this user-space function, as it was launched by kernel's KiExceptionDispatch. (The kernel code is on grey background because we filtered out only user-space execution.) Now we want to see why this exception was thrown. A simple press of the "%" key transfers our to the other end of a call-ret pair, or in our case, to the other end of the kernel call.

 

The "%" key landed us on the exact instruction that caused access violation in function Dns_SecurityKeyToBase64String. It was an attempted write to address r8+1, and the Memory view shows this address to be immediately after a valid memory block, where we see a bunch of A's that were likely written to this buffer in some loop we're probably currently in. We did not use REVEN-IDA integration here but if we did, we would immediately see the code graph for this function in IDA, with the current instruction selected in IDA. And we would see that we are, indeed, in a loop that copies base64-encoded signature value from our DNS request to this buffer that just got overflown, and uses r8 as the destination pointer that gets increased in every iteration. 

(Note that we enabled Page Heap for dns.exe to make it crash immediately on buffer overflow, otherwise the overflow could just corrupt the heap and eventually cause some random malfunction. With Page Heap, every heap buffer is allocated at the very end of a read-write memory block, followed by unallocated memory page - which means any typical buffer overflow will immediately trigger an access violation exception.)

Execution of a loop is shown in REVEN as a repetition of loop instructions, over and over again, but you can of course select any of these instructions and see how registers and memory looked like at that exact moment. What we want here, though, is to see where the buffer was allocated.

If we were in WinDbg, we'd have used the !heap command to get the call stack from the moment the overflown buffer was allocated. In REVEN, however, we can not only find the code that allocated the buffer, but also values of registers and content of memory in that precise moment. The simplest way (that I know of) would be to simply taint register r8 to see where its value came from - going back to the past far enough, at some point its value must have been determined by whoever allocated this buffer, or it would not have pointed to the end of this same buffer now.

However, tainting r8 backward produces too long a path that just keeps bouncing in the loop that we're in. While it eventually gets to where we want to be, it slows down the UI. So our first goal is to get to the beginning of our loop's execution. We copied the address of our access-violation-triggering instruction (0x7ff729ca224a) and went to the Search widget, where we searched for all uses of this address.



Results: this exact instruction has been executed 130705 times in our recording; in other words, it would take a significant chunk of one's lifetime to just scroll up to the start of the loop. However, it only took one press of the "Next" button to get to the first iteration of the loop - because we were already positioned on the last iteration.

 



 

This got us to the very first execution of our instruction in this recording, and we can see that it wrote 0x41 to memory (to the very same buffer it overflowed 130704 iterations later). The 0x41 left to it was written earlier by a similar instruction in the same loop. Now that we're out of the loop, so to speak, let's taint r8 and see where it got from.



We launched tainting for value of register r8 from the current transition, back to some function we selected high in the call stack; we chose function Zone_WriteBackDirtyZones. Why not taint all the way back to the very first transition? Because we're only interested in who has allocated the memory buffer that got overflown, but the taint would go way further back in time because the address of this allocation was influenced by earlier allocations (that's how the heap works) and that is just irrelevant to us.

Tainting identified a couple of hundred transitions, and we're interested in those at the very end of the list (i.e., the earliest ones). When one is often looking for memory allocations, some familiar Windows functions catch one's eye - and RtlAllocateHeap is one of them.



RtlAllocateHeap takes three arguments, allocation size being the third one - which in x64 calling convention means register r8. We can see that the value of r8 when this function was called was 0x80010. This means that the actual buffer allocated was of this size, but we still want to see if this is a hard-coded size or perhaps dynamically calculated. So we go further back in time through the taint results.



In the very first transition found with tainting, inside funtion File_WriteZoneToFile, we see a call was made to a function Mem_Alloc, which is not publicly documented but is clearly used for allocating the memory block we're after. Most importantly, we can see that a hard-coded value 0x80000 was provided to it, which is clearly the size of the buffer. (0x10 was subsequently added to it inside Mem_Alloc, which is nicely seen by walking through the taint.)

Now we know what happened: a fixed-sized memory block was allocated in function File_WriteZoneToFile, whose name implies that the DNS zone we've updated with our malformed requests was going to be written to file. At some point function Dns_SecurityKeyToBase64String was called that overflowed this buffer after base64-encoding the signature from our requests. Let's just see if function Dns_SecurityKeyToBase64String was called more than once, as we know that a single malformed request doesn't produce the crash.

To do that, we used the "Symbol call" search and provided the function name.



The search produced 6 hits, meaning that function Dns_SecurityKeyToBase64String was called 6 times. This indicates that it was our 6th DNS request that finally caused the writing to go beyond the buffer end. Some additional analysis showed that all these calls added their output to the same growing string inside the fixed-size buffer, which was supposed to be finally saved to the zone file.

Our REVEN analysis was done here. At this point we could have created a micropatch in various ways, making sure to prevent writing past the buffer end, but since we had Microsoft's official patches available, we wanted to see what they have done.

We used IDA and BinDiff to compare dns.exe from February 2021 and March 2021, and found 9 functions modified by the March update.



Technically, we could just search our recording for any execution inside these functions, and hopefully find that just one of them was executed - which would mean that the fix was included there. Actually, we did exactly that and found SigFileWrite to be the only one - but this would be very cumbersome if hundreds of functions were modified, especially if the recording included many of the modified functions that have nothing to do with our bug.

The most reliable method would be to inspect the entire taint list to see which of the modified functions affected the value we were tainting. Taint search is currently not supported by REVEN user interface, but we could undoubtedly use the API to achieve that (note to self: send a feature request to Tetrane). We're not that fluent in REVEN API yet so we took the third route: the call stack.

It is quite likely that one of the modified functions would appear in the call stack of our crash instruction. But in our case we don't see it (see the call stack on one of the screenshots above). We do see, however, that function Dns_SecurityKeyToBase64String seems to have been called by function LdrpDispatchUserCallTarget, which has a suspect name. Let's just click on that in REVEN and see what happened there.

 


We see that function RR_WriteToFile made a call to LdrpDispatchUserCallTarget, which then made a jump to SigFileWrite function. The latter is executed, but not seen in the call stack because LdrpDispatchUserCallTarget jumped to it instead of calling. Note that function LdrpDispatchUserCallTarget is part of Control Flow Guard as explained in this Trend Micro article. So whenever you see LdrpDispatchUserCallTarget in the call stack, you'll want to look for which function it jumped to.

Now that we have our "patch suspect", let's see how the original (February) and modified (March) versions of SigFileWrite function compare in BinDiff.



We see that four sanity checks were added, perhaps excessively but efficiently:

  1. If length of the SIG record is less than 0x12 (minimum possible), then exit.
  2. If length of the SIG record, subtracted by 0x12, is less than length of signer's name, then exit.
  3. If pointer to the end of the string (where it will be appended) is larger than end of buffer, then exit.
  4. The final, most complex-looking check, is compiler's "artistic rendition" of multiplication of signature length by 4/3, which is the number of characters that base64-encoding will require. If signature length multiplied by 4/3 is larger than the difference between string end pointer and end of buffer, then exit.

These checks make sure that the buffer will not get overflown, and will silently prevent DNS update records from being written to the zone file if end of buffer has been reached.

 

The Micropatch

Our micropatch does logically the same as Microsoft's, but it also adds an Exploit Blocked alert and log entry in case the buffer would have been overflown, as this would highly likely be a result of an exploitation attempt instead of something that would occur under normal circumstances.



MODULE_PATH "..\Affected_Modules\dns.exe_6.1.7601.24437_64bit\dns.exe"
PATCH_ID 597
PATCH_FORMAT_VER 2
VULN_ID 6993
PLATFORM win64

patchlet_start
    PATCHLET_ID 1
    PATCHLET_TYPE 2
    PATCHLET_OFFSET 0x7be32
    N_ORIGINALBYTES 5
    JUMPOVERBYTES 21            ; eliminate some original instructions that we'll add back in
    PIT dns.exe!0x7be6c
    
    code_start
        movzx eax, word[rdi+0Eh]    ; SIG request length
        cmp ax, 12h            ; is length below 12h?
        jb EXPLOIT_EXIT            ; packet is not valid - jump to exploit exit.
       
        movzx r8d, byte[rdi+42h]    ; get signer's name length (A.mal in the exploit example)
        sub ax, 12h            ; sub 12h from SIG request length
        lea ecx, [r8+2]            ; add 2 to signers name length
        cmp ax, cx            ; compare SIG request length and signer's name length
        jb EXPLOIT_EXIT            ; if signers name length is longer, jump to exploit exit.
       
        sub ax, cx            ; sub signer's name length from SIG request length
        cmp rsi, r11            ; r11 = sprintfSafeA return (end of string), rsi = end of buffer
        jb EXPLOIT_EXIT            ; jump to exploit exit if rsi < r11
       
        movzx r10d, ax            ; r10 = SIG request length - signer's name length
        mov eax, 0x55555556        ; eax = 0x100000000 / 3
        imul r10d            ; edx:eax = r10 * 0x100000000 / 3 (rdx = r10 / 3)
        mov eax, edx            ; eax = r10 / 3
        shr eax, 1Fh            ; eax = lowest bit of (r10 / 3)
        add edx, eax            ; add the lowest bit of eax ro edx (i.e., round up the division)
        lea eax, [4+rdx*4]        ; eax = 4*(rdx+1) - length of the base64-encoded string
        movsxd rcx, eax            ; rcx = length of the base64-encoded string
        mov rax, rsi            ; rax = end of buffer
        sub rax, r11            ; rax = end of buffer - end of existing string (i.e., space available)
        cmp rax, rcx            ; is space available less than length of base64-encoded string?
        jl EXPLOIT_EXIT            ; if so, go to exploit exit
       
        movzx eax, byte[rdi+42h]    ; original code preparing arguments for Dns_SecurityKeyToBase64String call
        lea rcx, [rax+rdi+44h]
        mov edx, r10d
        add rcx, r8
        mov r8, r11
        jmp END
       
    EXPLOIT_EXIT:
        mov r11, 0            ; r11 is later moved to rax which is the return of the function
        call PIT_ExploitBlocked        ; we call Exploit Blocked
        jmp PIT_0x7be6c            ; jump to last function block
    
    END:                    ; continue normal execution
    code_end
    
patchlet_end


Here's a video of the micropatch in action. You can see that without our micropatch, the POC, launched by a local non-admin user, successfully gets the DNS Service to crash (manually stopping the service just makes the crash happen earlier so we don't have to wait). This could be leveraged to remote arbitrary code execution of attacker's code as Local System. With our micropatch applied, the POC is blocked because the corrected code prevents writing beyond the allocated memory buffer.




We created this micropatch for the following Windows versions:

  1. Windows Server 2008 R2 without Extended Security Updates, updated to January 2020
  2. Windows Server 2008 R2 with year 1 of Extended Security Updates, updated to January 2021

According to our guidelines, this micropatch requires a 0patch PRO license. By the time you're reading this, the micropatch has already been distributed to all licensed online 0patch Agents and also automatically applied except where Enterprise policies prevented that. If you're not a 0patch user and would like to use this micropatch on your computer(s), create an account in 0patch Central, install 0patch Agent and register it to your account with appropriate amount of PRO licenses. Note that no computer restart is needed for installing the agent or applying/un-applying any 0patch micropatch.

We'd like to thank Eoin Carroll and Kevin McGrath for their analysis of the vulnerability, which allowed us to create a micropatch.



While you're here: If your organization has Windows 7 or Server 2008 R2 machines with Extended Security Updates and wouldn't mind saving lots of money on less expensive security patches in 2021 that don't even need your machines to be restarted, proceed to our New Year's Resolution. The same applies if you're still using Office 2010 and want to keep patching critical vulnerabilities now that support has ended.

To learn more about 0patch, please visit our Help Center.