Friday, November 17, 2017

Did Microsoft Just Manually Patch Their Equation Editor Executable? Why Yes, Yes They Did. (CVE-2017-11882)

And They Did an Absolutely Stellar Job

by Mitja Kolsek, the 0patch Team


[Update 11/21/2017]  Today Embedi published their proof-of-concept exploit, which allowed us to see where exactly Microsoft's manual patch blocks it. Contrary to this article's original claim that CVE-2017-11882 was patched in function 4164FA while six other buffer overflow checks we found were for some other attack vectors, it is actually one of those six checks that blocks Embedi's exploit. This article has been slightly corrected to reflect that. In addition, we were now able to create a micropatch for Equation Editor that also blocks all exploits targeting the vulnerability found by Embedi. All Internet-connected computers with a registered 0patch Agent running have already received this micropatch and have it automatically applied whenever Equation Editor is launched. [End update 11/21/2017]


A Pretty Old Executable

The recent Patch Tuesday brought, among other things, a new version of "old" Equation Editor, which introduced a fix for a buffer overflow issue reported by Embedi.

The "old" Equation Editor is an ancient component of Microsoft Office (Office now uses an integrated Equation Editor), which is confirmed by looking at the properties of the unpatched EQNEDT32.EXE:




We can see that File version is 2000.11.9.0 (implying being built in 2000), while Date modified is in 2003, which matches the time of its signature (signing modifies the file as the signature is attached to it.) Furthermore, the TimeDateStamp in its PE header (3A0ACEBF), which the compiler writes into the executable module when building it, indicates that the file was built on November 9, 2000 - exactly matching the date in the above version number.


We're therefore safe to claim that the vulnerable EQNEDT32.EXE has been with us since 2000. That's 17 years, which is a pretty respectable life span for software!

So now a vulnerability was reported in this executable and Microsoft spawned their fixing procedure: they reproduced the issue using Embedi's proof-of-concept, confirmed it, took the source code, fixed the issue in the source code, re-built EQNEDT32.EXE, and distributed the fixed version to Office users, who now see version 2017.8.14.0 under its properties.

At least that's how it would work for most other vulnerabilities. But something was different here. For some reason, Microsoft didn't fix this issue in the source code - but rather by manually patching the binary executable.


Manually Patching an EXE?

Really, quite literally, some pretty skilled Microsoft employee or contractor reverse engineered our friend EQNEDT32.EXE, located the flawed code, and corrected it by manually overwriting existing instructions with better ones (making sure to only use the space previously occupied by original instructions).

How do we know that? Well, have you ever met a C/C++ compiler that would put all functions in a 500+ KB executable on exactly the same address in the module after rebuilding a modified source code, especially when these modifications changed the amount of code in several functions?

To clarify, let's look at BinDiff results between the fixed (2017.8.14.0, "primary") and vulnerable version (2000.11.9.0, "secondary") of EQNEDT32.EXE:



If you're diffing binaries a lot, you'll notice something highly peculiar: All EA primary values are identical to EA secondary values of matched functions. Even the matched but obviously different functions listed at the bottom are at the same address in both EQNEDT32.EXE versions.

As we already noted on Twitter, Microsoft modified five functions in EQNEDT32.EXE, namely the bottom-most five functions listed on the above image. Let's look at the most-modified one first, the one at address 4164FA. The patched version is on the left, the vulnerable one on the right.



This function takes a pointer to the destination buffer and copies characters, one by one in a loop, from user-supplied string to this buffer. It is also the very function that Embedi found to be vulnerable in their research; namely, there was no check whether the destination buffer was large enough for the user-supplied string, and a too-long font name provided through the Equation object could cause a buffer overflow.

Microsoft's fix introduced an additional parameter to this function, specifying the destination buffer length. The original logic of the character-copying loop was then modified so that the loop ends not only when the source string end is reached, but also when the destination buffer length is reached - preventing buffer overflow. In addition, the copied string in the destination buffer is zero-terminated after copying, in case the destination buffer length was reached (which would leave the string unterminated).

Let's look at the code in its text form (again, patched function on left, vulnerable on right):

As you can see, whoever patched this function not only added a check for buffer length in it, but also managed to make the function 14 bytes shorter (and padded the resulting gap before the adjacent function with 0xCC bytes for style points :). Impressive.


Patching The Callers

Moving on. If the patched function got an additional parameter, all those calling it would have to change as well, right? There are exactly two callers of this function, at addresses 43B418 and 4181FA, and in the patched version they both have a push instruction added before the call to specify the length of their buffers, 0x100 and 0x1F4 respectively.

Now, a push instruction with a 32-bit literal operand takes 5 bytes. In order to add this instruction to these two functions while staying within the tight space of the original code (whose logic must also remain intact), the patcher did the following:

For function at address 43B418, the patched function temporarily stores some value - which it will need later on - in ebx instead of a local stack-based variable, which releases enough bytes for injecting the push call. (By the way, additional evidence of manual patching is that while the local variable is no longer used, space for it is still made on the stack; otherwise sub esp, 0x10C would turn into sub esp, 0x108.)





For the other caller, function at address 4181FA, the patched function mysteriously has the push instruction injected without any other modifications to the code that would introduce the needed extra space.


As you can see on the above image, the push instruction is injected at the beginning of the yellow block, and all original instructions in that block are pushed down 5 bytes. But why does this not overwrite 5 bytes of the original code somewhere else? It's as if there were 5 or more unused bytes already in existence just after this block of code that the patcher could safely overwrite.

To solve this mystery, let's look at the code in its text form.


Surprise, the vulnerable version actually had an extra jmp loc_418318 instruction at the end of the modified code block. How convenient! This allows the code in this block to be moved down 5 bytes, making space for the push instruction at the top.

Coincidence? Perhaps, but it looks an awful lot like this code block got manually modified before in the past, whereby it got shortened for 5 bytes and its last instruction (jmp loc_418318) was left there.


Additional Security Checks

What we've covered so far was related to Embedi's published research and CVE-2017-11882, but is not what blocks Embedi's exploit. The new version of EQNEDT32.EXE has two additional modified functions at addresses 41160F and 4219F0. Let's have a look at them.

In the patched executable, these two functions got a bunch of injected boundary checks for copying to what appear to be 0x20-byte buffers. These checks all look the same: ecx (which is the counter for copying) is compared to 0x21; if it's greater than or equal to that, ecx gets set to 0x20. All these checks are injected right before inlined memcpy operations. Let's look at one of them to see how the patcher made room for the additional instructions.



As shown on the above image, a check is injected before the inlined memcpy code. Note that in 32-bit code, memcpy is typically implemented by first copying blocks of 4 bytes using the movsd (move double word) instruction, while any remaining bytes are then copied using movsb (move byte). This is efficient in terms of performance, but whoever was patching this noticed that some space can be freed by only using movsb, and perhaps sacrificing a nanosecond or two. After doing so, the code remained logically identical but now there was space for injecting the check before it, as well as for zero-terminating the copied string. Again, an impressive and clever hack (and there was still an extra byte to spare - notice the nop?)

There are six such length checks in two modified functions, and just one of them is directly responsible for blocking Embedi's exploit. We believe that Microsoft noticed some additional attack vectors that could also cause a buffer overflow and decided to proactively patch the other five memcpys and the patched function we covered earlier.


Final Touches

After patching the vulnerable code and effectively manually building a new version of Equation Editor, the patcher also corrected the version number of EQNEDT32.EXE to 2017.8.14.0, and the TimeDateStamp in the PE header to August 14, 2017 (hex value 5991FA38) - which is just 10 days after Microsoft acknowledged receipt of Embedi's report. (Note however that due to the manual nature of setting these values it's possible that code has been modified after that date.)

[Update 11/20/2017] Another thing Microsoft also patched in EQNEDT32.EXE was the "ASLR bit", i.e., they set the IMAGE_DLLCHARACTERISTICS_DYNAMIC_BASE flag in the PE optional header structure:







This is good. Enabling ASLR on EQNEDT32.EXE will make it harder to exploit any remaining memory corruption vulnerabilities. For instance, Embedi's exploit would not work with ASLR because it relied on the fact that the call to WinExec would always be present at the same memory address; this allowed them to simply put that address on stack and wait for the ret to do all the work.

Interestingly, Microsoft decided not to also set the IMAGE_DLLCHARACTERISTICS_NX_COMPAT ("DEP") flag, which would prevent code execution from data pages (e.g., from stack). They surely had good reasons, but should any additional vulnerabilities be found and exploited in EQNEDT32.EXE, the exploit will likely include execution of data on stack or heap. [End update 11/20/2017]


Conclusion  

Maintaining a software product in its binary form instead of rebuilding it from modified source code is hard. We can only speculate as to why Microsoft used the binary patching approach, but being binary patchers ourselves we think they did a stellar job.

This old Equation Editor is now under the spotlight, and many researchers are likely to start fuzzing it for additional vulnerabilities. If any are found, we'll probably see additional rounds of manual binary patches in EQNEDT32.EXE. While Office has had a new Equation Editor integrated since at least version 2007, Microsoft can't simply remove EQNEDT32.EXE (the old Equation Editor) from Office as there are probably tons of old documents out there containing equations in this old format, which would then become un-editable.

Now how would we micropatch CVE-2017-11882 with 0patch? It would actually be much easier: we wouldn't have to shrink existing code to make room for the injected one, because 0patch makes sure that we get all the space we need. So we wouldn't have to come up with clever hacks like de-optimizing memcpy or finding an alternative place to temporarily store a value for later use. This freedom and flexibility makes developing an in-memory micropatch much easier and quicker than in-file patching, and we believe software vendors like Microsoft could benefit greatly from using in-memory micropatching for fixing critical vulnerabilities.

Oh by the way, Microsoft also updated Office's wwlib.dll this Patch Tuesday, prompting us to port our DDE / DDEAUTO patches to these new versions. 0patch Agent running on your computer will automatically download and apply these new patches without interrupting you. If you don't have 0patch Agent installed yet, we have good news for you: IT'S FREE! Just download, install and register, and you're all set. 


Cheers!

@mkolsek
@0patch

P.S.: If you happen to know the person(s) who did the binary patching of EQNEDT32.EXE, please send them a link to this blog post. We'd like them to know how much we admire their work. Thanks! 










27 comments:

  1. Can you swap primary and secondary in your graphics?

    ReplyDelete
  2. They could not find source code?

    ReplyDelete
    Replies
    1. This is the most obvious and probably the real reason. 17 year old source code got lost... God knows, maybe they were not even using a source code version control system back then.

      Delete
  3. Having the patched version on the left has lately become a habit to us (often diffing the latest version with multiple older versions so primary stays open). But wondering, is there some consensus on this in the community?

    ReplyDelete
    Replies
    1. +1 to the swap -- it's extremely confusing to read: I had to re-read the article after the first comment to make full sense of it.

      Delete
    2. I've been drinking, and that definitely took too long to figure out...

      Delete
    3. I haven't heard of any consensus (or seen the question asked before). I typically do Old Left - New Right so i can see what was present when I started and what changes I've made.
      When doing a merge I'll put my changes on the left with the new file on the right when comparing so the destination is on the right.

      That said: without looking it up I don't know how I diff current vs past revisions. (Verifying it I do have past on left with current would be on right)

      Delete
  4. Which Tool were you using to view the Assembly code?

    ReplyDelete
  5. IDA Pro (https://www.hex-rays.com/products/ida/) and BinDiff (https://www.zynamics.com/software.html)

    ReplyDelete
  6. Every graphical diff tool I've ever seen always shows the newer version on the right.

    ReplyDelete
    Replies
    1. ditto the above, it's like reading left to right, like an x-axis timeline increasing from left to right. Seems a reasonable request.

      Delete
  7. Replies
    1. Raymond Chen? A highly plausible guess or do you actually know that?

      Delete
    2. Who else could it be? ;)
      (I have no way of knowing)

      Delete
  8. The copyright tag in the first screenshot is interesting. Who are "Design Science, Inc."? Perhaps that's why Microsoft doesn't have the source?

    ReplyDelete
  9. Equation Editor was written bu a third party vendor (Design Science). Most probably MS did not have the source code.

    ReplyDelete
    Replies
    1. Design Science continued development of this product into their fully featured MathType software
      https://en.wikipedia.org/wiki/MathType

      Delete
    2. That's not correct: MathType has been around for a long time: first on the Mac, and then on Windows 2. Microsoft licensed a scaled-down version of MathType, called Equation Editor, for inclusion with some of its products on both Windows and Macintosh.

      Delete
  10. feh - we were doing this for mfc40.dll and y2k 18 years ago. https://jeffpar.github.io/kbarchive/kb/231/Q231327/

    ReplyDelete
  11. And yes the nop [eax+eax] is a 0F 1F NOP that don't work on older CPUs.

    ReplyDelete
  12. Reported a bug to MS in GDI+ Image.Save() about 2 years ago, which is present from XP to Win10. Some encoders do not report an error if write operations fail (e.g. out of disk space) and generate invalid files. MS offered 3 different approaches, how to work around this problem, but refused to accept this being a bug. I got the impression, that they don't want to touch old code. After a lot of arguing (maybe 10 emails) they finally accepted it being a bug. Maybe they are scared that a rebuild with a current tool chain may cause strange side effects. Maybe the old code base contains "dirty tricks", that would cause issues with current tools.

    ReplyDelete
    Replies
    1. I'm pretty sure that a company like Microsoft, does not only keep old sources, but also the relevant tool chains, target OSes (to build and test upon). Otherwise things would be pretty futile and indeed dangerous.

      That doesn't mean, of course, that they are reluctant to change old software - every little change could introduce a compatibility issue on the other end.

      Delete
  13. Technically, Microsoft could just have a replicable build toolchain. This is a thing in safety critical computing, where you should be able to recreate binaries bit-for-bit; usually this means version controlling the source code together with the entire toolchain.

    ReplyDelete
  14. MS had source code (or Design Science compiled it for them) at least as far as around 2005, since different build comes with Office 2007 and file is dated 10/04/2005 (also TimeDateStamp in both builds correspond to file modification date). For whatever reason they reverted to older builds since Office 2010 I think.

    Besides recompilation I see no other differences, version numbers are the same, copyright date is the same etc.

    Discovered that while updating manually Equation Editor on various PCs with different Office versions.

    TL&DR; Office 2007 had different version of Equation Editor built in 2005.

    ReplyDelete
  15. I was an engineer at Microsoft working on compatibility fixes and did a lot of these patches. You are very often better off making a very specific tweak to older codebases rather than doing a full build. Minor changes to compilers, other tools, etc are an issue, but the big one is recreating the full test suite for older apps. Doing a professional job releasing software that could be in use by millions of people is a HUGE endeavor to do properly.

    ReplyDelete