Security is now a strong differentiator in picking the right browser. We all use browsers for day-to-day activities like staying in touch with loved ones, but also for editing sensitive private and corporate documents, and even managing our financial assets. A single compromise through a web browser can have catastrophic results. It doesn’t help that browsers are also on their way to becoming some of the most complex pieces of consumer software in existence, increasing potential attack surface.
Our job in the Microsoft Offensive Security Research (OSR) team is to make computing safer. We do this by identifying ways to exploit software, and working with other teams across the company on solutions to mitigate attacks. This workflow typically involves identifying software vulnerabilities to exploit. However, we believe that there will always be more vulnerabilities to find, so that isn’t our primary focus. Instead, our job is really all about asking: assuming a vulnerability exists, what can we do with it?
We’ve so far had success with this approach. We have helped improve the security of several Microsoft products, including Microsoft Edge. We continue to make strides in preventing both Remote Code Execution (RCE) with mitigations like Control Flow Guard (CFG), export suppression, and Arbitrary Code Guard (ACG), and isolation, notably with Less Privileged AppContainer (LPAC) and Windows Defender Application Guard (WDAG). Still, we believe it’s important for us to validate our security strategy. One way we do this is to look at what other companies are doing and study the results of their efforts.
For this project, we set out to examine Google’s Chrome web browser, whose security strategy shows a strong focus on sandboxing. We wanted to see how Chrome held up against a single RCE vulnerability, and try to answer: is having a strong sandboxing model sufficient to make a browser secure?
Some of our key findings include the following:
- Our discovery of CVE-2017-5121 indicates that it is possible to find remotely exploitable vulnerabilities in modern browsers
- Chrome’s relative lack of RCE mitigations means the path from memory corruption bug to exploit can be a short one
- Several security checks being done within the sandbox result in RCE exploits being able to, among other things, bypass Same Origin Policy (SOP), giving RCE-capable attackers access to victims’ online services (such as email, documents, and banking sessions) and saved credentials
- Chrome’s process for servicing vulnerabilities can result in the public disclosure of details for security flaws before fixes are pushed to customers
Finding and exploiting a remote vulnerability
Identifying the bug
One of the disadvantages of fuzzing, compared to manual code review, is that it’s not always immediately clear what causes a given test case to trigger a vulnerability, or if the unexpected behavior even constitutes a vulnerability at all. This is especially true for us at OSR; we have no prior experience working with V8 and therefore know fairly little about its internal workings. In this instance, the test case produced by ExprGen reliably crashed V8, but not always in the same way, and not in a way that could be easily influenced by an attacker.
Looking at the address the crash occurs at (0x000002d168004f14), we can tell it does not happen within a static module. It must therefore be in code dynamically generated by V8‘s Just-In-Time (JIT) compiler. We also see that the crash happens because the rax register is zero.
At first glance, this looks like a classic null dereference bug, which would be a let-down: those are typically not exploitable because modern operating systems prevent the zero virtual address from being mapped. We can look at surrounding code in order to get a better idea of what might be going on:
The second piece of information we can glean from the disassembly above is that the zero value contained in register rax at the time of the crash is loaded from memory. It also looks like the value that should have been loaded is being passed to the toString function call as a parameter. We can tell that it’s being loaded from [rdi + 0x18]. Based on that, we can take a look at that piece of memory:
This doesn’t yield very useful information. We can see that most of these values are pointers, but that’s about it. However, it’s useful to know where the value (which is meant to be a pointer) is loaded from, because it can help us figure out why this value is zero in the first place. Using the WinDbg‘s newly public Time Travel Debugging (TTD) feature, we can place a memory write breakpoint at that location (baw 8 0000025e`a6845dd0), then place an execution breakpoint at the start of the function, and finally re-run the trace backwards (g-).
Interestingly, our memory write breakpoint doesn’t trigger, meaning that this memory slot does not get initialized in this function, or at least not before it’s used. This might be normal, but if we play around with the test case, for example by replacing the o.b.bc.bca.bcab = 0; line with o.b.bc.bca.bcab = 0xbadc0de;, then we start noticing changes to the memory region where our crash value originates:
We see that our 0xbadc0de constant value ends up in that memory region. Although this doesn’t prove anything, it makes it seem likely that this memory region is used by the JIT-compiled function to store local variables. That idea is reinforced by how the code from earlier looked like the value we crash trying to load was being passed to Object.toString as a parameter.
Combined with the fact that TTD confirmed that this memory slot is not initialized by the function, a possible explanation is that the JIT compiler is failing to emit code that would initialize the pointers representing the object members used to access the o.b.ba.bab field.
To confirm this, we can run the test case in D8 with the –trace-turbo and –trace-turbo-graph parameters. Doing so will cause D8 to output information about how TurboFan, V8‘s JIT compiler, goes about building and optimizing the relevant code. We can use this in conjunction with turbolizer to visualize the graphs that TurboFan uses to represent and optimize code.
TurboFan works by applying various optimization phases to the graph one after the other. About half-way through the optimization pipeline, after the Load elimination optimization phase, this is what our code’s flow looks like:
It’s fairly straightforward: the optimizer apparently inlined func0 into the infinite loop, and then pulled the first loop iteration out. This information is useful to see how the blocks relate to each other. However, this representation omits nodes that correspond to loading function call parameters, as well as the initialization of local variables, which is the information we’re interested in.
Thankfully, we can use turbolizer‘s interface to display those. Focusing on the second Object.toString call, we can see where the parameter comes from, as well as where it’s allocated and initialized:
(NOTE: node labels were manually edited for readability)
At this stage in the optimization pipeline, the code looks perfectly reasonable:
- A memory block is allocated to store local object o.b.ba (node 235), and its fields baa and bab are initialized
- A memory block is allocated to store local object o.b (node 259), and its fields are all initialized, with ba specifically being initialized with a reference to the previous o.b.ba allocation
- A memory block is allocated to store local object o (node 303), and its fields are all initialized
- Local object o‘s field b is overwritten with a reference to object o.b (node 185)
- Local object field o.b.ba.bab is loaded (nodes 199, 209, and 212)
- The Object.toString method is called, passing o.b.ba.bab as first argument
Code compiled at this stage in the optimization pipeline looks like it shouldn’t exhibit the uninitialized local variable behavior that we’re hypothesizing is the root cause of the bug. That being said, certain aspects of this representation do lend credence to our hypothesis. Looking at nodes 209 and 212, which load o.b.ba and o.b.ba.bab, respectively, for use as a function call parameter, we can see that the offsets +24 and +32 correspond to the disassembly of the crashing code:
As we still don’t have an explanation for the bug, we keep looking through optimization passes until we find something strange. This happens after the Escape analysis pass. At that point, the graph looks like the following:
There are two notable differences:
- The code no longer goes through the trouble of loading o and then o.b—it was optimized to reference o.b directly, probably because that field’s value is never changed
- The code no longer initializes o.b.ba; as can be seen in the graph, turbolizer grays out node 264, which means it is no longer live, and therefore won’t be built into the final code
Looking through all the live nodes at this stage seems to confirm that this field is no longer being initialized. As another sanity check, we run d8 on this test case with the flag –no-turbo-escape in order to omit this optimization phase: d8 no longer crashes, confirming that this is where the issue stems from. In fact, that turned out to be Google’s fix for the bug: completely disable the escape analysis phase in v8 6.1 until the new escape analysis module was ready for production in v8 6.2.
With all this information about the bug’s root cause in hand, we need to find ways to exploit it. It looks like it could be a very powerful bug, but it depends entirely on our ability to control the uninitialized memory slot, as well as how it ends up being used.
Getting an info leak
At this point, the easiest way to get an idea of what we can or can’t do with the bug is simply to play around with the test case. For example, we can look at the effect of changing the type of the field we’re loading from an uninitialized pointer:
The result is that the field is now loaded directly as a float, rather than an object pointer or SMI:
Similarly, we can try adding more fields to the object:
Running this, we get the following crash:
This is interesting, because it looks like adding fields to the object modifies the offsets from where the object fields are loaded. In fact, if we do the math, we see that (0x67 – 0x1f) / 8 = 9, which is exactly the number of fields we added from o.b.ba.bab. The same applies to the new offset that rbx is loaded from.
Playing around with the test case a bit more, we are able to confirm that we have extensive control over the offset where the uninitialized pointer is loaded from, even though none of these fields are being initialized. At this point, it would be useful to see if we can place arbitrary data into this memory region. The earlier test with 0xbadc0de seemed to indicate that we could, but the offset appeared to change with each run of the test case. Often, exploits get around that by spraying values. The rationale is that if we can’t accurately trap our target to a given location, we can instead just make our target bigger. In practice, we can try spraying values by using inline arrays:
Looking at the crash dump, we see:
The crash is essentially the same as previously, but if we look at the memory where our uninitialized data is coming from, we see:
We now have a big block of arbitrary script-controlled values at an offset from r11. Combining this observation with the previous one about offsets, we can come up with something even better:
The result is that we are now dereferencing a float value arbitrary from an arbitrary address:
This, of course, is extremely powerful: it immediately results in an arbitrary read primitive, impractical as it may be. Unfortunately, an arbitrary read primitive without an initial info leak is not that useful: we need to know which addresses to read from in order to make use of it.
As the variable v can be anything we want it to be, we can replace it with an object, and then read its internal fields. For example, we can replace the call to Object.toString() with a custom callback, and replace v with a DataView object, reading back the address of that object’s backing store. This produces a way for us to locate fully script-controlled data:
The above code returns (modulo ASLR):
Using WinDbg we can validate that this is indeed the backing store for our buffer:
Building an arbitrary read/write primitive
This places an SMI representing the integer 0xbadc0de at the location where our arbitrary object pointer would be loaded from. Since we didn’t set the least significant bit, it will be interpreted by V8 as an inline integer:
As expected, V8 prints the following output:
Given this, we have the ability to create arbitrary objects. From there, we can put together a convenient arbitrary read/write primitive by creating fake DataView and ArrayBuffer objects. We again place our fake object data at a known location using WinDbg:
As expected, the call to DataView.prototype.setUint32 triggers a crash, attempting to write value 0xdeadcafe to address 0x00000badbeefc0de:
Controlling the address where the data will be written to or read from is just a matter of modifying the obj.arraybuffer.backing_store slot populated through WinDbg. Since in the case of a real exploit that memory would be part of the backing store of a real ArrayBuffer object, doing so wouldn’t be difficult. For example, a write primitive might look like this:
Achieving Arbitrary Code Execution
Neither of those techniques would be directly applicable to Microsoft Edge, which features both CFG and ACG. ACG, which was introduced in Windows 10 Creators Update, enforces strict Data Execution Prevention (DEP) and moves the JIT compiler to an external process. This creates a strong guarantee that attackers cannot overwrite executable code without first somehow compromising the JIT process, which would require the discovery and exploitation of additional vulnerabilities.
CFG, on the other hand, guarantees that indirect call sites can only jump to a certain set of functions, meaning they can’t be used to directly start ROP execution. Creators Update also introduced CFG export suppression, which significantly reduced the set of valid CFG indirect call targets by removing most exported functions from the valid target set. All these mitigations and others make the exploitation of RCE vulnerabilities in Microsoft Edge that much more complex.
The dangers of RCE
Being a modern web browser, Chrome adopts a multi-process model. There are several process types involved: the browser process, the GPU process, and renderer processes. As its name indicates, the GPU process brokers interactions between the GPU and all the processes that need to use it, while the browser is the global manager process that brokers access to everything from file system to networking.
This is an interesting observation, because it indicates that renderer processes are not locked down to any single origin. This means that achieving arbitrary code execution within a renderer process can give attackers the ability to access other origins. While attackers gaining the ability to bypass the Single Origin Policy (SOP) in such a way may not seem like a big deal, the ramifications can be significant:
- Attackers can steal saved passwords from any website by hijacking the PasswordAutofillAgent interface.
- Attackers can navigate to any website in the background without the user noticing, for example, by creating stealthy pop-unders. This is possible because many user-interaction checks happen in the renderer process, with no ability for the browser process to validate. The result is that something like ChromeContentRendererClient::AllowPopup can be hijacked such that no user interaction is required, and attackers can then hide the new windows. They can also keep opening new pop-unders whenever one is closed, for example, by hooking into the onbeforeunload window event.
A better implementation of this kind of attack would be to look into how the renderer and browser processes communicate with each other and to directly simulate the relevant messages, but this shows that this kind of attack can be implemented with limited effort. While the democratization of two-factor authentication mitigates the dangers of password theft, the ability to stealthily navigate anywhere as that user is much more troubling, because it can allow an attacker to spoof the user’s identity on websites they’re already logged into.
This kind of attack drives our commitment to keep on making our products secure on all fronts. With Microsoft Edge, we continue to both improve the isolation technology and to make arbitrary code execution difficult to achieve in the first place. For their part, Google is working on a site isolation feature which, once complete, should make Chrome more resilient to this kind of RCE attack by guaranteeing that any given renderer process can only ever interact with a single origin. A highly experimental version of this site isolation feature can be enabled by users through the chrome://flags interface.
We responsibly disclosed the vulnerability that we discovered along with a reliable RCE exploit to Google on September 14, 2017. The vulnerability was assigned CVE-2017-5121, and the report was awarded a $7,500 bug bounty by Google. Along with other bugs our team reported but didn’t exploit, the total bounty amount we were awarded was $15,837. Google matched this amount and donated $30,000 to Denise Louie Education Center, our chosen organization in Seattle. The bug tracker item for the vulnerability described in this article is still private at time of writing.
Servicing security fixes is an important part of the process and, to Google’s credit, their turnaround was impressive: the bug fix was committed just four days after the initial report, and the fixed build was released three days after that. However, it’s important to note that the source code for the fix was made available publicly on Github before being pushed to customers. Although the fix for this issue does not immediately give away the underlying vulnerability, other cases can be less subtle.
For example, this security bug tracker item was also kept private at the time, but the public fix made the vulnerability obvious, especially as it came with a regression test. This can be expected of an open source project, but it is problematic when the vulnerabilities are made known to attackers ahead of the patches being made available. In this specific case, the stable channel of Chrome remained vulnerable for nearly a month after that commit was pushed to git. That is more than enough time for an attacker to exploit it. Some Microsoft Edge components, such as Chakra, are also open source. Because we believe that it’s important to ship fixes to customers before making them public knowledge, we only update the Chakra git repository after the patch has shipped.
Our strategies may differ, but we believe in collaborating across the security industry in order to help protect customers. This includes disclosing vulnerabilities to vendors through Coordinated Vulnerability Disclosure (CVD), and partnering throughout the process of delivering security fixes.
Microsoft Offensive Security Research team
Talk to us