Android Trusted Execution Environment Security Research (1): TEE, TrustZone and TEEGRIS

0x00 Preface

In the past few years, Trusted Execution Environment (TEE, Trusted Execution Environment) has achieved popularity in the Android ecosystem. In this series of articles, we will analyze the security of the TEEGRIS TEE operating system implemented in the Samsung Galaxy S10, and analyze the vulnerabilities and fixes in it . All the vulnerabilities involved in this article were reported to Samsung at that time and have been fixed by the end of 2019.
The goal of our research is to evaluate the security of Samsung TEE OS, analyze whether it can be attacked to gain runtime control and extract all protected data, such as decrypting user data. During the analysis, we did not consider the full exploit chain, but assumed that the attacker has already taken control of the Android environment and only focused on the security of the TEE.

This series of articles builds on our previous presentation at the Riscure Workshop in September 2020.

  • 1. In the first article, we will analyze TEE, TrustZone and TEEGRIS;
  • 2. In the second article, we will study the vulnerabilities of TAs running in TEEGRIS, and we will exploit one of the TAs to gain runtime control.
  • 3. In the final article, we will show how to further escalate privileges and gain access to the full TEE memory.

0x01 Trusted Execution Environment (TEE)

Trusted Execution Environments are designed to provide a secure environment when performing sensitive tasks (eg payments, user authentication, user data protection).
The secure environment is isolated from the non-secure or untrusted environment called Rich Execution Environment (REE, Rich Execution Environment). In the case we analyzed, that is, Android OS, REE can be equivalent to Android in subsequent articles.

The TEE operating system usually consists of a kernel with higher privileges and multiple applications with lower privileges (called trusted applications, TA, Trusted Applications).

TAs are isolated from each other and from the TEE core .

This way, if an application is compromised, it cannot compromise other applications or the TEE core.

In short, a strong TEE mechanism can achieve the following three types of isolation:

  • 1. Isolation between TEE and REE;
  • 2. Isolation between TA and TEE cores;
  • 3. Isolation between TAs.
    In order to meet these security requirements, TEE needs the support of hardware primitives to enforce isolation. Alignment between hardware and software is critical and requires constant alignment.

Broadly speaking, TEE is composed of multiple components, including:

  • 1. TEE perception hardware;
  • 2. Powerful Secure Boot chain for initializing TEE software;
  • 3. TEE OS kernel, used to manage security areas and trusted applications;
  • 4. The trusted application program is used to provide functions to the TEE.

In our series of articles, we will focus on 1, 3, and 4, and we will assume that the platform has properly implemented secure boot . We assume that the attacker has control of the REE so that he can communicate with the TEE, and his attack goal is to completely compromise the TEE.

The TEE kernel usually only exposes a very , and most of the functions are implemented by TA.

So our plan is to first find exploitable vulnerabilities in TA, and then escalate privileges to the kernel. But, before jumping into the disassembler, we first have to look at the ARM extension used to implement the TEE - TrustZone.

0x02 TrustZone for ARM

The rustZone technology is a hardware architecture developed by ARM that allows software to execute in two domains, secure and non-secure.

This is identified using the "NS" bit, which indicates whether the master is running in secure or non-secure mode. The master mentioned here can be a CPU core or a hardware peripheral (such as a DMA or an encryption engine).

Whether the master is safe can be defined by hardwiring in the design or by configuration. For example, the secure state of a CPU core can be toggled by calling the SMC instruction (more on that later) or by toggling the NS bit in the SCR register.

To define access restrictions for slaves (such as peripherals or memory), TrustZone usually includes two components, the TrustZone Address Space Controller (TZASC) and the TrustZone Protection Controller (TZPC).

TZASC can be used to define security ranges in DRAM. ARM offers two different implementations, the latest being the TZC-400. The figure below outlines how it is usually implemented in an SoC, quoted from the official technical documentation.

Overview of TZASC:

It can be seen that any access to DRAM memory will first pass through TZASC and then forwarded to the memory controller. TZASC can decide whether to allow access to memory based on a set of internal rules.

TZASC contains an always-enabled base region (Region 0) capable of spanning the entire DRAM memory range . Additionally, a number of other security areas are defined to which access can be restricted. In detail, the following can be set in other areas:

  • 1. Start address and end address.
  • 2. Secure read and write permissions. These permissions will apply to any secure master attempting to access the memory range. Note that TZASC has no concept of executive permission and enforcement delegated to the MMU.
  • 3. Non-secure ID filtering. Zones can be configured to allow access to non-secure masters. For the TZC-400, bitmasks can be set for read and write permissions for fine-grained control over which non-secure masters are allowed to access memory ranges.

TZPC implements a similar concept, but for internal peripherals and SRAM, not external DRAM . It contains a register (R0size) that specifies the size of SPAM on the secure slice in units of 4KB.
Compared with TZASC, it is less flexible, because it only allows to define a security area and a non-security area, the security area is from 0 to the specified size, and the remaining SRAM is directly regarded as the non-security area.
Then, there are many other registers that specify for each peripheral its security (accessible only by a secure master). It is not defined which peripherals the different bits in the TZPC register correspond to, and it is completely specific to different SoCs.

Typically, most settings for TZASC and TZPC are configured during initialization and are never changed . However, some of them need to be modified dynamically at runtime.

An example here is a Trusted User Interface (TUI) for executing secure payments. Take Samsung Pay on the S10 phone as an example. When the user needs to enter a PIN to authorize the payment, the TEE will take over and directly control the display and touch sensor. The underlying logic here is that since the PIN is a sensitive data, the entire process is handled by the TEE instead of using the untrusted Android OS.
Therefore, TZPC must be used to reconfigure the display and touch controller to be "secure" so that even an attacker running kernel-level code in Android cannot obtain the PIN. To display an image on the screen, a safe frame buffer needs to be stored in DRAM, so TEE also uses TZASC to reconfigure a part of DRAM as "safe" and use it as a frame buffer.

After the user finishes entering the PIN, TZASC and TZPC restore it to its previous value, and Android takes over again.

Transitions between secure and non-secure modes are managed by a component called the "Secure Monitor". This monitor is the main interface between the TEE and REE, and is the only component that can modify the security state of the kernel.

As in REE, TEE maintains isolation between user mode and kernel mode between kernel and TA . TEE OS is also responsible for loading TA and passing parameters between REE and TA.

TA runs in user space in the secure enclave and provides services for REE.

The ARMv8-A CPU supports four privilege levels in each region, also known as exception levels, which are:

  • (S-)EL0 – USER MODE/APP
  • (S-)EL1 – Core
  • EL2 – Hypervisor
  • EL3 – Secure Monitor

In REE, our Android application runs on EL0, while the Linux kernel runs on EL1.

EL2 exists only in non-secure mode (before the ARMv8.4-A version), called the hypervisor (Hypervisor). It was originally designed as a way to handle multiple virtual environments running in parallel at lower privilege levels,** but in the Android environment it is often used as a kernel hardening mechanism. **In Samsung phones as well, the hypervisor component is called Real-time Kernel Protection (RKP, Real-time Kernel Protection), which, in addition to these uses, limits the memory the kernel can access and places certain kernel structures Set to read-only, which increases the difficulty of kernel exploits. In the third article in the series, we will analyze RKP in detail.

Finally, let's analyze the security components, the targets of our study are EL3 (running in secure mode all the time), S-EL0 and S-EL1. There are multiple ways of implementing a TEE, but by far the most common examples are: running a very small component on EL3 responsible for switching between the two regions; running a full-fledged kernel on EL1; Run multiple TAs on it. Samsung's TEE OS TEEGRIS also adopts this design method.

Although a completely isolated environment is very secure, its use requires communication with other untrusted components running in Android. Communication between the REE and TEE is triggered using a dedicated instruction called a "safety monitor call" (SMC). Both instructions can call the instruction when EL > 0, which means that Android applications cannot directly initiate communication with the TEE.

It is often the case that the Linux kernel acts as a proxy and exposes a driver that applications can use to interact with the TEE.

The advantage of this design is that access restriction policies (e.g. using SELinux) can be applied to the access driver scenario to ensure that only some applications can communicate with the TEE, thus converging the attack surface . The same is true for the S10 phones, allowing only limited apps and services to communicate with the TEE.

Note that in our follow-up research, we assume that the attacker has the ability to communicate with the TEE. This is the case when rooting a phone with a tool like Magisk , or alternatively, to gain runtime control of the Linux kernel, but also of Android apps/services that allow to communicate with the TEE.

Once the SMC instruction is executed, an interrupt is generated in the security monitor running in EL3. The SMC handling mechanism will route the SMC to the appropriate component. If the monitor can handle the SMC directly, then do so and return immediately. Otherwise, the request is either forwarded to the TEE core (running at S-EL1), where it is then processed internally, or forwarded further to the TA running at S-EL0.

Now that we understand how TrustZone works, let's analyze how Samsung implements it.

0x03 TEEGRIS

TEEGRIS is a relatively new TEE operating system, first introduced by Samsung on the Galaxy S10 models. Starting in 2019, most new Samsung phones that use Exynos chips also start running TEEGRIS in TEE.

Before the launch of the S10 in March 2019, Exynos chips used another TEE OS called Kinibi developed by Trustonic, which was thoroughly analyzed in some previous research articles. However, since TEEGRIS is a relatively new operating system, there isn't much publicly available information online.

In fact, we can only find some usable information from an online article, which gives a good introduction to TEEGRIS and its kernel, mainly explaining how to set up QEMU for fuzzing. Even though we mainly focus on the reverse engineering direction, this article still provides us with some useful information, such as the boot image layout (where TEEGRIS is stored), and how to identify the system calls processed in the kernel.

Based on this information, let's analyze the main components of TEEGRIS: kernel, TA, driver.

As mentioned earlier, monitor codes play a very important role in TrustZone, but in Samsung's implementation monitors are stored encrypted in memory.

Therefore, we did not analyze it, but focused on other components, which were all stored in plaintext.

0x04 TEEGRIS kernel

The TEEGRIS core is a small component running in secure EL1. Even though it's small, it's not strictly a microkernel. For example, many drivers that can be used by TA are integrated in it. It runs in 64-bit mode and supports 64-bit and 32-bit TAs and drivers running in user space. Since the kernel is stored in plaintext in the boot partition, we can easily extract it and disassemble it.

The kernel implements many POSIX-compliant system calls, and also adds some TEEGRIS-specific system calls. In Alexander Tarasikov's article we noticed that there are two system call wrappers implemented in shared libraries (please refer to the TA section below for details on how to deal with shared libraries), namely libtzsl.so and libteesl.so. This allows us to quickly identify two tables in the kernel for syscall handlers for 64-bit and 32-bit TAs, respectively.

64-bit and 32-bit system call tables:

By analyzing the system calls, we found that Samsung makes full use of two routines that are relatively familiar in Linux - copy_to and from_user. Use these routines to access data from userland to ensure that TA cannot refer to internal kernel structures.

copy_from_user decompiled code:
The code in the image above first verifies the flag bit to see if any other checks are ignored. This flag is enabled when the kernel directly invokes a system call handler with known safe parameters. If the flag is set, this function becomes a wrapper for memcpy.

In other cases, the code will call check_address, as shown in the figure below.
Address checking routine:

The code snippet in the above figure provides us with some important information, the first page of TA must not be mapped (line 10, probably to prevent NULL pointer dereferencing), and the effective TA address should be less than 0x40000000 (line 12) . Any value greater than this will be considered invalid and will be discarded.

Another thing to note is that copying is performed using the LDTR instruction. LDTR behaves the same as regular LDR* instructions, but causes memory accesses to be performed with EL0 privileges.

This is because PAN is enabled, and unprivileged access to kernel memory can cause an access violation even if some edge cases are missed by the check_address function.

The upper limit of 0x40000000 for the TA address space probably means that there is relatively little randomness in ASLR, especially given that 64-bit TAs are supported.

To confirm whether this hypothesis holds, we analyzed the process of how the TA image is loaded.

Note that in TEEGRIS, TA is a slightly modified ELF file, so we can look in the code for functions to parse the standard ELF format.

Finally, we found the map_elf32_image function, which has an equivalent function in 64-bit TA.

Randomization of code and data segments:

Note that this code enforces that only the PIE executable can be loaded (line 120). It then generates a 2-byte random number (line 132), masks it with 0x7FFF (line 134), and uses this as the page offset to add to the entry point (and the base address, later done in the same function). This means that the ASLR offset can only have up to 32768 values, and it applies to all segments specified in the ELF.

Dynamic memory (for example: mapped memory for heap and REE sharing) uses different values ​​when "forking" syscalls, but is randomized in a similar way.

Dynamic memory randomization:
Note that ASLR is not only for TA, but also used in the kernel (commonly known as KASLR). We won't go into too much detail here, but it's something to keep in mind if we want to eventually exploit the kernel. In the entry function, the kernel generates another random value and modifies the page and relocation table accordingly.

KASLR,全称Kernel Address Space Layout Randomization,是一种计算机安全技术。它是一种随机化内核地址空间布局的方法,旨在防止攻击者通过猜测内核的地址空间布局来攻击系统。

KASLR技术通过在内核启动阶段获取一个随机值,并使用该随机值对内核加载地址进行相应的随机偏移,以实现内核地址空间的随机化。这样,每次启动系统时,内核的地址空间布局都会有所不同,使得攻击者难以猜测正确的内核地址空间布局。

在完成内核数据随机映射之后,还需要对符号地址进行重定位,校正内核代码的符号寻址,以确保内核代码的正常执行。

总之,KASLR是一种提高系统安全性的技术,通过随机化内核地址空间布局,防止攻击者猜测内核的地址空间布局并攻击系统。

As mentioned earlier, there are many drivers built into the kernel. Drivers are mainly used to communicate with peripheral devices (such as SPI and I2C) or perform encryption operations.

Partial list of drivers implemented in the kernel:

Considering that Samsung implements TEEGRIS following the POSIX specification, this way of interacting with the driver is not surprising.

The name of the driver usually starts with "dev://", which can be accessed by opening the corresponding file from TA.

TA can then interact with the driver using many system calls (eg: read, write, ioctl, mmap). Inside the kernel, a structure is used to store each driver's system call implementation.

Here, not every TA is granted access to heap drivers and syscalls all the time. In fact, each group has different permission levels according to the different groups TA belongs to.

The kernel keeps track of which permissions are granted to each TA and performs checks to ensure that only permitted TAs have access to restricted functionality. The following two figures show the permissions granted to two different groups, namely samsung_ta and samsung_drv.
Access permissions for the samsung_ta group:

Access permissions for the samsung_drv group:

As shown in the figure, each TA has 19 permissions. A value of 0 means that no permission is granted, and other values ​​indicate that partial or complete permissions are granted.

Most of these just mark granted and not granted, but a few (MMAP) also include a specific mask to determine whether memory can be mapped with read/write/execute privileges.

In the two examples above, samsung_ta is more restricted and only has access to a few permissions, but the samsung_drv group has even greater permissions. Beyond that, other groups have different permissions, but we've found the above two to be the most common so far.

0x05 TA and userland drivers

So far, we have analyzed the working principle of the kernel and how to interact with the kernel, and then look at TA. Generally, there are two ways of using TA in TEE.

  • One is an immutable blob bound to the TEE OS, which is always loaded at initialization;
  • The other one can be loaded by Android at runtime.

Samsung's TEEGRIS takes a hybrid approach, supporting both options.
The boot partition contains a special compressed package (startup.tzar) that contains all the shared libraries needed by TA and some special TAs and drivers needed by the early system before Android is fully booted, including TSS (for managing shared memory ), ACSD (for drivers that support TA authentication), and root_task (for loading TA and authenticating it with ACSD).

The binaries in the tzar archive are standard ELF files that can be directly loaded into a disassembler for analysis.

Since the tarball is part of the enabled image, its signature is verified at boot time.

TA can also be loaded from Android at runtime. Loadable TAs are stored in the /vendor/tee and /system/tee partitions.

In the S10 series, there are about 30 different TAs available for loading, in the following format:

  • 1. The length of the header is 8 bytes, including 4 bytes of version information (SEC2, SEC3 or SEC4) and 4 bytes of content.
  • 2. The content part is a regular ELF file containing the actual TA content. If the type of TA is SEC4, the content is encrypted, otherwise the content exists in plain text.
  • 3. The metadata part includes the TA group. Starting with the SEC3 version, there is an additional field containing the version number. root_task and ACSD will use this version number to manage TA and prevent it from rolling back. When loading a SEC3 or SEC4 version of TA, the version number is extracted and compared to the version number stored in the RPMB store. If it is lower than the stored version number, TA cannot be loaded and an error will be reported. If higher than the stored version number, increment the version number in the RPMB to match the TA version so that no older copies of the same TA are loaded. This also means that from now on, the SEC2 version of TA will not be available. This feature is critical to protect against older versions of TA that contain known vulnerabilities, and we will describe it in detail in the second article.
  • 4. The signature section contains the RSA signature of the rest of the image. It follows the X.509 format and is parsed by ACSD.

From this brief description, we can see that we can easily disassemble TA if we remove the initial header and load it into the disassembler as an ELF file. The only troublesome one is the SEC4 format because the ELF in it is encrypted, but in practice, we found only SEC2 and SEC3 used in the Galaxy S10 and S20. TA can then import the library from the tzar tarball. Libraries are also regular ELF files, implementing C library functions and Global Platform (GP) API and TEEGRIS specific functions.

**TA in EEGRIS implements GP API, **in this API, 5 interfaces that TA needs to implement to interact with REE are specified:

  • 1. TA_CreateEntryPoint, called when loading TA.
  • 2. TA_OpenSessionEntryPoint, which is called when a client application (CA, in our scenario an Android application) running in REE establishes a connection with TA for the first time.
  • 3. TA_InvokeCommandEntryPoint, which contains the main command handler to be invoked by each command sent by the CA. This is where most of the TA functionality is implemented.
  • 4. TA_CloseSessionEntryPoint, called when CA ends the session with TA.
  • 5. TA_DestroyEntryPoint, executed when the TA is unloaded from the memory.

Even though TA is an executable file, its execution is complicated because the actual main() function is inside the libteesl.so library. When starting TA, the following actually happens:

  • 1. Execute the start() function inside the TA. This function is usually just a wrapper around main().
    Example of start() function:

  • 2. The main function is actually not inside TA, but in libteesl.so library. This is where most of the logic for communicating the TEEGRIS core to the REE is configured.
    A standard mechanism based on POSIX epoll is established for communicating with the root TA. In the code snippet below, the main function first calls TA_CreateEntryPoint() and then jumps to start_event_loop().
    Code snippet of libteesl's main function:

  • 3. In start_event_loop(), the library will then receive events, such as requests from CAs. The time is then forwarded to the corresponding GP API entry point.

This chapter is titled "TAs and Userland Drivers," but so far we've been talking about TAs. So, where are the drivers?

In fact, the driver is the same as TA. They have the same format, implement the same GP API, but call a TEEGRIS-specific API called TEES_InitDriver.

This function allows a driver to specify a driver name and structure that can be used to interact with userland drivers in a manner similar to how it interacts with kernelland drivers. By default, userland drivers do not have any special privileges, but they usually belong to a higher privileged group.

0x06 Exploit Mitigations

Summarizing our analysis of the kernel and TA, it becomes clear what exploit mitigations are implemented in the kernel and TA. Some of them have already been introduced in the previous kernel analysis. Here, we aggregate measures as follows:

  • 1. XN (eXecute Never) is used in both the kernel and TA. This means data memory is never executable and code is never writable.
  • 2. Stack canary protection (Stack Canaries) is used in the kernel and TA to prevent stack buffer overflow.
  • 3. ASLR and KASLR are used to randomize the address space of TA and kernel.
  • 4. Use PAN and PXN to prevent the kernel from accessing or executing user-mode memory.

Historically, exploit mitigations in TEE OS have been relatively sparse compared to other popular OSes. Previous attacks against Samsung TEE were mainly aimed at older models that only used XN for protection, so there were relatively few mitigations.

S10 is undoubtedly a step in the right direction. If an attacker wants to fully compromise TEE, they may need to combine multiple vulnerabilities.

0x07 communicate with TA

Now that we know more about TAs, we need to understand how to communicate with TAs from the Android environment. Fortunately, the GP standard defines a set of APIs not only for TAs, but also for CAs wishing to communicate with TAs. Each entry point has a corresponding call that can be used by the CA, for example TEEC_OpenSession can be used to open a session, TEEC_InvokeCommand can be used to send commands, etc.

For TEEGRIS, the libteecl.so library implements the GP API, so communicating with the TA is as simple as using dlopen/dlsym to resolve the symbols required by the GP API. To open a session, the UUID of the target TA needs to be specified. The library then looks for a TA with that UUID in /vendor/tee or /system/tee (the UUID is the filename) and passes the entire TA image to the TEE, which then authenticates it and loads it. All operations are performed transparently to the CA, so the CA does not know how the actual communication takes place.

As we mentioned earlier, not every Android application is allowed to communicate with TEE. There is a limitation here, the full exploit chain requires the attacker to first gain runtime control of an application that can communicate with the TEE.

0x08 follow-up work

At this point, the first part of the series of articles has come to an end. In the next post, we will show how to identify and exploit vulnerabilities in TA to gain runtime control in the context of TA. In the final article, we will analyze how it can be exploited to escalate privileges and compromise the entire TEE.

0x09 Reference article

  • [1] https://developer.arm.com/documentation/ddi0504/c/
  • [2] https://medium.com/taszksec/unbox-your-phone-part-i-331bbf44c30c
  • [3] https://labs.bluefrostsecurity.de/blog/2019/05/27/tee-exploitation-on-samsung-exynos-devices-introduction/
  • [4] https://blog.quarkslab.com/a-deep-dive-into-samsungs-TrustZone-part-1.html
  • [5] http://allsoftwaresucks.blogspot.com/2019/05/reverse-engineering-samsung-exynos-9820.html
  • [6] https://globalplatform.org/wp-content/uploads/2018/06/GPD_TEE_Internal_Core_API_Specification_v1.1.2.50_PublicReview.pdf
原文链接:https://www.anquanke.com/post/id/236483

Guess you like

Origin blog.csdn.net/weixin_45264425/article/details/132666874
tee