Windows10 (14316) Linux subsystem analysis

0. Background

At the build 2016 conference, Microsoft announced that bash will be natively supported in Windows 10, and claimed to add a Linux subsystem (Windows Subsystem for Linux) to Windows instead of a virtual machine. Then the bash function was turned on in the released win10 14316 (x64) update, but the 32-bit version does not have bash.

The author immediately downloaded 14316 and wanted to experience the execution of bash commands on windows, and wanted to understand the implementation of this Linux subsystem mechanism.

 

1. Introduction

First open system32 \ bash.exe, and then casually enter an ls command. Through procmon, it is found that a process has accessed the C: \ Users \ xxxxx \ AppData \ Local \ lxss \ rootfs \ bin \ ls file.

Picture 1.png

Then found that the root directory of linux is mounted at C: \ Users \ xxxxx \ AppData \ Local \ lxss \ rootfs, you can see that the directory structure is basically the same as ubuntu.

Picture 2.png

Conversely, it is strange to see the process of accessing the elf file of ls. Procmon cannot display the process name, and the debugger will also find that there is not only no name, no section, peb and other information in this process object.

二. Pico Process

This nameless process is called the pico process, which is the host process of the ELF. About picoprocess introduction:  http://research.microsoft.com/en-us/projects/drawbridge/#Picoprocess . Simply put, Pico Process is a Container, a mechanism to implement drawbridge sandbox technology, and the "Project Astoria" project that was previously killed by Microsoft also uses the lxcore + picoprocess technology. Now Microsoft applies this technology to the win10 linux subsystem.

Picoprocess is uniformly created by PspCreatePicoProcess, and the main work in PspCreatePicoProcess is realized by PsCreateMinimalProcess. The name of the function PsCreateMinimalProcess is used to create a streamlined process. In fact, it is also true. PsCreateMinimalProcess will call PspAllocateProcess, the function of PspAllocateProcess function is used to create EPROCESS object, process address space, PEB and other information.

The prototype of the PspAllocateProcess function is roughly as follows:

NTSTATUS PspAllocateProcess( void *ParentProcess, 
             int PreviousMode, 
             void *ObjectAttributes, 
             char Protection,
             char SignatureLevel, 
             char SectionSignatureLevel, 
             void *SectionObject,
             void *TokenObject, 
             int Flags, 
             void *UserProcessParameters, 
             int bSystemToken, 
         OUT int a12, 
         OUT void *Process) 

The parameters of calling PspAllocateProcess in PsCreateMinimalProcess are shown in the following figure. ObjectAttributes, SectionObject, etc. are all NULL. This also demonstrates that we have seen that picoprocess has no process name and no user space information.

Picture 3.png

Looking back at the call stack of the function PspCreatePicoProcess, we can see that the call of ring3 is distributed to the lxcore driver by the function PsPicoSystemCallDispatch. Here we need to introduce another thing: PicoProvider

Picture 4.png

3. PicoProvider 

Nt has exported a function PsRegisterPicoProvider to register a PicoProvider. At present, only lxss.sys will call this interface. The two drivers, lxss.sys and lxcore.sys, are responsible for implementing most of the functions of the Linux subsystem. Lxss.sys as a boot start driver will call lxcore.sys-> PsRegisterPicoProvider during initialization, and PsRegisterPicoProvider is only allowed to be called once after system startup. After all boot drivers are initialized, nt will set PspPicoRegistrationDisabled to TRUE, thus prohibiting future All PsRegisterPicoProvider calls, which means that only one Provider is allowed in the current system.

Picture 5.png

Note that because the latest symbol is currently 14295, the symbol of 14316 has not been released by Microsoft, but the file related to lxss.sys exists at 14295 and most of the functions have been used in nt, but the upper-layer mechanism has not been implemented. Therefore, I use the 14295 symbol and the 14316 file to analyze the lxss kernel related mechanism. If it is incorrect, please point it out.

The prototype of PsRegisterPicoProvider is roughly as follows:

NTSTATUS PsRegisterPicoProvider(  IN PICO_INTERFACE *PicoInterface,
                               OUT PSP_INTERFACE *PspInterface
                               ); struct PICO_INTERFACE
{
   __int64 cbSize; //目前只支持0x48,不排除以后会有EX    PVOID PicoSystemCallDispatch;
   PVOID PicoThreadExit;
   PVOID PicoProcessExit;
   PVOID PicoDispatchException;
   PVOID PicoProcessTerminate;
   PVOID PicoWalkUserStack;
   PVOID LxpProtectedRanges;
   PVOID PicoGetAllocatedProcessImageName;
} struct PSP_INTERFACE
{
   __int64 cbSize; //目前只支持0x60    PVOID PspCreatePicoProcess;
   PVOID PspCreatePicoThread;
   PVOID PspGetPicoProcessContext;
   PVOID PspGetPicoThreadContext;
   PVOID PspGetContextThreadInternal;
   PVOID PspSetContextThreadInternal;
   PVOID PspTerminateThreadByPointer;
   PVOID PsResumeThread;
   PVOID PspSetPicoThreadDescriptorBase;
   PVOID PsSuspendThread;
   PVOID PspTerminatePicoProcess;
}

Lxss provides a set of PICO operation interfaces to NT through PsRegisterPicoProvider to obtain messages such as system call distribution, entry and exit, and abnormal access, and handles them by itself. And it will get a set of NT thread operation interface for operating on NT thread. There is a PICO interface PicoGetAllocatedProcessImageName, which is more interesting, because I mentioned that we use procmon to observe that the pico process has no name, and the kernel debugger cannot see the name. If you can know which ELF corresponds to the pico process, it will be very big for the analysis. help.

Through analysis, Pico Provider provides this interface to NT. When NtQuerySystemInformation obtains process name related information, it will normally take the process name information from EPROCESS-> SeAuditProcessCreationInfo, but if it is found that this is a pico process, call PicoGetAllocatedProcessImageName to obtain it. So logically speaking, NT has done compatibility with pico process, but why can't procmon still get the process name? Sure enough, I tried it myself with the simplest CreateToolhelp32Snapshot, which can easily enumerate the names related to the elf process.

So procmon may not use the standard interface to obtain the process name.

Picture 6.png

But where is the process name of the pico process stored? By analyzing this function, it is found that the ELF path information corresponding to the pico process is stored at EPROCESS-> PicoContext + 0x15b0, and the PicoContext is set by PspCreatePicoProcess when the pico process is created, and saves various information of the pico process. Use a simple script to traverse the name information of the pico process of the current system. This is the pico process information when I use bash to perform a wget download task:

Picture 7.png

Moving on to PsRegisterPicoProvider, PicoSystemCallDispatch in PICO Interface is also important. Responsible for distributing system calls from pico process for communication between picoprocess and provider. In a system call of pico process, when the ring3 code is sysenter, the kernel entry is KiSystemCall64. If the Minimal flag is set in the current thread-> _ DISPATCHER_HEADER, then KiSystemCall64 will call nt! PsPicoSystemCallDispatch for pico-related distribution, without taking the original sdt Published separately. Lxcore! LxpSyscalls saves the address of this distribution.

This is a part of the distribution function for printing. Similar to the SDT table, it is still distributed through rax as id, but the transmission of parameters is different from that of sdt. Use rdi, rsi, rdx, r10 to pass parameters.

(In the analysis process, it was also found that after 14316, a new SDT table KeServiceDescriptorTableFilter was added. When a thread flag was set to RestrictedGuiThread, the KeServiceDescriptorTableFilter table was used. This table will restrict many native API calls, mainly limiting edges The process calls win32k and other functions to do more strict security isolation, but the content of KeServiceDescriptorTableFilter is still the same as ordinary SDT and should not be used yet)

Picture 8.png

Let's take a look at the implementation of some functions provided by lxcore to the Linxu subsystem. For example, when we use bash to create an ELF process, the calling process of lxcore is probably: LxpSyscall_FORK-> LxpThreadGroupFork-> LxpThreadGroupCreate-> PspCreatePicoProcess. PspCreatePicoProcess was obtained from NT before PsRegisterPicoProvider, so lxsscore can operate nt into the thread, but what about other functions that do not have a response interface from NT?

For the operation of the file system, the corresponding nt api is directly called instead of directly dealing with the file system. For example, when the rm command is used to delete a file, the call process is: LxpSyscall_UNLINK-> LxpUnlinkHelper-> VfsPerformUnlink-> VfsUnlinkChild-> LxDrvFsRemoveChild-> LxDrvFsDeleteFile-> ZwSetInformationFile, which will open the file using IoCreateFile through LxDrvFsDeleteFile and then delete it. A set of VfsXXXX interfaces is implemented in lxcore.sys to virtualize VFS upwards, and use a set of APIs of LxDrvFsXXXX downwards to provide file access capabilities.

For the operation of the network, it is divided into the UNIX protocol cluster and the INET protocol cluster. The two have different distributions. For example, when the upper layer sends a UDP packet, the call process of lxcore: LxpSyscall_SENDMMSG-> LxpSocketSendMultipleMessages-> LxpSocketInetSendMessage-> LxpSocketInetSend-> LxpSocketInetDatagramSend-> afd! WskProAPISendTo. Will eventually enter afd to perform the corresponding function. This is the Afd related function used by the printed lxcore:

Picture 9.png

For others, such as obtaining current system information, lxcore also directly calls the corresponding function of Nt to obtain, such as the date command, lxcore is obtained through LxpSyscall_CLOCK_GETTIME-> KeQuerySystemTimePrecise.

Four. Inject into Pico Process       

Finally, I tried to inject into the pico process, intending to inject it to call the distribution of the pico process. Since the pico process does not have a section object, it cannot be injected with a remote thread. Finally, it was injected into the pico process process using the SetThreadContext method. Although it can be injected successfully, windbg cannot debug the pic process, because lxcore takes over the abnormal distribution of the pico process. If an exception is found in the pico thread in nt! KiDispatchException, the PicoDispatchException is used to handle the exception, and the lxcore uses APC to simulate the signal mechanism. Process communication and abnormal distribution.

V. Summary

At present, LxpSyscalls currently contains 0 × 138 calls, and some calls currently have no internal logic implementation, so Microsoft will gradually improve the support of various commands in the future. The above briefly analyzes the support of lxcore for the operation of the Linux subsystem. Of course, it is only part of the operation, and some such as ELF loading, memory management, device management, etc. have not been analyzed. However, from the results that have been analyzed so far, Microsoft does implement a set of Linux subsystem support, not a Linux virtual machine, so the execution efficiency will be much better, such as the time slice allocation of the pico process process and the basic process. Consistent. But there are also disadvantages. After adding a set of lxss mechanism, it also increases complexity. That is to say, after win10, it may face the double test of win and linux binary security, which may add new security guarantees for windows problem.

Published 766 original articles · praised 474 · 2.54 million views

Guess you like

Origin blog.csdn.net/u010164190/article/details/105695938