EPROCESS, KPROCESS, PEB

Executive Process Block (EPROCESS) is the root structure from which the KPROCESS and PEB depends. The KPROCESS contains the PDA physical address that is placed in the CR3 register, together with kernel scheduling information.

The PEB contains information about process images such as DLLs. EPROCESS and KPROCESS are only accessible by the kernel while the PEB can be accessed by its own process as well.

To get the EPROCESS address (in this case ffffd60b4aa5b240)

0: kd> !process  0 0 lsass.exe
PROCESS ffffd60b4aa5b240
    SessionId: 0  Cid: 02b4    Peb: fa07b9000  ParentCid: 0210
    DirBase: 3035d002  ObjectTable: ffff9d8f1e2cae80  HandleCount: 1230.
    Image: lsass.exe

And inspect its values:

0: kd> dt nt!_eprocess ffffd60b4aa5b240
   +0x000 Pcb              : _KPROCESS
    ...
   +0x360 Token            : _EX_FAST_REF
    ...
   +0x3f8 Peb              : 0x0000000f`a07b9000 _PEB

THe KPROCESS is the very first element of the EPROCESS, and we can dump it this way

0: kd> dt  nt!_kprocess ffffd60b4aa5b240
   +0x000 Header           : _DISPATCHER_HEADER
   +0x018 ProfileListHead  : _LIST_ENTRY [ 0xffffd60b`4aa5b258 - 0xffffd60b`4aa5b258 ]
   +0x028 DirectoryTableBase : 0x3035d002
   +0x030 ThreadListHead   : _LIST_ENTRY [ 0xffffd60b`4aa5f378 - 0xffffd60b`505f5378 ]
   +0x040 ProcessLock      : 0
   +0x044 ProcessTimerDelay : 0
   +0x048 DeepFreezeStartTime : 0
   +0x050 Affinity         : _KAFFINITY_EX
   +0x0f8 ReadyListHead    : _LIST_ENTRY [ 0xffffd60b`4aa5b338 - 0xffffd60b`4aa5b338 ]
   +0x108 SwapListEntry    : _SINGLE_LIST_ENTRY
   +0x110 ActiveProcessors : _KAFFINITY_EX
   ...

And the same goes for the PEB:

0: kd> dt  nt!_peb 0x0000000f`a07b9000
   +0x000 InheritedAddressSpace : 0 ''
   +0x001 ReadImageFileExecOptions : 0 ''
   +0x002 BeingDebugged    : 0 ''
   +0x003 BitField         : 0x4 ''
   +0x003 ImageUsesLargePages : 0y0
   +0x003 IsProtectedProcess : 0y0
   +0x003 IsImageDynamicallyRelocated : 0y1
   +0x003 SkipPatchingUser32Forwarders : 0y0
   +0x003 IsPackagedProcess : 0y0
   +0x003 IsAppContainer   : 0y0
   +0x003 IsProtectedProcessLight : 0y0
   +0x003 IsLongPathAwareProcess : 0y0
   +0x004 Padding0         : [4]  ""
   +0x008 Mutant           : 0xffffffff`ffffffff Void
   +0x010 ImageBaseAddress : 0x00007ff6`dcde0000 Void
   +0x018 Ldr              : 0x00007ff9`150453c0 _PEB_LDR_DATA

System Calls

Interrupt Dispatch Table (IDT)

We can inspect the interrupt service routines (ISRs) on a specified interrupt dispatch table (IDT).


0: kd> rM 0x100
gdtr=fffff8011465dfb0   gdtl=0057 idtr=fffff8011465b000   idtl=0fff tr=0040  ldtr=0000
nt!DbgBreakPointWithStatus:


0: kd> !idt -a

Dumping IDT: fffff8011465b000

00:	fffff80111d5a100 nt!KiDivideErrorFaultShadow
01:	fffff80111d5a180 nt!KiDebugTrapOrFaultShadow	Stack = 0xFFFFF8011465F9D0
02:	fffff80111d5a200 nt!KiNmiInterruptShadow	Stack = 0xFFFFF8011465F7D0
03:	fffff80111d5a280 nt!KiBreakpointTrapShadow
04:	fffff80111d5a300 nt!KiOverflowTrapShadow
05:	fffff80111d5a380 nt!KiBoundFaultShadow
06:	fffff80111d5a400 nt!KiInvalidOpcodeFaultShadow
07:	fffff80111d5a480 nt!KiNpxNotAvailableFaultShadow
08:	fffff80111d5a500 nt!KiDoubleFaultAbortShadow	Stack = 0xFFFFF8011465F3D0
09:	fffff80111d5a580 nt!KiNpxSegmentOverrunAbortShadow
0a:	fffff80111d5a600 nt!KiInvalidTssFaultShadow
0b:	fffff80111d5a680 nt!KiSegmentNotPresentFaultShadow
0c:	fffff80111d5a700 nt!KiStackFaultShadow
0d:	fffff80111d5a780 nt!KiGeneralProtectionFaultShadow
0e:	fffff80111d5a800 nt!KiPageFaultShadow
0f:	fffff80111d5b2f8 nt!KiIsrThunkShadow+0x78
10:	fffff80111d5a880 nt!KiFloatingErrorFaultShadow
11:	fffff80111d5a900 nt!KiAlignmentFaultShadow
12:	fffff80111d5a980 nt!KiMcheckAbortShadow	Stack = 0xFFFFF8011465F5D0
13:	fffff80111d5aa80 nt!KiXmmExceptionShadow
14:	fffff80111d5ab00 nt!KiVirtualizationExceptionShadow
15:	fffff80111d5ab80 nt!KiControlProtectionFaultShadow
16:	fffff80111d5b330 nt!KiIsrThunkShadow+0xB0

0: kd> u fffff80111d5b330
nt!KiIsrThunkShadow+0xb0:
fffff801`11d5b330 6a16            push    16h
fffff801`11d5b332 e989070000      jmp     nt!KxIsrLinkageShadow (fffff801`11d5bac0)
fffff801`11d5b337 cc              int     3

Service Descriptor Table

segmentation

3: kd> dt nt!*DescriptorTable* -v
Enumerating symbols matching nt!*DescriptorTable*
Address   Size Symbol
fffff80111e51cfc   000 ntkrnlmp!KiOpDescriptorTableStoreSkip (no type info)
fffff80111f95880   000 ntkrnlmp!KeServiceDescriptorTable (no type info)
fffff80111f7da80   000 ntkrnlmp!KeServiceDescriptorTableShadow (no type info)

These two tables contain System Service Tables (SSTs). An SST is a Windows lookup struct table

3: kd> dps nt!KeServiceDescriptorTable
fffff801`11f95880  fffff801`11e2dc10 nt!KiServiceTable
fffff801`11f95888  00000000`00000000
fffff801`11f95890  00000000`000001d0
fffff801`11f95898  fffff801`11e2e354 nt!KiArgumentTable
fffff801`11f958a0  00000000`00000000
fffff801`11f958a8  00000000`00000000
fffff801`11f958b0  00000000`00000000
fffff801`11f958b8  00000000`00000000
fffff801`11f958c0  fffff801`11d5a280 nt!KiBreakpointTrapShadow
fffff801`11f958c8  fffff801`11d5a300 nt!KiOverflowTrapShadow
fffff801`11f958d0  fffff801`11d5ad00 nt!KiRaiseSecurityCheckFailureShadow
fffff801`11f958d8  fffff801`11d5ad80 nt!KiRaiseAssertionShadow
fffff801`11f958e0  fffff801`11d5ae00 nt!KiDebugServiceTrapShadow
fffff801`11f958e8  fffff801`11d5c180 nt!KiSystemCall64Shadow
fffff801`11f958f0  fffff801`11d5be00 nt!KiSystemCall32Shadow
fffff801`11f958f8  00000000`00000000

If we inspect the content of the ServiceTable we can verify that contains offset of actual kernel routines.

3: kd> dd /c1 KiServiceTable L4
fffff801`11e2dc10  fced7304
fffff801`11e2dc14  fcf77c00
fffff801`11e2dc18  02b98402
fffff801`11e2dc1c  04746e00

These can be calculated this way (taking the 3rd value here as an example - nt!NtAccessCheck)

3: kd> u KiServiceTable + (fced7304 >>> 4)
nt!NtAccessCheck:
fffff801`11b1b340 4c8bdc          mov     r11,rsp
fffff801`11b1b343 4883ec68        sub     rsp,68h
fffff801`11b1b347 488b8424a8000000 mov     rax,qword ptr [rsp+0A8h]
fffff801`11b1b34f 4533d2          xor     r10d,r10d
fffff801`11b1b352 458853f0        mov     byte ptr [r11-10h],r10b
fffff801`11b1b356 498943e8        mov     qword ptr [r11-18h],rax
fffff801`11b1b35a 488b8424a0000000 mov     rax,qword ptr [rsp+0A0h]
fffff801`11b1b362 498943e0        mov     qword ptr [r11-20h],rax

We can now, as an example take a random API from NTDLL, like ntReadFile and find its dispatch routine.

3: kd> u ntdll!ntreadfile L2
ntdll!NtReadFile:
00007ff9`d285c170 4c8bd1          mov     r10,rcx
00007ff9`d285c173 b806000000      mov     eax,6

The above syscall number is 6, so we can use this value as an offset in the ServiceTable.

3: kd> dd /c1 KiServiceTable+4*0x6 L1
fffff801`11e2dc28  01c06105

And finally verify that we have an analogous symbol in the kernel routine.

3: kd> u KiServiceTable + (01c06105 >>> 4) L1
nt!NtReadFile:
fffff801`11fee220 4c894c2420      mov     qword ptr [rsp+20h],r9

Thanks to spotless contribution, we can also dump the whole SSDT list togethr with the symbols names.

3: kd> .foreach /ps 1 /pS 1 ( offset {dd /c 1 nt!KiServiceTable L poi(nt!KeServiceDescriptorTable+10)}){ r $t0 = ( offset >>> 4) + nt!KiServiceTable; .printf "%p - %y\n", $t0, $t0 }

fffff80111b1b340 - nt!NtAccessCheck (fffff801`11b1b340)
fffff80111b253d0 - nt!NtWorkerFactoryWorkerReady (fffff801`11b253d0)
fffff801120e7450 - nt!NtAcceptConnectPort (fffff801`120e7450)
fffff801122a22f0 - nt!NtMapUserPhysicalPagesScatter (fffff801`122a22f0)
fffff80111ffcb50 - nt!NtWaitForSingleObject (fffff801`11ffcb50)
fffff80111bcde10 - nt!NtCallbackReturn (fffff801`11bcde10)
fffff80111fee220 - nt!NtReadFile (fffff801`11fee220)
fffff80111ff1770 - nt!NtDeviceIoControlFile (fffff801`11ff1770)
fffff8011204eef0 - nt!NtWriteFile (fffff801`1204eef0)
fffff801120b7da0 - nt!NtRemoveIoCompletion (fffff801`120b7da0)
fffff801120b9d10 - nt!NtReleaseSemaphore (fffff801`120b9d10)
fffff80111fd74f0 - nt!NtReplyWaitReceivePort (fffff801`11fd74f
[...]

Syscall Walkthrough

As an example, we can inspect the lifecycle of the ReadFile API.

From KERNELBASE (an abstaction of KERNEL32) we se that the actual UserLand function is imported from NTDLL.

0: kd> uf ReadFile
Flow analysis was incomplete, some code may be missing
KERNELBASE!ReadFile:
00007fff`048651b0 48895c2410      mov     qword ptr [rsp+10h],rbx
00007fff`048651b5 4c894c2420      mov     qword ptr [rsp+20h],r9
[...]
00007fff`04865220 48ff15e9171800  call    qword ptr [KERNELBASE!_imp_NtReadFile (00007fff`049e6a10)]


0: kd> dps 00007fff`049e6a10
00007fff`049e6a10  00007fff`06bbc170 ntdll!NtReadFile
00007fff`049e6a18  00007fff`06c1bf40 ntdll!RtlRaiseStatus
00007fff`049e6a20  00007fff`06b39f60 ntdll!RtlCompareUnicodeString
00007fff`049e6a28  00007fff`06b66b90 ntdll!RtlTryAcquirePebLock
00007fff`049e6a30  00007fff`06b954b0 ntdll!RtlReleasePebLock
00007fff`049e6a38  00007fff`06bb1f60 ntdll!wcsspn

As we inspect NTDLL NtReadFile function, we can spot the 0x6 value, that refers to the SYSCALL number Then a check in K_USER_SHARED_DATA is performed to verify the system capabilities and if is a x64 CPU is running, it will execute the SYSCALL instruction, otherwise a 0x2e interrupt will be triggered

0: kd> uf  ntdll!NtReadFile
ntdll!NtReadFile:
00007fff`06bbc170 4c8bd1          mov     r10,rcx
00007fff`06bbc173 b806000000      mov     eax,6
00007fff`06bbc178 f604250803fe7f01 test    byte ptr [SharedUserData+0x308 (00000000`7ffe0308)],1
00007fff`06bbc180 7503            jne     ntdll!NtReadFile+0x15 (00007fff`06bbc185)  Branch

ntdll!NtReadFile+0x12:
00007fff`06bbc182 0f05            syscall
00007fff`06bbc184 c3              ret

ntdll!NtReadFile+0x15:
00007fff`06bbc185 cd2e            int     2Eh
00007fff`06bbc187 c3              ret

Once we hit the userland side of the function, we can inspect its privileges and PL level, which 3 as expected.

Breakpoint 1 hit
ntdll!NtReadFile+0x12:
0033:00007ff9`8ae5c182 0f05            syscall
3: kd> r cs
cs=0033
3: kd> dg 33
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0033 00000000`00000000 00000000`00000000 Code RE Ac 3 Nb By P  Lo 000002fb

On the other end, if we continue on the kernel-side version, we’ll se the actual Ring 0 in the code segment.

3: kd> g
Breakpoint 0 hit
nt!NtReadFile:
fffff805`54cb1830 4c894c2420      mov     qword ptr [rsp+20h],r9
3: kd> r cs
cs=0010
3: kd> dg 10
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0010 00000000`00000000 00000000`00000000 Code RE Ac 0 Nb By P  Lo 0000029b

We can also double check in the Service Table that the correct offset will point to the actual kernel system call implementation.

2: kd> dd /c1 kiservicetable+4*0x6 L1
fffff803`75e2dc28  01c06105

2: kd> u kiservicetable + (01c06105>>>4) L1
nt!NtReadFile:
fffff803`75fee220 4c894c2420      mov     qword ptr [rsp+20h],r9

nt!KiSystemCall64Shadow under the miscroscope

Analyzing the SYSCALL instruction itself, we can see it takes the value saved in EAX (0x6) and jump to the system call handler that is saved into the IA32_LSTAR MSR register (from the Intel manual) which is aptly named nt!KiSystemCall64Shadow. This this the Kernel RIP Syscall handler in long mode (64-bit only). When the SYSCALL instructions are performed, the code jumps to kernel-mode routine whose address is pointed to by a Model Specific Register (MSR). MSRs are special, CPU specific, registers that must be accessed through rdmsr (read) and wrmsr (write) CPU instructions through an index. For x64, the three values we are after are:

MSR Index Description
IA32_STAR 0xC0000081 Ring 0 and Ring 3 Segments + SYSCALL EIP:
00-31 = SYSCALL EIP
32-47 = kernel segment base
48-63 = user segment base.
IA32_LSTAR 0xC0000082 The kernel’s RIP for SYSCALL in long mode (64-bit software)
IA32_CSTAR 0xC0000083 The kernel’s RIP for SYSCALL in compatibility mode.
IA32_SFMASK 0xC0000084 The low 32 bits are the SYSCALL flag mask. If a bit in this is set, the corresponding bit in EFLAGS is cleared.

If we take a peek at the IA32_STAR register:

3: kd> rdmsr 0xc0000081
msr[c0000081] = 00230010`00000000

which deflates to: 0023001000000000, according to the structure of that STAR:

sysret CS : 0023
sysret SS : 002B ; CS + 8
sysret CS 64bit : 0033 ; CS + 16

syscall CS : 0010
syscall SS : 0018 ; CS + 8

syscall 32bit EIP : 00000000

We can verify both the SYSRET and SYSCALL values by placing two breakpoints, one in ntdll!NtReadFile (SYSRET) and the other at nt!NtReadFile (SYSCALL)

2: kd> bl
     0 e Disable Clear  fffff807`3c3f1220     0001 (0001) nt!NtReadFile
     1 e Disable Clear  00007ff8`ed23c170     0001 (0001) ntdll!NtReadFile

2: kd> g
Breakpoint 1 hit
ntdll!NtReadFile:
0033:00007ff8`ed23c170 4c8bd1          mov     r10,rcx

2: kd> r
rax=00007ff8ed23c170 rbx=0000000000000000 rcx=0000000000000768
rdx=0000000000000000 rsi=0000000000000000 rdi=000000f7e4b1f398
rip=00007ff8ed23c170 rsp=000000f7e4b1f298 rbp=0000000000000001
 r8=0000000000000000  r9=0000000000000000 r10=00000fff1da4782f
r11=8888848888555555 r12=0000000000000019 r13=0000020640231700
r14=000000f7e4b1f384 r15=0000000000000768
iopl=0         nv up ei pl nz na pe cy
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000203
ntdll!NtReadFile:
0033:00007ff8`ed23c170 4c8bd1          mov     r10,rcx

cs=0033 ss=002b match our SYSRET values as expected.

2: kd> g
Breakpoint 0 hit
nt!NtReadFile:
fffff807`3c3f1220 4c894c2420      mov     qword ptr [rsp+20h],r9
2: kd> r
rax=fffff8073c3f1220 rbx=ffff838352610080 rcx=0000000000000768
rdx=0000000000000000 rsi=000000f7e4b1f2b8 rdi=ffffdb0b0408fa28
rip=fffff8073c3f1220 rsp=ffffdb0b0408fa08 rbp=ffffdb0b0408fb00
 r8=0000000000000000  r9=0000000000000000 r10=fffff8073c3f1220
r11=fffff8073bfdecc8 r12=0000000000000019 r13=0000020640231700
r14=000000f7e4b1f384 r15=0000000000000768
iopl=0         nv up ei pl zr na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00040246
nt!NtReadFile:
fffff807`3c3f1220 4c894c2420      mov     qword ptr [rsp+20h],r9 ss:0018:ffffdb0b`0408fa28=0000000000000000

And cs=0010 ss=0018 match the above values from IA32_STAR MSR register as well.

Now let’s see what is the content of the IA32_LSTAR MSR, which is supposed to be the syscall handler.

1: kd> rdmsr 0xc0000082
msr[c0000082] = fffff803`75d5c180

1: kd> u fffff803`75d5c180
nt!KiSystemCall64Shadow:
fffff803`75d5c180 0f01f8          swapgs
fffff803`75d5c183 654889242510700000 mov   qword ptr gs:[7010h],rsp
fffff803`75d5c18c 65488b242500700000 mov   rsp,qword ptr gs:[7000h]
fffff803`75d5c195 650fba24251870000001 bt  dword ptr gs:[7018h],1
fffff803`75d5c19f 7203            jb      nt!KiSystemCall64Shadow+0x24 (fffff803`75d5c1a4)
fffff803`75d5c1a1 0f22dc          mov     cr3,rsp
fffff803`75d5c1a4 65488b242508700000 mov   rsp,qword ptr gs:[7008h]
fffff803`75d5c1ad 6a2b            push    2Bh

There are actually two different SysCall handlers, with and without the ‘Shadow’ keyword. The “Shadow” comes from the “Kernel Virtual Address Shadow” feature aimed to fix the Meltdown bug.

The swapgs privileged instruction is used to swap/exchange the current GS base register value with the value residing in the MSR address C0000102H (IA32_KERNEL_GS_BASE). To be more clear, the value of GS base register equals to the value contained into the IA32_GS_BASE MSR. In x64 Windows systems, the values are:

  • IA32_KERNEL_GS_BASE – Pointer to current processor control region (PCR), specifically the Kernel Processor Control Region(KPCR)
  • IA32_GS_BASE – Pointer to current execution thread TEB

So, in 64 long-mode, the GS segment always points to current thread TEB,in user mode, whereas in kernel mode points to current processor PCR.

The next instruction, mov qword ptr gs:[7010h],rsp saves the user-land stack pointer into the

Continuing reading the Syscall routine, the GS:7000h value is saved at RSP, containing the x64 page directory (PML4) Quoting the Fortinet article about the next instruction
bt dword ptr gs:[7018h],1
mov cr3,rsp

A flag […] will be checked, and if swapping is needed then the base of PML4 (corresponding to the kernel address space) will be moved into CR3. At this point the kernel stack is finally accessible and everything works normally. We note that swapping may not always be needed as it may have already happened previously (for example, interrupt while servicing system calls).One subtle thing to notice is that after the new PML4 has been moved into CR3 the address space is switched instantaneously, and the very next instruction fetch happens on the new address space (private to the kernel). But since KiSystemCall64Shadow is mapped into the very same virtual address, everything “just works”.

A summary of the GS values:

value description
gs:7000 PML4
gs:7008 Kernel Stack
gs:7010 Previous Stack
gs:7018 Flag

So let’s expand once more the syscall routine code with comment on each instruction:

nt!KiSystemCall64Shadow:
fffff804`17f63180 0f01f8          swapgs                             ; swap the value inside GS base register from TEB (user) to PCR (kernel)
fffff804`17f63183 654889242510700000 mov   qword ptr gs:[7010h],rsp  ; saves current user stack into 
fffff804`17f6318c 65488b242500700000 mov   rsp,qword ptr gs:[7000h]  ; saves KPCR base into RSP
fffff804`17f63195 650fba24251870000001 bt  dword ptr gs:[7018h],1    ; checks if last bit of KPCR has Kernel Page Table Integrity KPTI enabled for this                                                                   process
fffff804`17f6319f 7203            jb      nt!KiSystemCall64Shadow+0x24 

nt!KiSystemCall64Shadow+0x21:
fffff804`17f631a1 0f22dc          mov     cr3,rsp                     ; if so, the kernel KPCR will be loaded in CR3

As a summary, the whole system call flow can be visualized through the following graph:

syscall_flow