Asynchronous Procedure Calls (APC) Enumeration
In this Blog I will explain my approach for solving one of the exercises from Practical Reverse Engineering Book, which is enumerating kernel/user Asynchronous Procedure Calls (APC) of a process.
So I started Reverse Engineering Some of the APC related functions, creating windows kernel driver that uses APC in different situation and use Windbg to add breakpoint on the APC callback functions to see how it is dispatched, plus Reading lots of blogs and documentation on how APC is working and how it is dispatched
APC Internal:
Now I will give a short introduction for APC and its internal from kernel mode perspective.
Basically, APCs allow user programs and system components to execute code in the context of a particular thread and, therefore, within the address space of a particular process.
the windows kernel uses APC to complete I/O operations initiated asynchronously, thread suspension,… and it is used by malware author for injecting code inside other process (APC Injection 1, 2).
there is two types of APC user-mode and kernel-mode APC, user-mode APC execute in user space in the target thread’s process context and it requires that the target thread to be in an alterable wait state for being successfully delivered, kernel-APC execute in kernel space and can be classified as regular or special, both kernel/User APC has three functions:
-
KernelRoutine: this function will be executed in kernel space (at IRQL= PASSIVE_LEVEL in case normal kernel APC and user APC, and at IRQL = APC_LEVEL in case special kernel APC)
-
RundownRoutine: this function will be called in the kernel space in case the thread is terminated before delivering the APC.
-
NormalRoutine: this function will be called in kernel space in case kernel-mode APC, and will be called in user space in case user-mode APC
Each Thread has two members of type _KAPC_STATE, named ApcState and SavedApcState in its _KTHREAD Data structure:
-
ApcState: used regardless of whether the thread is attached to its own process or another process
-
SavedApcState: used to store APCs for process context which is not the current context and that must wait (for example when a thread attaches to another process, and the APC is queued for its own process).
_KAPC_STATE structure has member called ApcListHead which is two LIST_ENTRY structure and treated as the list head for kernel-APC and user-APC, and will be used to queue APCs for a thread.
1: kd> dt nt!_KTHREAD
............ //Truncated for visibility
+0x098 ApcState : _KAPC_STATE /*APCs that will be dispatched regarding the current process context*/
+0x098 ApcStateFill : [43] UChar
+0x0c3 Priority : Char
+0x0c4 UserIdealProcessor : Uint4B
+0x0c8 WaitStatus : Int8B
............ //Truncated for visibility
+0x186 WaitIrql : UChar
+0x187 WaitMode : Char
+0x140 WaitBlockFill6 : [116] UChar
+0x1b4 WaitTime : Uint4B
+0x140 WaitBlockFill7 : [164] UChar
+0x1e4 KernelApcDisable : Int2B
+0x1e6 SpecialApcDisable : Int2B
+0x1e4 CombinedApcDisable : Uint4B
+0x140 WaitBlockFill8 : [40] UChar
+0x168 ThreadCounters : Ptr64 _KTHREAD_COUNTERS
............ //Truncated for visibility
+0x240 AffinityFill : [10] UChar
+0x24a ApcStateIndex : UChar
+0x24b WaitBlockCount : UChar
+0x24c IdealProcessor : Uint4B
+0x250 NpxState : Uint8B
+0x258 SavedApcState : _KAPC_STATE /*APCs that will be saved as it can not be dispatched in the current context*/
+0x258 SavedApcStateFill : [43] UChar
+0x283 WaitReason : UChar
+0x284 SuspendCount : Char
+0x285 Saturation : Char
1: kd> dt nt!_KAPC_STATE
+0x000 ApcListHead : [2] _LIST_ENTRY /*Kernel\user APC queue list head*/
+0x020 Process : Ptr64 _KPROCESS
+0x028 InProgressFlags : UChar
+0x028 KernelApcInProgress : Pos 0, 1 Bit
+0x028 SpecialApcInProgress : Pos 1, 1 Bit
+0x029 KernelApcPending : UChar
+0x02a UserApcPendingAll : UChar
+0x02a SpecialUserApcPending : Pos 0, 1 Bit
+0x02a UserApcPending : Pos 1, 1 Bit
APC Enumeration:
-
First Approach
The first idea i thought of to start enumerating APC queues for all thread inside a process, is by parsing the process thread list from _KPROCESS structure the target member is ThreadListHead _LIST_ENTRY structure, then on each thread I parse the _KTHREAD structure to start to get ApcState _KAPC_STATE structure then parses the kernel-APC and user-APC.
the problem with this approach is that it is based on parsing undocumented structure like _KPROCESS and _KTHREAD structure to get to the required member element, and those structure changes with different version of windows and even on different build number.
0: kd> dt nt!_KPROCESS
+0x000 Header : _DISPATCHER_HEADER
+0x018 ProfileListHead : _LIST_ENTRY
+0x028 DirectoryTableBase : Uint8B
+0x030 ThreadListHead : _LIST_ENTRY /*process threads list head*/
+0x040 ProcessLock : Uint4B
+0x044 ProcessTimerDelay : Uint4B
+0x048 DeepFreezeStartTime : Uint8B
+0x050 Affinity : _KAFFINITY_EX
+0x0f8 AffinityPadding : [12] Uint8B
+0x158 ReadyListHead : _LIST_ENTRY
............ //Truncated for visibility
-
Second Approach
The first approach was not suitable solution for me as it might lead to system crash, and i wanted to make an implementation that will be stable on all version of windows.
What I thought of is that when you initialize an APC using KeInitializeApc then queue it using KeInsertQueueApc the APC structure will be inserted inside the appropriate queue (Kernel/User APC queue) so why don’t I just initialize and insert my own APC then use this APC structure that I inserted to traverse the entire queue, So after some experimenting and checking how the Kernel/user APCs are dispatched and checking all the problems that i will face to enumerate Kernel/User APCs queue i found the following:
- A thread might be created/terminated when enumerating APC will need to stop thread creation/termination in the process (at least the local created/terminated thread not the remote one)
- will need to get all threads ID in the target process in a stable and documented way.
- when traversing the APC queue, it might be accessed at the same time (for queuing APC or dispatching it) which might lead to crash or inaccurate result.
- will need a valid user-mode address for the NormalRoutine in the user-mode APC otherwise the target process might crash.
So I will try to solve those issue each one at a time, the first issue is that a thread might be created/ terminated during our enumeration process, to solve it i decided to suspend the process before the APC enumeration then Resume it after finishing using the NtSuspendProcess and NtResumeProcess API .
the Second issue is that I will need to get all the threads ID in the target process, to do that there is two solution: the first is to use ZwQuerySystemInformation from the kernel with SystemProcessInformation as the SystemInformationClass argument, the second one is enumerating it from user mode using documented Method from windows (CreateToolhelp32Snapshot -> Thread32First -> Thread32Next)
so i decided use the second method which is enumerating thread ID from user-mode process
hThreadSnap = CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD, 0);
if (hThreadSnap == INVALID_HANDLE_VALUE)
return(FALSE);
// Fill in the size of the structure before using it.
te32.dwSize = sizeof(THREADENTRY32);
if (!Thread32First(hThreadSnap, &te32)) {
//Error calling Thread32First
CloseHandle(hThreadSnap); // Must clean up the snapshot object!
return(FALSE);
}
do
{
if (te32.th32OwnerProcessID == dwOwnerPID)
{
_tprintf(TEXT("\n THREAD ID = 0x%08X"), te32.th32ThreadID);
_tprintf(TEXT("\n base priority = %d"), te32.tpBasePri);
_tprintf(TEXT("\n delta priority = %d"), te32.tpDeltaPri);
Threadarray[counter]= te32.th32ThreadID; // Move all Target process ID to global array to be passed to the Driver
counter++;
}
} while (Thread32Next(hThreadSnap, &te32));
the Third issue is that the APC list might be accessed at the same time of the APC enumeration which might lead to crash or inaccurate result, what I thought of at start is that there must be some type of synchronization that is being done by windows kernel to lock the list, So i start searching for this synchronization method and I faced the same issue as in my first approach which is that I will need to parse undocumented structure _KTHREAD for each thread (1) so the idea died from the start.
my second approach I saw in the book rootkit arsenal it works by suspending all other CPU on the system by creating threads with the number on processor-1 and each thread raise the IRQL to DISPATCH_LEVEL then raise the IRQL on the current processor to DISPATCH_LEVEL also, this way i will not be interrupted by the windows kernel or any other driver, and as the APC is dispatched at APC_LEVEL or PASSIVE_LEVEL I will guarantee that the APC will not change during the APC enumeration.
NTSTATUS DeviceControl(PDEVICE_OBJECT, PIRP Irp) {
.................. //Truncated for visibility
HookAllCPU();
//Raise IRQL to DISPATCH_LEVEL on current proccessor,
KIRQL oldIrql = 0;
if (DISPATCH_LEVEL >= KeGetCurrentIrql()) {
KeRaiseIrql(DISPATCH_LEVEL, &oldIrql);
}
while (ThreadIDArray[counter] != 0) {
DbgPrint("APC Enumeration for TID: %d\n", ThreadIDArray[counter]);
PETHREAD Ethread;
NTSTATUS status = PsLookupThreadByThreadId((HANDLE)ThreadIDArray[counter], &Ethread);
if (NT_SUCCESS(status)) {
DbgPrint("Start Enumerating Kernel APC\n");
EnumerateKernelAPC(Ethread);
DbgPrint("Start Enumerating User APC\n");
EnumerateNormalAPC(Ethread);
ObDereferenceObject(Ethread);
}
counter++;
}
//lower the IRQL on current processor
KeLowerIrql(oldIrql);
//Signal other threads on other processor to stop and release proseccor
InterlockedDecrement(&ReleaseFlag);
//loop until all other threads is stoped
while (TotalCpuNumberHooked) {}
.................. //Truncated for visibility
}
VOID HookAllCPU() {
DbgPrint(("Attempting Hooking All CPUs Except the current processor\n"));
DWORD64 nProcessor = KeNumberProcessors;
NumberOfCPUToHook = (LONG)nProcessor - 1;
HANDLE ThreadHandel = NULL;
OBJECT_ATTRIBUTES inizializedattributes;
InitializeObjectAttributes(&inizializedattributes, NULL, 0, NULL, NULL);
for (int i = 0; i < nProcessor - 1; i++) {
DbgPrint("CPU %d is beining hooked\n", i);
PsCreateSystemThread(&ThreadHandel, THREAD_ALL_ACCESS,
NULL, NtCurrentProcess(), NULL,
(PKSTART_ROUTINE)HookCPU, NULL);
//Close The Handle
ZwClose(ThreadHandel);
}
//loop untile the other CPU are Hooked
while (NumberOfCPUToHook != TotalCpuNumberHooked) {}
return;
}
VOID HookCPU(PVOID) {
//Raise IRQL
KIRQL oldIrql = 0;
if (DISPATCH_LEVEL >= KeGetCurrentIrql()) {
KeRaiseIrql(DISPATCH_LEVEL, &oldIrql);
}
//NT_ASSERT(KeGetCurrentIrql() == DISPATCH_LEVEL);
InterlockedIncrement(&TotalCpuNumberHooked);
//Loop Untile Released, the release happen after the enumeration
while (ReleaseFlag) {}
//Lower IRQL
KeLowerIrql(oldIrql);
InterlockedDecrement(&TotalCpuNumberHooked);
//terminate itself
PsTerminateSystemThread(STATUS_SUCCESS);
}
the last issue that i will need to solve before writing the final code is that i will need a valid user-mode address to the user-mode APC (any user-mode address will work but if it not valid it will crash the target process), so i will need to parse normal windows API to use as NormalRoutine in user-mode APC, so instead of parsing the API address in the kernel mode (no easy way to do it) i will let the user-mode part of my code use windows API GetProcAddress to get the API i will use LoadLibraryA API in my case.
APC Enumeration Testing:
Now I am ready to test my Approach and see if it will face any issue, the complete code can be found on my GitHub ( https://github.com/MahmoudZohdy/Practical_Reverse_Engineering/tree/main/APCEnumeration ).
It worked just fine, although there is an issue that I parse the APC list head as KAPC structure (I cannot think of a way to distinguish the APC that it is queued and the APC list head)
00000007 1.72731602 Attempting Hooking All CPUs Except the current processor
00000008 1.72731876 CPU 0 is beining hooked
00000009 1.72758520 APC Enumeration for TID: 7308
00000010 1.72758675 Start Enumerating Kernel APC
00000011 1.72760320 KernelRoutine: 0xFFFFF806203989F0 RundownRoutine: 0xFFFFF806203989F0 NormalRoutine: 0xFFFFF806202DCA00
00000012 1.72760475 KernelRoutine: 0xFFFFCC8B3CB86128 RundownRoutine: 0xFFFFCC8B3CB86128 NormalRoutine: 0xFFFFCC8B3D744080
00000013 1.72760642 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x0000000000000000 // my Kernel APC
00000014 1.72760701 Start Enumerating User APC
00000015 1.72761536 KernelRoutine: 0xFFFFCC8B3D744080 RundownRoutine: 0x0000000009000100 NormalRoutine: 0x0000000000000100 // APC List Head, not valid function address
00000016 1.72761679 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x00007FFE5AC604F0 // LoadLibraryA address
00000017 1.72761762 APC Enumeration for TID: 7312
00000018 1.72761846 Start Enumerating Kernel APC
00000019 1.72762036 KernelRoutine: 0xFFFFCC8B3D63B128 RundownRoutine: 0xFFFFCC8B3D63B128 NormalRoutine: 0xFFFFCC8B3D744080
00000020 1.72762203 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x0000000000000000
00000021 1.72762263 Start Enumerating User APC
00000022 1.72762883 KernelRoutine: 0xFFFFCC8B3D744080 RundownRoutine: 0x0000000108000101 NormalRoutine: 0x0000000000000100
00000023 1.72763026 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x00007FFE5AC604F0
00000024 1.72763085 APC Enumeration for TID: 7316
00000025 1.72763157 Start Enumerating Kernel APC
00000026 1.72763395 KernelRoutine: 0xFFFFCC8B3D6D6128 RundownRoutine: 0xFFFFCC8B3D6D6128 NormalRoutine: 0xFFFFCC8B3D744080
00000027 1.72764325 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x0000000000000000
00000028 1.72764397 Start Enumerating User APC
00000029 1.72765243 KernelRoutine: 0xFFFFCC8B3D744080 RundownRoutine: 0x0000000008000101 NormalRoutine: 0x0000000000000100
00000030 1.72765398 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x00007FFE5AC604F0
00000031 1.72765481 APC Enumeration for TID: 7332
00000032 1.72765565 Start Enumerating Kernel APC
00000033 1.72765803 KernelRoutine: 0xFFFFCC8B3D647128 RundownRoutine: 0xFFFFCC8B3D647128 NormalRoutine: 0xFFFFCC8B3D744080
00000034 1.72765923 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x0000000000000000
00000035 1.72765994 Start Enumerating User APC
00000036 1.72766721 KernelRoutine: 0xFFFFCC8B3D744080 RundownRoutine: 0x0000000108000101 NormalRoutine: 0x0000000000000100
00000037 1.72766840 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x00007FFE5AC604F0
00000038 1.72766924 APC Enumeration for TID: 7336
00000039 1.72766995 Start Enumerating Kernel APC
00000040 1.72767246 KernelRoutine: 0xFFFFF806203989F0 RundownRoutine: 0xFFFFF806203989F0 NormalRoutine: 0xFFFFF806202DCA00
00000041 1.72767377 KernelRoutine: 0xFFFFCC8B3D646128 RundownRoutine: 0xFFFFCC8B3D646128 NormalRoutine: 0xFFFFCC8B3D744080
00000042 1.72767520 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x0000000000000000
00000043 1.72767580 Start Enumerating User APC
00000044 1.72768116 KernelRoutine: 0xFFFFCC8B3D744080 RundownRoutine: 0x0000000008000100 NormalRoutine: 0x0000000000000100
00000045 1.72768235 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x00007FFE5AC604F0
00000046 1.72768319 APC Enumeration for TID: 7340
00000047 1.72768378 Start Enumerating Kernel APC
00000048 1.72768605 KernelRoutine: 0xFFFFCC8B3D645128 RundownRoutine: 0xFFFFCC8B3D645128 NormalRoutine: 0xFFFFCC8B3D744080
00000049 1.72768760 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x0000000000000000
00000050 1.72768819 Start Enumerating User APC
00000051 1.72769403 KernelRoutine: 0xFFFFCC8B3D744080 RundownRoutine: 0x0000000108000101 NormalRoutine: 0x0000000000000100
00000052 1.72769558 KernelRoutine: 0xFFFFF806390B1000 RundownRoutine: 0x0000000000000000 NormalRoutine: 0x00007FFE5AC604F0
00000054 1.72774374 Finished UnHooking other CPUs