The main purpose of this article is to give an overview of the Windows NT kernel initialization (Windows Trusted Boot). What happens behind when we are booting a Windows device? There are not many detailed posts on this so I thought of writing this up.
This is a continuation of my previous article “Understanding UEFI Secure Boot – How it helps to secure the Windows 10 pre-boot phase“. You can consider it as the foundation on which I will explain the role of Windows Trusted Boot (Code Integrity) and ELAM.
Windows OS Architecture – Shell View
Execution of the OS Boot Loader winload.efi marks the end of Windows Pre-Boot phase (marks the end of jurisdiction for Secure Boot as well) and beginning of the NT kernel (NTOS) initialization. An overview of the Windows OS architecture in shell view
Activities performed by Windows Boot Loader
The actions as performed by winload.efi (synchronous) –
- Loads NT Kernel (
ntoskrnl.exe) to memory (RAM)
- Loads Hardware Abstraction Layer module (
hal.dll) and Local Kernel Debugger (
kd.dll). If debugging is enabled, loads the other debugging libraries as kdcom.dll, kd1394.dll, kdusb.dll
- Loads the SYSTEM registry hive
%SystemRoot%\System32\config\SYSTEMand scans it to pick the
SERVICE_BOOT_STARTdrivers identified via reg_dword value
- Loads the scanned
SERVICE_BOOT_STARTdrivers to memory
- Executes the NT Kernel (
ntoskrnl.exe) and terminates itself
Before terminating itself, it does one last important thing – it terminates the EFI Boot Services by making a call to the EFI function
ExitBootServices() thus reclaiming the memory that was being used by EFI during the EFI boot phase.
Phases of NT Kernel Initialization
Execution of the NT Kernel marks the beginning of the NT Kernel (ntoskrnl.exe) initialization. Kernel initialization happens in two phases as explained below.
NT Kernel Initialization (Phase 0)
- Phase 0 starts with the Kernel call to initialize Hardware Abstraction Layer (hal.dll)
HAL forms the base for the NT kernel, it encapsulates the system hardware, allowing the OS Kernel and Executive services to communicate with the hardware using instructions (routines and macros) in a generalized way without accessing the hardware directly - thus allowing the Kernel and the Executive Services to function without being dependent on the underlying platform hardware.
- HAL initializes the processor (in case of multi-processor system, single processor is initialized, in case of multi-core processor, single core is initialized), prepares the System Interrupt Controller (interrupts are yet not enabled) and returns control back to Kernel (ntoskrnl.exe)
- Kernel invokes the Memory Manager Executive service which constructs the page tables and internal data structures necessary to provide basic memory services. It divides the virtual address space into two regions – lower part accessible from both user and kernel space and upper part reserved for kernel space only. It create areas for the file system cache, paged and non-paged pools of memory and page table which can be accessed by the Kernel and CPU. Allocation and deallocation of memory virtually and dynamically is managed by this Executive service.
- Kernel invokes the Object Manager Executive service by defining the Object Manager namespace (Process/Thread/Job/File/Section/Access Token/Event/Semaphore/Timer/Key/Desktop/Clipboard/WindowStation/Symbolic Link) such that it can start creating objects to be used by other Executive subsystems. More information on Object structure can be read here. The kernel also creates the Handle table for the purpose of resource (object) tracking.
- Kernel invokes the Security Reference Monitor which initializes the token type object in Object Manager namespace and then uses the object as returned to create and prepare the first token for assignment to the initial process.
- Kernel invokes the Process and Thread Manager Executive subsystem which initializes the process and thread object type and sets up list to track active processes and threads within the Object Manager namespace. It then creates a process object for the first kernel-mode process – System Idle process, a single thread running on the processor.
The sole task of this process is to keep the processor occupied when it isn't processing any other threads. In such scenarios, idle thread will call routines in the Hardware Abstraction Layer to reduce CPU clock speed or to implement other power-saving mechanisms. Since this process has the lowest possible priority, the NT Kernel scheduler can easily switch this with an incoming process. Without the System Idle Process, if a situation arises where the processor has nothing to process, it can result in system freeze. Thus this process keeps the CPU running and waiting for anything the Kernel throws at it. In the below reference snap, it shows the System Idle Process using 79% of CPU which means 79% of CPU is free to be allocated to other processes. This is commonly mistaken as a high CPU utilization process. Windows Trusted Boot
The Process and Thread Manager also creates another process object at this stage – System process which is pawned by the NT Kernel (
ntoskrnl.exe) and launches a thread of this process to drive the next phase of Kernel initialization.
NT Kernel Initialization (Phase 1)
- System process thread requests HAL to enable interrupts.
- System process thread initializes Local Kernel Debugger (kd.dll) already loaded in memory by bootloader (winload.efi)
- System process thread invokes the rest of the Executive services to initialize and have their objects created. The below steps are not sequential but asynchronous in nature.
- Configuration Manager Executive service defines the
HKEY_LOCAL_MACHINE\SYSTEMpart of Windows Registry as found at
%SystemRoot%\System32\config\SYSTEM. It also defines the
HKEY_LOCAL_MACHINE\HARDWARE\part of the Registry which will be filled by I/O Manager Executive service during driver initialization.
Windows drivers, in addition to the START value has three other properties - GROUP, TAG and DependOnGroup value which decides the time of the driver initialization during the OS boot.
- I/O Manager sorts the
SERVICE_BOOT_STARTdrivers (loaded to memory by winload.efi reading the START value) according to their GROUP value by reading the registry
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ServiceGroupOrder. It then sorts the sorted drivers against the TAG value within each group using the registry
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GroupOrderListto prepare the final list. It then traverses the list and initializes each driver according to its place in the list.
- Power Manager Executive service becomes functional with the initialization of
ACPI.sys(a SERVICE_BOOT_START driver) the role of which is to support power management and Plug and Play (PnP) device enumeration. HAL causes the ACPI.sys to be at the base of the Device Tree. The current system time is recorded as the System Boot time.
Till this point it is only a single processor (or core in case of multi-core processor) is active. System process thread calls HAL to initialize the rest of the processors (in case of multi-core processor, rest of the cores).
- I/O Manager scans the registry to load the
SERVICE_SYSTEM_STARTdrivers. The drivers are initialized in same manner as above.
During initialization, if I/O Manager encounters a driver which has DependOnGroup value defined, it waits to initialize that driver till a driver belonging to that particular group is not initialized. As part of initialization, I/O Manager also checks the return status. If a driver reports error on initialization, I/O Manager takes action according to the ErrorControl value as defined in the driver's registry key. In some case, an error may prevent NT Kernel to continue booting resulting a BSOD. This type of scenario can be investigated by enabling the Windows Boot log as created by the NT Kernel and saved at %WinDir%\ntbtlog.txt
- Post initializing the kernel space drivers, I/O Manager calls HAL to define the drive-letter mappings which it does by reading the information from registry path
- I/O Manager invokes its subsystem Cache Manager which works closely with the File-System subsystem and Memory Manager Executive service to cache file-system data in memory. Cache Manager helps in optimizing the OS performance as it greatly reduces expensive Disk operations every time a process/thread needs to access a file data.
- I/O Manager invokes its subsystem PnP Manager which enables the OS to respond to Plug and Play devices.
PnP Manager initializes with one Virtual Device on the system named Root. This is used by HAL to detect bus and everything connected to the System Board. The bus and devices as discovered by HAL returns a Vendor ID and Product ID. The PnP Manager reads the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum to find a key which matches the returned Vendor ID and Product ID. The key contains a value Driver which points to a particular Registry key under HKEY_LOCAL_MACHINE\ SYSTEM\ControlSet001\Control\Class which contains the value InfPath (points to C:\WINDOWS\INF)telling the kernel from where to load the driver. Alternately, if a match is not found meaning the driver is not present, prompt to install driver will be triggered once Windows Explorer starts.
- Kernel Transaction Manager Executive service implements transaction processing in kernel mode, enabling processes/threads to use atomic transactions (ACID property to ensure data integrity) on resources (both kernel/user mode)
The communication between the Executive subsystems and Kernel is through Asynchronous Local inter-Process Communication (ALPC) implemented via Native APIs – a lightweight API used within the Kernel space, by the Kernel Executive services for communication and providing service to the user-mode clients, exposed to the user-mode space via ntdll.dll
With the all the kernel space drivers initialized and Executive services functional, the System process thread invokes the Process and Thread Manager Executive service to create and launch the Session ManagerSubsystem process (smss.exe) – first user-mode process that gets created and is responsible for creating the user-mode environment that provides the visible interface to the Windows NT Kernel.
This marks the end of Kernel Initialization and start of GUI interface.
Kernel Initialization – Components Schema representation
Below image shows the components of NT Kernel and is colored according to the initialization phase – blue denotes Phase 0 and green denotes Phase 1.
Windows Trusted Boot – Code Integrity?
The above gives you an overview and internals of the NT Kernel Initialization. Now lets relate this to security aspect.
The pre-boot UEFI phase is already protected by Secure Boot. But is it enough to ensure that the OS boots to a trusted and secure state? No. Because if there is no check mechanism implemented at the Kernel Initialization phase – this is when the drivers are executed preparing the runtime.
Enter Windows Trusted Boot – which takes over from where Secure Boot left off. The bootloader verifies the digital signature of the Windows 10 kernel before loading it. The Windows 10 kernel, in turn, verifies every other component of the Windows startup process, including the boot drivers, startup files, and ELAM.
The logic in the Kernel-Mode Code Signing Policy responsible for enforcing code integrity is shared between the Windows kernel image and the kernel-mode library ci.dll
For devices that clean install Windows 10, and where
Secure Boot is On (this is standard for all new devices since the release of Windows 8.0), all new drivers must be signed by WHQL/Sysdev, else will fail the code integrity check. Read this Microsoft document for more information.
If a critical component fails the integrity check, it may cause a BSOD as it will be denied from getting loaded and executed.
However in most cases, Windows 10 will automatically try repair the corrupted component restoring the integrity of Windows and allowing the PC to start normally.
However if you are getting this frequently, the probable action plan would be
1. Run DISM /Online /Cleanup-Image /RestoreHealth
Deployment Image Servicing and Management tool will connect to the
Windows Update servers to download and replace any damaged files in the
local image for Windows 10 as necessary. Once this is complete…
2. Run sfc /scannow
This will scan all protected system files, and replace corrupted files with a
cached copy that is located in a compressed folder at %WinDir%\System32\dllcache
More info here.
Post this if still you are getting the same BSOD, it could possibly mean an issue
with the system itself and you should run the UEFI diagnostics to check memory
Configurable Code Integrity
Further, a Windows 10 device can be restricted to only run authorized apps by using a feature called Configurable Code Integrity, part of Windows Defender Application Control features of Windows 10. The advantage of this is stated here in the Microsoft documentation.
You can configure and deploy a WDAC policy from Intune. Click here to know more on this.
Running Code Integrity as a separate service from the OS kernel?
With Virtual Secure Mode (VSM) – a feature that leverages the virtualization extensions of the CPU to provide added security of data in memory, Windows 10 can run the entire Code Integrity as VSM instance run under Hypervisor authority completely separate from the OS Kernel.
The feature name is Hypervisor Code Integrity (HVCI). This helps to further harden the OS Kernel against memory attacks.
With Code Integrity service running as a virtual instance separate from the OS Kernel itself, the protected boot flow schema becomes
Role of ELAM in securing the Windows Boot
Secure Boot ensures trusted OS bootloader is allowed to execute where as Trusted Boot ensures that the OS kernel meets the trust. Here trust means the components are signed by Microsoft. But when a system boots, it loads and executes numerous drivers all of which are not Microsoft. Thus the next opportunity for malware to start is by infecting a non-Microsoft boot driver.
As I mentioned in my previous article “Understanding UEFI Secure Boot – How it helps to secure the Windows 10 pre-boot phase“, traditional Anti Malware apps starts post Kernel Initialization – thereby giving the opportunity to a rootkit disguised as a driver to work.
This is where Microsoft introduced the Early Launch Anti Malware (ELAM) driver – a detection mechanism for Windows systems that allows third-party security software, such as antivirus software, to register a kernel-mode driver that is guaranteed to execute early in the boot process, before any other driver is loaded.
The ELAM driver receives callback routines from the Plug and Play manager executive to determine whether the driver should be initialized. Based on the returned classification and defined policy, the PnP manager decides whether to initialize the boot driver.
Working of ELAM driver
ELAM driver evaluates a driver signature and classifies it as good, bad, or unknown based on the malware signature data stored under
HKLM\ELAM\<VendorName>\, where the ELAM driver vendor stores the whitelist/blacklist AV signatures for ELAM driver to use.
This registry hive is loaded to memory by OS Bootloader (winload.efi) from
c:\Windows\System32\config\ELAM , however, you wont find this when you view the registry post the system has completed boot. This is for performance reason. As the role of ELAM gets over, this registry hive also gets unloaded.
Kernel decides to load a driver post ELAM classification as per the policy defined at registry path
DriverLoadPolicy can have the below values
8 = Good only
1 = Good and unknown
3 = Good, unknown and bad but critical
7 = All
This can also be configured using GPO from under
Computer Configuration \ Administrative Templates\ System \ Early Launch Antimalware\ Boot-Start Driver Initialization Policy
By default there is no policy defined and Windows Kernel will load drivers with the follwoing ELAM classification – Good, Bad but critical for boot and Unknown. But by defining a policy, you can add more security by allowing only Good drivers to load. However, this sometimes might lead to instability.
As the Kernel initialization completes and Windows subsystem starts (win32 environment) the runtime AV service gets initialized. At this point, the ELAM driver gives the handoff and is unloaded. The runtime AV service takes the mantle for AV check and protection from there onwards.
The default Windows ELAM driver
Yes, Windows Defender provides the default ELAM driver on Windows 10 and it is pretty good at its work considering that it comes free. Microsoft being named the leader in the Gartner Magic Quadrant in Endpoint protection, it is quite easy to state that the AV from Microsoft is quite robust and effective.
Here you see the
START_TYPE as DEMAND_START because my system has 3rd party AV service registering their ELAM driver.
3rd party AV service also allowed to register ELAM driver
Microsoft allows 3rd party AV services to register their ELAM driver if you choose to use them instead of the inbuilt AV solution that Windows 10 comes with.
You can see here that I have McAfee, my 3rd party Anti-Malware service has its own ELAM driver regsiter with
START_TYPE set as BOOT_START and
By default, it is the responsibility of the 3rd party AV service to store a copy of the ELAM driver in the registry defined backup location, from where a copy of the ELAM driver can be obtained, in case the original file gets corrupted.
This is usually the path
C:\Windows\ELAMBKUP defined in registry location
HKLM\SYSTEM\CurrentControlSet\Control\EarlyLaunch via reg_key
Does ELAM guarantees complete AV protection?
Sadly no. The ELAM works when nothing on the system has been initialized and it also does not gets to check the driver package by itself. It only gets the driver signature details from the RPC call it receives from Kernel which it uses to check against its fixed AV signature database as defined in registry.
This limits the ability of ELAM to detect heuristic or polymorphing malwares.
But some protection/check mechanism is better and nothing and in such sense ELAM plays its role well in securing the Windows boot flow.
If you have read till this, you would be able to sum up the entire boot flow as protected by Secure Boot and Windows Trusted Boot as shown below
To be contd…
This article will be followed by my next article where I will be talking about the Windows Measured Boot and how it helps in securing the Windows Boot flow.
Till then, as I always say, read something new everyday, learn something new everyday…
- Code Integrity
- Virtual Security Mode in Windows 10
- Credential Guard
- Code Integrity Diagnostic System Log Events
- Viewing Code Integrity Events
- SetupAPI Text Logs