Understanding Windows Trusted Boot – Integrity Check and ELAM

2
WIndows Tristed Boot

The main purpose of this article is to give an overview of the Windows NT kernel initialization (Windows Trusted Boot). What happens behind when we are booting a Windows device? There are not many detailed posts on this so I thought of writing this up.

This is a continuation of my previous article “Understanding UEFI Secure Boot – How it helps to secure the Windows 10 pre-boot phase“. You can consider it as the foundation on which I will explain the role of Windows Trusted Boot (Code Integrity) and ELAM.

Windows OS Architecture – Shell View

Execution of the OS Boot Loader winload.efi marks the end of Windows Pre-Boot phase (marks the end of jurisdiction for Secure Boot as well) and beginning of the NT kernel (NTOS) initialization. An overview of the Windows OS architecture in shell view

Windows OS Architecture - Shell View -  Windows Trusted Boot
Windows OS Architecture – Shell View – Windows Trusted Boot

Activities performed by Windows Boot Loader

The actions as performed by winload.efi (synchronous) –

  • Loads NT Kernel (ntoskrnl.exe) to memory (RAM)
  • Loads Hardware Abstraction Layer module (hal.dll) and Local Kernel Debugger (kd.dll). If debugging is enabled, loads the other debugging libraries as kdcom.dll, kd1394.dll, kdusb.dll
  • Loads the SYSTEM registry hive HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control from %SystemRoot%\System32\config\SYSTEM and scans it to pick the SERVICE_BOOT_START drivers identified via reg_dword value 0x00000000
  • Loads the scanned SERVICE_BOOT_START drivers to memory
  • Executes the NT Kernel (ntoskrnl.exe) and terminates itself

Before terminating itself, it does one last important thing – it terminates the EFI Boot Services by making a call to the EFI function ExitBootServices() thus reclaiming the memory that was being used by EFI during the EFI boot phase.

Phases of NT Kernel Initialization

Execution of the NT Kernel marks the beginning of the NT Kernel (ntoskrnl.exe) initialization. Kernel initialization happens in two phases as explained below.

NT Kernel Initialization (Phase 0)

  • Phase 0 starts with the Kernel call to initialize Hardware Abstraction Layer (hal.dll)
HAL forms the base for the NT kernel, it encapsulates the system hardware,
allowing the OS Kernel and Executive services to communicate with the hardware
using instructions (routines and macros) in a generalized way without accessing
the hardware directly - thus allowing the Kernel and the Executive Services to
function without being dependent on the underlying platform hardware.
  • HAL initializes the processor (in case of multi-processor system, single processor is initialized, in case of multi-core processor, single core is initialized), prepares the System Interrupt Controller (interrupts are yet not enabled) and returns control back to Kernel (ntoskrnl.exe)
  • Kernel invokes the Memory Manager Executive service which constructs the page tables and internal data structures necessary to provide basic memory services. It divides the virtual address space into two regions – lower part accessible from both user and kernel space and upper part reserved for kernel space only.  It create areas for the file system cache, paged and non-paged pools of memory and page table which can be accessed by the Kernel and CPU. Allocation and deallocation of memory virtually and dynamically is managed by this Executive service.
Memory Space - Windows Trusted Boot
Memory Space – Windows Trusted Boot
  • Kernel invokes the Object Manager Executive service by defining the Object Manager namespace (Process/Thread/Job/File/Section/Access Token/Event/Semaphore/Timer/Key/Desktop/Clipboard/WindowStation/Symbolic Link) such that it can start creating objects to be used by other Executive subsystems. More information on Object structure can be read here. The kernel also creates the Handle table for the purpose of resource (object) tracking.
  • Kernel invokes the Security Reference Monitor which initializes the token type object in Object Manager namespace and then uses the object as returned to create and prepare the first token for assignment to the initial process.
  • Kernel invokes the Process and Thread Manager Executive subsystem which initializes the process and thread object type and sets up list to track active processes and threads within the Object Manager namespace. It then creates a process object for the first kernel-mode process – System Idle process, a single thread running on the processor.
The sole task of this process is to keep the processor occupied when it isn't
processing any other threads. In such scenarios, idle thread will call routines
in the Hardware Abstraction Layer to reduce CPU clock speed or to implement other
power-saving mechanisms. Since this process has the lowest possible priority, 
the NT Kernel scheduler can easily switch this with an incoming process. 

Without the System Idle Process, if a situation arises where the processor has
nothing to process, it can result in system freeze. Thus this process keeps the
CPU running and waiting for anything the Kernel throws at it.

In the below reference snap, it shows the System Idle Process using 79% of CPU
which means 79% of CPU is free to be allocated to other processes. This is
commonly mistaken as a high CPU utilization process.

 Understanding Windows Trusted Boot - Integrity Check and ELAM 1
Windows Trusted Boot

The Process and Thread Manager also creates another process object at this stage – System process which is pawned by the NT Kernel (ntoskrnl.exe) and launches a thread of this process to drive the next phase of Kernel initialization.

NT Kernel Initialization (Phase 1)

  • System process thread requests HAL to enable interrupts.
  • System process thread initializes Local Kernel Debugger (kd.dll) already loaded in memory by bootloader (winload.efi)
  • System process thread invokes the rest of the Executive services to initialize and have their objects created. The below steps are not sequential but asynchronous in nature.
  • Configuration Manager Executive service defines the HKEY_LOCAL_MACHINE\SYSTEM part of Windows Registry as found at %SystemRoot%\System32\config\SYSTEM. It also defines the HKEY_LOCAL_MACHINE\HARDWARE\ part of the Registry which will be filled by I/O Manager Executive service during driver initialization.
Windows drivers, in addition to the START value has three other properties - 
GROUP, TAG and DependOnGroup value which decides the time of the driver
initialization during the OS boot.
  • I/O Manager sorts the SERVICE_BOOT_START drivers (loaded to memory by winload.efi reading the START value) according to their GROUP value by reading the registry HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ServiceGroupOrder. It then sorts the sorted drivers against the TAG value within each group using the registry HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GroupOrderList to prepare the final list. It then traverses the list and initializes each driver according to its place in the list.
  • Power Manager Executive service becomes functional with the initialization of ACPI.sys (a SERVICE_BOOT_START driver) the role of which is to support power management and Plug and Play (PnP) device enumeration. HAL causes the ACPI.sys to be at the base of the Device Tree. The current system time is recorded as the System Boot time.

Till this point it is only a single processor (or core in case of multi-core processor) is active. System process thread calls HAL to initialize the rest of the processors (in case of multi-core processor, rest of the cores).

  • I/O Manager scans the registry to load the SERVICE_SYSTEM_START drivers. The drivers are initialized in same manner as above. 
During initialization, if I/O Manager encounters a driver which has DependOnGroup
value defined, it waits to initialize that driver till a driver belonging to that
particular group is not initialized. 

As part of initialization, I/O Manager also checks the return status. If a 
driver reports error on initialization, I/O Manager takes action according 
to the ErrorControl value as defined in the driver's registry key. In some case,
an error may prevent NT Kernel to continue booting resulting a BSOD. This type of
scenario can be investigated by enabling the Windows Boot log as created by the
NT Kernel and saved at %WinDir%\ntbtlog.txt
  • Post initializing the kernel space drivers, I/O Manager calls HAL to define the drive-letter mappings which it does by reading the information from registry path HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Disk
  • I/O Manager invokes its subsystem Cache Manager which works closely with the File-System subsystem and Memory Manager Executive service to cache file-system data in memory. Cache Manager helps in optimizing the OS performance as it greatly reduces expensive Disk operations every time a process/thread needs to access a file data.
  • I/O Manager invokes its subsystem PnP Manager which enables the OS to respond to Plug and Play devices.
PnP Manager initializes with one Virtual Device on the system named Root. 
This is used by HAL to detect bus and everything connected to the System Board.
The bus and devices as discovered by HAL returns a Vendor ID and Product ID. 
The PnP Manager reads the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum to
find a key which matches the returned Vendor ID and Product ID. 
The key contains a value Driver which points to a particular Registry key under
HKEY_LOCAL_MACHINE\ SYSTEM\ControlSet001\Control\Class which contains the value
InfPath (points to C:\WINDOWS\INF)telling the kernel from where to load the
driver. Alternately, if a match is not found meaning the driver is not present,
prompt to install driver will be triggered once Windows Explorer starts.
  • Kernel Transaction Manager Executive service implements transaction processing in kernel mode, enabling processes/threads to use atomic transactions (ACID property to ensure data integrity) on resources (both kernel/user mode)

The communication between the Executive subsystems and Kernel is through Asynchronous Local inter-Process Communication (ALPC) implemented via Native APIs – a lightweight API used within the Kernel space, by the Kernel Executive services for communication and providing service to the user-mode clients, exposed to the user-mode space via ntdll.dll

With the all the kernel space drivers initialized and Executive services functional, the System process thread invokes the Process and Thread Manager Executive service to create and launch the Session ManagerSubsystem process (smss.exe) – first user-mode process that gets created and is responsible for creating the user-mode environment that provides the visible interface to the Windows NT Kernel.

smss.exe - 1st user mode process created post kernel initialization ( Windows Trusted Boot)
smss.exe – 1st user mode process created post kernel initialization ( Windows Trusted Boot)

This marks the end of Kernel Initialization and start of GUI interface.

Kernel Initialization – Components Schema representation

Below image shows the components of NT Kernel and is colored according to the initialization phase – blue denotes Phase 0 and green denotes Phase 1.

Windows 10 Kernel Components - Kernel Initialization - Windows Pre-Boot (Windows Trusted Boot)
Windows 10 Kernel Components – Kernel Initialization – Windows Pre-Boot (Windows Trusted Boot)

Windows Trusted BootCode Integrity?

The above gives you an overview and internals of the NT Kernel Initialization. Now lets relate this to security aspect.

The pre-boot UEFI phase is already protected by Secure Boot. But is it enough to ensure that the OS boots to a trusted and secure state? No. Because if there is no check mechanism implemented at the Kernel Initialization phase – this is when the drivers are executed preparing the runtime.

Enter Windows Trusted Boot – which takes over from where Secure Boot left off. The bootloader verifies the digital signature of the Windows 10 kernel before loading it. The Windows 10 kernel, in turn, verifies every other component of the Windows startup process, including the boot drivers, startup files, and ELAM.

The logic in the Kernel-Mode Code Signing Policy responsible for enforcing code integrity is shared between the Windows kernel image and the kernel-mode library ci.dll

Windows Trusted Boot - Code Integrity component - ci.dll exported function view
Windows Trusted Boot – Code Integrity component – ci.dll exported function view

For devices that clean install Windows 10, and where Secure Boot is On (this is standard for all new devices since the release of Windows 8.0), all new drivers must be signed by WHQL/Sysdev, else will fail the code integrity check. Read this Microsoft document for more information.

If a critical component fails the integrity check, it may cause a BSOD as it will be denied from getting loaded and executed.

Driver Check Failed - Code Integrity Error BSOD - Windows Trusted Boot
Driver Check Failed – Code Integrity Error BSOD – Windows Trusted Boot

However in most cases, Windows 10 will automatically try repair the corrupted component restoring the integrity of Windows and allowing the PC to start normally.

However if you are getting this frequently, the probable action plan would be

1. Run DISM /Online /Cleanup-Image /RestoreHealth

Deployment Image Servicing and Management tool will connect to the
Windows Update servers to download and replace any damaged files in the
local image for Windows 10 as necessary. Once this is complete…


2. Run sfc /scannow

This will scan all protected system files, and replace corrupted files with a
cached copy that is located in a compressed folder at %WinDir%\System32\dllcache

More info here.

Post this if still you are getting the same BSOD, it could possibly mean an issue
with the system itself and you should run the UEFI diagnostics to check memory
and HDD.

Configurable Code Integrity

Further, a Windows 10 device can be restricted to only run authorized apps by using a feature called Configurable Code Integrity, part of Windows Defender Application Control features of Windows 10. The advantage of this is stated here in the Microsoft documentation.

Windows Trusted Boot - WDAC Configurable Code Integrity from Intune
Windows Trusted Boot – WDAC Configurable Code Integrity from Intune

You can configure and deploy a WDAC policy from Intune. Click here to know more on this.

Running Code Integrity as a separate service from the OS kernel?

With Virtual Secure Mode (VSM) – a feature that leverages the virtualization extensions of the CPU to provide added security of data in memory, Windows 10 can run the entire Code Integrity as VSM instance run under Hypervisor authority completely separate from the OS Kernel.

The feature name is Hypervisor Code Integrity (HVCI). This helps to further harden the OS Kernel against memory attacks.

With Code Integrity service running as a virtual instance separate from the OS Kernel itself, the protected boot flow schema becomes

WIndows Trusted Boot - Boot Flow Schema with Virtual Based Security
WIndows Trusted Boot – Boot Flow Schema with Virtual Based Security

Role of ELAM in securing the Windows Boot

Secure Boot ensures trusted OS bootloader is allowed to execute where as Trusted Boot ensures that the OS kernel meets the trust. Here trust means the components are signed by Microsoft. But when a system boots, it loads and executes numerous drivers all of which are not Microsoft. Thus  the next opportunity for malware to start is by infecting a non-Microsoft boot driver.

As I mentioned in my previous article “Understanding UEFI Secure Boot – How it helps to secure the Windows 10 pre-boot phase“, traditional Anti Malware apps starts post Kernel Initialization – thereby giving the opportunity to a rootkit disguised as a driver to work.

This is where Microsoft introduced the Early Launch Anti Malware (ELAM) driver – a detection mechanism for Windows systems that allows third-party security software, such as antivirus software, to register a kernel-mode driver that is guaranteed to execute early in the boot process, before any other driver is loaded.

The ELAM driver receives callback routines from the Plug and Play manager executive to determine whether the driver should be initialized. Based on the returned classification and defined policy, the PnP manager decides whether to initialize the boot driver.

Working of ELAM driver

ELAM driver evaluates a driver signature and classifies it as good, bad, or unknown based on the malware signature data stored under HKLM\ELAM\<VendorName>\, where the ELAM driver vendor stores the whitelist/blacklist AV signatures for ELAM driver to use.

This registry hive is loaded to memory by OS Bootloader (winload.efi) from c:\Windows\System32\config\ELAM , however, you wont find this when you view the registry post the system has completed boot. This is for performance reason. As the role of ELAM gets over, this registry hive also gets unloaded.

Kernel decides to load a driver post ELAM classification as per the policy defined at registry path HKLM\SYSTEM\CurrentControlSet\Policies\EarlyLaunch\DriverLoadPolicy

The reg_dword DriverLoadPolicy can have the below values

8 = Good only
1 = Good and unknown
3 = Good, unknown and bad but critical
7 = All

This can also be configured using GPO from under Computer Configuration \ Administrative Templates\ System \ Early Launch Antimalware\ Boot-Start Driver Initialization Policy

By default there is no policy defined and Windows Kernel will load drivers with the follwoing ELAM classification – Good, Bad but critical for boot and Unknown. But by defining a policy, you can add more security by allowing only Good drivers to load. However, this sometimes might lead to instability.

As the Kernel initialization completes and Windows subsystem starts (win32 environment) the runtime AV service gets initialized. At this point, the ELAM driver gives the handoff and is unloaded. The runtime AV service takes the mantle for AV check and protection from there onwards.

The default Windows ELAM driver

Windows Trusted Boot -  Windows Defender default Windows 10 ELAM driver
Windows Trusted Boot – Windows Defender default Windows 10 ELAM driver

Yes, Windows Defender provides the default ELAM driver on Windows 10 and it is pretty good at its work considering that it comes free. Microsoft being named the leader in the Gartner Magic Quadrant in Endpoint protection, it is quite easy to state that the AV from Microsoft is quite robust and effective.

Here you see the START_TYPE as DEMAND_START because my system has 3rd party AV service registering their ELAM driver.

3rd party AV service also allowed to register ELAM driver

Microsoft allows 3rd party AV services to register their ELAM driver if you choose to use them instead of the inbuilt AV solution that Windows 10 comes with.

Windows Trusted Boot - 3rd party AM service ELAM driver
Windows Trusted Boot – 3rd party AM service ELAM driver

You can see here that I have McAfee, my 3rd party Anti-Malware service has its own ELAM driver regsiter with START_TYPE set as BOOT_START and LOAD_ORDER_GROUP Early-Launch.

ELAM recovery

By default, it is the responsibility of the 3rd party AV service to store a copy of the ELAM driver in the registry defined backup location, from where a copy of the ELAM driver can be obtained, in case the original file gets corrupted.

Windows Trusted Boot - ELAM Backup Path as defined in registry
Windows Trusted Boot – ELAM Backup Path as defined in registry

This is usually the path C:\Windows\ELAMBKUP defined in registry location HKLM\SYSTEM\CurrentControlSet\Control\EarlyLaunch via reg_key BackupPath

Windows Trusted Boot - ELAM Backup location
Windows Trusted Boot – ELAM Backup location

Does ELAM guarantees complete AV protection?

Sadly no. The ELAM works when nothing on the system has been initialized and it also does not gets to check the driver package by itself. It only gets the driver signature details from the RPC call it receives from Kernel which it uses to check against its fixed AV signature database as defined in registry.

This limits the ability of ELAM to detect heuristic or polymorphing malwares.

But some protection/check mechanism is better and nothing and in such sense ELAM plays its role well in securing the Windows boot flow.

Ending

If you have read till this, you would be able to sum up the entire boot flow as protected by Secure Boot and Windows Trusted Boot as shown below

Windows Trusted Boot - Overall Block Schema
Windows Trusted Boot – Overall Block Schema

To be contd…

This article will be followed by my next article where I will be talking about the Windows Measured Boot and how it helps in securing the Windows Boot flow.

Till then, as I always say, read something new everyday, learn something new everyday…

Resources

2 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.