- TCP connection arrives at HTTP.sys kernel driver (port 80/443)
- TLS handshake completed by HTTP.sys (kernel mode, uses ssl.sys)
- HTTP parsing — HTTP.sys reads request line + Host header
- Route lookup — HTTP.sys maps Host header to registered app pool
- Request queued — placed in the app pool's request queue
- WAS signals w3wp.exe to dequeue and process the request
- IIS native pipeline runs modules (auth, caching, filter, handler)
- Response returned via HTTP.sys back to the TCP connection
IIS Architecture: HTTP.sys, Worker Process, and Request Flow
The deep internals of how IIS processes every request — from the kernel-mode HTTP driver through WAS and the worker process to your application code. The knowledge that separates guessing from systematic diagnosis.
🧒 Simple Explanation (ELI5)
Imagine a busy restaurant. The front door greeter (HTTP.sys) takes every customer's order ticket the moment they arrive, without making them wait for a waiter. The shift manager (WAS) sends the right cook (worker process, w3wp.exe) to a specific kitchen (application pool) to prepare orders for that table. The cook reads the recipe book (application code) and prepares the meal. HTTP.sys delivers the finished plate back to the customer. If one cook burns down their kitchen, the other kitchens keep running because they are all separate rooms.
🔧 Why Do We Need It?
- Understand why crashes don't take down all sites: process isolation (each app pool = separate w3wp.exe) means a crashed application does not affect other sites — but only if app pools are correctly separated.
- Diagnose 503 without IIS restart: knowing that HTTP.sys keeps port 80 open while WAS recycles worker processes explains why 503 is transient and why IISRESET is often unnecessary.
- Debug startup performance: first-request latency ("cold start") on .NET Framework apps comes from CLR JIT compilation occurring in the worker process — understanding this explains warm-up techniques like Application Initialization.
- Understand Web Garden limits: multiple worker processes per app pool (Web Garden) increase throughput but create session state sharing problems — you can only use this correctly when you understand the architecture.
🌍 Real-world Analogy
HTTP.sys is like a telephone switchboard operator at the front desk of a large building — they answer every incoming call instantly (accept TCP connection), determine which department (IIS site) the call is for, and put the caller on hold if the department is unavailable. WAS is the department manager — when calls arrive, WAS wakes up the right employee (worker process) to handle them. Each department (app pool) runs in a locked, sound-proof office — what happens in one office doesn't affect others. The building keeps the switchboard running even if every employee quits — HTTP.sys outlives individual worker process crashes.
⚙️ Technical Explanation
HTTP.sys is a kernel-mode device driver (http.sys) that is the sole listener on TCP ports 80 and 443 for IIS deployments. It owns the socket at the network layer. When a TCP connection arrives, HTTP.sys completes the TCP handshake and SSL handshake (in kernel mode for performance) without involving any user-mode process. It reads the HTTP request line and the Host header, then looks up its internal routing table (populated by WAS from applicationHost.config) to determine which application pool and site should handle it. If the worker process for that pool is already running, HTTP.sys places the request in the pool's request queue. If not, it signals WAS to spin up a new worker process.
WAS (Windows Process Activation Service) is the supervisor of worker processes. It reads applicationHost.config on startup and builds its routing table, registering all site/app bindings with HTTP.sys via the HTTP server API. WAS monitors worker processes for health: if a w3wp.exe crashes, WAS detects the exit code, logs the error, and determines whether to restart it (based on Rapid Fail Protection settings). WAS enforces resource limits set on App Pools (max CPU, max memory via Job Objects). WAS also manages process recycling: scheduled time-of-day recycles, request-count recycles, memory threshold recycles, and on-demand recycles.
w3wp.exe (World Wide Web Worker Process) is the user-mode process that runs application code. One w3wp.exe per application pool (unless Web Garden mode is configured). Inside w3wp.exe, the IIS native request pipeline processes the request through all registered native modules. If the request maps to a .NET handler, the managed engine module (ManagedEngine / aspnet_isapi.dll) loads the CLR and routes into the ASP.NET pipeline. For ASP.NET Core with in-process hosting, ANCM loads the CoreCLR inside the same w3wp.exe. For out-of-process Core, w3wp.exe forwards the request via HTTP to a separate Kestrel process and returns Kestrel's response to HTTP.sys.
📊 Visual Representation
⌨️ Commands / Syntax
# --- HTTP.sys inspection ---
netsh http show servicestate # URL registrations + queue info
netsh http show servicestate view=requestq # Per-app-pool request queue
netsh http show iplisten # IP addresses HTTP.sys listens on
netsh http show counters # HTTP.sys kernel counters
# --- WAS and worker process ---
sc query was # WAS service state
%windir%\system32\inetsrv\appcmd list wp # Worker processes: PID + pool name
# Map w3wp.exe PID to pool and user identity
Get-Process w3wp | Select-Object Id,@{N='Pool';E={(Get-WmiObject Win32_Process -Filter "ProcessId=$($_.Id)").CommandLine}}
# --- App pool requests counters from WMI ---
# Current queued requests per site
(Get-Counter '\Web Service(*)\Current Connections').CounterSamples |
Where-Object {$_.InstanceName -ne '_total'} |
Select-Object InstanceName, CookedValue
# --- Native pipeline tracing (FREB) ---
# Enable Failed Request Tracing via appcmd
%windir%\system32\inetsrv\appcmd configure trace /enablesite:"Default Web Site" /path:/ /statusCodes:500 /timeTaken:30000
# --- Worker process memory and thread info ---
Get-Process w3wp | Select-Object Id, WorkingSet64, PrivateMemorySize64, Threads
💼 Example (Real-world Use Case)
An operations team sees periodic 503 errors on a high-traffic API site. The errors last about 2 seconds. Using netsh http show servicestate view=requestq, they observe the app pool request queue reaching 1000 (the default maximum). HTTP.sys accepts connections but the queue overflows, triggering 503. The root cause is that w3wp.exe has exhausted its thread pool — all threads are blocked waiting for a slow downstream SQL Server query. Solution: add async/await patterns to avoid blocking IIS threads (application code fix), increase Maximum Concurrent Requests per CPU in applicationHost.config as a temporary buffer, and add database connection pool tuning. The engineers could only diagnose this because they understood that the 503 originated from HTTP.sys queue overflow, not from the application crashing.
🧪 Hands-on
- Run
netsh http show servicestateand examine the output: identify the "Request queues" section. For each registered URL, note the number of requests currently queued. Confirm the Default Web Site or your configured site appears here. - Run
%windir%\system32\inetsrv\appcmd list wp. For each w3wp.exe entry, note the PID, Application Pool name, and state. Open Task Manager → Details tab and confirm the same PIDs appear as w3wp.exe processes. - Run
sc query wasandsc query w3svc. Stop W3SVC:net stop w3svc. Browse to http://localhost — what error do you get? Now runnetsh http show servicestateagain — confirm HTTP.sys is still registered and listening even with W3SVC stopped. Start W3SVC:net start w3svc. - Enable Failed Request Tracing on the Default Web Site for status code 500: in IIS Manager → Default Web Site → Failed Request Tracing → Enable, add rule → All Content, status codes 500. Manually browse a non-existent URL to generate a 404, then check
C:\inetpub\logs\FailedReqLogFiles\W3SVC1\for the trace XML file. - Review the FREB XML trace in a browser (it renders as an HTML table): expand the pipeline stages and identify which module returned the 404 status code (should be StaticFile or ManagedPipelineHandler) and the total time taken.
ASP.NET Framework and Core apps have a "cold start" — the JIT compiler compiles IL to machine code the first time the app pool starts or recycles, causing the first request to be slow (2-30+ seconds for large apps). Without the Application Initialization feature (Web-AppInit IIS module), users hit the recycled pool get a 503 while the app is warming up. Solution: install the App Initialization module and configure initializationPage in applicationHost.config. IIS will pre-warm the application after recycle by sending an internal warm-up request before accepting public traffic. This eliminates cold-start 503s for zero-downtime deployments and scheduled recycles.
🐛 Debugging Scenario
Failure: A worker process crashes repeatedly every few minutes. IIS recycles it automatically but users experience 503 for 5-10 seconds each time. The app pool eventually gets disabled (Rapid Fail Protection fires after 5 failures in 5 minutes).
- Event Viewer → Application log → Event ID 5010 (pool disabled) and 1000 (application error with faulting module). Note the faulting module and exception code.
- If faulting module is the application DLL — application code exception. Enable crash dumps via WER:
HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\w3wp.exeDumpType=2. Let it crash once more, collect the dump. - If faulting module is clr.dll or mscorlib — CLR crash, likely StackOverflowException (not catchable) or ExecutionEngineException. Check for recursive calls or thread pool exhaustion in the code.
- Immediate remediation: increase the Rapid Fail Protection window (IIS Manager → App Pool → Advanced → Rapid Fail Protection → set Failure Interval to 10 minutes) or increase the failure count to 10. This gives more time for the app to self-recover before the pool is permanently disabled.
- Long-term: fix the root cause in application code and add health-check monitoring to alert before the pool is disabled, not after.
🎯 Interview Questions
Beginner
HTTP.sys is the Windows kernel-mode driver that provides the HTTP/HTTPS protocol listener for IIS and other Windows applications. It runs in kernel mode for performance: handling TCP connections, HTTP parsing, and SSL negotiation at the kernel level avoids expensive user-mode to kernel-mode context switches that would occur on every request if a user-mode process like w3wp.exe owned the socket. HTTP.sys also maintains a kernel-mode HTTP response cache that can serve frequently accessed static responses without involving any user-mode process at all, reducing latency and CPU usage for cached content.
w3wp.exe is the IIS Worker Process — the user-mode executable that hosts the IIS request pipeline and runs web application code. There is one w3wp.exe per application pool (unless Web Garden mode creates multiple per pool). The name means "World Wide Web Worker Process." Each instance runs in its own isolated memory space under its own security identity (IIS AppPool\PoolName by default). W3SVC and WAS create and manage w3wp.exe processes — you should never start w3wp.exe manually. It appears in Task Manager as multiple instances when multiple app pools are running. When an application pool is recycled, the old w3wp.exe is replaced with a new one.
WAS (Windows Process Activation Service) creates and manages worker processes (w3wp.exe instances) for all application pools. It reads the IIS configuration from applicationHost.config, registers URL reservations with HTTP.sys, monitors worker process health, and enforces recycling conditions. W3SVC (World Wide Web Publishing Service) handles the HTTP-specific aspects — it manages the mapping between IIS sites and HTTP.sys listeners and coordinates the W3 service infrastructure. They are separate services so that WAS can also support non-HTTP activation protocols (Named Pipes, TCP WCF). W3SVC depends on WAS and cannot run without it. Stopping WAS stops all IIS activity even if W3SVC is still running.
An Application Pool is a configuration container that maps one or more IIS applications to a worker process instance. Each pool runs as a separate w3wp.exe with its own memory, security identity, and .NET runtime state. Process isolation means: (1) a crash in one pool does not kill other pools; (2) each pool has its own .NET CLR instance, so different .NET versions can coexist on the same server; (3) each pool runs as a different identity, separating security contexts; (4) pools can be recycled independently without affecting other sites. Best practice: one application per application pool — never share a pool between multiple applications if they have different criticality, release cadence, or security requirements.
IIS uses "overlapping recycle" (also called "graceful recycle"): when a recycle is triggered, WAS starts a new w3wp.exe process alongside the old one. The old process continues handling its in-flight requests. HTTP.sys begins routing new incoming requests to the new process. When the old process completes or the shutdown timeout (default 90 seconds) expires, the old w3wp.exe terminates. This means most recycling operations are invisible to end users — active requests complete on the old process, new requests go to the new process. Edge cases: session state stored in-process (in-memory) is lost when the old process terminates. If in-flight requests take longer than the shutdown time limit, they are forcibly terminated. Using out-of-process session state (SQL Server, Redis) eliminates session loss during recycles.
Intermediate
Rapid Fail Protection is an IIS safety mechanism that permanently disables an application pool if its worker process crashes too many times within a configured time window. Default: 5 failures in 5 minutes triggers pool disable. After disabling, the pool returns HTTP 503 for all requests until manually restarted. Purpose: prevent a buggy process from crashing in a tight loop, consuming CPU and flooding Event Viewer with error logs. Configuration: IIS Manager → App Pool → Advanced Settings → Rapid Fail Protection. In production: tune the thresholds based on the application (aggressive thresholds for critical APIs; more lenient for low-priority apps). Always alert on Event ID 5010 (pool disabled) in your monitoring — a silently disabled pool is invisible to basic HTTP health checks.
A Web Garden creates multiple worker processes for a single application pool (configured via Maximum Worker Processes > 1). Benefits: better CPU utilisation on multi-core servers when the application is not designed for multi-threading; the load is distributed across multiple processes. Drawbacks: each process has its own in-process state — static variables, in-memory caches, and in-process session state are NOT shared across processes. This causes cache inconsistency (each process warms up its cache independently) and session routing issues (requests for the same session must land on the same process, requiring ARR or a load balancer with sticky sessions). Recommendation: use out-of-process cache (Redis, Memcached) and external session state (SQL Server) before enabling Web Garden. In modern architectures, horizontal scaling across servers is preferred over Web Garden within one server.
The application pool request queue is maintained by HTTP.sys in kernel memory. It holds incoming HTTP requests that have been accepted at the network layer but not yet dequeued and processed by a w3wp.exe thread. Default queue length is 1000 requests (configurable via applicationHost.config appPool queueLength or the AppCmd command). When the queue is full, HTTP.sys returns 503 Service Unavailable immediately — the client connection is accepted (no TCP timeout) but gets a 503 response. Queue overflow indicates the application pool cannot process requests as fast as they arrive. Causes: insufficient threads, slow application code, slow downstream dependencies (DB, external APIs), or insufficient worker processes (Web Garden). Monitor with Performance Counter: Web Service(_Total)\Current ISAPI Extension Requests.
ANCM (ASP.NET Core Module) is a native IIS module installed by the .NET Core Hosting Bundle. It bridges IIS and the .NET Core runtime in two modes: Out-of-process: ANCM acts as a reverse proxy, forwarding HTTP requests from IIS to a separate Kestrel process. IIS handles SSL termination, logging, and auth before forwarding. The Kestrel process runs the Core app on a random loopback port. If Kestrel crashes, ANCM waits and retries, returning 502 in the interim. In-process: ANCM loads the CoreCLR directly into w3wp.exe memory (introduced in .NET Core 2.2). The Core app runs inside the IIS worker process. Better performance (no HTTP hop between IIS and Kestrel) but the app shares the worker process lifetime — a Core app crash takes down the worker process. Recommended for production unless architectural reasons require out-of-process isolation.
Recycle: WAS creates a new worker process alongside the current one, the new process starts, HTTP.sys routes new requests to it, and the old process shuts down after completing in-flight requests or hitting the shutdown timeout. Application state is re-initialised in the new process. Zero or minimal downtime via overlapping recycle. Restart: Instantly kills the current worker process and starts a fresh one. Causes brief downtime (the 503 window between death and start). Used when a clean kill is required — e.g., forceful recycle after an app is completely stuck. In IIS Manager: right-click pool → Recycle (graceful) vs. Stop then Start (hard restart). In appcmd: appcmd recycle apppool /apppool.name:X is a recycle; appcmd stop apppool /apppool.name:X then appcmd start apppool /apppool.name:X is a hard restart.
Scenario-based
1. Check if it's all pools or one: appcmd list apppool /processModel.userName:* | findstr Stopped — are any pools in Stopped state? 2. Check Event Viewer → Application log Event ID 5010 (rapid fail protection disabled a pool). If yes: identify which pool and which app crashed repeatedly. 3. Run sc query was and sc query w3svc — if either service is stopped, start them: net start was; net start w3svc. 4. Check netsh http show servicestate — any request queue registrations? If none, HTTP.sys lost its registrations (service restart will reregister). 5. If pool is disabled: restart it manually appcmd start apppool /apppool.name:X, confirm site responds, and check logs to prevent recurrence. 6. Document every step for the post-incident review.
Configure the app pool's recycling settings in IIS Manager → App Pool → Recycling: set a Regular Time Interval recycle (minutes since last recycle) or a Specific Time recycle (e.g., 3:00 AM daily when traffic is lowest in the application's time zone). Enable Overlapping Recycle (default): the new process starts before the old one stops — zero downtime for most requests. Set Maximum Private Memory (KB) as a safety net: e.g., 2,000,000 KB (2 GB) — if memory exceeds this, WAS triggers an immediate recycle. Configure the overlap shutdown timeout to 90 seconds to give in-flight requests time to complete. Enable recycle event logging in IIS Manager → App Pool → Advanced Settings → Generate Recycle Event Log Entry for all reasons — this creates Event ID 1074 entries in the Application log, documenting every recycle for the post-mortem. Long-term action: profile the memory leak with a dump analysis tool and fix the root cause.
Risks of habitual iisreset: (1) drops all active connections to all sites simultaneously — a shopping cart or form submission in progress is lost. (2) On a server with 10+ sites, it resets sites that are working fine along with the problematic one. (3) Masks the root cause rather than fixing it — if an app leaks memory or deadlocks, iisreset makes it temporarily better but the issue returns. (4) On a production server serving thousands of users, a 30-second iisreset window at peak time causes measurable revenue loss. Correct approach: (a) identify which specific app pool is degraded via appcmd list wp and Task Manager; (b) recycle ONLY that pool (appcmd recycle apppool /apppool.name:X); (c) investigate the root cause; (d) apply a permanent fix. Never use iisreset during business hours on a production server hosting more than one site unless every site needs to be restarted simultaneously (e.g., after applicationHost.config schema changes).
1. DNS resolves shop.example.com to IP 10.0.1.100. 2. Browser initiates TCP 3-way handshake to 10.0.1.100:443. HTTP.sys kernel driver accepts the connection. 3. TLS handshake: HTTP.sys presents the SSL certificate configured in the IIS binding for shop.example.com port 443, negotiates cipher suite and TLS version. 4. Browser sends HTTP/1.1 or HTTP/2 GET request: GET /products HTTP/2 Host: shop.example.com. 5. HTTP.sys looks up the URL reservation and routing table → maps to "ShopPool" application pool. 6. If "ShopPool" has no w3wp.exe running: WAS creates a new w3wp.exe process. CLR loads and JIT-compiles ASP.NET application assemblies (cold start — can take 5-30 seconds for large apps). 7. HTTP.sys dequeues the request and signals w3wp.exe. 8. IIS pipeline runs through modules: Request Filtering → Anonymous Auth → URL Rewrite → ManagedEngine → ASP.NET MVC route → ProductsController.Index(). 9. Controller queries DB, builds response. 10. w3wp.exe calls HTTP.sys response API with 200 OK and body. 11. HTTP.sys sends response over the TLS session to the browser.
This command shows the current runtime state of the HTTP.sys kernel-mode server. Output sections: Server Session (HTTP server sessions created by applications): each IIS site creates a server session. URL Groups: each site binding (IP:port:hostname) is a URL group registered in HTTP.sys with the application pool queue reference. Request queues: currently pending requests in each pool's queue plus queue statistics (completed requests, rejected requests). SSL certificate bindings: SSL certs associated with each IP:port or SNI hostname. Use cases: (1) Confirm that URL registrations survived a service restart (sites that aren't showing in servicestate despite being configured in IIS Manager are a major clue). (2) Identify queue build-up in real time during a performance incident. (3) Verify SSL certificate is bound to the correct IP/hostname without browsing to the site. (4) Identify orphaned URL reservations from uninstalled applications that may conflict with IIS.
🌐 Real-world Usage
Architecture knowledge lets you move from "I hope iisreset fixes it" to "I know exactly which component failed and why." Production IIS support teams use this architecture knowledge daily — mapping PID to pool to application, understanding why a 503 is transient vs. permanent, and explaining the root cause clearly in incident reports. In interviews, describing the HTTP.sys → WAS → w3wp.exe flow is the benchmark answer that separates a junior candidate from an IIS specialist.
📝 Summary
HTTP.sys in kernel mode accepts all connections and performs TLS. WAS supervises worker processes, reads config, and enforces health policies. w3wp.exe runs application code in an isolated user-mode process per app pool. Each boundary in this architecture has failure modes — knowing them is the difference between a 5-minute resolution and a 5-hour bridge call.