How the Great Firewall of China Actually Works — A Technical Deep Dive

Building Vexonik required deeply understanding what we were working against. The Great Firewall is not a simple IP blocklist — it's a sophisticated, layered system that evolves continuously. This is a technical breakdown of how it actually works.

Overview: The GFW Is Not One System

The Great Firewall (GFW) is the colloquial name for the censorship infrastructure operated by China's Ministry of Industry and Information Technology (MIIT) at the backbone level of China's internet. It runs at multiple major ISPs simultaneously — China Telecom, China Unicom, China Mobile — which means there's no single "bypass."

The GFW operates through several distinct mechanisms, each targeting different aspects of internet traffic. Understanding each one is necessary to understand why naive circumvention fails.

Layer 1: DNS Poisoning

DNS poisoning is the cheapest and most widely deployed mechanism. When a user in China queries a DNS resolver for a blocked domain (e.g., google.com), the GFW injects a forged DNS response before the legitimate one arrives.

The forged response typically contains:

127.0.0.1 (localhost)
0.0.0.0
A random non-existent IP
An IP owned by a Chinese entity (for traffic analysis purposes)

The injection is possible because DNS queries are unencrypted UDP packets. The GFW monitors traffic at the backbone level and race-conditions the response — it sends a forged reply faster than the legitimate resolver.

User → DNS query for google.com → [passes through backbone]
                                        ↓
                               GFW detects "google.com"
                                        ↓
                               GFW injects poisoned response (fast)
                                        ↓
User receives 127.0.0.1 ← [before real response arrives]

Partial bypass: Using encrypted DNS (DoH, DoT) prevents DNS poisoning. But the encrypted DNS servers themselves may be blocked by IP, and the domains you're querying are still exposed if the DNS transport is identified.

Layer 2: IP Blocking

The GFW maintains blocklists of IP addresses and CIDR ranges. The entire IPv4 and IPv6 space of major cloud providers (AWS, GCP, Azure, DigitalOcean, Vultr) has significant coverage, with specific datacenter ranges heavily blocked.

IP blocking is:

Fast — kernel-level packet dropping, near-zero latency
Blunt — blocks legitimate services that share IPs (CDN edge nodes, Cloudflare)
Dynamic — IPs are added to blocklists sometimes within hours of a new VPN node going live

This is why VPN providers burn through IP addresses rapidly when targeting China. A fresh datacenter IP might work for days or weeks; a known VPN IP gets blocked within hours of detection.

Important: IP blocking alone doesn't explain the GFW's effectiveness. Protocols can be tunneled through many different IPs. The deeper threat is traffic analysis.

Layer 3: Deep Packet Inspection (DPI)

DPI is the core of the GFW's power. The GFW inspects the content and metadata of packets, not just their source and destination.

Protocol Fingerprinting

Every network protocol has characteristic byte patterns. OpenVPN sends a recognizable handshake. WireGuard has a distinct 32-byte handshake structure. Standard TLS has a recognizable ClientHello format. The GFW maintains signatures for known VPN protocols.

Example: OpenVPN's first packet contains the HMAC header at a fixed offset, followed by a packet type byte. A stateful inspection engine can identify this within the first few packets of a connection.

OpenVPN TLS Client Hello (simplified):
[0x00][0x00][0x00][0x00][0x00][0x00][0x00]  ← Session ID
[0x38]                                        ← Packet type (KEY_METHOD_V2)
[random_bytes...]                             ← Random

DPI signatures exist for:

OpenVPN (both UDP and TCP modes)
WireGuard
IPSec/L2TP
PPTP
Shadowsocks (original implementation)
Standard SSH port-forwarding patterns
Tor (default configuration)

Traffic Analysis Without Protocol Identification

Even when a protocol is properly obfuscated, traffic patterns leak information:

Packet size distribution: VPN traffic tends to have different packet size distributions than typical HTTPS browsing
Inter-arrival timing: Constant-rate keepalive packets are a fingerprint
Entropy analysis: Encrypted traffic has near-maximum entropy; the GFW uses entropy tests to identify potential tunnels
Connection duration and bandwidth: Long-lived, high-bandwidth connections to foreign datacenters are flagged for deeper inspection

Layer 4: Active Probing

This is the most sophisticated and least understood component. When the GFW detects a suspicious connection — based on IP, traffic patterns, or partial protocol match — it doesn't just block it. It actively probes the server to confirm it's a proxy.

The probe simulates different types of clients:

Sending a random byte sequence and observing the response
Sending a malformed protocol handshake and observing error behavior
Sending a valid HTTPS ClientHello and checking if the server responds like a normal HTTPS server

For most VPN protocols, a server that receives an unexpected probe responds in a way that confirms it's a proxy — wrong handshake, unexpected error codes, or silence where a legitimate HTTPS server would send a certificate.

Timeline of detection:
T+0s: User connects to VPN node
T+0.1s: GFW DPI flags suspicious traffic
T+0.5s: GFW probe system initiates active probing from multiple vantage points
T+1-5s: Probe confirms proxy behavior
T+5-60s: IP added to blocklist

This is why "fresh IPs" get blocked so quickly. The probing system is highly automated.

Why Standard VPNs Fail

Putting it together:

OpenVPN TCP: DPI fingerprints the handshake within the first 3 packets → blocked
WireGuard: Unique 32-byte handshake structure → fingerprinted → blocked
Shadowsocks (original): The first byte pattern is identifiable → blocked
SSH tunneling: SSH handshake is identifiable; long-lived high-bandwidth SSH → flagged
V2Ray/VMess with TLS: Better, but active probing can distinguish real TLS servers from proxies that impersonate TLS

The Temporal Dimension: The GFW Tightens During Sensitive Periods

The GFW is not static. During politically sensitive periods (National Day, Party Congress, significant anniversaries), filtering is dramatically tightened:

More aggressive traffic analysis
Lower thresholds for probing triggers
Faster blocklist propagation
Outright blocking of all foreign VPN protocols as default

Infrastructure that works reliably for 10 months may become unusable for 2 months during these periods. This is a key operational challenge for any VPN service targeting China.

The GFW's Limitations

Understanding the limits of the GFW is as important as understanding its capabilities:

Cannot break TLS encryption — it can block TLS connections, but not read their content
Performance constraints — inspecting every packet at China's scale has computational limits; the GFW makes probabilistic decisions, not deterministic ones
Cannot block what it doesn't know — truly novel protocols have a grace period before signatures are developed
International business pressure — completely blocking all foreign traffic would be economically catastrophic; a balance is maintained

In the next article, I'll cover the specific protocols (VLESS + Reality, TUIC, Hysteria2) that currently evade GFW detection and the architectural reasons why they work.