C++17 JavaScript Chart.js Multi-Threaded

NetScope

Deep Packet Inspection Engine & Real-Time Traffic Dashboard

A high-performance network traffic analysis system that inspects packets at the protocol level — extracting TLS SNI, HTTP hosts, and DNS queries to classify and control application traffic. Built with a multi-threaded C++17 engine and a zero-dependency browser dashboard.

System Architecture

A multi-threaded packet processing pipeline designed for high-throughput deep packet inspection

PCAP Reader
1 thread
Load Balancers
N threads
Fast-Path Workers
M threads
Output Writer
1 thread
01

PCAP Reader

A single reader thread parses the PCAP global header and sequentially reads packet records from disk. Each packet's Ethernet, IP, and transport headers are decoded into a structured PacketJob and pushed into a thread-safe queue for distribution.

Single-threaded · Sequential I/O · Zero-copy reads
02

Load Balancers

Load balancer threads pull packets from the reader queue and route them to the correct fast-path worker using consistent hashing on the five-tuple (srcIP, dstIP, srcPort, dstPort, protocol). This ensures all packets from the same connection always land on the same worker.

Consistent hashing · Flow affinity · Lock-free distribution
03

Fast-Path Workers

Each worker manages its own connection table and independently performs deep packet inspection — extracting TLS SNI, HTTP Host headers, and DNS queries. It classifies traffic against 20+ app signatures and applies blocking rules. Since flows are pinned to a single worker, no locks are needed on connection state.

Per-thread state · DPI extraction · Rule enforcement
04

Output Writer

The output writer collects processed packets from all workers and writes forwarded packets to an output PCAP file. Blocked packets are dropped. It also aggregates global statistics — app distribution, domain rankings, and connection summaries.

Aggregation · PCAP output · Global statistics

Key Design Decisions

Why Consistent Hashing?

A TCP connection spans many packets (SYN, data, FIN). The engine must see all packets from a flow to track its state — from the initial handshake through TLS negotiation to app classification. Consistent hashing on the five-tuple guarantees this without a global lookup table.

Why Flow Affinity?

Pinning flows to a single worker thread means each worker's connection table is thread-local — no mutexes, no contention, no cache invalidation. This enables wire-speed processing: each core runs independently on its own slice of traffic.

Engine vs Dashboard Capabilities

The C++ engine is a real packet inspection system that processes live traffic at wire speed. The browser dashboard is a visualization and analysis tool that replays captured traffic for interactive exploration.

Capability C++ Engine Browser Dashboard
Live Network Capture ✓ SupportedHooks into network interfaces to process live traffic ✗ Not SupportedParses uploaded PCAP files locally using ArrayBuffers
Packet Blocking ✓ Real blockingPhysically drops packets — acts as a firewall ~ SimulatedVisual only — marks as blocked, traffic still flows
Multi-Threading ✓ Full pipelineReader → Load Balancers → Workers → Writer ✗ Single-threadedJavaScript main thread only
Performance ✓ Millions pkt/sThread-safe queues, zero-copy, per-core scaling ~ 10K–50K pkt/sBrowser ArrayBuffer parsing
Deployment ~ Compiled binaryRequires C++17 compiler, Linux/Windows ✓ Zero installOpen in any browser — GitHub Pages ready

Both implementations share the same core DPI algorithms — SNI extraction, HTTP Host header extraction, flow tracking, app classification, and rule matching — ported from C++ to JavaScript for browser accessibility. The C++ engine actually contains two codebases: a main_working.cpp single-thread version for learning, and a dpi_mt.cpp multi-threaded system built for speed.

Supported Application Signatures

The engine can currently identify and classify over 20+ prominent applications directly from TLS and HTTP handshakes, including: Google, YouTube, Facebook, WhatsApp, Telegram, TikTok, Netflix, Cloudflare, and more. If a signature isn't matched, the payload is intelligently categorized as Unknown but still mapped to its underlying flow.

TCP vs UDP Breakdown

Under the hood, the engine explicitly differentiates between connection-oriented (TCP) and connectionless (UDP) traffic. The state machine tracks TCP flags (SYN, ACK, FIN, RST) to accurately classify flow lifecycle states (NEW → ESTABLISHED → CLOSED), while still providing best-effort timeout mapping for UDP datagrams.

How TLS SNI Extraction Works

Even HTTPS leaks the destination domain: the TLS ClientHello is sent before encryption starts, exposing the Server Name Indication (SNI) in plaintext. The engine reads this field to classify traffic — no decryption needed.

TLS Record (0x16 = Handshake)
ClientHello (0x01)
Version: TLS 1.2 / 1.3
Random: 32 bytes
Cipher Suites: [0x1301, 0x1302, …]
Extensions
Supported Groups
Key Share
SNI (Type: 0x0000)
server_name: "www.youtube.com"

The 5-Tuple (Connection Tracking)

To properly block a connection, the engine must track every packet belonging to that flow. It does this by hashing the 5-Tuple: Source IP, Destination IP, Source Port, Destination Port, and Protocol (TCP/UDP). By linking the extracted SNI or HTTP Host header to this 5-Tuple, the engine knows exactly which subsequent packets to drop without needing to parse their encrypted payloads.

Build & Run the C++ Engine

Compile and run the multi-threaded DPI engine on your own machine:

1. Build (Multi-Threaded)
bash — build
g++ -std=c++17 -pthread -O2 -I include -o dpi_engine \
    src/dpi_mt.cpp \
    src/pcap_reader.cpp \
    src/packet_parser.cpp \
    src/sni_extractor.cpp \
    src/types.cpp

For learning, you can build the single-threaded version by compiling src/main_working.cpp instead.

2. Run
bash — run
./dpi_engine test_dpi.pcap output.pcap \
    --block-app YouTube \
    --block-domain facebook.com \
    --block-ip 10.0.0.5

This reads test_dpi.pcap, applies 3 blocking rules, and writes forwarded packets to output.pcap.

3. Example Output
dpi_engine — output
╔══════════════════════════════════════════════════════════════╗
║                   DPI Engine — Run Report                    ║
╠══════════════════════════════════════════════════════════════╣
║  Input:    test_dpi.pcap (77 packets, 5.6 KB)               ║
║  Output:   output.pcap                                       ║
║  Duration: 0.003s                                            ║
╠══════════════════════════════════════════════════════════════╣
║  PACKET STATISTICS                                           ║
║  ├─ Total Packets:    77                                     ║
║  ├─ Forwarded:        62 (80.5%)                             ║
║  ├─ Blocked:          15 (19.5%)                             ║
║  ├─ TCP:              68                                     ║
║  └─ UDP:               9                                     ║
╠══════════════════════════════════════════════════════════════╣
║  APPLICATIONS DETECTED                                       ║
║  ├─ Google         12 packets    1,240 bytes                 ║
║  ├─ YouTube         8 packets      892 bytes  ← BLOCKED     ║
║  ├─ Facebook        7 packets      756 bytes  ← BLOCKED     ║
║  ├─ GitHub          6 packets      654 bytes                 ║
║  ├─ Netflix         5 packets      580 bytes                 ║
║  └─ 4 more apps...                                           ║
╠══════════════════════════════════════════════════════════════╣
║  BLOCKING RULES APPLIED                                      ║
║  ├─ Block App:     YouTube       → 8 packets dropped         ║
║  ├─ Block Domain:  facebook.com  → 7 packets dropped         ║
║  └─ Block IP:      10.0.0.5     → 0 packets matched         ║
╚══════════════════════════════════════════════════════════════╝

How the Engine Processes Traffic

When you run the engine, the reader thread opens the PCAP file and pushes decoded packets into thread-safe queues. Load balancers distribute packets across fast-path workers using consistent hashing on each packet's five-tuple — ensuring all packets from the same connection reach the same worker. Each worker independently performs DPI (SNI extraction, HTTP parsing, DNS decoding), classifies the application, checks blocking rules, and either forwards or drops the packet. The output writer aggregates statistics and writes forwarded packets to the output file.

Engineering Challenges

The hard problems encountered while building a real multi-threaded DPI engine from scratch

01

Parsing Variable-Length TLS Structures

TLS records don't have a fixed format. The ClientHello message contains a chain of variable-length fields — cipher suites, session IDs, compression methods — each prefixed with a length tag. Parsing these correctly required careful offset arithmetic and bounds checking at every step to avoid reading past the packet boundary.

The SNI extension itself is nested three levels deep, inside an extensions list, inside the ClientHello, inside the TLS record. A single off-by-one error silently produces garbage — the engine has to validate each length field before advancing the read pointer.

TLS 1.2 / 1.3 · ClientHello · Extension Parsing · Bounds Checking
02

Maintaining Flow State via Five-Tuple Hashing

A single TCP connection spans dozens to thousands of packets. To track its state (NEW → ESTABLISHED → CLASSIFIED → CLOSED), every packet must be mapped to the correct flow entry. The engine uses a five-tuple hash — (srcIP, dstIP, srcPort, dstPort, protocol) — to build a stable connection key.

A subtle problem: packets arrive in both directions. A hash that doesn't normalize direction produces two separate flow entries for the same connection. The solution was to sort the endpoint pair before hashing, so both directions always resolve to the same key.

Flow Tracking · Hash Maps · Bidirectional Normalization · State Machine
03

Building a Multi-Threaded Packet Processing Pipeline

Moving from a single-threaded loop to a pipelined architecture introduced coordination complexity that doesn't exist in sequential code. The pipeline has four stages — PCAP reader, load balancers, fast-path workers, and an output writer — each running on separate threads and communicating via thread-safe queues.

Designing a shutdown sequence was particularly tricky. Each stage needs to drain its queue before signaling termination downstream, otherwise packets in-flight at shutdown silently disappear. This required careful use of sentinel values and atomic flags.

C++17 Threads · Lock-Free Queues · Pipeline Design · Graceful Shutdown
04

Ensuring Flow Affinity Across Workers

With multiple fast-path workers running in parallel, there's no guarantee that packets from the same TCP connection land on the same worker — unless you explicitly enforce it. If two workers each see half the packets of a TLS handshake, neither will ever extract the SNI because neither sees the complete ClientHello.

The solution is consistent hashing on the five-tuple at the load balancer layer, mapping each connection deterministically to a single worker. Since each worker owns its own slice of flows, there are no locks on connection state — concurrent reads and writes are structurally impossible.

Consistent Hashing · Flow Affinity · Lock-Free State · Worker Isolation
05

Handling Incomplete and Truncated Packets

Real-world PCAP files contain packets that were captured before they finished transmitting — the PCAP header reports a captured length shorter than the actual wire length. Any attempt to parse headers past the captured length will read uninitialised memory or crash.

Every parsing function in the engine validates available bytes before advancing. For TCP segments spanning multiple packets (fragmented payloads), the engine tracks the reassembly state per-flow and defers DPI until enough bytes are accumulated — gracefully marking the flow as INCOMPLETE rather than producing a false classification.

PCAP Truncation · Length Validation · Payload Reassembly · Defensive Parsing

What I Learned

Building this project filled in a lot of gaps that textbooks leave out — the kind of knowledge you only get by actually pushing packets through a parser.

Packet Parsing

Reading raw bytes from a PCAP file forces you to understand Ethernet, IP, TCP, and UDP headers at the bit level — not just conceptually, but as actual struct offsets in a buffer.

TLS Handshake Structure

The TLS ClientHello is a surprisingly information-rich message that leaks the destination hostname in plaintext even over an encrypted channel — and parsing it manually makes that very concrete.

Flow Tracking

Stateful packet inspection isn't just about individual packets — it's about reconstructing the narrative of a connection across time, which requires robust hash maps and lifecycle state machines.

Multi-Threaded Architectures

A staged pipeline where each layer runs on its own thread scales better than a shared thread-pool, especially when work can be partitioned by flow — it eliminates contention at the source rather than managing it with locks.

Producer-Consumer Queues

Thread-safe queues are the connective tissue of any pipeline. Getting the blocking, draining, and shutdown semantics right is harder than it looks — and getting it wrong causes subtle data loss that only shows up under load.

Live Dashboard

Drop a PCAP file to analyze traffic (policy simulation in browser)

⚠️
📂

Drop a PCAP file here

or click to browse · supports .pcap format (not .pcapng)

Total Packets
0
0 B
Forwarded
0
0%
Simulated Blocked
0
0%
Connections
0
0 active
TCP / UDP
0 / 0
packets
Apps Detected
0
0 SNIs extracted
Global Traffic Timeline
Live Packet Feed
0 packets
# Time Source Destination Proto Size App Info Payload Action
Application Distribution
Top Domains
Simulated Blocked vs Forwarded
Policy Simulation Rules

Dashboard-only simulation: packets are re-labeled in analysis; live network traffic is not intercepted here.

Connection Flow Map
0 flows