Reading the cPanel/WHM emergency patch the day it dropped

Today, cPanel quietly shipped an emergency security update across every supported branch. The advisory was short. There was a vague reference to “session loading and saving” under internal ticket CPANEL-52908, a note that exploitation predated the patch, and six new build numbers, one per supported major. I started to wonder, how do I tell which hosts are patched and which aren’t?

On the wire there is nothing to compare against; the response shape did not change. So the only honest answer was to read the diff, walk it end-to-end, and decide what an external probe could legitimately observe.

This is a write-up of the evening I spent doing that.

cPanel publishes its release tree on httpupdate.cpanel.net/cpanelsync/, which makes side-by-side diffs surprisingly easy. I pulled the last vulnerable build (11.110.0.96) and the first patched build (11.110.0.97) and ran diff -ruN. The whole patch was 5.3 KB across three files: Cpanel/Session.pmCpanel/Session/Load.pm, and Cpanel/Session/Encoder.pm.

The security-relevant change was effectively one line. In Cpanel::Session::saveSession, the patched version adds a call to filter_sessiondata($session_ref) before the hashref is flushed to disk. Everything else in the diff was hardening around it: a no-ob: branch in the encoder for sessions that have no obfuscation key, a hex_encode_only helper, and a rewrite of a local $h{k} = X if COND idiom that perlsyn flags as undefined behaviour.

cPanel’s session state lives in /var/cpanel/sessions/raw/<sessid> and is written by Cpanel::Config::FlushConfig::flushConfig. The format is one entry per line, key=value\n, with no escaping at all. The reader splits each line on the first =, and when the same key appears twice, the last occurrence wins.

That format has an obvious property: if a \n byte ends up inside a value, the next read sees that value as multiple keys.

The codebase already had the right defense for this. filter_sessiondata strips \r\n=, from origin values and \r\n from everything else. It was wired into the two safe wrappers:

  • Cpanel::Session::create, the new-session path
  • Cpanel::Session::Modify::save, the locked-update path

It was simply not wired into saveSession itself. Inside the published Perl tree, every direct caller went through one of those two wrappers, with one exception that pre-stripped \r\n from its inputs by hand. So on the .pm side there was no obvious caller of the bug. The unsafe direct caller had to live somewhere else.

cpsrvd is the cPanel daemon that fronts ports 2082 through 2096. On disk it ships as a single xz-compressed ELF: cpanelsync/11.110.0.96/binaries/linux-c8-x86_64/cpsrvd.xz. I pulled it for both builds, decompressed both, and started picking the binaries apart.

The first interesting marker was B::C in the strings output. B::C is the Perl B::C compiler backend, which turns Perl source into native C and links it into the host binary. cpsrvd embeds a private package tree that way: Cpanel::Server::RoutesCpanel::Server::Handlers::*Cpanel::Server::Type::ChangeCpanel::Server::Utils, the constant Cpanel::Server::_SESSION_PARTS. All of those package names appear in the cpsrvd ELF as symbols, but none of them ship as a .pm file in cpanelsync. They are baked in once at build time and never reloaded.

Adjacent in the binary’s string table I found set_session_valuessuccessful_loginkeep_sessionprevious_session_user$session_ref$user_provided_session_ref, alongside saveSessionwrite_sessionneeds_authcp_security_tokenfilter_sessiondata, and the literal path /usr/local/cpanel/Cpanel/Session.pm_SESSION_PARTS is the list of fields cpsrvd considers writable from a request; it includes themeexternal_validation_tokentfa_verifiedsession_temp_passsession_temp_userip_address, and needs_auth.

That gives the full picture. cpsrvd carries a compiled-in Cpanel::Server handler that does its own loadSession → mutate → saveSession. It calls Cpanel::Session::saveSession directly, which means it skips the create and Modify::save wrappers that would have filtered the values for it. That is the unsafe direct caller.

I diffed the two cpsrvd ELFs byte for byte. They differ by 9,272,115 bytes between v110.96 and v110.97. Most of that is B::C output churn: a one-line source change in saveSession rebuilds a large fraction of the compiled Perl, and the linker reshuffles addresses in everything downstream. The string table is unchanged. So the patch is in the compiled code path, not the metadata, which has a sharp implication: an operator who patches the on-disk .pm files but does not get a rebuilt cpsrvd is still vulnerable, because the running daemon never looks at the on-disk Session.pm for its own session-write path.

The actual primitive is cleaner than I expected. cpsrvd’s HTTP Basic auth handler stores the submitted password verbatim in the pre-auth session file. So a Basic auth header with a CRLF-bearing password writes attacker-controlled key=value lines into /var/cpanel/sessions/raw/<sess> on the first request.

The second step is what makes it land. cpsrvd’s session loader reads a JSON cache file at /var/cpanel/sessions/cache/<sess> first and only falls back to the raw text parser if the cache is missing or fails. In the cache, injected newlines are inert string content inside the pass value, so on a cache hit nothing happens. The trigger is a request to a /cpsess<token>/ URL with a wrong token: cpsrvd serves the Token-Denied page and, in the process, calls Cpanel::Session::Modify::new(nocache => 1). That re-parses the raw file (which still has the injected newlines) and regenerates the JSON cache with the attacker-chosen keys promoted to top-level entries.

From there the session has whatever fields you injected. The four that matter are user=roothasroot=1tfa_verified=1, and successful_internal_auth_with_timestamp set to a recent value. After the regen, the next request to WHM is authenticated as root.

WHM’s daemon runs as root, so a forged WHM session is a root shell on the host. On a shared-hosting node that is every tenant’s mail spool, every tenant’s database, every tenant’s home directory, and the kernel.

I wanted empirical confirmation that the cpsrvd-binary hypothesis was right, so I stood up an AlmaLinux 8 t3.medium in eu-central-1, locked the security group to my IP, and ran the cPanel installer. My first attempt pinned /etc/cpupdate.conf to 11.110.0.96; the installer refused with FATAL: You cannot install versions of cPanel & WHM prior to cPanel & WHM version 125. I dropped to 11.126.0.53, which is the last pre-patch build on the 11.126 branch. The installer ran for about 40 minutes and brought cpsrvd up clean.

I confirmed the live Cpanel/Session.pm on disk matched the vulnerable v53 source, then ran the full chain through a nuclei template against https://<public-ip>:2087. Ten attempts, ten hijacks. The on-disk session cache for each successful run showed user=roothasroot=1tfa_verified=1 as proper top-level keys.

Then I tested the cpsrvd-binary hypothesis directly. I copied the patched Session.pm from 11.126.0.54 over the live /usr/local/cpanel/Cpanel/Session.pm, restarted cpsrvd, and re-ran the nuclei template. Ten attempts, ten hijacks. The patch on disk had no effect on the running daemon, exactly as the strings analysis predicted.

Finally I ran /scripts/upcp to pull cPanel up to 11.126.0.54 properly, which rebuilds the cpsrvd binary as part of the upgrade. Ten attempts, zero hijacks. That is the empirical confirmation. The fix lives in the compiled cpsrvd, not the on-disk module.

I terminated the instance, deleted the security group and key pair, and let the EBS volume go with it.

The bug itself is the kind of thing that, in retrospect, looks almost mundane: a sanitizer that existed in the codebase, was wired into two paths, and was simply missing from a third. There was no exotic primitive, no novel chain. The exploitable surface was a line-oriented config file and a forgotten filter_sessiondata call.

What was less mundane was the layer below. The Perl side of the patch was almost a decoy. The fix that actually mattered shipped inside a compiled binary, where the same source change produces a 9 MB byte delta and no visible string-table difference. That is the kind of thing a casual diff misses, and it is the kind of thing an operator who hot-patches the on-disk modules without restarting the right daemon misses too. I would not have believed it without the cpsrvd-restart experiment on the live box; the strings analysis was suggestive, the live test was conclusive.

The wider gap is the more uncomfortable one. The advisory dropped with exploitation already happening, with no on-wire change between vulnerable and patched, and with the actual fix sitting in a place most defenders never look. If you wanted to know whether your perimeter was safe, the only path was to read the diff, understand it well enough to identify externally-observable signatures, validate them against a live build, and ship the check yourself. The window in which “a patch exists” and “every box you care about is actually patched” diverge is exactly the window attackers operate in, and on this one it was measured in hours.

That is the part of vulnerability research that does not show up in any advisory. It is also the part that is becoming dramatically faster to do: running four research streams in parallel, diffing the patch, reverse engineering the binary, standing up a live target, and writing the detection check is a single evening now, where two years ago it was a week. I used to think the bottleneck was the reading. It isn’t. The bottleneck is choosing what not to do.


Posted

in

by

Tags: