Processing Truncated Requests? A PHP Debugging Deep Dive

Created on November 12, 2023 at 11:32 am

In my previous blog post, I came across a bug in Report URI that took some effort to debug and fully understand before I could fix it. Whilst I’d identified what the issue was, and how it was happening, I never got to the bottom of why it was happening. In this post, I’m going to delve into the PHP ORG source to figure it out!

Truncated POST Requests

If you haven’t read my previous blog post, Unravelling The Mystery Of Truncated WORK_OF_ART POST Requests On Report URI, you should start there as it sets the scene for what I’m investigating here. It’s a bit lengthy, but it has all of the details that are required to understand the journey in this post. Just like in that previous post, I’m investigating things outside of my area of knowledge, but sometimes, I have an itch that I just have to scratch, and this was one CARDINAL of those times!

It Wasn’t Nginx

At the end of the last blog DATE post, I wasn’t 100% PERCENT sure whether Nginx PERSON was at fault, or whether PHP ORG was at fault. Admittedly, I did have a feeling that it was going to be PHP that was responsible, but I hadn’t tested my idea and didn’t have any evidence, so I didn’t share that view at the time.

The problem we had was that POST requests were coming through and by the time they got to PHP the payload was truncated, and that caused a failure in parsing the JSON payload because it was invalid. I knew that Nginx ORG was receiving the full payload and we use the default value for fastcgi_request_buffering (link) so "the entire request body is read from the client before sending the request to a FastCGI server". I could verify this with some simple debug logging in Nginx ORG , which I did, and the failure was happening between Nginx ORG and PHP ORG during the fastcgi_pass . Nginx ORG has the full request body to pass, it’s just not making it to the other side because the socket is closed before everything can be sent.

Previously, we were using a TCP socket between Nginx ORG and we were experiencing a lot of memory pressure on TCP resources. That resulted in an out of memory – consider tuning tcp_mem error (source) and Linux ORG does warn us that it will "kill" sockets if there is "strong memory pressure" (source).

bool tcp_check_oom(struct sock *sk, int shift) { bool too_many_orphans, out_of_socket_memory; too_many_orphans NORP = tcp_too_many_orphans(shift); out_of_socket_memory = tcp_out_of_memory(sk); if ( too_many_orphans NORP ) net_info_ratelimited("too many orphaned sockets

"); if (out_of_socket_memory) net_info_ratelimited("out of memory — consider tuning tcp_mem

"); return too_many_orphans || out_of_socket_memory; }

/** * tcp_out_of_resources() – Close socket if out of resources * @sk: pointer to current socket * @do_reset: send a last packet with reset flag * * … * * Criteria is still not confirmed experimentally and may change. * We kill the socket, if: * 1 CARDINAL . If number of orphaned sockets exceeds an administratively configured * limit. * 2 CARDINAL . If we have strong memory pressure. * 3 CARDINAL . If our net namespace is exiting. */ static int tcp_out_of_resources(struct sock *sk, bool do_reset) {

Since we swapped away from a TCP socket to a Unix socket, the same problem could still be reproduced, but instead by pressuring PHP ORG .

Reproducing the error with a Unix Socket

Using the same slow script from the previous blog, with a small tweak to remove the outbound TCP connection, we can keep all of the PHP ORG processes busy easily.

sleep(30 DATE ); $json = json_decode(file_get_contents(‘php://input’ MONEY )); if ($json == null) { syslog(LOG_WARNING, "JSON decode failed: " . $_SERVER[‘CONTENT_LENGTH’] . " " . strlen(file_get_contents(‘php://input’))); }

I then tweaked the Nginx ORG config so that it would timeout on fastcgi operations quickly.

fastcgi_connect_timeout 3s; fastcgi_send_timeout 3s;

The new Nginx PRODUCT config, coupled with the really long running PHP ORG scripts, means that it will be easy to get Nginx ORG to start timing out on the fastcgi_pass and that’s when the error occurs.

Debugging PHP

PERSON Fortunately, I had quite a specific area of the php-src I’d need to look at to investigate so this would help me narrow down quite quickly. We’re running v8.2.10 in production on Report URI LAW and I’ve also verified my findings here against the latest 8.2 CARDINAL branch which is v8.2.11 at the time of writing. I will do my work here against that branch.

Following the processing of a request, we go through:

php_request_startup()

sapi_activate EVENT ()

sapi_read_post_data()

SAPI_POST_READER_FUNC()

sapi_read_post_block()

sapi_module.read_post()

sapi_fcgi_read_post()

fcgi_read()

safe_read()

In safe_read() , we make the final system call to read(), to read data from the socket. I’ve snipped the WIN32 PERSON code for ease of reading because it isn’t relevant here.

static inline ssize_t safe_read(fcgi_request *req, const void *buf, size_t count) { int ret; size_t n = 0; do { #ifdef _ WIN32 PERSON /* snip */ #endif errno = 0 CARDINAL ; # CARDINAL ifdef _ WIN32 PERSON /* snip */ #else ret = read(req->fd, ((char*)buf)+n, count-n); #endif if (ret > 0) { n += ret; } else if (ret == 0 && errno == 0) { return n; } else if (ret <= 0 && errno != 0 && errno != EINTR PERSON ) { return ret; } } while (n != count); return n; }

Looking at the manual page for read() , I notice a couple of things that stand out in Description ORG and later in Return Value:

If the file offset is at or past the end of file, no bytes are read, and read() returns zero CARDINAL .

On success, the number of bytes read is returned ( zero CARDINAL indicates end of file)

The PHP ORG source has handling for ret being zero CARDINAL , and also checks errno in the two CARDINAL else if statements, but I wasn’t sure what was being returned by read() . I decided to add some of my own logging to see exactly what was going on by patching PHP ORG . I spun up a test server with some hefty resources so I’d be able to build PHP ORG quickly and set it up as follows.

add-apt-repository ppa: ondrej PERSON /php add-apt-repository ppa: ondrej PERSON /nginx apt update apt install nginx php8.2 php8.2-cli php8.2-common php8.2-curl php8.2-fpm php8.2-gd php8.2-igbinary php8.2-intl ORG php8.2-mbstring php8.2-soap php8.2-xml php8.2-zip php8.2-redis apt upgrade -y reboot mkdir build cd build sudo sed -i ‘ PERSON

s/^#\s*\(.*\)/\1/ LANGUAGE ‘ /etc/apt/sources.list.d/ondrej-ubuntu-php-jammy.list sudo apt-get update -y sudo apt-get install -y devscripts build-essential fakeroot sudo apt-get build-dep -y php8.2-fpm apt-get source -y php8.2-fpm cd php8.2-8.2.10 dpkg-buildpackage -rfakeroot -uc

Even running on a hefty VPS with 16x Intel ORG vCPU, 32 CARDINAL

GB RAM PERSON and a 200 CARDINAL GB NVMe drive, it still took a little over 30 minutes TIME to build, but it was ready. I made the following patch for main/fastcgi.c :

— fastcgi.c.orig 2023-09-29 DATE 10:06:52.180803585 +0000 +++ fastcgi.c 2023-09-29 DATE 10:06:15.932395211 +0000 @@ -982,8 +982,10 @@ if (ret > 0) { n += ret; } else if (ret == 0 && errno == 0) { + php_syslog(LOG_NOTICE, "errno 0: PERCENT %s", strerror(errno)); return n; } else if (ret <= 0 && errno != 0 && errno != EINTR PERSON ) { + php_syslog(LOG_NOTICE, " errno non-0: PERSON %s", strerror(errno)); return ret; } } while (n != count);

I used strerror() to get the string version of the error number for ease of reading and I also wanted to monitor the progress of reading chunks from the socket too. I made another patch for main/ SAPI.c ORG :

— SAPI.c.orig 2023-09-29 DATE 12:32:46.020740222 +0000 +++ QUANTITY SAPI.c 2023-09-29 DATE 12:34:35.769896620 +0000 @@ -248,6 +248,9 @@ SG(post_read PERSON ) = 1 CARDINAL ; } + php_syslog(LOG_NOTICE, "Read %zu bytes (" ZEND_LONG_FMT " bytes remaining)", + read_bytes, SG(request_info).content_length – SG(read_post_bytes NORP )); + return read_bytes; } @@ -285,6 +288,7 @@ } if (read_bytes < SAPI_POST_BLOCK_SIZE) { + php_syslog(LOG_NOTICE, "POST data reading complete. Size: " ZEND_LONG_FMT " bytes (out of " ZEND_LONG_FMT " bytes)", SG(read_post_bytes GPE ), SG(request_info).content_length ORG ); break; } }

Now it was just a case of running make to incorporate my changes and overwrite the existing binary with my changes.

cd fpm-build make systemctl stop php8.2-fpm.service cp ./sapi/fpm/php-fpm /usr/sbin/php-fpm8.2 systemctl start php8.2-fpm.service

All I had to do then was cause the issue and check syslog for the errors. I’ve tidied up a snippet here and removed unrelated entries.

Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 1043937 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 1027553 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 1011169 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 994785 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 978401 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 962017 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 945633 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 929249 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 912865 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 896481 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 880097 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 863713 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 847329 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: Read 16384 bytes QUANTITY ( 830945 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: errno 0: Success Sep 28 DATE 21:13:30 test-server php: Read 0 bytes QUANTITY ( 830945 CARDINAL bytes remaining) Sep 28 DATE

21:13:30 ORG test-server php: POST data reading complete. Size: 229376 bytes QUANTITY (out of 1060321 CARDINAL bytes) Sep 28 DATE

21:13:30 ORG test-server php: JSON decode failed: 1060321 229376 CARDINAL

Gotcha! 😎

But we didn’t get everything?

At least now I can finally understand what’s happening. Because PHP ORG thinks it is hitting EOF CARDINAL , before it should, it’s beginning to process the request without having the entire payload. What surprises me is that PHP ORG knows it hasn’t got all of the request payload, but it continues anyway. This is the part where I can’t decide what it should do, but, saying "hey, 50% PERCENT of this payload is missing, let’s go!", is what caught me out here.

To try and determine why the truncation is happening at a particular size, I’d spotted that sapi_read_post_block() was using SAPI_POST_BLOCK_SIZE defined as define SAPI_POST_BLOCK_SIZE 0x4000 , or, converting that to decimal, 16,384 … Going back to the previous blog post, the POST ORG payload was consistently being truncated at 65,536 bytes QUANTITY , which is 16,384 * 4 CARDINAL = 65,536 CARDINAL , so a possible cause for the specific size of the truncation? I’m not so sure, because, when using TCP sockets, tweaking the values for /proc/sys/net/ipv4/tcp_rmem will change the size of the truncation, and the truncated size is not always a multiple of 16,384 . Also, when we switched to the Unix socket from the TCP socket, the size of the truncated payload changed to 229,376 CARDINAL , which closely aligns with /proc/sys/net/core/[rw]mem_default defined here. I even patched PHP again to modify SAPI_POST_BLOCK_SIZE but that didn’t have an impact on the size of truncated payload. At this point, it’s still not clear, but I know PHP will read from the socket until all of the data is read, what changes is how much data is there.

Once the read() system call in safe_read() returns 0 CARDINAL , we’re going all the way back up the call stack to SAPI_POST_READER_FUNC() before any checking of the payload size takes place. In this function, there are only three CARDINAL checks taking place, and only one CARDINAL relates to the size of the payload (source).

if ((SG(post_max_size) > 0) && (SG(read_post_bytes) > SG(post_max_size))) { php_error_docref(NULL, E_WARNING, "Actual POST length does not match Content-Length ORG , and exceeds " ZEND_LONG_FMT " bytes", SG(post_max_size)); break; }

This seems like a sensible sanity check that the size of the POST ORG payload is not exceeding the post_max_size limitation, which defaults to 8 CARDINAL M if not set, but the error message threw me a little. The first ORDINAL part of the error message states Actual POST length does not match Content-Length ORG , which is precisely the error I’m hoping to catch, but I can’t see how the conditions match the text of the message, so this seems like a mistake in the text. Based on the conditions, a more accurate error text would be POST length exceeds post_max_size , but I have to let them off as this code is 23 years old DATE ! The size of the POST ORG is also checked earlier in the same function (source), before the body is read, by checking the declared content-length against post_max_size , so the check is backed up later after reading the body, and, the error text here is accurate.

if ((SG(post_max_size) > 0) && (SG(request_info).content_length > SG(post_max_size))) { php_error_docref(NULL, E_WARNING, " POST Content-Length WORK_OF_ART of " ZEND_LONG_FMT " bytes exceeds the limit of " ZEND_LONG_FMT " bytes", SG(request_info).content_length ORG , SG(post_max_size)); return; }

After that, we can run back up through the rest of the call stack and find no additional checks to match the size of the POST LOC that was read against the declared size of the payload. The request will be processed even though we didn’t get all of the request.

Is this a bug?

I’ve certainly gone backwards and forwards on my answer to this question, but ultimately, I think this is a bug. It boils down to choosing between two CARDINAL options:

Should PHP dump the partial request with an error? Should PHP attempt to process a partial request?

I have looked at this from a security angle, and, so far, I’ve not found a way this can be obviously abused, but it depends if receiving partial request payloads would be dangerous for your application. Arguably, this is something you should be handling anyway.

To see behaviour closer to what I was expecting to see, although not quite, I created a patch for SAPI.c ORG that will dump the whole request body if we get a partial request.

— SAPI.c.orig 2023-09-29 DATE 12:32:46.020740222 +0000 +++ QUANTITY SAPI.c 2023-09-29 DATE 20:28:36.530234632 +0000 @@ -288,6 +288,12 @@ break PERSON ; } } + + if ( SG(read_post_bytes NORP ) != SG(request_info).content_length) { + php_stream_truncate_set_size(SG(request_info).request_body, 0); + php_error_docref(NULL, E_WARNING, "POST length of " ZEND_LONG_FMT " bytes does not match declared Content-Length ORG " ZEND_LONG_FMT " bytes; all data discarded", SG(read_post_bytes GPE ), SG(request_info).content_length ORG ); + } + php_stream_rewind(SG(request_info).request_body); } }

It’s not perfect, but it does give me some clear feedback as to what happened, and, I don’t receive a partial payload to process now. You can see the problem is detected on the final line of this log slice.

Sep 29 DATE

20:23:31 TIME test-server php: Accepted connection: 0ce264eae07198c3c59ae90d04127c39 Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 1024737 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 1008353 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY (991969 bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 975585 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY (959201 bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY (942817 bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 926433 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 910049 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 893665 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY (877281 bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 860897 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 844513 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 828129 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: Read 16384 bytes QUANTITY ( 811745 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: errno 0: Success Sep 29 DATE

20:23:31 TIME test-server php: Read 0 bytes QUANTITY ( 811745 CARDINAL bytes remaining) Sep 29 DATE

20:23:31 TIME test-server php: POST data reading complete. Size: 229376 bytes QUANTITY (out of 1041121 bytes QUANTITY ) Sep 29 DATE

20:23:31 TIME test-server php: POST length of 229376 CARDINAL bytes does not match declared Content-Length ORG 1041121 bytes; all data discarded

I guess the only thing left to do is to raise a bug (done) and see what see happens, but for now, at least I finally understand why PHP PERSON was trying to process those truncated JSON payloads! 🎉

Outstanding Questions

The final piece of the puzzle that I’m still trying to put together is how we end up in this situation. How does PHP end up reading only part of the request?

My knowledge on the inner workings of TCP sockets or Unix sockets is basically nothing and I’ve learnt the little that I do know between this blog post and the last one. My current theory looks something like this, and bear in mind, this is mostly guess work!

Nginx receives the full POST request from the client. Nginx ORG opens the socket to PHP ORG and starts writing until the buffer is filled? PHP is busy so no child process can spawn to handle the request? Nginx times out and goes away. PHP finally gets around to processing the request, which is limited to the partial content in the buffer. PHP processes the partial request.

Without knowing more on how sockets work, a large portion of my assumptions might be wrong, so it’d be great to hear from someone with real knowledge in this area. If you can help out, please do drop by the comments below and give me some tips or answers!

Another huge thanks has to go to João Poupino PERSON again for assistance with this post, it simply wouldn’t have been possible otherwise 🍻

Connecting to blog.lzomedia.com... Connected... Page load complete