jess LAND
       www.jessland.net
        Sponsored by:       
One eSecurity
www.one-esecurity.com
JISK Knowledgebase >>    About    News    Essentials    Architecture    FWs    IDS/IPS    Honeypots    Malware    Forensics   
  +  JSS Home    Projects    JSS Community    Events    News    Docs    About    Contact .

JLCorner > Jess > SANS > Malware Analysis > Notes > PE

PE Files


1. Gral. Info.

  • Based on the UNIX COFF file format (sometimes referred as PE/COFF)
  • .exe and .dll files are PE
  • Format:
  • File Format:
      +-----------------+
      ~   DOS MZ Hdr    ~  64b
      +-----------------+
      ~    DOS Stub     ~  (var)
      +-----------------+
      ~     PE Hdr      ~  228b
      +-----------------+
      ~  Section Table  ~  40b x No. Sections
      +-----------------+
      ~    Section 1    ~
      +-----------------+
      ~    Section 2    ~
      +-----------------+
      ~    Section ...  ~
      +-----------------+

2. Definitions

  • Image Base: Preferred starting address of the executable when it is loaded in memory
    • Locating the Image Base:
      • Read the DWORD at file offset: e.g. [0x3C] + 0x34 (D8 + 34 = 10C)
    • The most common BA in Windows is 0x40000
  • VA
    • Absolute Address in Memory
  • RVA
    • Address Memory relative to the Image Base (VA - IB)
    • Anything appearing on a PE file is a RVA
  • Virtual Address of a Section
    • RVA relative to the begining of the Section
  • Section Alignment
    • Very common to see 0x1000
    • Sections start on multiples of the SA
    • Locating SA: [0x3C] + 0x38
  • File Alignment
    • Raw Offset of Section must be a multiple of the FA value.
  • File on Disk:
    • The File in Disk does not necessarily end after the last section. If such data exists it is call Overlay.
    • When the file loads into memory, the distance between sections can be completely different than that it was on disk.
  • Entry Point
    • RVA at wich the loader will begin execution ( main() )
    • Location of the first intstruction to be executed
  • File Header
    • struct _IMAGE_FILE_HEADER
    • Commonly of a certain size (but not always)
    • Follows immediately after the PE Signature
  • Optional Header
    • Magic: 0x0B01 (PE32 + 0x20b)
    • AddressOfEntryPoint
    • ImageBase: All relative address are based on this one. Typically possible to find the PE header of the executable at this address in memory.
    • SectionAligment: Aligment of the sections in memory
    • FileAlignment: Aligment on Disk
  • Section Header:
    • Follows the Optional Hdr
    • Virtual Size: size in memory
    • Virtual Address: address of the section in memory, relative to ImageBase
    • SizeOfRawData: size of the section on disk
    • PointerToRawData: offset within the file to the contents to be loaded in memory
  • Export Table
    • Both EXE and DLLs can export symbols (EXEs rarely do)
    • Structures
      • Image Export Directory Structure

3. Sections

  • At a minimum, a PE file will have 2 sections: one for code and the other for data.
  • A Windows NT app. has 9 predefined sections:
  .text   .bss   .rdata   .data   .rsrc   .edata   .idata   .pdata   .debug
  • Most common sections:
     Executable Code	.text
     Data 		.data   .rdata   .bss
     Resources		.rsrc
     Export Data	.edata
     Import Data	.idata
     Debug Info		.debug
  • Facts:
    • The structure of a PE file is exactly the same on disk as when it is loaded in memory, but it is not copied exactly into memory: the windows loader decides which parts need mapping and omits any others, which will be placed at the end of the file.
    • The location of an item in the file often differs from its location in memory due to the page-based virtual memory mamagement.
      • In memory each section will start in a new 4K (0x1000) page.
      • On disk sections will be aligned to 512 bytes (0x200)
    • At load time the memory manager set the access rights on memory pages for the different sections based on their settings in the section header (rwx).
    • When PE files are loaded into memory they are known as modules.
  • Section Header (8 first bytes of every 0x28 bytes of the Section Table)
    • Section Header entry Size: 0x28 bytes
    • Fields
      • Section Name
      • Virtual Size
      • Virtual Address
      • Raw Size
      • Raw Address
      • Characteristics
  • Facts:
    • Every binary can contain any or several of these sections:
      • .bss, .data, .edata, .pdata, .rdata, .xdata, .reloc, .rsrc, .text (or .code), .tls
    • Typical Sections: .text, .data, .rsrc, .reloc
  • Number of sections: [0x3c] + 0x06
  • Section Table
    • Begins at offset: [0x3C] + 0xF8
    • Size: NumberOfSection * 0x28

3.1. The DOS Header

  • Length: 64b
  • Purpose: in case the program is run from DOS.
    • When building an app in Windows, the linker links a default stub program called WINSTUB.EXE.
	 *  2b e_magic		0x4D5A (MZ - Mark Zbikowsky)
	    2b e_cblp
	    2b e_cp
	    2b e_crlc
	    2b e_cparhdr
	    2b e_minalloc
	    2b e_maxalloc
	    2b e_ss
	    2b e_sp
	    2b e_csum
	    2b e_ip
	    2b e_cs
	    2b e_lfarlc
	    2b e_ovno
	    2b e_res
	    2b e_oemid
	    2b e_oeminfo
	    2b e_res2
	 *  4b e_lfanew		Offset of PE Hdr relative to the file beginning

3.2. The DOS Stub


3.3. The NT Headers & The Data Directory

  • Position determined by the e_lfanew value in the DOS Hdr.
  • The NT Headers comprise: File Header & Optional Header
  • Structure:
	   0 -   4b Signature			0x50 0x45 0x00 0x00 (PE..)
	        20b File Header			Physical Layout & Properties
           4 -      |   2b  Machine
	   6 -      | * 2b  NumberOfSections
      	   8 -      |   4b  TimeDateStamp
	  12 -      |   4b  PointerToSymbolTable
	  16 -      |   4b  NumberOfSymbols
	  20 -      |   2b  SizeOfOptionalHeader
	  22 -      | * 2b  Characteristics		(EXE / DLL)
	       224b  Optional Header
	  24 -      |   2b  Magic			0x010b
	  26 -      |   1b  MajorLinkerVersion
	  27 -      |   1b  MinorLinkerVersion
	  28 -      |   4b  SizeOfCode
	  32 -      |   4b  SizeOfInitializedData
	  36 -      |   4b  SizeOfUninitializedData
	  40 -      | * 4b  AddressOfEntryPoint
	  44 -      |   4b  BaseOfCode
	  48 -      |   4b  BaseOfData
	  52 -      | * 4b  ImageBase		99% of the times 0x400000
	  56 -      | * 4b  SectionAlignment
	  60 -      | * 4b  FileAlignment
	  64 -      |   2b  MajorOperatingSystemVersion
	  66 -      |   2b  MinorOperatingSystemVersion
	  68 -      |   2b  MajorImageVersion
	  70 -      |   2b  MinorImageVersion
	  72 -      |   2b  MajorSubsystemVersion
	  74 -      |   2b  MinorSubsystemVersion
	  76 -      |   4b  Win32VersionValue
	  80 -      | * 4b  SizeOfImage
	  84 -      | * 4b  SizeOfHeaders
	  88 -      |   4b  Checksum
	  92 -      |   2b  Subsystem
	  94 -      |   2b  DllCharacteristics
	  96 -      |   4b  SizeOfStackReserve
	 100 -      |   4b  SizeOfStackCommit
	 104 -      |   4b  SizeOfHeapReserve
	 108 -      |   4b  SizeOfHeapCommit
	 112 -      |   4b  LoaderFlags
	 116 -      | * 4b  NumberOfRvaAndSizes
	            | 128b  DATA DIRECTORY	16 Image Data Dir. Structures
	 120 -           |   4b  Export         VirtualAddress
	 124 - 	         |   4b    ''           isize
	 128 -           |   4b  Import         VirtualAddress
	 132 -	         |   4b    ''           isize
	 136 -           |   4b  Resource       VirtualAddress
	 140 -	         |   4b    ''           isize
	 144 -           |   4b  Exception      VirtualAddress
	 148 -	         |   4b    ''           isize
	 152 -           |   4b  Security       VirtualAddress
	 156 -	         |   4b    ''           isize
	 160 -           |   4b  Basereloc      VirtualAddress
	 164 -	         |   4b    ''           isize
	 168 -           |   4b  Debug          VirtualAddress
	 172 -	         |   4b    ''           isize
	 176 -           |   4b  Copyright      VirtualAddress
	 180 -	         |   4b    ''           isize
	 184 -           |   4b  Globalptr      VirtualAddress
	 188 -	         |   4b    ''           isize
	 192 -           |   4b  TLS            VirtualAddress
	 196 -	         |   4b    ''           isize
	 200 -           |   4b  Load Config    VirtualAddress
	 204 -	         |   4b    ''           isize
	 208 -           |   4b  Bound Import   VirtualAddress
	 212 -	         |   4b    ''           isize
	 216 -           |   4b  IAT            VirtualAddress
	 220 -	         |   4b    ''           isize
	 224 -           |   4b  Delay Import   VirtualAddress
	 228 -	         |   4b    ''           isize
	 232 -           |   4b  Com Descriptor VirtualAddress
	 236 -	         |   4b    ''           isize
	 240 - 	         |   4b  Number of Directory Entries
	 244 - 	         |   4b    ''           isize (????)
  • Facts:
    • A bogus entry of NumberOfRvaAndSizes will make OllyDbg determine this is a bad image and will run the the app without breaking at the entry point. Changing it back to 0x10 the problem is solved.

3.4. The Section Table

  • Array of IMAGE_SECTION_HEADER structures, each containing info about one section of the PE file, no padding between them. This means there will be as many structures in the array as sections in the PE file (the number of sections in the PE file is specified at the beginning of the File Header).
  • Structure (40b):
	       * 8b Name1			Label. Can be blank.
	   0 -   4b PhysicalAddress
	   4 - * 4b VirtualSize			Section Size in bytes.
	   8 - * 4b VirtualAddress		RVA of Section
	  12 - * 4b SizeOfRawData		Section Size in File in Disk
	  16 - * 4b PointerToRawData		Offset from beginning of file to
	  					Section Data.
						0 -> no data in file
	  20 -   4b PointerToRelocations
	  24 -   4b PointerToLinenumbers
	  28 -   4b NumberOfRelocations
	  32 -   4b NumberOfLinenumbers
	  36 -   4b Characteristics		Flags (e.g. exece code in sect.,
	  					initialized/uninitialized data,
						can it be written or read from)

3.5. The PE File Sections

  • Facts:
    • In the file on disk each section starts at an offset multiple of the FileAlignment. There is 0 padding between each section's data.
    • When loaded into RAM sections always start on a page boundary (x86 -> 4k aligned ; IA-64 -> 8k aligned). Alignment is specified in SectionAlignment.
  • Executable Code
    • All code segments reside in a single section called .text or CODE
  • Data
    • .bss: unitialized data for the app (inc. vars declared as static)
    • .rdata: read-only data (literal strings, constants, debug dir. info).
    • .data: all other vars (except automatic, which appear on the stack), app or module global vars.
  • Resources
    • .rsrc: contains resource info for a module.
      • The first 16 bytes comprise a header like most other sections, but this section's data is further structured into a resource tree which is best viewed using a resource editor (e.g. ResHacker).
  • Export Data
    • .edata: contains the Export Directory for an application or DLL (names & addresses of exported functions).
  • Import Data
    • .idata: contains misc infor about imported functions including the Import Directory and Import Address Table.
  • Debug Info
    • .debug: contains debug info but the debug dirs live in .rdata (they reference debug info in the .debug section).
      • The PE format also supports separate debug files (normally with .DBG extension)
  • Thread Local Storage (TLS):
    • Each thread of a process has its own private storage (TLS) to keep data specific to that thread, such as pointers to data structures and resources that the thread is using.
    • .tls: defines the layout for the TLS needed by routines in the exec (and any DLLs to which it directly refers). Each time the process creates a thread, the new thread gets its own TLS created using the .tls section as a template.
      • The .tls section is created by the Linker.
  • Base Relocations
    • .reloc

3.6. The Export Section


3.7. The Import Section


3.8. The Loader


5. Import Table

  • IT is an array of Image Import Descriptions
    • Contain info on how to llocate the names of the DLLs and their exported functions.
  • Facts:
    • Typically removed by packers
    • There is an Image Import Descriptor in the Import Table for each imported DLL. Each descriptor ix 0x14 bytes long
  • Locating the Import Table
    • Locate the Data Directory (DD)
    • The necessary info is contained in the Data Directory (array of strucutres containing the RVA and Size of a specific table)
      • Start of the Data Directory: [0x3C] + 0x78
    • Find the entry in the Data Directory
      • The second entry in the DD belongs to the import table: Offset from Start of DD: 0x08
    • Read the DD entry and extract RVA and Size
      • RVA of IT: first 4 bytes
      • Size of IT: last 4 bytes
    • Convert RVA to File Offset
      • Traverse the Section Table to find which section contains the IT
        • The VA of the section containing the IT will be <= to the RVA of the IT. The VA+VSize of the section will be greater than the RVA of the IT
  • Image Import Descriptor is made up of 5 elements:
    • Import Lookup Table (ILT)
    • Import Address Table (IAT)
      • On disk it's an array of RVAS, each RVA pointing to an imported function's name.
      • Once loaded in memory, the loader replaces each element in the array by the respective function's address in memory.
      • Helps the loader locate API funcions and other symbols needed by the executable
      • Summary of the range of actions used by the executable
      • The IAT can be rebuilt by different packers/obfuscators with different degrees of complexity.
      • IAT Descriptor Structure
        • Contains info about the DLL containing the symbols to import
      • IAT Thunk Data Structure
        • Contains info about the specific symbol imported
      • Image Import By Name Structure
        • Contains the name of the symbol to import
      • Intermission
        • Resolve symbols themselves
        • Manually going through the LoadLibrary, GetProcAddress sequence for all symbols
        • Looking them up through hashes of their names
        • Looking them up through signatures of their code
        • Once mapped they can be integrated into the binary through:
          • Peculiar jump tables
          • Skipping the DLL funtion's entry point
      • Locating the IAT
        • Locate the Data Directory
        • Find the entry in the DD: The 13th entry belongs to the IAT
        • Read the DD entry to extract RVA & Size
        • Conver RVA to File Offset
    • ...
  • Packers and Import Tables:
    • Packers typically remove the target binary's ILT & IAT, create a replacement IT with the values needed by the stub to run, and modify the Data Directory so the RVA of the IT points to the newly created IT.

6. Navigation Imports on Disk


7. Adding Code to a PE File


8. Adding Import to an Executable


9. PE32+

  • Expanded to accommodate 64-bit architetures
  • Tips:
    • Most PE headers stay the same
    • The Exception Directory is supposed to contain most of the functions of the binary, more specifically, the "non-leaf" ones.
  • Magic Number: 0x20B
  • Updated fields:
    • IMAGE_TLS_DIRECTORY
    • IMAGE_LOAD_CONFIG_DIRECTORY

Copyright © 2000-2008 Jessland - Jess Garcia's Website - All rights reserved.