PE Files
1. Gral. Info.
-
Based on the UNIX COFF file format (sometimes referred as PE/COFF)
-
.exe and .dll files are PE
+-----------------+
~ DOS MZ Hdr ~ 64b
+-----------------+
~ DOS Stub ~ (var)
+-----------------+
~ PE Hdr ~ 228b
+-----------------+
~ Section Table ~ 40b x No. Sections
+-----------------+
~ Section 1 ~
+-----------------+
~ Section 2 ~
+-----------------+
~ Section ... ~
+-----------------+
2. Definitions
-
Image Base: Preferred starting address of the executable when it is loaded in memory
-
Locating the Image Base:
-
Read the DWORD at file offset: e.g. [0x3C] + 0x34 (D8 + 34 = 10C)
-
The most common BA in Windows is 0x40000
-
VA
-
Absolute Address in Memory
-
RVA
-
Address Memory relative to the Image Base (VA - IB)
-
Anything appearing on a PE file is a RVA
-
Virtual Address of a Section
-
RVA relative to the begining of the Section
-
Section Alignment
-
Very common to see 0x1000
-
Sections start on multiples of the SA
-
Locating SA: [0x3C] + 0x38
-
File Alignment
-
Raw Offset of Section must be a multiple of the FA value.
-
File on Disk:
-
The File in Disk does not necessarily end after the last section. If such data exists it is call Overlay.
-
When the file loads into memory, the distance between sections can be completely different than that it was on disk.
-
Entry Point
-
RVA at wich the loader will begin execution ( main() )
-
Location of the first intstruction to be executed
-
File Header
-
struct _IMAGE_FILE_HEADER
-
Commonly of a certain size (but not always)
-
Follows immediately after the PE Signature
-
Optional Header
-
Magic: 0x0B01 (PE32 + 0x20b)
-
AddressOfEntryPoint
-
ImageBase: All relative address are based on this one. Typically possible to find the PE header of the executable at this address in memory.
-
SectionAligment: Aligment of the sections in memory
-
FileAlignment: Aligment on Disk
-
Section Header:
-
Follows the Optional Hdr
-
Virtual Size: size in memory
-
Virtual Address: address of the section in memory, relative to ImageBase
-
SizeOfRawData: size of the section on disk
-
PointerToRawData: offset within the file to the contents to be loaded in memory
-
Export Table
-
Both EXE and DLLs can export symbols (EXEs rarely do)
-
Structures
-
Image Export Directory Structure
3. Sections
-
At a minimum, a PE file will have 2 sections: one for code and the other for data.
-
A Windows NT app. has 9 predefined sections:
.text .bss .rdata .data .rsrc .edata .idata .pdata .debug
Executable Code .text
Data .data .rdata .bss
Resources .rsrc
Export Data .edata
Import Data .idata
Debug Info .debug
-
Facts:
-
The structure of a PE file is exactly the same on disk as when it is loaded in memory, but it is not copied exactly into memory: the windows loader decides which parts need mapping and omits any others, which will be placed at the end of the file.
-
The location of an item in the file often differs from its location in memory due to the page-based virtual memory mamagement.
-
In memory each section will start in a new 4K (0x1000) page.
-
On disk sections will be aligned to 512 bytes (0x200)
-
At load time the memory manager set the access rights on memory pages for the different sections based on their settings in the section header (rwx).
-
When PE files are loaded into memory they are known as modules.
-
Section Header (8 first bytes of every 0x28 bytes of the Section Table)
-
Section Header entry Size: 0x28 bytes
-
Fields
-
Section Name
-
Virtual Size
-
Virtual Address
-
Raw Size
-
Raw Address
-
Characteristics
-
Facts:
-
Every binary can contain any or several of these sections:
-
.bss, .data, .edata, .pdata, .rdata, .xdata, .reloc, .rsrc, .text (or .code), .tls
-
Typical Sections: .text, .data, .rsrc, .reloc
-
Number of sections: [0x3c] + 0x06
-
Section Table
-
Begins at offset: [0x3C] + 0xF8
-
Size: NumberOfSection * 0x28
3.1. The DOS Header
-
Purpose: in case the program is run from DOS.
-
When building an app in Windows, the linker links a default stub program called WINSTUB.EXE.
* 2b e_magic 0x4D5A (MZ - Mark Zbikowsky)
2b e_cblp
2b e_cp
2b e_crlc
2b e_cparhdr
2b e_minalloc
2b e_maxalloc
2b e_ss
2b e_sp
2b e_csum
2b e_ip
2b e_cs
2b e_lfarlc
2b e_ovno
2b e_res
2b e_oemid
2b e_oeminfo
2b e_res2
* 4b e_lfanew Offset of PE Hdr relative to the file beginning
3.2. The DOS Stub
3.3. The NT Headers & The Data Directory
-
Position determined by the e_lfanew value in the DOS Hdr.
-
The NT Headers comprise: File Header & Optional Header
0 - 4b Signature 0x50 0x45 0x00 0x00 (PE..)
20b File Header Physical Layout & Properties
4 - | 2b Machine
6 - | * 2b NumberOfSections
8 - | 4b TimeDateStamp
12 - | 4b PointerToSymbolTable
16 - | 4b NumberOfSymbols
20 - | 2b SizeOfOptionalHeader
22 - | * 2b Characteristics (EXE / DLL)
224b Optional Header
24 - | 2b Magic 0x010b
26 - | 1b MajorLinkerVersion
27 - | 1b MinorLinkerVersion
28 - | 4b SizeOfCode
32 - | 4b SizeOfInitializedData
36 - | 4b SizeOfUninitializedData
40 - | * 4b AddressOfEntryPoint
44 - | 4b BaseOfCode
48 - | 4b BaseOfData
52 - | * 4b ImageBase 99% of the times 0x400000
56 - | * 4b SectionAlignment
60 - | * 4b FileAlignment
64 - | 2b MajorOperatingSystemVersion
66 - | 2b MinorOperatingSystemVersion
68 - | 2b MajorImageVersion
70 - | 2b MinorImageVersion
72 - | 2b MajorSubsystemVersion
74 - | 2b MinorSubsystemVersion
76 - | 4b Win32VersionValue
80 - | * 4b SizeOfImage
84 - | * 4b SizeOfHeaders
88 - | 4b Checksum
92 - | 2b Subsystem
94 - | 2b DllCharacteristics
96 - | 4b SizeOfStackReserve
100 - | 4b SizeOfStackCommit
104 - | 4b SizeOfHeapReserve
108 - | 4b SizeOfHeapCommit
112 - | 4b LoaderFlags
116 - | * 4b NumberOfRvaAndSizes
| 128b DATA DIRECTORY 16 Image Data Dir. Structures
120 - | 4b Export VirtualAddress
124 - | 4b '' isize
128 - | 4b Import VirtualAddress
132 - | 4b '' isize
136 - | 4b Resource VirtualAddress
140 - | 4b '' isize
144 - | 4b Exception VirtualAddress
148 - | 4b '' isize
152 - | 4b Security VirtualAddress
156 - | 4b '' isize
160 - | 4b Basereloc VirtualAddress
164 - | 4b '' isize
168 - | 4b Debug VirtualAddress
172 - | 4b '' isize
176 - | 4b Copyright VirtualAddress
180 - | 4b '' isize
184 - | 4b Globalptr VirtualAddress
188 - | 4b '' isize
192 - | 4b TLS VirtualAddress
196 - | 4b '' isize
200 - | 4b Load Config VirtualAddress
204 - | 4b '' isize
208 - | 4b Bound Import VirtualAddress
212 - | 4b '' isize
216 - | 4b IAT VirtualAddress
220 - | 4b '' isize
224 - | 4b Delay Import VirtualAddress
228 - | 4b '' isize
232 - | 4b Com Descriptor VirtualAddress
236 - | 4b '' isize
240 - | 4b Number of Directory Entries
244 - | 4b '' isize (????)
-
Facts:
-
A bogus entry of NumberOfRvaAndSizes will make OllyDbg determine this is a bad image and will run the the app without breaking at the entry point. Changing it back to 0x10 the problem is solved.
3.4. The Section Table
-
Array of IMAGE_SECTION_HEADER structures, each containing info about one section of the PE file, no padding between them. This means there will be as many structures in the array as sections in the PE file (the number of sections in the PE file is specified at the beginning of the File Header).
* 8b Name1 Label. Can be blank.
0 - 4b PhysicalAddress
4 - * 4b VirtualSize Section Size in bytes.
8 - * 4b VirtualAddress RVA of Section
12 - * 4b SizeOfRawData Section Size in File in Disk
16 - * 4b PointerToRawData Offset from beginning of file to
Section Data.
0 -> no data in file
20 - 4b PointerToRelocations
24 - 4b PointerToLinenumbers
28 - 4b NumberOfRelocations
32 - 4b NumberOfLinenumbers
36 - 4b Characteristics Flags (e.g. exece code in sect.,
initialized/uninitialized data,
can it be written or read from)
3.5. The PE File Sections
-
Facts:
-
In the file on disk each section starts at an offset multiple of the FileAlignment. There is 0 padding between each section's data.
-
When loaded into RAM sections always start on a page boundary (x86 -> 4k aligned ; IA-64 -> 8k aligned). Alignment is specified in SectionAlignment.
-
Executable Code
-
All code segments reside in a single section called .text or CODE
-
Data
-
.bss: unitialized data for the app (inc. vars declared as static)
-
.rdata: read-only data (literal strings, constants, debug dir. info).
-
.data: all other vars (except automatic, which appear on the stack), app or module global vars.
-
Resources
-
.rsrc: contains resource info for a module.
-
The first 16 bytes comprise a header like most other sections, but this section's data is further structured into a resource tree which is best viewed using a resource editor (e.g. ResHacker).
-
Export Data
-
.edata: contains the Export Directory for an application or DLL (names & addresses of exported functions).
-
Import Data
-
.idata: contains misc infor about imported functions including the Import Directory and Import Address Table.
-
Debug Info
-
.debug: contains debug info but the debug dirs live in .rdata (they reference debug info in the .debug section).
-
The PE format also supports separate debug files (normally with .DBG extension)
-
Thread Local Storage (TLS):
-
Each thread of a process has its own private storage (TLS) to keep data specific to that thread, such as pointers to data structures and resources that the thread is using.
-
.tls: defines the layout for the TLS needed by routines in the exec (and any DLLs to which it directly refers). Each time the process creates a thread, the new thread gets its own TLS created using the .tls section as a template.
-
The .tls section is created by the Linker.
3.6. The Export Section
3.7. The Import Section
3.8. The Loader
5. Import Table
-
IT is an array of Image Import Descriptions
-
Contain info on how to llocate the names of the DLLs and their exported functions.
-
Facts:
-
Typically removed by packers
-
There is an Image Import Descriptor in the Import Table for each imported DLL. Each descriptor ix 0x14 bytes long
-
Locating the Import Table
-
Locate the Data Directory (DD)
-
The necessary info is contained in the Data Directory (array of strucutres containing the RVA and Size of a specific table)
-
Start of the Data Directory: [0x3C] + 0x78
-
Find the entry in the Data Directory
-
The second entry in the DD belongs to the import table: Offset from Start of DD: 0x08
-
Read the DD entry and extract RVA and Size
-
RVA of IT: first 4 bytes
-
Size of IT: last 4 bytes
-
Convert RVA to File Offset
-
Traverse the Section Table to find which section contains the IT
-
The VA of the section containing the IT will be <= to the RVA of the IT. The VA+VSize of the section will be greater than the RVA of the IT
-
Image Import Descriptor is made up of 5 elements:
-
Import Lookup Table (ILT)
-
Import Address Table (IAT)
-
On disk it's an array of RVAS, each RVA pointing to an imported function's name.
-
Once loaded in memory, the loader replaces each element in the array by the respective function's address in memory.
-
Helps the loader locate API funcions and other symbols needed by the executable
-
Summary of the range of actions used by the executable
-
The IAT can be rebuilt by different packers/obfuscators with different degrees of complexity.
-
IAT Descriptor Structure
-
Contains info about the DLL containing the symbols to import
-
IAT Thunk Data Structure
-
Contains info about the specific symbol imported
-
Image Import By Name Structure
-
Contains the name of the symbol to import
-
Intermission
-
Resolve symbols themselves
-
Manually going through the LoadLibrary, GetProcAddress sequence for all symbols
-
Looking them up through hashes of their names
-
Looking them up through signatures of their code
-
Once mapped they can be integrated into the binary through:
-
Peculiar jump tables
-
Skipping the DLL funtion's entry point
-
Locating the IAT
-
Locate the Data Directory
-
Find the entry in the DD: The 13th entry belongs to the IAT
-
Read the DD entry to extract RVA & Size
-
Conver RVA to File Offset
-
...
-
Packers and Import Tables:
-
Packers typically remove the target binary's ILT & IAT, create a replacement IT with the values needed by the stub to run, and modify the Data Directory so the RVA of the IT points to the newly created IT.
6. Navigation Imports on Disk
7. Adding Code to a PE File
8. Adding Import to an Executable
9. PE32+
-
Expanded to accommodate 64-bit architetures
-
Tips:
-
Most PE headers stay the same
-
The Exception Directory is supposed to contain most of the functions of the binary, more specifically, the "non-leaf" ones.
-
Updated fields:
-
IMAGE_TLS_DIRECTORY
-
IMAGE_LOAD_CONFIG_DIRECTORY