Earlier, I wrote a post on Understanding PE Structure – The Layman’s Way and this one is a continuation of that post. I highly recommend my readers to go through that post, where I have gone detailed into the PE file format, before jumping onto this article.
Since the earlier post was already too long and I hadn’t gone into much detail about the Data Directory structure of PE, I decided to continue from where I left off in this
What’s the Context
As soon as the Windows loader loads an executable it does certain things in the background. First, it reads the files of a PE structure and loads an executable image into the memory. The other thing it does is to scan the Import Address Table (IAT) of an executable to locate the DLLs and functions that the executable uses and load all these DLLs and map them into the process address space.
This is done for the operating system to start the executable file successfully. But how this DLL loading is done and function is imported? We are going to see this in this post.
Within any executable file, we would see an array of data structures which is one per imported DLL. Each of these structures gives the name of the imported DLL and points to an array of function pointers. Import Address Table (IAT) is an array of these function pointers where the address of the imported function is written by the Windows loader.
Here, we will discuss only the important field and structures of the PE File format which is relevant to this topic as I don’t want this post to be too lengthy to be exhaustive for you.
Let’s Begin From Where We Left Off
In the earlier post, I have not gone into much detail about Data Directories which is just another array of structures called IMAGE_DATA_DIRECTORY.
Just to give an overview of Data Directories, it is the last entry of the Optional Header and it is an array of several other structures. This is one of the very important data structures because this is what holds the pointer to other data structures.
This holds pointers to what functions it exports, it holds pointers to what functions it imports, pointers to where the debug information is to be found, where the digital certificate is, where the relocation information is, where the resources are, etc… This is like a big map that is pointing to all other data structures that we will be looking at later.
So, the last field of the IMAGE_OPTIONAL_HEADER structure is the so-called IMAGE_DATA_DIRECTORY where we locate the Import Address Table (IAT). The data directory field indicates where to find the other important components of executable information in the file. Below are the few data structures showing only the important ones.
// <winnt.h> #define IMAGE_NUMBEROF_DIRECTORY_ENTRIES 16 // Optional header format. typedef struct _IMAGE_OPTIONAL_HEADER { ... IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES]; } IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32; // Directory Entries #define IMAGE_DIRECTORY_ENTRY_EXPORT 0 // Export Directory #define IMAGE_DIRECTORY_ENTRY_IMPORT 1 // Import Directory #define IMAGE_DIRECTORY_ENTRY_RESOURCE 2 // Resource Directory #define IMAGE_DIRECTORY_ENTRY_BASERELOC 5 // Base Relocation Table #define IMAGE_DIRECTORY_ENTRY_DEBUG 6 // Debug Directory #define IMAGE_DIRECTORY_ENTRY_TLS 9 // TLS Directory
Since our total focus here is on looking up the Imports of an executable file or to say what functions it has imported from other modules, we will take up the 2nd entry of the DataDirectory array which represents the Import D
// Import Directory #define IMAGE_DIRECTORY_ENTRY_IMPORT 1
Each data directory is a data structure defined as an IMAGE_DATA_DIRECTORY. And although the data directory entries are the same, each specific directory type is unique. The IMAGE_DATA_DIRECTORY structure is defined below as:
typedef struct _IMAGE_DATA_DIRECTORY { DWORD VirtualAddress; // RVA of data DWORD Size; // Size of the data in bytes }IMAGE_OPTIONAL_HEADERS32, *PIMAGE_OPTIONAL_HEADERS32;
Each data directory entry specifies a relative virtual address and size of the directory. The VirtualAddress member in this array element describes the location of the import directory which in turn is also an array. This array consists of structures of type IMAGE_IMPORT_DESCRIPTOR. One structure of this type is assigned for each DLL that is imported by the module.
The Import directory entry of the import table takes us to the position of the import table inside the file image and is a container of several important data structures like Characteristics, OriginalFirstThunk, Name & FirstThunk, and these are one per DLL. So, every single DLL will have its IMAGE_IMPORT_DESCRIPTOR table.
typedef struct _IMAGE_IMPORT_DESCRIPTOR { union { DWORD Characteristics; DWORD OriginalFirstThunk; // RVA of ILT }; ... ... DWORD Name; // RVA of imported DLL name DWORD FirstThunk; // RVA to IAT }IMAGE_IMPORT_DESCRIPTOR;
- OriginalFirstThunk (OFT)
It holds the RVA of the Import Lookup Table (ILT) or the Import Name Table (INT). The ILT field contains information about how the import is to be processed either by ordinal or by the name.
- Name
This is just the RVA which will point at the specific name of the module from where the imports are taken. E.g. hal.dll, ntdll.dll, etc.
- FirstThunk (FT)
This holds the RVA of the Import Address Table (IAT). The structure and content of the import address table are identical to those of the Import Lookup Table until the file is bound.
On-disk both OriginalFirstThunk (INT) points to the same data structures i.e. IMAGE_THUNK_DATA,
For each function that is used by the executable module as an import, we will encounter an IMAGE_THUNK_DATA structure. The IMAGE_THUNK_DATA structure is an array of DWORDs and each DWORD represent an imported function and is defined in the WinNT.H header file as below:
typedef struct _IMAGE_THUNK_DATA { union { ... PDWORD Function; DWORD Ordinal; PIMAGE_IMPORT_BY_NAME AddressOfData; }u1; }IMAGE_THUNK_DATA32;
The IMAGE_THUNK_DATA structures within the IAT serve two purposes. In the executable file, they contain either the ordinal of the imported API or an RVA to an IMAGE_IMPORT_BY_NAME structure.
typedef struct _IMAGE_IMPORT_BY_NAME { WORD Hint; BYTE Name[1]; } IMAGE_IMPORT_BY_NAME, *PIMAGE_IMPORT_BY_NAME;
The IMAGE_IMPORT_BY_NAME structure is just a WORD, followed by a string naming the imported API. The WORD value is a “Hint” to the OS loader as to what the ordinal of the imported API may be. When the OS loader loads the executable in the memory, it overwrites each IAT entry with the actual address of the imported function.
Now, to know whether the function is imported by an ordinal or by its name, one must check the high bit of the IMAGE_THUNK_DATA value:
- If the high bit is set, the bottom 31 bits (or 63 bits for a 64-bit executable) is treated as an ordinal value. The function is therefore imported by its ordinal and there would be no name available.
- If the high-bit is not set, the whole DWORD is an RVA to an IMAGE_IMPORT_BY_NAME structure.
So, What’s the difference between IAT and INT?
INT is another array that is identical to the IAT and is also an array of IMAGE_THUNK_DATA structures. As both point to the same data structure, the main difference between the IAT and INT is that INT isn’t overwritten by the Windows loader when the executable is loaded into the memory but IAT entries get overwritten with the actual address of the imported function.
Also, INT is not required for an executable to load but IAT is one of the essential components for an executable to load. Without this, it may fail to load.
Wrap-up
In the next article, we will try to look inside the Import Address Table (IAT) with the help of WinDbg or some other tools but it will be more practical as we are already done with the theory.