Hello readers! In this article, we will take a look at the PE Header which is very much important in understanding the internal part of an executable file. Once you have an overall idea about what’s inside the executable file and how that executable file works in Windows it will then become easy for you to analyze any executable file as you advance the journey to the Malware Analysis path. Hopefully, this article will make you understand the overall scenario as to why I wrote this up and what’s the importance of PE Header while analyzing any malware binary. Also, I would try to keep this post as simple as possible since I am assuming that you are new to this exciting world of Malware Analysis and I don’t want you to get distracted. So, let’s get started.
Each executable file has a common format called Common Object File Format (COFF), a format for executable, object code, shared library computer files used on Unix systems. And PE (Portable Executable) format is one such COFF format available today for executables, object code, DLLs, FON font files, and core dumps in Windows. And if you ask me what’s on the plate for Linux then? Well, we have an Executable Link File (ELF) format for the Linux. Since I have dedicated this post to the Windows PE headers, I will discuss ELF format in some other post later.
PE format is actually a data structure that tells Windows OS loader what information is required in order to manage the wrapped executable code. The data structures on disk are the same data structures used in the memory and if you know how to find something in a PE file, you can almost certainly find the same information after the file is loaded in the memory. A module in memory represents all the code, data, and resources from an executable file that is needed by a process. Other parts of a PE file may be read, but not mapped in (for instance, relocations). Some parts may not be mapped in at all, for example, when debug information is placed at the end of the file. A field in the PE header tells the system how much memory needs to be set aside for mapping the executable into memory. Data that won’t be mapped in is placed at the end of the file, past any parts that will be mapped in.
The PE data structures include: DOS Header, DOS Stub, PE File Header, Image Optional Header, Section Table
Let’s just explain you these data structures with the help of an example. So, I am taking an example of Calculator (calc.exe) here which I’ll be opening in Hex Editor (HxD). You can grab this handy tool from here.
Below diagram illustrates the DOS Header of the PE file format. DOS Header occupies the first 64 bytes of the file. i.e. the first 4 rows of the hex editor as seen in the image below. If you notice you will see the ASCII strings “MZ” mentioned at the beginning of the file. This MZ occupies the first two bytes (hexadecimal: 4D 5A or 0x54AD) of the DOS Header which is read as 5Ah 4Dh. MZ is the initials of Mark Zbikowski, one of the developers of MS-DOS. This field is called e_magic or the magic number which is one such important field to identify an MS-DOS-compatible file type. All MS-DOS-compatible executable files set this value to 0x54AD i.e. “MZ” in ASCII.
The final field, e_lfanew, is a 4-byte offset (E8 00 00 00) and tells where the PE Header is located. Check the PE header section below for this offset location.
A stub is a program or a piece of code that is run by default when the execution of an application starts. In this case, the real-mode stub program is run by MS-DOS when the executable is loaded. The programs typically do no more than output a line of text, such as: “This program requires Microsoft Windows v3.1 or greater“. Or “This program cannot be run in DOS mode“.
PE File Header
The PE header is located by looking at the e_lfanew field of the MS-DOS Header. The e_lfanew gives the offset of the PE header location. In this case, the offset set for PE header is 000000F0 and the PE signature starts at 50 45 00 00 (the letter PE followed by two terminating zeroes). File header is the next 20 bytes of the PE file and contains information about the layout of the file.
The above-highlighted part of the image signifies the file header of any portable executable file.