Before we proceed to the concept of PE File Format, which describes the internal structure of all Windows executable files, one should also know the concepts of Virtual Address (VA), Relative Virtual Address (RVA) and File Offsets as these would be the foundation in helping you to understand the technical parts of the PE file format.
This post will be the building blocks for the later posts on PE File Format. I highly recommend you to read this article before proceeding to the advanced topic of exploring the internals of any Windows executable file.
Understanding how VA, RVA, and Offset are interconnected and how to calculate one from the other is very critical. This is something we have to deal with very often and in every reversing challenge that we will take up later.
Virtual Address (VA)
Applications do not directly access physical memory, they only access virtual memory. In other words, the Virtual Addresses (VAs) are the memory addresses that are referenced by an application.
Virtualizing access to memory provides flexibility in the way applications use available physical memory. In fact, an application doesn’t have to occupy a contiguous piece of physical memory; it can be broken down into parts, without the application even needing to know about it.
Relative Virtual Address (RVA)
Relative Virtual Address or RVA (here afterward) is the difference between two Virtual Addresses (VA) and represents the highest one. Virtual Address is the original address in the memory whereas Relative Virtual Address (RVA) is the relative address with respect to the ImageBase. ImageBase here means the base address where the executable file is first loaded into the memory.
We can calculate RVA with the help of the following formula:
RVA = VA – ImageBase
Have a look at the example below for more clarification:
An application is loaded into the memory having a Base Address at 0x400000 and the VA is at 0x401000. So the RVA is calculated as:
Virtual Address = 0x00401000
ImageBase = 0x00400000
RVA = 0x00001000
When we talk about offsets, we usually either refer to physical memory, a physical file on disk or in another general in cases where we treat data as raw data.
The file offset is actually a location within that particular file. To make it easier for you to understand it is actually the distance from the starting point either the start of the file or the start of a memory address. The offset value is added to the base value to determine the actual value.
So, if we have to calculate the file offset of the entry point in a PE file, consider the below table which shows the important fields within the PE optional header and section header for a particular application.
|Number of Sections
|Address of Entry Point
|Size of Raw Data
|Pointer to Raw Data
Now, the steps with which we calculate the file execution start offset are followed as below:
- First, determine the Address of entry point from the field under Optional Header.
- Next, check in which section’s virtual space the address of entry point lies.
- Once the right section header is determined, make a note of its virtual address and pointer to raw data fields.
Now, calculate the difference between the address of entry point and the virtual address of the earlier identified section in which the entry point lies.
- Finally, add the difference to the pointer to the raw data which will give the file-based execution start offset of that file.
In short the formula for calculating execution start file offset would be:
Offset of entry point in EXE file = (AddressOfEntryPoint – .section[VirtualAddress]) + .section[PointerToRawData]
In this case, the address of entry point lies in the .text section as the .text section starts at 0x00001000 and ends at 0x00007748 and the address of entry point is 0x0000739D.
So, the file offset for the execution start is:
(0x0000739D – 0x00001000) + 0x0000400 = 0x0000679D