X86 Memory Segmentation - Real Mode

Real Mode

In real mode, the 16-bit segment selector is interpreted as the most significant 16 bits of a linear 20-bit address, called a segment address, of which the remaining four least significant bits are all zeros. The segment address is always added with a 16-bit offset to yield a linear address, which is the same as physical address in this mode. For instance, the segmented address 06EFh:1234h has a segment selector of 06EFh, representing a segment address of 06EF0h, to which we add the offset, yielding the linear address 06EF0h + 1234h = 08124h (hexadecimal).

Because of the way the segment address and offset are added, a single linear address can be mapped to up to 4096 distinct segment:offset pairs. For example, the linear address 08124h can have the segmented addresses 06EFh:1234h, 0812h:0004h, 0000h:8124h, etc. This could be confusing to programmers accustomed to unique addressing schemes, but it can also be used to advantage, for example when addressing multiple nested data structures. While real mode segments are always 64 KiB long, the practical effect is only that no segment can be longer than 64 KiB, rather than that every segment must be 64 KiB long. Because there is no protection or privilege limitation in real mode, even if a segment could be defined to be smaller than 64 KiB, it would still be entirely up to the programs to coordinate and keep within the bounds of their segments, as any program can always access any memory (since it can arbitrarily set segment selectors to change segment addresses with absolutely no supervision). Therefore, real mode can just as well be imagined as having a variable length for each segment, in the range 1 to 65536 bytes, that is just not enforced by the CPU.

(Note that the leading zeros of the linear address, segmented addresses, and the segment and offset fields, which are usually neglected, were shown here for clarity.)

The effective 20-bit address space of real mode limits the addressable memory to 220 bytes, or 1,048,576 bytes. This derived directly from the hardware design of the Intel 8086 (and, subsequently, the closely related 8088), which had exactly 20 address pins. (Both were packaged in 40-pin DIP packages; even with only 20 address lines, the address and data buses were multiplexed to fit all the address and data lines within the limited pin count.)

Each segment begins at a multiple of 16 bytes, from the beginning of the linear (flat) address space. That is, at 16 byte intervals. Since all segments are 64 KiB long, this explains the huge overlap that can occur between segments and that any location in the linear memory address space can be accessed with many segment:offset pairs. The actual location of the beginning of a segment in the linear address space can be calculated with segment×16. A segment value of 0Ch (12) would give an linear address at C0h (192) in the linear address space. The address offset can then be added to this number. 0Ch:0Fh (12:15) would be C0h+0Fh=CFh (192+15=207), CFh (207) being the linear address. Such address translations are carried out by the segmentation unit of the CPU. The last segment, FFFFh (65535), begins at linear address FFFF0h (1048560), 16 bytes before the end of the 20 bit address space, and thus, can access, with an offset of up to 65,536 bytes, up to 65,520 (65536−16) bytes past the end of the 20 bit 8088 address space. On the 8088, these address accesses were wrapped around to the beginning of the address space such that 65535:16 would access address 0 and 65533:1000 would access address 952 of the linear address space. Programmers using this feature led to the Gate A20 compatibility issues in later CPU generations, where the linear address space was expanded past 20 bits.

In 16-bit real mode, enabling applications to make use of multiple memory segments (in order to access more memory than available in any one 64K-segment) is quite complex, but was viewed as a necessary evil for all but the smallest tools (which could do with less memory). The root of the problem is that no appropriate address-arithmetic instructions suitable for flat addressing of the entire memory range are available. Flat addressing is possible by applying multiple instructions, which however leads to slower programs.

Read more about this topic:  X86 Memory Segmentation

Famous quotes containing the words real and/or mode:

    Talking about dreams is like talking about movies, since the cinema uses the language of dreams; years can pass in a second and you can hop from one place to another. It’s a language made of image. And in the real cinema, every object and every light means something, as in a dream.
    Frederico Fellini (1920–1993)

    Almost any mode of observation will be successful at last, for what is most wanted is method.
    Henry David Thoreau (1817–1862)