Very Long Instruction Word

Design

In superscalar designs, the number of execution units is invisible to the instruction set. Each instruction encodes only one operation. For most superscalar designs, the instruction width is 32 bits or fewer. VLIW is a type of MIMD.

In contrast, one VLIW instruction encodes multiple operations; specifically, one instruction encodes at least one operation for each execution unit of the device. For example, if a VLIW device has five execution units, then a VLIW instruction for that device would have five operation fields, each field specifying what operation should be done on that corresponding execution unit. To accommodate these operation fields, VLIW instructions are usually at least 64 bits wide, and on some architectures are much wider.

For example, the following is an instruction for the SHARC. In one cycle, it does a floating-point multiply, a floating-point add, and two autoincrement loads. All of this fits into a single 48-bit instruction.

f12=f0*f4, f8=f8+f12, f0=dm(i0,m3), f4=pm(i8,m9);

Since the earliest days of computer architecture, some CPUs have added several additional arithmetic logic units (ALUs) to run in parallel. Superscalar CPUs use hardware to decide which operations can run in parallel. VLIW CPUs use software (the compiler) to decide which operations can run in parallel. Because the complexity of instruction scheduling is pushed off onto the compiler, the hardware's complexity can be substantially reduced.

A similar problem occurs when the result of a parallelisable instruction is used as input for a branch. Most modern CPUs "guess" which branch will be taken even before the calculation is complete, so that they can load up the instructions for the branch, or (in some architectures) even start to compute them speculatively. If the CPU guesses wrong, all of these instructions and their context need to be "flushed" and the correct ones loaded, which is time-consuming.

This has led to increasingly complex instruction-dispatch logic that attempts to guess correctly, and the simplicity of the original RISC designs has been eroded. VLIW lacks this logic, and therefore lacks its power consumption, possible design defects and other negative features.

In a VLIW, the compiler uses heuristics or profile information to guess the direction of a branch. This allows it to move and preschedule operations speculatively before the branch is taken, favoring the most likely path it expects through the branch. If the branch goes the unexpected way, the compiler has already generated compensatory code to discard speculative results to preserve program semantics.

The acronym VLIW may also refer to Very Long Instruction Word, a CPU instruction set designed to load (or copy) a literal value count of inline machine code to the on-chip RAM for higher speed CPU decoding.

Vector processor (SIMD) cores can be combined with VLIW architecture as the Fujitsu FR-V, further increasing throughput and speed.

Read more about this topic: Very Long Instruction Word

Famous quotes containing the word design:

“Nowadays the host does not admit you to his hearth, but has got the mason to build one for yourself somewhere in his alley, and hospitality is the art of keeping you at the greatest distance. There is as much secrecy about the cooking as if he had a design to poison you.”
—Henry David Thoreau (1817–1862)

“We find that Good and Evil happen alike to all Men on this Side of the Grave; and as the principle Design of Tragedy is to raise Commiseration and Terror in the Minds of the Audience, we shall defeat this great End, if we always make Virtue and Innocence happy and successful.”
—Joseph Addison (1672–1719)

Related Phrases

Wider Implementations

Related Words