Machine Language Programs

The machine language level instruction set of a computer is the lowest level language available to application programmers. (There have been some examples of machines whose microprograms could be modified but they are very rare and even with those machines no one would write applications at the microinstruction level.) A complete description of the machine language instruction set for our example architecture can be found in the documentation for the simulator and will not be repeated here. The purpose of this page is to take an initial look at writing machine language programs.

Example 1: Basic Arithmetic and Logic

Write a program that will implement the following Java-like program:

int A = 129 (binary 10000001);
int B = 127 (binary 01111111);
int C, D, E, F;
C = A + B;
D = A - B;
E = A AND B;
F = A OR B;

Variables

The variables declared in the main method are stored in the local variables block of memory which starts at word address 0x2000. Each variable is associated with a specific address within this block and the addresses are allocated sequentially starting at 0x2000. Which variable goes where is arbitrary but let's make the following allocations:

Variable	Address
A	0x2000
B	0x2001
C	0x2002
D	0x2003
E	0x2004
F	0x2005

In IJVM, however, variables are not associated directly with their memory locations but rather by how far they are from the base of the local variables section of memory. That is, each variable is associated with its offset from the base address stored in the LV register. In our case, those offsets are:

Variable	Offset
A	0
B	1
C	2
D	3
E	4
F	5

Variable Initialization

We'll defer a discussion of how to initialize variables till later. In this example, we will initialize our variables by simply entering the values directly into the corresponding memory locations in the simulator.

Arithmetic and Logic Operations

Here is the sequence of steps for performing arithmetic and logic operations:

Load the first (left-hand) operand onto the stack.
Load the second (right-hand) operand onto the stack.
Perform the desired operation. (This pops the top two values, performs the operation, and pushes the result onto the stack.)
Pop the top value from the stack and store in the variable on the left side of the assignment operator.

Following this sequence of steps, the stack will be in exactly the same state as it was when the sequence began.

The ILOAD instruction (opcode: decimal 21) pushes the value of a variable onto the stack. The operand for this instruction is a one-byte unsigned integer whose value is the offset of the desired variable. The value of the operand is added to the contents of the LV register to determine the address of the variable. The format for this instruction is:

Instruction	Opcode	Operand
ILOAD	21	Offset (0-255)

The instructions to perform the arithmetic and logic operations are IADD, ISUB, IAND, and IOR. (The "I" in the instruction names is for "Integer".) Each operation removes the top two integers from the stack (the operands), performs the operation, and pushes the result onto the stack. None of these instructions have an operand because the operands are already on the stack. The opcodes, in decimal, are:

Instruction	Opcode (Decimal)
IADD	96
ISUB	100
IAND	126
IOR	176

The ISTORE instruction pops the top value from the stack and stores it in the variable corresponding to its one-byte unsigned operand (the offset). The address of the variable is determined by adding the offset to the base address in the LV register. The format for this instruction is:

Instruction	Opcode	Operand
ISTORE	54	Offset (0 - 255)

The HALT instruction (opcode: decimal 255) terminates the execution of a machine language program.

Now we are ready to put all of the pieces together:

Java	Instruction	Opcode	Operand
C = A + B	ILOAD A	21	0
	ILOAD B	21	1
	IADD	96
	ISTORE C	54	2
C = A - B	ILOAD A	21	0
	ILOAD B	21	1
	ISUB	100
	ISTORE D	54	3
C = A AND B	ILOAD A	21	0
	ILOAD B	21	1
	IAND	126
	ISTORE E	54	4
C = A OR B	ILOAD A	21	0
	ILOAD B	21	1
	IOR	176
	ISTORE D	54	5
	HALT	255

Here is our final machine language program (in decimal):

21 0 21 1 96 54 2 21 0 21 1 100 54 3 21 0 21 1 126 54 4 21 0 21 1 176 54 5 255

To test this machine language program:

Click the "Goto LV" button, enter 129 at address 8192 (the value of LV) and enter 127 at address 8193.
Click the "Go to" button (with the address 0 in the text box).
Click the "Display Bytes" radio button and enter the machine language program, one byte at a time, starting at address zero.
Click the "Run" button.
Click the "Goto LV" button. (This automatically selects the "Display Words" radio button.)
Verify that the memory locations associated with C, D, E, and F contain the correct values (256, 2, 1, and 255, respectively).

Example 2: Variable Initialization

In the previous example, we deferred the question of how to initialize variables by simply entering the initial values of the variables A and B directly into memory. In reality, of course, that is not the way it is done. A program is responsible for initiating its own variables.

Recall that, in our Java-like program, we wanted to initialize A and B:

int A = 129 (binary 10000001);
int B = 127 (binary 01111111);

Let's begin by writing the code to initialize B. The BIPUSH instruction pushes its one-byte signed operand onto the stack. Once it is on the stack, we can use an ISTORE instruction to assign that value to a variable. Since the operand for this instruction is only a one-byte signed integer, the possible values range from -128 to 127. The format for this instruction is:

Instruction	Opcode	Operand
BIPUSH	16	-128 to 127

Here is the code needed to initialize B:

Java	Instruction	Opcode	Operand
int B = 127;	BIPUSH 127	16	127
	ISTORE B	54	1

The BIPUSH instruction cannot be used to initialize A because the value, 129, is too large to represent in a one-byte signed integer operand. Consequently, we need to adopt a different strategy. We will store 129 as a constant in the constant pool (which starts at 0x3000, the address stored in the CPP register). The LDC_W instruction is used to push the value of a constant onto the stack. Its two-byte unsigned operand is the offset of the constant from the base of the constant pool section of memory. The address of the constant is found by adding the offset to the address in the CPP register. Note that a two-byte offset gives us the capability of accessing up to 65,536 constants but in our example architecture, the size of the constant pool limits us to only 4,096 constants. Once the value of a constant has been pushed onto the top of the stack, we can use the ISTORE instruction to copy it to the corresponding variable. The format for the LDW_C instruction is:

Instruction	Opcode	Operand
LDW_C	19	0 to 65,535

The code for initializing A is given here:

Java	Assembly	Opcode	Operand
int A = 129;	LDC_W 0	19	0 (2-byte)
	ISTORE 0	54	0 (1-byte)

In this case, the offset of the constant within the constant pool is 0 (assuming that we stored 129 at 0x3000) and the offset of the variable in the local variables section of memory is also 0.

Now, the question is how do we get the 129 loaded into the constant pool? In the simulator, we will simple enter it into memory much as we entered the initial values of the variables in the previous example. What, you might ask, is the difference? In either case, the programmer must manipulate specific memory locations in order to supply the initial values of a variable. In the first example, the programmer had to initialize the memory locations used for the variables. In this example, the programmer enters one initial value directly into the machine language program as the operand of BIPUSH and enters the second value in the constant pool section of memory.

The machine language programmer must know the offset for every variable and constant used in the program and must have access to memory in order to enter the initial values of those variables and constants. Both of these limitations are removed at the next level which is the assembly language level.

The code for initializing the variables in our example program would precede the code for performing the arithmetic and logic operations. It doesn't make any difference whether we initialize A first or B first but we will initialize B first since we discussed that first. The first 4 bytes initialize B and the next 5 bytes initialize A. From that point on, the program is exactly the same as it was before:

16 127 54 1 19 0 0 54 0 21 0 21 1 96 54 2 21 0 21 1 100 54 3 21 0 21 1 126 54 4 21 0 21 1 176 54 5 255

To test this machine language program:

Click the "Goto CPP" button, enter 129 at address 12,288 (the value of CPP).
Click the "Go to" button (with the address 0 in the text box).
Click the "Display Bytes" radio button and enter the machine language program, one byte at a time, starting at address zero.
Click the "Run" button.
Click the "Goto LV" button. (This automatically selects the "Display Words" radio button.)
Verify that the memory locations associated with C, D, E, and F contain the correct values (256, 2, 1, and 255, respectively).

Example 3: If-Then-Else

Write a program that will implement the following Java-like program:

int A, B
if (A < 0)
    B = -A;
else
    B = A;

Our machine language instruction set has four branching instructions all of which have a 2-byte signed operand. The operand is an offset that is added to the memory address at which the branching instruction was located in order to find the address of the next machine language instruction. A negative offset means the program will branch to a previous instruction (useful for looping) and a positive offset means the program will branch to a subsequent instruction (i.e., it will skip ahead).

The OPC (old program counter) register is used to store the address at which a branching instruction is located. The program counter itself has to keep incrementing in order to read the 2-byte operand. If a branch occurs, the value of the operand is added to the value in the OPC register and the result is copied to the PC register. If no branch occurs, then the value of the PC register contains the address of the first instruction following the failed branching instruction.

Instruction	Opcode	Operand (2-Byte Signed Offset)	Action
GOTO	167	-32,768 to 32,767	PC = OPC + Offset
IFEQ	153	-32,768 to 32,767	IF TOS = 0, PC = OPC + Offset
IFLT	155	-32,768 to 32,767	IF TOS < 0, PC = OPC + Offset
IF_ICMPEQ	156	-32,768 to 32,767	IF (top 2 elements on stack are equal), PC = OPC + Offset

When a conditional branching instruction is executed, the value (or values) used to perform the test are removed from the stack.

In an if-then-else statement, there are two blocks of instructions; one (the if clause) is executed if the tested condition is true and the other (the else clause) is executed if the tested condition is false. As a general rule, the else clause immediately follows the branching instruction and the if clause follows the else clause:

if condition is true go to IF_CLAUSE (conditional branch)
    else clause code goes here
    go to REST_OF_PROGRAM (unconditional branch)
IF_CLAUSE:
    if clause code goes here
REST_OF_PROGRAM:
    rest of program goes here

Note that an if-then-else structure requires a conditional branching statement to jump over the else clause (when the condition is true) and an unconditional branching instruction to jump over the if clause (when the condition is false). A simple if-then structure has no else clause but other than that the design is exactly the same. In the code that follows, A is at offset 0 and B is offset 1 in the local variables block of memory. (The execution time is in clock cycles based on the number of microinstructions required to execute the instruction.)

Java	Instruction	Opcode	Operand	Bytes	Time (A < 0)	Time (A >= 0)
If (A < 0)	ILOAD A	21	0	2	6	6
	IFLT goto If_Clause	155	? ? (0 10)	3	11	8
else B = A;	ILOAD A	21	0	2		6
	ISTORE B	54	1	2		7
	GOTO End	167	? ? (0 10)	3		7
then B = -A;	If_Clause: BIPUSH 0	16	0	2	4
	ILOAD A	21	0	2	6
	ISUB	100		1	4
	ISTORE B	54	1	2	7
	End: HALT	255		1	1	1
			Total	20	39	35

Here is the machine language program:

21, 0, 155, 0, 10, 21, 0, 54, 1, 167, 0, 10, 16, 0, 21, 0, 100, 54, 1, 255

You might be asking, how do you know what the branching offsets are as you write the program. The answer is, you don't. For each branching offset, start out by writing two question marks (one for each byte of the two-byte signed offset). If you use only one question mark, you are likely to forget that the operand is actually two bytes. When you are all done, go back and count how many bytes there are after the branching opcode 155 up to and including the opcode 16 at the beginning of the if clause. There are ten bytes: ?, ?, 21, 0, 54, 1, 167, ?, ?, and 16. Replace the question marks in the operand for opcode 155 with 0 10 (the first byte is 0 and the second byte is 10). As a 2-byte integer, the offset will be 10.

Similarly, count the number of bytes after the branching opcode 167 up to and including the opcode at the beginning of the rest of the program (HALT in this case). There are ten bytes: ?, ?, 16, 0, 21, 0, 100, 54, 1, and 255. Replace the question marks in the operand for opcode 167 with 0 10.

To test this machine language program:

"Goto LV" and enter 100 at offset 0.
"Go to" address 0.
"Display Bytes" and enter the machine language program.
"Run".
"Goto LV" and verify that B (at offset 1) is 100.
Enter -200 at offset 0.
"Reset" and "Run".
"Goto LV" and verify that B (at offset 1) is 200.
Enter 0 at offset 0.
"Reset" and "Run".
"Goto LV" and verify that B (at offset 1) is 0.

This program requires 20 bytes of memory and the average execution time is 37 clock cycles.

Example 5: Efficiency

One reason for writing machine language programs is to maximize speed and minimize size. In general, we want small programs that run as fast as possible. Sometimes these two goals conflict. For example, I may be able to make a program faster at the expense of requiring more memory. Or, perhaps, I can reduce the size of a program at the expense of slowing it down a little.

Let's consider the program we just wrote:

    ILOAD A
    IFLT IF_CLAUSE
    ILOAD A
    ISTORE B
    GOTO END
IF_CLAUSE:
    BIPUSH 0
    ILOAD A
    ISUB
    ISTORE B
END:
    HALT

Refinement 1

Notice that the last instruction in both the if clause and the else clause is ISTORE B. Let's move that statement out of each clause and put it just before the HALT instruction:

    ILOAD A
    IFLT IF_CLAUSE
    ILOAD A
    GOTO END
IF_CLAUSE:
    BIPUSH 0
    ILOAD A
    ISUB
END:
    ISTORE B
    HALT

By eliminating one ISTORE instruction, the program is 2 bytes smaller. Execution time is unchanged since only one of the ISTOREs would have been executed anyway.

Refinement 2

Notice that A is pushed onto the stack in order to be tested. The test, however, removes it from the stack and in both the if clause and the else clause a second copy of A is pushed onto the stack. Why not push a second copy of A onto the stack before the test? The test will remove the second copy leaving the first copy of A on the top of the stack ready for use:

    ILOAD A
    ILOAD A
    IFLT IF_CLAUSE
    GOTO END
IF_CLAUSE:
    BIPUSH 0
    ISUB
END:
    ISTORE B
    HALT

See if you can figure out why the this program will not work. Stop! Don't continue reading until you have thought about why this program will not work. Remember that the subtraction instruction subtracts the value on the top of the stack from the value that was underneath it. In the if clause of this program, the subtraction yields A - 0 = A which is incorrect (we want 0 - A = -A). The problem is that the top two elements on the stack are in the wrong order. We can fix this by using the SWAP instruction before we subtract:

    ILOAD A
    ILOAD A
    IFLT IF_CLAUSE
    GOTO END
IF_CLAUSE:
    BIPUSH 0
    SWAP
    ISUB
END:
    ISTORE B
    HALT

In this second refinement, we have, in effect, replaced one ILOAD instruction with a SWAP. What are the consequences of this change? The SWAP instruction is 1 byte long and the ILOAD instruction is 2 bytes long, so our new version is one byte smaller than the first refinement (which was 2 bytes smaller than the original for a total reduction of 3 bytes).

Unfortunately, the SWAP instruction adds 7 clock cycles to the execution time if A is negative (for a total of 46 clock cycles). Execution time is unchanged if A is positive (35 clock cycles). The average execution time has increased to 40.5 clock cycles (a 9.5% increase).

Refinement 3

In refinement 2, we used two ILOAD instructions in order to get two copies of A on top of the stack. We can use the DUP instruction to do the same thing:

    ILOAD A
    DUP
    IFLT IF_CLAUSE
    GOTO END
IF_CLAUSE:
    BIPUSH 0
    SWAP
    ISUB
END:
    ISTORE B
    HALT

Compared to refinement 2, we have replaced an ILOAD (2 bytes) with a DUP (1 byte) saving one more byte of memory. An ILOAD requires 6 clock cycles while the DUP requires only 3. That is a savings of 3 clock cycles whether A is negative or not. This version requires 16 bytes of memory (20% less memory than the original version) and the average execution time is 37.5 clock cycles (1.4% more than the original version). That is a pretty good compromise.

Instruction	Opcode	Operand	Bytes	Time (A < 0)	Time (A >= 0)
ILOAD A	21	0	2	6	6
DUP	89		1	3	3
IFLT If_Clause	155	? ? (0 6)	3	11	8
GOTO End	167	? ? (0 7)	3		7
If_Clause: BIPUSH 0	16	0	2	4
SWAP	95		1	7
ISUB	100		1	4
End: ISTORE B	54	1	2	7	7
HALT	255		1	1	1
		Totals	16	43	32

Refinement 4

The primary cause for the increase in the average execution time is the SWAP statement that we added to get the operands in the right order for the subtraction in the if clause. Let's begin our program with BIPUSH 0. If A < 0 then we will already have both 0 and A on the stack in the correct order and we can do away with the SWAP instruction. Otherwise, pushing 0 onto the stack was a waste of time and only increases the number of clock cycles along the path through the else clause.

Instruction	Opcode	Operand	Bytes	Time (A < 0)	Time (A >= 0)
BIPUSH	16	0	2	4	4
ILOAD A	21	0	2	6	6
DUP	89		1	3	3
IFLT If_Clause	155	? ? (0 6)	3	11	8
GOTO End	167	? ? (0 4)	3		7
If_Clause ISUB	100		1	4
End: ISTORE B	54	1	2	7	7
HALT	255		1	1	1
		Totals	15	36	36

Here is the machine language program:

16, 0, 21, 0, 89, 155, 0, 6, 167, 0, 4, 100, 54, 1, 255

This program requires 15 bytes (25% less than the original version) and the average execution time is 36 clock cycles (2.7% fewer than the original version). In this case, we were able to make our original program considerably smaller and slightly faster.

Example 6: Looping

Write a program that will implement the following Java-like program:

sum = 0;
counter = 1;
do while (counter < 11)
{
    sum = sum + counter;
    counter++;
}

In the branching example above, we saw that there are three conditional branching instructions: IFEQ, IFLT, and IF_ICMPEQ. None of these performs the test "i < 10" that is indicated in the Java program we are trying to implement. That means we have to rethink the structure of the loop. Here are several possibilities:

while (counter - 11 < 0) do
{
    body of loop
}

until (counter - 11 == 0) do
{
    body of loop
}

 until (counter == 11) do
{
    body of loop
}

In a while-do loop, the condition being tested is a continuation condition. If the test succeeds then the program branches to the first statement in the body of the loop. The last statement in the body of the loop is an unconditional branch back up to the code that performs the test. If the test fails, the next instruction in the program is usually an unconditional branch to the first statement following the body of the loop. The while loop given above might be implemented like this:

test:
    prepare for the test
    IFLT top
    GOTO end
top:
    body of loop
    GOTO test
end:
    rest of program

In an until-do loop, the condition being tested is an exit condition. (Java does not have an until-do loop.) If the test succeeds, then the program branches to the first statement following the body of the loop. Otherwise, the programs goes right on into the body of the loop. As before, the last statement in the body of the loop is an unconditional branch back up to the code that performs the test. Our until-do loops might be implemented like this:

test:
    prepare for the test
    IFEQ end  (or IF_ICMPEQ end)
    body of loop
    GOTO test
end:
    rest of program

While the choice is somewhat arbitrary, let's go with an until-do loop with an IFEQ instruction:

Java	Label	Instruction	Opcode	Operand	Bytes	Time (False)	Time (True)
		BIPUSH 0	16	0	2	4	4
sum = 0		ISTORE sum	54	0	2	6	6
		BIPUSH 1	16	1	2	4	4
counter = 0;		ISTORE counter	54	1	2	6	6
	test:	ILOAD counter	21	1	2	6	6
		BIPUSH 11	16	11	2	4	4
		ISUB	100		1	4	4
Until (counter - 11 == 0) do		IFEQ end	153	? ? (0 16)	3	8	11
{		ILOAD sum	21	0	2	6
		ILOAD counter	21	1	2	6
		IADD	96		1	4
sum = sum + counter;		ISTORE sum	54	0	2	6
counter ++;		IINC counter, 1	132	1 1	3	7
}		GOTO test	167	? ? (255 238)	3	7
	end:	HALT	255		1		1
				Totals	30

The offset for the GOTO instruction near the bottom needs some explanation. Since this instruction branches backwards the offset is negative. Count the number of bytes preceding the opcode of 167 back to and including the opcode of 21 which is the first instruction following the test label. You should get 18 bytes which means the offset is -18. We have to find the decimal equivalent of the 2's complement representation of -18:

00000000  00010010  18 in Binary
11111111  11101101  1's complement
                +1  Add 1
11111111  11101110  -18 in 2's complement form
   255      238     Decimal equivalent

Here is the final machine language program:

16, 0, 54, 0, 16, 1, 54, 1, 21, 1, 16, 11, 100, 153, 0, 16, 21, 0, 21, 1, 96, 54, 0, 132, 1, 1, 167, 255, 238, 255

This program is 30 bytes long. The preamble (everything up to the test) takes 20 clock cycles. The body of the loop (including the failed test itself) takes 56 clock cycles. The loop is executed 10 times for a total of 560 clock cycles. It takes an additional 12 clock cycles to end the program (including the successful test). Altogether this program requires 602 clock cycles to execute.

Example 7: Methods

Machine language instruction sets have some mechanism for calling subroutines (or procedures, or functions, or methods). While the name may vary, the concept is the same. Programmers are able to write independent blocks of code that can be invoked by other blocks of code. In Java, these blocks of code are called methods. In IJVM, every method is implemented as a function whose result is left on top of the stack when the function terminates. The code that invoked the method then pops that value from the stack and uses it or, if the function is a void function, just throws it away (so to speak). The key point here is that the return value must be popped from the stack before the stack is used for anything else.

The machine language instruction INVOKEVIRTUAL (opcode decimal 182) is used to invoke a method. This instruction has a 2-byte opcode that is an unsigned offset into the constant pool section of memory where the address of the method is stored (address of method = CPP + offset). The IRETURN (opcode decimal 172) is the instruction that returns control to the invoking method. IRETURN has no operands.

Before executing the INVOKEVIRTUAL instruction, the programmer must push the parameters that will be used by the method onto the stack. Every method has a least one parameter; an object reference. Recall that in Java, an instance method is invoked by a particular instance of a class (a particular object). The object reference identifies the object that is invoking the method. Our simple IJVM doesn't actually support methods but, nevertheless, it retains this feature of the real JVM. In our case, all we need to do is declare a constant in the constant pool and use it as the object reference. No matter how many parameters may be passed to a function, the object reference must be the first one pushed onto the stack.

The remaining parameters (if any) are passed by value. Consequently, the value of the parameter is copied onto the stack. Recall that modifying a value parameter within a method has no effect on the value of the parameter in the invoking routine. In the context of C++ and Java programming, we say that changing the value of the formal parameter has no affect on the actual parameter. In Java, all parameters are value parameters.

In addition to the machine language instructions, a method has a 4-byte header; two 2-byte integers. The first integer is the number of parameters that were pushed onto the stack and the second is the number of local variables. The INVOKEVIRTUAL instruction finds a place in memory to store the values of the parameters and local variables and stores the address of this location in the LV register. INVOKEVIRTUAL needs the two 2-byte integers to know how much space is needed to store the corresponding values. In this block of memory, the values of the parameters come first (in the order that they were pushed onto the stack) and the values of the local variables follow.

Write a program that implements this Java-like code:

main
{
    p = Product(20, 30);
}

int Product(a, b)
{
    int prod = 0;
    while (b > 0)
    {
        prod = prod + a;
        b = b - 1;
    }
    return prod;
}

The machine language implementation is given below. The parameters (object reference, 20, and 30) were pushed onto the stack first so their offsets from LV in the product method are 0, 1, and 2 respectively. The local variable is at offset 3.

 MAIN
    ldc_w 0 (object ref)     19  0   0  // 0 is CPP offset of obj ref
    bipush 20                16 20
    bipush 30                16 30
    invokevirtual Product   182  0   1  // 1 is CPP offset of Product
    istore p                 54  0      // 0 is LV offset of p
    halt                    255

PRODUCT
    3 parameters              0  3      // obj ref, 20, and 30
    1 variable                0  1      // prod

    bipush 0                 16  0
    istore prod              54  3
TOP:
    iload b                  21  2
    ifeq END                153  ?   ?  (  0  16)
    iload prod               21  3
    iload a                  21  1
    iadd                     96
    istore prod              54  3
    iinc b -1               132  2 255
    goto TOP                167  ?   ?  (255 242)
END:
    iload prod               21  3
    ireturn                 172

The code for main must start at byte address zero. You can put the code for the product method anywhere you want in the code section of memory, but you must enter the corresponding address at offset 1 in the constant pool (the object reference is at offset 0). When the program terminates, the product will be stored at the base address of the LV register (which corresponds to the variable p in main at offset 0).

Here is our final machine language program:

MAIN (Start at byte 0)
19, 0, 0, 16, 20, 16, 30, 182, 0, 1, 54, 0, 255

PRODUCT
0, 3, 0, 1, 16, 0, 54, 3, 21, 2, 153, 0, 16, 21, 3, 21, 1, 96, 54, 3, 132, 2, 255, 167, 255, 241, 21, 3, 172

The Stack Frame

Let's take a look at exactly what the INVOKEVIRTUAL and IRETURN statements do. There are a number of significant questions that must be answered with regard to methods:

Where do we store the code for the method?
Where do we store the data (parameters and local variables)?
How do we get back to the invoking routine?

In order to get back to the invoking routine, the computer has to keep track of the address of the first instruction following the INVOKEVIRTUAL instruction (this is called the return address). It must save the return address before loading the PC register with the address of the method (which is stored in the constant pool). When the method has terminated, the return address is loaded into the PC register and execution resumes in the invoking routine.

Modern computers store the code for a method and the data for the method in two separate locations. In our machine, the code can be stored just about anywhere. The data is stored on the system stack in what is called a stack frame.

Before executing the INVOKEVIRTUAL instruction, the function parameters must be pushed onto the stack. Remember that there is always at least one parameter; the object reference. The object reference is always pushed onto the stack first followed by the other parameters. The stack frame for a method begins at the address of the object reference parameter. While we don't need the object reference in IJVM, this memory location will be used for two purposes. Initially, it will be used to store a pointer to the return address. When the method terminates, the value returned by the method will be stored at this same location.

Here is what happens when INVOKEVIRTUAL is executed:

1. The operand of the INVOKEVIRTUAL instruction is read. The address of the method is stored at CPP + operand.

2. The value in PC is copied to OPC.

3. The address of the method (from CPP + operand) is loaded into the PC register.

4. The number of parameters is read from the method header.

5. The value of SP - number of parameters is stored in TOS. TOS contains the address of the object reference which is the base of the stack frame.

6. The number of local variables is read from the method header.

7. The SP register is incremented by the number of local variables plus one. This provides spaces in the stack frame for the local variables and for the return address though none of these spaces are initialized at this point.

8. Write the value in SP to the base of the stack frame. At this point, the object reference has been replaced by the address of the return address (a pointer to the return address).

9. The return address, itself, is written to the top of the stack (the location pointed to by SP).

10. Push the value of the LV register onto the stack frame.

11. Load the LV with the address of the base of the stack frame (which was stored in TOS back up in step 5).

12. Fetch the opcode of the first instruction in the method.

From this point on, the method executes its instructions as normal. The only real difference is that the LV register now points to the base of the local stack frame rather than the block of memory reserved for the local variables of the main method. The method must push its return value onto the stack before executing the IRETURN instruction. To be more specific, the return value must be on the top of the stack immediately above the previous value of the LV register.

The IRETURN instruction performs these tasks:

1. Read the address of the return address from the base of the stack frame.

2. Read the return address (the old PC value) and store it in the PC register.

3. Read the old LV value and store it in the LV register.

4. Reset the SP to point at the base of the stack frame (where the obj ref parameter had been and then the return address pointer).

5. Copy the return value to the new top of stack location

6. Fetch the opcode of the first instruction following the INVOKEVIRTUAL instruction in the invoking method.