Assembly Language Format


The assembler is case insensitive and requires no special coding formats. You may format your assembly language programs any way you want. The only restriction is that individual program tokens be separated by white space. As with any programming language, it is probably best to write no more than one instruction per line. Consistent indentation and good use of comments can make your source code easier to read.

Assembly language source files are ordinary text files that are saved with a "jas" file extension. You can write your programs in any text editor or use the simulator's integrated editor/assembler. To assemble a program, load the source code into the simulator (Source menu; Load Source Code...). Then assemble the program (Source menu; Assemble Source Code; or simply tap the F2 function key).

When a program is successfully assembled, the source code is saved and the assembled machine language program is saved and loaded into the memory of the simulator. If there is no open source code file, a "Save As" dialog will ask you to supply a file name. If you choose to cancel this dialog, neither the source file nor the machine language file will be saved. However, the machine language code will still be loaded into the memory of the computer.

Grammar Rules

Assembly language programs must conform to the grammar rules given below. The symbol ::= means "is defined as". Square brackets [ ] indicate that the enclosed construct is optional. Braces { } indicate that the enclosed construct may be repeated zero (0) or more times. When a construct has one or more possible forms, those forms are separated by vertical bars. When a construct consists of multiple parts, those parts must appear in the order in which they are listed (from left to right).

For example, a program is defined as an optional constant section, a required main method, and an optional list of program methods. If present, the constant section must precede the main method and the list of program methods must follow the main method. A constant section is defined as the keyword ".constant" followed by an optional constant list followed by the keyword ".end-constant". Note that while there is no logical reason to have a constant section with no constants, it is not a syntax error. A constant list is defined as a constant declaration followed by 0 or more additional constant declarations (i.e., a constant list has at least one constant declaration). A constant declaration is defined as a constant identifier followed by a constant. Finally, a constant is defined as a literal integer value or a character (such as 'A').

       <program> ::= [<const_section>] <main_method> [<method_list>]
 <const_section> ::= .constant [<const_list>] .end-constant
    <const_list> ::= <const_dec> { <const_dec> }
     <const_dec> ::= <constId> <constant>
      <constant> ::= integer | character
   <main_method> ::= .main <body> .end-main
   <method_list> ::= <method> { <method> }
        <method> ::= .method <methodId> ( [<param_list>] ) <body> .end-method
          <body> ::= [<var_section>] <statement_list>
   <var_section> ::= .var [<id_list>] .end-var
       <id_list> ::= <VarId> { <varId> }
    <param_list> ::= <VarId> {, <VarId>}
<statement_list> ::= <stmt> { <stmt> }
     <statement> ::= [<label>:] <instruction>
   <instruction> ::= bipush byte
                   | dup
                   | err
                   | goto <label>
                   | halt
                   | iadd
                   | iand
                   | ifeq <label>
                   | iflt <label>
                   | if_icmpeq <label>
                   | iinc <varId> byte
                   | iload <varId>
                   | in
                   | invokevirtual <methodId>
                   | ior
                   | ireturn
                   | istore <varId>
                   | isub
                   | ldc_w <constId>
                   | nop
                   | out
                   | pop
                   | swap
         <label> ::= <id>
         <varId> ::= <id>
      <methodId> ::= <id>
       <constId> ::= <id>
            <id> ::= letter { _ | letter | digit }

Comments

C++ style comments are supported. Comments begin with a double forward slash (//) and continue to the end of the line. Multi-line comments are not supported. Comments are totally ignored by the assembler but can make your programs easier to read.

 

Identifiers

All identifiers must start with a letter and contain only letters, numerals, and underscore characters.

 

Constant Section

The constant section is optional. If present, it must come first in the program. Here is the general form for the constant section:

General Format Example
.constant
    id1 value1
    id2 value2
    ...
.end-constant
.constant
    one 1
    a 'a'
.end-constant

The value of a constant must be a signed decimal integer or a single character enclosed by quotes.

 

Main Method

Every assembly language program must have a main method and there must be at least one instruction in the method. Every possible path through the main method code should end with a HALT statement. Ideally, the HALT statement should be the last statement in the method (for the sake of readability) but that is not a requirement.

General Format
.main
    [variable declarations]
    executable code
.end-main

 

Variable Declarations

Variable declarations are optional. A method does not have to use any variables. If present, the variable declarations must precede the executable code in the method.

General Format Example
.var
    variable list
.end-var
.var
    time
    speed
    distance
.end-var

The variable list is a whitespace delimited list of identifiers.

 

Instruction Format

The instructions that have operands are listed below with the general format for the instruction and an example or two.

BIPUSH

General Format Examples
bipush byte bipush 5
bipush -4

The byte operand is a decimal value in the range -128 to 127.

GOTO, IFEQ, IFLT, IF_ICMPEQ

General Format Examples
goto label
ifeq label
iflt label
if_icmpeq label
goto top
ifeq top
iflt top
if_icmpeq top

The label operand is a program label in the same method as the branching instruction. A program label is an identifier followed (immediately) by a colon. A program label corresponds to the address of the opcode of the first instruction following the label.

ILOAD, ISTORE

General Format Examples
iload varId
istore
varId
iload speed
istore distance

The varId operand is a variable or parameter identifier declared in the same method as the instruction.

IINC

General Format Examples
iinc varId byte iinc n 1
iinc time -1

The varId operand is a variable or parameter identifier declared in the same method as the instruction.
The byte operand is a decimal value in the range -128 to 127.

INVOKEVIRTUAL

General Format Example
invokevirtual methodId invokevirtual add

The methodId operand is a method identifier. While not a requirement for the assembler, methods definitions should be placed in the program after an instruction that invokes it. That is, it should follow a method in which it is invoked. Otherwise, the disassembler will not be able to properly disassemble the machine language program.

LDC_W

General Format Example
ldc_w constId ldc_w one

The constId operand is a constant identifier.

 

Method Section

The method section is optional. If present, it must follow the main method. The method section consists of one or more methods having this general format:

General Format
.method methodId ( paramList )
    [variable declarations]
    executable code
.end-method

The paramList is a comma-delimited list of identifiers. If there are no parameters, the parameter list can be left empty but the parentheses are still required.

 

Example Programs

A number of example programs are supplied with the simulator. The section on the instruction set has links to example programs for each instruction except NOP (which does nothing). You'll also find example programs in the assembly language tutorials.