The PowerPC 604 RISC Microprocessor - eisber.net

The PowerPC 604 RISC MicroprocessorThe PowerPC 604 RISC microprocessor uses out-of-order and speculative executiontechniques to extract instruction-level parallelism.. Its nonblocking execution pipelines, fastbranch misprediction recovery, and decoupled memory queues support speculativeexecution.S. Peter Song he 604 microprocessor is the thirdmember of the PowerPC family beingMarvin Denmandeveloped jointly by Apple, IBM, andMotorola. Developed for use in desk-Joe Changtop personal computers, workstations. and servers,this 32-bit implementation works with theSomerset Design Center software and bus in the PowerPC 601 and 603microprocessorst 3 While keeping the systeminterface compatible with the 601 microprocessor,we improved upon it by incorporating aphase-locked loop and an IEEE-Std 1149.1boundary-scan (TAG) interface on chip. In addition,an advanced machine organization deliversone and a half to two times the 601's integerperformance.Performance strategyProcessor performance depends on three factors:the number of instructions in a task, thenumber of cycles a processor takes to completethe task, and the processor's frequency. 45 Ourarchitecture, which we optimized to producecompact code while adhering to the reducedinstruction set computer (RISC) philosophy,addresses the first factor. The high instructionexecution rate and clock frequency addressesthe other two factors. The 604 provides deeppipelines, multiple execution units, registerrenaming, branch prediction, speculative execution,and serializationSix-stage superscalar pipeline. As shownin Figure 1, this deep pipeline enables the 604to achieve its 100-MHz design. The stages are• Fetcb. This stage translates an instructionfetch address and accesses the cache for upto four instructions.• Decode. Instruction decoding determinesneeded resources, such as source operandsand an execution unit.• Dispatch. When the resources are available,dispatch sends instructions to a reservationstation in the appropriate execution unit. Areservation station permits an instruction tobe dispatched before all of its operands areavailable.' As they become available, thereservation station forwards operands to theexecution units. Dispatch can send up tofour instructions in program order (in-orderdispatch) to four of six execution units: twosingle-cycle integers, a multicycle integer, aload/store, a floating point, and a branch.• Issue/execute. In each execution unit, thisstage issues one instruction from its reservationstation and executes it to produceresults. The instructions can execute out ofprogram order (out-of-order execution)across the six execution units as well as withinan execution unit that has an out-of-orderissue reservation station. Table 1 lists thelatency and throughput of the executionstages.8 IEEE Micro 0740.7475t94/SO4 00 0 1994 IEEE

Branch instructionsI FetchPredict'DecodeI PredictInteger instructionsFetchLoad/store instructionsDispatchPredictI Validate I Complete1 l Decode !Dispatch I Execute Complete 'Write backFetch I Decode DispatchFloating-point instructionsFetchDecode IDispatch !Multiply !AddFigure 1. Pipeline description.• Completion. An instruction issaid to be fimshed when itpasses the execute stage. A finishedinstruction can be completed1) if it does not cause anexception condition and 2)when all instructions thatappear earlier in programorder complete. This is knownas in-order completion.• Write back This stage writesthe results of completedinstructions to the architecturalstate (or the state that isvisible to programmer). Bypasslogic permits most instructions to complete andwrite back in one cycle.Although some designs use even deeper pipelines toachieve higher clock frequencies than the 604 does, we feltthat such a design point does not suit today's personal computers.It relies too heavily on one of, or a combination of, avery large on-chip cache, a wide data bus, or a fast memorysystem to deliver its performance. It would be less than competitivein today's cost-sensitive person -al computer market.Precise interrupts and register renaming. Most programmersexpect a pipelines) processor to behave as a nonpipeline('processor, in which one instruction goes throughthe fetch to write-back stages before the next one begins. Aprocessor meets that expectation if it supports precise interrupts,in which it stops at the first instruction that should notbe processed. When it stops (to process an interrupt), theprocessor's state reflects the results of executing all instructionsprior to the interrupt-causing instruction and none ofthe later instructions, including the interrupt-causing instruction.This is not a trivial problem to solve in multiple. out-oforderexecution pipelines. An earlier instruction executingafter a later instruction can change the processor's state tomake Later instruction processing illegal. Sohi gives a generaloverview of the design issues and solutions.The 604 uses a variant of the reorder buffer described bySmith and Pleszkun to implement precise interrupts." The16-entry reorder buffer keeps track of instruction order aswell as the state of the instructions. The dispatch stage assignseach instruction a reorder buffer entry as it is dispatched.When the instruction finishes execution, the execution unitrecords the instructions execution status in the assignedreorder buffer entry. Since the reorder buffer is managed asa first-in/first-out queue. its examining order matches theinstruction flow sequence. To enforce in-order completion.all prior instructions in the reorder buffer must completebefore an instruction can be considered for completion. Thereorder butler examines four entries every cycle to allowAddr I Cache Align CompleteICalcI I Write back'Rndinorrn CompleteI I Write backInstructionTable 1. 604 execution timings.Latency ThroughputMost integer 1 1integer multiply (32x32) 4 2Integer multiply (others) 1Integer divide 20 19Integer load 2 1Floating-point load 3 1Store 3 1Floating-point multiply-add 3 1Single-precision floating-point divide 18 18Double -precision floating -point divide 31 31completion of up to four instructions per cycle.Unlike Smith and Pleszkun's reorder buffer, the 604'sreorder buffer does not store instruction results. Temporarybuffers hold them until the instructions that generated themcomplete. At that time, the write-back stage copies the resultsto the architectural registers. The 604 renames registers toachieve this: instead of writing results directly to specified registers,they are wntten to rename buffers and later copied tospecified registers. Since instructions can execute out of order,their results can also be produced and written out of orderinto the rename buffers. The results are, however. copied fromthe buffers to the specified registers in program order. Registerrenaming minimizes architectural resource dependencies,namely the output-dependency (or write-after-write hazard)and anticlepenciency (or write-after-read hazard), that wouldotherwise limit opportunities for out-of-order execution.'Figure 2 (next page) depicts the format of a rename bufferentry The 604 contains a 12-entry rename buffer for thegeneral-purpose registers (GPRs) that are used for 32-bit integeroperations. The 604 allocates a GPR rename buffer entryupon dispatch of an instruction that modifies a GPR The dispatchstage writes a destination register number of theOctober 1994 9

Rename valid I Reg num Result Result validFigure 2. Rename buffer entry format.instruction to the Reg num field, sets a Rename valid bit, andclears the Result valid bit. When the instruction executes, theexecution unit writes its result to the Result field and sets theResult valid bit. After the instruction completes, the writebackstage copies its result from the rename buffer entry tothe GPR specified by the Reg num field, freeing the entry forreallocation. For a load-with-update instruction that modifiestwo GPRs, one for load data and another for address.the 60-4 allocates two rename buffer entries.Register renaming complicates the process of locating thesource operands for an instruction since they can also residein rename buffers. In dispatching an integer instruction, thedispatch stage searches its source operands simultaneouslyfrom the GPR file and its rename buffer. If a source operandhas not been renamed, the processor uses the value readfrom the GPR file. If a rename exists (indicated by an entrywith the Rename valid set and its Reg nurn field matchingthe source register number), the Result in the rename bufferis used. It is, however, possible that the result is not yet validbecause the instruction that produces the GPR has not yetexecuted. The dispatch stage still dispatches the instructionsince the operand will be supplied by the reservation stationwhen the result is produced. The dispatched instructioncontains the rename buffer entry identifier in place of theoperand. The GPR file and its rename buffer can use eightread ports for source operands to support dispatching of fourinteger instructions each cycle.The 604 also uses a rename buffer for floating-point registers(FPRs) and one more for the condition register (CR).The FPR rename buffer has eight 90-bit-wide entries to holda double-precision result with its data type and exceptionstatus. The FPR file and its rename buffer access three readports for dispatching one floating-point instruction per cycle.In addition to compare instructions, most integer and floating-pointinstructions can also generate negative, positive,zero, and overflow condition results. One of the eight fieldsin the 32-bit CR stores these 4-bit condition results. The 604treats each field as a 4-bit register and applies register renamingusing an eight-entry CR rename buffer.Branch prediction and speculative execution. Becausetoday's application software contain a high percentage ofbranch instructions, correctly predicting the outcome of theseinstructions is crucial to keeping the multiple instructionpipelines flowing and for achieving two to three times theexecution rate of scalar processors. The 604 uses dynamicbranch prediction in the fetch, decode. and dispatch stagesto predict as well as correct branch instructions early.The 604's speculative execution strategy complements itsbranch prediction mechanisms. The strategy is to fetch andexecute beyond two unresolved branch instructions. Theresults of these speculatively executed instructions reside inrename buffers and in other temporary registers. If the predictionis correct, the write-back stage copies the results ofspeculatively executed instructions to the specified registersafter the instructions complete.Upon detection of a branch misprediction. the 604 takesquick action to recover in one cycle. It selectively cancelsthe instructions that belong in the mispredicted path fromthe reservation stations, execution units. and memoryqueues. It also discards their results from the temporarybuffers. In addition, the processor resumes its previous stateto start executing from the correct path even before the mispredictedbranch and its earlier instructions have completed.Since the 604 detects a branch misprediction many cyclesbefore the branch instruction completes, its fast recoveryscheme helps to maintain performance of those applicationswith high data cache miss rates and whose branches are difficultto predict.Serialization. A serialization mechanism delays executionof certain instructions that would otherwise be expensiveto execute speculatively in the 604's multiple-pipeline,out-of-order execution design. This mechanism delays infrequentlyused instructions until they can safely execute whilepermitting later instructions to execute. Some examples arethe move to and from special-purpose register instructions,the extended arithmetic instructions that read the carry bit,and the instructions that directly operate on the CR, whichthe PowerPC architecture provides for calculating complexbranch conditions. This mechanism also controls storeinstructions since it is difficult to undo stores.The dispatch stage sends a serialized instruction to theproper execution unit with an indication that it should notbe executed. When all prior instructions have completed andupdated their results to the architectural state, the completionstage allows the serialized instruction to execute. Oncethe serialized instruction is dispatched, dispatch continuesto dispatch the following instructions so they can executebefore the serialized instruction. When the serialized instructionis completed, the later instructions also complete uponfinishing execution. This minimizes the penalty of serializedinstructions.Machine organizationFigure 3 shows the fetch address generation logic. Thefetch stage selects an address from the addresses generatedin the different pipeline stages each cycle. Since an addressgenerated in a later stage belongs to an earlier instruction, itsselection precedes an address from an earlier stage.The completion stage detects exception conditions andgenerates an exception handler address. This stage also10 IEEE Micro

updates the program counter (PC) withthe target address of a taken branchinstruction, or advances it by the numberof instructions being completed. Thebranch execute stage may correct theinstruction fetch with the branch targetaddress if the branch is misoredicted asnot taken and with the sequential addressif the branch is mispredicted as taken.The dispatch and decode stages may -change the fetch address with either thetarget or sequential address of a branchinstruction being predicted. There aretwo copies of the target and sequentialaddress registers in the decode, dispatch,and execute stages_ since there can be upto two branch instructions in each stage.The completion stage also has two targetregisters to handle up to two finishedbranch instructions.If the fetch address hits in the branchtarget address cache (BTAC), the targetaddress becomes the fetch address.Otherwise, the instruction fetch continuessequentially. The 64-entry, fully associativeBTAC holds the target addressesof the branches that are predicted to betaken. If a branch is predicted as nottaken for its next encounter. the branchexecute stage removes it from the BTAC.The BTAC is accessed with the fetchaddress, and not with a branch instructionaddress. providing a zero-cycle fetchpenalty for taken branches. Althoughthere may be multiple branch instructionsin the four instructions beingfetched, the BTAC provides the targetaddress of the first-predicted takenbranch instruction.VInstructionlcache 1FetchDecodeDispatchV V VBHTVPrediction logic(4 instructions)Prediction logic(4 instructions)VExecute logicBranch executeInstructionscompletedCompletionInstruction decode and dispatch. The pipeline decodesfour instructions every cycle to determine exception conditions,as well as the resources needed by the instructions.The resources include the execution unit. source operands.and destination registers. Decoding the instructions beforethe dispatch stage simplifies the dispatch logic without usingpredecoded bits in the instruction cache. Predicting branchinstructions in the first two entries of the decode buffer minimizesthe performance penalty of adding the decode stage.When the decode stage detects an unconditional branchthat was not in the BTAC. it corrects the instruction fetch to thetarget address of the branch. It also predicts conditionalbrunches with the execution history found in the branch historytable (13HT)."' Each entry of the 512-entry BHT denotesBTACVTargetTargetTarget InTargetTHP1+2 +4SEQVSEQSEQBHT Branch history tableBTAC Branch target address cacheFAR Fetch address register•Exception logicFigure 3. Instruction fetch address generation logic.PC Program counterSEQ Sequential addressone of the four history states: strong not taken. weak not taken.weak taken. and strong taken. The table updates the historystate with the actual outcome of the branch that is mapped tothe entry. To simplify the design. each entry in the BHT mapsto every 512th instruction address. This allows the BHT CO beaccessed with the fetch address and to return the four entriesmapped to the four instructions starting with the fetch address.Not all conditional branch instructions use the BHT. Thearchitecture provides a count register K) value as a branchcondition to support loops in programs. Only the conditionalbranch instructions that do not depend on the C. i }Z valueuse, as well as update. the BHT Those that do depend onthe CTR are predicted. based on the value of the shadowC I R The shadow C I K has a future state of the C l K that isOctober 1994 11

updated by speculatively executed move-to-c 1 K or branchand-decrementinstructions. This prediction scheme is veryLifective on branches chat control loop iteration.The dispatch buffer sends up to four instructions to four ofthe six execution units each cycle. As space allows, moreinstructions advance from the decode stage into the fourntrydispatch buffer. The 604 places only a few restrictionstin dispatch to enable a high-speed implementation. They are1) One instruction per execution unit. Since each executionunit can start only one instruction per cycle and aninstruction can bypass the reservation station if the executionunit is available, dispatching one instruction perunit simplifies the logic without imposing an undue performancepenalty. Two identical single-cycle integerunits handle the more frequent instructions.2) Resources available. Each instruction needs a reorderbuffer entry, a reservation station entry in the appropriateexecution unit, and rename buffer entries to holdits results. Available resources depend on the state ofthe instructions previously dispatched as well as thosecurrently being dispatched.3) Stop dispatch after branch. Instructions following abranch instruction are not dispatched in the same cycleas the branch is dispatched. This restriction simplifiessaving the processor state, which allows immediate cancelingof speculatively executed instructions that followpredicted branches.4) In-order dispatch. Dispatching instructions in orderresults in only a small cost in performance and greatlysimplifies resource allocation and dispatch logic. Out-oforderexecution is introduced with six independent executionpipelines and out-of-order issue reservationstations to achieve performance comparable to an outof-orderdispatch design.Reservation stations and result forwarding. A rwoentryreservation station on every execution unit allowsinstructions to be dispatched before obtaining all of theiroperands. Without a reservation station, an instruction cannotbe dispatched until all of its source operands becomeavailable, either in the register file or in its rename buffer.Without reservation stations, the 604's in-order dispatchdesign would be more complex, since it would have to detectdata dependencies and would frequently stall. The reservationstations in the three integer units can issue instructionsout of order to allow a later instruction to bypass an earlierstalled instruction. The branch, load/store, and floating-pointunit reservation stations may only issue instructions in order.Each execution unit provides one result bus for each typeof result it produces. For instance, the multicycle integer unithas one result bus for the GPR and another for the CR datapes Figure 4 shows the four GPR result buses and the reser-vation stations and GPR rename huffer that are connected tothem. Each GPR reservation station entry monitors all fourGPR result buses for any missing A or B operands. denotedas A op and B op in the figure. When an execution unitreturns a result and the associated GPR rename buffer entryidentifier, the reservation station compares the identifieragainst those in its entries. When a match is found. it forwardsthe result to the waiting instruction. For returning the updateaddress of a load-with-update instruction while executingone load instruction per cycle, the loadistore unit shares theresult bus of the less frequently used multicycle integer unit.It is interesting to note why the 604 uses a reorder buffer,rename buffers, and reservation stations to provide the samefunctions that a DRIS (deferred-scheduling, register-renaminginstruction shelf) in the Metaflow architecture provides.' ADRIS entry consists of instruction status fields that a reorderbuffer entry would have, source operand fields that a reservationstation entry would have, and destination fields that arename buffer entry would have. (The 604's reservation stationentry uses a separate source operand to store either animmediate or a copy of the source operand. Although theDRIS figure in the Metaflow article shows only the ID field toindicate the DRIS entry with the source operand, it is likelythat another field is needed to store an immediate operand.)Two disadvantages of the DRIS had we used it in the 604design are• Scheduling overhead. The DRIS instruction schedulingis more complicated than the 604 -s dedicated reservationstations since the next instruction for an execution unitmust be the first - ready - instruction of the -right' typein all DRIS entries.• Single result type. DRIS supports renaming of only oneregister type, whereas the 604 needs three. Say that morethan one DRIS is used, as described in Popescu et al.,"to support separate integer and floating-point registers.One of them would have to house all instructions to provideprecise interrupts while not being able to provideregister renaming. An alternative is to design one DRISto accommodate the largest register type.Execution units. The branch execution unit can hold twobranch instructions in its reservation station and two more finishedbranch instructions. It serves to validate branch predictionsmade in earlier stages, and also verifies that the predictedtarget matches the actual target address. If a misprediction isdetected, the branch execution unit redirects the fetch to thecorrect address and starts the branch misprediction recovery.The 604 has a three-stage complex integer unit (CFX) toexecute integer multiply, divide, and all move to and fromspecial-purpose register instructions. The CFX can sustainone multiply instruction per cycle for 32x16-bit and those32x32-bit multiplies whose B operand is representable as a:2 IEEE Micro

17-bit signed integer. It can sustain onemultiply per rwo cycles for larger 32x32-bit operands. The CFX also uses themultiply pipeline stages to execute adivide instruction in 20 cycles. The 604Stwo simple integer units execute allother integer instructions in one cycle.The chree-stage floating-point unit cansustain one double-precision multiplyaddper cycle, one single-precisiondivide every 18 cycles, or one doubleprecisiondivide every 31 cycles. It compliesfully with the IEEE-Std 754floating-point arithmetic standard. The604 provides hardware support fordenormalization, exceptions, and threegraphics instructions. It also provides aGPR result busesIntegerunit 1non-IEEE mode for graphics support. Tne non-IEEE modeconverts a denorm.alized result to zero to avoid prenormalizationin subsequent operations.Instruction completion and write back. After aninstruction executes, the execution unit copies results to itsrename buffer entries and the execution status to its reorderbuffer entry. Among other things, the execution status indicateswhether the instruction Finished execution without anexception. Of the four reorder buffer entries examined everycycle, up to fcur instructions that finished without an exceptioncomplete in program order.Other than the in-order completion necessary to supportprecise exceptions, the 604 imposes only a few additionalrestrictions on instruction completion. They are1) Stop before a store instruction. Since a store dataoperand is read from the register file in the completionstage, a store instruction cannot complete if its storeoperand is still in the rename buffer. Stopping completionbefore a store instruction allows the store operandto be written to the register File, even if it is producedby an instruction currently being completed.2) Stop after a taken branch instruction. Since a takenbranch instruction sets the program counter to its targetaddress. it is speed critical to advance the programcounter from the new target address by the number ofinstructions completed after the taken branch in onecycle. Stopping completion after a taken branch instructionavoids this logic altogether.To minimize effects of long execution latenc'i on in-ordercompletion. the completion stage overlaps with the last executioncycle for those instructions with multicycle executionstage. These include the multiply. divide. store. load miss, andexecution serialized instructions A store instruction completesas soon :is it is translated without an exception. Similarly. aIntegerunit 2Complexinteger unitGPRrenamebufferGPR General-purpose registerOp OperandFigure 4. GPR result buses, reservation stations, and rename buffer.load instruction that misses in the data cache completes upontranslation without an exception. Since the reservation stationscan forward the load data when it is available to the dependentinstructions, the load miss can safely be completed.Most superscalar designs impose additional restrictionsdue to a limited number of ports to register files. For instance.four write ports would be required to complete up to fourinstructions if each one can update one register. The 604GPR file would require eight write ports to complete fourload-with-update instnictions per cycle. The 604 avoids thisproblem by decoupling instruction completion from registerFile updates using the write-back stage. Instructions completewithout regard to the type or number of registers theyupdate. The completion stage updates their results if ports areavailable; if not. the write-back stage updates them. Therename buffer entries function as temporary buffers for thoseinstructions that are not completed and as write-back buffersfor those that are. All three GPR. FPR. and CR rename bufferscontain two read ports for write back. Correspondingly, thethree register files have two write ports for write back.Memory operationsHigh-speed superscalar processors require a greater memorybandwidth to sustain their performance. The 604 meetsthe increased demand with large on-chip caches, nonblockingmemory operations, and a high-bandwidth systeminterface. The 604 takes advantage of the weakly orderedmemory model, to which the PowerPC architecture subscribes,to offer efficient memory operations. Although loadsand stores that hit in the data cache can bypass earlier loadsand stores, program order memory access can be enforcedwith instructions provided for this purpose.Load/store unit Figure 5 (next page) shows a block diagramof the load/store unit and the memory queues. Tnisunit has a two-cycle execution stage It calculates the memoryaddress and translates that address with a 128-entry, two-October 1994 13

PowerPC 604Reservation stationRA (0:31) pipeline registerLoad queue Store queueLoad miss registerGPR result busData cacheFigure 5. Load/store unit and memory queues.Byte alignment-4--1way set-associative translation lookaside buffer (T1B) in thefirst cycle. The second cycle processes loads making a speculativecache access and aligns bytes when the access hits inthe cache. The pipelined execution stage executes one loador store instruction per cycle.In the first half cycle, the load/store unit calculates a loadinstruction's memory address, denoted as EA in the figure.It translates the real address, denoted as RA in the figure,and the data cache access begins in the second half cycle. Ifthe access hits in the cache, the unit aligns the data and forwardsit to the rename buffer and the execution units in thesecond cycle. If the access misses. the unit places the instructionand its real address in the four-entry load queue. Whena load miss completes, it accesses the cache a second time.If the load is still a miss, the unit moves it to the load missregister while reloading the missing cache line. This permitsa second load miss to access the cache and to initiate thesecond cache line reload before the first is brought in.The unit calculates the memory address of a store and translatesit in the first cycle. It does not write the data to the cache,however, until after the store instruction completes. The unitplaces the instruction and its real address in the six-entry storequeue. Since the data cache is not accessed in the secondcycle, it is available for an earlier store from the store queue(if necessary) or load miss from the load queue (if necessary).When a store instruction completes. the load/store unitmarks it completed in the store queue so that instruction completioncan continue without waiting for storage to the cacheor memory. If the store hits in the cache, the unit writes it tothe cache and removes it from the store queue. If the storeIs a miss. the unit a ili bypass it in the store queue to alloylater stores to take place while cache reloading proceeds.Multiple store misses can be bypassed in the store queue.Figure 6 shows the store queue structure. Four pointersidentify the state of the store instructions in the circular storequeue. When a store has finished execution (or successfultranslation). the load.'store unit places it in the finished state.When it completes. the finish pointer advances to place it inthe completed state. When it is committed to cache or memory,the completion pointer advances to place it in the committedstate. If the store hits in the cache. advancing thecommit pointer removes it from the queue. If the store ismiss. the commit pointer does not advance until the missingcache line is reloaded and the store is written to the cacheWhile the cache line is being reloaded. the next store indicatedby the completion pointer can access the cache. If thisstore hits in the cache. the unit removes it from the queueIf it misses. another cache line reload begins.Caches. The 604 provides separate instruction and datacaches to allow simultaneous instruction and data . accessesevery cycle. Both 16-Kbyte caches provide byte parity protectionand a four-way set-associative organization with 32-byte lines. They are indexed with physical addresses. havephysical tags. and make use of the least recently usedreplacement algorithm.The instruction cache provides a 16-byte interface to thefetch unit to support the four-instruction dispatch design.This nonblocking cache allows subsequent instructions tobe fetched while a prior cache line is being reloaded. Thisdesign is particularly beneficial if the missing cache linebelongs in a mispredicted path. since it allows the correctinstructions to be accessed immediately after a branch mispredictionrecovery. The instruction cache also providesstreaming, a mechanism to forward instructions as they arereceived from off-chip cache or memory.The instruction cache does not maintain coherency:instead. the architecture provides a set of instructions forsoftware to manage coherency. In particular, the instructioncache block invalidate (ICBI) instruction causes all copies ofthe addressed cache line to be invalidated in the system. TheICBI generates an invalidation request, to which all coherentcaches must comply.The data cache contains a 64-bit data interface to theload/store unit for data access, ?AMU for tablewalking, andbus interface unit (BIU) for cache line reloading and snooping.(The architecture specifies an algorithm to traverse pagetable entries that define the virtual-to-physical memory mappings.- Tablewalk" refers to a hardware implementation ofthe algorithm.) The data cache's rwo copy-back buffers supportnonblocking cache operations. A copy-back buffer holdsa dirty (modified) cache line that is being replaced or that hitson a snoop request. The data cache moves an entire cacheline into a copy-back buffer in one cycle to minimize the14 IEEE Micro

Allocation pointerFinish pointerCompletion pointerCommit pointerFigure 6. Store queue structure.EmptyFinishedCompletedleste,4 Committed..160-40-inWSnoop-RWITM(cast out)InvalidModifiedMReplacementReloadshareAllocated -*-AStore-hitSnoop-read(cast out)Cache-line cleanStore-hitReloadexclusiveSharedSASnoopread(► . Exclusive..cycles the cache is unavailable.The MESI (modified, exclusive, shared, invalid) coherencyprotocol defined in the PowerPC 60x Processor InterfaceSpecification keeps the data cache coherent.' 2 A duplicateddata tag array supports two-cycle snoop response with minimumperformance impact to the normal cache operations.To further support nonblocking cache operations, we'veextended the MESI coherency protocol with one more state.As illustrated in a simplified state diagram in Figure 7, theprotocol assigns the new state. Allocated, to the block selectedto hold the missing cache line. All necessary informationfor this particular miss. including the address and set number.remains in the memory queue and later completes thecache line reload. The cache line becomes either shared orexclusive, based on the coherency response.Tne software can individually disable, invalidate, or lockboth instruction and data caches. While a cache is disabled,all accesses bypass it and directly access the off-chip cacheor memory. While a cache is locked, it is accessible but itscontents cannot be replaced. All cache misses. in this case,are accessed from off chip as cache-inhibited accesses.Coherency is. however_ maintained even when a cache islocked. The data cache supports the cache couch instructions,which initiate reloading of the specified cache line ifit is not in the cache. These instructions can effectively shortencache miss races and latencyBus interface unit and memory queues_ The 604's BIUimplements the PowerPC 60X Pnxessor Interface Speczfica-[ion to ensure bus compatibility with the 601 and 603 microprocessors.A split transaction mode allows the address busto operate independently of the data bus, freeing the databus during memory wait states. To support the split transactionmode, the BIL.: uses the address and data buses onlyduring what are known as address tenure and data tenurec-ycies. The 81li also provides a pipelined mode, in which upto three address tenures can be outstanding before data forthe first address is received. If permitted. the BIU will completeone or more write transactions berween the addressand data tenures of a read transaction. Byte parity protectsFigure 7. Simplified data cache state diagram.its 32-bit address and 64-bit data buses.Figure 8 (next page) shows the address and data queuesthat implement split and pipelined transaction modes. Fourtypes of memory queues support the four types of operations:line fill. write. copy back. and cache control. For a linefiiloperation. the line-fill address from either the instructionor data cache remains in the memory address queue untilthe address can be sent out in an address tenure. After theaddress tenure, the address transfers to the addressqueue. This releases the address bus for other transactionsin the split transaction mode. As each double word for thecache line retums. it moves to the line-fill buffer and alsoforwards to the load/store unit.During a write operation the address stays in the memoryaddress queue. and the data in the write buffer. until boththe address and data can move out in a write transaction. Thesize of a write transaction can vary from 1 to 8 bytes to handlenondouble-word aligned writes. Similarly during a copybackoperation. the address remains in the copy-back addressqueue, and the data in the copy-back buffer, until both theaddress and data transfer in a 32-byte burst write transaction.For a cache control instruction or a store to a shared cacheline, the address stays in the cache control address queueuntil an address-only transaction broadcasts the cache controlcommand. Since all address queues in the 604 are consideredpart of the coherent memory system. the BIli checksthem against data cache and snoop addresses to ensure dataconsistency and maintain MESI coherency protocol.System support featuresThe 604 provides several features to support robust systemdesign such as instruction and data address breakpoints andsingle-step and branch instruction tracing facilities for softwaredebugging. It also provides performance-monitoringfunctions for the system to profile software performanceOctober 1994 15

Data addressInstruction address•Memory addressCopy-back addressCache control addressTable 2. The 604 physical characteristics.ItemAddress busData address♦ Instruction address AALine-fin address LI]Line-fill buffer (8 words)ASnoop addressFigure 8. Address and data queue organization.CharacteristicsInstructionData ARead dataWrite buffer(2 words)DataCopy-back buffer —(8 words)\ T T /AVData busIInclude setting instruction or dataaddress breakpoints. single stepping.running Ncycles. and reading and writingsystem memory locations a' well asany storage element within the processorThe COP functions are implementedas an extension to theIEEE-1149.1 specification. and are controlledentirely through that interfaceSystem designers can configure the604 processor operating frequency asone. one-and-a-half. two. or threetimes the system bus frequency. Theon-chip phase-locked loop generatesthe necessary processor clocks fromthe bus clock. The 604 also providesa nap mode, which clocks only externalinterrupt detection logic and thephase-locked loop. It enters napmode under software control andexits from the mode upon detectingan interrupt. The 60-4 can still servicesnoop requests if the system assertsthe RUN pin to run the clocks whilein the nap mode. We estimate napmode power consumption at less than 0.4 watts.Table 2 lists some of key physical characteristics of the604. Figure 9 shows the 604 die photo.TechnologyDie sizeTransistor countCache sizeVoltagePower dissipationSignal VOsPackage0.5-um CMOS, 4 metal layers196 mm ,, 12.4x15.8 mm3.6 million16-Kbyte I-cache and D-cache3.3V, 5V VO tolerantLess than 10W at 100 MHz171; C MOSITTL compatible304-pin CQFPwithout additional hardware. These functions can determinemany key performance parameters, such as instruction executionrate, branch prediction rate, cache hit rates, and averagecache miss latency.The 604 design follows the level-sensitive scan designmethodology to provide high test coverage. As required byLSSD rules, every storage element, except in arrays, connectsto a scan chain that starts with a chip input pin and ends on achip output pin. During test mode. storage elements in a scanchain behave as a shift register that can also capture inputs toexercise a sequential digital network in a combinational manner.The 604's common on-chip processor (COP) providesmany functions to control and observe the storage elements.Some of the functions useful for chip and system debuggingDESIGNED TO MEET LOW-COST needs of the personalcomputer market, the 604 performs well with inexpensive,as well as expensive. memory systems. The 604's large onchipcaches help to maintain performance of well-behavedapplications that exhibit localities. For those with erraticbehaviors and access patterns. speculative execution guidedby dynamic branch prediction helps to reduce on-chip cachemiss latency. The nonblocking execution pipelines and thememory queues that decouple the pipelines from memoryaccess further help to reduce the effects of cache misses. Thesplit and pipelined modes use the system bus to providegreater bandwidth while maintaining compatibility with the601 and 603 microprocessors. J!1References1. E. Silha, "The PowerPC Architecture." IBM RISC System/6000Technology: Volume II, IBM Corporation. Austin, Tex., 1993.2. C Moore, "The PowerPC 601 Microprocessor," IBM RISCSystem/6000 Technology. Volume II, IBM Corporation. 1993.3 B. Burgess et al., "The PowerPC 603 Microprocessor - A HighPerformance, Low Power. Superscalar RISC Microprocessor."16 IEEE Micro

,••iPro, Co..r.ccor ;2: Scc e - es:"C9-1 3:: 3::S .:e e: a Dces• 2 " 993S e -. a. ?•ccesso• r*. 1 -: Pe :.•Peefsrmance' Par. 2 c' m:: -c. "6 R Tornasuio. "Ar. Eff.c .e - : 4;gt.,.rithr-Aritnriletic Units. voi Ja - 196 7 . cc 25-33EG Soh,. - ins: ruction issoe Log ,: fo -interruPtible. vultiole un:z.IEEE Trans Computers. Mar 1990. oc 3z9-35:-.1Smith and A Pleszkun. "implementation of ;',•e; sein P:ce:.hec: o'ocessors. P'cc ;2: 4^"ArCrt:TeCT:Jre 'SEE. P.scataway N ; . 19E5 oc9 M Johnson. SuPersca ,a, microprocessor .Desc•• -'a .Engiewooc Gtiffs. N .j 99 -10 J Lee anc A Srmtn, "Brancn Predic:ion Szrateges an: BrancnTarget Buffer Des:gn. - Computer. Jan 7984 . cc 6-22:1 v Pooescu et a: . "The metaflow Arch . :ectute. • IEEE ;V•CrOJune 199'.. pa 70-13. 63-732 M S AVeri. et a! . ^.-en/Jew c! :ne :7:d.s • -tn:f. , SSL.e. 720 -:2-51$: ".14 .; 4. I, it' ft 1 ;f. .1 '.': ff. ". ':,;:... ! ..•.; i r • ■11: 5 ; , 104 .. .11p„,. .'iti , ii 1 . t? • I: ii ... . di .41, . , . 1, , . f.ry • rl '..fil , p 1 13,fill I. ' 1 OVA ir. ..Viittil "! 411 .1 :1■N 1;1 *(4 Iiii . il% 3! i f+.401 ■ Ii ik. "1". ... 1: '. '' .. .:C.:. ' .VV.;%.: 71 VI. Ibi .3 1,g ; Ili Lii i,„„ ; ' 19111ii.,4' I,. rivof.isl :ift.60 4. • ..g • i ;• .1. .. .,:.. .0, it , • .. ) 1• • 4mb Ark 1:3 . --1, If,11 , e i a , • : • .; , il 1#0. Y'.' '11 '.1 ‘1 Iii . . , !4• (4111'4; ir . 1 ! i:4,1r, .\; 4,' . i, , 1, -tA .! t' :‘ ., .., tili:, ti i; 11, .. , t •1. ilt t 4. 1."! ; ', • ? ) ., , • TA , . , 0•" '' • ' : • .,. . .. ., .: a • ,E41,0. ,' i, !Ito 1 - I I , ":!I • :. i . ii. . I ? : -.1' !;..., .. .. 1.. ! '. 11 .i. v. : !m • 1; I ...;;. ; •.` 1 i it. ; ': 11. 91. ..,•: . r f , , . i .,.. . ' 4: 4 1 4 ' : , - , sIl i,. :4 .lit ... iti ,t./j1 • ... ., s:: .I141 '. 1 la ., I/ i . li t • 1 - !he,: ••; ' '' it.1:1: P 1 '. -i I 1'''.tt i• ' , '''#• tip, 1a • • qi !.i 41! . 1 , .. . . . I "• ' .: **;' . 44 it• ,. . ;.• 1' v.:. E.,./;• i:01.: . :. ., ,, 0-i 1 : ,P i'.. q$ , •*: '.. II 4 . . 11 .r. ...iit: ist ' ; 1.4. • ey , yi, . la. f . ' . :! . c .i , :, 4 ! 1..e . i, . ..4;'t:4 1-, '%11. 1, A -: ' •' ' k. . ...! ; • •• : Tr*V"11.13 •'' 11.: . - 4, le ,. • ; ' : i. i :.ar '; '` '''‘w . . lir 14 --.:-. .14. .1 a n,'• !• . i ,rah 14,7, e, ,. .1:0 I. i O.; .!. 1, , 1 1 • 1 -1/MI I I, i 1] ! : , . a:s ii•k •til • '..it1/2I ,' .' •i' t, •-.• 1 1 . , ) !• i.0 ,- 'la ":., . .. • ..... .• . • . 1 ., ' .4 I . 1.. . 1 ! ) 1.' • •P II .1 iittl'. i-'' ... .. I 1 r'' : e V \ii , i i ,7%. I.*: f it I ,r r. tlintill-.% ; • 1 .N. %,: .. 1 ''1-, ' 43; 1; • 1 1i?. 14 " ;0. il i a • I" i' . Earl 1*. iriql. 1 11 1; .:(* ' ' IP i 0 0' t l .". . ... $14 'a ', I , . . IIIIII" . . .i.al , 4:11,54e I , i i f ; lia . 461 *f :. '''.. 4:. ..;':; - :`,11 ,i . j:,: •!. •. .. :: ,; :r4 , At., . .1, .. :. . • ,•_i will:mit. : ;iv i , il: • ,.,j::. :14,.. t. ,, ;.4 : 1:: ; .. st: . 1n:. ,.., , !. .. 1 i s ; . irl ,„ ‘0,.,,,. ... .,.. .:.,,,:. !„..! i .. , . . • . ,..1 ysif Httlill i ot.. !...Z ' :::::4: 7..... :*: :.. I kt 1 - i•-t: 1 :i si : g .. 11,Eivei ...Lil': 1 1. 1. :. :1;1:11:1..“:"; .1 ''1 ::,....114:..'.::...771.,.1it•* ! L. ; 41 / 11.1 . • •• .: . k."; '1 1. ..' ! *. l: '' '.: • '. • I • • . . c nt ••• - I ' 14$ I:.:9A hiihill It9 1• ►•* IlLtildiligAL • • - w eer, • .• • ••• •irv.,11 10,10 IP ',.• V •s• .4 1 ,,p v tor '.11 .. ; h. 1 :.• .::.. . .. 6- • 1 :1 4.i.% , ... , .i.. i.: • , .. . : ! ...._ 0 . .• 11r. 1 ).TS 11$S. Peter Song is a senior engineer in theSystems Technology and ArchaeczuteDivision of IBM. He led the definition orthe 604 microarchi:ecture and raterciesigned the specuLlt:%c execution. completion.and exception control logic Songholds BS MS. and PhD ‘legrees in eiec-Erica! and computer engineering from the niversity of Texa:.a: Austin He is 3 member of the IEEE Computer Soctrtv EtaKappa Nu. and Tau Bet:: PiFigure 9. Die photo of the PowerPC 604.Joe Chang's biograph. photograph. and act ress appear orp 51 of cr-11 issueDirect questKvis concerning this artic!e to SongS-c.omerset Desigr. Center 11400 Burnet Road. M Au.szinTN -8- 5S. sp•ong4 ibmoto cornMarvin Denman I• pr:ricina! staff :rig:neer in the RISC NIicropflx.eor Divisionof Nlotorola. Inc He :Li- contrihuted ttthe definition or the 'M . 4 MK rOarCh.:C%nranch-processing D enmanBN degree in cornrutcr scienceUm. ersitv .ind an NIS intnt: 1. -ii er's::: •1 Te.as at Ai.istin He .1 niernt ■cr !;-lure and later designes: the retch .indReader Interest SurveyHciiC:.:e our Ir,Zc'reSI ;h: , h% circling the ippropri_iterlk,:nic-; or, thee0;7..)z).•• '994 T.

Threaded Codethreaded = aufgefadeltsubroutine threaded codeDie interne Darstellung eines threaded Interpretersist eine Liste von Adressen vorherdefinierter interner Darstellungen (Unterprogrammen).Diese Darstellungen sind ineiner linearen Liste aufgefadelt.MaschinencodeheaderMaschinencode...ZwischencodeheaderjsrjsrNameNameNamertsrtsCodeCode 1 CodeDie einzelnen Elemente werden Obersetzt.Die abstrakte Maschine ist meistens eineStack-maschine.FORTH

direct threaded codeindirect threaded codeMaschinencode Zwischencode Maschinencode ZwischencodeheaderMaschinencodejmp nextheaderjmp enterX...returnheaderMaschinencodejmp nextheaderenterXreturnenter: push icmove ic = instr + jmp lennext: move instr = (ic)+jmp (instr)return: pop icjmp nextenter: push icmove ic = instrnext: move instr = (ic)+move ind = (instr)+jmp (ind)return: pop icjmp next

token threaded codeindirect token threaded codeMaschinencode Zwischencode Maschinencode ZwischencodeheaderMaschinencodejmp nextheaderjmp enterXreturn tokenheadertokenMaschinencode......jmp nextheaderenter tokenX...return tokenenter: push icmove ic = addr + jmp lennext: move instr = (ic)+move addr = tab(instr)jmp (addr)return:popjmpicnextenter: pushmovenext: movemovemovemovejmpreturn: pop icjmp nexticic = addr + token leninstr = (ic)+addr = tab(instr)addr = (addr)addr = tab(addr)(addr)

Kosten auf dem MC68000subroutine threaded 220 Zyklen 4 byteenter:next: bsr + rtsreturn: rtsenter = 0 next = 34 return = 16indirect threaded 212 Zyklen 2 byteenter:next:return:move.wmovemove.wmove.wjmpmove.wmove.wmove.wjmpa6,-(a7)a5,a6(a6) +, a5(a5) +, a4(a4)(a7)+,a6(a6)+,a5(a5)+,a4(a4)enter = 36 next = 24 return = 32direct threaded 162 Zyklen 2 byteenter:next:return:move.wleamove.wjmpmove.wmove.wjmp.wenter = 42 next =a6,-(a7)a6,#4(a5)(a6)+,a5(a5)(a7)+,a6(a6)+,a5(a5)16 return = 24token threaded 248 Zyklen 1 byteenter:next:return:move.bleamove.bmove.wjmpmove.wmove.bmove.wjmpa6,-(a7)a6, #4 (a4)(a6)+,a5tab(a5),a4(a4)(a7)+,a6(a6)+,a5tab(a5),a4(a4)enter = 44 next = 28 return = 36

Kosten auf dem MIPS 83000indirect token threaded 408 Zyklen 1 byteenter:next:return:move.bleamove.bmove.wmove.bmove.wjmpmove.wmove.bmove.wmove.bmove.wjmpa6,-(a7)a6, #1 (a4)(a6)+,a5tab(a5),a4(a4)+,a5tab(a5),a3(a3)(a7) +, a6(a6)+,a5tab(a5),a4(a4)+,a5tab(a5),a3(a4)enter = 64 next = 48 return = 56subroutine threaded 17 Zyklen8 byteEin delay-slot fur Lade- and Sprungbefehler31 return address registerenter: add r30,r30,-4sw r31, (r30)next: bal label + j r31return: lw r31,(r30)add r30,r30,4j r31enter = 2 next = 2 return = 3

Die Sprache ZIPdirect threaded 30 Zyklen 4 byte(Z80 Interpretative Processor)enter:swaddlwaddnopnext: lwdelayimPaddreturn: lwaddnextr1,-4(r30)rl,r2,8r2,4(r2)r30, r30, -4r2r2, (r1)slotr2r1,r1,4r1, (r30)r30,r30,4enter = 6 next = 3 return = 6Zip ist eine FORTH artige SpracheWichtigste Merkmale:lndirekt threaded codeStack orientiertlinear verkettete 3 Zeichen lange NamenZ80 Maschiencode and ZIP4 Kb Memoryinteger ArithmetikLoop-Anweisungen (kein goto)einfacher Line Editor

CTIL'Start / Restart' 14ZIP FunktionenStart/Restart: initialisiert Stack pointer,System Variable und Execute-ModusMass/Inline: liest eine Zeile in ZeilenbufferII Mass l l InlineITokenQuestion 1q OKII Search II,found ► 1-1 ExecuteII7Number1IStackToken: liest den nachsten token aus demZeilenbuffer und stellt ihn in DictionaryOk: gibt eine Ok-Meldung ausSearch: durchsucht Dictionary nach demtoken, gibt die Adresse und ein gefundenFlag auf den StackExecute: Wenn execute-Mode, dannKommando ausfuhren, wenn compile-Mode,dann wenn immediate-Kommando, dannausfuhren, sonst ObersetzenStack: gibt Fehlermeldung aus, geht zu StartNumber: stellt Nummer auf Stack oderObersetzt sieQuestion: gibt token und Fehlermeldung aus

I 71E1X1 E IDictionary Format010 1Body14 ID IR 10 !Link 'Body'NilNummern: Integer, 2 byte auf Stack, im Bodyzuerst literal handler und dann der WertStrings: Auf Stack Ietzter Buchstabezuunterst mit high-byte gesetzt, in Bodyzuerst Lange, dann die einzelnen BuchstabenBoolesche Werte: 1..true, 0..false,bei Abfragen jeder Wert # 0 trueKonstanten und VariablenKonstante: n CONSTANT nameDECIMAL 255 CONSTANT max: ... max ...;: ... 255 ...;Name und Wert mit Zeiger auf KonstantenaktivierungscodegespeichertVariable: n VARIABLE nameBei Aktivierung des Namens wird Adresseauf den Stack gelegt.Arrays: n VARIABLE name size DP +!Variable wird angelegt, dann wird derDictionaryzeiger urn size erhoht. BeiIndizierung muf3 der Index zur Adresseaddiert werden.Users: USERS nAddiert n zur Adresse vom User-Speicherblock.Dieser Speicherblock kann beliebig imSpeicher stehen.

SystemparameierStackbefehleKarmen als Variable oder als Registerverwendet werdenIR:WA:SP:RSP:MODE:STATE:DP:Instruction RegisterWord-Address VariableDaten StackpointerReturn StackpointerExecute/Compile Modefalse/trueImmediate Wort wennMODE=STATEDictionary Zeiger VariableCONTEXT: SuchvokabularCURRENT: DefinitionsvokabularSTART:LPB:BASE:true, wenn erster StartZeilenbuffer ZeigervariableZahlenbasis VariableDROP:DUP:SWAP:OVER:2DUP:2SWAP2OVER:RROT:LROT:+!:OSET:1 SET:@::Entferne oberstes Stackelementverdopple oberstes Stackelementvertausche obere 2 Stackelementekopiere 2. Stackelement auf denStackverdreifache oberstes Stackelementvertausche oberstes und 3. Stackelementkopiere 3. Stackelement auf denStackABC> CABABC > BCA*stack[top] = stack[top-1]; top-=2;*stack[top] += stack[top-1]; top-=2;*stack[top--] = 0;*stack[top--] = 1;stack[top] = *stack[top];Pop Daten- und Push ReturnstackPop Return- und Push Datenstack

RechenbefehleABS, MINUS, +, *, /, /MOD, MOD/, */,*/MODMAX, MIN, 2*, 2/, 1+, 2+, 1- 2-AND, OR, XOR, NOT=,

KontrollstrukturenBEGIN ... END SchleifeBEGIN ... flag ENDflag IF ... ELSE THENflag IF ... THENBEGIN ... flag IF ... WHILEBEGIN... flag IF ... ELSE... WHILEend start DO ... LOOPend start DO ... inc +LOOPSchleifenindex steht auf dem Returnstackund wird um 1 oder inc erh6htLEAVE: Schleifenexit fur DO-Loopsswitch mit Sprungtabelle implementierenCompiling KeywordsCREATE: stellt den nachsten token in denDictionary:: startet Compilemode, macht einCREATE und setzt die Adresse vonenter in den Dictionarysetzt Adresse von return in denDictionary und beendet Compilermode;CODE: setzt Adresse von SCODE inDictionary und beendet CompilermodeAssemblerScreensandere Dictionaryformatemodule threaded code

Pascal P4 Systemmodel for most other Pascal systems (UCSD)compiler generates assembly language P4 intermediate codeassembler/interpreter assembles and executes P4 codeadvantages• readable intermediate code• resolving of forward references in single pass in interpreter• portable systemproblems• very slowpossible improvements• compiler generates binary P4 code• direct threaded code interpreter or JIT compilerThe P4 Virtual Machineconstantsoverm5 registersprogrammemoryheapstackdatamemoryPC program counterNPSPSPMPEPstack pointermark stack pointerextreme stack pointer0 NP new pointer

Stack Framelocal stacklocalsparameters4EPSP, coif e.„ csuk.emst mark stackreturn addressold EPdynamic linkstatic linkcupentretcall user procedureenter blockreturnfunction valueMP) VAssembler/Interpreter2 instructions with 2 operands are stored in one machine wordcodemaxMCPBCPSCPRCPICPstringsboundary pairsset constantsreal constantsinteger constantsOVERMOVERB• OVERS• OVERR▪ OVERImaxstacko 1p1 q1 op2 p2 q20stack/heapcodestore4

Assemblercodelabeltab1value1valuedefinedenteredenteredenterinstruction names stored in linear tablemultiple type instructions are translated into different instructionsidentical constants are stored only onceInterpretermaximum of 4 filessubroutines for• post mortem dump• computation of base• string comparison• standard input/output proceduresinstruction fetchcase statement

122 The P-code Machine [Ch.Thus the reference in a Pascal program to a file variable for example input twill be a direct reference to the contents of one of these locations.P-CODE INSTRUCTION SETThe following table describes the complete instruction set showing the parametersand also the effect of the execution of each instruction on the stack. Only abrief description of each instruction is given here — a detailed version is given inChapter 11.Instruction Operation Parameterson stack if presentDescription of instructionBefore Afterobi(i) 1obr(r) radlOM Iadr(r,r)rAbsolute value of IntegerAbsolute value of realAdds two integers on the top of the stack and leavesan integer resultAdds two coals on the top of the stack and leaves areal resultChecks value Is between upper and lower bounds 1chic C No changechr (1) c Converts integer to charactercsp special Q Call standard procedurecup special PQ Call user proceduredec. C (x) x Q DecrementWI* (5,5) s Set differencedvl (i ,0 i Integer divisiondvr (r,r) r Real divisionent special PQ Enter blockeof (a) b Test on end of filecot' C (x,x) b Q Compare on equalfit) (b) False Jumpfib (I,r) r,r Float next to the toplit (i) r Float top of the stackgcq C (x,x) b Q Compare on greater or equalInc C (x) x Q IncrementInd C (a) x Q Indexed fetchInn (Ls) b Test sot membershipint (s,$) t Set intersectionfor (b,b) b Boolean inclusive ORixo 0,0 a Q Compute indexed addresstoo a Q Load baso-level addressIca a Q Load address of constantId x PQ Load constant indirect — assembler generatedIda a PQ Load address with level Phic C x Q Load constant(do C x Q Load contents of baso-lovel addressleg C (x,x) b Q Compare on less than or equalles C (x,x) b Q Compare on loss thanlod C x PQ Load contents of addressmod (1,1) I Modulomov(a,a)Q MoveinplOMIInteger multiplicationmpr (tir) r Real multiplication9] P-code Instruction SetInstruction Operationon stackBefore Afterrust special'neq C (x,x) bngi (I) Ingr (r) rnot (b) bodd (1) bord C (x) 1ret C specialsbl (1,1) Isbr (T.1) rsgs (1) ssql (I) Isqr (r) rsro C (x)sro C (i,x) ,sip Nd'effcctsir C (x)trc (r) 1tilt)effectu/c No effectNouni (a,$) sxlp(I)Key to effect on stack:abcrsxParametersif presentaddressbooleancharacterintegerrealsetany of the above typesThe C parameter denotes one of the primitive types.PQPQQQDescription of instructionMark stackCompare on not equalInteger sign inversionReal sign InversionBoolean notTest on oddConvert to integerReturn from blockInteger subtractionReal subtractionGenerate singleton sotSquare integerSquare realStore at base level addressStore at-base-level addressStopStore at level PTruncationError in case statementUnconditional JumpSet unionIndexed Jump123

Assembler/Interprete• ListingI (*Aasembler and interpreter of Paecel code*)2 ("K. Jenson, N. Wirth, Ch. Jacobi, ETH May 76")34 program pcode(input s output,prd,prO156 (" Note for the implementation.7I This interpreter ie written for the caee where all the fundamental types9 take one storage unit.10 In an actual implementation, the handling of the ep pointer has to take11 into account the fact that the types may have lengths different from one'12 in push and pop operations the sp has to be increased and decreaeud not13 by 1, but by a number depending on the typo concerned;14 However, where the number of unite of storage has been computed by the15 compiler, the value must not be corrected, since the lengths of the types16 involved have already been taken into account.17A)1819202122 label 1;23 conet codemax • 8650;24 pereax • 17500;25 maxstk . 13650; (* site of variable store A)26 overt • 13655; (A size of integer constant table • 5 A)27 ovorr • 13660; (A size of real constant table • 5 A) •28 overs - 13730; (A site of set constant table • 70 A)29 overb - 13820;30 OVeral • 10000;31 maxatr, • 18001;32 largoint . 26144;33 begincod• - 3;34 inputadr • 5;35 oitputadr • 6;36 prdadr • 7137 prradr • 8;38 duminst . 6213940 typo bit4 .-0..15141 'bit6 • 0..127;42 bit20 . - 26143..26143143 datatypo - (undo(' int,reol,boolopettoidr,mark,car);44 address ■ -1:.maxatr;45 beta46.. packed orrey11..231 of char; (*error meeaoge 0 )47 var code s orray(0..codemax) of (A the program '948 packed record opl tbit6;49DI tbit4,50 ql tbit20151' opt !bite;52 p2 tbit4;3354 end'q2 1bit2055 pc 1 0..pcmAx; (Aprogram address register*)36 op 1, bit6, p t bit4, q 1 bit20; (Ainatruers,,,, ,......, • .6782 Pascal Implementation: Compiler and Assembler/Interpreter953 end;r:54955 46 ("1110)1 begin sp 1. sp-1;stere(•p).vs 1. •tore(sp).vs A •tore(op+11.vs956957 end;958959 47 ("uni")1 begin ep :. sp-1;960etore(spl.ve 1• store(sp).vs + store(sp+11.vs961 end;962963 48 ("inn")t begin964 sp t. op - 1; i t- store(spkvi;storelepl.vb 1- i in store(sp+11.vs;965966 end;9u7968 49 ("mod")1 begin ep :. op-1,storelepl.vi :. etore(spl•vi mod etore(sp+1).vi969970 und;971972 50 (*odd"): storelep).vb ► • odd(stors(sp).vi);97)974 51 ("mpi")t begin sp i. sp-1;etoreispl.vi :. •tore(sp).vi * storelsp+11.vi915976 end;977978 52 ("mpr")1 begin •p I» sp-1;979 store(•pl.vr 1... store(sp).vr * otore(ep+1).vr900 end;981902 53 ("dvi*): begin ep :- sp-I,storelsp).vi t• storelspkvi div store(sp+1).vi983984 end;98598o 54 ( 0dvr$91 begin sp :.. sp-1;atore(sp).vr c. store(sp).vr / otore(sp+11.vr987988 end;989993 55 (*mov*)1 begin it :. store(sp-1).va;991 ' 12 i• etorelspl.va; sp :. sp-2;for I :. 0 to q-1 do storelil+il :- store(12+i)992993 ( 5 q is a number of storage units A)994 end;995996 56 ("lca 0 )t begin ep I. •p+1;997 •torelsp).va t. q;998 end;9991000 100,101,102,103,104,1001 57 (*dec")1 store(ep).vi t. store(spl.vi - q;10021003 58 (*stp")t interpreting z ■ false;10041005 59 (*ord 0 )s ("only used to change the tagfield")1006begin1007 end;10001009 60 ("chr")s begin1010end;101 ►1012 61 (Auje 0 )t erred(' case error ');1013 and1014 end; ("while interpreting")10151016 I i,,,,

68Pascal Implementation: Compiler and Asset&/InterpreterAssembler/Interpreter Listing 695/5859store ; array 10..overml ofrecord case datatypa of60 int .(v1 :integer), 12461 reel .(vr :real), 12562 bool :(vb .boolean), 12663 sett :(vs :set of 0..47)1 12764 car 1(c :char); 12865 adr :(va :address); 12966 ("address in store") 13061 mark t(vm :integer) 13160 end; 13269 mp,sp,np,ep : address. (A address registers A) 13370 ("mp points to beginning of a data segment 13471 sp points to top of the stack 13572 ep points to the maximum extent of the stack 136)) np points to top of the dynamically allocated area") 13774 13075 interpreting. boolean; 13976 prd,prr ; text;("prd for read only prr for write only A) 14077 14178 inatr : array(bit61 of alfa; (A mnemonic instruction codes A) 14279 cop I array(bit6) of integer; 14380 sptable I array(0..201 of alfa; ("standard functions and procedures") 14481 14582 ("locally used for interpreting one instruction") 14683 ad,adl : address; 14784 b 1 boolean; 14885 i,j,i1,12 ; integer; 14986 c 1 char) 15087 15188 (" A)15289 15390 procedure load; 15491 conet maxlabel 1050; 15592 type labelst (entered,deftned); ("label situation") 15693 labelrg « 0..moxlebel; ("label range") 157-94 lebelrec record15095 val.: address; 15996 at: labelst 16097 end; 16198 var icp,rcp,scp,bcp,mcp : addreae; ("pointers to next free position") 16299 word . array(1..101 of char; 1 integer; ch : char; 1631UU labeltab: arrayllabolrg) of labelrec; 164101 labelvalue: address, 1651U2 166103 procedure lnit; 167104 var i: integer; 168105 begin instr( 011 ■ 'Iod instr( 1 1-'ldo 169106 instr( inatr( 3 1..'sro 170107 instr( instr( 5 t.i'lao171100 instr( 6)1.e i sto instr( 7 t.'ldc1172;109 instr( 8)1.'... instr( 9 :m i ind173110 instr(1011m'inc instr(11 174instr(12)t.'cup instr(13 ...ant 175112 instr(14)1-'rat inetr115 :-'cep 176113 instr(16)1- l ixe inetr(17 177114 instr(1811..'nee instr(19 1.. 1 gaq 178115 instr(20): ■ 'grt instr121 1-'1eq 179116 instr(2211•'les instr123 t•'u.i1) 180Ill instr(24)1-'f)p instr(25 i. e xjp 181•118 inetr(26)swichk instr(27 102119 instr(2811 ■ 'sdi instrI29 ps'adr 183120 instr(30): ■ 'sbl inetrI31 s-'sbr 184121122123instr(321:-'sgsinstr(33);"'fltinstr(341:..'floinstr(35)s.'trcinstr(36):-'n8iinetr(37)10'ngrinstr(3011- • eqi inetr(39):-.'sqrinetr14011A'abiinstr(41):.'abrinetr(42):-'notinstr(431t.'andinstr14411.. s ior instr(45):-'difinstr(46)P.'intinetr(47)r-'uniinstr(48): ■ i inn instr14911 ■ 'modinstr(50)1-'oddinstr(51);.'mplinstr(5211..'mprinstr(53):s'dviinstr(54)1m'dvrinatr1551: ■ 'movInstr(56):-.'lca ; instr(57):...decinstr(501:-'etpinstr(59):..'ordinstr(60):-'chrinstr(61):-'ujcsptable( 0eptable( 2sptable( 4sptable( 6eptable( 8eptable(10aptable(12sptable114eptablo116eptable(18eptable120:-'retts i wrieptable( 1eptable( 3sptable( 5eptable( 7eptable( 9eptable(111- i rdr eptable(13aptablo(15'; aptable(17eptable(19cop( 0) 105; cop( 11 ;- 65;cop( 2) 70; cop( 3) 1- 75;cop( 61 ;is 60; cop( 9) 1.. 85;cop(10) 1- 90; cop(26)cop(571 100;95;pc begincodelicp ta umiak + 1;rcp 1. overt + 11acp overr + 1;bcp overa + 2;mep overb + 1;for 11. 1 to 10 do word(i):-. 'for 0 to maxlabel dowith labeltabli1 do begin vall-r-1;roset(prd);end,("init"):A'put:-'rin:A'wln;.'wrr; ■ 'rdi: - 'rdc:.'logprocedure errorl(stringt beta), ("error in loading")begin writoln;write(atring),haltend; ("errorl")entered end;proceduri update(xt labelrg); (*when • label definition lx is found")var curr,succs -1..pcmax; ("rasp. current element and successor elementof • list of future references")endlisti boolean;beginif labeltab(x).st.definod then errorl.(' duplicated labelelse beginif labeltab(x).val0-1 then ("forword'roference(s)")begin cum.. labeltablx).vall endlists. false;while not endlist dowith cods(curr cli,v 2) dobegin

70 Pascal Implementation: Compiler and Assemble' iterpreter Assembler/Interpreter Listing 71185 if odd(curr) then begin q2;186 q2:- labelvoluo187 end188 else begin euccliu (11,189 ql:• labalvalue190 end;191 If succ--1 then endllst:• true192 else curr: ■ succ193 end;194 end;195 labeltab(xl.at :- defined;196 laboltab(x).val:.. labelvalue;197 end198 end,("update 4 )199200 procedure assemble; forward;201202 procedure generate;("generato segment of code")203 var x: integer; (" label number 4 )2U4again. boolean;/05 begin206 again 1- true;207 while again do208 begin read(prd,ch),(" first character of line")209 case ch of210 'i's readln(prd),211 '1', begin read(prd,x);212 it not eoln(prd) then read(prd,ch);213 if ch-'-' then rcad(prd,labelvalue)214 else labelvaluec- pc;215 update(x); readln(prd);216 end;217'q'l begin again 1- false; readln(prd) end;218 ' 'I begin read(prd,ch); assemble end219 end;220 end221 end; ("generate")21222) procedure assemble; (*translate symbolic code into machine code and store)224 label 1; (*goto 1 for instructions without code generation*)225 var name lalfal b sboolean; r 'real; a 'set of 0..58;226 cl 'char; i,sl,lb,ub :integer;127228 procedure lookup(xc labelrg); ( 4 search in label table*)229 begin case labeltab(x).st of230 entered: begin q labeltab(x).val;2)1 laboltablx1.val pc232 end;233 defined! q: ■ labeltablx1.val234 end("caoe235 end,(*lookup*)236237 procedure lobclaesrch; rt238 var x: labelrg;239 begin whilo (ch0'1') and not eoln(prd) do read(prd,ch);240 read(prd,x); lookup(x)241 ond,("labelaoarch")242243 procedure getname;244 begin word(!) s• ch;245 road(prd,word(2),word(3));246 if not eoln(prd) than read(prd,ch) (*next character*);247 pack(word,l,name)248 end; (*getname")249250 procedure typesymbol;251 var I: integer;252 begin253 if ch 'i' then254 begin255 case ch of256 'a': i 0;257 'r's I 1;258 's's I 2;259 'b's 3;260 'c't 1 1 - 4;261 end;262 op c• coplop1+1;263 end;264 e7.d ("typesymbol 4 )265266 bagin p ;• 0; q U; op 0;267 getname;268 inetr(duminst) name;269 while inetr(op)Oname do op op+11270 if op duminat then errorl(' illegal instruction ');271272 case op of (* get parameters p,q *)273274 (*equ,neq,geq,grt,leq,les*)275 17,18,19,276 20,21,221 begin case ch of277 'a'l , (*p - 0*)278 'i's p :- 1;279 'r's p :- 2;280 'b's p 3;201 's'i p ;- 4;202 'c's p 6;283 'm's begin p 5;284 read(prd,q)285 and286 end287 end;288289 (*lod,atr*)290 0,21 begin typeeymbol; read(prd,p,q)291 end;292293 4 (*Ida*): read(prd,p,q);294295 12 (*cup*); begin read(prd,p); labelsearch end;296297 11 ("met"): read(prd,p);298299 14 (*ret*): case ch of300 'p'; p;•0;301 'i's p1 ■ 1;302 p:.62;303 'c's304 'b's p1•4;305 'a't p:-5306 end;307308 ("lao,Ixa,mov")309 5,16,551 road(prd,q),310311 ( 4 1do,sro,ind,inc,dec*)312 1,3,9,10,57i begin typesymboll read(prd,q)

72 Pascal Implementation: Compiler and Assemble Iterpreter Assembler/Interpreter Listing 7331331431531631731831932U32132232332432532632132632933033133233)33433533633733833934034134234334434534634134834935035135235335435535635i358359360361362363364365366361360369370371372373314375376end;("uJP, ( 314 0( JP")23,24,25; labeleearch;13 ("ent"), begin read(prd,p), labeluearch end;15 ("cep"): begin for i:.1 to 9 do read(prd,ch); getname;while name0aptablefql do q t. q+1end;7 ("ldc"): begin case ch of ("get q")'1': begin p ;• 1; read(prd,1),if abe(i)>-largeint thenbegin op :- 8;atorelicpj.vi i; q maxstk,repeat q q+1 until stora(ql.vi-1;if q-icp thenbegin icp icp+1;if icp-overi thenandendend else q t ■end;errorl(' integer table overflow'r': begin op 8; p :- 2;read(prd,r);etore(rcp).vr r; q ;. overt;repeat q q+1 until store(q).vr.r;if q-rcp thanbegin rcp rcp+1;if rcp overr thenendend;'n': ; (Ap,q • 0")'b'i begin p I. 3;errorl(' real table overflowread(prd,q) and;'CI begin p 1• 6;repeat read(prd,ch); until ch ' ';if ch '"' thenerrorl(' illegal character')iread(prd,ch); q t. ord(ch);read(prd,ch),if ch '"' thenerrorl(' illegal character ,);end;'CI begin op 8, p 4;8 I" ( J; read(prd,ch),while chO')' dobegin read(prd,s1,ch)1 • 1. a + [•1)end;store(•cpkve 1• e; q t. overr;repeat q I. q+1 until •tore(q).ve.•;if q•scp thanbegin •cp 1. •cp+1;if acp.overs thenerrorl(' set table overflowandendend ("case")');'),377378 26 ("chk"): begin typcsymbol;379 read(prd,lb,ub);380 if op . 95 then q lb301 else362 begin383 etore(bcp-1).vi t• lb; etoro(bcpl.vi ub;304 q °yore;365 repeat q q+2366until (store(q-11.vi•lb)and (store(ql.vi.ub).367 it q•bcp then388 begin bcp bcp+2;369 if bcp.overb then390 crrorl(' boundary table overflow ');391 end392 end393 end;394395 56 ("lea"): begin396 if mcp + 16 >• overre then397 errorl(' multiple table overflow '),398 mcp mcp+16;399 q mcp;40U for 0 to 15 ("etringlgth") do401 begin read(prd,ch);402 etore(q+1).vc ;• ch403 end;404 end;405406 6 ("810): typesymbol;407.406 27,28,29,30,31,32,33,34,35,36,37,38,39, 40 , 41 , 42,43,44,45,46,47,409 48,49,50,51,52,53,54,561 ;410411 ("ord,chr")412 59,601 goto 1;413414 61 ("ujc")1 ; ("must have same length as uip")415416 end; (*case")417418 ( 8 store instruction h)419 with code(pc div 21 do42Uif odd(pc) then421 begin op2 ih op; p2 th p; q2 t• q422 end else423 t2gin opl :• op; pl th p; ql th q424 ends425 pc t. pc+1;426 1: readln(prd);427 ends ("assemble")428429 begin (*load")430 snits431 generate;432 pc :. 0;433 generate;434 end; (*loci") ,435' ) ; 436 ( 8437438 procedure pmd;439 var a !integer; it integer'4408)

74 Pascal Implementation: Compiler and Assemb! Interpreter Assembler/Interpreter Listing 75441 procedure pt;442 begin write(11;6);443 if abu(store(s).v1) < maxint then wrIte(storo(akvi)444 else writc('too big ');445 e e - 1;446 1 1 + 1;447 if 1 - 4 then448 begin writeln(output); 1 1. 0 end;449 end; ("pt")450451 bogin452 write(' pc .',pc-1;5,' op -',op13,' sp mp453 ' np454 writoln; writeln(' ,),455456 a up, t 1.. 0;457 while s>-0 do pt;458 s 1. maxstk,459 while s>-rip do pt;460 end; (*pmd")461462 procedure orrori(stringi beta),463 begin writeln; writeln(string).464 plod; goto 1465 end;( 1 orroriA)466467 function base(ld tinteger)laddrees;468 var ad taddress;469 begin ad ;. mp;47U while ld>0 do471 begin ad t. store(ad+11.vm;472 end;Id to Id-1473 base ;. ad474 end; ("base*)475476 procedure compare;477 (*comparing Is only correct if reuult by comparing integers will be*)478 begin479 11 store(ap).va;480 12 1. etore(sp+11.va;481 1 t. 0; b t ■ true;482 while b and (1q) do483 if store(11+1).vi484 else b (alesstore(12+1).vi then i 1 ,- 1+1485 end; (*compare")486487 procedure callsp;488 var line: booloan; adptr,adelnt: address;489 it integer;490491 proceduru readi(var (:text),492 var ad: address;493 begin adt.. store(sp-1).va;494 read(f,store(ad).vi),495 storo(ators(spkva).vc496 spt. ap-2497 end;(*readl")498499 procedure readr(var f: text);5U0 var ad; address;501 begin adt. store(sp - 1).va;SU2ruad(f,stors(ad).vr);503 stora(stors(sp).val.vc :-504 ap-2505 end;(*readr * )506507 procedure roadc(var ft text);508 var c: char; ad: address;509 begin road(f,c);510 adt.. wtore(sp-11.va;511 etoro(adkvc c;512 store(storo(spj•va).vc t. f";513 storeletore(spi.va).vi 1- ord(f").514 sp-2515 ond;(*roadc*)516517 procedure wrltestr(var to text);510 var 1,j,kt integer;519 ad: address,520 begin ad:. storelsp-3).va;521 k store(sp-2).vi; j store(ep-1).vi;522 j and k aro numbers of characters 01 )523 if k>j then for 1:..1 to k-j do write(f,' ')524 else k;525 for i 0 to j - 1 do write((,stors(ad+1).vc);526 sp:• ap - 4527 end.(*vritescr*)528529 procedure getfile(var ft text),530 var ad: address;531 begin ad:•store(apj.va;532 get(f), scorelad).vc f";533 apt-sp-1534 end;(*getfile*)535536 procedure putfilo(var ft text);537 var ad: address;538 begin ad:. stora(sp).va;539 atore(ad).vcl put(f),540 op:- op-11541 end;(*putfile*)542543 begin ("callsp*)544 case q of545 0 (*get*); case storelap).va of546 51 gctfile(input);547 6: errori(' got on output file ');548 7: getfile(prd);,)549 8: errori(' get on pry file550 end;551 1 (*put*)i case storelapj.va of,),552 51 errori(' put on road file553 6: putfile(outpot).554 7t error!(' put on prd file ');555 81 putfile(prr)556 end;557 2 ("rate): begin558 (*for teetphase*)559 np I. store(sp).va; ep sp-1560 end;561 3 (*rin*)1 begin case store(sp).va of562 51 begin readln(input).563storo(inputadr).vc input'564 end;565 61 errori(' readln on output file ');566 71 begin readln(rnpot);567 store(ihputadr).vc ilmut" poet568 end;

76 Pascal Implementation: Compiler and Assemb: interpreter Assembler/Interpreter Listing 7761 errori(' road on output file569 81 errori(' roadln on prr file ') 63311(Ardi 4 ); csao atore(al).va of570 end; 6345: readi(input),571 ep.- ep-1 6356: errori(' read on output file,),572 end; 63671 readi(prd),573 4 (Anew th bogin nd: ■ np-store(spj.va; 637end; 81 errori(' read on prr file,)574 (Atop of stack gives tho length in unite of storage A) 630575 if ad 0. ep then 639 12(ArdrA): case store(epl.va of576 errori(' store overflow '); 640 5: readr(input);577 np:. ad; ad:• otore(sp-1).va; 6411 ).518 atoro(adj,va 1. np; 6427i readr(prd).,)579 o p: il- 6p - 2 643end; et errori(' read on prr fe50u end; 644581 5 ("win•)1 begin caao atore(sp).va of 645 13(ArdcA)t case etore(apj.va of582 Si errori(' writein on input file '), 646 Si 5 i readc(input);583 6: writeln(output), t 647 61 actor!(' read on output file.584 ,),7: error!(' writein on prd file64871 readc(pr8):,)585 8: writeln(prr) 649end; 8i errori(' road on prr file586 end; 650587 ep:- op-1 651140'010)1 etore(op).vr:0 sin(store(opkvr);588 end; 65215("cos.91 atore(sp).vr:- coo(storo(ep).vr),589 6 (Awre")1 case otorolopl.va of 6536(*exp")1 atorelep),vrp. exp(atore(spi.vr),590Si 51 errori(' write on input file 1 )1 65417("logl9t atore(apj.vr: ■ ln(store(sp).vr);591 61 writestr(output)i 655592 71 errori(' write on prd file '); 656593 8: writestr(prr) 657594 end; 6513 etore(adj.va :A np;595 7 ("olnA), begin coso store(opkva of 659 Bpi. sp-1596 5: linos - eoln(input); 660 and;591 6: errori(' ooln output file '). 661 end;(*case q')598 7i line:-eoln(prd)1 662 end;(Acallse)18(Asq0)1 storo(sp).vrI• eqrt(storelaP).vr);19(*atnA)t store(sp).vr:A arctan(storelepkvr),20(AsavA): begin ed:Astore(spkval599 8: errori(' eoln on prr file,)663600 end; 664 begin (A main A)601 store(spl.vb IA line 665 rewrito(prr);602 end; 666 load; (" e•semblos and stores code A)603 0 (*wri"); begin case atorelapj.va of 667 writein(output), (" for testing A )604 5: errori(' write on input file 1 );668 pc :- ap :. -1; rep I-0, np :. maxatk+1; op :- 5;605 6: write(out put ,669 store(inputadrj.vc I.. input";606 atorelep - 21.vi: storo(op - 1).vi);670 atoro(prdadd.vc i.. prd';607 7: orrori(' write on prd file I ),671 interpreting,:- true;608 8: wrlto(prr,672609 store(sp - 2).vil store(ap - 11.vi)673 while interpreting do610 end;674 begin611 ep:..ep-3 675 (*fetch*)612 end; 676 with code(pc div 2) do613 9 ("wrr")i begin cane etorelepj.va of 677 if odd(pc) then614 Si error!('write on input file '). 678 begin op tA opt; p IA p2, q tA q2615 6: writs(output, 679 end el..616 litorelep-21.vri store(ap-11.vi); 680 begin op tA opl; p i• pl; q 1. ql617 7: errori(' write on prd file 1 )1 681 and;618 8: write(prr, 682 pc p. pc+I;619 store(ep-2j.vr: store(ap-1).vi) 683620 end; 684 (*execute*)621 sp:-ep-3 605 case op of622 end; 68662) 10("wro9t begin case ntore(sp).ve of 607 105,106,107,108,109,624 5: errori(' write on input file.);688 0 (AlodA): begin ad :A base(p) + q;625 6; writa(output,store(sp-21.vct 689 sp ;A sp+1;626 store(sp-11.v1), 690 store(sp) s-. etore(ad)627 1: error!(' write on prd file '); 691 end;628 0: write(prr,chr(store(sp-21.vi)1 692629 storelep-1).vi), 693 65,66,67,68,69,630 end; 694 1 Oldo")1 begin631 ep.-ep-3 , 695632 end; 696sp tA I'M;store(sp) 1• store(q)

78Pascal Implementation: Compiler and Assemt" -/InterpreterAssembler/Interpreter Listing 79691698699700701702703end;70,71,72,73,74,2 ("etr")1 begin store(bsee(p)+q) 1.. storelsehsp i• sp-1end;704 75,76,77,78,79, 768705 3 ("aro")1 begin store(qj I. store(sp); 769706 ep 1.- ap-1 770707 end; 771708 772709 4 ("1.da"); begin sp t. sp+1; 773710 store(spf.va 1- base(p) + q 714711 end; 775712 776711 5 (also"), begin sp :- sp+1; 777714 storeIspkva 1- q 778715 end; 779716 780717 80,81,82,83,84, 781718 6 ("sto")t begin 782719 storelstore(sp-1).vs) :. etore(sp); 783720 ap :. sp-2; 784721 end; 785722 786723 7 ("Ade"); begin sp ;- sp+1; 707724 if p-I then 708125 begin storelepj.vi :- q; 789726 end else 790727 if p - 6 then store(sp).ve 1. chr(q) 791728 else 1792729 if p - 3 then etorelspl.vb 1. q - 1 793710 else (" load nil *) store(spj.va ;- maxatr 794731 end; 795732 796733 8 ("lei"): begin ep 1. sp+1; 797734 store(epl I. store(q) 798715 end; 799716 000737 85,86,87,88,89, 801738 9 (*Ind"): begin ad :- etore(ep).va + q; 802739 ( 0 q i8 a number of storage unite *) 803740 store(sp) 1. storo(adj 804741 end; 005742 806743 90,91,92,93,94, 807744 10 ("inc$91 store(sp).vi 1. storelapl.vi+q; 808145 809746 11 ("mst")i begin ("p-level of calling procedure minus level of called 810747 procedure + 1; set dl and el, increment sp") 011748 (" then length of this element is 812149 mex(Intsize,realeite,boolsito,charsize,ptrsize *) 813150 storelep+21.vm 1. base(p); 814751 (" the length of this element is ptraiee *) 815752 storo(sp+3).vm 1. me; 816751 (" idem *) 817754 atore(sp+4).vm 1- ep; 818755 (" idem *) 819756 sp :. 01)+5 820757 end; 821758 822159 12 ("cup")1 begin ("p-no of locations for parameters, q-entry point") B23760 op 1. sp-(p+4); 824761762763764765766767atore(mpt41.vm 1. pc;pc I. qend;13 ("ent")i if p 1 thenbegin sp 1. mp + q; (*el length of dataseg")if ap > np then errort(' store overflowendelsebegin ep sp+q;if ep > np then (wort(' store overflowend;(*q . max space required on stack")14 ("ret")t begin case p of0: sp:- mp-1,1,2 ; 3,4,5: op'. alpend;pc 1. storeimp+4j.vm;op store(mp+3).vm;mpl ■ store(mp+2j.vm;end;15 ("cep"): callsp;16 ("ixa"); begint- store(spl.vi;ap t. ep-1;etorelapj.va q*i+etore(spl.v*1end;17 ("equ"); begin ap sp-1;case p of11 store(ep).vb t. s•ore(ap).vi storeOs stora(sp).vb 1- store(spl.va store6: store(sp).vb 1. atore(ep).vc • store2t store(sp).vb store(sp).vr . store3: store(ep).vb I. storelapj.vb store41 store(sp).vb 1. store(spl.vs storeSt begin compare;store(apj.vb t. b;end;end; ("case p")end;sp+1ep+1sp+1sp+1sp+1sp+1.vi;.va;.vc;•vr;.vb;•va;18 (*neq")t begin ep t. ep-1,case p ofOt store epl.vb i ■ storelsp).va store(sp+1).vs;11 store ep).vb t ■ store(sp).vi store(sp+1).vi;61 store epj.vb storo(sp).vc store(sp+1).vc;2: store epl.vb store(sp).vr store(sp+ ► ).vr;3t store sp).vb storelspl.vb storelsptlj.vb;41 store sp).vb l ■ stora(spj.vs •torelsp+1).vs;5: begin compare;store(sp).vb not b;endend; (*case p")end;19 ;ogees); begin sp sp-1;case p ofOt Wort(' - for address ');It storelaphvb 1. store(spl.vi >- storsIsp+1).v1;61 etorelepj.vb I. storo(opl.vc storolop+11.vc;' ) ;' ) ;

80 Pascal Implementation: Compiler and Asse ► 'r/Interpreter Assembler/Interpreter Listing 81U2552613278205292: storeisp),vb :. storefsp).vr >- store(sp+1).vr;3: store(ep).vb I. otoro(sp).vb >.•store(sp+1).vb;4: ecore(apkvb :- store(sp).vm >. store(sp+1J've;2: storeisp),vb :. storefsp).vr3: store(ep).vb I. otoro(sp).vb4: ecore(apkvb :- store(sp).vm5: begin compere;storc(spj.vb :... b orstorc(spj.vb :... b or88989U95 ("chkah). if (storc(op).ve < np) or891 (store(epl.va > (mexstr-q)) then892 error!(' bad pointer value ').893!(' value out of range,),835 20 (hgrth)1 begin sp :- sp-I; 399 27 (heofh)1 begin i :.. etore(sp).vl;836 case p of 90U if t-inputadr then..----- 7 537 0: errori(' - for address '). 901 begin store(apI.vb 1.. eof(lnput), store(sp+11.vi; 902 end else errori(' code in error539 6: store[spkvb :N etore(epl.vc ) atore(sp+1).vc; 903 end;840 2: etore(sp).vb :- storo(sp).vr > I:tor:3(•1)+11.yr; 904841 3: store(sp).vb :- store(ep).vb > etore(sp+11.vb; 905 28 (*adi"): begin sp : - sp - 1;843 5: begin compare; 907 end;844 store(sp).vb t., not b and 908545 (store(il+i).vi > •tore(l2+tkvi) 909 29 (hadrh); begin ap :.., sp-I;846 end 910, store(spj.vr ;.., store(sp).vr + store(sp+11.vr847 end; (*caps ph) 911 end;540 end; 912,)851 case p of 915 end;852 0: orrori(' - for address '); 916853 1: storelsp).vb ... storol•p).vi (- storelep+1).vi; 917 31 (Asbr*): begin sp 1... op-1;855 2: otore(sp).vb t ■ atoro(sp).vr O. store(sp+1).vr; 919 end;556 3: store(sp).vb 1... storo(sp).vb O. storelap+1).vb; 920057 4: ntore(sp).vb :.. store(sp).v• Os stordsp+11.vo, 921 32 (hags's)! store(sp).vs :... (store(sp1.vil.858 5: begin compare; 922860 (store(11+il.vi C. store(i2+1).vi) 924861 end; 925 34 ("110): storelep - 11.vr :P store(sp - 1).vi;862 end; ("case ph) 926863 end; 927 15 (Acre"): store(sp).vi : - trunc(store(sp).vr);864 928865 22 ( 4 1o0)1 begin tip ;- sp-I; 929 36 (o.ngi*); storo(epj.vi ;., -atore(sp).vi;866 case p of 930867 01 crrori(' ,>-. for address '); 931 37 (ongr")1 store(sp).vr 1. -atore(sp).vr;868 I: "torelsp).vb :.. store(sp).vi < atore(•p+!).vi; 932869 6: storelepl.vb IP storelsid.vc < store(sp+1).vc; 933 38 (*eqi*): store(spl.vi t ■ sqr(store(sP).vi);U70 2: store(sp).vb 1.• store(sp).vr < storelep+11.vr; 934871 3: etore(ep).vb :P store(sp).vb < storo(sp+11.vb; 935 39 (*sqr"): •tora(sp).vr ;.. sqr(store(spl.vr);872 5: begin compare; 936873 store(sp).vb :.. not b and 937 40 (*obi"): •tore(sp).vi ,i- abs(store(sp).vi);074 (store(11+1).vi < store(i2+ikvi) 938875 end I 939 41 (*abr"): store(sp).vr :. abs(storelsp).vr);876 end; ("case ph) 940877 end; 941 42 (*not*): store(ep).vb 1. not store(sp).vb;879 23 (hujph)1 pc I. q; 943 43 ("and")1 begin •p JP sp-1;880 944 store(•phvb :P store(•p).vb and •tore(sp+1).vb881 24 (*t)p*): begin it not store(ep).vb then pc 1- (I; 945 ends882 sp :- ap - 1 946883 end; 947 44 ("loth): begin sp 10 sp-1;504 948 ' •toro(spl.vb 1... store(sp).vb or store(ap+1).vb1355 25 (Aajp*)! begin 949 end;886 pc 10 sfore(sp).vi + q; 950857 sp :- ap-1 951 45 (hdifh): begin ap 1. op-I;888 end; 952 storelsp).vs 1-. storolep1.vs - •tore(spt11.vs575 942

Pascal P4 Compilersingle pass compilercontrol part: syntactic analysis calls lexical analysis (insymbol),semantic check and code generationgenerated code: assembly language sourceabout 4000 lines of Pascal codeportable through constant definitionsLexical Analysisprogram driven lexical analysis (main routine: insymbol)determines identifiers, keywords, numbers and other symbolsskips comments (option recognition)output of source and error messagesspaces in identifiers are added economicallystorate of constantsinteger computation can cause overflow

Tables of Lexical Analysis123456frwifdooftoinorendforifsydosyofsytosyrelopaddopnoopnoopnoopnoopinoporop789forwardprogramfunctionprocedureSyntax Analysisprogram driven: recursive descent parserprocedure whilestatement;beginexpressioin (fsys+[dosy]);if sy = dosythen insymbolelse error (54);statement(fsys)end;skip skips symbol until continuation is possible10

Semantic Analysisenterid, searchid, searchsection, getbounds,equalbounds, comptypesno endless recursion for cyclic date structures (pointers)1 1Code Generationgen0, gent, gen2, genOt, genit, gen2tgenerate code for 0, 1 or 2 parameters with or without typesmes: computes maximum step depthgenfjp, genujpxjp, gencupent: branch switch and procedure callalignquot, align: address computationsload, store, loadaddress: operand loads and storescheckbnds: checks boundsgenlabel, putlabel: generation of labels12

JVIII ■ 111t14 /1.11111ySIS ai Y P 9 3TYPESTypes are represented internally by the type structure [118-321,All types use the size field, which contains the run-time store size needed tohold an object of that type. (The marked field Is used only/ by the procedurepri► ttables (676 -845j when printing out the compiler tables, If the t option Isswitched on.)All other fields depend on the form of the typo, If It is a pointer, or an array,and so on.ScalarA scalar type is either declared, that is, an enumeration, in which case fconstpoints to the last identifier in the list (they are linked together by their nextfield) 11061) ; or it is standard, when the typo is Integer, real,,or char (booleanis declared). These latter four can be distinguished by comparing the pointervalue to the structure with one of the four pointers Intptr, realptr, charptr,boolptr, which are initialised [3646 et seql (sec for example [652-7]).SubrangeHere rangctype points to the type of which this is a subrange (for example,integer in / .. 10); and Min and max hold the minimum and maylmum values(1 and 10 in the above case).PointerEl type points to the type pointed at (Integer IntInteger).PowerFor sets, visa points to the clement type of the set (for example, char in set ofchar),Arraysht.v type points to the index type of the array and aeltype to the clement type,(e.g. char and integer respectively In array/ char/ of Integer).A niulti•dimensional array, like array /1. .10, 1. ./Of of real is treatedidentically to array //. ./Of of array //, ,101 of real so here would give1.1111,4FilesFiltype points to the file type (for Ixample Integer in file of Integer).RecordsFstfld points to the first field of the record, the other fields being linked to the1 binary tree described before.Recvar points to a structure of form hied representing the variant part ofthe record (It is nil if there Is no variant part).A tried hAs' two fields: tagfldp points to the Identifier for the tag field, andhivar points to a list of structures of form variant, each representing one of theease labels.As an example:type rx record I: Integer;case b: colour ofred, blue: (J: Integer);green: (k: integer;case c: boolean oftrue: (a: real)end;11=111111111=antiI • I_I 1

▪••TABLE OF CONTENTSCompiler Listing 3Assembler/Interpreter Listing 67The sources of the Pascal program are available in machine-readabloform on magnetic tape on application to the publishers.British Library Cataloguing in Publication DataPemberton, SievertPascal implementation. — (Ellis llortvood series In computer science)1. PASCAL (Computer program language)I. Title II. Daniels, NI alibi001.64'21 QA76.731)2SC3•Library of Congress Card No. 81-20184 AACR2ISBN 0-85312-358-6 Book (Ellis llorwood Ltd., Publishers)ISBN 0-85312-437-X Compiler (Ellis Itorwood Ltd., Publishers)ISBN 0-470-27323-9 (Halsted Press)Compiler Listing1 ( 0 $c+,t-,d-,1")2 (AAAAAAAAAhhAAAAAAAAAA•AAAAAAAAAAAAAAAAAAAAAAAA34 A5 A Portable Pascal compiler6AAAAAAAAAAAAAAAAAAAAAAAA7 A A8 A Pascal P4910 A A11 A Authors'12 A Uro Ammenn13 A Kees ,/ Novi14 A Christian Jacobi15A16 Address'1718 A Inatitut Puer Informatik19 A Eidg. Technische Hochschuls20 A C1(-8096 Zuorich21 A22232425 A26 AAAAAAAAAAAAAAA*AAAAAAAAAAAAARAAAAAAAAAAAAAAAA)272829 program pascalcompiler(input,output,prr);30313233 conet diaplimit • 201 mAxlevel • 10;34 Intsire I;35 inter 1;36 renlsire I;31 rentalI;38 chareire1;39 charnl1;40 chnrmexI;41 boolaireI;42 boolal I;43 ptrsizo I;44 edral1;45 seteire1;46 natal47 stacknlI;48 stackela1zo I;49 etrglgth • 16;50 eethigh 47151 sotlow0152 ordmaxchar 63;53 ordminchar0154 maxint - 327671St trAftermnri.ornek 51

4Pascal implementation: Compiler and Assembler/InterpreterCompiler Listing57 (" steckoleite - minimum alto (or 1 steekelement121_ette form: structform of58k"steckel122scalar: (case scelkindi doclkind of59 ntackel scm(sli other al-conatante) 123 declared: ((coned ctp));60 charmax ■ scm(chersize,charal) 124 put:range& (rangetypet etp; m1n,maxt valu),61 scm smalleet common multiple 125 pointer: (sitypet stp),62 lca(termarketock 4"ptreitnimnx(x-ales) 126 powers (cleat, sip),63 • klAsteckeleiee A)127 errL/st (aeltype,inxtypet etp);64 maxetack 1; 128 records! ((etfIdt ctp; recvart etp);65 pormnl eteckell 129 Most (filtype: etp);66 parmeise stockelsixe: 130 tegfldt (tegfieldpt ctp; fotvar: stp);61 recd; - etackal; 131 variant! (nxtvar,subvar: etp; vArvalt vale)613 filebuffor ■ 4; 132 end;69 mexaddr mexint; 13370 13471 13572 13673 type ("describing: ")137 idcless (typee,konst,vars,field,proc,func);(AAAAAAAAAAAAA)74 138 eetofide - set of 1dcleas;15 139 idkind (actuel,formel);76 140 alpha .• packed array (1,.(1) of char;71 ("bneic symbols")141711 (AAAAAAAAAAAAAAA)142 identifier ■ packed record79 143 name: alpha; llink, rllnk: ctp;80 symbol - (ldent,lntconst,realconot,stringconst,notey,mulop,addop,relop, 144 1dtypo: stp; next: ctp;81 1parent,rparent,lbrack,rbreck,comma,semicolon,period,arrow, 145 case klassi 1dclees of82 colon,bocomoe,laboley,conetey,typeey,verey,funcey,progey, 146 konet: (values: valu),03 procey,setey,packedey,arreyey,recordey,fileey,forwardey, 147 vertu (vkindi idklnd; vlev: levrange; vaddr: eddrrange),84 beginey,ifey,casetty,repeetey,whilesy,forey,withey, 148 field: (fldaddr: eddrrange);85 gotosy,encley,eiseey,untiley,ofey,doey,tosy,downtotty, 149 proc,86 theney,othorey), 150 tune: (case pideckind: declklnd of01 operator .• (mul,rdiv,andop,idiv i lmod,plun,mlnua,orop,ltop,loop,geop,gtop, 151 standard: (key, 1..15);88 neop,eqop,lnop,noop); 152 declared: (pflev: levrange; p(nnmc: Integer;09 sotofeyo - set of symbol; 153 case pfkindt ldkind of90 chtp - (letter,numbor,epoclal,illogal, 154 actual: (forwdecl, extern:91 chstrquo,chcolon,chporiod,chlt,chgt,chlparen,chopece), 155 boolean)))92 156 end;93 ("constants") 15794 (AAAAAAAAAAA)15095 159 dieptange ■ 0..diapllmit;96 cetclase (rcel,peet,strg); 160 where (61ck,crec,vrec,rec);97 cep - " constant; 16190 conetnnt record case cciette: cetcleee of 162 ("expresslona")99 reel: (rvalt pecked array (1..otrglgthj of char); 163 (ss AAAAA 401A41014)100 peat: (pvalt set of setlow—sothigh), 164 attrkind (cst,varbl,expr);101 etrg: (elgth: 0..etrglgth; 165 vaccoos (drct ; indrct,inxd);102 ovals packed array (1..strglgth) of char) 166103 end; 167 attr record typtr: stp;104 168 case kind: attrklnd of105 valu record case intvell boolean of ("intval never sat nor teated") 169 coat (cvalt val.);106 true: (ival, integer); 110 varbl: (case Access: vaccess of107 false: (yelp: cep) 171 drct1 (vlevel: levrange; dplmtt eddrrange);108 and; 172 lndrctt (idplmt: eddrrange))109 173 end;110 ("date structures") 114111 •(AAAAAAAAAAA AAAAA A) 175 testy " testpotnter;112 levrange ■ 0..maxlevel; eddrrange ■ 0..maxeddr; 176 testpointer pecked record113 etructform (ecaler,eubrangc,pointer,power,arreye,recordo,filee, Ill eltl,elt2 stp;114 tegfld,varient)1 176 leetteetp t teetp115 declkind ■ (stsndard,declared); 179 end;116 stp " structure; ctp ■ " identlfiet; 1 180117 181 ("labels")118 structure packed record 182119 marked: boolnan; ("for test phase only") 183 lbp " labl;120 size' eddrrange{ 184 1abl record nextlab: lbp; defined: boolcon;

6 Pascal Implementation; Compiler and Assembler/Interpreter185 labvol, labnamet integer186 end;187188 extfliep - 'Mettle;189 fllerec record filenamelalphal nextfllelextfilop end;190191 ("192193194 var195 ("returned by source program scanner196 ineymbol:197AAAAAAAAAA)190199 cyt symbol; ("last symbol")200 op' operator; ("classification of last symbol")201 vol: volu; ("value of last constant")202 lgth: Integer; ("length of last string constant")2U3 id: alpha; ("last identifier (possibly truncated)")204 kkt 1..8; ("nr of chars in last identifier")205 chi char; ("last character")206 toll boolosn; ("end of lino flag")207208209 ("counters!")210(AAAAAAAAAAA)211212 client: integer; ("character counter")213 lc,ici addrrangel ("data location and instruction counter")214 linecounti integer;215216217 ("switches!")218(AAAAAAAAAAA)219220 dp, ("declaration part")221 prterr, ("to allow forward references in pointer type222 declaration by suppressing error message")223 list,prcode,prtablosi booleanl ("output options for224 -- source program listing225 -- printing eymbolic code226 -- displaying ident and etruct tables227 --> procedure option")228 debug' booleang229230231 ("pointers!")232 (AAAAAAAAAAA)233 parmptr,234 intptr,realptr,charptc,235 boolptroilptc,textptr: stp; ("pointers .to entries of etandard ids")236 utypptr,ucetptr,uverptr,23• ufldptr,up'rcptr,ufctptr, ("pointers to entries for undeclared (ds')238 fwptri ctp; (Ahead of chain of torw dccl type Ida")239 fextfilnp: extfilop; (Ahead of chain of external (nee")240 globtentp: teetp; ("last tnetpointer")241242243 (Abookkeoping of declaration levels:")244(AAAAAAAAAA 44444 AAAAAAAAAAAAAAAAAAAAA)245246 level' levrange; ("current ototic level")247 dies, (Alevul of last id searched by searchid")248 top, dieprangei (Atop of display")A)Compiler Listing 7249250 displays (*whore! moans1")251 array fdisprange) of252 packed record ("back! id is variable 10)253 fnamel ctp; !label! Ibpg ("creel id is (told Id in record with')254 case occur; where of (" constant address")255 cruel (clevi levrange; (".vreci id is field ld in record with')256 cdspli addrrange);( 4 yr:I- table address")257 vrect (vdsplt eddrrange)258 end; (" --> procedure withs6itement")259260261 (*error messages:")262 (AAAAAAAAAAAAAAAAA)263264 errir..:1 0..101 ('or of errors in current source line")265 orrliett266 array 11..10) of267 packed record post integer;260 nmr, 1..400269 end;270271272273 ("expression compilation:")274 (AAAAAAAAAAAAAAAAAAAAAAAAA)275276 gettr: attr; ("describes the expr currently compiled')271278279 ("structured constants:")280 (A AAAAA AAAAAAAAA AAAAAAAA )281282 constbeggys,simptypebegays,typebegsye,blockbegsys,selectamfacbegsya,283 statb,gsys,typedele: setofdys;284 chartp array(char) of chip;285 rut array (1..35("nr. of red. words")) of alpha;286 ft. wl array (1..9) of 1..36(Anr. of tee. words + I");287 royt irray (1..35(Anr. of res. words")) of symbol;288 sey1 array char) of symbol;209 rope nrruy 1..35("nr. of res. words")) of operator;290 cope nrtny char) of operator;291 nat Array 1..351 of slpha;292 ant array 0..601 of packed array (1..4) of char;293 sna: array 1..23) of packed array (1..4) of char;294 cdx: artily 0..60) of -4..+4;295 pdx: array 1..231 of -7..+7;296 ordintt array (char) of integer;297298 intlnbel,mxint10,digmaxi integer;299300 (A A)301302303 procedure endoflitle;304 var lastpos,freepos,currpoa,currnmr,f,k1 integer;305 begin306 if erring > 0 then ("output error messages")307 begin write(output,' **** ':15);308 lastpos 1. 0; freopos r. 1;309 for k 1- 1 to erring do310 begin311 with errlist(k) do312 begin currpoe 1. poe; currnmr nmr end;

8 Pascal Implementation: Compiler and Asse ► bler/Interpreter Compiler Listing 9313 if currpoe laetpos then write(outo. ,/,')314 else315 begin316 while fKeepos < currpoe do117 begin write(output,' '); freepos t. freepos + I end;318 write(output,'"');319 lastpos 1. currpoe320 and;321 if currnmr < 10 then f ► • 1322 oleo if currnmr < 100 then t tft 2323 elan t t ■ 3;324 write(output,currnmrif);325 freepos 1. froepue + f + 1326 end;321 writeln(output); errinx I. 0328 end;329 linccount 1• linacount + 1;330 if list and (not sof(input)) then331 begin write(output i linecountt6, 1 't2);332 if dp then write(output,lct7) else writo(output,ici7);333 write(output,' ')334 end;335 chcnt t. 0336 end (*endofline*)337338 procedure arror(ferrnri integer);339 begin340 if errinx >- 9 then341 begin errlint(101.nmr 255; arrinx 1. 10 end342 else343 begin errinx to errinx + 11344 errlistlerrinxl.nmr ts. ferrnr345 end;346 orrlistIerrinxi•pow chcnt347 end (*error") ;340349 procedure ineymboi;350 (*read next baoic symbol of source program and return its351 description in the global variables sy, op, id, vel and lgth*)352 label 1,2,3;353 var 1,ki integer;354 digit: packed array (1..strglgthl of char;355 string: packed array (1..strglgth) of char;356 lvp, cep; test: bonlean;357358 procedure nextch;359 begin if eol then360 begin if list then writeln(output); endofline361 end;362 if not oof(input) then363 begin eol f. eoln(input); rend(input ech);364 if list than write(output,ch);365 chcnt chcnt + 1366 end367 else368 begin writeln(output,' " 4 sof ','encounterod');369 test 1. false370 and371 end;372373 procedure options;374 begin315 repeat nextch;376 If ch 0 'A' then377 hegin378 if ch 't' then379 begin nextch; prtebles t. ch '+' end380 else381 if ch • '1' then382 begin nextch; list I. ch '+';383 if not list then writeln(output)384 end305 else306 if ch 'd' then387 begin nextch; debug388 elsech '+' end389 if ch 'c' then39U (.fcl'

10 Pascal Implementation: Compiler and Asser - 'iler/Interpreter Compiler Listing I I441 if ch • 'e' then442 begin k t. k+1; if k 0 , digmax then digit(k) :. ch;44) nextch;444 if (ch • '+') or (ch -'-') then445 begin k 1.. k+1; if k C. digmax then digit(k) t. chi446 nextch447 end;448 if chartplchj number then errur(201)449 else450 repeat k 1. k+1;451 if k ' then521 begin op sm noop; nextch end522 else op t ■ !top523 end;524 chgtt525 begin nextch; sy 1. relop;526 if ch '-' then527 begin op geop; noxtch end520 else op 1.. gtop529 end;530 chlparen:531 begin nextch;532 if ch - 'C' then533 begin nextch;534 if ch '$' then options;535 repeat536 while (ch 'A') and not eof(input) do nextch;537 nextch538 until (ch • ')') or oof(input);539 nextch; goto 1540 and;541 ny 1porent; op :- noop542 enu;543 specials544 begin ey t• saylohj; op t ■ sop(ch);545 nextch546 end;547 chepecet sy 1. othersy548 end (*caso 04 )549 end (Cinaymbol*)550551 procedure entorid(fcp; ctp);552 (*enter id pointed at by fop into the name-table,553 which on each declaration level is organised as554 an unbalanced binary tree*)555 var namt alpha; lcp, lcpli ctp; lleftt boolesn;556 begin nero t. fer.neme;557 lcp t. display(top)ansmel558 if lcp nil then559 dispinyltopj.fneme t• fcp560 else561 begin562 repeat lepl t ■ lap;563 if lcp".name nem then (*name conflict, follow right link*)564 begin arror(101); lop t• lor.rlink; !left565 else4false end566 if ler.nome < nom then567568begin lop t- lcp".rlink; !left 1- false endelse begin lcp 1. lcp".11ink; heft t. true end

12 Pascal implementation: Compiler and Asse '11er/interpreter Compiler Listing 13569 until Lep » nil;510 if ;left then lcp1".11ink t» fcp else lcp1".rlink t» fcp571 end;572 fcp'.11ink t» nil; fcp".rlink im nil573 end ,(*onterid") 1574575 procedure soarchsectiOn(fcpt ctp; var tcplt ctp);576 ("to find record fields and forward declared procedure id's517 --> procedure proceduredoclaration570 --> procedure selector")579 label 1;580 begin581 while fcp nil do582 it fcp .name - id then goto583 else if fcp".neme < id then fcp 1» fcp".rlink584 else fcp 1. fcp".11inki585 1: fcpl tm fcp586 end ("searchs,oction")567588 procedure saarchid(fidclst setofids; var rcpt ctp);569 label I;590 var Icpt ctp;591 begin592 for disk top downto 0 do593 begin lcp 1» displeytdiaxlanamo;594 while lcp nil do595 if Icp".name » id then596 if ler.kleee in fidels then goto 1597 else590 begin if prterr then arror(103);599 lcp t» Icp'.rlink600 end601 else602 if Icp".name < id then603 lcp tm lcp".rlink604 else lcp t• Icp".11ink605 end;606 (*search not successful; suppreee.error message in case607 of forward referenced type id in pointer type definition600 --> procedure simpletypo")609 if prterr then610 begin orror(104);611 ("to avoid returning nil, reference an entry612 for an undeclared id of appropriate class613 --> procedure enterundecl*)614 if typos in fickle then lcp t» utypptr615 else616 if yore in fidcle then lcp 1» uvarptr617 oleo618 if field in fidcle then Lep t» ufldptr619 oleo620 if konst in fidcle then lcp uestptr621 else622 if proc in fidcls then lcp t= uprcptr623 she lcp t• ufetptr;624 end;625 1, fcp lcp626 end ("searchid") ;627628 procedure getbounds(fept 84; var Imln,fmaxt integer);629 ("get internal bounds of subrange or scalar typo")630 ("assume fepOintptr and fspOrealptr")631 begin632 (min 0; fmax i ■ 0;633 if fep nil then634 with tap" do635 if form » subrange then636 begin fmin t• nin.ivell fmax max.ival end637 else6 38 if fep • charptr then639 begin fmin t ordminchar; (max t» ordmexcher640 end641 also642 if fconst nil then643 fmax tin fconst".values.ivol644 and ("geLbounds")645646 function allgnquot(fspi etp)s integer;647 begin648 alignquot 1- 1;649 if fep nil then650 with fep" do651 ceee form of652 scalars if fep•intptr then alignquot I» Intel653 oleo if fep-boolptr then alignquot boolel654 else if ecalkind ■ declared then alignquot t- Intel655 else if fep-charptr then alignquot 1» choral656 else if fep-realptr then alignquot realal651 else ("permptr 0 ) alignquot 1- ',small658 subrange[ alignquot t» alignquot(congetypo),659 pointers alignquot adral;660 power, alignquot sotal;661 files, alignquot fileel;662 arrays' alignquot Is alignquot(seltype);663 records, alignquot recal;664 verient,tegfidt orror(501)665 end666 end (*alignquot");667668 procedure elign(tept dip; var tics integer);669 var tc,it integer;670 begin671 k im alignquot(fsp);672 1 1. flc-1;673 flc im 1 + k (k+1) mod k674 end ("align"),675676 procedure printtableo(fbi boolean);677 (*print data otructure end name table")676 var i, lim, disprenge;619680 procedure market;681 ("mark •iota etructure entries to avoid multiple printout")602 vac 1; integer;683684 procedure marketp(fps ctp); forward;665686 procedure markutp(tps sip);687 (*mark data structures, prevent cycles')688 begin689 if fp 0 nil then690 with fp" do691 begin perked 1* true;692 case form of693 scolart694 subrange' marketp(rangetype);695 pointers ("don't mark eltypal cycle possible; will be marked696 anyway, it fp true") i

14 Pascal Implementation: Compiler and Assembler/Interpreter697 powers morkstp(oloot)698 arroyos begin morketp(aoltypo); rorkstp(inxtype) end;699 records' begin marketp(fotfld); earketp(recvar) end;700 files: markstp(filtypo)1701 tegfld, markstp(fetvar);102 variant' begin mnrkotp(nxtvar); marketp(oubvar) end .103 end (Cease.)704 end ("with*)705 end ( 0markstph);106707 procedure morkctp;708 begin709 if fp nil then710 with fp" do711 begin marketp(llink); markctp(rlink);712 markstp(idtype)713 end714 end (*mackctp*);715716 begin ("marker*)717 for i s• top downto lim do718 marketp(displaylil.fneme)719 end ("marker");720721 procedure followctp(fPs ctp); forward;722723 procedure followstp(fpi stp);724 begin725 it fp 0 nil then726 with fp" do727 if marked then728 begin marked false; write(output,' 's4,ord(fp):6,eire110);729 case form of130 scalars begin write(output,'scalar's10)1131 if scalkind w standard then732 write(output,'standard's10)733 sloe write(output, s declared'110,":4,ord(fconst)16);734 writeln(output)735 end;736 aubranges begin731 write(output,'subrangs'110,' 's4,ord(rangetype)s6);738 if rangetype realptr then739 write(output,min.ival,max.ival)740 else741 it (min.valp nil) and (max.valp nil) then742 write(output,' ',min.valp".rvals9,74) ",max.valr.rvals9);744 writeln(output); followetp(rangetype);745 end;746 pointers writein(output, g pointer's10,"14,ord(altypo)16);747 powers begin writeln(output, s eet'110,' 't4,ord(eleet)16)1740 followstp(eleet)749 end;750 arrays; begin751 writeIn(outpuWarray'110,"14,ord(ealtype)s6,"t4,752 ord(inxtype)t6);753 followstp(aeltype); followstp(inxtype)754 end;755 recorder begin756 writeln(output,'record's10,' '14,ord(fetfld)16,"14,757 ord(rocvor)t6); followctp(fstfld);758 followatp(recvar)759 end;760 files: begin write(output, s file'110, 9 's4,ord(filtype)16);Compiler Listing 15161 followetp(filtype)762 . end;763 tagfidt begin writeln(output,'tagfld's10,' 't4,ord(tagfieldp):6,764 "14,ord(fatvar):6);765 followstp(fetver)766 end;767 variants begin writeln(output,'varlant'110,' '14,ord(nxtvar):6,768 "s4,ord(eubver)16,varval.ival);169 followstp(nxtvar); followstp(subvar)170 end771 end (*case")772 end (*if marked")773 end (Afollowstph);114715 procedure followctp;176 var is integer;771 begin778 if fp 0 nil then779 with fp" do700 begin write(output,' 's4,ord(tp)s6,' ',name:9,' '14,ord(Ilink)16,781 ' 's4,ord(rlink),6,' '14,ord(idtypo)s6);782 case kiss. of703 types: write(output,'type'110)1184 konst: begin write(output,'conatent'110, 1 '14,ord(next)16),785 if idtype nil then186 if idtype • realptr then787 begin788 if valueo.valp 0 nil then789 write(output,' ',values.volp".rval:9)790 end791 else792 if idtype".form - arrays then ("stringconat')793 begin794 if vsluee.valp nil then79) begin write(output,' ');796 with values.valp" do797 for i I to elgth do798 write(output,svallii)799 end800 end801 else write(output,voluce.ival)802 end;803 yore: begin write(output i 'variable't10)i804 if vkind actual then write(output,'actual':10)805 else write(output,'formal'110);806 write(output,' '14,ord(next)16,vlev,' '14,vaddri6 );807 end;808 fields write(output,'field's10,' I t4,ord(next)16,":4,11daddr:6),809 proc,810 tunes begin811 if klase - proc then write(output,'procedure'110)012 else write(outpuWfunction'110);813 if pfdecklnd standard then814 write(output,'stendard':10, key110)815 else816 begin write(output i 'declared'110,' ':4,ord(next)16);811 write(output,pflev,' 1 14,pfneme:6)1818 if pfkind • actual then 1819 begin write(outpuWactual'110);820' if forwdecl then write(output,'forwsrd':10)821 else write(output,'notforward':10)1822 if extern then write(output,'excern'110)823 else write(output,'not extern':10).824 end

16Pascal Implementation: Compiler and Assembler/InterpreterCompiler Listing 17825826827828829830else writs(output,'Iucmal'i10)andendend ("case 6 )1writeln(output);889890891892893894begin if op ■ plus then sign t ■ pos else sign 1.. neg:insytebolend;it sy » ident thenbegin searchid((konst),1cp);with ler dofollowctp(llink); followctp(rlink)1031 folloOstp(idtype) 893 begin lap t. idtype; (valu 1. values end;832 end ("with") 896 if sign none then833 end ("followctp"); 897 if lap - intptr then836 898 begin if sign . nag then fvalu.ival t. -fvelu.ival end835 begin ("printtables") 899 also036 writeln(output); writeln(output); writeln(output); 900 if lap ■ realptr then837 if tb then lim ii. 0 901 begin838 else begin lim vu top; write(output,' local') end;, 902 if sign . neg then839 writeln(output,' tables '); writeln(output); 903 begin new(lvp,reol);840 marker; 904 if fvalu.velp.rval(1) • '-' then841 for i s• top downto lim do 903 Ivp'.rvel(11 1.. '+'842 followctp(display(i).fneme)1 906 else Ivp".rval(1) s ■ '-';843 writeln(output); 907 (or i 1.. 2 to strglgth do844 if not eol then write(output,' 'ichent+16) 908 lvp".rvel(i) 1.. fvelu.velr.rval(i),845 end ("printtables"); 909 fvelu.valp 1- lvp;866 910 end847 procedure genlabel(var nxtlebt integer); 911848 begin intlabel t ■ intlebel + 1; 912 el:: d error(105)1849 nxtlab s. intlabel 913 ineymbol;850 end ("genlabel"); 914 end851 915 else852 procedure block(feyel ectofeys; fay, symbol; fprocps ctp); 916 if ay • intconet then853 var lays symbol; teats booloan; 917 begin it sign ■ neg then vel.ival 1. -val.lvel;854 918 lop t• intptr; fvelu 1- veil ineymbol855 procedure skip(feyss setofeys); 919 end856 ("skip input string until relevant symbol found") 920 else857 begin 921 it ay . roalconat then858 if not eot(input) than 922 begin if sign ■ nag then val.velp".rval(1) :- '859 begin while not(sy in (eye) and (not eof(input)) do ineymbol; 923 lap 1» roalptr; fvalu t. veil ineymbol,060 if not (ey in (eye) then ineymbol 924 end861 end 925 else862 end ("skip") ; 926 begin error(106); ekip((sye) end863 927 end;864 procedure constant(tsys1 sototeys; var fops sip; var (valut valu); 920 if not (sy in (eye) then865 ver lop' stp; lcpt ctp; signs (none,pos,neg); 929 begin orror(6); skip(fsys) end866 lvpt cep; it 2..strglgth; 930 end;867 begin lop 1» nil; fvalu.ival s ■ 0; 931 imp t ■ lap868 if not(sy in constbegsya) then 932 and ("conatent") 1869 begin error(50); ekip(fsys+conetbegsye) end; 933870 if ey in conetbegeys then 9)4 function equalbnunde((spl,fsp2t stp); boolesn;871 begin 933 v4r Imin1,1mIn2,1max1,1mes2s integer;872 if sy ■ stringconetey then 036 begin873 begin 937 if s (fspl ■ nil) or (fsp2-nil) then equelbounde l• true874 if lgth ■ 1 then lop 1 .. cherptr 938 else875 else 939876 begin 940 b lie%ounde(fepl,lmin1,1max1);877 new(lsp,arraye); 941 getbounde(fsp2,lmin2,1max2);878 with lep" do 942 equnlbounde 1... (Iminl•Imin2) and (Imaxl-Imax2)879 begin aeltype t- charptr; inxtype s ■ nil; 943 and880 site t. lgthocharsite; form t- arrays 944 end ("equalbounds") ;881 end 945882 end; 946 function comptypes(fspl,(sp21 stp) 1 boolean;083 fvalu t• veil ineymbol 967 ("decide whether structures pointed at by fspl end fep2 are compotible")884 end 948 var nxtl,nxt21 ctp; comp' booleon;885 oleo 949 begi ln teetpl,Iteetp2 1 teetp;886 begin 950887 sign t... none; 951 if fspl • fsp2 then comptypes 1» true808 if (ey . addop) and (op in (plue,minus)) then 952 else(‹

18Pascal Implementation: Compiler and Asscall.1,4InterprcterCompiler Listing 1995395495595695795895996096196296396496596696796896997097191297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016if (fspl nil) and (1.p2 0 nil) than ,if fepl".form fsp2".form thencase fepl".form ofscalertcomptypes im false;(A identical scalars declared on different levels eranot recognized to be compatible)subrangotcomptypes 1m comptypes(fspr.rengetype.lsp2".rangotype),pointertbegincomp im false; ltestpl globteetp;lteetp2 1- globteetp;while ltestpl nil dowith ltestpl" dobeginif (elt1 m fepl".eitype) and(e1t2 - fep2".eltype) then comp rm true;lteetpl tm lastteetpend;if not comp thenbegin now(Itestpl);with lteetpl" dobegin olt1 im fepl".eltypelelt2 1m fspr.eityps:lastteetp tm globteetpend;globteetp 1• ltestpl;comp tm comptypes(fspl".eltype,fsp2".eltype)end;comptypes tm comp; globtestp tm lteetp2end;power:comptypee tm comptypes(fspl".elset,fsp2".eleet);arrays,begincomp tm comptypee(fspl".aoltype,fep2".aeltypo)and comptypes(repl".inxtype,fsp2".inxtype);comptypes tm comp and (fspl".siee • fsp2".site) andequalboundo(fspl".inxtypo,fep2".inxtype)end:records,begin nxtl I. fepl".fstfldy nxt2 t• fsp2".fetfld; comp:mtruelwhile (nxtl nil) and (nxt2 nil) dobegin comptmcomp and comptypes(nxtr.idtype,nxt2".idtype),net' nxtr.next; nxt2 tm nxt2".nextend;comptypes 1- comp and (nxtl - nil) end (nxt2 - nil)and(fapl".recvar • ni1)and(fep2".recvar m nil)end;("identical records are recognized to be , compatibleiff no variants occur")comptypes tm comptypes(fsprailtype,fsp2".filtype)end (AcatleA)else (*fspl".form fsp2".form 4 )it fepl".form subrenge thencomptypoe im comptypes(fspl".rangetype,f.p2)elseif fep2".form m subrenge thencomptypes tm comptypes((sp1,fsp2".rangetype)also comptypes falseelse comptypos tm trueend ("comptypes")1017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104U10491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210131074107310761077107810791080function string(rept stp) t boolean;begin string i ■ felee;if fop nil thenif fep".form m array. thenif comptypes(fsp".seitype,cherptr) then stringend ("string")trueprocedure typ(fayst setoteyel var fepr etp; var (sire' addrrange);var Iap,lapl,lap21 stp; oldtopi disprengol lcp: ctp;Isite,displi addrrange; Imin,lmaxt Integer;procedure simpletype(feysteetofeys; var fep:stp; var (eiteraddrrange),var lep s lepli etp; icp,Icplt ctp; ttopt diaprenge;lcrat integer; lvalut volu;begin flare 1- 11if not (sy in simptypebegeye) thenbegin error(1); skip(fsys + simptypebegeye) end;if ay in simptypobegeye thenbeginif ay - iperent thanbegin ttop 1m top; (Adecl. conste local to innermost block")while dieplayitophoccur blck do top 1m top - 1;new(lep,scaler,declared);with lap" dobegin size 1- intmize; form im scaler;ecalkind :- declaredend;lcpl t. n11; lent a - 01repoat insymbol;if ay .• ident thenbegin new(lcp,konet);with lcp' dobegin name id; idtype a.. lap; next 1.. Icpl,veluea.ival :- lent; klaaa konatend;enterld(lcp);lent lcnt + 1;lcpl t• lcp; insymbolendelse error(2);if not (sy in (eye + (comme,rperent)) thenbegin orror(6); skip(fays + Icommaprperent)) enduntil sy comma;lep"aconot im lcpl; top :- ttop;if sy - rparent then insymbol else error(4)endelsebeginif sy - ident thenbegin searchid(Itypee ; konat],lcp):ineymbol;if icp".klase konst thenbegin new(lep,eubronge);with ler, lcp" dobegin rangetype idtype; form :- subrange;if string(rangetype) thenbegin error(148); rangetype im nil end;min values; size 1 - intettoend;if sy • colon then insymbol elee error(S);constant(fiya,lap1,1valu);lep".max lvalu;If ler.rangetype 0 lap! then error(107)

30 Pascal inipletnentatIon: Compiler and /185,-1)1er/interpreterCompiler Listing 3 11721172217231124172517261727172817291730173117321133173417351736173117381739174017411742174)17441745174617471148174917501751175217531154175517561757175817591760116117621763176417651766176711681769177017711772177311741775177617711778177917801/81178217831784begin name ;• id; idtype g• nil;extern t• false; pflev level; gonlebel(lbnome);ptdecktnd 1- declared; pfkind actual; pfneme Ibneme;it fey procey than kiltse procelse klaaa tuneend;enterid(lcp)endelsebegin Icp1 t- lcp".next;while Icpi nil dobeginwith Icpi" doif kless var• thenif idtype nil thenbegin lcm t• vaddr + idtype".sise;it 1cm > lc then lc im lcmend;lcpl t. Icp1".nextendend;insymbolandelsebegin error(2); lcp ufctptr end;oldlev level; oldtop top;if level < maxlevel then level level + 1 else error(251);if top < displimit thenbegin top t. top + 1;with display(top) dobeginif forw than [name lcp".noxtelse fname nil;[label tse nil;occur blckendendaloe error(250);if fey procey thenbegin parameterlist((semicolonhlcp1)1it not forw than Icp".next t - leplendelsebegin parameterliat((eemicolon,colonI,Icp 1 );if not forw than Icp".next Icp1;if ay colon thenbegin ineymboll •if sy • ident thenbegin if forw then error(122);searchid((typee),Icp1);lop Icp1".idtype;Icp".idtype tm lop;if lap nil thenif not (Isp".form in lecalsr,subrange,pointer)) thenbegin error(120); Icp".idtype nil end;insymbolendelse begin error(2); skip(fsys + (semicolon)) endendaloeif not forw then crror(123)end;it ey semicolon then inaymbol else error(14);it ey forwardey then1705 begin1786 it forw then error(161)1787 else Icr.forwdecl ta true;insymbol;1 7:: if sy • semicolon than insymbol else error(14){1790 if not (sy in toys) thenbegin error(6)1 nkip(feye) end:;:;end1793 elsebegin lcr.forwdeel 1- falee; mark(markp);19 7 945repeat block(feysotemicolon,lcp)11796 if sy 0 semicolon then1797 begin it prtables then pcinttables(false); inaymbol;17981799if not (sy in (beginey,procey,funcey)) thenbegin orror(6); skip(fsys) end1800 endelse error(14)122 until (sy in (beginey,procey,funcayl) or eof(input);1803 release(markp); (" return local entries on runtime heap ")end;1805 level t- oldlev; top t.. oldtop; lc 1• lie;1806 end (cprocdecIsrationc) ;18088 07 procedure body(feys: eetofsye);1809 conet cstoccmax-65; cixmax-1000;type oprange - 0..63;1 8 1 01 Vat'1812 11cptctp; saveldtalphs;1813 cetptrt array (1..catoccmax) of cop,1814cetptrixi 0..cetoccmnx;(callow• referencing of noninteger constants by on index1816 (instead of a pointer), which can be stored in the p2-field1817 of the instruction record until writeout.1818--> procedure load, procedure writeout")8 1:i , entnamo, soggitet integer;1820 stacktop, topnew, topmax: integer;1821lemax,11c11 addrrange; lcpt ctp;22Ilpt lbp;182318241825 procedure mee(it integer);1826 begin topnow 1. , topnow + cdx111"mnxatack;if topnew > topmax then topmax 1. topnow1 8 2 end;1819procedure putic;:: 18313begin if lc mod 10 .. 0 than writeln(prr,'1',ict5) end;18321833 0ii procedure gon0(fopt oprenge);beginIf prcodc then begin putic; writeln(prr,mn(fop):4) end;is P. le 4- I; mea(fop)end ( 4 gen0") 11118818)718381840 1839 procedure genl(fopt oprongo; fp21 integer);z kt integer;1841 begin1842 if prcode than1043begin putic; write(prroonlfop114);1844 top - 30 then18:t;begin writcln(prr e ana((01112):1846 topnew 1• topnew + pdx1fp2) 4 maxeteck;1847 if topnew > topmax than topmax 1- topnew1848 end

32 Pascal Implementation: Compiler and Assembler/Interpreter Compiler Listing 33104918501851185210531854105518561857185818591860106118621863186418651866186718681869187018711872187318741875187618771878187918801881188210831884188518861887188818891890189118921893189418951896189/189818991900190119021903190419051906190719081909191019111912elseboginIf lop m 38 thanbegin write(prr,"");with cetptrlfp21" dobeginfor k im 1 to elgth do write(prr,evel1k1:1);for k am elgth+1 to etrglgth do writo(prr,' ');andwriteln(prr,"")endelse if fop - 42 then writeln(prc,chr(fp2))else writeln(prr,fp2112);mee(fop)endend;is 1- lc + 1end (*gent*) 1procedure gen2(fopi oprenge; fpl,fp2t integer);var k 1 integer;beginif prcodo thenbegin putici writs(prroln1fop1e4);ceee top of45,50,54,561writeln(prr,' ,fpli3,fp218);41,48,49,52,53,551begin writs(prr,chr(fp1));if chr(fpl) m 'm' then writs(prr,fp2ill);writeln(prr)end;51,case Ypl of1, writein(prWi ',tp2);21 begin write(prr,'r I );with cetptrlfp21" dofor k tm 1 to otrglgth do write(prr,rval1k1);writeln(prr)end;31 writeln(prr,'b41 writeln(prr,'n');61 writeln(prr,'c '":3,chr(fp2),"");Si begin write(prr,'(');with cstptr(fp2)" dofor k 1m setlow to sethigh doif k in pvel then write(prr,k13)1writeln(prr,')')endendend;end;is 1- is + 1; mee(fop)end (Agen2A)procedure gentypindicstor(fept etp);beginif fepOnil thanwith for domom form ofscaler{ if ferintptr then write(prr,'i')elseif tep-boolptr than write(prr,'b')also1913 it fopmcharptr than write(prr,'c')1914 else1915 if scslkind m declared then wrItu(prr,'1')1916 else write(prWr');1917 eubrangoi gontypindicetor(rengetype);1918 pointer' write(prr, g a');1919 powers write(prWe');1920 cecorde,arreyet write(prr s 'm');1921 filee,tegfld,vArientt error(500)1922 end1923 end (*typindicator");19241925 procedure genOt(fopt oprAnges for etp);1926 begin1927 if prcode then1928 begin putic;1929 write(prr,mn(fop)14);1930 gentypindicetor(fop);1931 writeln(prr);19)2 end;1933 lc is + 1; mee(fop)1934 and (*genOt")119351936 procedure genit(fopt opt-tinge; fp21 integer; lap: etp);1937 begin1938 if prcodo then1939 begin putic;1940 write(prr,mnifopj14);1941 gentypindtcator(fep);1942 writeln(prr,fp2i11)1943 end;1944 is is + I; mee(fop)1945 and (ogenitA)11946•1947 procedure gen2t(fopt oprnnne; fpl,fp2: integer; fop: etp);1948 begin1949 if prcode then1950 begin putic;1951 write(prr,mn(fop)1 4);1952 gentypindicator(fep);19531954writeln(prr,fpl13+5"ord(ebe(fp1))99),4218):end;1955 Ic is + 1; mee(fop)1956 end ( 4 gen20);19571958 procedure load;1959 begin1960 with gattr do1961 It typtr nil then1962 begin1963 CASO kind of1964 cot, if (typtr".form - scaler) end (typtr realptr) then1965 if typtr - boolptr then gen2(51(*Idc"),3,cvel.lvel)1966 elee1967 if typtr ■ charptr then1968 gen2(51( 0 1dc"),6,cval.ival)1969 else gen2(51(Aldc*),1,cval.iyAl)1970 else1911 if typtr1972 elsenilptr then gen2(51( , 1dc"). 4 .0)if cetptrix >- cetoccmax then error(254)1974 elee19731975 begin chtptrix t ■ cetptrlx + 1;1976 cetptrIcetptrixl 1.* cvel.volp;

34 Pascal implementation: Complier and Assembler/Interpreter I Compiler Listing 351977 it typtr real, then1970 gen2(51("Idc"),2,catptrix)oleo lse1980 gen2(51("Idc"),5,catptrix)1981 end;1982 varblt case access of1983 drcts it vlevel< ■ 1 than1984 genit(39("IdoA),dplmt.typtr)1985 sloe gen2t(54(Alod 0 ),1evel-vlevel,dplmt,typt l1986 indrcts genit(35(AindA),idplmt,typtr),1987 inxd: error(400)1908 end;1989 exert1990 end;1991 kind 0- expr1992 end1993 end ("load")19941995 procedure store(var fattrt attr);1996 begin1997 with fattr do1998 it typtr nil then1999 case access of2000 drctt if vlevel . cetoccmax then error(254)2016 else2017 begin cetptrix t ■ cstptrix + I;2018 cetptrIcstptrixj cvil.valp;2019 gen1(38(Alca 0 ),cstptrix)2020 end2021 else error(400)12022 vsrbli case access of2023 drct: if vlevol

56 Pascal Implementallon:Comildler and Assembler/Interpreter3385 storeosttol genujpxjp(57("ujp 0.),Iaddi,, putlabel(lcix);3386 lc 8- 11c;3387 end ("forstatement")3388338933903391339233933394339533963397339833993400340134023403340434033406340734083409341034113412341334143415341634173418341934203421342234233424342534263427342834293430343134323433343434353436343134363439344034413442344334443445344634473448procedure withetatement;var lcp; ctp; lcnth di.prangel 11c; addrrangelbegin lcntl 01 11c ► .. lc;repeatif ey ident thenbegin searchid((vers,field1,1cp), ineymbol endelse begin error(2); lcp uvarptr end;selector(feys + (comma,dosyhicp);if gottr.typtr nil thenif gattr.typtr ^ .form . records thenif top < dieplimit thenbegin top t. top + I; 'encl. ► - Icnti + I;with displayttopi dobegin [name J. gattr.typtr"ofstfld;[label t.■ nilend;if gattr.accees drat thenwith displayltopl dobegin occur t• creel cloy I. gettroilevellcdspl t. gattr.dplmtendelsebegin ioadaddreae;elign(nilptr,1c);gen2t(56(Aetr*),0,1colilptr);with displeyttopi dobegin occur 1. vroc; vdapl t- lc end;lc 1. iciptrsire;if lc > lcmax then lcmax lcendandelse error(250)else error(140);cost ► - ey comma;if not test then ineymboluntil teat;if ey doey then ineymbol else error(S4);statemont(feye);top 1.. top-lentil lc ► - lie;end ("withetstement") 1begin (*statement")if sy ■ Intconst then ("label.")begin lip t ■ displayllevel).flabel;while lip 0 nil do.4ith Ilp" doif labval val8ival thenbegin if defined then error(165);putiebel(labneme); defined true;goto 1endelee Ilp nextleb;error(167);1 ineymbol; 1if ey • colon then ineymbol ales erTor(5)end;it not (ey in fey. (identl) thenbegin ercor(6); skip(feys) end;if sy in statbegoys + (identl then3449345034513452345334543453345634573458345934603461346234631464346534663467346834693470347134723473347434753476347734783479348034813482348334843485348634873488348934903491349234933494349534963497349834993500350135023503350435053506350735083509351035113512Compiler Listing 57oggincase toy ofidentt begin senrchid((vsrs,fleld,func,procl,lcp), ineymbol;if Icr.klass proc than call(fsys,lep)sloe aesignment(Icp)end;begineyt begin ineymbol; compoundatatement end;gotoeyt begin ineymbol; gotostatemant end;ifsy ► begin ineymbol; lfatatement end;casesyt • begin ineymbol; cesoetatement end;whilesy: begin ineymbol; whileetstement end;ropeateyi begin ineymbol; repeststatement end;foray ► begin ineymbol; forstatement end;withoyf begin ineymbol; withatatoment endend;if not (ey in (semicolon.endsy,elsosy,untilay1) thenbegin error(6); skip(fays) andendend ("statement") 1begin (*body")if fprocp 0 nil then ontname fprocr.pfnemeelse genlabel(entnemo);cetptrix is 01 topnew t- lcaftormarketeck; topmox i ■ Icaftermarkstack;putlabel(entname); genlsbel(segelze); geniebel(stacktop),gencupent(32("entl"),I,segeize); gencupent(32(Aent2"),2,stacktop),if fprocp nil then ("copy multiple values into local cells")begin licl t• Icaftermarketack;lcp 1- fprocp".next;while lcp nil dowith lcp" dobeginalign(parmptr,11c1)1if Ideas • vats thenIf idtyps nil thenif idtype.form > power thenbeginit vkind . actual thenbegingen2(50("Ide"),0,veddr);gen2t(5 4 ("lod"),0,11c1,nflptr) ;4 0("mov"),Idtype".elte);gen1(end;licl 1- 11c1 + ptrsleeendelse licl 1. llcl + idtype.site;lcp ► . Icp".next;end;end;lcmax lc;repeatrepeat statement(fsys + Isemicolon,endey1)until not (ay in statbegsys);teat t. sy 0 semicolon;if not test then ineymboluntil test;if ey • endsy then ineymbol else error(13)1Ilp t. display(topkflabol; ("test for undefined labels")while lip nil dowith 11p ^ dobeginif not defined thanbegin error(168);writeln(output); writeln(output,' label ',Libya!).

if) 1993, 1994, 1995 Sim Microsystems, Inc.2550 Garcia Avenue, Mountain View, California 94043.1100 U.S.AAll rights reserved_ This tiii In quality release and related documentation Are protected by copyright and distributed widerlicenses restricting its use, copying, distribution, and decompilation. No part of this release or related documentatio. may bereproduced In Any form by Any means ivithout prior written authorization of Sun and its licensors, if anyPortions of this product may be derived from the UNIXI'aiid Berkeley 4 3 IISO systems, licensed from UNIX SystemLaboratories, Inc. and the University of California, respectively.1 hird•party font software in this release IS protected bycopyright and licensed from Sun's Font SuppliersRESTRICTED RIGHTS LEGEND. Use, duplication, or disclosure by the United States Government is subject to the restrictionsset forth in DEARS 252_227-7013 (c)(1)(ii) and FAR 52 227-19The Java Virtual Machine SpecificationRelease 1.0 BelaDRAFTThe release described in this manual may be protected by one or note U S patents, foreign patents, or pending applicalTRADEMARKSSuit, Suit Microsystems, Sun Microsystems Computer Corporation, the Sun logo, the Stun Microsystems ComputerCorporation logo, WebRunner, Java, FirstPerson and the FirslPerson logo and agent are trademarks or registered t r Aetna f ksof Sun Microsystems, Inc. The "Duke" character is a trademark 01 Soo Microsystems, lire and Copyright (c) 1992.1995 Sun.Microsystems, Inc. All Rights Reserved. UNIX ° is a registered trademark in the United States and other countries, exclusivelylicensed through X/Open Comapny, Ltd. OPEN LOOK is a registered trademark of Novell, Inc All other product namesmentioned herein are the trademarks of their respective owners.All SPARC trademarks, including the SC V Compliant Logo, arc trademarks or registered Itade111Atks of Sl'AltC luternational.SPARCstation, SPARCser vet. SPARCengine. SPARCworks. and SPA RC.ompiler are licensed exclusively to SooMicrosystems, hue. Products bearing SPARC trademarks nit:based upon an architecture developed by So. MIcIOSyslt . 111,The OPEN 1.000) and Stull" Graphical User Interfaces were developed by Sun Microsystems, Inc fur its users and been ct sSun acknowledges the pioneering efforts of Xerox iii researching and developing the concept of visual or graphic Al userinterfaces for the computer industry. Son holds a non-exclusive license from Xerox to the Xerox Graphical Usti Interlacewhich license also covers Sun's licensees who implement OPEN 1.(X)K GUIs and otherwise comply with Sun's written hi vowagreements.X Window System is a trademark and product of the Massachusetts Institute of TechnologyTHIS PUBLICATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED.INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR APARTICULAR PURPOSE, OR NON-INFRINGEMENT.SunWn Altereaystaen C-....xttis. e.rperv(lciaA S. Mictoinorni, Inc Purim.THIS PUBLICATION COULD INCLUDE TECHNICAL. INACCURACIES OR TYP(X;R API IICAI. ERRORS CI IANCES AREVERIODICALLY ADDED TO THE INFORMATION HEREIN; THESE CHANGES WILL BE INCORPORATED IN NEWEDITIONS OF THE PUBLICATION. SUN MICROSYSTEMS, INC MAY MAKE IMPROVEMENTS AND/OR CliANGESTHE PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THIS PUBLICATION AT ANY TIMEQi0PleaseRecycleAugust 21, 1995

AlContentsAppendix A: An Optimization 73A.1 Constant Pool Resolution 73A.2 Pushing Constants onto the Stack (_quick variants)... 74A.3 Managing Arrays (_quick variants) 75A.4 Manipulating Object Fields (_quick variants) ...... 76A.5 Method Invocation (__quick variants) ...... 78A.6 Miscellaneous Object Operations (_quick variants) 110Index of Instructions 83Preface 5Chapter 1: Java Virtual Machine Architecture 71.1 Supported Data Types 71.2 Registers 71.3 Local Variables 814 The Operand Stack 81.5 Execution Environment 81.6 Garbage Collected Heap 91 7 Method Area 101.8 The Java Instruction Set 101.9 Limitations 10Chapter 2: Class File Format 112 Format 112.2 Signatures 132.3 Constant I'ool 152.4 Fields 192 5 Methods 192.6 Attributes 20Chapter 3: The Virtual Machine Instruction Set 273.1 Format for the Instructions 273 2 Pushing Constants onto the Stack 273.3 Loading Local Variables Onto the Stack 303.4 Storing Stack Values into Local Variables 333 5 Wider index for Loading, Storing and Incrementing 353.6 Managing Arrays 363.7 Stack Instructions 423.8 Arithmetic Instructions 443.9 Logical Instructions 503 10 Conversion Operations 523.11 Control Transfer Instructions 563.12 Function Return 633.13 Table Jumping 653.14 Manipulating Object Fields 663.15 Method Invocation 683.16 Exception Handling 703.17 Miscellaneous Object Operations 713.18 Monitors 72August 22, 1995 lava Vittual Machine Specification 3 4 lava %/WW1' Machine Specification Angus' 22. 1995

0-PrefaceThis document describes version 1.0 of the Java Virtual Machine and its instniction set. We have written thisdocument to act as a specification for both compiler writers, who wish to target the machine, and as aspecification for others who may wish to implement a compliant Java Virtual Machine.The Java Virtual Machine is an imaginary machine that is implemented by emulating it in software on a realmachine. Code for the Java Virtual Machine is stored in c lass files, each of which contains the code for atmost one publ is class.Simple and efficient emulations of the Java Virtual Machine are possible because the machine's format iscompact and efficient bytecodes. Implementations whose native code speed approximates that of compiled Care also possible, by translating the bytecodes to machine code, although Sun has not released suchimplementations at this time.ihe rest of this document is structured as follows:• Chapter 1 describes the architecture of the Java Virtual Machine.• Chapter 2 describes the .c lass file format.• Chapter 3 describes the bytecodes.• Appendix A contains some instructions generated internally by Sun's implementation of the JavaVirtual Machine. While not strictly part of the specification we describe these here so that thisspecification can serve as a reference for our implementation. As more implementations of the JavaVirtual Machine become available, we may remove Appendix A from future releases.Sun will license the Java Virtual Machine trademark and logo for use with compliant implementations of thisspecification. If you are considering constructing your own implementation of the Java Virtual Machine pleasecontact us, at the email address below, so that we can work together to insure 100% compatiblity of yourimplementationSend comments on this specification or questions about implementing the Java Virtual Machine to ourelectronic mail address: Java@java.sun.com .August n. 1995 lava Vlllual Machine Spet 1110111On 5 6 lava VtittiAl NiMittrIC SpfCifl(MiOil August 22. 149'

4171 Java Virtual Machine Architecture1.1 Supported Data TypesThe vir tual machine data types include the basic data types of the Java language:byteshortintlongfloatdoublechar// 1-byte signed 2's complement integer// 2-byte signed 2's complement integer// 4-byte signed 2's complement integer// 8-byte signed 2's complement integer// 4-byte IEEE 754 single-precision float// 8-byte IEEE 754 double-precision float// 2-byte unsigned tinicode characterNearly all Java type checking is done at compile time. Data of the primitive types shown above need not betagged by the hardware to allow execution of Java. Instead, the bytecodes that operate on primitive valuesindicate the types of the operands so that, for example, the i add, 1 add, fadd, and dadd instructions each addtwo numbers, whose types are int, long, float, and double, respectivelyThe virtual machine doesn't have separate instructions for boo I ean types. Ennead, integer instructions,including integer returns, are used to operate on boolean values; byte arrays are used for arrays ofbool eon.The virtual machine specifies that floating point be done in IEEE 754 format, with support for gradualunderflow. Older computer architectures that do not have support for IEEE format may run Java numericprograms very slowly.Other virtual machine data types include:objectreturnAddressNote: lava arrays are treated as objects.// 4 - byte reference to a Java object// 4 bytes, used with jar/ret/Jsr_w/ret_w instructionsThis specification does not require any particular internal structure for objects. In our implementation anobject reference is to a handle, which is a pair of pointers: one to a method table for the object, and the other tothe data allocated for the object. Other implementations may use inline caching, rather than method tabledispatch; such methods are likely to be faster on hardware that is emerging between now and the year 2000.Frograms represented by Java Virtual Machine bytecodes are expected to maintain proper type discipline andan implementation may refuse to execute a bytecode program that appears to violate such type discipline.While the Java Virtual Machines would appear to be limited by the bytecode definition to running on a 32-bitaddress space machine, it is possible to build a version of the java Virtual Machine that automaticallytranslates the bytecodes into a 64-bit form. A description of this transformation is beyond the scope of thisspecification1.2 RegistersAt any point the virtual machine is executing the code of a single method, and the pc register contains theaddress of the next bytecode to be executed.Each method has memory space allocated for it to hold:• a set of local variables, referenced by a var s register,• an operand stack, referenced by an optop register, and• a execution environment structure, referenced by a f rame register.All of this space can be allocated at once, since the size of the local variables and operand stack are known atcompile time, and the size of the execution environment structure is well-known to the interpreter.All of these registers are 32 bits wide.1.3 Local VariablesEach Java method uses a fixed-sized set of local variables. They are addressed as word offsets from the vascregister. Local variables are all 32 bits wide.Loris integers and double precision floats are considered to take up two local variables but are addressed bythe index of the first local variable. (For example, a local variable with index n containing a double precisionfloat actually occupies storage at indices n and n+ I.) The virtual machine specification does not require 64-bitvalues In local variables to be 64-bit aligned. Implementors are free to decide the appropriate way to dividelong integers and double precision floats into two words.htstructions are provided to load the values of local variables onto the operand stack and store values from theoperand stack into local variables.1.4 The Operand StackThe machine instructions all take operands from an operand stack, operate on them, and return results to thestack. We chose a stack organization so that it would be easy to emulate the machine efficiently on machineswith few or irregular registers such as the Intel 486.The operand stack is 32 hits wide. It is used to pass parameters to methods and receive method results, as wellas to supply parameters for operations and save operation results.For example, the i add instruction adds two integers together It expects that the integers to he added are thytop two words on the operand stack, pushed there by previous instructions. (loth integers are popped from ilnystack, added, and their sum pushed back onto the operand stack Subcomputations may be nested on theoperand stack, and result in a single operand that can be used by the nesting computationEach primitive data type has specialized instructions that know how to operate on operands of that type Eachoperand requires a single location on the stack, except for long and double, which require two locationsOperands must be operated on by operators appropriate to their type. It is illegal, for example, to push twoants and then treat them as a long. This restriction is enforced, in the Sim implementation, by the bytecodeverifier. However, a small number of operations (the duo opcodes and swap) operate on runtime data areas asraw values of a given width without regard to type.In our description of the virtual machine instructions below, the effect of an instruction's executions on theoperand stack is represented textually, with the stack growing from left to right, and each 32-bit wordseparately represented. Thus:Stack: ..., value?, taque2..., Mudshows an operation that begins by having valise? on top of the stack with value? just beneath it As a result ofthe execution of the instruction, MHO and toltre2 are popped from the stack and replaced by palue3, which hasbeen calculated by the instruction. The remainder of the stack, represented by an ellipsis, is unaffected by theinstruction's execution.The types long and double take two 32-bit words on the operand stack:Stack: ...valuelvordl, valite-word2This specification does not say how the two words are selected from the 64-bit long or double value. it isonly necessary that a particular implementation be internally consistent.1.5 Execution EnvironmentThe information contained in the execution environment is used to do dynamic linking, normal methodreturns, and exception propagation.1.5.1 Dynamic Linkingline execution environment contains references to the interpreter symbol table for the current method andcurrent class, in support of dynamic linking of the method code. The class file code for a method refers tomethods to be called and variables to be accessed symbolically. Dynamic linking translates these symbolicAugust 22. 1995 Java VittuAl Machine Spe

method calls into actual method calls, loading classes as necessary to resolve as-yet•undefined symbols, andtranslates variable accesses into appropriate offsets in storage structures associated with the runtime locationof these variables.This late binding of the methods and variables makes changes in other classes that a method uses less likely tobreak this code1.5.2 Normal Method ReturnsIf execution of the current method completes normally, then a value Is returned to the calling method. Thisoccurs when the calling method executes a return instruction appropriate to the return type.The execution environment is used in this case to restore the registers of the caller, with the program counter ofthe caller appropriately incremented to skip the method call instniction. Execution then continues in thecalling method's execution environment.1.5.3 Exception and Error PropagationAn exceptional condition, known in Java as an Er ror or Except ion, which are subclasses of Throwable,may arise in a program because of:• a dynamic linkage failure, such as a failure to find a needed class file,• a run-time error, such as a reference through a null pointer,• an asynchronous event, such as is thrown by Thread. stop, from another thread,• the program using a throw statement.When an exception occurs:• A list of catch clauses associated with the current method is examined. Each catch clause describesthe instruction range for which it is active, describes the type of exception that it is to handle, andhas the address of the code to handle it.• An exception matches a cat ch clause if the instruction that caused the exception is in theappropriate instruction range, and the exception type is a subtype of the type of exception that thecatch clause handles. If a matching catch clause is found, the system branches to the specifiedhandler. If no handler is found, the process is repeated until all the nested catch clauses of thecurrent method have been exhausted.• The order of the catch clauses in the list is important. The virtual machine execution continues atthe first matching catch clause. Because Java code is stnictured, it is always possible to sort all theexception handlers for one method into a single list that, for any possible program counter value,can be searched in linear order to find the proper (innermost containing applicable) exceptionhandler for an exception occuring at that program counter value.• If there is no matching catch clause then the current method is said to have as its outcome theuncaught exception. The execution state of the method that called this method is restored from theexecution environment, and the propagation of the exception continues, as though the exceptionhad just occurred in this caller.1.7 Method AreaThe method area is analogous to the store for compiled code in conventional languages or the text segment ina UNIX process. It stores method code (compiled Java code) and symbol tables. In the current Javaimplementation, method code is not part of the garbage•collected heap, although this is planned for a futurerelease.1.8 The Java Instruction SetAn instruction in the Java instruction set consists of a one•byte opcode specifying the operation to beperformed, and zero or more operands supplying parameters or data that will be used by the operation Manyinstructions have no operands and consist only of an opcode.The inner loop of the virtual machine execution is effectively:dofetch an opcode byteexecute an action depending on the value of the opcode) while (there is more to do) ;The number and size of the additional operands is determined by the opcode. If an additional operand is morethan one byte in size, then it is stored in big-endian order high order byte first. For example, a 16-bitparameter is stored as two bytes whose value is:f rat_byte • 256 • second_byteThe bytecode instruction stream is only byte-aligned, with the exception being the tableswi tch andIookupswi tch instructions, which force alignment to a 4•byte boundary within their instructions.These decisions keep the virtual machine code for a compiled Java program compact and reflect a consciousbias in favor of compactness at some possible cost In performance.1.9 LimitationsThe per-class constant pool has a maximum of 65535 entries. This acts as an internal limit on the totalcomplexity of a single class.The amount of code per method is limited to 65535 bytes by the sizes of the indices in the code in the exceptiontable, the line number table, and the local variable table. This may be fixed for 1.0beta2.Besides this limit, the only other limitation of note is that the number of words of arguments in a method call islimited to 255.1.5.4 Additional informationThe execution environment may be extended with additional implementation•specific information, such asdebugging information.1.6 Garbage Collected I -leapThe Java heap is the runtime data area from which class instances (objects) are allocated. The Java language isdesigned to be garbage collected — it does not give the programmer the ability to deallocate objects explicitly.Java does not presuppose any particular kind of garbage collection; various algorithms may be useddepending on system requirements.August 71. 199S lava Virtual Machine Specification 9 10 lava Victual Machine Specification August 7. 1995

41-2 Class File FormatThis chapter documents the Java class ( . class) file format.Each class file contains the compiled version of either a Java class or a Java interface. Compliant Javainterpreters must be capable of dealing with all class files that conform to the following specification.A Java class file consists of a stream of 8-bit bytes. All 16-bit and 32-bit quantities are constructed by reading intwo or lour 8-bit bytes, respectively. The bytes are joined together in network (big-endian) order, where thehigh bytes come first. This format is supported by the Java Java.io. Da ta Input andj ava . io Da aOu tput interfaces, and classes such as java.io.DataInputStreamandjava.io.DataOutputStreamThe class file format is described here using a structure notation. Successive fields in the structure appear inthe external representation without padding or alignment. Variable size arrays, often of variable sizedelements are called tables and are commonplace in these structures.The types ul, u 2, and u4 mean an unsigned one-, two-, or four-byte quantity, respectively, which are read bymethod such as readUnsignedDyte, readUns ignedShor t and readlnt of the javaio.Data Inputinterlace.2,1 FormatThe following pseudo-structure gives a top-level description of the format of a class file:magicclassFileu4 magic;u2 minor_version;u2 major_version;u2 constant_pool .count;cp_info constant_pool(constant_pool_count - 1);u2 access_flags;u2 this_class;u2 super_class;u2 interfaces_count;u2 interfaceslinterfaces_count);u2 fields_count;field_info fields(fields_countl;u2 methods_count;method info methods(methods_count);u2 attributes_count;attribute_info attributeslattribute_count);This field must have the value OxCARBABE.minor_version, major_versionThese fields contain the version number of the Java compiler that produced this class file. Animplementation of the virtual machine will normally support some range of minor versionnumbers 0•11 of a particular major version number. ff the minor version number isincremented the new code won't ruin on the old virtual machines, but it is possible to make anew virtual machine which can run versions up to n•l.A change of the major version number indicates a major incompatible change, one thatrequires a different virtual machine that may not support the old major version in any way .The current major version number is 45; the current minor version number is 3.constant_pool_countThis field indicates the number of entries in the constant pool in the class file.constant_poolThe constant pool is an table of values. These values are the various string constants, classnames, field names, and others that are referred to by the class structure or by the code.cons t ant_pool 10) is always tunised by the compiler, and may be used by animplementation for any purpose.Each of the constant_pool entries 1 through constant _pool _count - 1 is a variablelengthentry, whose format is given by the first "tag" byte, as described in section 2.3.•ccm.,clagsThis field contains a mask of up to sixteen modifiers used with class, method, and fielddeclarations. The same encoding is used on similar fields in f ield_info andmethod_in f o as described below. Here is the encoding.Flag Name Value Meaning Used ByACC_PUBLIC 04001 Visible to everyone Class, Method, VariableACC_ PR I VATE 0x0002 Visible only to the defining class Method, VariableACC_ PROTECTED 0x0004 Visible to subclasses Method, VariableACC_STATIC 0x0008 Variable or method is static Method, VariableACC_FINAL Ox0010 No further subclassing, overriding,or assignment after initializationACC_SYNCHRONIZED 0x0020 Wrap use in monitor lock MethodACC_VOLATI LE 0x0040 Can't cache VariableACC_TRANSIENT 0x0080 Not to be written or read by a persistentobject managerACC_NATI VE Ox0100 Implemented in a language otherthan JavaACC_ INTERFACE 0x0200 Is an interfaceClass, Method, VariableVariableMethodClassACC_ABSTRACT 0x0400 No body provided Class, Methodthis_classThis field is an index into the constant pool; cons tant_pool this__classl must be aCONSTANT._ class.Augon 22. 199 5 lava Vilh1711 Machine Specification 12 lava Virtual Machine Specification A ugust 22. MS

auper_el arsThis field is an index into the constant pool. If the value of super_c lass is nonzero, thencons tant_poo ( super_c lass I must be a class, and gives the index of this class'ssuperclass in the constant pool.If the value of super_c lass is zero, then the class being defined must bejava . lang . Obj ect, and it has no superclass.interfaces_countThis field gives the number of interfaces that this class implements.interfacesEach value in this table is an index into the constant pool. If an table value is nonzero(inter faces lit != 0, where 0 , ::. IThe meaning of the base types is as follows:::. (0-9)•B byte signed byteC char characterD double double precision IEEE floatF float single precision IEEE floatI i nt integer1 long long integer1.; ... an object of the given classS short signed shortZ boo l can true or falseWield sig>arrayA return•type signature represents the return value from a method. It is a series of bytes in the followinggrammar: I VThe character V indicates that the method returns no value. Otherwise, the signature indicates the type of thereturn value.An argument signature represents an argument passed to a method:A method signature represents the arguments that the method expects, and the value that it returns. ::= () ::= •attributes_countThis field indicates the number of additional attributes about this class.attribute.A class can have any number of optional attributes associated with it. Currently, the only classattribute recognized is the "Soured:de" attribute, which indicates the name of the source filefrom which this class file was compiled. See section 2.6 for more information on theattr ibute_info structure.2.2 SignaturesA signature is a string representing a type of a method, field or array.August a 1995lava Vitituti Machine SpetillrallOn13 11 lavn Virtual Machine SpectlicntIon Augini 51. 1995

2.3 Constant PoolEach item in the constant pool begins with a I byte tag:. The table below lists the valid tags and their values.CONSTANT_Fieldref_infoul tag;u2 class index;u2 name_and_type_index;Constant hypeCONSTANT_ClassCONSTANT_FieldrefCONSTANT_MethodrefCONSTANT_InterfaceMethodrefCONSTANT_StringCONSTANT_ IntegerCONSTANT_FloatCONSTANT_LongCONSTANT_DoubleCONSTANT_NameAndTypeCONSTANT_Utf6CONSTANT_UnicodeValue0 (^4••••■tagCONSTANT_Methodref_info (ul tag;u2 class_index;u2 name_and_type_index;CONSTANT_InterfaceMethodref_info (ul tag;u2 class_index;u2 name_and_type_index;the tag will have the value CONSTANT_Fi eldref, CONSTANT_Methodref, orCONSTANT_Inter faceMethodrefEach tag byte is then followed by one or more bytes giving more information about the specific constant.2.3.1 CONSTANT_ClassCONSTANT_Class is used to represent a class or an interlace.tagCONSTANT_Class_info (ul tag:u2 name index;The tag will have the value CONSTANT_Classname_indexconstant_poollname_indexl is a CONSTANT_Ut f 8 giving the string name of the class.Because arrays are objects, the opcodes anewarray and mu 1 t i anewar ray can reference array "classes" viaCONSTANT_Class items in the constant pool. In this case, the name of the class is its signature. For example,the class name forisint ( (H IThe class name forISThread('(Ljava lang Thread;'2.3.2 CONSTANT JFieldref,Methodref,InterfaceMethodref)Fields, methods, and interface methods are represented by similar structures.class. indexconstant_pool (class_index ) will be an entry of type CONSTANTs l ass giving thename of the class or interface containing the field or method.For CONSTANT Fi el dref and CONSTANT_Methodref, the CONSTANT__Class item must bean actual class. For CONSTANT_Inter f aceMethodref, the item must an interface whichpurports to implement the given method.name_and_type_indexconstant_pool name_and_type_index ) will be an entry of typeCONSTANT_NameAndType . This constant pool entry indicates the name and signature of thefield or method.2.3.3 CONSTANT_StringCONSTANT_String is used to represent constant objects of the built-in type String.tagCONSTANT_String_infoul tag;u2 string_index;The tag will have the value CONSTANT_St ring•tring_indexcons tant_pool (str I ng_i ndex ) is a CONSTANT_Ut 18 string giving the value to whichthe String object is initialized.2.3.4 CONSTANT_Integer and CONSTANT_FloatCONSTANT_Integer and CONSTANT. Float represent four byte constants.August 22, 1995 lava Virtual Machine Spedllollon 15 16 JAVA VI/III/II Machine Specitl(alton Aug. ,. 22. I995

ATtagbytesCONSTANT_Integer_infoul tag;u4 bytes;CONSTANT_Float_infoul tag;u4 bytes;The tag will have the value CONSTANT_Integer or CONSTANT_FloatFor integers, the four bytes are the integer value. For floats, they are the IEEE 754 standardrepresentation of the floating point value. These bytes are in network (high byte first) order.2.3.5 CONSTANT_Long and CONSTANT_DoubleCONSTANT_ Long and coNsTANT_Doub1e represent eight-byte constants.CONSTANT_Long_infoul tag;u4 high_bytes;u4 low_bytes;CONSTANT_Double_infoul tag;u4 high_byte:);u4 low_bytes;All eight-byte constants take up two spots in the constant pool. If this is the n il' item in the constant pool, thenthe next item will be numbered n+ 2.tagThe tag will have the value CONSTANT_Long or CONSTANT_Doubl e.high_byt••, low_byt••For CONSTANT_Long, the 64 -bit value is (high...bytes «32) + low_bytes.For CONSTANT_Double, the 64 -bit value, high_.bytes and low_bytes together representthe standard IEEE 754 representation of the double-precision floating point number.2.3.6 CONSTANT_NameAndTypecoNsTANT_NameAndType is used to represent a field or method, without indicating which class it belongs to.tagCONSTANT_NarneAndType_ info I1 tag;u2 name_index;u2 signature_index;The tag will have the value CONSTANT_NameAridType1name_indexconstant_pool I name_ i ndex ) is a CONSTANT_Ut 18 string giving the name of the fieldor method.• ignature_indexconstant_pool l signature_index 1 is a CONSTANT_Ut 18 string giving the signatureof the field or method.2.3.7 CONSTANT_Utf8 and CONSTANT UnicodeCONSTANT_Ut f8 and CONSTANT_Uni code are used to represent constant string values.CONSTANT__Ut f 8 strings are "encoded" so that strings containing only non-null ASCII characters, can berepresented using only one byte per character, but characters of up to 16 bits can be represented.All characters in the range Ox0001 to Ox0071: are represented by a single byte:• - • - • - • - - • - • -1017bits of data- • - • - - • - - - -The null character (0x0000) and characters in the range 0x0080 to Ox0714: are represented by a pair of two bytes4 - • - • - • - • - • - • - • - • • - • - • - 4 - - - • - - •1111101 5 bitst 1 )1101 6 bits• - • - • - - • - - • - - • •- • - - - • - 4 - • - 4 - •Characters in the range 0x0800 to OxFFIT are represented by three bytes:• - • - • - • - • - • - • - • - • • - ♦ - • - - • - • - • - • - • • - • - • - • - • - • - • - • - •1111111014 bits 1 11101 6 bits 1 11101 6 bits 1• -•-.-.-.-.-.-. - • -+-.- s-+-•-.-• -+ -.-.-.-.-. -• -There are two differences between this format and the "standard" UTF-8 format. First, the null byte (0x00) isencoded in two-byte format rather than one-byte, so that our strings never have embedded nulls. Second, onlythe one-byte, two-byte, and three-byte formats are used. We do not recognize the longer formats.t aglengthbytesCONSTANT_Utfe_info (ul tag;u2 length;ul bytes I length);CONSTANT_Unicode_infoul tag;u2 length;u2 bytes ( length' ;The tag will have the value CONSTANT_Ut 18 or CONSTANT_Uni code.The number of bytes in the string. These strings are not null terminated.The actual bytes of the string.Auguso 22, 1995 lava Virtual Machine Specification 17 le lava V11111111 Machine Sivo(Kntton Aug... 12 1541

2.4 FieldsThe information for each field immediately follows the field_count field in the class file. Each field is describedby a variable length field_info structure. The format of this structure is as follows:field_info (u2 access_ flags;u2 name_index;u2 signature_index;u2 attributes_count;attribute_info attributes(attribute_countl;access_ lag•This is a set of sixteen flags used by classes, methods, and fields to describe various propertiesand how they many be accessed by methods in other classes. See the table "Access Flags" onpage 12 which indicates the meaning of the bits in this field.The possible fields that can be set for a field are ACC_PUBLIC, ACC_PRIVATE,ACC _PROTECTED, ACC_STAT I C, ACC_F I NAL, ACC_VOLATI LE, and ACC_TRANS I ENT .At most one of ACC_ PUBL I C, ACC_PROTECTED, and ACC_PRIVATE can be set for anymethodname_indexcon stant_pool (name_index) is a CONSTANT_Ut f 8 string which is the name of thefield.• ignature_indexconstant_pool ( s ignature_index I is a CONSTANT_Ut f string which is the signatureof the field. See the section "Signatures" for more information on signatures.attributes_countThis value indicates the number of additional attributes about this field.attributesA field can have any number of optional attributes associated with it. Currently, the only fieldattribute recognized is the "Constant Value" attribute, which indicates that this field is a staticnumeric constant, and indicates the constant value of that field.Any other attributes are skipped.2.5 MethodsThe information for each method immediately follows the met hod_count field in the class file. Each methodis described by is variable length method_ i nfo structure. The structure has the following format:method_infou2 access_flags;u2 name_index;u2 signature_index;u2 attributes_count;attribute_info attribuceslactribute_countl;access_flagsThis is a set of sixteen flags used by classes, methods, and fields to describe various propertiesand how they many be accessed by methods in other classes. See the table "'Access Flags - onpage 12 which gives the various bits in this field.The possible fields that can be set for a method are ACC_PUBLIC, ACC_PRIVATE,ACC_ PROTECTED, ACC_STATIC, ACC_F I NAL, ACC_S YNCHRON I 2 ED, ACC _NAT I V E, aridACC_ABSTFtACT.Al most one of ACC_PUBLIC, ACC_PROTECTED, and ACC__PR I VATE can be set for anymethod.name_ indexconstant._pool (name_index) is a CONSTANT_Ut f 0 string giving the name of themethod.signature_indexconstant_pool ( s ignature_i ndex) is a CONSTANT_Ut f 8 string giving the signature ofthe field. See the section "Signatures" for more information on signatures.attributes_countThis value indicates the number of additional attributes about this field.attribute.A field can have any number of optional attributes associated with it. Each attribute has aname, and other additional information. Currently, the only field attributes recognized are the"Code" and "Exceptions" attributes, which describe the bytecodes that are executed toperform this method, and the Java Exceptions which are declared to result from the executionof the method, respectively.Any other attributes are skipped.2.6 AttributesAttributes are used at several different places in the class format. All attributes have the following format.GenericAttribute_infou2 attribute_name;u4 attribute_length;ul infolattribute_length);The attr ibute_name is a 16•bit index into the class's constant pool; the value ofconstant_pool ( at t r bute_name I is a CONSTANT_Ut f8 string giving the name of the attribute. thefield attribute_length indicates the length of the subsequent information in bytes. This length does notinclude the six bytes of the attribute_name and at tri bo t e_l ength.In the following text, whenever we allow attributes, we give the name of the attributes that are currentlyunderstood. In the future, more attributes will be added. Class file readers are expected to skip over and ignorethe information in any attribute they do not understand.Augurs 22, 1995 JAVA Vlrtunt Machine SpetIlICA11011 19 20 JAVA VIMIAIMAthille Spe

2.6.1 SourceFileThe "SourceFile" attribute has the following format:SourceFile_attribute (u2 attribute_name_index;u4 attribute_length;u2 sourcefile_index:at t ribute_name_indexcons tant_pool (at tr i bute_name_i ndex ) is the CONSTANT_Ut f 8 string- SourceFiattribute_lengthCode_attribute (u2 attribute_name_index;u4 attribute_length;u2 max_stack;u2 max_locals;u4 code_length;ul code(code_lengthl;u2 exception_table_length;( u2 start_pc;u2 end_pc;u2 handler_pc;u2 catch_type;) exception_table(exception_table_length);u2 attributes_count;attribute_info at.tributes(attribute_count);The length of a Sourceile_attribute must be 2 .source f ile_indexconstant_pool ( source ffrom which this class file was compiled.2.6.2 ConstantValueThe "Constant Value" attribute has the following format:ConstantValue_attributeu2 attribute_name_index;u4 attribute_length;u2 constantvalue_index;attribute_name_indexis a CONSTANT_Ut f 8 string giving the source fileconstant_pool tat t r bute_name_i ndex is the CONSTANT_Ut f 8 string'ConstantValue" .attribute_lengthThe length of a Constant Value_attribute must be 2.conet•ntvalue_indexcons tan t_pool (constant value_index I gives the constant value for this field.The constant pool entry must be of a type appropriate to the field, as shown by the followingtable:attribute_name_indexconstant_pool ( att r i bute_name_ ndex I is the CONSTANT_Ut f 8 siring -Code - .attribute_lengthThis field indicates the total length of the "Code" attribute, excluding the initial six bytesmax_stackMaximum number of entries on the operand stack that will be used during execution of thismethod. See the other chapters in this spec for more information on the operand stack.mex_loca laNumber of local variable slots used by this method. See the other chapters in this spec formore information on the local variables.code_ lengthcodeThe number of bytes in the virtual machine code for this method.These are the actual bytes of the virtual machine code that implement the method. When readinto memory, if the first byte of code is aligned onto a multiple•of•four boundary the thetableswitch and tablelookup opcodc entries will be aligned; see their description formore information on alignment requirements.except ion_table_lengthThe number of entries in the following exception table.longfloatdoubleint, short, char, byte, boolean2.6.3 CodeThe "Code" attribute has the following format:CONSTANT_LongCONSTANT_FloatCONSTANT_DoubleCONSTANT_Integerexception_tebleEach entry in the exception table describes one exception handler in the codestart_pc, end_pcThe two fields start_pc and end_pc indicate the ranges in the code at which the exceptionhandler is active. The values of both fields arc offsets from the start of the code. start_pc isinclusive. end_pc is exclusive.handler_pcThis field indicates the starting address of the exception handler. The value of the field is anoffset from the start of the code.August 22, 1915 JAvA VutuAl MAchlne SpecIlIcAlion 21 22 tas• Viirual MAchonc Specificmion A.K.021995

catch_ typeIf catch_type is nonzero, then constant_pool catch_t ypel will be the class ofexceptions that this exception handler is designated to catch. This exception handler shouldonly be called if the thrown exception is an instance of the given class.If catch_ type is zero, this exception handler should be called for all exceptions.attributee_countThis field indicates the number of additional attributes about code. The "Code" attribute canitself have attributes_attributesA "Code" attribute can have any number of optional attributes associated with it. Eachattribute has a name, and other additional Information. Currently, the only code attributesdefined are the "LineNumberTable" and "Local VariableTable," both of which containdebugging information.1.6A Exceptions TableThis table is used by compilers which indicate which Exceptions a method Is declared to throw:Exceptions_attributeu2 attribute_name_index;114 attribute_length;u2 number_of_exceptions;u2 exception_index_table(number_of_exceptions);attribute_name_indexconstant_pool (attribute_name_index) will be the CONSTANT_Ut f 8 string-Exceptionsattribute_lengthThis field indicates the total length of the Exceptions_attribute, excluding the initial six bytes.number_of_exceptionsThis field indicates the number of entries in the following exception index table.except ion_index_tableEach value in this table is an index into the constant pool. For each table element(except ion_ ndex_table != 0, where 0

local ..verieble_table_lengthThis field indicates the number of entries in the following local variable table.local_variable_tableEach entry in the local variable table indicates a code range during which a local variable has avalue It also indicates where on the stack the value of that variable can be found.etart_pc , lengthThe given local variable will have a value at the code between star t_pc and star t_pc1 engt h. The two values are both offsets from the beginning of the code.name_index, signeture_indoxclotconstant _pool name_index land constant_pool I signature_index I areCONSTANT Ut f 8 strings giving the name and signature of the local variable.The given variable will be the stot ih local variable in the method's frame.Aug., 27. 1995 JAvs Virtual Machine SpecIfIcAllon 25 76 lava Virtual Machine Specificmion August 21,199S

3 The Virtual Machine Instruction Set3.1 Format for the InstructionssipushPush two-byte signed integerSyntax:iipush = 17kite!byte?Java Virtual Machine instructions are represented in this document by an entry of the following form.Stack: ... =>iteminstruction nameShort description of the instructionSyntax:apcoile = numberoperuulIoperwir12Stack ...., mine!, (Anavolne3A longer description that explains the functions of the instruction and indicates any exceptions thatmight be thrown during execution.Each line in the syntax diagram represents a single 8-bit byte.Operations of the Java Virtual Machine most often take their operands from the stack and put their resultsback on the stack. As a convention, the descriptions do not usually mention when the stack is the source ordestination of an operation, but will always mention when it is not. For instance, the i load instruction has theshort description Toad integer from local variable." Implicitly, the integer Is loaded onto the stack. The i addinstruction is described as "Integer add"; both its source and destination are the stack.Instructions that do not affect the control flow of a computation may be assumed to always advance the virtualmachine pc to the opcode of the following instruction. Only instructions that do affect control flow willexplicitly mention the effect they have on pc.3.2 Pushing Constants onto the StackbipushPush one-byte signed integerSyntax:Stack => valisebipush = 16byte!byte! is interpreted as a signed 8-bit valise. This value is expanded to an integer and pushed onto theoperand stack.IdclIdc2byte] and byte2 are assembled into a signed 16-bit value. This value is expanded to an integer andpushed onto the operand stack.Nish item from constant poolSyntax:Stack: ......, itemhid = 18indexbyteindexlmtel is used as an unsigned 8-bit index into the constant pool of the current class The dem at thatindex is resolved and pushed onto the stack. If a String is being pushed and there isn't enoughmemory to allocate space (or it then an OutOf Memor yEr r or is thrown .Note: A String push results in a reference to an object; what other constants do, and explain thissomewhere here.Push item from constant poolSyntax .Stack: ......, itemIdt? = 19indeAb)ie Iindexbytel and indexhyte2 are used to construct an unsigned 16-bit index into the constant pool of thecurrent class. The item at that index is resolved and pushed onto the stack. If a St r i rig is beingpushed and there isn't enough memory to allocate space for it theta an outOf Mentor yEr r or isthrown.Note: A String push results in a reference to an object; what other constants do, and explain thissomewhere here.August 22, 1995 lust Virtual Mtchlne specification 27 Visual MA, issue SpeolicAtion assy, iltn 2) Isis

1dc2wPush long or double from constant poolSyntaxStack . =>aconst_nulltdc2tv = 20index!, )re Iitulexl,)re2constant •ivordl , tonslatortvonf2flute vbytel and indexbyte2 are used to construct an unsigned 16-bit index into the constant pool of thecurrent class. The two-word constant at that index is resolved and pushed onto the stack.Push null object referenceSyntax:Stack:t_ aconsr_toull = I=> ..., nullPush the no 11 object reference onto the stack.Iiconst_Push single floatSyntax:Stack: ... =>Lfconsr_Forms: cons t_O = 11, f cons t _ 1 = 12, f cons t _2 = 13Push the single-precision floating point number onto the stackdconst_Push double floatSyntax:dconsr_«1> IStack: ... •ovordl, •ottord2Forms: dconst_O = 14, dconst_l = 15Push the double-precision floating point number onto the stack.iconst_mlPush integer constant -1Syntax:iconst_yol = 2IStack: ..., -IPush the integer -1 onto the stack.iconst_Push integer constantSyntax:eco,,,,,.>Stack: ... => en>Forms i const _O = 3, iconst_1 = 4, iconst_2 . 5, lconst_3 = 6, iconst„4 = 7, iconst _5 . 8Push the integer onto the stack.3.3 Loading Local Variables Onto the StackiloadLoad integer from local variableSyntax://owl = 21vltidexStack: ....> ..., valueThe value of the local variable at vitufex in the current Java frame is pushed onto the operand stackiload_Load integer from local variableSyntax:ilnatl_'Icons/_.Stack: ... => ..., valuePush long integer constant Forms: i 1 oad_O = 26, i 1 oad_ 1 .= 27, i 1 oad_2 = 28, i 1 oad_3 = 29Syntax 'Me value of the local variable at in the current Java frame is pushed onto the operand stack.1,0,,s,_ This instruction is the same as i load with a vo ► dex of , except that the operand is implicit.Stack: . =>

!loadIdloadLoad long integer from local variableLoad double float from local variableSyntaxllood = 22vindexSyntax:Woad = 24Stack: ... => . , value-word', value-word2Stack: ... ->value-word], valuelyord2The mint of the local variables at viudex and vindex +1 in the current Java frame is pushed onto theoperand stackThe value of the local variables at vwdex and vindex+ I in the current Java frame is pushed onto theoperand stack.lload_load long integer from local variableSyntax:Stack: .. =>[ Maul_ value-wordl, value.word2Forms: 11 oad_O = 30, 11 oad_1 = 31, 11 oad_2 = 32, 11 oad_ 3 = 33The value of the local variables at and +1 in the current Java frame is pushed onto the operandstack .This instruction is the same as 1 load with a vindex of en>, except that the operand en> is implicit.dloakLoad double float from local variableSyntax:Stack: ...illmul_value•wardl, value-word2IForms: d 1 oad_O = 313, dl oad_ = 39, dload_2 = 40, dload 3 = 41The value of the local variables at en> and en> «1 in the current Java frame is pushed onto the operandstackThis instruction is the same as dload with a vindex of < ► >, except that the operand is implicit(loadaloadLoad single float from local variableload object reference from local variableSyntax:florid = 23vindexSyntax:Wood = 25viudexStack: ... =>Stack: ......, valueThe value of the local variable at vindex in the current Java frame is pushed onto the operand stack.The value of the local variable at vindex in the current Java frame is pushed onto the operand stack.(load-_Load single float from local variableSyntax:/bawl_ rStack: ....> ..., valueForms: f load -_0 = 34, f oad_l = 35, f 1 oad_2 = 36, f 1 oacl_3 = 37The value of the local variable at in the current Java frame is pushed onto the operand stack.This instruction is the same as f load with a vindex of en>, except that the operand is implicit.aload_Load object reference from local variableSyntax:aloml_Stack: ....> ..., valueForms: a load_O = 42, a load_l = 43, a 1 oad_2 = 44, al oad_3 = 45The value of the local variable at in the current Java frame is pushed onto the operand stack.This instruction is the same as aloud with a vindex of en>, except that the operand is implicit.August 22, 1995Java Vir tual Machine Specification 31 32 lava virtual Machine Specification August 22.1995

I 3.4Storing Stack Values into Local Variables(storeStore single float into local variableistoreStore integer into local variableSyntax:istore !store[ Wore = 54Stack: ..., cc Ise => ...violdexvalue must be an integer. Local variable vindex in the current Java frame is set to value.Store integer into local variableSynt.ix:Stack: ..., value =>1$1,1re_Forms: i store_0 = 59, istore_l = 60, istore_2 = 61, istore_3 = 62value must be an integer. Local variable in the current Java frame is set to value.This instruction is the same as istore with a viadex of , except that the operand is implicit.Store long integer into local variableSyntax:Istore_Lome = 55v melexStack: ..., value-wordl, value-word2 =>Ivalue must be a long integer. Local variables vinder and vindex+ I in the current Java frame are set tovalue.Store long integer into local variableSyntax:Isture_Stack: ., value-word!, value-tvord2Forms: Istore_O = 63, Istore_1 = 64, Istore_2 = 65, Istore_.3 = 66value must be a long integer. Local variables I pore = 56vindervalue must be a single-precision floating point number. Local variable voider in the current Java frameis set to valise.Store single float into local variableSyntax:Stack: value =>Iforn.e_Forms: fstore_O = 67, Estore_1 = 68, fstore_2 = 69, (store_3 = 70value must be a single-precision floating point number. local variable in the current Java frame isset to valise.This instruction is the same as ( store with a viridex of , except that the operand is impin itStore double float into local variableSyntax:dstore_(I ►an' = 57vinderStack: ..., value-wordl, value•suvrel2 =>value must be a double-precision floating point number. local variables cinder and wilder I in thecurrent Java frame are set to value.Store double float into local variableSyntax:Eli3 ►orr_ IStack: ..., value-tvordi , value•word2 =>Forms: dstore_0 = 71, dstore_! = 72, cis t ore_2 = 73, dstore_3 = 74value must be a double-precision floating point number. Local variables and . I in the currentJava frame are set to valise.This instruction is the same as dstore with a cinder of , except that the operand is implicitAugust V, 1995Java VIrluAl MAchine SpKificAllon 13 34 lus• VI/111AI MArhine SpeOlICA11011 Augusl 75 199,

)_)astoreStore object reference into local variableSyntaxastore_Stack: ..., value =>,wore = 58vindexvalue must be a return address or a reference to an object. Local variable Older in the current Javaframe is set to value.Store object reference into local variableSyntax:Slack ., value =>ustore_Forms: astore_O = 75, astore_1 = 76, astore__2 = 77, astore_3 = 78value must be a return address or a reference to an object. Local variable in the current Java frameis set to value.This instruction is the same as astore with a vindex of , except that the operand cu> is implicit.3.6 Managing Arraysnewarra yAllocate new arraySyntai.Stack: ..., size => resultsielvarru) = 188(lopesize most be an integer. It represents the number of elements in the new arrayat ype is an internal code that indicates the type of array to allocate. l'ossible values fur aim. are asfollows:T_13001.F.AN 4T' CHAR 5T_FLOAT 6T_DOUBLE 7T_ YTIi 8iincIncrement local variable by constantSyntax:Stack. no changeiinc = 132vindexcarerLocal variable vindex in the current Java frame must contain an integer. Its value is incremented by thevalue cool, where const is treated as a signed 8-bit quantity.T_SIIORT 9T_INT 10T_LONGA new array of alype, capable of holding size elements, is allocated, and result is a reference to this newobject. Allocation of an array large enough to contain site items of (Hype is attempted. All elements ofthe array are initialized to zeto.If size is less than zero, a Negat iveAr raySi zeExcept ion is thrown. If there is not enoughmemory to allocate the array, an OutOfMemoryError is thrown.3.5 Wider index for Loading, Storing and IncrementingwideWider index for accessing local variables in load, store and increment.Syntax'vide = 196vindex(Stack: no changeThis bytecode must precede one of the following bytecodes: i load, 1 load, f load, d load,a load, i store, lstore, (store, dstore, astore, i inc. The Wilder of the followingbytecode and vuidex2 from this bytecode are assembled into an unsigned 16-bit index to a localvariable in the current Java frame. The following bytecode operates as normal except for the use of thiswider index.August 21.1995lava Virtual Machine Specification35 36 lava Venial Machine Speaicalion Angus, 1? 1905

anewarrayAllocate new array of references to objectsSyntaxStack: ..., siu,=> resultsize roust be an integer. It represents the number of elements in the new array.nutexbytel and itulexbyle2 are used to construct an index into the constant pool of the current class. Theitem at that index is resolved The resulting entry must be a class.A new array of the indicated class type and capable of holding size elements is allocated, and result is areference to this new object. Allocation of an array large enough to contain size items of the given classtype is attempted. All elements of the array are initialized to nu 11.If size is less than zero, a Negat iveArraySizeExcept ion is thrown. If there is not enoughmemory to allocate the array, an Out.OfMemoryEr ror is thrown.anewar ray is used to create a single dimension of an array of object references. For example, to createnew Thread( 7 )the following code is used:bi push 7anewar ray anewar ray can also be used to create the first dimension of a multi-dimensional array. For example,the following array declaration:new int(6)1)is created with the following code:bipush 6anewarray See cONSTANT_Class in the "Class File Format" chapter for information on array class names.multianewarrayAllocate new multi-dimensional arraySyntaxunettlrrt) = 1891te lindexbyte2multionewarria = 197itulexb)te 1itulexb)te2Stack: ..., sari srze2.. men => resultEach size must be an integer. Each represents the number of elements in a dimension of the array.indexbytel and indexbytel are used to construct an index into the constant pool of the current class. Theitem at that index is resolved. The resulting entry must be an array class of one or more dimensions.dimensions has the following aspects:• It must be an integer 2 1.It represents the number of dimensions being created. It must be S the number of dimensions ofthe array class.arraylengthialoadWoad• It represents the number of elements that are popped off the stack All must be integers greaterthan or equal to zero. These are used as the sizes of the dimension For example, to createnew int (61( 31 (1the following code is used:bipush 6bipush 3multianewarray value-toordl, valuequord2nrrayreimust be a reference to an array of long integers. index must be an integer. The long integer valueat position number index in the array is retrieved and pushed onto the top of the stack .If arrayref is null a NullPointer Except ion is thrown. If index is not within the bounds of thearray an Ar ray nde xOu tO f BoundsExcept i on is thrown.A llgUSi 52, 1995lava Vinuxt Machine Spedlication37 38lAvAV.h..IMAchmeSveciiKMmn Aopoi 2). 199‘

faloadcaloadLoad single float from arrayLoad character from arraySyntax.Syntax:r fuloud = 48 1 I (-aloud = 52 JStack ., arrayref, index => ..., value Stack• .., arrayref, index => ..., valuearrayref must be a reference to an array of single-precision floatins point numbers. index must be anarrayref must be a reference to an array of characters. index must be an integer the character value atinteger The single-precision floating point number VAC at position number index in the array isposition number index in the array is retrieved, zero-extended to an integer, and pushed onto the topretrieved and pushed onto the top of the stackof the stack.If arrayref is null a Null PointerExcept ion is thrown. If index is not within the bounds of theIf arrayref is null a Nu 1 1 PointerExcept ion is thrown. If index is not within the bounds of thearray an Ar r ay IndexOutO f BoundsExcept ion is thrown. I array an Ar rayIndexOutOfBoundsExcept ion is thrown.daloadLoad double float from arraySyntax:r dom.,. 49Stack: ..., arrayref, index =>value-word, valur-word2arrayref must be a reference to an array of double-precision floating point numbers. index must be aninteger. The double-precision floating point number value at position number index in the array isretrieved and pushed onto the top of the stack.If arrayref is nu 11 a Nu l l Po interExcept ion is thrown. If index is not within the bounds of thearray an Array I ndexOutO f BoundsExcept ion is thrown.saloadLoad short from arraySyntax:1Julorld = 33 jStack: ...,arrayref, index =>valuearrayref must be a reference to an array of short integers. index must be an integer. The .signed shortinteger value at position number index in the array is retrieved, expanded to an integer, and pushedonto the top of the stack.If arrayref is nu II, a Null Pointer Except ion is thrown. If index is not within the hounds of thearray an Ar ray I ndexOu tO f BoundsExcept ion is thrown.aaloadLoad object reference from arraySyntax:Iualeud = 50 IStack: ..., arrnyref, index => ..., valuearrayref must be a reference to an array of references to objects. index must be an integer. The objectreference at position number index in the array is retrieved and pushed onto the top of the stack.If arrnyref is null a Nu 1 1 Poi nterExcept ion is thrown. If index Is not within the bounds of thearray an Array I ndexOu tO f Bounds Except i on is thrown.iastoreStore into integer arraySyntax:ilsoure = 79Stack: ..., area yrcf, index, valise =>arrayref must be a reference to an array of integers, index must be an integer, and value an integer. Theinteger value is stored at position index in the array.arrayref is null, a Nu 1 1 PointerExcept ion is thrown. If index is not within the bounds of thearray an Ar raylndexOutOf BoundsExcept ion is thrown.baloadLoad signed byte from array.Syntax:r huloud 751-1 =.Stack: ..., arrayref, index =>minearea yref must be a reference to an array of signed bytes. index must be an integer. The signed byte valueat position number index in the array is retrieved, expanded to an integer, and pushed onto the top ofthe stack.[(array/I/is null a Null Pointer Except ion is thrown. If index is not within the bounds of thearray an Ar ray I ridexOutOf BoundsExcept ion is thrown.lastoreStore into long integer arraySyntax:lustore = 80Stack: ...,arrayetl, index, value-wordl , value-ivord2Iarrayref must be a reference to an array of long integers, index must be an integer, and value a longinteger. The long integer value is stored at position index in the array.If null, a Null PointerExcept ion is thrown. If index is not within the bounds of thearray, an Ar ray I ndexOutOf BoundsExcept ion is thrown.August 22. 199S Java VifillAi Machine Specification 39 111 1.•11 Virtual Machine specification August 21. 199c

PastoreStore into single float arraySyntaxLplume = 81 1Stack arrayref, index, value =>arrayref must be an array of single-precision floating point numbers, index must be an integer, andvalue a single-precision floating point number. The single float value is stored at position index in thearrayIf arrayref is null, a NullPointerExcept ion is thrown. If index is not within the bounds of thearray an Ar raylndexOutOf BoundsExcept ion is thrown.bastoreStore into signed byte arraySyntaxI. lnmmt = 84Stack: arrayref, index, value ‘,> .arrayref must be a reference to an array of signed bytes, index must be an integer, and value an integerThe integer value is stored at position index in the array. If value is too large to be a signed byte, it istruncated.If arrayref is nu 11, a No 1 1 Poi rater Except i on is thrown. If index is not within the bounds of thearray an Ar rayIndexOutOf BoundsExcept ion is thrown.dastoreaastoreStore into double float arraySyntax:dusrnrt = 82 1Stack. ...,arrayref, index , value-tvordl, value-toord2 =>arras/ref must be a reference to an array of double-precision floating point numbers, index must be aninteger, and value a double-precision floating point number. The double float value is stored at positionindex in the array.If am-owl is nu 1 1, a Null Poi nt erExcept ion is thrown. If index is not within the bounds of thearray an Ar rayl ndexOutOf BoundsExcept ion is thrown.Store into object reference arraySyntax:Irrovore = 83 IStack: arrayref, index, value =>arrayref must be a reference to an array of references to objects, index must be an integer, and value areference to an object. The object reference value is stored at position index in the array.If arrayref is nu 1 1, a Nu 1 1 Poi nter Except ion is thrown. If index is not within the bounds of thearray, an Ar rayIndexOutOf BoundsExcept ion is thrown.The actual type of valise must be conformable with the actual type of the elements of the array. Forexample, it is legal to store an instance of class Thread in an array of class Object, but not viceversa. (See the Java language Specification for information on how to determine whether a objectreference is an instance of a class.) An Ar rayStoreExcept ion is thrown if an attempt is made tostore an incompatible object referenceNote: Mustn't refer to the Java language Specification; give semantics here.cast oresastoreStore into character arraySyntax:(white' 85Stack: ..., arrayref, index, value =>arrayref must be an array of characters, index must be an integer, and value an integer. The integer valueis stored at position index in the array. If value is too large to be a character, it is truncated.If arrayref is null, a NullPointerExcept ion is thrown. If index is not within the bounds of (thearray an Ar ray! ridexOutOf BoundsExcept ion is thrownStore into short arraySyntax:vinare = 86Stack: ..., array, index, value =>Iarrayref must be an array of shorts, index must be an integer, and value an integer. The integer onli n e isstored at position index in the array. If valise is too large to he no short, it is truncated.If arrayref is nu 11, a Nu 11 Poi nte rExcept ion is thrown If index is not within the bounds ofarray an Ar rayInclexoutOf BoundsExcept ion is thrown.3.7 Stack Instructions!sop1)o nothingSyntax.nndr = 0Stack: no changeDo nothing.August 22. 1995 lava Virtual Machine Specification 41 42 lava VlatlAl Machine SiwilicAtton August 22 VW.

31popPop top stack wordSyntax:pup = 87dup2_xlDuplicate top two stack words and put two downSyntax:dup2_,x I = 93 IStack: . ., any => Stack: any2, anyl => any2, aryl, any3, any2, anylPop the top word (corn the stack.Duplicate the top two words on the stack and insert the copies two words down in the stack.pop2Pop top two stack wordsSyntax:pop2 = 88dup_x2Duplicate top stack word and put three downSyntax:I drop Jr2 = 91Stack: ...,any2, anyl => Stack: any3, any2, anyl => anyl, any3, any2, anylPop the top two words from the stack. Duplicate the top word on the stack and insert the copy three words down in the stack .dupDuplicate top stack wordSyntax:dup = 891dup2_x2Duplicate top two stack words and put three downSyntax:dup2Jr2 = 94 jStack: ..., any => ..., any, any Stack: ..., arty4, any3, any2, anyl => any2, anyl, any4, any3, any2, anylDuplicate the top word on the stack.Duplicate the top two words on the stack and insert the copies three words down in the stack.dup2swapDuplicate top two stack wordsSwap top two stack wordsSyntax:dup2 = 92Syntax:mop = 93Stack: ...,any2, anyl =>any2, anyl, any2, nnylStack: ..., any2, nnyl =>any2, nnylDuplicate the top two words on the stack.Swap the top two elements on the stack.dup_xlDuplicate top stack word and put two downSyntax:dup_A 1 = 90Stack: ..., any2, anyl =>Ianyl, any2, anylDuplicate the top word on the stack and insert the copy two words down in the stack.3.8 Arithmetic InstructionsiaddInteger addSyntax:iudd = 96Stack: ..., mina ualue2 => ..., resultvaluel and traltse2 must be integers. The values are added and are replaced on the stack by their integersum.Austust 22. 1995 lava Virtual Machine Specification 43 44 lava Virtual Machine Specification August 22. 1995

laddLong integer addfSubSingle float subtractSyntax./add = 97Syntax:hub = 102IStack , (Awl-word?, value? Avord2, value2-tvord1, value2-tvord2 => result-word I, result-tvord2Stack: ..., value?, value2 =>resultvalue? and value2 must be long integers. The values are added and are replaced on the stack by theirlong integer sum.value? and value2 must be single-precision floating point numbers. value? is subtracted from value!.and both values are replaced on the stack by their single-precision floating point differencefaddSingle floats adddsubDouble float subtractSyntax:r Judd = 98jSyntax:I daub = 103 IStack -, value?, mina => ..., resultStack: ..., value]-wordl, value? -word2, value2-word 1 value•oord2result•word I , result•word2value? and value? must be single-precision floating point numbers. The values are added and arereplaced on the stack by their single-precision floating point sum.valuel and value2 must be double-precision floatiny, point numbers. value2 is subtracted from valor 1.and both values are replaced on the stack by their double-precision floating point difference.daddDouble floats addi ulInteger multiplySyntax:Budd = 99-1Syntax:I thud = 104 1Stack: .., valuel-tvordl, value? -tvord2, value2-tvordl, value2-tvord2 =>result-tvordl, result-word?Stack: ..., value?, value?resultvalue? and value2 must be double-precision floating point numbers. The values arc added and arereplaced on the stack by their double-precision floating point sum.valuel and value2 most be integers. Both values are replaced on the stack by their integer productisubInteger subtractSyntax:is rub = 100Stack , value?, value? => resultvalue? and value2 must be integers. value? is subtracted from value', and both values are replaced onthe stack by their integer differenceImulLong integer multiplySyntax:inuul = 105 1Stack:.,., valuel-tvordl, value] -word?, value?. word?, value2-word2 =>resull-word 1 , resuff•word2value? and value? must be long integers 'loth values arc replaced on the stack by their long integerproduct.]subLong integer subtractSyntax:hub = 101JStack: , value]-tvordl, ualue1-tvord2, value2-word 1, value2-tvord2 => result-rvordl, result-word?vAltte I and value? must be long integers. value2 is subtracted from wind, and both values are replacedon the stack by their long integer difference.fmulSingle float multiplySyntax:fond = 106Stack: ..., value?, value2 => ..., resultvaluel and value2 must be single-precision floating point numbers. Both values are replaced on thestack by their single-precision floating point product.August 22, 1995 lava Virtual Machine Specification 45 46 lava Virtual Machine Specification August 22. 1995

dillDouble float multiplyddivDouble float divideSyntax.L distil = /07ISyntax:ddiv =IIIIStack: .., mind -word!, value! -tuord2, value?-word!, tmlue2-word2 => ..., result-word I, result -word?Stack: ..., ualuel word! , value! -word?, vnlue2.wordl, value2-word2 => , result word I, result word?value! and value? must be double-precision floating point numbers. Both values are replaced on thestack by their double-precision floating point product.Morel and value? must be double-precision floating point numbers. value! is divided by value?, andboth values are replaced on the stack by their double-precision floating point quotientDivide by zero results in the quotient being NaN.idlyInteger divideSyntaxStack.itliv = 101!., value!, value? => , result(*duel and valise? must be integers valise! is divided by value?, and both values are replaced on thestack by their integer quotient.The result is truncated to the nearest integer that is between it and 0. An attempt to divide by zeroresults in a "/ by zero" AT i thmet icExcept ion being thrown.i remInteger remainderSyntax:Iiron = II?Stack: ..., vnluel, tolue2 =>resultvalue! and valise? must both be integers wheel is divided by value?, and both values are replaced onthe stack by their integer remainder.An attempt to divide by zero results in a "1 by zero" Ari thmet icExcept ion being thrownIdlyLong integer divideSyntax:( Idly = 109Slack: ..., value!-word!, mind-word?, untue2-word2, untue2.word2 => ..., result-word!, result-word2!remNote: need a description of the integer remainder semanticslong integer remainderSyntax:L kens = 113Ivalise! and value? must be long integers. value! is divided by value?, and both values are replaced onthe stack by their long integer quotient.The result is truncated to the nearest integer that is between it and 0. An attempt to divide by zeroresults in A"1 by zero" Ar thmet i cExcept i on being thrown.Stack: ..., unfurl-word!, value! -word?, value2-wordl, value2-word? =>resort-worif I, result •loor,12value! and value? must both be long integers, value! is divided by value?, and both valises are replacedon the stack by their long integer remainder.An attempt to divide by zero results in a "I by zero" Ari thmet icExcept ion being thrown.fdivNote: need a description of the integer remainder semanticsSingle float divideSyntax:fthv = I/OIf remSingle float remainderStack: .., value!, value? => ..., resultvalue! and value? must be single-precision floating point numbers. value! is divided by value?, andboth values arc replaced on the stack by their single-precision floating point quotient.Syntax:I Iron = 114 —1Stack: ..., value!, value?..., resultDivide by zero results in the quotient being NaN.value! and value? must both be single-precision floating point numbers. mina is divided by thaw - 2,and the quotient is truncated to an integer, and then multiplied by value? The product is subtractedfrom mluel.The result, as a single-precision floating point number, replaces both values on the stackresult = value! - (inteval_part(txtitiel/ value?) • value?), where integral_part0 rounds to the nearestinteger, with a tie going to the even numberAn attempt to divide by zero results in NaN.Note: gls to provide a better definition of the floating remainder semanticsAugust 21.199 5 JAVA VI( At Machine Specification 47 48 la ns VsrlUAi Machine Specification August ti, 1495

dreminegInegInegdnegDouble float remainderSyntaxInteger negateSyntax.Stack: ..., valise =>them = 115Stack: .., value! -word], value, -tuord2, vnlue2-wordl, value2-tvord2 => result-word!, result-wurd2value! and value? must both be double-precision floating point numbers. valuel is divided by value2,and the quotient is truncated to an integer, and then multiplied by value2. The product is subtractedfrom valuel.The result, as a double-precision floating point number, replaces both values on the stackresult = value: (interal_part(oaluel / untue2)' value2), where integral_part() rounds to the nearestinteger, with a tie going to the even number.An attempt to divide by zero results in NaN.Note; gls to provide a better definition of the floating remainder semanticsmeg = 116, resultIvalue must be an integer. It is replaced on the stack by its arithmetic negation.Long integer negateSyntax -[ Meg = 117Slack: value-word!, value-tvord2 => result-ward!, result-tvord2value must be a long integer. It is replaced on the stack by its arithmetic negation .Single float negateSyntax .[ beg= 118Stack: ..., value => ..., resultvalue must be a single-precision floating point number. It is replaced on the stack by its arithmeticnegation.Double float negateSyntax:Stack: .dreg = 119 _1value-word!, value•tvord2 => , result-word!, result-tvord2valise must be a double-precision floating point number. It is replaced on the stack by its arithmeticnegation.3.9 Logical InstructionsishlishriushrIshlIshrInteger shift leftSyrunx:Integer arithmetic shift rightSyntax;Lair = 122Slack: ..., value!, value? => ..., resultvalue! and value2 must be integers. valise! is shifted right arithmetically (with sign extension) by theamount indicated by the low live bits of value? The integer result replaces both values on the stackInteger logical shift rightSyntax:Stack:iushr = 124value], value? => ..., resultjvalue! and value? must be integers. value] is shifted right logically (with no sign extension) by theamount indicated by the low five bits of value2. The integer result replaces both values on the stack.Long integer shift leftSyntax:ishl = 1 20-1Stack: ..., value!, value2 =>hid 121resultvalue! and value? must be integers. valucl is shifted left by the amount indicated by the low five bits ofuafite2. The integer result replaces both values on the stack.Stack: ..., value!-word!, -worn, unfurl result-word!, result-tvord2valise! must be a long integer and value? must be an integer. value! is shifted left by the amountindicated by the low six bits of valise? The long integer result replaces both values on the stackLong integer arithmetic shift rightSyntax!Ithr = 123Stack: ..., valuel value2 => result-word!, resull-word2valise! must be a long integer and value? must be an integer. value! is shifted right arithmetically (withsign extension) by the amount indicated by the low six bits of value2 The long integer result replacesboth values on the stack.August a 1995lava Virtual Machine Specification 49so IA. Victual Machine Specification Aug,' 22. 5995

lushrlong integer logical shift rightixorInteger boolean XORSyritox.= 125 ISyntax:ixor = 130Stack , value I. word!, value! -ivord2, valuenoordl, value2-ivord2 => result-word!, result •riAnd2Slack:value), valise? => ..., resultvalue! must be a long integer and value2 must be an integer. value! is shifted right logically (with nosign extension) by the amount indicated by the low six bits of value2. The long integer result replacesboth values on the stack.value! and value2 must both be integers. They are replaced on the stack by their bitwise exclusive or(exclusive disjunction).landInteger boolean ANDSyntax:iund = 126Stack: ..., valuel, mind => ..., resultvalue! and Emilia must both be integers. They are replaced on the stack by their bitwise logical and(conjunction).IxorLong integer boolean XORSyntax:Ixor = 131Stack: . value!-word!, value! quota, value2-ward! , value2-word2 => ..., result-word I, result-word 2value! and value2 must both be long integers. They are replaced on the stack by their bitwise exclusiveor (exclusive disjunction).landLong integer boolean ANDSyntax:[ hind = 127Stack: .., value!-word!, valuel.word2, valuefloordl, tialue2-ivbrd2 =>result-word 1, result-roord2value! and value? must both be long integers. They are replaced on the stack by their bitwise logicaland (conjunction).3.10 Conversion Operationsi2IInteger to long integer conversionSyntax:ill = 133Stack: ..., value => ..., result•wordl, result-uvrd2forvalise must be an integer. It is converted to a long integer. The result replaces value on the stackInteger boolean ORSyntax:for = 128i2fInteger to single floatStack: .., value!, value2 =>, resultvaluel and value2 must both be integers. They are replaced on the stack by their bitwise logical or(disjunction).Syntax:Stack!i2J = 134, value => ..., resultforvalue must be an integer. It is converted to a single-precision floating point number. The result replacesvalue on the stack.Long integer boolean ORSyntax:for = 129i2dInteger to double floatStack: .., value!-word!, value! -word2, vatue2-word1, value2-word2 =>result-word], result-usord2value! and value2 must both be long integers. They are replaced on the stack by their bitwise logical or(disjunction).Syntax:i2d = 135Stack: . value => ..., result •roordl, result - itiord2valise roust be an integer. It is converted to a double-precision floating point number. The resultreplaces utilise on the stack.August 22.1995lava SpethicAtion Si52A vntnxt MArhine SproficationAup,,,st 52 19v,

IliLong integer to integerSyntax:L 12i = 136121Single float to long integerSyntax:L 121= 140Stack:, value-word], value-tvord2 => ..., resultStack: ..., value => ..., result -urordl, result -word2value must be a long integer. It is converted to an integer by taking the low-order 32 bits. The resultreplaces value on the stack.value must be a single-precision floating point number. It is converted to a long integer The resultreplaces value on the stack. See The Java Innguage Specification for details on converting floating pointnumbers to integers.12fNote: Mustn't refer to the Java Language Specification; give semantics here.Long integer to single floatSyntax:Stack:121= 137., onlu•-word1, valise-worn => ..., resultf2dSingle float to double floatSyntax:/2d = 141mine must be a long integer. It is converted to a single-precision floating point number. The resultreplaces value on the stackStack: ..., value =>result-word 1, result-word2I2dLong integer to double floatSyntax:12d = 138Stack: ..., value-ruordl, value-tvord2 =>result-wordl, result-roord2d2ivalue must be a single-precision floating point number. It is converted to a double-precision floatingpoint number. The result replaces value on the stack.Double float to integerSyntax:1121= 142f2ivalue must be a long integer It is converted to a double-precision floating point number. The resultreplaces value on the stack.Single float to integerSyntax:Stack121 = 139, value => . , resultvalue must be a single-precision floating point number. It is converted to an integer. The result replacesvalue on the stack. See The Java Innglinge Specification for details on converting floating point numbersto integers.Note: Mustn't refer to the Java Language Specification; give semantics here.d2IStack: ..., tnilire-toordl, value-toord2 => ..., resultvalue must be a double-precision floating point number. It is converted to an integer. The resultreplaces value on the stack. See The Java [Angling( Specification for details on converting floating pointnumbers to integers.Note: Mustn't refer to the lava Language Specification; give semantics here.Double float to long integerSyntax:1121= 143Stack: ..., value-tvordl, value-tvord2 =>result-wordl, result •ivord2value must be a double-precision floating point number It is converted to a long integer The resultreplaces mine on the stack. See The lava Inngunge Specification for details on converting floating pointnumbers to integers.Note: Mustn't refer to the Java Language Specification; give semantics here.August 21. 1995 Jaw' Vluoxi Machine SpecIlIcAllon Si 51 lava Virtual M ac hint SisKillotlion Augurs 22. 1995

d2fDouble float to single floatSyntaxStack:d2/= 144 )tulue-wordl , value-word2 => ..., resultvalue must be a double-precision floating point number. It is converted to a single-precision floatingpoint number. If overflow occurs, the result must be infinity with the same sign as value. The resultreplaces value on the stack.3.11 Control Transfer InstructionsifeqBranch if equal to 0Syntax:'fey = 153brunchbrsetbronchb)re2int2byteint2charint2shortInteger to signed byteSyntax:Integer to shortSyntax:I inab)se 145 1Stack: ..., value => ..., resultvalue must be an integer. It is truncated to a signed 8-bit result, then sign extended to an integer. Theresult replaces value on the stack.Integer to charSyntaxStack: ..., value =>I int2chur = 146 I, resultvalue most be an integer. It is truncated to an unsigned 16-bit result, then zero extended to an integer.The result replaces valise on the stack.Slack:fins2shors = 147 Ivalue => ..., resultvalue must be an integer. It is truncated to a signed 16-bit result, then sign extended to an integer. Theresult replaces value on the stack.ifnullifltStack: ..., value =>value must be an integer. It is popped from the stack. If value is zero, branclibytel and brattchlsyle2 areused to construct a signed 16-bit offset. Execution proceeds at that offset from the address of thisinstruction. Otherwise execution proceeds at the instruction following the i f eq.Branch if nullSyntax:Stack: .. , valise =>Branch if less than 0Syntax:Own = 198branchksehronchbyte2value must be a reference to an object. It is popped from the stack. If valise is nu 1 1, branclthyte I andbranclthyle2 are used to construct a signed 16.bit offset. Execution proceeds at that offset from theaddress of this instruction. Otherwise execution proceeds at the instruction following the i f nullStack: ..., value =>lJlr = 155bronchhyte Ibranclthyse2value must be an integer. It is popped from the stack. if value is less than zero, branchbytel andbranchlsyte2 are used to construct a signed 16-bit offset. Execution proceeds at that offset from theaddress of this instruction. Otherwise execution proceeds at the instruction following the i f 1 t..August 22. 1995Java Virtual Machine Specification55 56 lava Virtual Machine Smirscation August 22 I994

ifleBranch if less than or equal to 0Syntax.ille = 158branchb)te Ibronchbyte2ifgeBranch if greater than or equal to 0Syntax:lige = 156brosichb)te Ibranchb)te?Stack: ..., VA tie =>Stack: ..., value =>value must be an integer. It is popped from the stack. If value is less than or equal to zero, brauchbyteland branchbyte? are used to construct a signed 16-bit offset. Execution proceeds at that offset from theaddress of this instruction. Otherwise execution proceeds at the instruction following the i i le.value must be an integer. It is popped from the stack. If value is greater than or equal to zero,branchbyIel and braruftbyle2 are used to construct a signed 16-bit offset. Execution proceeds at thatoffset from the address of this instruction Otherwise execution proceeds at the instruction followingthe i fge.ifneBranch if not equal to 0Syntax:ifnonnullifgtStack . ..., valise =>ijrtt = 154brosichbyte Ibronchb)te2value must be an integer. It is popped from the stack. If value is not equal to zero, bratichbylel andbranclibyle2 are used to construct a signed 16-bit offset. Execution proceeds at that offset from theaddress of this instruction. Otherwise execution proceeds at the Instruction following the line.Branch if not nullSyntax:Stack: ..., value =>Omission = 199bronchbyte Ibrosichb)tevalue must be a reference to an object. It is popped from the stack. If value is not null, branclibytel andbranchbyte2 arc used to construct a signed 16-bit offset. Execution proceeds at that offset from theaddress of this instruction. Otherwise execution proceeds at the instruction following thenonnu 11 .Branch if greater than 0Syntax:Stack' ., value =>ifgt = 157bronchb)te Ibrusichb)te2valise must be an integer. It is popped from the stack. If value is greater than zero, brancla)ytel andIsras1chlsyle2 are used to construct a signed 16-bit offset. Execution proceeds at that offset from theaddress of this instruction. Otherwise execution proceeds at the instruction following the I fgt.if_icmpegBranch if integers equalSyntax:if_icmpneif_icmpltr_icmprq = 159branchbyie /branclib)te2Stack: ..., uahrel, value?value] and value? must be integers. They are both popped from the stack. If unfurl is equal to valise?,bratichbytel and branchbyle2 are used to construct a signed 16-bit offset. Execution proceeds at thatoffset from the address of this instruction. Otherwise execution proceeds at the instruction followingthe 1 f_icmpeg.Branch if integers not equalSyntax:if icmprir = 160branclibyre 1branchb)ie2Stack: value!, value? =>valuel and value2 must be integers. They are both popped from the stack. It entire! is not equal tovalise?, branchbytel and branchbyte2 are used to construct a signed 16-bit offset. Execution proceeds atthat offset from the address of this instruction. Otherwise execution proceeds at the instructionfollowing the i f _1 cmpne.Branch if integer less thanSyntax:[!f length = 161bronchbyte Ibranchbyte2Stack: ..., value], value2valuel and value? must be integers. They are both popped from the stack. If value; is less than mane?.branchbytel and branchbyle2 are used to construct a signed 16-bit offset. Execution proceeds at thatAugust 22, 1995Java Virtual Machine Specification57 58 Java Virtual Machine Specification August 22. 1995

(2)i f_icmpgtoffset from the address of this instruction. Otherwise execution proceeds at the instruction followingthe iBranch if integer greater thanSyntaxif jenipgi = 163brusichb)te Ibrusictsb)re2lcmpLong integer compareSyntax:101111 = 148Slack: ..., valuel-wordl, valuel-worn]?, value2-wart 11 , value2-word] => ..., resultvalue] and value2 must be long integers. They are both popped from the stack and compared. If valuelis greater than value2, the integer value 1 is pushed onto the stack. If value] is equal to value2, the value0 is pushed onto the stack. If value] is less than value2, the value -1 is pushed onto the stack.if jempleStack: .., value], value? =>paluel and mina must be integers. They are both popped from the stack. If value] is greater thanvalue2, brnachbytel and brnnchbyte2 are used to construct a signed 16-bit offset. Execution proceeds atthat offset from the address of this instruction. Otherwise execution proceeds at the instructionfollowing the i f cmpgt.Branch if integer less than or equal toSyntax:f jcmpgeiLicsuple = 164branchbvelbrusielsbyte2Stack: .., value], value? =>value' and value2 must be integers. They are both popped from the stack. If value] is less than or equalto value?, branclibytel and branclthyte2 are used to construct a signed 16-bit offset. Execution proceeds atthat offset from the address of this instruction. Otherwise execution proceeds at the instructionfollowing the i f _icmple.Branch if integer greater than or equal toSyntax :if_ietnitge 162brunchbrasichb ■ te2Stack: ..., value], value2value] and valite2 must be integers. They are both popped from the stack. If value] is greater than orequal to value2, brancltbytel and bratichbyte2 are used to construct a signed 16-bit offset. Executionproceeds at that offset from the address of this instruction. Otherwise execution proceeds at theinstruction following the i Licmpge.fcmpIfcmpgdcmplSingle float compare (--1 on NaN)Syntax:fonpl = 149 1Stack: ..., valuel, valise? => ..., resultvalue] and value2 must be single-precision floating point numbers. They are both popped Irons thestack and compared. If value] is greater than value?, the integer value 1 is pushed onto the stack IIvalue] is equal to value2, the value 0 is pushed onto the stack. If valisel is less than Mud, the value -1 ispushed onto the stack.If either value) or value2 is NaN, the value -I is pushed onto the stack.Single float compare (1 on NaN)Syntax:I fotipg = 150 IStack: ..., value], valise2 => ..., resultvalue] and value? must be single-precision floating point numbers. They are both popped from thestack and compared. If valise] is greater than value?, the integer value I is pushed onto the stack. Ifvalue] is equal to value2, the value 0 is pushed onto the stack. If value] is less than value?, the value -1 ispushed onto the stack.If either valise] or value? is NaN, the value 1 is pushed onto the stack.Double float compare (-1 on NaN)Syntax:I (Imp! = 151Stack: ..., unfirel-toordl, valuel-worn, valise2-ivordl, value2-word] => ..., resultvalue] and Mud must be double-precision floating point numbers. They are both popped from thestack and compared. If valuel is greater than value?, the integer value 1 is pushed onto the stack. Ifvalue] is equal to valise2, the value 0 is pushed onto the stack. If value] is less than value2, the value -1 ispushed onto the stack.If either value] or value? is NaN, the value -1 is pushed onto the stack.August 22. 1995 lava VItillAl Machine Spet111011011 59 60 lava Virtual Machine Specification August 22. 1995

dcmpgDouble float compare (1 on NaN)Syntax:if_acmpeqdo nirg = 152jStack: .., voluel•teordl, valuel-worn, value2-wordl, value2- ►vordi => ..., resultvalue] and value? mast be double-precision floating point numbers. They are both popped from thestack and compared. If value] is greater than value?, the integer value 1 is pushed onto the stack. Ifvaluel is equal to value?, the value 0 is pushed onto the stack. If value] is less than value2, the value -I ispushed onto the stack.If either value] or volute? is NaN, the value 1 is pushed onto the stack.Branch if object references are equalSyntax:if_actapeq = 165branclib)ie Ibranclibyie2Stack: ..., value], value2 =>value] and valtte2 must be references to objects. They are both popped from the stack. If the objectsrefrerenced are not the same, branchbylel and branclubsde2 are used to construct a signed 16-bit offset.Execution proceeds at that offset from the Address of this instruction. Otherwise execution proceeds atthe instruction following the i f_acmpey1goto_wjsrBranch always (wide index)Syntax:Stack: no changebratictsbytel , brartclibyte2, branclthyte3, and branclibyte4 are used to construct a signed 32-bit offset.Execution proceeds at that offset from the address of this instruction.Jump subroutineSyntax:Stack:goia_sv = 200branchbyie 1brattchNie2branchb)ie3brunchbyte4jsr a 168branclitryie IbraischNte2..., return-addressbronclthyte1 and brancltbyte2 are used to construct a signed 16-bit offset. The address of the instructionimmediately following the jsr is pushed onto the stack. Execution proceeds at the offset from theaddress of this instruction.if_acmpneBranch if object references not equalSyntax:if ucraprie = 166brunchb)ie Ibranclib)it 2Stack: , value!, value2 =>value] and valuta must be references to objects. They are both popped from the stack. If the objecsreferenced are not the same, broutchbytel and brancltbyte2 are used to construct a signed 16•bit offset.Execution proceeds at that offset from the address of this instruction. Otherwise execution proceeds atthe instruction following the i f_acmpne.jsr_wNote: The jar instruction is used in the implementation of Java's f i na 11 y keyword.Jump . ubroutine (wide index)Syntax:Stack: ....>Jsr_tv = 201brauchbyte Ibrotichbytelbranchb)te3brurichlr ■ te4returmarldressgoldBranch alwaysbranchbytel, brariclibyte2, branchbyte3, and brattchbyte4 are used to construct a signed 32-bit offset. Theaddress of the instruction immediately following the j sr_w is pushed onto the stack. Executionproceeds at the offset from the address of this instruction.Syntax:goto = 167branchbyie 1branchb)re2Stack: no changebranclubytel and bronchbyte2 are used to construct a signed 16-bit offset. Execution proceeds at thatoffset from the address of this instruction.August 11, 1995 Iwu VIrtuAl Machine SpetlflotIon 61 62 luau Virt.1 tvlAchme SpeoficAlion Aug.. 12. 1995

Li 4retret wReturn from subroutineSyntaxStack no changeret = 169wilderLocal variable vssulex in the current Java frame must contain a return address. The contents of the localvariable are written into the pc.Note that jsr pushes the address onto the stack, and ret gets it out of a local variable. Thisasymmetry is intentional.Note. The r et instruction is used in the implementation of Java's finally keyword.Return from subroutine (wide index)Syntax:res_iv = 209vitlexb)te Ivindexbyte2(returndreturnReturn single float from functionSyntax:Stack:freturn = 174 1value => (emptylvalue must be a single-precision floating point number. The value value is pushed onto the stack of theprevious execution environment. Any other values on the operand stack are discarded. The interpreterthen returns control to Its caller.Return double float from functionSyntax:[ drown = 175 IStack: ..., value-wordl, valise-word? =>Stack: no changeReturn object reference from functionvitulexbytel and vindexbyie2 are assembled into an unsigned 16-bit index to a local variable in thecurrent Java frame. That local variable must contain a return address. The contents of the local variableSyntax:are written into the pc. See the re t instruction for more information. [ (trams 2. 176 Iareturnvalue must be a double-precision floating point number. The value value is pushed onto the stack of theprevious execution environment. Any other values on the operand stack are discarded. The interpreterthen returns control to its caller.3.12 Function ReturnireturnReturn integer from function(return.Syntax•StackReturn long integer from functionSyntaxirtiurn = 172., value => (emptylvalue must be an integer. The value value is pushed onto the stack of the previous executionenvironment. Any other values on the operand stack are discarded. The interpreter then returnscontrol to its caller(return = 173Stack , value word!, value-word2 (empty'Ivalise must be a long integer The value value is pushed onto the stack of the previous executionenvironment. Any other values on the operand stack are discarded. The interpreter then returnscontrol to its caller.returnStack: ..., value => !empty'value must be a reference to an object. The value value is pushed onto the stack of the previour,execution environment. Any other values on the operand stack are discarded. The interpreter thenreturns control to its caller.Return (void) from procedureSyntax:Stack: ...breakpoint( return o 177(empty(All values on the operand stack are discarded. The interpreter then returns control to its caller.Stop and pass control to breakpoint handlerSyntax:Stack: no changeL breakpoint = 202 JAugust 22.1995lava Virtual Machine SpecilicAtiondJ 64 lava Virtual Machine Specification August 21, 1995

3.13 Table JumpingtableswitchAccess jump table by index and jumpSyntax:Stack . index => .rublesivitch = 170...0-3 bitedefault-offsetldefaillt•offse ►2defatilt•offtetidefaiiii•affset4lowlinty2luiv3luiv4high!high?high3high4. lung) off.tef J. .tableswitch is a variable length instruction. Immediately after the tabl eswi t ch instruction,between zero and three 0's are inserted as padding so that the next byte be gins at an address that is amultiple of four. After the padding follow a series of signed 4-byte quantities: default-offset, low, high,and then high-low+ I further signed 4-byte offsets. The high -low+ I signed 4-byte offsets are treated as a0-based jump table.The index must be an integer. If index is less than low or index is greater than high, then default-offset isadded to the address of this instruction. Otherwise, low is subtracted from index, and the index -tolu'thelement of the jump table is extracted, and added to the address of this instruction.lookupswitchAccess jump table by key match and jumpSyntax:Stack: ..., key =>rookurrivirch = 171...0-3 byte pad...defuitlt-offset Idefault-off-seadefault-offset 3default-offset4;woks Itipairs2titlars)lipulrs4..ruutch•offsetlookupswitch is a variable length instruction. Immediately after the lookunswi t eh instruction,between zero and three 0's are inserted as padding so that the next byte begins at an address that is amultiple of four.Immediately after the padding are a series of pairs of signed 4-byte quantities. The first pair is specialThe first item of that pair is the default offset, and the second item of that pair gives the number ofpairs that follow. Each subsequent pair consists of a snatch and an offset.The key must be an integer. The integer key on the stack is compared against each of the snatches. If it isequal to one of them, the offset is added to the address of this instruction. If the key dues not match anyof the matches, the default offset is added to the address of this instruction3.14 Manipulating Object FieldsputfieldSet field in objectSyntax:',infield = 181indexb)te Iitulexb)te2Stack . , objectref, valueORStack: ..., objectref, value - wordl, value - tvord2 =>indexbytel and andexbyte2 are used to construct an index into the constant pool of the current class. Theconstant pool item will be a field reference to a class name and a field name. the item is resolved to afield block pointer which has both the field width (in bytes) and the field offset (in bytes)The field at that offset from the start of the object referenced by object ref will be set to the value on thetop of the stack.This instruction deals with both 32-bit and 64•bil wide fields.If objectref is nu 1 1, a Nu). 1 Point er Except ion is generated.If the specified field is a static field, an I ncompa t ibleCI a ssChangeEr ror is thrown.Aultusi U. 1995 1AVA VillUAI Machine Specification 1AVA VilittAl Machine Atop.1 17 1555

getfieldFetch field from objectSyntax'[Reified = 180indexlote Iindexb)te2getstaticGet static field from classSyntax:gastutic = 178IndexbyteiIndexbyte2Stack, objectref => ..., valueORStack: ..., => ..., valueORStack: object ref => value-tvordl , value-tvord2Stack: ..., =>valtte-wordl, value-woraputstaticutdexbytel and indexbyte2 are used to construct an index into the constant pool of the current class. Theconstant pool item will be a field reference to a class name and a field name. The item is resolved to afield block pointer which has both the field width (in bytes) and the field offset (in bytes).objectref must be a reference to an object. The value at offset into the object referenced by object refreplaces objectref on the top of the stack.This instruction deals with both 32-bit and 64-bit wide fields.If objectref is null, a Nu11Pof n t erExcept ion is generated.If the specified field is a static field, an IneompatibleClassChangeError is thrown.Set static field in classSyntax.Stack: ..., value =>ORputsratic = 179indexb)te Iindexh)te2Stack:..., value-tvordl, value.tvord2 =>Indexbytel and indexbyte2 are used to construct an index into the constant pool of the current class. Theconstant pool item will be a field reference to a static field of a class. That field will be set to have thevalue on the top of the stack.This instruction works for both 32-bit and 64-bit wide fields.If the specified field is a dynamic field, an I ncompa t ibleClassChangeEr ror is thrown.indexbytel and indexbyte2 are used to construct an index into the constant pool of the current class. Theconstant pool item will be a field reference to a static field of a class.This instruction deals with both 32-bit and 64-hit wide fields.If the specified field is a dynamic field, an IncompatibleClassChangeError is generated3.15 Method InvocationThere are four instructions that implement method invocation.invokevi rtualInvoke an instance method of an object, dispatching based on the runtime (v irtual) typeof the object. This is the normal method dispatch in Java.invokenonvirtual Invoke an instance method of an object, dispatching based on the compile-time (non•virtual) type of the object. This is used, for example, when the keyword super or thename of a superclass is used as a method qualifier.invokestat icInvoke a class (s t a t ic) method in a named class.invokeinter face Invoke a method which is implemented by an interface, searching the methodsimplemented by the particular run-time object to find the appropriate method.invokevirtualInvoke instance method, dispatch based on run-time typeSyntax:invokrvir mu! = 182istilexbyte IIndexbyte2Stack: ..., objectref,iarg2The operand stack must contain a reference to an object and some number of arguments. indeabyieland indexbyte2 are used to construct an index into the constant pool of the current class. The item atthat index in the constant pool contains the complete method signature. A pointer to the object'smethod table is retrieved from the object reference. The method signature is looked up in the methodtable. The method signature is guaranteed to exactly match one of the method signatures in the tableThe result of the lookup is an index into the method table of the named class, which is used with theobject's dynamic type to look in the method table of that type, where a pointer to the method block forAugust 21. 199568lava Virtual Machine Specification 67 Java Virtual Machine Specification August 22, 1995

the matched method is found. The method block indicates the type of method (native,synchroni zed, and so on) and the number of arguments expected on the operand stack.If the method is marked synchroni zed the monitor associated with objectref is entered.The objearef and arguments are popped off this method's stack and become the initial values of thelocal variables of the new method. Execution continues with the first instruction of the new method.If the object reference on the operand stack is null, a Nu 1 1 PointerExcept ion is thrown. Ifduring the method invocation a stack overflow is detected, a StackOver f lowError is thrown.invokenonvirtualInvoke instance method, dispatching based on compile•time typeSyntax.invokestaticinvokeninivittuul a 183indrabttelindrektre2Stack , objearef, larg2The operand stack most contain a reference to an object and some number of arguments. indexbyteland oulexbyte2 are used to construct an index into the constant pool of the current class. The item atthat index in the constant pool contains a complete method signature and class. The method signatureis looked up in the method table of the class indicated. The method signature is guaranteed to exactlymatch one of the method signatures in the table.The result of the lookup is a method block. The method block indicates the type of method (native,sync hr on i zed, and so on) and the number of arguments (nargs) expected on the operand stack.If the method is marked synchronized the monitor associated with object ref is entered.The objectref and arguments are popped off this method's stack and become the initial values of thelocal variables of the new method. Execution continues with the first instruction of the new method.If the object reference on the operand stack is null, a Nul l Poi nterExcept ion is thrown. Ifduring the method invocation a stack overflow is detected, a StackOver f lowError Is thrown .Invoke a class (static) methodSyntax .Stack . , largl ,invokenutic = 184index!), tel 1Emit' rbrrelThe operand stack must contain some number of arguments. indexbytel and indexbytel are used toconstruct an index into the constant pool of the current class. The item at that index in the constantpool contains the complete method signature and class. The method signature is looked up in themethod table of the class indicated. The method signature is guaranteed to exactly match one of themethod signatures in the class's method table.The result of the lookup is a method block. The method block indicates the type of method (nat ive,synchr on i zed, and so on) and the number of arguments (nargs) expected on the operand stack.If the method is marked synchroni zed the monitor associated with the class is entered.The arguments are popped off this method's stack and become the initial values of the local variablesof the new method. Execution continues with the first instruction of the new method.If during the method invocation a stack overflow is detected, a StackOver f lowEr ror is thrown.invokeinterfaceInvoke interface methodSyntax:invokeinrerface = 185indexbytelIndexbyte2'meg,reservedStack: ..., objecfref, WO, Iarg2The operand stack must contain a reference to an object and mugs-1 arguments. inde)hydel andindexbytel are used to construct an index into the constant pool of the current class. The item at thatindex in the constant pool contains the complete method signature. A pointer to the object's methodtable is retrieved from the object reference. The method signature is looked up in the method tableThe method signature Is guaranteed to exactly match one of the method signatures in the table.The result of the lookup is a method block. The method block indicates the type of method (nati ve,synchronized, and so on) but unlike invokevI r tua 1 and invokenonvi rtual, the number ofavailable arguments (nargs) is taken from the bytecode.If the method is marked s ynchronized the monitor associated with object ref is enteredThe objearef and arguments are popped off this method's stack and become the initial values of thelocal variables of the new method. Execution continues with the first instruction of the new methodIf the object ref on the operand stack is null, a NullPo int ex- Except ion is thrown. If during themethod invocation a stack overflow is detected, a St ackOver f lowEr ror is thrown3.16 Exception HandlingathrowThrow exception or errorSyntax:adirow a 191Stack: ..., objectref => (undefined]objearef must be a reference to an object which is a subclass of Thr owable, which is thrown. Thecurrent Java stack frame is searched for the most recent catch clause that catches this class or asuperclass of this class. If a matching catch list entry is found, the pc is reset to the address indicatedby the catch list entry, and execution continues there.If no appropriate catch clause Is found in the current stack frame, that frame is popped and the objectis rethrown. If one Is found, it contains the location of the code for this exception. The pc is reset to thatlocation and execution continues. If no appropriate catch is found in the current stack frame, thatframe is popped and the objectref is rethrown.If object ref is no 11, then a Null Po int erF.xcept ion is thrown instead.August 22, 1995 Java Virtual Machine Specilicallon 6970 lava Virtual Machine SpecitIcalion Angus, 22, 1995

3. 17 Miscellaneous Object Operations3.18 Monitorsne wCreate new objectSyntax:new .---. 187ifidethyle Iinde.th)te2monitorenterEnter monitored region of codeSyntax:monitorenter =I1941Stack: ...,objectref =>checkcastStack. ..objectrefIndexbytel and indexbyte2 are used to construct an index into the constant pool of the current class. Theitem at that index must he a class name that can be resolved to a class pointer, class. A new instance ofthat class is then created and a reference to the object is pushed on the stack.Make sure object is of given typeSyntaxStack: ..., ob ectref =>cheekenst = 192lndextr)re telifideAb)te2objected!inclexbytel and indexbyte2 are used to construct an index into the constant pool of the current class. Thestring at that index of the constant pool is presumed to be a class name which can be resolved to a classpointer, class. objecfref must be a reference to an object.checkcast determines whether objectref can be cast to be a reference to an object of class class. Anu 11 objectref can be cast to any class. Otherwise the referenced object must be an instance of class orone of its superclasses. (See the lava Impinge Specification for information on how to determinewhether a objectref is an instance of a class.) If objectref can be cast to class execution proceeds at the nextinstruction, and the objectref remains on the stack.It objectref cannot be cast to class, a Cl as sCa s t Except ion is thrownNote. Mustn't refer to the lava Language Specification; give semantics here.monitorexitobjectref must be a reference to an object .The interpreter attempts to obtain exclusive access via a lock mechanism to object ref. If another threadalready has objectref locked, than the current thread waits until the object is unlocked. If the currentthread already has the object locked, then continue execution. If the object is not locked, then obtain anexclusive lock.If objecfref is null, then a Null Po i nterExcept. ion is thrown instead.Exit monitored region of codeSyntax:Stack: ..., objectref =>nuinirorexit = 195 Iobjectref must be a reference to an object.The lock on the object released. If this is the last lock that this thread has on that object (one thread isallowed to have multiple lockr, on a single object), then other threads that are waiting for the object tobe available are allowed to proceed.If objecfref is null, then a Nu 11 Poi n erExcept. ion is thrown instead.instanceofDeternune if an object is of given typeSyntax:inowicrof = 193indorb)te IincleAb)te2Stack. ..., ob ectref => ..., resultindexbyte I and indexbyte2 are used to construct an index into the constant pool of the current class. Thestring at that index of the constant pool is presumed to be a class name which can be resolved to a classpointer, class. objecfref must be a reference to an object.mstanceof determines whether objecfref can be cast to be a reference to an object of the class class. Thisinstruction will overwrite objectref with I if objectrel is an instance of class or one of its superclasses. (Seethe lava language Specification for information on how to determine whether a object reference is aninstance of a class.) Otherwise, objectref is overwritten by 0. If object ref is null, its overwritten by 0.Note: Mustn't refer to the lava liniguage Specification; give semantics here.August 2.2, 1995 Jaya Vifitial Machine SpeC111(allOn 71 72 Java Virtual Machine Spetilt(Ation August 22, 1995

Appendix A: An OptimizationA.2 Pushing Constants onto the Stack Lquick variants)Idcl_quickPush item from constant pool onto stackfollowing set of pseudo-instructions suffixed by _quick are variants of Java virtual machine instructions.They ate used to improve the speed of interpreting bytecodes. They are not part of the virtual machinespecification or instruction set, and are invisible outside of an Java virtual machine implementation. However,inside a virtual machine implementation they have proven to be an effective optimization.Syntax:ldc I _quickbidexbyte IA compiler from Java source code to the Java virtual machine instruction set emits only 000-_.quickinstructions. If the _quick pseudo-instructions are used, each instance of a non-_quick instruction with a_quick variant is overwritten on execution by its _quick variant. Subsequent execution of that instructioninstance will be of the _quick variant.In all cases, if an instruction has an alternative version with the suffix _quick, the instruction references theconstant pool If the _quick optimization is used, each non - _quick instruction with a _quick variantperforms the following:• Resolves the specified item 111 the constant pool• Signals an error if the item in the constant pool could nut be resolved for some reason• Turns itself into the _qu ick version of the instruction. The instructions putstat lc, get static,put f ield, and get f ield each have two _qu ick versions.• Performs its intended operationThis is identical to the action of the instruction without the _quick optimization, except for the additional stepin which the instruction overwrites itself with its _quick variant.The _quick variant of an instruction assumes that the item in the constant pool has already been resolved,and that this resolution did not generate any errors. It simply performs the intended operation on the resolveditem.Note: some of the invoke methods only support a single-byte offset into the method table of the object; (orobjects with 256 or more methods some invocations cannot be "quicked" with only these bytecodes. We alsoneed to define or change existing getfield and putfield bytecodes to support more than a byte of offset.This Appendix doesn't give the opcode values of the pseudo-instructions, since they are invisible and subjectto change.A.1 Constant Pool ResolutionWhen the class is read in, an array constant_pool ( I of size nconstants is created and assigned to a fieldin the class. cons tant_pool ( 0) is set to point to a dynamically allocated array which indicates which fieldsin the constant. pool have already been resolved. cons tant_pool ( 1 ) throughcons t ant _pou I I ricotta tants - 1 I arc set to point at the "type" field that corresponds to this constantitemWhen an instruction is executed that references the constant pool, an index is generated, andconstant_pool (0) is checked to see if the index has already been resolved. If so, the value ofconst ant_pool I index) is returned. If not, the value of constant_pool ( indexI is resolved to be theactual pointer or data, and overwrites whatever value was already in constant_pool index .Idc2_quickStack: ....> ..., itemi ► dexbytel is used as an unsigned 8-bit index into the constant pool of the current class. The aim at thatindex is pushed onto the stack.Push Item from constant pool onto stackSyntax:Stack: ... => ..., itemIda_quIckIndexbyte Iindexb) 1e2indexbytel and indexbyle2 are used to construct an index into the constant pool of the current class. Theconstant at that index is resolved and the item at that index is pushed onto the stack.ldc2w_quickPush long integer or double float from constant pool onto stackSyntax:Stack: ... =>Idc2w_quickindexto) ►eindexbyte2constant-word', constant -toorit2indexbytel and indexbyfe2 are used to construct an index into the constant pool of the current class. Theconstant at that index is pushed onto the stack.August 22. 1995lava %/WWI MAChult SpidnCAllOnn 73 74 lava Virtual Machine SpedlIcAtton August 12. 1995

A.3 Managing Arrays Lquick variants) A.4 Manipulating Object Fields (_quick variants)anewarray_quickAllocate new array of references to objectsSyntax.Stack. .., size => resultitnewurra)_qtackindextr)te Iimlextote2size must be an integer. It represents the number of elements in the new array.mdexbytel and indexbyte2 are are used to construct an index into the constant pool of the current class.The entry must be a class.A new array of the indicated class type and capable of holding size elements is allocated, and result is areference to this new array. Allocation of an array large enough to contain size items of the given classtype is attempted. All elements of the array are initialized to zero.If size is less than zero, a Mega t i veAr rayS i zeExcept ion is thrown. If there is not enoughmemory to allocate the array, an Ou tO f MemoryError is thrown.multianewarray_quickAllocate new multi-dimensional arraySyntaxmultuotelvarray_quickindexbyte Ioulexb)te2dimensionsStack ., size(, size2, => resultliach size must be an integer. Each represents the number of elements in a dimension of the array.index bytel and rndexbyre2 are used to construct an index into the constant pool of the current class. Theresulting entry must be a classdimensions has the following aspects:• It must be an integer 2 I.• It represents the number of dimensions being created. It must be .c the number of dimensions ofthe array class.• It represents the number of elements that are popped off the stack. All must be integers greaterthan or equal to zero. These are used as the sizes of the dimension.If any of the size arguments on the stack is less than zero, a Negat iveArraySi zeExcept ion isthrown. If there is not enough memory to allocate the array, an OutOftlemoryError is thrown.The result is a reference to the new array object.Note: More explanation needed about how this is an array of arrays.putfield_quickSet field in objectSyntax:iturfield_quic koffsetunusedStack: ...,objedref, volute =>objectref must be a reference to an object. value must be a value of a type appropriate for the specifiedfield. offset is the offset for the field in that object. value is written at offset into the object. Both tilt/cc:refand value are popped from the stack.If objectref is null, a Nu 1 1 Poin ter Except ion is generated.putfield2_quickSet long integer or double float field in objectSyntax:otafield2_quickoffsetunusedStack: ..., ob ectref, valite•toordl, volute-toord2=>objectref must be a reference to an object. value must be a value of a type appropriate for the specifiedfield. offset is the offset for the field In that object. value is written at offset into the object . Both objectrefand value are popped from the stack.If objectref is null, a NullPointerExcept ion is generated.getfield_quickFetch field from objectSyntax:Stack:getfleld_quickoffsetunusedobjected/ => ..., valueobjectref must be a handle to an object. The value at offset into the object referenced by objectref replacesobjectref on the top of the stack.If objectref is null, a Nul 1 Po I nter Except ion is generated.August 22. 1995 Ix VI11.1 Machine Specification 75 76 lava Virtual Machine Specification August 22. 1995

cpgetfield2_quickFetch held from objectSyntaxgetfieldLquickOffsetunusedrgetstatic2_quickGet static field from classSyntax:getstutic2_quickludexbytelItuiesb)te2Stack: ..., object refvalue-toordl, value•wort12Stack: ..., =>value-tvorrtl, value-toord2objectref must be a handle to an object. The value at offset into the object referenced by object ref replacesOleo ref on the top of the stackII object ref is null, a Nu 11 Po i nt. er Except ion is generated.putstatic_quickSet static field in classSyntax:Stack: ..., value =>[ !Haman. _qui( kintlestnte Ioulesb)te2indexbytel and itfes-byte2 are used to construct an index into the constant pool of the current class. Theconstant pool item will be a field reference to a static field of a class. value must be the type appropriateto that field. That field will be set to have the value value.putstatic2_quickSet static field in classSyntax[ mamma _gunkuulesbyte Iouleskite2Stack: , tin tie,worr11, value.toord2 =>t ndcsbytel and itulexbyte2 are used to construct an index into the constant pool of the current class. Theconstant pool item will be a field reference to a static field of a class. That field must either be a longinteger or a double precision floating point number. value must be the type appropriate to that field.That field will be set to have the value value.getstatic_quickGet static field from classSyntax:Stack , ..., valuegetstutic_outckitulesb)teitulexb)te2intiexbyte I and index lyie2 are used to construct an index into the constant pool of the current class. Theconstant pool item will be a field reference to a static field of a class. The value of that field will replacehandle on the stackindexbytel and itufexbyte2 are used to construct an index into the constant pool of the current class Theconstant pool item will be a field reference to A static field of a class. The field must be a long integer ora double precision floating point number. The value of that field will replace liniulfc on the stackA.5 Method Invocation (_quick variants)invokevirtual_quickInvoke instance method, dispatching based on rim-time typeSwim:Invokevirtuul_ouickoffsetnurgsStack: ..., objectref, largl,Iorg2 =>The operand stack must contain object ref, a reference to an object and tiorgs•1 arguments. The methodblock nt offset in the object's method table, as determined by the object's dynamic type, is retrieved.The method block indicates the type of method (nat ive, synchroni zed, etc.).If the method is marked synchroni zed the monitor associated with the object is enteredThe base of the local variables array for the new Java stack frame is set to point to object ref on the stack,making object ref and the supplied arguments (nrgl, arg2, ...) the first rungs local variables of the newframe. The total number of local variables used by the method is determined, and the executionenvironment of the new frame is pushed after leaving sufficient room for the locals. The base of theoperand stack for this method invocation is set to the first word after the execution environmentFinally, execution continues with the first instruction of the matched method .If object ref is null, a Null Poi nterExcept ion is thrown. If during the method invocation a stackoverflow is detected, a St ac kOve r f lowEr r or is thrown.invokevirtualobject_quickInvoke instance method of class Java . 1 ang . Obj ec t, specifically for benefit of arraysSyntax:Stack: ..., oVectref, lora forg2invokevirtuulobject _quickoffsetmugsThe operand stack must contain object ref, a reference to an object or to an array and ttargs-1 arguments.The method block at offset in Java . la ng . Objec t's method table is retrieved. The method blockindicates the type of method (na t i ye, synchroni zed, etc.).If the method is marked synchroni zed the monitor associated with handle is entered.The base of the local variables array for the new Java stack frame is set to point to object ref on the stack,making objectref and the supplied arguments (nrg2,nrg2, ...) the first /tangs local variables of the newAugust 22,1995JAVA virtuxt Machine Specification71 7t1 tern VirtuAl Machine Specific/Won August 22. 1995

frame the total number of local variables used by the method is determined, and the executionenvironment of the new frame is pushed after leaving sufficient room for the locals. The base of theoperand stack for this method invocation is set to the first word after the execution environment.Finally, execution continues with the first instruction of the matched method.If oblearef is nu 1 1, a Nu l l Po i n t erExcept ion is thrown. If during the method invocation a stackoverflow is detected, a StackOver f lowError is thrown.invokenonvirtual_quickInvoke instance method, dispatching based on compile-time typeSyntax.invokennuvirruul_quickindexbyte 1intlexb)re2Stack . , objeciref, largl, (arg2 =>The operand stack must contain object ref, a reference to an object and some number of arguments.indexbytel and indexbytel are used to construct an index into the constant pool of the current class. Theitem at that index in the constant pool contains a method slot index and a pointer to a class. Themethod block at the method slot index in the indicated class Is retrieved. The method block Indicatesthe type of method (flat ive, synch ron i zed, etc.) and the number of arguments (nargs) expectedon the operand stack.If the method is marked synch ron I zed the monitor associated with the object is entered.The base of the local variables array for the new Java stack frame is set to point to object ref on the stack,making object ref and the supplied arguments (nrgl, nrg2, ...) the first nargs local variables of the newframe. The total number of local variables used by the method is determined, and the executionenvironment of the new frame is pushed after leaving sufficient room for the locals. The base of theoperand stack for this method invocation is set to the first word after the execution environment.Filially, execution continues with the first instruction of the matched method.If object ref Is null, a Null PointerExcept ion is thrown. If during the method invocation a stackoverflow is detected, a StackOver f lowError is thrown.invokestatic_quickInvoke a class (static) methodSyntax.Stack: .., jargl, (arg2 ...))mvokesurric_quickrude* re Iinclextr)re2The operand stack must contain some number of arguments. indexbytel and indexbytel are used toconstruct an index into the constant pool of the current class. The item at that index in the constantpool contains a method slot index and a pointer to a class. The method block at the method slot indexin the indicated class is retrieved. The method block indicates the type of method (nat ive,sync hroni zed, etc.) and the number of arguments (nargs) expected on the operand stack.It the method is marked synchronized the monitor associated with the method's class is entered.The base of the local variables array for the new Java stack frame is set to point to the first argument onthe stack, making the supplied arguments (argl, arg2, ...) the first nargs local variables of the newframe. The total number of local variables used by the method is determined, and the executionenvironment of the new frame is pushed after leaving sufficient room for the locals. The base of theoperand stack for this method invocation is set to the first word after the execution environmentFinally, execution continues with the first instruction of the matched method.If the object handle on the operand stack is null, a Null Po interExcept ion is thrown If duringthe method invocation a stack overflow is detected, a StackOver f lowEr ror is throwninvokeinterface_quickInvoke interface methodSyntax:invoketurerfaceidNre I1(111)1(2'stagsguessStack: ..., ob ectref, WV, (nrg2The operand stack must contain object ref, a reference to an object, and nargs - 1 arguments. tdbytel and(Nye? are used to construct an Index into the constant pool of the current class. The item at that indexin the constant pool contains the complete method signature. A pointer to the object's method table isretrieved from the object handle.The method signature Is searched for in the object's method table. As a short-cut, the method signatureat slot guess is searched first. If that fails, a complete search of the method table is performed. Themethod signature is guaranteed to exactly match one of the method signatures in the table.The result of the lookup is a method block. The method block indicates the type of method (ria t i ve,synchronized, etc.) but the number of available arguments (nargs) is taken from the bytecodeIf the method is marked synchronized the monitor associated with handle is entered.The base of the local variables array for the new Java stack frame is set to point to handle on the stack,making handle and the supplied arguments (nrgl, arg2, ...) the first mugs local variables of the newframe. The total number of local variables used by the method is determined, and the executionenvironment of the new frame is pushed alter leaving sufficient room for the locals. The base of theoperand stack for this method invocation is set to the first word after the execution environment.Finally, execution continues with the first instruction of the matched methodIf object ref is null, a Nul 1 Poi nterExcept ion is thrown. If during the method invocation a stackoverflow is detected, a StackOver f lowError is thrown.guess is the last guess. Each time through, guess is set to the method offset that was usedA.6 Miscellaneous Object Operations (._-quick variants)new_quickCreate new objectSyntax:Stack: ....>new_quickitulext9re 1indexbyre2objectrefindexbytel and indexbyte2 are used to construct an index into the constant pool of the current class. Theitem at that index must be a class. A new instance of that class is then created and object ref, is referenceto that object is pushed on the stack.August 22.1995 lava Virtual Machine SpecIfIcAllon 79 80 Java Virtual MaChlrle Specific/Mon August 22, 1995

checkcast_quickMake sure object is of given typeSyntaxcheacust_quickindexbytelimlextr2te2Stack , olyectref ,objectrefobiect fel most be a reference to an object. indexbytel and indexbyte2 are used to construct an index intothe constant pool of the current class. The object at that index of the constant pool must have alreadybeen resolved.checkcast then determines whether objectref can be cast to a reference to an object of class class. Anu I I reference can be cast to any class, and otherwise the superclasses of object refs type are searchedfor class. If class is determined to be a superclass of object ref's type, or if objectref is null, it can be cast toobparef cannot be cast to class, a ciasscastExcept i on is thrown.Note: here (and probably in other places) we assume casts don't change the reference; this isimplementation dependentinstanceoLquickDetermine if object is of given typeSyntaxinstuncrol_quickuulexh )1e Iitulexbyte2Stack: ..., driecfref - > ..., resultobject ref must be a reference to an object. imlexbytel and intlexbyle2 are used to construct an index intothe constant pool of the current class. use item of class class at that index of the constant pool musthave already been resolved.nst liceot determines whether object ref can be cast to an object of the class class. A null objecirefcan be cast to any class, and otherwise the superclasses of objectref's type are searched for class. If classis determined to be a superclass of objectref's type, result is I (true). Otherwise, result is 0 (false). Ifhandle is null, result is0 (false).A ugusi 22. 1995 lava Virtual Machine Specific/0ton RI 52 lava Virtual Machine Specillcahon August 22. 1995

I Index of InstructionsAil load 39aastore 41aconst_null 29aload 32aload_ 32anewarray 37anewarray_quick 75areturn 64arraylength 38astore 35astore_ 35athrow 70baload 39bastore 42bipush 27breakpoint 64caload 40castore 42checkcast 71checkcast_quick 81d21 55(12i54(121 54dadd 45daload 39dastore 41dcmpg 61dcmpl 60dconst_ 30ddiv 48(load 32dload_ 32dmul 47&leg 49(rem 49(return 64(store 34dstore_ 34dsub 46dup 43dup_x 1 43dup_x2 44dtip2 43citip2_x 144dup2_x2 44(2d 54Ili 53121 54(add 45(alum' 39(astore 41fcmpg 60fcmpl 60lconst_ 30fdiv 47(load 31fload_ 31fmtil 46Ineg 49(rem 48(return 64(store 34Istore_ 34(sub 46getlield 67gel(ietd_quick 76get(icld2_quick 77getslatic 68getstatic_quick 77getstat1c2_quick 78goto 61goto_w 6212d 52i21 52121 52iadd 44iaload 38iand 51iastore 40iconst_ 29iconst_ml 29idiv 47if_acmpeq 61if_acmpne 61if_acmpeq 58if_icmpge 59ificmpgt 59i(_icmple 59if_icmplt 58if_icmpne 58ifeq 56lige 58ifgt 57tie 57iflt 56ifne 57ifnonnull 57ifnult 56iinc 35load 30iload_ 30hind 46ineg 49instanceof 71instanceof_quick 81instruction name 271nt2byte 55Int2char 55int2short 55invokeinterface 70invokeinterface_quick 80invokenonvirtual 69invokenonvirtual_quick 79invokestatIc 69invokestatic_quick 79invokevirtual 68invokevirtual_quick 78Invokevirtualobject_quick 78for 51hem 48ireturn 63Ishl 50ishr 50istore 33istore_ 33Isub 45luster 50ixor 52jsr 62jsr_w 62I2d 53121 53121 53ladd 45laload 38land 51lastore 40Icrnp 60Iconst_ 29Idcl 28Idcl_quick 74Idc2 28Idc2_quick 74Idc2w 29Idc2w_quick 74ldiv 47'load 31lload_ 31Imul 46',leg 49lookupswitch 66lor 51hem 48!return 63Ishl 50Ishr 50!store 33Istore_ 33!sub 45lusty 51Ixor 52monitorenter 72monitorexit 72multianewarray 37multianewarray_quick 75new 71new_quick 80newarray 36nop 42pop 43pop2 43putfield 66putfield_quick 76putfield2_quick 76putstatic 67putstatic_quick 77putstatic2_quick 77ret 63ret_w 63return 64saload 40sastore 42sipush 28swap 44tableswitch 65wide 35August 22, 1995 lava ViflUAI MA(Illne SpecillcAllun 83 84 Java Vitt.% Machine Specilicaiion August 52, 1995

Efficient JavaVM Just-in-Time CompilationAndreas Krallhttp://www.complang.tuwien.ac.at/andi/AbstractConventional compilers are designed for producinghighly optimized code without paying much attention tocompile time. The design goals of Java just-in-time compilersare different: produce fast code at the smallest possiblecompile time. In this article we present a very fast algorithmfor translating JavaVM byte code to high quality machinecode for RISC processors. This algorithm handles combinesinstructions, does copy elimination and coalescing and doesregister allocation. It comprises three passes: basic blockdetermination, stack analysis and register preallocation, finalregister allocation and machine code generation. Thisalgorithm replaces an older one in the CACAO JavaVM implementationreducing the compile time by a factor of sevenand producing slightly faster machine code. The speedupcomes mainly from following simplifications: fixed assignmentof registers at basic block boundaries, simple registerallocator, better exception handling, better memory managementand fine tuning the implementation. The CACAOsystem is currently faster than every JavaVM implementationfor the Alpha processor and generates machine codefor all used methods of the javac compiler and its librariesin 60 milliseconds on an Alpha workstation.1 IntroductionJava's [2] success as a programming language resultsfrom its role as an Internet programming language. Thebasis for this success is the machine-independent distributionformat of programs with the Java virtual machine [12].The standard interpretive implementation of the Java virtualmachine makes execution of programs slow. This does not'Copyright 1998 IEEE. Published in the Proceedings of PACT'98, 12-18 October 1998 in Paris, France. Personal use of this material is permitted.However, permission to reprint/republish this material for advertisingor promotional purposes or for creating new collective works for resale orredistribution to servers or lists, or to reuse any copyrighted component ofthis work in other works, must be obtained from the IEEE. Contact: Manager.Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane/ P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl.732-562-3966.matter if small applications are executed in a browser, butbecomes intolerable if big applications are executed. Thereare two solutions to solve this problem:• specialized JavaVM processors,• compilation of byte code to the native code of a standardprocessor.SUN took both paths and is developing both Java processorsand native code compilers. In our CACAO systemwe chose to go for native code compilation since it is moreportable and gives more opportunities for improving the executionspeed. Compiling to native code can be done in twodifferent ways: compilation of the complete program in advanceor compilation on demand of only the methods whichare executed (just in time compiler, JIT). The CACAO system[10] uses a JIT compiler and is freely available via theworld wide web.1.1 Previous WorkThe idea of machine independent program representationsis quite old and goes back to the year 1960 [14]. Anintermediate language UNCOL (UNiversal Computer OrientedLanguage) was proposed for use in compilers to reducethe development effort of compiling many differentlanguages to many different architectures. The design ofthe JavaVM has been strongly influenced by P code, the abstractmachine used by many Pascal implementations [13].P code is well known from its use in the UCSD Pascal system.There have even been efforts to develop microprocessorswhich execute P code directly.The Amsterdam compiler kit [16] [15] uses a stack orientedintermediate language. This language has been designedfor fast compilers which emit efficient code. Theintermediate representation of the Gardens Point compilerproject is also based on a stack machine called Dcode [8].Dcode was influenced by Pascal P code. Both Dcode interpretersand code generators for different architectures exist.The problems of compiling a stack oriented abstract machinecode to native code are well known from the programminglanguage Forth. In his thesis [5] and in [7] Ertl describesRAFTS, a Forth system that generates native code

at run time. Translating. the stack operations to native codeis done by translating the operations back to expressionsrepresented as directed acyclic graphs as an intermediatestep. In [6] he translates Forth to native code using C asan intermediate language. In this system the stack slots aretranslated to local variables of a function. Optimization andcode generation are performed by the C compiler.The first implementations of JIT compilers became availablelast year for the browsers from Netscape and Microsofton PCs. They were followed by Symantec's developmentenvironment. Recently SUN released a JIT compiler for theSparc and PowerPC processors. Silicon Graphics developeda JIT compiler for the MIPS processor and recently Digitalreleased a JIT for the Alpha processor.A public domain JIT compiler for several architecturesis the kaffe system developed by Tim Wilkinson(ht tp : / / www ka f f e . org ). For all the above mentionedsystems, no publicly available description of thecompilation techniques exists.The translation scheme of the Caffeine system is describedin [9]. It supports both a simple translation schemewhich emulates the stack architecture and a more sophisticatedone which eliminates the stack completely and usesregisters instead. Caffeine is not intended as a JIT compiler.It compiles a complete program in advance. DAISY(Dynamically Architected Instruction Set from Yorktown)is a VLIW architecture developed at IBM for fast executionof PowerPC, S/390 and JavaVM code. Compatibilitywith different old architectures is achieved by using a JITcompilation technique. The JIT compilation scheme for theJavaVM is described in [4].Adl-Tabatabai and others [1] describe a fast and effectivecode generation system for a JIT compiler. This compilerdoes optimizations like bound check elimination, commonsubexpression elimination and two kinds of register allocation,a simple one and a global priority based one. The resultsshow that for most benchmark programs the complexregister allocator and the subexpression eliminator incur tomuch overhead which does not pay back at run time.2 Translation of stack code to register codeThe JavaVM is a typed stack architecture [12]. Thereare different instructions for integer, long integer, floatingpoint and address types. The main instruction set consists ofarithmetic/logical and load/store/constant instructions. Allthese instructions either work directly on the stack or movevalues between the stack and local variables. There are specialinstructions for array access and for accessing the fieldsof objects (memory access), for method invocation, and fortype checking.The architecture of a RISC processor is completely differentfrom the stack architecture of the JavaVM. RISC pro-cessors have large sets of registers. They execute arithmeticand logic operations only on values which are held in registers.Load and store instructions are provided to move databetween memory and registers. Local variables of methodsusually reside in registers and are saved in memory onlyduring a method call or if there are too few registers.2.1 Machine code translation examplesThe example expression a = b - c *d would betranslated by an optimizing C compiler to the following twoAlpha instructions (the variables a, b, c and d reside in registers):MULL c,d,tmp0 ; trap() = c * dSUBL b,tmp0,a ; a = b - tmp0If JavaVM code is translated to machine code, the stackis eliminated and the stack slots are represented by temporaryvariables usually residing in registers. A naive translationof the previous example would result in the followingAlpha instructions:iload b --> MOVE b,t0iload c --> MOVE c,t1iload d --> MOVE d,t2imul --> MULL t1,t2,t1isub --> SUBL tO,t1,t0istore a --> MOVE tO,aThe problems of translating JavaVM code to machinecode are primarily the elimination of the unnecessary copyinstructions and finding an efficient register allocation algorithm.A common but expensive technique is to do the naivetranslation and use an additional pass for copy eliminationand coalescing.2.2 The old translation schemeThe old CACAO compiler did the translation to machinecode in four steps. First, basic blocks were determined.Then, the JavaVM was translated into a register oriented intermediaterepresentation, the registers were allocated, andfinally machine code was generated. The intermediate representationwas oriented towards a RISC architecture targetand assumed that all operands reside in registers (assumingan unlimited number of pseudo registers). The intermediateinstructions contained a MOVE instruction for registermoves, OP1, OP2 and OP3 instructions for the arithmetic/logicaloperations, a MEM instruction for accessingthe fields of objects, BRA instructions and special instructionsfor method invocation (METHOD). Two special instructions(ACTIVATE and DROP) maintained live range informationfor the register allocator.2

The second pass of the compiler translates each JavaVMload or store instruction into a corresponding intermediatecode MOVE instruction using a new register as the destinationregister in the case of a load. Always using a newregister yields code in a similar form to static single assignmentform [3], which is commonly used for compiler optimizations.A JavaVM i add instruction is translated intoan OP2 instruction, again using a new destination register.This naive translation scheme would generate manyMOVE instructions. Therefore MOVE instructions are generatedlazily. The translator keeps lists which track whichregisters should contain the same values (that are registerswhich are just copies of another register). Instead of generatinga MOVE instruction, the translator enters the registerinto a copy list. If the translator should later generate aDROP instruction, it deletes the register from the list.When at control flow joins the register lists did notmatch, the corresponding MOVE instruction had to be generated.But for most joins the stack, and therefore the registerlists, are empty or else the registers are compatible. Furthermorethe register allocator tries to assign the same hardwareregister to the same stack slots so that MOVE instructions canbe eliminated.2.3 Old register allocationFor a just-in-time compiler expensive register allocationalgorithms, like graph coloring, cannot be used. We thereforedesigned a simple and fast scheme. There are two differentsets of registers: registers for stack slots and registersfor local variables. First, registers for stack slots are assigned.Afterwards, the remaining registers are assigned tothe local variables which are active in the whole method.All registers are assigned to a CPU register at the beginningof a basic block. An existing allocation is left unchanged.The allocator scans the instructions and, for eachinstruction which activates a register and to which no CPUregister has been assigned, a new CPU register is selected.If the allocator has run out of CPU registers, the register isspilled to memory. There exist some conventions for theassignment of registers when calling methods. To preventunnecessary copy instructions at a method call prior to theallocation pass, pseudo registers which are method parameters or return values are assigned the correct register (precoloring).2.4 Problems of the old schemeThe old compiler used a lot of doubly linked lists and allocatedevery object explicitly. So a large amount of memorywas used and a large percentage of the compile time wasspent in object allocation. It had to do four passes over thecode and there were examples in our applications where thecompiler took up to fifty percent of the total run time. So wesearched for improvements and designed a new translationalgorithm.3 The new translation algorithmThe new translation algorithm can get by with threepasses. The first pass determines basic blocks and builds arepresentation of the JavaVM instructions which is faster todecode. The second pass analyses the stack and generates astatic stack structure. During stack analysis variable dependenciesare tracked and register requirements are computed.In the final pass register allocation of temporary registers iscombined with machine code generation.The new compiler computes the exact number of objectsneeded or computes an upper bound and allocates the memoryfor the necessary temporary data structures in three bigblocks (the basic block array, the instruction array and thestack array). Eliminating all the double linked lists also reducedthe memory requirements by a factor of five.3.1 Basic block determinationThe first pass scans the JavaVM instructions, determinesthe basic blocks and generates an array of instructionswhich has fixed size and is easier to decode in the followingpasses. Each instruction contains the opcode, two operandsand a pointer to the static stack structure after the instruction(see next sections). The different opcodes of JavaVMinstructions which fold operands into the opcode are representedby just one opcode in the instruction array.3.2 Basic block interfacing conventionThe handling of control flow joins was quite complicatedin the old compiler. We therefore introduced a fixed interfaceat basic block boundaries. Every stack slot at a basicblock boundary is assigned a fixed interface register. Thestack analysis pass determines the type of the register andif it has to be saved across method invocations. To enlargethe size of basic blocks method invocations do not end basicblocks. To guide our compiler design we did some staticanalysis on a large application written in Java: the javaccompiler and the libraries it uses. Table 1 shows that inmore than 93% of the cases the stack is empty at basic blockboundaries and that the maximal stack depth is 6. Using thisdata it becomes clear that the old join handling did not improvethe quality of the machine code.3.3 Copy eliminationTo eliminate unnecessary copies loading of values is delayeduntil the instruction is reached which consumes the3

stack depth 0 1 2 3 4 5 6 >6occurrences 7930 258 136 112 36 8 3 0Table 1. distribution of stack depth at block boundaryvalue. To compute the information the run time stack issimulated at compile time. Instead of values the compiletime stack contains the type of the value, if a local variablewas loaded to a stack location and similar information.Adl-Tabatabai [1] used a dynamic stack which is changedat every instruction. A dynamic stack only gives the possibilityto move information from earlier instructions to laterinstructions. We use a static stack structure which enablesinformation flow in both directions.Fig. 1 shows our instruction and stack representation. Aninstruction has a reference to the stack before the instructionand the stack after the instruction. The stack is representedas a linked list. The two stacks can be seen as the source anddestination operands of an instruction. In the implementationonly the destination stack is stored, the source stack isthe destination of stack of the previous instruction.that a local variable resides in memory, the copy should bedone with the load instruction. Since the stack is representedas a linked list only the destination stack has to bechecked for occurrences of the offending variable and theseoccurrences are replaced by a stack variable.Figure 2. anti dependenceistore aTo answer the question of how often this could happenand how expensive the stack search is, we analyzed againthe javac compiler. In more than 98% of the cases the stackis empty (see table 2). In only 0.2% of the cases the stackdepth is higher than 1 and the biggest stack depth is 3.Figure 1. instruction and stack representationstack depth 0 1 2 3 >3occurrences 2167 31 1 3 0Table 2. distribution of store stack depthThis representation can easily be used for copy elimination.Each stack element not only contains the type ofthe stack slot but also the local variable number of whichit is a copy, the argument number if it is an argument, theinterface register number if it is an interface. Load (pushthe content of a variable onto the stack) and store instructionsdo no generate a copy machine instruction if the stackslot contains the same local variable. Generated machineinstructions for arithmetic operations directly use the localvariables as their operands.There are some pitfalls with this scheme. Take the exampleof fig. 2. The stack bottom contains the local variable a.The instruction is tore a will write a new value for a andwill make a later use of this variable invalid. To avoid thiswe have to copy the local variable to a stack variable. Animportant decision is at which position the copy instructionshould be inserted. Since there is a high number of dupinstructions in Java programs (around 4%) and it is possibleTo avoid copy instructions when executing a store it isnecessary to connect the creation of a value with the storewhich consumes it. In that case a s tore not only can conflictwith copies of a local variable which result from 1 oadinstructions before the creator of the value, but also withload and store instructions which exist between the creationof value and the store. In fig. 3 the i load a instructionconflicts with the is tore a instruction.The anti dependences are detected by checking the stacklocations of the previous instructions for conflicts. Since thestack locations are allocated as one big array just the stackelements which have a higher index than the current stackelement have to be checked. Table 3 gives the distributionof the distance between the creation of the value and thecorresponding store. In 86% of the cases the distance isone.The output dependences are checked by storing the instructionnumber of the last store in each local variable. If4

chain length 1 2 3 4 5 6 7 8 9 >9occurrences 1892 62 23 1 62 30 11 41 9 7 65Table 3. distribution of creator-store distancesi add iload a istore b istore aFigure 3. anti dependencea store conflicts due to dependences the creator places thevalue in a stack register. Additional dependences arise becauseof exceptions. The exception mechanism in Java isprecise. Therefore store instructions are not allowed tobe executed before an exception raising instruction. This ischecked easily by remembering the last instruction whichcould raise an exception. In methods which contain no exceptionhandler this conflict can be safely ignored becauseno exception handler can have access to these variables.3.4 Register allocationExpensive register allocation algorithms are neither suitablenor necessary. The javac compiler does a coloring ofthe local variables and assigns the same number to variableswhich are not active at the same time. The stack variableshave implicitly encoded their live ranges. When a value ispushed, the live range start. When a value is popped, thelive range ends.Complications arise only with stack manipulation instructionslike dup and swap. We flag therefore the firstcreation of a stack variable and mark a duplicated one as acopy. The register used for this variable can be reused onlyafter the last copy is popped.During stack analysis stack variables are marked whichhave to survive a method invocation. These stack variablesand local variables are assigned callee saved registers. Ifthere are not enough registers available, these variables areallocated in memory.Efficient implementation of method invocation is crucialto the performance of Java. Therefore, we preallocate theargument registers and the return value in a similar way aswe handle store instructions. Input arguments (in Java inputarguments are the first variables) for leaf procedures (andinput arguments for processors with register windows) arepreassigned, too.3.5 Instruction combiningTogether with stack analysis we combine constant loadinginstructions with selected instructions which are followingimmediately. In the class of combinable instructions areadd, subtract, multiply and divide instructions, logical andshift instructions and compare/branch instructions. Duringcode generation the constant is checked if it lies in the rangefor immediate operands of the target architecture and appropriatecode is generated.The old translator expanded some complex instructionsinto multiple instructions to avoid complex instructions inthe later passes. One of such instructions was the expansionof the lookup instruction in a series of load constant andcompare and branch instructions. Since the constants areusually quite small this unnecessarily increased the size ofthe intermediate representation and the final code. The newcompiler delays the expansion into multiple instructions tothe code generation pass which reduces all representationsand speeds up the compilation.3.6 ExampleFig. 4 shows the intermediate representation and stackinformation as produced by the compiler for debugging purposes.The Local Table gives the types and register assignmentfor the local variables. The Java compiler reusesthe same local variable slot for different local variables ifthere life ranges do not overlap. In this example the variableslot 3 is even used for local variables of different types(integer and address). The JIT-compiler assigned the savedregister 12 to this variable.One interface register is used in this example enteringthe basic block with label LO 04. At the entry of the basicblock the interface register has to be copied to the argumentregister A00. This is one of the rare cases where a moresophisticated coalescing algorithm could have allocated anargument register for the interface.The combining of a constant with an arithmetic instructionhappens at instruction 2 and 3. Since the instructionsare allocated in an array the empty slot has to be filled witha NOP instruction. The ADDCONSTANT instruction alreadyhas the local variable L02 as destination, an informationwhich comes from the later ISTORE at number 4. Similarlythe INVOKESTATIC at number 31 has marked all itsoperands as arguments. In this example all copies (besidethe one to the interface register) have been eliminated.5

sieve JavaLex javac espresso Toba java_cuprun time on 21164A 600MHz (in seconds)CACAO old total 1.120 0.720 1.336 0.858 1.208 0.398load 0.040 0.067 0.224 0.141 0.068 0.077compile 0.022 0.116 0.343 0.235 0.139 0.196run 1.058 0.537 0.769 0.481 1.000 0.125CACAO new total 0.902 0.522 0.925 0.614 0.982 0.218load 0.040 0.067 0.223 0.141 0.068 0.077compile 0.004 0.018 0.060 0.050 0.019 0.026run 0.858 0.437 0.642 0.423 0.895 0.115speedupspeedup total old/new 1.24 1.38 1.44 1.40 1.23 1.82speedup compile old/new 7.33 6.44 5.62 4.70 7.31 7.53number of compiled JavaVM instructions2514 13412 I 34759 I 27281 14430 17489number of cycles per compiled JavaVM instruction955 805 1035 1099 790 891Table 4. comparison between old and new compiler3.7 Complexity of the algorithmThe complexity of the algorithm is mostly linear with respectto the number of instructions and the number of localvariables plus the number of stack slots. There are only asmall number of spots where it is not linear.• At the begin of a basic block the stack has to be copiedto separate the stacks of different basic blocks. Table1 shows that the stack at the boundary of a basic blockis in most cases zero. Therefore, this copying does notinfluence the linear performance of the algorithm.• A store has to check for a later use of the same variable.Table 2 shows that this is not a problem, too.• A store additionally has to check for the previous useof the same variable between creation of the value andthe store. The distances between the creation and theuse are small (in most case only 1) as shown by table3.Compiling javac 29% of the compile time are spent inparsing and basic block determination, 18% in stack analysis,16% in register allocation and 37% in machine codegeneration.4 ResultsTo evaluate the differences between the old and the newcompiler we used six different programs: sieve is a Javaimplementation of the well known prime number generationprogram, JavaLex is a scanner generator, javac is the Javacompiler from sun, espresso is a compiler for an enhancedJava dialect, Toba is a system which translates Java classfiles to C and java_cup is a parser generator. As input datafor javac, espresso and Toba we used all source files of Toba(18 files).Table 4 shows the total run time, the load time, the compiletime and the run time for the old and the new systemon an Alpha workstation with a 600Mhz 21164a processor.The new compiler is between 5 and 7 times faster than theold compiler. The new system also has some improvementsin the code generation and uses a hardware null pointercheck ([11]). Both improvements together speed up the newsytem between 23% and 82%. On average only 800 to 1100cycles are needed to compile one JavaVM instruction. Aprofiler which assumes that all memory accesses go to thefirst level cache computed 423 cycles per compiled JavaVMinstruction.To evaluate the performance of CACAO we comparedit with Sun's JDK and with kaffe version 0.8 (see section1.1). We also got access to a beta version of Digitals JITcompiler. Due to problems with the monitor implementation([11]) this compiler gives very bad results for javac andsimilar programs (three times slower than the JDK), but producedeffiecient code for the sieve benchmark.Table 5 gives the run times for all these systems onan ALPHA workstation with a 300MHz 21064a processor.The CACAO system is between 3 and 5 times faster than thekaffe system and twice as fast as the Digital JIT compiler.6

sieve JavaLex javac espresso i Toba java_cuprun time on 21064A 300MHz (in seconds)JDK 83.2 29.8 18.5 8.7 32.1 3.5Digital JIT 6.27 84.4 47.6 14.1 - 9.8kaffe 9.14 9.9 17.8 12.5 - 2.98CACAO old 4.80 2.65 4.74 3.17 4.58 1.52CACAO new 3.87 1.92 3.29 2.26 3.72 0.83speedup with respect to interpreterspeedup JDK/DEC-JIT 13.3 0.35 0.38 0.48 - 0.36speedup JDK/kaffe 9.10 3.01 1.04 0.7 - 1.17speedup JDK/CACAO old 17.3 11.24 3.90 2.74 7.01 2.30speedup JDK/CACAO new 21.5 15.52 5.62 3.85 8.62 4.22Table 5. comparison between JDK, Digital JIT, kaffe and CACAO5 Conclusion and further workWe presented an efficient algorithm for translatingthe JavaVM to efficient native code for RISC processors.This new algorithm is about seven times fasterthan the compiler used before. CACAO executes Javaprograms up to 5 times faster than other JIT compilers.CACAO can be obtained via the world wide web athttp://www.complang.tuwien.ac.at/java/cacao/ .Currently additional code generators for the Sparc, MIPSand PowerPC processors are being developed. We areworking to integrate bound check removal, instructionscheduling and method inlining.AcknowledgementWe express our thanks to Manfred Brockhaus, DavidGregg and Anton Ertl for their comments on earlier draftsof this paper.References[1] A.-R. Adl-Tabatabai, M. Ciernak, G.-Y. Lueh, V. M. Parikh,and J. M. Stichnoth. Fast, effective code generation in a justin-timeJava compiler. In Conference on Programming LanguageDesign and Implementation, volume 33(6) of SIG-PLAN, page to appear. Montreal, 1998. ACM.[2] K. Arnold and J. Gosling. The Java Programming Language.Addison-Wesley, 1996.[3] R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, andF. K. Zadeck. Efficiently computing static single assignmentform and the control flow graph. ACM Transactions on ProgrammingLanguages and Systems, 13(4):451-490, October1991.[4] K. Ebcioglu, E. Altman. and E. Hokenek. A Java ILP machinebased on fast dynamic compilation. In MASCOTS'97- International Workshop on Security and Efficiency Aspectsof Java, 1997.[5] M. A. Ertl. Implementation of Stack-Based Languageson Register Machines. PhD thesis, Technische UniversitatWien, April 1996.[6] M. A. Ertl and M. Maierhofer. Translating Forth to native C.In EuroForth '95, 1995.[7] M. A. Ertl and C. Pirker. The structure of a Forth native codecompiler. In EuroForth '97 Conference Proceedings, pages107-116, 1997.[8] K. J. Gough. Multi-language, multi-target compiler development:Evolution of the Gardens Point compiler project.In H. Mossenbock, editor, JMLC'97 -Joint Modular LanguagesConference, Linz, 1997. LNCS 1204.[9] C.-H. A. Hsieh, J. C. Gyllenhaal, and W. W. Hwu. Javabytecode to native code translation: The Caffeine prototypeand preliminary results. In 29th Annual IEEE/ACM InternationalSymposium on Microarchitecture (MICRO'29), 1996.[10] A. Krall and R. Graff. CACAO - a 64 bit JavaVM justin-timecompiler. Concurrency: Practice and Experience,9(11):1017-1030, 1997.[11] A. Krall and M. Probst. Monitors and exceptions: Howto implement Java efficiently. In S. Hassanzadeh andK. Schauser, editors, ACM 1998 Workshop on Java for High-Performance Computing, pages 15-24, Palo Alto, March1998. ACM.[12] T. Lindholm and F. Yellin. The Java Virtual Machine Specification.Addison-Wesley, 1996.[13] S. Pemberton and M. C. Daniels. Pascal Implementation,The P4 Compiler. Ellis Horwood, 1982.[14] T. B. Steel. A first version of UNCOL. In Proceedings of theWestern Joint IRE-AIEE-ACM Computer Conference, pages371 - 377, 1961.[15] A. S. Tanenbaum, M. F. Kaashoek, K. G. Langendoen, andC. J. H. Jacobs. The design of very fast portable compilers.ACM SIGPLAN Notices, 24(11):125-131, Nov. 1989.[16] A. S. Tanenbaum, H. van Staveren, E. G. Keizer, and J. W.Stevenson. A practical tool kit for making portable compilers.Communications of the ACM, 16(9):654-660, September1983.7

java. io.ByteArrayOutputStream.write (int)voidLocal Table:0: (addr) S151: (int) S142: (int) S133: (int) S12 (addr) S12Interface Table:0: (int) T24[ L00] 0 ALOAD 0[ T23] 1 GETFIELD 16[ L02] 2 IADDCONST 1[ L02] 3 NOP[ ] 4 ISTORE 2[ L02] 5 ILOAD 2[ L00 L02] 6 ALOAD 0[ T23 L02] 7 GETFIELD 8[ T23 L02] 8 ARRAYLENGTH[ ] 9 IF_ICMPLE L005[ ] 18 IF_ICMPLT L003[ ] L002:[ 100] 19 ILOAD 3[ I00] 20 GOTO L004[ ] L003:[ I00] 21 ILOAD 2[ A00] L004:[ L03] 22 BUILTIN1 newarray_byte[ ] 23 ASTORE 3[ L00] 24 ALOAD 0[ A00] 25 GETFIELD 8[ A01 A00] 26 ICONST 0[ A02 A01 A00] 27 ALOAD 3[ A03 A02 A01 A00] 28 ICONST 0[ L00 A03 A02 A01 A00] 29 ALOAD 0[ A04 A03 A02 A01 A00] 30 GETFIELD 16[ ] 31 INVOKESTATIC java/lang/System.arraycopy[ L00] 32 ALOAD 0[ L03 L00] 33 ALOAD 3[ ] 34 PUTFIELD 8[ ] L005:[ ] 45 RETURNFigure 4. Example: intermediate instructions and stack contents8

Technical Overview of the Common Language RuntimeErik MeijerMicrosoftRedmond WAemeijer@microsoft . cornJohn GoughQUTBrisbane, Australiaj.gough@qut.edu.auAbstractThe functionality of the recently announced Microsoft .NET system is founded on the capabilities of the Common LanguageInfrastructure (CLI). Unlike some other recent systems based on virtual machines, the CLI was designed from the start tosupport a wide range of programming languages. It is also expected that ECMA standardization will make the CLI availableon a wide range of computing platforms. This combination of multi-language capability and multiplatforrn implementationmake the CLI an important target for future language compilers.In this paper, the technical details of the CLI are briefly described. To motivate some of the discussion a comparison is madewith the JavaT m virtual machine (JVM). The JVM was designed under rather different constraints, making it a much moredifficult target for languages other than JavaT M . We also briefly discuss the issues involved in mapping various languageconstructs to the primitives of the CLI.1 IntroductionThe ideas of virtual machines, intermediate languages and language independent execution platforms have fascinated languageresearchers for a long time. Well known examples include UNCOL [6], UCSD P-code [23], ANDF [20], AS-400 [25],hardware emulators such as VMWare, Transmeta Crusoe TM [30], binary translation [26], the JVM [19], and most recentlyMicrosoft's Common Language Infrastructure (CLI) [2].There are several reasons why people are looking at alternative implementation paths for native compilers:Portability By using an intermediate language, you need only n + m translators instead of n * m translators, to implementn languages on m platforms.Compactness Intermediate code is often much more compact than the original source. This was an important property backin the days when memory was a limited resource, and has recently regained importance in the context of dynamicallydownloaded code.Efficiency By delaying the commitment to a specific native platform as much as possible, the execution platform can makeoptimal use of the knowledge of the underlying machine, or even adapt to the dynamic behavior of the program.Security High-level intermediate code is more amenable to deployment and runtime enforcement of security and typingconstraints than low level binaries.Interoperability By sharing a common type system and high-level execution environment (that provides services suchas a common garbage collected heap, threading, security, etc), interoperability between different languages becomeseasier than binary interoperability. Easy interoperability is a prerequisite for multi-language library design and softwarecomponent reuse.Flexibility Combining high level intermediate code with metadata enables the construction of (typesafe) metaprogr, ammingconcepts such as reflection, dynamic code generation, serialization, type browsing etc.

Attracted by the high-level runtime support and the wide availability of the JVM, and the rich set of libraries on the Java TMplatform, quite a number of language implementers have recently turned to the JVM as the execution environment for theirlanguage [29, 7].The JVM is a great target for Java TM, but even though the JVM designers hope to attract implementers of other languages[19, Chapter 1.2], we will argue that the JVM is essentially a suboptimal multi-language platform.For a start, the JVM provides no way of encoding type-unsafe features of typical programming languages, such as pointers,immediate descriptors (tagged pointers), and unsafe type conversions. Furthermore, in many cases the JVM lacks the primitivesto implement language features that are not found in Java TM , but are present in other languages. Examples of suchfeatures include unboxed structures and unions (records and variant records), reference parameters, varargs, multiple returnvalues, function pointers, overflow sensitive arithmetic, lexical closures, tail calls, fully dynamic dispatch, generics, structuraltype equivalence etc [17, 18, 14, 9, 12, 11, 24].The CLI has been designed from the ground up as a target for multiple languages, and explicitly addresses many of theissues mentioned above that are needed to efficiently compile a wide variety of languages. To ensure this, from early onin the development process of the CLI, Microsoft has worked closely with a large number of language implementers (bothcommercial and academic, for an up to date list see www. . go tdo tne t . corn). For instance, the tail call instruction wasadded as a direct result of feedback from language researchers; tail calls are a necessary condition for efficiency in manydeclarative languages that use recursion as their sole way of expressing repetition.It would be unfair to state that the CLI as it is now, is already the perfect multi-language platform. It currently has good supportfor imperative (COBOL, C, Pascal, Fortran) and statically typed 00 languages (such as Eiffel, Oberon, ComponentPascal). Microsoft continues to work with language implementers and researchers to improve support for languages in nonstandardparadigms [16].In the remainder of this paper, we give a quick overview of the architecture, instruction set and type system of the CLIand point out specific points where we think the CLI is a better multi-language execution environment than the JVM. Thetreatment is necessarily brief. For a more detailed and tutorial overview of the CLI, see the recent book [10].2 Architecture of the Common Language Infrastructure (CLI)The CLI manages multiple concurrent threads of control (which are not necessarily native OS threads). A thread can beviewed as a singly linked list of activation records[13, 3], where a activation record is created and linked back to the currentrecord by a method call instruction, and removed when the method call completes (either by a normal return, a tailcall, orby an exception). It is usual, but not necessary, that the activation records of a single thread are allocated on a runtime stack.However, since the management of activation records is abstracted away in the CLI, and to avoid confusion, we shall use theterm "stack" here exclusively to refer to the evaluation stack of the virtual machine.An instruction pointer (IP) which points to the next CLI instruction to be executed by the CLI in the present method.An evaluation stack which contains intermediate values of the computation performed by the executing method (theoperand stack in JVM terminology).A (zero-based) array of local variables A local variable may hold any data type. However, a particular variable must beused in a type-consistent way (in the JVM, a local variable can contain an integer at one point in time and a float atanother).A (zero-based) array of incoming arguments Unlike the JVM the argument array and the local variable array are not thesame.A methodInfohandle which contains information about the method, such as its signature, the types of its local variables,and data about its exception handlers.A local memory pool The CLI includes instructions for dynamic allocation of objects from the local memory pool (e.g. [3,Chapter 7.3, page 408].A return state handle which is used to restore the method state on return from the current method. This corresponds to whatin conventional compiler terminology would be the dynamic link.A security descriptor which is used by the CLI security system to record security overrides (assert, permit-only, and deny).This descriptor is not directly accessible to managed code. Although extremely important and interesting, the securitymechanism of the CLI is outside the scope of this paper.

In contrast to the JVM where all storage locations (local variables, stack slots, arguments) are 4 bytes wide, storage locationsin the CLI are polymorphic, in the sense that they might be 4 bytes (such as a 32 bit integer) or hundreds of bytes (such as auser-defined value type), but their type is fixed for lifetime of the frame.3 AssembliesEvery execution environment has a notion of "software component" [28]. An assembly is a set of files (modules) containingCommon Intermediate Language (CIL) code and metadata, that serves as the primary unit of a software component in theCLI. Security, versioning, type resolution, processes (application domains) all work on a per assembly basis. In JVM termsan assembly could roughly be compared to a JAR file.An assembly manifest describes information about the assembly itself, such as its version, which files make up the assembly,which types are exported from this assembly, and optionally a digital signature and public key of the manifest itself. Here isan example manifest for an assembly using ILASM syntax [2]:.assembly HelloWorld {}.assembly extern mscorlib {.publickeytoken = (37 7A 5C 56 19 34 E0 89).ver 1:0:2411:0}Inside an assembly or module we can define reference types such as classes, interfaces, arrays, delegates) (see section 7) andvalue types such as structs, enums (see section 6), and nested types. In contrast to the JVM, the CLI allows top-level methodsand fields. All these declarations are included in the assembly's metadata. A unique feature of the CLI is that it's metadata isuser extensible via the notion of custom attributes.For a more detailed and tutorial overview of the role of assemblies as software components, see [21].4 Type SystemIn this section we give an informal overview of the CLI type system, a more formal introduction is given by Gordon andSyme [8].In addition to user defined types (section 6 and section 7), the CLI supports the following set of primitive types:—object, shorthand for Sys tem . Obj ec t, string, shorthand for System. Str ing, void, void return type.—boo1, 8-bit 2's complement signed value, char, 16-bit Unicode character.—int8, unsigned int8, int16, unsigned int16, int3 2, unsigned int32, int64, unsigned int64,unsigned and 2's complement signed integers of respective width; native int, unsigned native int, machinedependent unsigned and 2's complement signed value.—f 1 oat 3 2, f 1 oat 64, IEEE-754 floating point value of respective width; native float, machine dependent floatingpoint number (not user visible).—typed reference, an opaque descriptor of a pair of a pointer and a type, used for type safe varargs.Primitive types can be combined into composite types using the following set of type constructors:—valuetype typeref, class typeref, reference to value or reference type.—type pinned, prevents the object at which local variable points from being moved by GC. This is outside the scopeof this paper.—type [bounds] , (multi-dimensional) array. This is outside the scope of this paper, suffice to note that in constrast tothe JVM, the CLI does support true multi-dimensional arrays.— method callConv type* (parameters) ) , function pointer. This is outside the scope of this paper.

— type&, managed pointer to type.— type*, unmanaged pointer to type.The natural-size, or generic, types (primitive types native int, unsigned native int, obj ect, and the two typeconstructors &, *) are a mechanism in the CLI for deferring the choice of a value's size. The CLI maps each to the naturalsize for a specific processor at JIT- or run-time. For example, a native int would map to int32 on a Pentium processor, butto int64 on an IA64 processor.The obj ect type represents an object reference that is managed by the CLI. A managed pointer & is similar to the obj ecttype, but points to the interior of an object. Managed pointers are not interchangeable with object references. Unmanagedpointers * or native int are the traditional pointers of other runtime systems, that is, the addresses of data. Unmanagedpointers are an essential element for the interoperation of CLI programs with native code components. Such pointers maynot point into the managed heap since such heap values are under the control of a garbage collector that is free to move andcompact objects. Conversely, values of managed pointer type may safely point outside the managed heap, since the garbagecollector knows the heap limits.Natural sized types offer a significant advantage over the JVM which prematurely commits all storage locations to be 32 bitswide. This implies for example that values of type long or double occupy two locations, which makes things unnecessarilyhard for compiler writers.A more important weakness of the JVM as a target for multiple language is the fact that its type system lumps together allpointers into one reference type, closing the door for languages or compilers that do need a more fine-grained level ofdetail. We will expand on the usefulness of the CLI pointer types in more detail in section 9.5 Base Instruction setThe CLI has about 220 instructions, so obviously we do not have space to cover all of them in this paper, instead we willhighlight a few representative instructions from each group below' .When comparing to JVM instructions, you will notice that unlike the JVM where most instructions have the types of theirarguments hard-coded in the instruction (which makes it easier to interpret JVM byte code, but puts a burden on everycompiler that generates JVM byte codes), the CLI instruction set is much more polymorphic and usually only requires explicittype information for the result of an instruction (which makes it easier for compilers to generate CIL code, but requires morework from the JIT).5.1 Constants, arguments, local variables, and pointersThe CLI provides a number of instructions for transferring values to and from the evaluation stack. Instructions that pushvalues on the evaluation stack are called "loads", and instructions that pop elements from the stack into local variables arecalled "stores".The simplest load instruction is ldc . t v, that pushes the value v of type T 2 on the evaluation stack. The ldnull pushesa null reference (of type obj ect) on the stack.The ldarg n instruction pushes the contents of the n-th argument on the evaluation stack. The ldarga n instructionpushes the address (as a managed pointer of type T&) of the n argument on the evaluation stack. The starg n instructionpops a value from the stack and stores it in the n-th argument. In each case, the JIT knows the type of the value from thesinature of the method.The ldloc n instruction pushes the contents of the n-th local variable onto the evaluation stack, and ldloca n pushesthe address of the n-th local variable on the evaluation stack as a managed pointer. The stloc n instruction pops a valuefrom the stack and stores it in the n-th argument. Again, the JIT can figure out the types of these values from the context.Many of the CLI instruction also have short forms, that allow more compact representation in certain special cases. We will not discussthese variants here2 Here T E { int 3 2, int 64, f loat 32, float 64} and t is the short form of T. The short form of types is used in all instructions thathave a type index.

The ldind t instruction expects an address (which can be a native int, or a unmanaged or managed pointer) on the stack,dereferences that pointer and puts the value on the stack. The stind t v instruction stores a value v of type T at addressfound at the top of the stack. In both cases, the type t is needed because the JIT cannot always infer what the type of theresulting value is.The other load and store instructions include ldf id, ldsfld, stfld, stsfld, and ldflda and ldsflda to manipulateinstance and static fields, and a similar family of instructions for arrays.Example: reference arguments The ability to load the address of local variables, and to dereference pointers to indirectlyget the value they point at allows compiler writers to efficiently implement languages that support passing arguments byreference. For example, here is the CIL version of the Swap function that swaps the values of two variables:.method static void Swap(int32& xa, int32& ya) {.maxstack 2.locals (int32 z)ldarg xa; ldind.i4; stloc zldarg xa; ldarg ya; ldind.i4stind.i4; ldarg ya; ldloc zstind.i4; ret // returnTo call this function (see section 8), we just pass the addresses of the local variables as arguments to function Swap:. locals (int32 x, int32 y)// initialize x and yldlocaldlocacall void Swap(int32&, int32&)In the JVM there is a separate load (and store) instruction for each type, i.e. i loader pushes the integer content of the n-thlocal variable on the stack, and similarly for al oad_n (reference), dload_n (double, so it will moved as two 32 bit values),f loads (float), and 1 loads (long, again, moves two items will be moved).The JVM does not allow compilers to take the address of local variables, hence it is impossible to implement byref argumentsdirectly. Instead compiler writers have to resort to tricks such as passing one-element arrays, or by introducing explicit boxclasses (the JVM does not support boxing and unboxing either). Gough (12] gives a detailed overview of the intricate designspace of implementing reference arguments on the JVM.5.2 ArithmeticThe add instruction adds the two topmost values on the stack together (and similarly for other arithmetic instructions).Overflow is not normally detected for integral operations unless you specify . ovf (signed) or ovf . un (unsigned); floatingpointoverflow returns +oc or —cc.The JVM never indicates overflow during operations on integer data types, which means that the time penalty may besignificant for procedures which perform intensive arithmetic in languages (such as Ada95 [1] or SML [22]) that requireoverflow detection. A minor issue in this context, is that there is a separate add instruction for each type (and similar forother arithmetic instructions), just as is the case for load and store.5.3 Simple control flowThe CLI supports the usual variety of (conditional) branch instructions (such as br, beg, bge etc.). There is no analog ofthe JVM "jump subroutine" instruction. Also the CLI does not limit the length of branches to 64K as the JVM does (whichmight not be a big deal for humans programming in Java, but it is a real problem for compilers generating JVM byte code).

6 Value TypesA value type is similar to a struct in C or record in Pascal, i.e. a sequence of named fields of various types. In contrast toreference types, which are always allocated on the GC heap, value types are allocated "in place". In the CLI, value types canalso contain (static, virtual, or instance) methods [2), the details of which are outside the scope of this paper.6.1 StructuresHere is the definition of a simple Point structure that contains two fields x and y (which the CLI may store in any order):.class value Point {.field public int x.field public int y}6.2 UnionsThe CLI also supports sequential and explicit layout control of fields. The latter is needed to implement C-style union types(or variant records in Pascal), a structure where the fields may overlap. For example the following value class defines a unionthat may hold either a float or an int:.class value explicit FloatOrint {.field [0] public float32 f.field [0] public int32 n6.3 EnumsBesides structures, there is another kind of value type, enumerations, which correspond to C-style enums. Enumerationsprovide a type safe way to associate names with integer values. For example the following enum defines a new value typeShape with two constants RECTANGLE and CIRCLE:.class enum Shape {.field public static valuetype Shape RECTANGLE = int32(0).field public static valuetype Shape CIRCLE = int32(1)The CLI also allows you to specify enum details such as the internal storage type or indicating that the enumeration is acollection of bits, for more details see [2].6.4 Initializing valuetypesExcept for boxing and the .1 oc a I s directive, the CLI does not have special mechanisms or instructions to explicitly allocatememory for a valuetype. The ini tobj T instruction expects the address of a valuetype T on the stack, and initializes allthe fields of the valuetype to either null or a 0 of the appropriate primitive type (this is a nice example of a polytypicinstruction). For example to initialize the example Po int struct that we introduced in section 6.1, we would load the addressof the local variable p of type Point on the stack and call ini tobj Point:.locals (valuetype Point p)ldloca pinitobj Point

It should be obvious that having value types is essential for compiling Pascal or C-like languages that have enums, recordand union types. Compiling such languages to the JVM is inefficient to start with, as you need to represent enums and structsby classes and unions by class hierarchies [4, Chapter5J. A much more serious consequence is that it is impossible to supportthe full semantics of such languages, as it is impossible to implement the common (type unsafe) trick where you store a afloat in an FloatOrint union type, and read it as an int:.locals (valuetype FloatOrint fi, int32 n)// fi.f = 3.14idloca fildc.r4 3.14stfld float32 FloatOrInt::f// n = fi.nidloca fildfld int32 FloatOrint::n7 Reference typesThe CLI supports types such as classes, interfaces, arrays, delegates. Because of lack of space, we will restrict our attentionto classes. Classes can contain methods and fields; but yet again, to support as many languages as possible, besides virtualand static methods (as in Eiffel, and Java TM ), the CLI also support instance methods (as in C++).For example, here are two classes Foo and Bar that both define an instance method f, and a virtual method g:.class public Foo {.method public virtual void f() {...).method public instance void g() {...}.method public static void h() {...}.method public specialname void .ctor() {...}.class public Bar extends Foo {.method public virtual void f() {...}.method public instance void g() {...}.method public static void h() {...).method public specialname void .ctor() {...)Constructors always are names . c tor and have to be marked as specialname.7.1 Instantiating Reference typesThe newobj c instruction allocates a new instance of the class associated with constructor c and initializes all the fields inthe new instance. It then calls the constructor with the given arguments along with the newly created instance.For example, we can create an instance f with static type Foo of our class Foo, and an instance b with static type Foo ofour class Bar using the following instruction sequence:.locals (class Foo f, class Foo b)newobj void Foo::.ctor(); stloc fnewobj void Bar::.ctor(); stloc bTo create an instance of a class c in the JVM. you always have to use the sequence new c; dup ; invoke specialc . ( ) V (and similarly for using a constructor that takes arguments) and the JavaT M verifier must do a complex

dataflow analysis to ensure that no object is used before it is properly initialized or that it is initialized more than once [19,Chapter 4.9.4]. It seems much simpler to avoid all the complexity to start with and just do allocation and initialization in asingle instruction.8 Invoking methodsThe CLI has two call instructions for directly invoking methods and interfaces. A third call instruction call i allows indirectcalls on a function pointer, but this is outside the scope of this paper.The call m instruction is normally used to call a static method m (i.e. it is comparable to the cal I stat ic instruction inthe JVM). For example, to call method Foo : : h() , we just write:call void Foo::h()It is legal to call a virtual or instance method using call instance (rather than cal lvirt), in which case method lookupis done statically, in other words, you will get an early bound call (i.e. the effect is comparable to a invokespecial onthe JVM). Assuming that bar is a local variable that contains an instance of class Bar, the following call would actuallyexecute method Foo: : f ( ) :ldloc bar;call instance void Foo::f()The instance calling convention indicates that Foo: : f ( ) expects an additional "this" parameter.The callvirt m instruction makes a late bound call to a virtual method rn, in other words, the actual method that isinvoked depends on the dynamic type of the "this" parameter (the JVM has two separate instructions, invokevirtua 1and invokeinterface for this purpose. which once again makes life harder for compiler writers). So in the examplebelow, the method that will be invoked is Bar : : f ( ) since the this parameter passed to the call has static type c lass Foo,but dynamic type class Bar:ldloc bar;callvirt void Foo::f()For instance methods, callvirt will still result in an early bound call.8.1 TailcallsSome people find it hard to believe, but there are programming languages where recursion is the only way of expressingrepetition (examples include Haskell, Scheme, Mercury). For these languages, it is essential that the underlying executionenvironment supports tailcalls. The tai 1. prefix instructs the JIT compiler to discard the caller's stack frame prior to makingthe call, which means that the following method will indeed loop forever instead of throwing a stack overflow exception:.method public static void Bottom() (.maxstack 8tail. call void Bottom(); retIf the call is from untrusted code to trusted code the frame cannot be fully discarded for security reasons.Since the JVM does not support tailcalls, compiler writers are forced to use tricks like trampolines to artificially force theJVM to discard stack frames [5, 27, 15, 18].

9 Interaction between value and reference typesIf you have both valuetypes and reference types, programmers will want to use valuetypes in contexts where reference typesare required (for instance to store a Point in a collection). The same problem occurs in dynamic languages like Schemeand statically typed polymorphic functional languages like Haskell and SML where polymorphic functions expect a uniformargument representation.To support these scenarios, it is essential to have efficient support from the execution environment to move between theworlds of value- and reference types. Having to create an instance of a class every time you want to pass a valuetype asa reference type has too much performance overhead. Moreover, this would also force you to define a new class for everyvaluetype, or introduce many unnecessary casts.The CLR provides built-in support for boxing and unboxing. A valuetype T can be turned into reference type object usingthe box T instruction, and back into a valuetype using the unbox T instruction.10 Various exoticaAs a consequence of its multi-language focus, the CLR provides a number of special facilities that are otherwise difficult tosynthesize. The case of tail calls has already been mentioned, but there are others as well.The manipulation of function pointers as values is critical to the implementation of 00 languages with arbitrary mechanismsof method dispatch. Support for virtual dispatch in the case of single implementation inheritance with multiple interfaceimplementation is built in. All other cases must rely on explicitly constructed dispatch tables. The ldf to instruction loads afunction pointer on the stack, and the calli instruction invokes the function pointer on the top of the evaluation stack. Anotherhandy instruction for the implementation of multiple inheritance is the jmp instruction. This takes a method referenceas an argument, and transfers control to the entry point of the nominated method. The instruction provides the functionalityrequired for constructing "trampoline" stubs that are often the preferred way of performing the "this adjustment" in thedispatch of virtual methods with multiple inheritance.Languages that pass conformant arrays by value must allocate space for the array copy as part of the procedure call. In thiscase the use of the local lac instruction expands the current activation record. The use of this instruction is much preferableto the dynamic allocation of space for the copy on the heap, as in necessary on the JVM. This instruction thus provides thesemantics of the C al 1 oca function.The final example that will be mentioned here is the 1dt. oken instruction. This instruction loads the runtime type handleof the type reference in the instruction argument. This operation is a basic building block in the reflection mechanisms. It isused when the type reference is known, but no instance of the type is conveniently available. The same functionality may begained on the JVM by use of the Class . f orName ( ) function, but in that case the name is bound at runtime.11 Verified and unverified codeThe CLR provides a rich set of primitives for the implementation of both typesafe and non-typesafe features. In cases wherememory safety is an important factor, the infrastructure allows for a rich subset of the primitives to be used in ways thatallow for verification of safety. As is the case with the JVM, the analysis is necessarily conservative, but provides strongguarantees of freedom from certain classes of runtime errors. Verification may take place either at component deploymenttime, or at load time. As might be expected, verification is based on analysis of the component, and does not rely on trust ofthe component producer.In general terms the guarantees that verification provides are similar to those given by the more strict of contemporarystatically typed languages. The verifier guarantees that locations holding object references can only reference objects of typesthat fulfill the contracts of the statically declared type, and that field selection can only access fields valid for the known type.There are some guarantees that cannot be statically verified. In such cases the verifier checks that all usages that cannot bestatically checked are protected by runtime tests. For example, it is seldom possible to check that all array indices are within

the known bounds of the array, so the verifier must check that all array accesses are protected by a bounds check. A similarprinciple applies to field or method accesses that depend on the success of a narrowing type cast.Apart from the obvious type guarantees that verification must provide, there are also a number of checks that depend on wellformednessof the control flow. For example, the evaluation stack must have the same height and type-compatible contentalong all paths which join at control flow merge points. Furthermore, in the case of object references the statically knownbound on the type of an evaluation stack element is the least common ancestor of the set of bounds on the types incident onthe merge point.If a compiler wishes to generate intermediate code that can be verified, certain constraints must be met. Some instructions,such as the block copy instruction, are inherently unverifiable while operations such as addition are unverifiable when usedfor address arithmetic. Apart from avoiding certain instructions, it is also necessary to avoid type-unsafe assignments, andsome uses of undiscriminated unions 3 .Languages that are statically type-safe should always be able to be compiled down to verifiable CIL, although the mappingof data types may require some inventiveness. In unverified contexts, as an example, languages that have value arrays ofstatically declared size would normally declare a value class of the required runtime size. In this case array elements wouldbe accessed by indexing into the memory blob that represents the value object at runtime, using address arithmetic in theusual way. If the compiler performs its own index bounds checks then such usage will be completely type safe. However,since the verifier does not permit address arithmetic, and cannot recognize all possible explicit bounds checks, an alternativemapping must be found to achieve verifiability. In this example, the solution is to tranparently allocate a reference array ofthe required size, and use the built in array support of the CLI. This mechanism, using a reference type to represent a valueobject, is a common idiom for verifiable code. We call such representation objects reference surrogates. The mapping of thevalue semantics to such surrogates is treated in detail in [10].The important point to be emphasized is that most of the innovations of the CLR are preserved in a verified environment.Thus the use of value classes, reference parameters and even type-safe unions are permitted.For the compiler writer, verification provides an unexpected and welcome bonus. The offline verifier, pever f y, detects(and diagnoses) most of the common errors that are made when coding a CIL emitter. During the early stages of testing,submitting output to peveri fy allows most such errors to be detected. In the case of one of the project 7 compilers, GardentPoint Component Pascal, the compiler successfully bootstrapped itself on the first attempt, once peveri fy certified the CILas being verifiable.12 Conclusions and future workIn the previous sections we have argued that the CLI is already strictly more powerful than the JVM as a multi-languageplatform. Microsoft Research and the .NET product group continue to work with language inplementors to improve supporta wide variety of language paradigms.We explictly solicit language implementors (including those who now target the JVM) to try to target the CLI and provide uswith feedback on how we can make the CLI even better than it is today.AcknowledgementsWe would like to thank all Project 7 participants, and Jim Miller, Patrick Dussud, Jim Hogg, Clemens Szysperski, Don Syme,Andrew Kennedy, and Nick Benton, for many discussions on the topics discussed in this paper. Nicks's notes on his previousexperiences with compiling SMLj to the JVM were especially helpful.References1. Ada 95 Reference Manual, 1995. ANSI/ISO/IEC-8652:1995.3 Strangely, some unions are tolerable to the verifier, provided that references are not overlapped with other types.

2. CLI Partition II: Metadata. http_://msdn.microsoft.corn/net/ecma/, 2001. ECMA TG3.3. A. V. Aho. R. Sethi. and J. D. Ullman. Compilers: Principles, Techniques and Tools. Addison-Wesley, Reading, Mass., 1986.4. J. Bloch. Effective Java Pro_gramming Language Guide. Addison Wesley. 2001.5. P. BothnerrI(awa -Compiling Dynamic Lanauages to the Java VM. In USENIX'98 Technical Conference, 1998.6. M. E. Conway. Proposal for an UNCOL. CAfM. 1(10):5-8. 1958.7. J. Engel. Programming for the Java Virtual Machine. Addison Wesley, 1999.8. A. Gordon and D. Syme. Typing a Multi-Language Intermediate Code. In Proceedings POPL'01, pages 248-260,2001.9. J. Gough. Parameter Passing for the Java Virtual Machine. In Proceedings of the Australasian Computer Science Conference, 1998.10. J. Gough. Compiling for the .NET Common Language Runtime. Prentice-Hall, Upper Saddle River NJ, 2001.11. J. Gough. Stacking them up: A Comparison of Virtual Machines. In Proceedings- ACSAC-2001, 2061.12. J. Gough and D. Conley. Evaluating the Java Virtual Machine as a Target for Languages other than Java. In Proceedings Joint ModulaLansuages Conference, 2000.13. D. Cirune. H. Bal, C. Jacobs, and K. Langendoen. Modern Compiler Design. Wiley, 2001.14. J. C. Hardwick and J. Sipelstein. Java as an Intermediate Language. Technical Report CMU-CS-96-161, Carnegie Mellon University,Auaust 1996.15. S. 15. Jones. N. Ramsey, and F. Reig. C-: a Portable Assembly Language that Supports Garbage Collection. In International Conferenceon Principles and Practice of Declarative Programming, 1999.16. A. Kennedy and D. Syme. Design and Implementation of Generics for the .NET Common Language Runtime. In ProceedingsPLDI'01, 2001.17. A. Krall and J. Vitek. On Extending Java. In H. MossenbOck, editor. Joint Modular Languages Conference (JMLC'97), pages 321-335,Linz. 1997. Springer.18. C. League, Z. Shao, and V. Trifonov. Representing Java Classes in a Typed Intermediate Language. In International Conference onFunctional Programming, panes 183-196,1999.19. T. Lindholm and F. Yellin. T#e Java Virtual Machine Specification (2e). Addison Wesley, 1999.20. S. Macrakis. From UNCOL to ANDF: Progress In standard Intermediate Languages. Technical report, Open Software FoundationResearch Institute, 1993.21. E. Meijer and C. Szyperski. What's in a name: .NEt as a Component Framework. In 1st OOPSLA Workshop on Language MachanismsProgramming Software Components,paaes 22-28,2001.22. it. K. Milner. M. Tofte. and R. W. Harper. The CDefinition of Standard ML. MIT Press, 1990.23. P. A. Nelson. A Comparison of PASCAL Intermediate Languages. ACM SIGPLAN Notices. 14(8):208-2131979.24. M. Odersky and P. Wadler. Pizza into Java: Translating Theory into Practice. In Proceedings of the 24th ACM' Symposium on Principlese Programming Lang_uages (POPL'97), Paris, France pages 146-159. ACM Press, New York (NY), USA, 1997.25. D. L. Schleicher and R. L. Taylor. System Overview of the Application Svstem/400. IBM Systems Journal, 38(213):398-413, 1999.26. R. Sites, A. Chernoff. M. Kirk, and M. Marks. Binary. Translation. CALM 36(2):69-81 1993.27. G. L. Steele Jr. Rabbit: A compiler for Scheme. Technical Report Technical Report AI-Tft-474, MIT Artificial Intelligence Laboratory,1978.28. C. Szv_perski. Component So are: Beyond Object-Oriented Programming. ACM Press and Addison-Wesley, New York, N.Y., 1998.29. R. Tolksdorf. Programming nauaaes for the Java Virtual Machine. http:7/grunge.cs.tu-berlin.de/ tolk/vinlanguages.html.30. TRANSMETA. The Technologyitetind Crusoe Processors. http://www.transmeta.corn/crusoe/download/pdf/crusoetechwp.pdf, 2000.

Syntaxgesteuerte Editorenand BaummaschinenTexteingabe wird durch SchablonengesteuertPROGRAM ( );BEGIN := END.

Anforderungen an den Zwischencode• bijektive Abbildung zwischen Quellprogramm und Zwischencode• speichereffizient• Iaufzeiteffizient• ROckwertsausfuhrung muf3 moglich sein• Semantik Ieicht erkennbar• einfaches EinfOgen und Loschen von Codehierarchischer Zwischencode(abstrakter Syntaxbaum)

Besonderheiten des Zwischencodes• Zeiger auf nachsten Knoten oder zurOck zum Vaterknoten• ermoglicht Abarbeitung des Baumes ohne Stapel• ein Offsetfeld gibt die Position im Vaterknoten an• Vorgangerknoten wird Ober Vaterknoten erreicht• Zeiger konnen Sprungbefehle sein; Feld vor dem Sprungzielenthalt dann den Offset zum KnotenanfangKarel der RoboterComputerspiel zum Erlernen des Programmierens• Roboter bewegt sich in einem rechtwinkeligen StraBennetz• besitzt einen Sack mit beeper• er kann beeper wahrnehmen, auslegen und einsammeln• er kann sich vorwartsbewegen und nach links drehen• er wird in einer einfachen Programmiersprache programmiert

DieRoboterprogrammiersprache• keine Daten• rekursive UnterprogrammeSprachelemente:• Unterprogrammdefinition• Block-Anweisung• WHILE-Schleife• Zahlschleife (ITERATE)• IF-THEN-ELSE -Anweisung• fixer Satz an Bedingungen• Unterprogrammaufruf• move, turnleft, pickbeeper, putbeeper,turnoff

Knotendefinitionabstract class Node {int offset;Node next;abstract void print(int level);abstract int exec step();abstract int back_step();}Knotenarten-)rogramNodeDefineNodeExecutionNodeBlockNodeWhileNodeIterateNodeIfThenElseNodeCallNodeBasiclnstrNodeName{DefineNode define; ExecutionNode execution};{Name name; Node instruction};{Node instruction};{Node instruction};{int test; Node instruction};{int count; Node instruction};{int test; Node then stmt; Node else_stmtl;{Name name};{int instruction};{String name; DefineNode define; Name next};

Implementierung des Interpreters• Befehlszahler besteht aus Knotenzeiger und Offset• Stapel fOr RUcksprungadressen und ITERATE-ZahlerRuckwartsausfuhrung:• einfache Befehle sind umkehrbar• Vorgangerknoten wird Ober Vaterknoten erreichtRuckwartsausfuhrungvon Schleifen und VerzweigungenBedingungen werden nach der Befehlsausfuhrung in einem StapelgespeichertvorwartsruckwartsFTTFTWIleN

Programmablauf des EditorsInitialisierungEingabeaufforderungEinlesen des BefehlsAnalyse and Auswahl des BefehlsAusfuhren des BefehlsAusgabe des Programmtextessolange Name oder NummerEinlesen von Name oder NummerAusgabe des Programmtextesbis B efeh gleich QuitAnalyse der Befehle erfolgt tabellengesteuert'ndizes: Zustand des Programmbumes, eingegebene ZeichenIn halt: Unterprogramm mit optionalen ParameterAusgabe des Programmtextes:• Schablonen stehen in Tabelle (mit Zeileninformation)• Einrucktiefe ist die Tiefe des Syntaxbaums

Printed by andi from a0.complang.tuwien.ac.atMar 29 1996 16:42 KarefTheRobot.java Page 1/' Karel the Robot, a computer language learning game after Pettis/' Author: Andreas Krall/' Last Change: 96/03/28interface Globals// error and return codesstaticstaticstaticfinalfinalfinalint simple_instr_finishedint no_errorint robot_made_turnoff- --1;. 0;- I;static final int double_definition_error - 2;static final lot incomplete_program_error . 3;static final int blocked_robot_error . 4;static final int end_of_world_error . 5 ;static final int no_beeper_error • 6 ;static final int empty_bag_error . 7 ;static final int missing_turnoff_error . 8 ;static final int back_end_reached_error - 9;static final int stack_overflow_error ■ 10;static// testfinal int internal_program_errorcodes and names- 11;////must be < 0must be 0static final int undef_test • 0; // must be 0static final int front_is_clear 1;// must be previous 1static final int left_ia_clear . 2;static final int right_is_clear • 3 ;static final int next_to_a_beeper 4;static final int facing_north - 5; // north must be east - 1static final int facing_east 6; // east must be south - 1static final int facing_south■ 7 ; // south must be west - 1static final int facing west■ 8 ;// west must be south 4 Istatic final int any_beepers_in_bag ■static final int front_is_blocked • 10;static final int left_is_blocked - 11;static final int right_is_blocked • 12;static final int not_next_to_a_beeper 13;static final int not_facing_north 14;static final int not_facing_east • 15;static final int not_facing_south . 16;static final int not_facing_west - 17;static final int no_beepers_in_bag 18;static final String test_names(I - ('",'front_is_clear','left_is_clear','right_is_clear','next_to_a_beeper','facing_north'.'facing_east','facing_south','facing_west','any_beepers_in_bag''front_is_blocked','left_is_blocked','right_is_blocked','not_next_to_a_beeper','not_facing_north*,'not_facing_east','not_facing_south','not_facing_west','no_beepers_in_beg'1:// basic instruction codes and namesstatic final int undef_instr 0; // must be 0•,Mar 29 1996 16:42 KarelTheRobot.java Page 2static final int move_inatr - 1; // must be previous • 1static final int turnleft_instr . 2;static final int putbeeper_instr 3;static final int pickbeeper_instr 4;static final int turnoff_instr . 5;static final String instruction_names() ('', 'move', 'turnleft', 'putbeeper', 'pickbeeper', 'turnoff'1;// node codes and node element description codesstatic final tnt is_undef . 0; // must be less than is_numberstatic final int program_node 1; // must be less than is_numberstatic final int define_node = 2; // must be less than is_numberstatic final int execution_node = 3: // must be less than is_numberstatic final tnt block_node - 4; // must be less than is_number// must be less than is_numberstatic final int while node - 5;static final int iterate_node - 6; // must be less than is_numberstatic final int if_then_else_node • 7;static final int call_node • 8;// must be less than is_number// must be less than is_number// must be less than is_number// must be less than is_numberstatic final int basic_instr_nodestatic final int is_program..9;10;static final int is_executionstatic final int is_def_list. 11;• 12;// must be lees than is_number// must be lees than is_numberstatic final int ie_stmt . 13; // must be less than is_numberstatic final int is_etmt_list = 14; // must be less than is_numberstatic final int is_number • 15;static final int is_teat . 16: // must be greater than is_numberstatic final int ia_name . 17; // must be greater than is_numberstatic final int is_instr . 18; // must be greater than is_numberfinal class KarelTheRobot implements Globals(static int offset 0; // offset of current instructionstatic Node instruction null; // current instruction// left searches to the next left positionstatic void left()Node instr instruction;if (offset 0) ( // find previous node in listwhile (inatruction.offnet 0) // find last node in listinstruction - instruction.next;offset • instruction.offset - 1; // set parent offsetinstruction ■ instruction.next; // set parent nodeif (instruction.get_node_at_pos(offset) instr)offset--;// original node is list headelseinstruction instruction.get_node_st_pos(offsetl:while (inatruction.next !- instr) // find previous node in listinstruction instruction.next;offset instruction.length(); // offset of last element) elseoffset--;if (offset . ■ 0)return;// go down the tree to the rightmost node of the treewhile ((instruction.description(offset) < is_number) &&(instruction.get_node_at_pos(offset) ). null)) (// go down the treeinstruction - instruction.get_node_at_pos(offset);while linstruction.offset .. 0)// find last node in listinstruction instruction.next;offset - instruction.length();// offset of last elementKarelTheRobot.java

(1-Printed by andi from a0.complang.tuwien.ac.atMar 29 1996 16:42 KarelTheRobot.java Page 31//tight searches to next right positionstatic void right() (offset...;while (offset > ■ instruction.length(1) ( // while behind last elementoffset . instruction.oftaet;// set parents offset andinstruction ■ instruction.next; // go up to parent nodeif ((offset !- 0) sa.// go down to child node(instruction.descriptionfoffset) < is_number)(instruction.get_node_at_posloffset) !. null)) (instruction - instruction.get_node_nt_pos(offset);offset • 0;// first position in child node// insert_node insert a node at the current positionstatic void insert_nodelint node_codelNode node, help;node null;switch (node_code)case program_node:offset ■ 0;instruction ■ new ProgramNodel);return;case define_node:node . new DefineNode();break;case execution_node:node - new ExecutionNodel);break;case block_node:node new BlockNode();break;case while_node:node - new WhileNodel);break;case iterate_node:node - new IterateNode();break;case if_then_else_node:node - new IfThenElseNode(1;break;case call_node:node new CallNodel);break;case basic_instr_node:node ■ new RasicInstrNode();break:if (instruction.description(offset) is_undeflnode.offset instruction.offset; // middle of listnode.next Instruction.next;instruction.offset 0;instruction.next • node;) else if ((help instruction.get_node_at_pos(offset)) null)instruction.put_object_at_postoffset, node); // emptynode.offset • offset • 1;node.next ■ instruction;) else (instruction.put_object_st_posloffset, node); // head of listnode.offset ■ help.offset;node.next • help.next;Mar 29 1996 16:42 KarelTheRobot.java Page 4offset . 0;instruction - node;static public void print(int level, String text) (while (--level s. 0)System.out.print("),System.out.printltext);static public void printlniint level. String text)while (--level >- 0)System.out.prInt(");System.out.println(text);static public void print_instr(int level, Node instr)if (instr .• null)KarelTheRobot.println(level • 1, '*);elsewhile (instr 1. null)instr.prInt(level • 1);if linetr.offset I. 0)return;instr inetr.next;static public void main(String aros(11throws inva.io.I0ExceptionProgramNode program;Node nl, n2, n3;IfThenElseNode if_stmt;int error;int ch;nl - new RasicInstrNode(turnleft_instr);nl.next - new BasicInstrNode)turnieft_instr);n3 - nl.next.next new RanicInstrNodelturnieft_instr);n2 new DlockNode();((BlockNode) n2).instruction n1;nl.next n2;n3.offnet - 2;n1 . new DefineNode();((DefineNode) n1).nameName.enter_namel'turnright*. (DefineNode) n11:((DefineNode) n1).instruction • n2;program - new ProgramNode();program.dofine (DefineNode) nl;n2.offset = 3;nl.next program.execution;program,execution.instruction • if_stmt . new IfThenElseNode( ► ;if_atmt.test - 3;if_stmt.next ■ new BasicInstrNode(turnoff_instr);it_stmt.next.offsetif_stmt.next.next ■ program execution;nl if_stmt.then_stmt • new callNodell:nl.offset . 3;nl.next if_stmt;((CallNode) n1).definition progrnm.define.name;ni if_stmt.else_stmt = new IterateNode();nl.offset • 4;nl.next if_stmt;((lterateNode) n1).count . 3;program.print(0);offset . 0:KarelTheRobotiava 2

Printed by andi from a0.complang.tuwien.ac.atMar 29 1996 16:42 KarelTheRobot.java Page 5instruction • program;DackBuffer.resetll:DefineNode reset();IterateNode.reset41;ch ■ 'c';while ((error instruction.exec_step()) - *orld.lengthlx ■ world length - 1;KarelsWorld.x • x;if (y < 0)y • 0;it (y >• world(0).1ength)y world(0).1ength - 1;KarelsWorld.y • y;Karelsworld.facing ■ facing_north;it ((facing facing_north) II(facing (acing_east)(facing facing_nouth)(facing facing_west11KarelsWorld.facing - facing;static void fill_bag(int count) (if (count < 0)count • 0:bag • count:private static boolean position_out_of_boundslint x. int y)if ((x < 0) II (x >. world.length) 11return true;return false;static void set_north_wall(int x. int y) (if (position_out_of_boundslx, y))return;world(x)(y) I. north_wall;static void set_east_wall(int x, int yl ((Y < 0) II (y >° world(0).1ength11static boolean test(int test)switch (test) (case front_is_clear:switch (facing)(case facing_north:return ((world(x)ly1 & north_wall) 0):case facing_enst:return ((world(x)(Y1 & eent_well) n. 01:case facing_south:if (y 0)return false;return (Iworld(x)(y-1) & north_wall) 01:case facing_west:if (x 0)return false;return ((world(x-1)(y) & east_wall) 0);break;case left_is_clear:switch (facing))case facing_north:if (x 0)return false;return ((worldlx-1((Y) & east_wall) 01:case facing_east:return ((world(x)(y) & north_wall) 01:case facing_south:return ((world(x)(y) G east_well) 0);case facing_west,if (y 0)return false;return ((world(x)(y-1) & north_well) 0);break;case right_is_clear:switch (facing)(case facing_north:return ((world(x)(y) & east_wall) 0),case facing_east:if (y om 01return false;return ((world(x)(y-1) & north_wall) == 01)case facing_south:if (x -- 01return false;return Ilworld(x-11(Y 1 & east_wall) 0):case facing_west:return lIworld(x)iy) & north_wall) -- 0).break;cone next_to_a_beeper:return ((world(x)(y) >» 2) > 0);KarolTheRobot.java

61-Printed by andi from a0.complang.tuwien.ac.atMar 29 1996 16:42 KarelTheRobotlava Page 7case facing_north:return (facing facing_north);case facing_east:return (facing =. facing_east);case facing_south:return (facing facing_southl;case facing_west:return (facing facing_west);case any_beepers_in_bag:return (bag > 0);case front_is_blocked:return ! test(front_is_clear);case left_is_blocked:return ! test(left_is_clearl;case right_is_blocked:return ! testIrlght_ls_clear);case not_next_to_a_beeper,return (Iworld(x)(y) », 2) ■ 01;case not_facing_north:return (facing ! ■ facing_north);case not_facing_enst:return (facing !. facing_east);case not_facing_south:return (facing ! ■ facing_south);case not_facing_west:return (facing !. facing_wost);case no_beepers_in_bag:return (bag

Printed by andi from a0.complang.tuwien.ac.atMar 29 1996 16:42 KarelTheRobot.java Page 91Name get_name_at_pos(int pos)(return null:Node got_node_at_poe(int pos)(return null;void put_object_at_poslint pos, int val) (Mar 29 1996 16:42 KarelTheRobot.java Page 10KarelTheRobot.println(level. 'END-OF-PROGRAM");int exec_step11if (KerelTheRobot.offset !- 0)return missing_turnoff_orror;if (execution null)return incomplete_program_error;KarelTheRobot.instruction . execution;return 0;void put_object_at_pos(int pos, Name name)void put_object_st_pos(Int pos. Node node) (abstract void print(int level);abstract int exec_stepl);final class ProgramNode extends Node implements Globalestatic final int description()(program_node, is_def_list, is_execution, is_undef, is_undef);DefineNode define;ExecutionNode execution;ProgramNode()define = null;execution - new ExecutionNode();execution.next • this;execution.offset 3;lnt length()return 2;int description(int poetreturn descrlption(pos);Node get_node_atpos(int pos))if (pos I)return define;else if )pos == 2)return execution;return null;void put_object_st_poslint pos, Node node)if )pos 1)define - (DefineNode) node:else if (pos 2)execution - (ExecutionNode) node;void print(int level)KarelTheRobot.println(level, 'BEGINNING-OF-PROGRAM");If (define null)KarelTheRobot.println(level 1, "");elsedefine.print(level I);execution.print(level a 11;final class DefineNode extends Node implements Globe'sstatic final int description()(define_node, is_name, is_stmt. is_undef, is_undef);Name name;Node instruction;static CellNode stack()static int top ■ 0;DefineNode() (name ■ null;instruction . null;static void reset()top . 0;int length() (return 2;int description(Int pos)return description/poshName get_name_st_pos(int pos)(if (pos 1)return name;return null;new CallNode(10241;void put_object_at_pos(Int pos, Name name) (If )pos 1)this.name = name;Node get_node_at_pos(int pos)(if (pos 2)return instruction;return null;void put_object_at_pos(int pos, Node node)if (pos =. 2)instruction - node;void print(int level)KarelTheRobot.prInt(level, "DEFINE-NEW-INSTRUCTION "1;if (name null)KarelTheRobot.print(0, '');elseKarelTheRobot.java

6f)Printed by andi from a0.complang.tuwien.ac.atMar 29 1996 16:42 KarelTheRobot.java Page 11KarelTheRobot.print(0, name.oet_name(1);KarelTheRobot.println(0, AS'1;instructlon.printIlevel • 1);static int pushICallNode caller) (to••;if (top >- stack.lengthlreturn stack_overflow_error;Astack(top) • caller;Kat elTheRoloot . instruct ion m cal Le...r....eielITITFTTITi7tttif lne instruct ion;return 0;int exec_step() (if (KarelTheRobot.offset 0)return internal_program_error;KarelTheRobot.offset • stack(top).offset;KerelTheRobot.instruction stackltopl.next;top--;return 0;final class Name implements Olobals(private static Name name_listString name;DefineNode define;private Name next;null;NamelString name, DefineNode define)this.name name;thls.define define;this.next ■ name_list;name_list • this;public String get_name))return this.name;public static Name enter_namelString name, DefineNode definition)Name nlist • name_list;while Inlist null)if inlist.name.equalsIneme))if Inlist.deflne null)nlist.define . definition;return fillet;► elsereturn null;nlist • nlist.neKt;return new NameIname, definition);public static DefineNode find_name(String name)Name nlist name_list;while (nlist I. null) (if (nlist.name.equals(name))return nlist.define;nlist • nlist.next;new NameIname, null);return null;Mar 29 1996 16:42 KarelTheRobot.java Page 12final class ExecutionNode extends Node implements Globalsstatic final int description()(execution_node,Node instruction;ExecutionNode()instruction • null;int description(int poe)return description(pos);Node get_node_at_pos(int pos)(if (pos 1)return instruction;return null;void put_object_at_pos(int pos, Node node)if (pos 1)instruction node;is_undef, is_undef, is_undef);void print(int level)KarelTheRobot.println(level, 'BEGINNING-OF-EXECUTION');KarelTheRobot.print_inetrIlevel, instruction);KarelTheRobot.println(level, 'END-OF-EXECUTION•);int exec_otep()if (KarelTheRobot.offset .• 0) (if (instruction null)return incomplete_program_error;KarelTheRobot.instruction instruction;) elseKarelTheRobot.offeet 0;KarelTheRobot.inetruction . Instruction;return 0;final class DlockNode extends Node implements Globalsstatic final int description() .(block_node,is_undef, is_undef, is_undef);Node instruction;BlockNode)) (instruction . null;int description(int pos)return description(pos);Node get_node_st_pos(lnt pos)(if (poe •. 1)return instruction;return null;KarelTheRobot.java 6

Printed by andi from a0.complang.tuwien.ac.atMar 29 1996 16:42 KarelTheRobot.java Page 13void put_object_at_pos(int pos. Node node)if (pos 1)instruction . node;void printlint level) (KarelTheRobot.println(level, 'BEGIN');KarelTheRobot.print_instrIlevel, instruction);KarelTheRobot.println(level. 'END");int exec_stepilif (KarelTheRobot.offset 0)if (instruction null)return incomplete_program_error;KerelTheRobot.instruction • instruction;) elseKarelTheRobot.offset 0;KarelTheRobot.instruction instruction;return 0;Mar 29 1996 16:42 KarelTheRobot.java Page 14KarelTheRobot.println(level, 'MULE • • teat_names(test1 • ' DO');KarelTheRobot.print_instr(level, instruction);int exec_step()if (test 0)return incomplete_program_error;if (KarelTheRobot.offnet 0)SackBuffer push(true);elseBackBuffer.push(felse);KarelTheRobot.offset 0;if (KarelsWorld.test(test111if (instruction null)return incomplete_program_error;KarelTheRobot.instruction • instruction;elseKarelTheRobot.offaet . offset;KarelTheRobot.instruction next;return 0;final class whileNode extends Node implements Globale (static final int description()(while_node, is_test, is_stmt, is_undef, is_undef);int test;Node instruction;whileNode()test undef_test;instruction . null;int length()return 2;int description(int pos)return description(pos);int get_int_at_pos(int post(it (pos 1)return test;return 0;void put_object_st_pos(int pos, int val) (if (pos 1)test • val;Node get_node_at_pos(int pos)(it (pos 2)return instruction;return null;void put_object_at_pos)int pos. Node node) (if (pos 21instruction • node;void print(int level)final class IterateNode extends Node implements Clobalsstatic final int description()(iterate_node, is_number, is_stmt, is_undef, is_undef);int count;Node instruction;static int stack() ■ new int(1024);static int top ■ 0;IterateNode()count 0;instruction • null;static void reset()top 0;int length()return 2;int description(int pos)return descrlption(pos);int get_int_at_pos(int post(if (pos 11return count;return 0;void put_object_at_pos(int pos, int val) (if (pos ... 1)count val;Node get_node_at_pos(int pos)(if (pos 2)return instruction;return null;)KarelTheRobot.java 7

Printed by andi from a0.complang.tuwien.ac.atMar 29 1996 16:42 KarelTheRobotejava Page 15void put_object_at_poslint pos. Node node)if (pos 2)instruction . node;void print lint level) (if (count .= 01KarelTheRobot.println(level, *ITERATE TIMES');elseKarelTheRobot.println(level, 'ITERATE ' count • ' TIMES");KarelTheRobot.print_instr(level, instruction);int exec_step() (it (KarelTheRobot.offset 0)if ((count 0) 11 (instruction null))return incomplete_program_error;top++;if (top >- stack.length)return stack_overflow_error;stack(topl ■ count;KarelTheRobot.instruction • instruction;else (►if 1--stack(topl > 0)KarelTheRobot.offset 0;1KarelTheRobot.inatruction ■) elsetop--;KarelTheRobot.offset ■ offset;KarelTheRobot.instruction ■ next;return 0;instruction;class IfThenElseNode extends Node implements Globale (static final int description() .(if_then_else_node, is_test, is_stmt, is_stmt, is_undef);int test;Node then_stmt;Node else_stmt;IfThenElseNodel)test . undef_test;then_stmt - null;else_stmt . null;int length()return 3;int description(int pos) (return description(pos);int get_int_at_pos(int pos)(if (pos 1)return test;return 0;void put_object_at_pos(int pos, int val)if (pos 1)Mar 29 1996 16:42 KarelTheRobotjava Page 16test . val;Node get_node_at_pos(int pos)(if (pos 2)return then_stmt;else if (pos 3)return elee_stmt;return null;void put_object_at_pos(int pos. Node node) (if (pos 2)then_stmt ■ node;else if (poselse_stmt3)node;void print(int level)KarelTheRobot.println(level, 'IF + test_names(test));KarelTheRobot.println(level, 'THEW):KarelTheRobot.print_instr(level, then_stmt);KarelTheRobot.println(level, 'ELSE');KarelTheRobot.print_instr(level, else_stmt);int exec_step()if (test 0)return incomplete_program_error;if (KarelTheRobot.offset 0)if (KarelsWorld.test(teet))if (then_stmt null)return incomplete_program_error;KarelTheRobot.inntruction • then_stmt;) elseif (else_stmt null)return incomplete_program_error;KarelTheRobot.instruction ■ else_stmt;1) elseif (KarelTheRobot.offset .. 4)BackBuffer.push(true);elseDackBuffer.push(false);KarelTheRobot.offset . offset;KarelTheRobot.inetruction • next;return 0;final class CallNode extends Node implements Globalestatic final int description() ■(call_node, is_name, is_undef, is_undef, is_undef);Name definition;CallNode()definition . null;int description(int pos)return description(pos);Name get_name_at_pos(int pos)(if (pos .. 1)KarelTheRobot.java 8

Printed by andi from a0.complang.tuwien.ac.alMar 29 1996 16:42 KarelTheRobot.java Page 17return definition;return null;void put_object_at_pos(int pos. Name name) (if (pos 1)this.definition . name;)void printlint level)KarelTheRobot.printin(level, definition.get_namo()):)int exec_step() (if (KarelTheRobot.offset !. 0)return internal_program_error;if ((definition null) II (definition.define.inetruction null))return incomplete_program_error;return definition.define.push(thie);final class BasicInstrNode extends Node implements Olobalestatic final int description()(basic_instr_node, is_inetr, is_undef, is_undef, la_undef);tnt instruction;BasicInstrNode(1instruction ■undef_inetr;BasicInstrNode(int instr)instruction instr;)int deecription(int pos)return deecription(poel;)int get_int_at_poslint pos)(if (pop 1)return instruction;return 0;)void put_object_at_pos(int pos, tnt veil (if (pos 1)instruction val;)void print(int level)KarelTheRobot.printinilevel, inetruction_namos(instruction));)int exec_stepl)int retval;retval KarelsWorld.exec_instt(instruction);KarelTheRobot.offset ■ offset;KarelTheRobot.inntruction • next;return retval;KarelTheRobot.java 9

Syntaxgesteuerte Editoren,Interpreter und CompilerMentor: Arbeiten an diesem Projekt seit1975, Struktureditor, Mental = Editordefinitionssprache,Softwareentwicklungsumgebung,INRIA, FrankreichGandalf (ALOE): A Language Oriented EditorEditorgenerator S-E-Umgebung, CMUCOPS: Cornel Program Synthesizer,Editorgenerator mit Semantikuberprufung(attributierte Grammatik), Interpreter undDebugger fur Syntaxbaum, Cornell Univ.PSG: Programm System Generator,TH Darmstadt

power but retain the disciplined viewpoint of the rest ofthe system.A final example illustrates an awkwardness arisingnot from the structural constraints of the Synthesizer,but from the textual constraints of a language whoseconcrete syntax was defined to be unambiguous forparsers. Inserting the templateIF ( condition)THEN statementintoIF ( condition )THEN 13 tatementELSE PUT LIST ( 'whose else am i?' );leads to an inconsistency between the explicitly derivedstructure (an IF-THEN within an IF-THEN-ELSE)and the structure implied by the parser-oriented concretesyntax (an IF-THEN-ELSE within an IF-THEN). Althoughtempted to adopt the derived interpretation (becauseprettyprinting easily distinguishes one interpretationfrom the other), we elected, instead, to maintaincompatibility with PL/I. Therefore, we prevent such aninsertion and require that the user provide a compoundstatement explicitly.There are many possible alternative designs, amongthem the following four: a) the compound statementcould be inserted automatically when necessary, b) acompound statement could be displayed automaticallywhen necessary; c) the IF-THEN-ELSE template couldbe defined asIF ( condition )THEN DO; (statement) END;ELSE DO; (statement) END;d) the IF-THEN template could be eliminated therebyrequiring that every conditional statement have anELSE-clause. In this final case, the display of an emptyELSE clause could be suppressed unless necessary fordisambiguation.VI. ImplementationA. File TreesSynthesizer files are represented internally as executablederivation trees. Each template or phrase is representedin this tree by a separate node. The pointersconnecting nodes are, in fact, goto instructions for theinterpreter, the null pointer is a halt instruction. Nodesare variable length; each is composed of three sections:Iextension code continuationThe extension identifies the node type and contains anyother information needed to generate the display of thenode but not necessary to execute it_ The code sectioncontains interpretable op-codes for executing the node.The entry point of thẹ node is the first byte of the codesection. The continuation contains a goto linking thisnode to the next op-code to be executed. The target ofthis goto is either the entry point of a sibling node or aninterior op-code of a parent node.For example, the templateIF ( condition )THEN statementELSE statementhas the internal representation given below.ifrom previous op-codeto next op-codeIF halt skip_2 on_false halt skip____1 I halt goto 1This node is tagged in the extension as an IF-node. It contains op-codes that implement the proper control flow andthree halt instructions that represent the unexpanded placeholders. When the template has been expanded toIF ( k > 0 )THEN statementELSE PUT LIST -( list-of-expressions);a link to Polish postfix code for the phrase k > 0 replaces the first halt op-code, and a link to the node for the PUTstatementreplaces the third halt op-code. A halt instruction remains for the other statement placeholder:from previous op-codeto next op-code +IF goto skip___2___ on_ _false halt skip__ goto J gotocondition code for k > 0 1 goto• 1 halt gotoPUT571 Communicationsofthe ACMSeptember 1981Volume 24Number 9

The interpreter is cIssical: it executes straight line codeand goto instructions. It is completely blind to the structureof the tree and requires neither recursion nor a stackto execute a file tree. Access to variables and proceduredefinitions is through a symbol table.The editor walks the tree using the same goto pointersas the interpreter. Each cursor position designates one ofthe nodes of the tree. Cursor motion is defined withrespect to a preorder traversal_ There are no backwardpointers; thus, backward cursor motion is implementedinternally by going all the way around.B. DeclarationsAs demonstrated in Sec. LI.C, declarations present aspecial problem: modifying a declaration can simultaneouslyintroduce errors and correct errors at otherlocations in the program.. Internally, information aboutidentifiers is stored in a symbol table. When a declarationis modified, the Synthesizer discards the old symbol tableand traverses the tree in preorder reparsing and redoingthe semantics of every phrase. Phrases with errors are--Irked as invalid and are printed in the highlighted font_Len the screen is redrawn. Because the allocation ofvariables within an activation record is recomputed inthe process of reconstructing the symbol table, access tothe variables of a suspended activation record is lost inthe process. Therefore, execution cannot be resumedafter such modifications.C. Displaying the TreeThe print representation of a file is generated fromthe tree; a text representation is not saved.. The externalrepresentation of each kind of template is stored in atable: The entries of this table alternate between terminalstrings and placeholder-descriptors. For example, the IFtemplateis encoded as:"IFcondition-descriptor") \ \nTHEN"--gement-I-descriptor_LSE"statement-2-descriptorThe placeholder-descriptors identify the placeholdersand their positions within the code section of an internalnode. The terminal strings contain key words, punctuationmarks, and formatting control characters that areinterpreted on output. For example,\ ( means move left -margin right one unit,\n means line-feed, carriage-return to current left -mar-gin,\} means move left-margin left one unit,\r means carriage-return to current left-margin.The print routine traverses the tree in preorder, simultaneouslykeeping track of position within the externalrepresentation of the appropriate template. Each term:-nal string encountered is printed and its formattingcommands obeyed. Each phrase is translated from postfixto infix for display. (The parentheses of a phrase aresaved in the extension of the node encoded one bit peroperator.)As the tree is traversed for display, a table mappinginternal node addresses to external screen coordinates isupdated_ This table is used both for cursor motion in theeditor, and at runtime for the trace feature_D. Implementation of Debugging FeaturesThe tracing, pacing, and single-step features are implementedby taking appropriate action on the interpretationof each goto leading to a new node.When tracing, each goto uses the map from internalnode addresses to screen coordinates to determinethe new cursor'position. If the map is not defined for agiven target node, then the cursor lies outside the windowand the program is redrawn with the new cursor positioncentered in the window. Traced programs are neverpermitted to run any faster than one cursor update perrefresh of the video screen in order to avoid stroboscopiceffects such as loops that appear to run backwards. Whenpacing, the interpreter waits appropriately at eachgoto before continuing execution. When stepping, theinterpreter waits for a resume command before continuing.The variable-monitoring feature is implemented in astraightforward manner. a table mapping identifiers toscreen positions is maintained. Assignment to a monitoredvariable is detected by the interpreter whereuponthe appropriate position is updated on the screen.Reverse execution also has a straightforward implementation:the forward execution interpreter maintainsa history file of the flow of control and the valuesdestroyed by assignments to variables. The reverse executioninterpreter restores values and updates the screento give the illusion of the program executing backwards.VII. Tne Synthesizer GeneratorContinuing research and development of the Synthesizerwill increase its power, versatility, and range ofapplication complementing the unique syntax-directedmechRnisms the environment already provides. For example,global data flow analysis techniques will be usedto answer queries about static program structure, as in[IS]. The video display can be used to express staticrelationships between components of a program; themultiple fonts of a terminal can be exploited to highlightregions of interest. For example, the programmer mightrequest the highlighting of all uses or all assignments toa variable X. Alternatively, the analysis can be keyed tothe present location of the editing cursor. For example.,the programmer might request the highlighting of allassignments to X that can account for its value at thepresent cursor location, or all possible uses of X that can572 Communications September 1981of Volume 24thc ACM Number 9

I- be reached from the present cursor location.To facilitate such further development, we are implementinga language-independent system for generatingSynthesizer-like systems from a grammatical specificationof a given programming language. An attributegrammar will be used to define the syntax, displayformat, and semantics of each template and phrase_ Inour application, where program units are inserted anddeleted in arbitrary order, semantic analysis must beboth incremental and reversible. For this purpose, attributegrammars have the advantage of expressing semanticsand context-sensitive constraints applicatively andon a modular basis; the arguments to each semanticfunction are imported explicitly from neighboring nodesin the derivation tree.Because propagation of semantic informationthrough the tree is implicit in the formalism, an incrementalattribute evaluator can update the appropriateattribute values in conjunction with each editing operation.In particular, because the attribute dependenciesare known, the evaluator can delete semantic informationautomatically when program units are deleted; aseparate mechanism to undo semantics is not needed.ye have described one such incremental attribute evaluatorin [8]; more recently, we have developed an optimal-dineincremental evaluator that runs in time proportionalto the number of attribute values that actuallymust be changed 121].Acknowledgmen:s. Many people have-participated inthe development of the Synthesizer. We are deeply indebtedto A. Demers for many stimulating discussionsand for writing the LSI-I 1 operating system kernel; hisinsights and assistance have been invaluable. We arealso extremely grateful for the generous help of 3. Archer,R. Conway, M. Fingerhut, D. Gries, C. Hauser, S.Horwitz, D. Jacobs, R. Johnson, D. Krafft, S. Mahoney,and R. Olsson.Received 5/80; revised and accepted 4/81ReferencesAlberga, C.N, Brown, A.L, Leeman, G.13, Mikelsons, M, andWegrnan, M.N. A program development tooL Conference Record ofthe 8th Ann. Syrup. on Principles of Programming Languages,Williamsburg, VA, Jan., 1981,92-104.2. Archer, J., Conway, R., Shore, A., and Silver, L The CORE userinterface. Tech. Report No. TR80-437, Dept of Comptr. Sci_, CornellUniv., Ithaca, NY, Sept. 1980.3. Balzer, R.M., EXDAMS-EXtendabIe Debugging and MonitoringSystem, AFIPS Proc. V. 34 (SJCC 1969), 567-580.4. Constable, R_ and O'Donnell. MJ. A -Programming Logic.Winthrop, Cambridge, MA, 1978.5_ Conway. R. and Constable. R. PL/CS-A disciplined subset ofPL/I. Tech. Rept No. 76-293, Dept. of Comptr. Sci., Cornell 1976.6. Conway. R. Primer on Disciplined Programming Using PL/CS.Winthrop. Cambridge, MA, 1978.7. Conway, R. and Gries, D. An introduction io programming-asiruciured approach using PL/I and PL/C. Winthrop, Cambridge,MA, 1979, 135-137.8_ Demers, A., Reps, T., and Teitelbaum, T. Incremental evaluationfor attribute grammars with application to syntax-directed editors.CnrIfererice Record of the 8th Ann_ Syrup. on Principles ofProgramming Languages, Williamsburg. VA, Jan.. 1981.9. Donzeau-Gouge, V, Hue:, 0_, Kahn, G„ Lang, B., and Levy,A structure-oriented program editor. Tech. Rept, IRIA-LABORIA, -France 1975.10. Eagetbart, D.C. and English. W.K. A research center foraugmenting human intellect. AFIPS Proc. V. 33 (FJCC, 1968).11. Feiler, P.H. and Medina-Mora„ R., An incremental programmingenvironment. Dept. of Comptr. Sci„ Carnegie-Mellon Univ,Pittsburgh, PA, April 1980.12. Hansen, W. Creation of hierarchic text with a computer display.Ph.D. Thesis, Comptr. Sci. Dept., Stanford University, Stanford, CA,June 1971.13. Haberrnann,'A.N. An overview of the Gandalf project. Comptr.Sci- Res_ Rev. 1978-79, Carnegie-Mellon Univ., Pittsburgh, PA, 1979.14. Hodgson, Ll., and Porter, M. B1DOPS: A bi-directionalprogramming system. Dept_ of Comptr. Sci_ Univ. of New England,Armidale, N.S.W., Australia, 1980.15. Joy, B. Ex Reference manuaL Dept. of Electrical Eng. andComptr. Sci., Univ. California, Berkeley, CA, 1977.16. Kurtz, T.E. BASIC. SIG PLAN Notices, Aug. 1978.17. Lewis, J.W. and Porges, D.F. ALBE/P: a language-based editorfor rascaL Dept. of Comptr. Sci., Yale Univ., New Haven, CT.18. Masinter, L.M. Global program analysis in an interactiveenvironment. Xerox PARC Report SSL-80-1, Jan. 1980.19. Mikelsons, M. and Weg,mari, M.N. PDE IL: The PLIL programdevelopment environment principles of operation. Res. Rept RC8513,IBM, Thomas J. Watson Research Center, Yorktown Heights, NY,Nov. 1980.20. Pine, J.H. and Schwcppc, E.J. A Fortran language anticipationand prompting system. PTOC. ACM Nat Cont., Atlanta, Georgia,1973.21. Reps, T. Optimal-time 'incremental semantic analysis for syntaxdirectededitors_ Tech. Report No. 81-453, Dept. of Comptr. Se,Cornell University, Ithaca, NY, March 1981.22. Skinner, G. God user documentation. Dept_ of Comptr. Sci,Cornell Univ., Ithaca, NY,23. Teitelbaum, T. A formal syntax for PL/CS. Tech Rept 76-281,Dept. of Comptr. Sci., Cornell Univ., Ithaca, NY, 1976.24. Teitelbaum, T. The Cornell Program Synthesize= amicrocomputer implementation of PL/CS. Tech. Report No. TR79-370, Dept. of Comptr. Sci., Cornell Univ., Ithaca, NY, Jane 1979.25. Teitelbaum, T. The Cornell program synthesizer: A tutorialintroduction. Tech. Report No. TR79-381. Dept. Comptr. Sci,Cornell Univ., Ithaca, NY, July 1979, Revised Jan. 1980.26. Teitelman, W. IN-TM-RI ISP reference manuaL Xerox PARC,1974.27. Teitelman, W. A display-oriented programmer's assistant XeroxPARC, March 1977.28. Wilcox, TR., Davis, A.M., and Tindall, M.E. The design andimplementation of a table driven, interactive diagnostic programmingsystem_ Comm. ACM 19, 11 (Nov. 1976), 609-616.29. Zelkowitz, M. Reversible execution as a diagnostic tool. Ph.D.Thesis, Dept. of Comptr. Sri, Cornell Univ, Ithaca, N.Y., Jan. 1971.573Communications September 1931of Volume 24the ACM Number 9

Eystam generator developed at the TechnicalAbstract The prtrrearrningUniversity of Darmstadt generates sophisticated interactive programmingenvistereents from formal language definitions. Fran a formal, entirelynonprocedural definition of the language's syntax, context conditions anddenctetional semantics, it produces a hybrid editor, an interpreter and aLibrary eystem. The editor allowe both structure editing and text editing,guaranteeing immediate recognitital of syntax and aemantic errors. Thebeen used to generate envirtxrnents for PASCAL, MCCULA-2 andgenerator hasthe formal language definition language itself. A brief description of thegenerated envirerrnehts and the definition language is given, and ourexperiences with formal language definitions are discussed fear the languagedefiner's point of view as well as from the programmer's point of view usingthe generated enviraements.1. IntroductionExperiences with the PSG - Programming System GeneratorG. StaeltingInstitut fur praktische LnformatikTechnleche Hochschule Darmstadt .The Pr centring System Generator PSG developed at the Technical Universityo f Dorm9 LACK generates language-dependent int* ractive prOgreerreting erts/ ironmeritsfrom formal larcuage definitions. Fran a fyLmal definition of aLanguage's syntax. context conditions, &notational aemantics and additionalinformation it produces an integrated software developmentenvironment. Oneof the major compererts of a PSG enviremenent is a powerful hybrid editorwhich allows atructure oriented editing as well as text editing. In structuremode, the editor guarantees prevention of both, syntactic and semanticerrors, whereas in textual mode it guarantees their immediate recognition.The editor is generated from the language's syntax and context conditions.Furthermore, a P9) environtent includee an interpreter which is generatedfrom the langvege's denotational aemantica. A language-independent librarysystem is pert of a PSG environment.The basic units for editing and interpretingare called fragments. Afragment is al. arbitrary part of a program, for example a statement, aprocedure declaration or a Whole program. Fragments arc internally stored naabstract syntax trees. Fregments may be incomplete, that is, seboolpenentsmay be missing. Missing suboanixxlents are called templates. Bottom-up systemdevelopment is provided by combining fragments, while the fragments themselvesaxe constructed top-down.The editor supports two input modes, which may be mixed freely by the user.In textual mode, the editor behavel like a normal screen-oriental texteditor with the usual capabilities to enter, modify, delete, search etc.text. By keystroke, incremental syntactic and semantic analysis are invoked.If the input was error-free, the text will be pretty-printed and editing mayproceed. If any syntactic or semantic errors are detected, an error messagewill be displayed by a menu-driven error recovery routine. Earliest possibledetection of both syntactic and semantic errors is guaranteedt As soon as afragment cannot be embedded into a syntactically and semantically correctprogram, it will be classified as erroneous. For semantic errors, this workseven if declarations of e.g. variable types are still missing.In structured mode, programa are developedin menu-driven refinement ormodification steps. The menus are generated according to the abstract syntaxof the language. The usual structure oriented commands are offered to theuser, such as refinement of a structure, selection from alternatives of asyntactic class, modification, insertion,and deletion of substructures,zooming. of sdbetructures, copying of substructures etc. However, the menusare filtered dynamically by the context analysis, such that only thosemenu-items producing syntactically and semantically correct refinementsafter selection will be offered to the user. Thus, in structural input mode,neither syntactic nor semantic errors can occur. In addition the user mayrotreive the ooetext information which has been derived so far. For example,he might ask the system which variables are already declared, which varia-bles are still undeclared, what possible types the undeclared variables may•poeneas etc .Like the other system components, the interpreter is able to handle arbitraryincomplete fragments. As loing as control flow in the interpreted fragmentdoes not touch any syntactically incomplete structure, the fragment can beinterpreted without difficulties. If flow of control encounters a umaplate,• According to our philosophy, declaration before use is not required.An undeelared variable is confeHered a semantic error as soon as thelast template offering the possibility of declaring that variablehas been deleted• Work of this author was aupported b7 the 'Deutsche Porschangwiemein-

the editor will be invoked asking the user to enter the missing parts of thefragment. Alternatively, the language definer' may force the interpreterto ask the user for e.g. values of uninitialized variables or missingexpress wasA language-independent fragment library wytem Where fragments are stored asabstract syntax trees is also part of a generated environment. Reading,writing and rewriting of fragments ie automatically performed by the editorif required. Deletion of fregments requires an explicit user oommanoi. PSGor/vim -melts offer the facility of redirecting input to external text files.Perthereeee, fragnente may be written in pretty-printed style onto externalfiles.2. Whet the language definer has to doCne of the most important goals during the development of PS3 has been thedefiniticn of a formal language definition languagecovering the wholespectrum of a language's syntax, context conditions and dynamic semantics aswell 81 all of the eddAtional laceration required by an interactiveeseirterrent e.g. menu texts or pretty-printing information. Thus, thelanguege definer working with PSG is offered a formal, norgemeeeduraldefinition lenguege. This is in striking contrast to most existing environmentgenerators, Which frequently support only the formal definition of thesyntactic aspects of a language. For example, the language definer workingwith CeeerALF (Febe2a) has to write eo-celled action routines in an ordinaryplegraelairg lenguege; theses action routines will penfueutasks such as typechecking. code generation etc. Using the Cornell Program Synthesizer (CPS)[Repel), which is bard on attributed grammare, the languege definer has tocode certain attribute functions in the language C.A PSG language definition consists of three major parts, the definition ofthe syntax, the context conditions, and the denotational semantics of theLanguage. The first part is mandatory, the others are optional. Syntax andsemantics definition rely on well-known concepts. However. new conceptsbased on AI technology had to be developed for defining and checking contextconditions, due to the vpecific requirements of interactive environmentswhere prograns are usually incomplete containing e.g. pending vtriebledeclarations.definer has to ecify all reserved wordsand all delimiter. (specialeyeibols). FaCh lexical entity is given a name. For PASCAL, this Looks aefollowsetc.if'IP'sthen 'THEN':else -) 'ELSE',beccmes - )' • 1equal . 1-sem - lot',The abstract syntax, which forms the second part of the syntax definition,is the care of any language definition. All other parts of a languagedefinition refer to the abstract syntax. Abstract syntax rules look likethissetc.CLASS statement - assignment, forstatement, compound, ifstatement, call,NODE assignment ss variable expression/NODE forstatement a: Id expression to or downto expression statementsNODE compound II statamentlist:LIST statenentlist - statementeeNODE lfstatement expression statement [statement)/ZXE call ss Id Cparameterlistb.LIST parameterlist » erpression+1CLASS variable » /d, recordarray ref, pointer raftCLASS expression - variable, .ccnstant, addition, subtraction, • • . /NOCE addition **expression expression/CLASS rules describe syntactic alternatives. NUDE rules define substructuresof a syntactic entity. Substructures which are optional are enclosed insquare brackets. The number of a node's substructures is fixed, althoughthey may be of different syntactic type. LIST rules define syntacticentities with a variable number of substructures of the same syntactic type.In a PSG environment, fragments are internally represented by abstractsyntax trees. Hissing substructures of a node are represented by treetemplates; they serve as placeholders for pending refine-Tents. Missingsubliets of a list are called list templates, they may be moved, deleted andinserted freely within a list.The syntax definition part starts with the definition of the lexicalstructure of the language, which Ls used to generate a scanner. The languageIn the following, all examples refer to PASCAL

Being the folart, part of the syntax definition, the format definition.is aand mews offered to the user are generatedThe structure oriented commandsaccording to the abstract syntax. Fbr example, each template is associatedwith a menu of refinement poesibilities. Plowever, this menu is dynamicallyfiltered with respect to context conditions (s ►e below).of the syntax definition, isThe concrete syntax, %JhiCh is the third partused to generate an Lncrenental perstr. The concrete syntax is restricted tofull LL(1)grammars. It includes transformation rules Which specify how tobuild abstract trees from textual input. Thusthe concrete syntax isgrammar. Concrete syntax rules lbOkactually a etring-to-tree transformationLike this:etc.statementINOCE for, Id, beccmcs, expression, tocrdoento, expression,do, statemesitforetatementI'NOOE begin, statementlist, end ■ > compcundI tixc Id, optparameterlistetatementlist ,:e LIST statement+-sem,optparameterList(ip,parrumeterlist,rp),toords.dnto TERAINAL to 1 TERMINAL clowntorThe NOOE, LIST and reainvq, keywords and thecall:specify hoe' to build the abstract tree during the parsingthe situation is not alway'3 thatE. 1'. and se , ' delimitersprocess. Weever,simple. Frequently, a concrete syntax doesnot rerely reflect the rules of the abstract syntax, due to operatorprecMenoes or left-factoriration used to avoid 1L(11 -conflicts. Forexample,expreesirn to. simple_expression, sinplexpr_tailleimple_expression tve factor. ...;sinpleexpr_tall : UMATENDDE equal, cimple_expressienI EMPTY,e> equalexprHere, the UPS:MOM and EMPTY rules will construct a correct equal_exprnode, although the rules reflect operator precedence and are left-factori-tad.lha parser will part;e anyinput entered in textual mode. It accepts arbitraryvalid prefixes of any input conforming to the syntactical category of agiven template. If.any syntax errors are detected, a recovery routine willcompote a menu comprising all local correction possibilities, .Mich ispresented to the user. The user may then correct his input either in textualerrs or by selection among the menu items.tree-to-string transformation grammar which is used to conetruct theexternal textual representation of an abstract tree. Prettyprinting inborrationis part of the format definition:forstatement e> 1 for Id becomes expression to expression do statement C2),ifetatement e) 1 if expression then statement(2) (statement[2] -) I alse,),In the example, .1 . means start of a new line, and indentation factors maybe specified inside square brackets. Parentheses are used to specifyconditional formatting* the keyword 'ELSE' will be displayed only if theoptional else-part of an 'ifstatemene is indeed present. Ocoditionalformatting is used also to re-insert parentheses into expressions ifnoccessary due to operator precedence (note that parentheses are discardedduring parsing and that operator precedences are reflected by the abstracttree's structure). A string-to-tree-to-string transformation which isperformed by 'parsing textual input, building the abstract tree anJ pretty--printing the abstract tree must yield the original input text exactlyexcept for spaces, newlines and redundant parentheses.In the last part of the syntax definition, header° and menu texts have to bespecified which are used to generate the textual representation of templatesand menus. Fbr each name ocCurirg in the abstract syntax an external namehas to be specified:.statement -> 'Anweisung':ifstatament ->'Bedingte Anweieung',Dor each syntactic class, menu texts have to be specified,:statement ->'FCA-Anweisung','Verbundanwoisung',Fbr purposes of generality, syntactic entities may posses different externalnames, depending on their occurrence in templates or in menus.The definition of context conditionsThe context analysis of PS has been of special interest, since the'classicalmethods like attributed grammars [Knael.] turned out to be inadequateeven if attribute evaluation is performed incrementally Dep93]. COnsiderthe following situation, In a PASCAL programfrageent, the variables 'a' and'i' have not yet been declared or used, and a d eclaratial-tenpl ate is stillpresent. Now the user enters an incomplete assignmentaaraCi+1311 ■Although 'a' and 'i' arestill undeclared, the context analysis nust derive

immediately that 'I' has type integer (or a wUbrange thereof), that 'a' is ame-dimensional array with index and component type integer, and that thestill missing right-hand side of the assignment mist also be compatible withinteger. If a user types 'TRUE' as the right aide, a semantic error mustIn addition, the mom for the right-band sideimmediately be reported.tomplate should be filtered in such a way that the menu item for theconstant 'ME' will not be displayed, as well as all other non-integerexpression items.The classical methods follow the schemot first inspect the declarations andcollect information about e.g. types of variables, then use this informationto check typo inoompatibilities in expressions etc. This schAme does notwork in the above example.The concept of context relations (Hen84] has been developed to overcomethese difficulties with the classical methods. The basic idea is to computea net of still possible attributes for each node of an incxxipleta frageeelt.A collection of still poesible attribute assignments to the nodes of afragment is called a context relation. If such a relation consists ofexactly onetuple, the context information is unambiguous. If a relation isempty, a semantic error has been detected. It can be shown that the contextrelation of a composite fralmmit is just the natural join of the relationsof its Subfragments. Therefore context conditions any be computed incrementallyduring editing. As context relations are in general of infinite size,they are represented in a finite way using eo-called term form relationswith variables. The basic idea is to describe the set of poopible attributesby a granuar, the yo-called data attribute grammar. ...Infinite seta ofattributes are then representedby incomplete derivation trees according tothe data attribute grammar: in addition these derivation treesarbitrary functional dependenciee between (uUb)trees.any containTo specify contextconditions, the Language definer first has to define thescope and visibility rules of the language. This information is used todetermine whether all the different occurences of an identifier in afragment actually denote the SAW "abstract" identifier. If so, theircorresponding cats of still possible attribute values may be intersected.The second part of the context conditions definition is the speCification ofthe data attribute grammar. }ere, the structure of the attributes of thelanguage isdefined. Typical rules lock like this:-etc.CLASSordina, integer, Boolean, eUbrange, enumeration,MOE settype s: ordinal:NCCIE arraytype Is index types typo;LIST index_types m ordinal+rCLASS class ■ variable, ctype, constant, procedure,The attribute format definition forms the third part of the co text conditionsdefinition. Sinner to the format definition of the context-free partof the lanc-aage definition, it specifies how attributes shall be displayedto the user if he looks at the syetol table.The last and most important part of the context condition definition is thespecification of the so-called basic relations, which meat be specified forall terminals and each node rule of the abstract syntax. As the oontextrelation ofa fragment is the join of the relations of its components,specification of the basic relations provides enough informaticx: to analyseeach 'fragment incrementally. A basic relation consists of a set'of tupleswhich define a (posiibly infinite) net of attribute assigunents to thecomponents of a node rule rasp. a terminal. For instance, the basic relationof a syntactic integer number consisting of a single tuple might be:Intl MK-attribute(integer, constant);which specifies that an integer number hai type integer and ie a constant.More sophisticated apecifications can be Obtained by using variables, whichspecify that certain sub ' attributes must be identical. The basic relation foran assignmentassignment it ,variable expressioncontains three tupels, which use the variable TYPE:asaignments MOIL MK-attribute(TYPE, variable) 14C- attribute(TYPE, computational)I NIL IBC-attribute( real, Variable) MK-attribute( integer, computational)I Kit Mk-attribute(TYPE, function) MK-attribute(TYPE, computational):which says that in an assignment either- the left-hand side is a variable of a certain TYPE, and the right-handside is an expression of the same TYPE, or- the left-hand Bide is a real variable, and the right-hand side is aninteger expression, or- 'the left-hsnd aide is a function identifier with a certain result TYPE,• • •NOCE attribute it type class;CLASS type aimple_type, array type, sat_type, ...CIASS simple_type ∎ arithmntic, ordinal:

and the right-hand aide is an expression of the some TYPE.During editing, an inference engine is used to derive context informationfrom the basic relations as demonetrated in the Above example. Note thesimilarity to the AI-paradigna of inference-rule-based deduction rystems.The definition of semanticswithin the PSO eye tem, the dynamic semantics of a language is defined inde rotationaletyle (Gor79). The denotatiaeal semantics is used to generateen interpreter. The sanantic function, are defined in a META-TV-like [Non]exteneion of type-free lambda calculus. This metalanguage au ores higlr-level conceps like lists and mape and allows the definition of higher--crder-functionale of arbitrary rank. The term* of the metalanguage are usedL„ an universal intermediate language. If a fragment is to be executed, itwill be translated into a term of the metalanguage. using the definitions ofthe semantic faactions. This term will be inteLprvted, that is, reduced tonormal hanm. The resulting term is the result of program execution.In contrast to systems like SIS [Moe79] our interpreter allows interactionwith the vier during pLygram execution in order to supply inputenter values of uninitialised variables etc.Thedata, todefinition of the semantics consists of three parts. First of all, ayeti of auxiliary functions to be used elscWhere in the semantics definitionmay be defined. For example, the definition of a "distributed concatenation"function for a list of lists (which Le wuppoeed to be used in severaldistinct 'mantic feat ions for different types of lists) looks as folLowe:disconc » LAM list of_liste. IF NULL list °Lasts THEN 0ELSE CONC HEAD list of lists, (disconc TAIL list of lists),Bert, LAM denote° functional abstraction, parentheses denote functionalapplication. NULL is a test for the empty list, CCNC, !MAD and TAIL haveChair usual meanings, and 1 4) . denetes the empty list.The mein part of the semantics definition comprises the semantic functionsfor each syntactic entity. In a PASCAL-subset without 0070s and side effectsof functions,themeaning of a statement may be defined as a functionalwhich mepe envircnumts onto functions WhiCh map states to states. Theveanieg of an expression is a functional Odell maps environments ontofunctions from states to values. An environment is a map Which raps identifiersto (10CatiOn, descriptor> pairs. h state is a map Which maps locations• POT the sake of roadability, this specification does not exactly.5., ♦ r, i•-••••-•,nr-,, 11-.41 411,1 .to values. Thus, the semantic function for a conditional statement might,look as follaessifstatemente LAM env. LAM state. IF ((THEN (( IC statanent 1 31 env) state)It expression 31 env) state)ELSE (( !statement 2tLAM env. LAM state.etateI env) state),The '1[' and ']1' brackets are the "meta-brackets" which denote the meaningfunctions of the sUboomponents of a node. Theepecial form 'I' ... 'I' isused for subcomponents which are optional (as the ELSE-pert in our example).If the optional miboomponent is missing, the function hollowing the colanwill be used.The third pert of the semantics definition descrihes the meanings of theexecutable fregments. Typical examples areprocedure declarations ", ERROR 'Procedure declaration is not executable'?statemenLe 'Result of statement execution with no variables declaredor initialized'', (( IC statement 31 Cl) (]);where 'Cr denotes the empty nap. Nate the difference between the resultof a 'statement' execution specified here and the semantic function for thesyntactic class 'statement', to which the above definition refers.3. Experiences with the generator and the generated environmentsUntil now,environments have been generated for Algo160, PASCAL, MOCULA-2,the language definition language itself, and sane experimental specificationlanguages. The language definition environment has been used intensivelynot only by the members of the project team, but also by lots of students,as west of our environments have been generated as part of diploma theses.At Keiserelautern, PSG has been used along with other systems((W (KasV)and GANEALF) in student projects for the implementation of a PASCAL-subset.The PASCALeenvironmeet wee ueod to implement other parts of the PSG system.Thus, we feel that by no we have gathered substantial experience and thatwe are able to compare our approach to others , especially GAME ELF and CPS.The benefits of a formal larresege definition languageIn [Hal064], Itabentvum states that "the state of the art has not reached thepoint where all of the task - specific (i.e. language specific) part (of anenvironment) can be formally described and automatically generated".However. our experience with PSG indicates that this point has been reachedby now, at least for languages of a complexity not greater than that of e.g.

--y-PASC7,1,.The useof a formal language definition Language has many advantages*- In vim/ of the power and complexity of the generatedenvirorrents, PSGLanguage definitions are very abort. Typically, they vary in sizebetween 240 lines for an Algo160 envirannent without oontext oonditionsand semantics and 3600 lines for a HDDJL&-2 envirorment including fullspecification of context conditions and denotational semantics.- The expressive power of the language definition language allows oxIcentrationon the relevant aspects of a language definition. The languagedefiner does not have to ooncern himself with minor details such as theorganization of symbol tables etc.- PSG Languege definitices are cafe , since all incoesistancies in adefinitioi are detected at generation time. *-- The ecdular design of the language definition language improves madabilityand reliability. It allowsthe independent definition of thesyntactic, context dependent, and semantic aspects of a language, oncethe abstract syntax has been defined.a formal larqusge definition language is an ideal tool during thedevelop sit of new languages. In a "language design lab", languagedefinitions are easily modified and tested.As a coneequence, the amount of manpower to generate an envirormeit issmalls h moderately awake graduete student with NOM background in program--ming lang\eeges and some i.ni tial knowlegde of the PSG user interface willspecify and debug an elgo160 definition without context conditions andsemantics within ten days. The MDOULA-2 envilutinellt Lncluding full specificetialof context condition and denotetianel semantics was defined as partof a diploma thesis within eight months. Thus, the use of a formal languagedefinitiallanguage allows the quick generation of correct, reliable andpo.Jerful prxxmcmairg envirornxelts.The benefits of the hybrid editor approachLn (Fei64), Kaiser and Feller state for structure oriented editors that'in order to modify an expression, ... the user rust .understand the andtree representation and enter a tedious serious of tree oriented clip,dweeee and insert °gormands. Unfortunately, complete parsing of all expres-- A.t the moment, this is rot true for the semantics definition, as it istased on type free lamtda calculus. However, the implementation of atype inference algorithm allowing handling of polymorphism, overloadingend opercicas is about to be completed (see (Let84]).sions is also r ptimar. This is true not only for expressions, but alsofor arbitrary structured statements as well as for any syntactic entityincluding complete programs. In (Repel], Teitelbaum and Repo state that"(the Change of a while Loop into a repeat Loop) must be accomplished bymoving the constituents of the existing WHME-template into a newlyinsertedUNTIL-taxpLate. Although such modifications can be made rapidly ..., theyare admittedly awikward". Within a PSG envilumelit, problems of this kinddo not exist, since users may switch freelybetween textual mode andstructure code. Furthermore, our experience indicates that experiencedprogrammers'prefer textual mode rot only for modifications, but also toenter e.g. a sequence of statamenta or even a whole procedure. Since theparser accepts arbitrary incomplete input and, in case of syntax errors,generates a menu of all poesible Local recovery actions, textual input modeseems to be quite attractive for users Who know the concrete syntax of theirlanguage. Furthermore, arbitrary parts of a fragment may be read in from anexternal textfile. Onthe other hand, unexperienced users tend to preferstructured mode. By simply selecting menu items, they need not bother aboutsyntactic details which they do not know. Thus, the possibility to mixtextual modeand structure mode freely seem* to be the most flexible,general, and user friendly solution to the dichotomy of viewing programseither as text or as structure.The benefits of dynamic context sensitive menu filteringIn Cilab82b] liabelonannstates that "Noe believe that preventing mistakes isfar superior to making the user fix them. ... (however) as to semanticerrors it is difficult to see how to avoid them". within a PSG enviti_zuent,structured mode prevents syntactical and seranticalelxots due to thedynamic context-sensitive menu filtering. This feature is rot provided byany other environMent known to US. In textual mode, the user ray always type. arbitrary nonsense, but syntactical and semantical errors will be detectedimmediately. This guarantees that programs are correct at every stage oftheir development.We believe that our emir:malts support syntactic and semantic errorccevantion • in the best possible way. There is, however, one problem inconnection with certain modifications* if a user modifies e.g. a proceduredeclaration by adding an extra parameter, context incompatibilities welloccur at each place where the procedure is called. If the calls are modifiedfirst, they will become incompatible with , theprocedure declaration.Although it might be oonsLiered.bad programming style to modify the types ofobjects in en uncontrolled manner, the user can circumvent such situations

Nby temporarily deactivating the context analysis. It is planned to modifythe context analysis in a way that enables it to tolerate faulty aubtreestomporarely.Drawbacks in generality and performancePSG is not the ultimate system, as there remain several unsatisfying points.A formal language definition language enables language definers to generateenvirorrnents in a rapid and reliable way. Hawever, the current implementationof the definition language impoees some restrictions on the clans ofLanguages which may be defined with PSG.First of all, if the concrete syntax of a language cannot be made 1L(1),the language cannot be defined within PSG. It is, however, difficult to seehow to incorporate a mere powerful parsing technique. Bottom-up tethniquessuds as inoranental Lit parsing ((Ce 178)) do sat' fit in our framework. LL(k)with k > 1 is problematic in view of the requirement that arbitrary validprefixes of sentential forma must be parseable. .Certain languages have context conditions which are not definable withinthe current definition Language. The soppe and visibility analysis cannothandle features like elliptical reccrd references in PL/1 or FORWARDprocedure declarations in PASCAL (which will lead to a 'double declaration'error). Within our framework - no declarations required before use - FORWARDdeclaraticns do rot make sense an The context analysis phase is unableto handle user-defined polymorphic or overloaded objects such as overloadedfunctions in ADA. We are currently working on a more powerful.apecificationlanguage for context conditions which will overcome these ihortcominge.Finally, the semantics definition language is unable to handle any form ofparallelimm.The performance of PSG enviloossents has not yet reached production quality,as far as context analysis and pregram execution are concerned. For thecontext analysis, this is primarily a problem of the current implementation,which is merely a prototype. We expect that a more eophisticated implementationwill result in a time speedup factor of at least five. However, theintrinsic complexity of the method is greater than that of e.g. attributedgremmersi For an abstract syntax tree containing n nodes the Repo/Teitelbaumalgorithm will perform with 0(n), wheras cur method requires 0(n4ln(n)).Tho performance difficulties concerning pLooxast execution are of a slightlydifferent nature, as we have difficulties to .see hi to speed up theinterpreter simply by. improving its implementation. The interpreter is suchfaster than that of SIS. Bow ever, it is not fast enough ior productionprograms, as is aloo noted by Pleban for PSP ([Ple84]). We think that theseWhortoominga may be overoane by compilation of the metalarquage term*[Dah84b), utilizing techniques like data floe analysis . elimination ofunnecceeary call-by...name and delayed evaluatiseo and elimination of tailrecursion and linear recursion.4. eanclusionWe presented the PSG peuyLeandg system generator, which generates po•0erfalinteractive programing envirorments from formal language definitions. Thepros and cons of using a farmal, entirely nonprocedural Language definitionlanguage have been discussed. It turned out that use of a formal definitionlanguage allowe very simple and rapid generation of reliable and ppwerfulenvironments. On the other hand, certain strange and complicated features ofcertain language, are not definable with the currently implemented definitionlanguage, and the performance of the generated envirorramta has not yetreached production quality. Nevertheless, we believe that the use of formallanguage definitions is an apvcopriate tool, and that the shortcomings inperformance will be captured by more eophioticated inplementations, whichare still under way.5. AtknowlegdementsX .thank the other markers of the project team, namely R. Dahlke. W. Denhapl.H. Minkel, H. J8ger and T. Letschert for their valuable comments duringthe.development of this paper.6. Peferences[Dah84a] Eahlke, R. and Smiting, G.I Programmiersystemgenerator. Arbeitsberidht1994. Be:richt PG2112/84, Fachgebiet Prograreiersprachenand Dbarsattar II, Teduiische Ibchschule Darmstadt, Pebruar 1984.(Bah84b) Bahlke, R. and Letschert, T. Auaf0hrbare denotationale Semantik.Proc. 4. GI-Fachgesprflch tmplamentierung von Programmiersprochen.MArt 1904.(111678) Dj05rnor, D. and Jones, C.D. (eds.): The Vienna Development MethodsThe metalanguage. LOOS 61, Springer Verleg 1978.lee1783 Celentano, A.: Incremental LR parsers. Acta Informatica 10 (1978),307-321.(FeiO4) Weer, G.E. and Feller, P., Generation of language-oriented editors.Proc. Programmierumgebungen and Compiler, Derichte des German Chapter

Implementation Techniques for PrologAndreas Kral'Institut fur ComputersprachenTechnische liniversitat WienArgentinierstraBe 8A-1040 WienandiOmips.complang.tuwien.ac.atAbstractThis paper is a short survey about currently used implementationtechniques for Prolog. It gives an introductionto unification and resolution in Prolog andpresents the memory model and a basic executionmodel. These models are expanded to the ViennaAbstract Machine (VAM) with its two versions, theVAM2p and the VAM 1 p, and the most famous abstractmachine, the Warren Abstract Machine (WAM).The continuation passing style model of Prolog, binaryProlog, leads to the BinWAM. Abstract interpretationcan be applied to gather information about a program.This information is used in the generation of very specializedmachine code and in optimizations like clauseindexing and instruction scheduling on each kind ofabstract machine1 IntroductionThe implementation of Prolog has a long history[Co193). Early systems were implemented by the grouparound Colmerauer in Marseille. The first system wasan interpreter written in Algol by Phillip Roussel in1972. With this experience a more efficient and usablesystem was developed by Gerard Battani, HenryMeloni and Rene Bazzoli [BM73). It was a structuresharing interpreter and had essentially the same builtinpredicates as modern Prolog systems. This systemwas reasonably efficient and convinced others of theusefulness of Prolog. Together with Fernande and LuisPereira David Warren developed the DEC-10 Prolog,the first Prolog compiler [War77]. This compiler andthe portable interpreter C-Prolog spread around theworld and contributed to the success of Prolog. Furtherdevelopments are described in [Roy94] and partlyin this paper.Section 2 presents a basic execution model for Prolog.This model helps to understand the Warren AbstractMachine described in section 3 and the ViennaAbstract Machine described in section 4. Section 5gives on overview of optimizations.2 A basic execution model2.1 IntroductionThe two basic parts of a Prolog interpreter are theunification part and the resolution part. The resolutionis quite simple. It just implements a simplifiedSLD-resolution mechanism that searches the clausestop-down and evaluates the goals from left to right.This strategy immediately leads to the backtrackingimplementation and the usual layout of the data areasand stacks. Resolution handles stack frame allocation,calling of procedures, and backtracking.Unification in Prolog is defined as follows:• two constants unify if they are equal• two structures unify if their functors (name andarity) are equal and all arguments unify• two unbound variables unify and they are boundtogether• an unbound variable and a constant or structureunify and the constant or structure is bound tothe variableThis definition of unification determines the datarepresentation. A thorough analysis of the recursiveunification algorithm pays off because the interpreterspends most of the time in this part.2.2 The representation of dataSince Prolog is not statically typed, the type and valueof a variable can in general be determined only at runtime. Therefore, a variable cell is divided into a valuepart and a tag part which determines the kind of thevalue. Fig. 1 shows a tagged value cell.Basic data objects in Prolog are constants (atomand integer), structures and unbound variables. Sinceunbound variables can be bound together, there are

efifunctor ./2 int 1 ref[ functor ./2 int 2 atomFigure 2: representation of X = 1.2.0reffunctor ./2 int 1 functor ./2int I 2 atom UFigure 3: compacted representation of X = 1.2.0tagvalueFigure 1: a tagged value cellreferences between variables which are represented bypointers. To access a variable it can be necessary tofollow the chain of references which is called dereferencing.Following tagged cells are needed in a Prologsystem:atom unique identifier of character stringinteger integer numberreference pointer to another tagged cellunbound (self-reference pointer)functor name and arity of a structure followed by atagged cell for each argumentMost Prolog implementations do not use a separatetag for unbound variables. They represent unboundvariables by a self reference. This can eliminate a tagcheck during unification of an unbound variable withanother variable. The comparison can be replaced byan assignment. Since structures need more than onecell the variable cell contains a reference to the functorcell (see fig. 2). If the last cell of a structure is again astructure and the second structure is allocated directlyafter the first structure, the reference can be omitted(see fig. 3). This compact allocation can be obtainedeither at the first allocation or on garbage collection.Another solution is a special reference tag for structures.The advantage of this method is that the typeof a value cell can be determined without a memoryaccess. Many implementations distinguish further betweenthe empty list (nil) and other atoms, and betweenlists and other structures in order to allow moreefficient implementations of lists. Big numbers andfloating point numbers are represented as structures.The tag field can be represented in different ways. Itcan be an additional memory cell of the standard wordsize, or it can be a small part of a memory cell. If thetag consists of some bits, the tag is either fixed-sizedor variable-sized and uses the most or least significantpart of the word.Useful tag representations try to minimize the tagextraction and insertion overhead. An example is theuse of zeroes in the least significant part of the wordas an integer tag. Addition and subtraction can so bedone without tag manipulation. Another example isto have the stack pointer displaced by the list tag, sothat the allocation of list cells is free. A comprehensivestudy of tag representations can be found in [SH87).Problems arise if a variable occurring inside a structureshould be bound to this structure. In theoremprovers in such a case the unification should fail. Thistest for occurrence of the variable in a structure, calledoccur check, is expensive. It is omitted in many unificationalgorithms employed by Prolog systems. If sucha structure is assigned to a variable, a recursive structureis created. A simple unification algorithm wouldenter an infinite loop unifying two infinite structures.There exist linear time unification algorithms for infinitestructures [Jaf84), but many Prolog systems dowithout it and create infinite structures, but cannotunify or print them.2

x(X) a(A)a(C) b(C), c(C)b(s(0))c(s(0))x(X) a(A)x(X) a(A)a(D) b(D), d(D)b(s(0))d(s(0))failbacktrackingsuccessFigure 4: stacks2.3 The data areasVariables in a Prolog clause are stored in a stack framesimilar to variables in a conventional programminglanguage. The SLD-resolution was chosen as the resolutionscheme for Prolog because of its simple stackimplementation and efficient memory use. An earlydescription of the memory management of Prolog canbe found in (Bru82].The clausea(C)b(C), c(C).can be represented by the tree in fig. 5.Figure 6: proof treeFigure 5: clauseSubtrees can be combined to a complete proof tree,also called AND-OR tree. As an example, take thefollowing short Prolog program:x(X) x(X).a(C) b(C), c(C).a(D) b(D), d(D).b(s(0)).c(s(0)).d(s(0)).The AND-OR tree is shown in fig. 6. The thicklines belong to the AND-tree of the last solution, thethin lines belong to the AND-tree of the first solution.The AND-tree represents the calls of the different goalsof a clause. The OR-tree represents the alternativesolutions.The AND-OR-tree can be represented in linearizedform by a stack (see fig. 4). Since we are only interestedin one solution at a time, only an AND-treeis stored in the stack. The OR-tree corresponds todifferent contents of the stack between backtracking.Fig. 4 represents the AND-OR-tree at three differentmoments. The left part of the figure shows the first solution,the middle shows the stack after backtrackingand the right part shows the second solution.The cells of structures are allocated on a stack. Infig. 4 the :cells for the structure s (0) would be allocatedaft& the stack frame for b(s(0)). When thestack frame for c (s (0)) is allocated, the stack framefor b (s (0) ) can be discarded if there are no referencesinto the discarded stack frame and if there are no structurecells on the stack. In order to allow memory reusethe stack is divided into two parts. The environment(or local) stack holds the stack frames and the copystack (global stack or heap) holds structure cells. Thedangling reference problem can be solved if referenceswithin the environment stack are directed towards thebottom of the stack or to the heap.In order to facilitate the removal of stack frames,there is a distinction between deterministic and indeterministicstack frames. A stack frame is deterministicif no alternative clauses are left for this procedure.An indeterministic stack frame is called choice point.During unification variables in a stack frame maybecome bound. On backtracking they should be reset3

vvvqad,-0-- yu-tckcktit.(beto unbound. An additional stack, the trail, solves thisproblem. During unification the addresses of boundvariables are pushed onto the trail. On backtrackingthese addresses are popped from the trail and the variablesare reset to unbound. It is only necessary to trailthe addresses of variables which are closer to the bottomof the stack than the last choice point. Testingthis condition is called trail check.variablescallers goalcallers framealternative clausestop of trailtop of copy stackprevious choice pointFigure 7: stack frame with choice pointFig. 7 shows a stack frame with a choice point. A deterministicstack frame contains the cells for the variables,a pointer to the caller of this clause, comparableto the return address in a conventional stack frame,and a pointer to the stack frame of the caller. Thesetwo pointers are usually called continuation. A choicepoint additionally contains a pointer to the next alternativeclause, a pointer to the top of trail and apointer to the top of copy stack at the time the choicepoint was created, and a pointer to the previous choicepoint.copy stackif copy and environment stack grow in the same direction,and the copy stack grows towards the environmentstack. The code area is needed to store theprogram and string representations of the atoms.To enable fast unification, only unique identifiersof atoms are stored in variables. A hash table orsearch tree is constructed over these strings to enablefast searching when only the string representation isknown. The same concept is applied to functors (nameand arity of structures).2.4 Simple Optimizations2.4.1 The Representation of TermsIn the previous sections we have used a representationof structures known as structure copying. This techniquewas introduced by Maurice Bruynooghe [Bru821and Christopher Mellish [Me182]. Structure copyingis now the standard implementation method becauseit is faster than the previously used structure sharing[BM72]. In general, structure copying also consumesless memory than structure sharing [Me182J.Structure sharing is based on the assumption that alarge part of a structure is constant and contains onlyfew variables. A structure is here divided into the constant,part called skeleton, and a variable part, calledenvironment. The skeleton contains the constants andthe offsets into the environment, the environment containsthe variables. The skeleton is stored in the codearea, the environment in the global stack. A structureis represented by two pointers, one to the skeleton andone to the environment. Therefore, a variable cell hasto hold a tag and two pointers. On modern machinearchitectures this means that a cell needs two wordsand spends much time in decoding skeletons. Thusonly the first Prolog systems [BM731 and David Warrensfirst Prolog compiler [War771 used structure sharing.But in conjunction with binary Prolog (see section3.3) structure sharing can gain in interest again.trailenvironment stackcode areaFigure 8: data areasFig. 8 shows the stacks and data areas in a Prologsystem The check for pointer directions is simplified2.4.2 Interpreters and CompilersWe did not yet address the problem of how to representprograms. A simple solution is to directly usethe term representation of the clauses. The interpreterthen has two instructions, the unification which operateson a whole goal and the head of the matchingclause, and the resolution which pushes whole clausesonto the stack and does the backtracking. This simplemodel is called clause or goal stacking model. Usingstructure sharing for the goal level of the term leads tothe classical interpreter model with two term pointersand two environment (frame) pointers.4

Unification in general consists of assignments, conditionalassignments and comparisons. So it is quite naturalto break the unification up into its atomic parts.The program is analysed and instructions specializedfor the argument types of the goals are generated. Theresolution can be divided into stack allocation, clauseindexing and calling instructions. The program is representedas a sequence of such instructions which canbe either executed by an interpreter or compiled tomachine code. Such an instruction set definition togetherwith the memory model is called an abstractmachine. Several abstract machines were defined, inthis paper only the common Warren Abstract Machine(WAM) and the Vienna Abstract Machine (VAM) aredealt with.2.4.3 Variable ClassificationIn the simple execution model presented above it isassumed that during allocation of a stack frame allvariable cells are initialized to unbound. nirthermore,for every variable occurring in a. clause a cell is allocated.Variables occurring only once in a clause, called voidvariables, can be bound only by a single instruction.The value bound to this variable will never be used.So it is not necessary to reserve space for such variables.Another case are variables which occur onlywithin one subgoal. It is not necessary to reserve thespace over different goals. Space for these temporaryvariables is not reserved in the stack frame but in anadditional fixed area. To avoid dangling pointers, referencesmust always point from temporary variablesto the environment or copy stack.The initialization of the stack frame and of temporarevariables can be eliminated if the first occurrenceand further occurrences of a variable are distinguished.The improvement comes not only from the eliminationof some initializations but also from the elimination ofa complex unification for the first occurrence.2.4.4 Clause IndexingIndexing of Prolog clauses is an optimization whoseaim is to reduce the number of clauses to be triedand to avoid the creation of choice points if possible.The results are better execution times and memoryconsumption.The most trivial optimization done by every Prologsystem is to try only the clauses of a procedure insteadof all clauses of a program during the the searchfor a unifying clause. First argument indexing is morecomplicated: Only clauses which unify with the goalin the first argument are selected. For this purposean indexing structure is built over the clauses whichdifferentiates the clauses depending on their first arguments.This indexing structure is either a hash table ora search tree. The search tree has the advantage that iteasily handles variables in the head of the clauses andallows dynamic clause insertion. Sophisticated clauseindexing schemes are presented in section 5.2.2.4.5 Last-call OptimizationIn section 2.3 we noticed that stack frames can bediscarded after the subtree has been proved and noalternatives are left. This check is simple. The stackframe has to be the top-most frame. There can beno choice point left on the stack allocated later. Adeterministic stack frame can be discarded not onlyafter the call of the last subgoal, but also before thiscall. The general solution is to copy the stack frameof the called clause over the stack frame of the clausewith the last call after the unification of the variableshas been done (see fig. 9).4called clauselast calltcalled clauseFigure 9: general last-call optimizationThis frame moving is complicated by the fact thatthere could be references to the moved stack frame andreferences to unbound variables in the discarded stackframe. Therefore, the variables have to be checkedand updated prior to the moving of the stack frame.Instead of updating the references, the variables canbe globalized. That means that they are allocated onthe global (copy) stack. The overhead of moving thestack frame can be avoided by copying the discardedstack frame to registers. The new stack frame thenis directly created at the place of the discarded frame(WAM). An other solution is to create the new stackframe in registers and copy the registers to the placeof the discarded frame (VAM).Last-call optimization can be generalized for everycall. A deterministic stack frame can be moved overthis part of a stack frame which is not used at latercalls. For that purpose the variables have to be orderedon their last occurrence. A simple stack trimmingwithout the overhead of generalized last-call optimizationcan be achieved by discarding only variableswhich have their last occurrence before the call. Lastcalloptimization can reduce an infinite memory con-5

sumption to a finite one. So it has to be implementedin every Prolog system. Specialized implementationsalso reduce the run time because unifications can beeliminated if variables occupy the same location.2.4.6 Garbage CollectionIn Prolog unreferenced data (garbage) can be producedboth in the code area and in the copy stack. Butdifferent kinds of garbage collection algorithms can beapplied to these data areas. At least the copy stackneeds a compacting collector which preserves the orderof the cells. An algorithm which uses pointer reversalhas the best space-time complexity. When the copystack becomes compacted the trail must be updatedtoo. Some Prolog garbage collectors collect only partof the stack due to wrong interpretations of uninitializedvariables. Unused data in the code area is easilydetected by the retract procedure. If the code is notmoved, no updates of the environment stack are necessary.This parameter passing is mirrored in the instructionset. put instructions copy the arguments of thegoal into the registers, get instructions unify the registerswith the arguments of the head. unify instructionshandle the unification of structure arguments.They can be executed in two modes. In write modea new structure is created, in read mode the structurearguments are unified with the arguments of thehead. procedural instructions manage the stack andexecute procedure calls. indexing instructions buildthe indexing structure. The data areas are identicalto the previously presented simple model (see fig. 10),but the choice point is quite different. The originalWAM added a push down list used as a stack for therecursive unification procedure. But in a byte codeemulator this push down list is hidden in the run timestack of the implementation language. In a machinecode generating compiler the environment or the copystack can be used for this purpose.3 The Warren Abstract MachineItrailTRSix years after the development of his successful compilerfor the DEC-10 David Warren presented a newabstract Prolog instruction set [War831. This NewProlog Engine has become very popular under thename Warren Abstract Machine (WAM). It has beenthe basis of nearly all Prolog systems developed afterthe year 1983. The aim of the WAM was to serve asa simple and efficient implementation model for bytecode interpreters as well as machine code generatingcompilers. So the first implementation was a structurecopying byte code emulator.3.1 The Original Warren Abstract MachineThe WAM is closer to the execution model of imperativelanguages than all other implementation models.The main idea is the division of the unification intotwo part, the copying of the arguments of the callinggoal into argument registers and the unification of theargument registers with the arguments of the head ofthe called clause. This is very similar to the parameterpassing in imperative languages like C. The first parametersare passed via registers. If the registers areexhausted, the stack can be used for additional parameters.The partitioning of the unification reduces thenumber of instruction pointers to one and the numberof frame pointers to one, if all parameters can be keptin registers.stackheapcode arearoll■•••••ABSPCPFigure 10: data areas of the WAMSince all variables in the stack frame are copied intothe argument registers before calling a procedure, lastcalloptimization is simplified. The stack frame of thecalled procedure can be created directly at the placeof the stack frame of a deterministic caller. To avoidthe overhead of recreating the argument registers onbacktracking using put instructions, the argument registersare saved in the choice point. This permits lastcalloptimization also in these cases where the calledprocedure has alternative clauses. Furthermore, thisleads to a relaxed definition of temporary variables.The head, the first subgoal and all builtin predicatesbetween head and first subgoal count as one subgoal6

for the classification of temporary variables. Unfortunately,the problem of dangling references is notsolved. Therefore, there are special versions of putinstructions which check if the last occurrence of avariable in the last subgoal has a reference to the discardedstack frame. Such variables are called unsafevariables and are allocated on the copy stack.After this introduction we can present the machineregisters of the WAM:program counterCP continuation program counter£ current environment pointerB most recent choice pointA top of stack (not strictly necessary)TR top of trailH top of heapS structure pointerA1,A2,... argument registersXI ,X2,... temporary registersThe continuation program counter is a registerwhich caches the pointer to the continuation goal. Itcan be compared with the return address in an imperativelanguage. Holding this value in a register speedsup the execution of the leaf procedures. The environmentpointer is comparable to the frame pointer in animperative language. The original WAM contained aHE register (heap backtrack point) which caches thetop of heap corresponding to the most recent choicepoint. It is used to check if a variable has to be trailed.In general it is faster to take this value directly fromthe choice point than to update this register at everychoice point creation and deallocation. The structurepointer S is used during the unification of the argumentsof structures. Also named different, argumentregisters and temporary registers share the same poolof registers. Register allocation tries to use the registersin such an order that the number of instructionscan be reduced.The environment contains the local variables andthe continuation code pointer CP' and a pointer to theprevious environment E'. The choice point is shown infig. 11. B', H', TR', CP', E' and the Ai' are copies ofthe values of the machine registers before the creationof the choice point. The value BP of the retry pointeris supplied by the instruction which creates the choicepoint and points to the code of the next alternativeclause.Fig. 12 shows the complete WAM instruction set.vr, describes either temporary or local variables. Ridesignates the argument registers. C is a constant (integeror atom) in its internal representation and F isthe functor of a structure which contains the name andthe arity of the structure.B'H'TR'previous choice pointtop of heaptop of trailB? retry program pointerCP'E'A 1 '• •An'continuation program pointerenvironment pointerargument registersFigure 11: choice point in the WAM3.2 Optimizing the basic WAMIn an interpreter the execution mode of unify instructionis hidden in the state of the interpreter. Thereare just two instruction decoding loops, one for theread mode and one for the write mode. In a machinecode generating compiler the mode has to become explicit.The simple solution of a flag register, which ischecked in every instruction, is not very efficient. Thefirst step is to divide the unify instructions in writeand read instructions. The optimal solution, whichsplits all paths for read and write mode, has exponentialcode size. Linear space is consumed if the modeflag is only tested once per structure. This schemecan be improved if write mode is propagated down anested structure and read mode is propagated up. Amore detailed description and further references canbe found in [Roy94].In the WAM it is very common that unbound variablesare bound to a value shortly after their initialization.This happens e.g. if a variable has its first occurrencein the subgoal which calls a procedure witha constant argument. The variable has to be createdin memory and needs to be dereferenced and trailedbefore being bound. Beer [Bee88} recognized that thisis time consuming and additionally would require anoccur check if implemented. He developed the idea ofan uninitialized variable.An uninitialized variable is defined to be an unboundvariable that is unaliased, that means it is notshared with another variable. Such a variable gets aspecial reference tag. Creation of an uninitialized variableis simpler, it does not have to be dereferenced ortrailed. Binding reduces to a single store operation.It is necessary to keep track of such variables at runtime. If they remain uninitialized after the executionof the subgoal they have been created, they must beinitialized to unbound.

3.3 Binary PrologThe key idea of binary Prolog is the transformation ofclauses to binary clauses using a continuation passingstyle. BinProlog, an efficient emulator for binary Prologhas been developed by Paul Tarau [Tar91][Tar92].The implementation is based on the WAM which canbe greatly simplified in that case.In binary Prolog a clause has at most one subgoal.A clause can be transformed to a binary clause byrepresenting the call of subgoals explicitely using continuations[App92]. For that purpose the first subgoalis given an additional argument containing the successcontinuation. The success continuation is the listof subgoals to be executed if the first subgoal is executedsuccessfully. The head is given an additionalargument which passes on the continuation. A factis transformed to a clause, whose subgoal executes ameta-call of the continuation. For example, the followingclausesnrev( , []).nrev([111T] ,R) :-nrev(T,L) , append (L, [H] ,R) .are transformed intonrev( , D ,cont) (Cont).nrev( [HIT] ,R,Cont) : -nr ev (T , L , append (L , [H] , R , Cont ) )Compiling binary Prolog to the WAM it appearsthat the environment stack is superfluous since all variablesare temporary. Therefore, all instruction dealingwith local variables or managing the stack can beeliminated. So a small and efficient interpreter can beimplemented. But this simplification has a big problem.The continuation, which contains also the variablespreviously contained in the stack frame, is storedon the copy stack. This means that there is no lastcalloptimization. So for a working BinWAM an efficientgarbage collector is crucial. In some sense theBinWAM can be seen as mixture of a clause stackingmodel with the WAM.4 The Vienna Abstract Machine4.1 IntroductionThe VAM has been developed at the TU Wien as analternative to the WAM. The WAM divides the unificationprocess into two steps. During the first step thearguments of the calling goal are copied into argumentregisters and during the second step the values in theargument registers are unified with the arguments ofthe head of the called predicate. The VAM eliminatesthe register interface by unifying goal and head argumentsin one step. The VAM can be seen as a partialevaluation of the call. There are two variants of theVAM, the VAM 1 p and the VAM2P-A complete description of the VAM 2p can be foundin [KN90]. Here we give a short introduction to theVAM 2p which helps to understand the VAM ip andthe compilation method. The VAIN/1 2p (VAM with twoinstruction pointers) is well suited for an intermediatecode interpreter implemented in C or in assemblylanguage using direct threaded code [Be1731. The goalinstruction pointer points to the instructions of thecalling goal, the head instruction pointer points to theinstructions of the head of the called clause. During aninference the VAM2p fetches one instruction from thegoal, one instruction from the head, combines themand executes the combined instruction. Because informationabout the calling goal and the called head isavailable at the same time, more optimizations thanin the WAM are possible. The VAM features cheapbacktracking, needs less dereferencing and trailing, hassmaller stack sizes and implements a faster cut.The VAM i p (VAM with one instruction pointer)uses one instruction pointer and is well suited for nativecode compilation. It combines instructions atcompile time and supports additional optimizationslike instruction elimination, resolving temporary variablesduring compile time, extended clause indexing,fast last-call optimization, and loop optimization.4.2 The VAM2PLike the WAM, the VAM 2p uses three stacks. Stackframes and choice points are allocated on the environmentstack, structures and unbound variables arestored on the copy stack, and bindings of variablesare marked on the trail. The intermediate code of theclauses is held in the code area. The machine registersare the goalptr and headptr (pointer to the code of thecalling goal and of the called clause respectively), thegoalframeptr and the headframeptr (frame pointer ofthe clause containing the calling goal and of the calledclause respectively), the top of the environment stack,the top of the copy stack, the top of the trail, and thepointer to the last choice point.Values are stored together with a tag in one machineword. We distinguish integers, atoms, nil, lists, structures,unbound variables and references. Unboundvariables are allocated on the copy stack to avoid danglingreferences and the unsafe variables of the WAM.Furthermore it simplifies the check for the trailing ofbindings. Structure copying is used for the representationof structures.8

copy stack1trailenvironment stack1t4icode areacopyptrtrailptrchoicepntptrgoalframeptrheadframeptrgoalptrheadptrvariablesgoalptr'goalframeptr'local variablesI continuation code pointerI continuation frame pointerFigu re 15: stack frametrailptr' copy of top of trailcopyptr' copy of top of copy stackheadptr' alternative clausesgoalptr' restart code pointer (VAM2P)goalframeptr' restart frame pointerchoicepntptr' 1 previous choice pointFigure 16: choice pointFigure 13: VAM data areasVariables are classified into void, temporary and localvariables. Void variables occur only once in a clauseand need neither storage nor unification instructions.Different to the WAM, temporary variables occur onlyin the head or in one subgoal, counting a group ofbuiltin predicates as one goal. The builtin predicatesfollowing the head are treated as belonging to thehead. Temporary variables need storage only duringone inference and can be held in registers. All othervariables are local and are allocated on the environmentstack. During an inference the variables of thehead are held in registers. Prior to the call of the firstsubgoal the registers are stored in the stack frame. Toavoid initialisation of variables we distinguish betweentheir first occurrence and further occurrences.The clauses are translated to the VAM 2p abstractmachine code (see fig. 14). This translation is simpledue to the direct mapping between source codeand VAM2p code. During run time a goal and a headinstruction are fetched and the two instructions arecombined. Unification instructions are combined withunification instructions and resolution instructions arecombined with termination instructions. A differentencoding is used for goal unification instructions andhead unification instructions. To enable fast encodingthe instruction combination is solved by addingthe instruction codes and, therefore, the sum of twoinstruction codes must be unique.4.3 The VAMITThe VAM i p has been designed for native code compilation.A complete description can be found in [KB92].The main difference to the VAM2p is that instructioncombination is done during compile time instead ofrun time. The representation of data, the stacks andstack frames (see fig. 15) are identical to the VAM 2p.The VAM ip has one machine register less than theVAM2p. The two instruction pointers goalptr andheadptr are replaced by one instruction pointer calledcodeptr. Therefore, the choice point (see fig. 16) isalso smaller by one element since there is only oneinstruction pointer. The pointer to the alternativeclauses now directly points to the code of the remainingmatching clauses.Due to instruction combination during compile timeit is possible to eliminate instructions, to eliminate alltemporary variables and to use an extended clause indexing,a fast last-call optimization and loop optimization.In WAM based compilers abstract interpretationis used to derive information about mode, type andreference chain length. Some of this information is locallyavailable in the VAM i p due to the availability ofthe information of the calling goal.All constants and functors are combined and evaluatedto true or false. For a true result no code isemitted. All clauses which have an argument evaluatedto false are removed from the list of alternatives.In general no code is emitted for a combination with avoid variable. In a combination of a void variable withthe first occurrence of a local variable the next occurrenceof this variable is treated as the first occurrence.Temporary variables are eliminated completely. Theunification partner of the first occurrence of a temporaryvariable is unified directly with the unificationpartners of the further occurrences of the temporaryvariable. If the unification partners are constants, nocode is emitted at all. Flattened code is generated forstructures. The paths for unifying and copying structuresis split and different code is generated for eachpath. This makes it possible to reference each argumentof a structure as offset from the top of the copystack or as offset from the base pointer of the struc-9

tune. If a temporary variable is contained in morethan one structure, combined unification or copyinginstructions are generated.All necessary information for clause indexing is computedduring compile time. Some alternatives areeliminated because of failing constant combinations.The remaining alternatives are indexed on the argumentthat contains the most constants or structures.For compatibility reasons with the VAM2p a balancedbinary tree is used for clause selection.The VAM i p implements two versions of last-calloptimization. The first variant (we call it postoptimization)is identical to that of the VAM2P- Ifthe determinacy of a clause can be determined duringrun time, the registers containing the head variablesare stored in the callers stack frame. Head variableswhich reside in the stack frame due to the lack of registersare copied from the head (callee's) stack frameto the goal (caller's) stack frame.If the determinacy of a clause can be detected duringcompile time, the caller's and the callee's stackframes are equal. Now all unifications between variableswith the same offset can be eliminated. If not allhead variables are held in registers reading and writingvariables must be done in the right order. We callthis variant of last-call optimization pre-optimization.Loop optimization is done for a determinate recursivecall of the last and only subgoal. The restriction toa single subgoal is due to the use of registers for valuepassing and possible aliasing of variables. Unificationbetween two structures is performed by unifying thearguments directly. The code for the unification of avariable and a structure is split into unification codeand copy code.5 Optimizations5.1 Abstract InterpretationInformation about types, modes, trailing, referencechain length and aliasing of variables of a programcan be inferred using abstract interpretation. Abstractinterpretation is a technique of describing and implementingglobal flow analysis of programs. It was introducedby [C077] for datafiow analysis of imperativelanguages. This work was the basis of much of the recentwork in the field of logic programming [AH87)[Bru91] [Deb921 [Me185) [RD92] [Tay89]. Abstract interpretationexecutes programs over an abstract domain.Recursion is handled by computing fixpoints.To guarantee the termination and completeness of theexecution a suitable choice of the abstract domain isnecessary. Completeness is achieved by iterating theinterpretation until the computed information change.Termination is assured by bounding the size of thedomain. The previous cited systems all are metainterpreterswritten in Prolog and very slow.A practical implementation of abstract interpretationhas been done by Tan and Lin [TL92]. They modifieda WAM emulator implemented in C to executethe abstract operations on the abstract domain. Theyused this abstract emulator to infer mode, type andalias information. They analysed a set of small benchmarkprograms in few milliseconds which is about 150times faster than the previous systems.5.2 Sophisticated Clause IndexingThe standard indexing method used in WAM-basedProlog systems can create two choice points. Therefore,this method has been called two-level indexing.Carlson [Car87] introduced one-level indexing by delayingthe creation of a choice point as long as possible.By discriminating first on the type of the first argumentand when appropriate on its principal functor,the set of potentially matching clauses is filtered out.A choice point is then needed only for non singletonsets. In the worst case the number of indexing instructionscan be quadratic to the number of clauses. TheVAM2p uses pointers instead of indexing instructionsto avoid two-level indexing and to enable assert and retract[Kra88]. A similar strategy is used in [DMC89].The use of field encoded and superimposed codewords for clause indexing was proposed by Wise andPowers [WP84] and was refined by Colomb [CJ86][Co191]. The method is based on content addressablememory (CAM). The CAM consists of an array of bitcolumns. Logical operations on columns and lines ofthe CAM can be computed in one cycle. The resultsof operations can be held in result columns or lines.The idea is to hold hash values for the arguments of aclause in the CAM. The encoding scheme is based onm-in-n coding which sets m bits in a word of size n toI. Field encoding uses n/2-in-n coding and gives eachargument some bits of a line. Superimposed codinguses m-in-n coding, where n is the size of a whole lineand m so small that m times number of arguments isn/2. Variables are either represented by a special columnor by hash values with all bits set to 1 or 0. TheCAM is fast, but too special and expensive to be usedin general purpose computer systems.In [KS88] Kliger and Shapiro describe an algorithmfor the compilation of an FCP(—,:,?) procedure intoa control-flow decision tree that analyses the possibledata states in a procedure call. This tree is translatedto a header for the corresponding machine code of thepredicate. At run time the generated instructions controlthe flow which finally reaches the jump instructionpointing to the correct clause. Redundant tests in a1 0

process reduction attempt are eliminated and the candidateclause is found efficiently. The decision treemay need program space exponential in the numberof clauses and argument positions. Consequently in[KS90] they choose decision graphs rather than decisiontrees to encode the possible traces of each predicate.Hickey and Mudambi [1-1.M89J were the first who applieddecision trees as an indexing structure to Prolog.They compile a program as a whole and apply modeinference to determine which arguments are bound.The decision tree is compiled into switching instructionswhich can be combined with unification instructionsand primitive tests. So equivalent unificationswhich occur in different clauses are evaluated onlyonce. Reusing the result of such a unification requiresa consistent register use. A complete indexing schemegenerating algorithm is presented which takes into accounteffects of aliasing and gives a consistent registeruse. They also show that the size of the switchingtree is exponential in the worst case and that findingan optimal switching tree is NP-complete. For caseswhere the—size of the switching tree is a problem theyalso present a quadratic indexing algorithm. In generalthe size is no problem and the speedup is a factorof two.Palmer and Naish [PN911 and Hans [Han921 also noticedthe potential exponential size of decision trees.They compute the decision tree for each argument separatelyand store the set of applicable clauses for eachargument. At run time the arguments are evaluatedand the intersection of the applicable clause sets ofeach argument is computed. The disadvantage of thismethod is the high run time overhead. Furthermorethe size of the clause sets is quadratic to the number ofclauses, whereas decision trees are rarely exponentialwith respect to the number of arguments.5.3 Stack CheckingSince a Prolog system has many stacks, the run timechecking of stack overflow can be very time consuming.There are two methods to reduce this overhead.The more effective one uses the memory managementunit of the processor to perform the stack check. Awrite protected page of memory is allocated betweenthe stacks. Catching the trap of the operating systemcan be applied to promote a more meaningful errormessage to the user. A problem with this scheme occursin combination with garbage collection. The trapcan occur at a point in the program where the internalstate of the system is unclear so that it is difficult tostart garbage collection.The second idea is to reduce the scattered overflowchecks to one check per call. It is possible to computeat compile time the maximum number of cellsallocated during a single call on the copy and the environmentstack. If these stacks grow into one another(possible only if no references are on the environmentstack) both stacks can be tested with a single overflowcheck. The maximum use of the trail during a call cannot be determined at compile time.5.4 Instruction SchedulingModern processors can issue instructions while precedinginstructions are not yet finished and can issueseveral instructions in each cycle. It can happen thatan instruction has to wait for the results of anotherinstruction. Instruction scheduling tries to reorder instructionsso that they can be executed in the shortestpossible time.The simplest instruction schedulers work on basicblocks. The most common technique is list scheduling[War9O]. It is a heuristic method which yields nearlyoptimal results. It encompasses a class of algorithmsthat schedule operations one at a time from a list of operationsto be scheduled, using priorization to resolveconflicts. If there is a conflict between instructions fora processor resource, this conflict is resolved in favourof the instruction which lies on the longest executingpath to the end of the basic block. A problem withbasic block scheduling is that in Prolog basic blocksare small due to tag checks and dereferencing. So instructionscheduling relies on global program analysisto eliminate conditional instructions and increase basicblock sizes. Just as important is alias analysis. Loadsand stores can be moved around freely only if they donot address the same memory location.A technique called trace scheduling can be appliedto schedule the instructions for a complete inference[Fis811. A ,-trace is a possible path through a section ofcode. In general it would be the path from the entryof a call to the exit of a call. Trace scheduling uses listscheduling, starting with the most frequent path andcontinuing with less frequent paths. During schedulingit can happen that an instruction has to be movedover a branch or join. In this case compensation codehas to be inserted on the other path. In Prolog the lessfrequent path is often the branch to the backtrackingcode. In such cases it is often not necessary to compensatethe moved instruction.AcknowledgementWe express our thanks to Alexander Forst, FranzPuntigam and Jian Wang for their comments on earlierdrafts of this paper11

The PowerPC 604 RISC Microprocessor - eisber.net

Create successful ePaper yourself

Delete template?

Save as template?