tut_eng.htm manual

Maybe you're interested in learning to program in assembler, but don't know where to start. You've heard that this language is used to write bootsectors, device drivers, compilers, operating systems, so you try to write something like that, and then you flood the technical forums with questions, admitting that you're new to assembler and that something doesn't work. Before embarking on such specialized tasks, it is necessary to get a basic orientation and practice in the use of tools such as assembler, disassembler, debugger or analyser, etc. This is best done by using them often, first with simple examples like "Hello world", calculator, drawing ASCII graphics, and later on less trivial applications. You don't have to worry that typing in assembler is slower than in other languages; just typing on the keyboard you spend many times less time than thinking about the logic of the program, and this is true in all languages.

Suppose you have seen some assembly instructions somewhere that you'd like to try, but now the main problem occurs when working with a new programming language: where to write these instructions and how to make your computer execute them. Here you'll know how.

Expected basic knowledge

This tutorial is intended for those interested in programming in assembly language for x64_32 (Intel, AMD) personal computers. We will write programs for MS Windows, Linux, DOS.

Let's assume you have at least basic knowledge of English and you can operate a computer, install and run programs from the command line, you can edit files with plain text editor (nano, joe, notepad, etc.), you know hexadecimal notation and can perform basic operations with hexadecimal numbers, for example you can calculate examples of the type

How a computer works

The smallest unit of information is one bit, whose physical realization in the CPU can be imagined as a flip-flop circuit with two stable states that can be flipped to provide a voltage at the output, to which we have assigned a value of either logic 1 or logic 0 (and nothing in between). The flap circuit remembers its state and can change it on demand (write or read).
A concatenation of multiple flip-flop circuits so that their bits can be written/read at the same time is called a register. A register can be 8-bit, 16-bit, 32-bit, etc. up to 512-bit. Writing the contents of the register using zeros and ones would be cluttered, instead, the contents of the register are usually written using two hexadecimal digits for each of its 8-bit bytes.

The registers are located on the CPU chip. Unlike memory, registers are very fast, but their number and size are limited. We will be working mainly with General Purpose Registers (GPR). The registers are accessed by the programmer using their name (not the address), and for most GPRs, subsets of registers can be named in addition to the entire register. The lower half of the 64-bit RAX register is called EAX, the lower quarter is AX, and the lower eighth is AL. For a complete list of registers see the manual. In some places we use the names rAX, rBX, etc. The lower case letter r here indicates that rAX can represent RAX, EAX or AX registers, depending on the current processor mode.

Similarly to RAX, the other general registers RBX, RCX, RDX, RBP, RSI, RDI, RSP, R8..R15 can also be divided.

The general registers are specialized or fixed in some instructions, which partly corresponds to the mnemonic of their name.

Other registers, such as ST0..ST7 (math coprosessor), MM0..MM7 (multimedia), YMM0..YMM15 (SIMD registers) are ortogonal, i. e. not specialized and interchangeable with one-another.

In addition to registers, there are flip-flop circuits on the CPU chip called flags that are set automatically after certain operations, especially arithmetic. We will be mainly interested in the following:

Other flags (Parity, Auxilliary, Trap, Interrupt) are not normally used in common applications. The flags can be viewed as separate flip-flop circuits with a memory of one bit. For the sole purpose of writing them to the stack, they are grouped into a virtual register which can be manipulated by instructions PUSHF, POPF and thus restore all flags at once.

The registers can also include the rIP instruction pointer, which points to the address of the next instruction during the execution of any instruction, except for transfer instructions where it will be replaced by the destination address to jump to.

The Carry Flag is exceptional in that we can set it to 1 with STC, set it to 0 with CLC or change its value to the opposite with CMC. Similarly, we can also set and reset the Direction Flag using STD and CLD, and Interrupt Flag using STI and CLI. Other flags cannot be explicitly changed in this way but Zero Flag can be set to 1 by zeroing any register with SUB reg, reg.

CPU is connected to memory by a data and address bus (a set of wires). Whenever the CPU needs to read or write something, it sets the address on the address bus and reads or writes the written data on the data bus.

Reading and writing to the device works in a similar way. Devices include a keyboard, monitor, mouse, network card, and other similar peripherals. Unlike memory, the data combinations used to select them are not called address, but port, e.g. a keyboard has a fixed port of 64h, a printer has a port of 378h, etc. For an overview of personal computer ports, see v TechHelp.

In general, from the point of view of an assembler programmer, it can be said that

the processor reads some information from memory or a device into a register, manipulates it, and then writes it somewhere.

This manipulation can be an arithmetic or logical operation, changing bits, setting to some value, etc. The steps to be performed by the processor are determined by machine instructions. These have a variable length of 1 to 15 bytes and are stored in the operating memory, one after the other. The CPU fetches and executes them sequentially from memory.

Each instruction has a mnemonic abbreviation (specified by the CPU manufacturer) followed by operands that specify where the information is to be written from and to. The job of a program called an assembler is to convert the mnemonic abbreviations and operands into hexadecimal code for the machine instructions and store them in a file so that they are executable by the operating system.

A typical instruction has two operands – input and output – and in intel syntax they are written in the order of instruction output, input. For example, the ADD EAX,ECX instructs the processor to add the contents of the input register ECX to the contents of the EAX (output) register. The contents of the ECX register remain unchanged. The contents of the general registers are treated as a fixed-point integer.

Operational code before executing ADD EAX,ECX EAX ECX ┌──┬──┐ ┌──┬──┬──┬──┐ ┌──┬──┬──┬──┐ │01│C8│ │12│34│56│78│ │56│78│9A│BC│ └──┴──┘ └──┴──┴──┴──┘ └──┴──┴──┴──┘ AH AL CH CL after executing ADD EAX,ECX EAX ECX ┌──┬──┐ ┌──┬──┬──┬──┐ ┌──┬──┬──┬──┐ │01│C8│ │68│AC│F1│34│ │56│78│9A│BC│ └──┴──┘ └──┴──┴──┴──┘ └──┴──┴──┴──┘ AH AL CH CL

The example above works with the 32-bit wide EAX and ECX registers, but we would apply the same to add registers with widths of 8, 16, 32, or 64 bits, for example, ADD AH,CL, ADD AH,CH, ADD RAX,RCX. Fractional registers that do not have a name, such as the upper half of EAX, the third eighth of RCX, etc., cannot be directly added in this way, but we could use rotation instructions to temporarily move the contents of the desired fractional register to the named part, perform the addition using ADD AL,CL, and then, if necessary, reverse the rotation to return the fractional register back to position.

In addition to the Intel syntax, there is also a syntax developed by AT&T, in which the input and output are swapped. However, we will not cover that here, since almost all assemblers and processor manufacturers prefer the Intel syntax.

When viewing assembly tutorials, we can notice many inconsistencies in the data description:

The contents of the ECX register in the ADD EAX,ECX example above are written hexadecimal as 56 78 9A BC and thus begin with its most significant byte 56. This is seemingly inconsistent with storing a 32-bit word in memory starting with the least significant byte, but we think of a word in a register differently than a word in memory. If we were to store the ECX register in memory at, say, address 0 (which would be done with the MOV [0],ECX) and then display the memory contents with a debugger or similar tool, we would see

This shows that it depends on whether we view the contents of memory space as a multi-byte word or as a series of bytes.

Processor modes

Personal computers nowadays run almost exclusively in protected mode. In this mode, the operating system protects itself primarily to prevent the user from inadvertently or deliberately disturbing the memory of the system or of other users who might be working on the computer at the same time. Access is denied if the user attempts to read or write to memory that has not been allocated to the user. Similarly, access to input/output ports is restricted. Direct write and read instructions with IN and OUT are privileged in protected mode and their use is reserved by the operating system.

In the computer prehistoric times, personal computers (then only 16-bit) ran in the real mode where the user had the entire operating memory and all ports to himself. On today's PCs, this mode is only available through an emulator such as DOSBox, which is somewhat inconvenient compared to the native Linux or Windows environment. Yet, oddly enough, 16-bit mode is preferred in assembler courses, perhaps out of the (mistaken) belief that 16-bit mode is easier for beginners than 32-bit or 64-bit mode.

If the computer is switched to 16-bit real mode, either as part of emulation or by booting the computer into DOS, we will only have the 16-bit registers AX, BX, CX, DX, BP, SP, SI, DI available, and we must consider the segment registers CS, DS, SS, ES when addressing memory. We can address memory either by writing the address directly, e.g. MOV AX,[1234h], or by loading the address into a register first and then using that register for addressing:

MOV BX,1234h MOV AX,[BX]

In real mode, at most one base register BX or BP and at most one index register SI or DI can be used for addressing, e. g. MOV AX,[BX+SI+1234h]. When using BP, SS will be used as the default segment register (unless explicitly specified otherwise), otherwise the default data register DS will be used.

The address, calculated as the sum of the contents of the BX, SI, and direct value 1234h registers, is added to the contents of the 16-bit segment register multiplied by 16 before this linear address is used to access memory.

In protected 32-bit mode memory addressing is much simpler. We have 32-bit registers EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI, and we can use any combination of up to two of these registers to address memory, for example, MOV EAX,[ECX+EDX+12346789h]. If ESP or EBP is used in addressing, the segment register becomes the SS register instead of the default DS register. However, this usually doesn't matter because DS, SS and ES contain the same address in protected mode and we don't have to bother with segment registers at all. So it is obvious that 32-bit mode is much easier for the programmer than 16-bit mode.

Similar addressing rules are valid in 64-bit mode, moreover we can use besides RAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI other general registers R8..R15. Segment registers CS, DS, SS, ES are not used in 64-bit mode at all.

Data types

Data type determines how to view a data item – whether to treat it as an integer value, a floating-point number, a text string, a bitmap, or some other structure. See the manual. Unlike in higher programming languages, data types are not controlled by the assembler. The basic data type is determined by the width of the data item (e.g. 8, 16, 32, 64 bits), but nothing prevents us from interpreting, for example, a 64-bit floating-point number as a binary number or as a string of characters (and getting the wrong result, of course). We can display the contents of an item (memory location) as text, convert it to a number in the format expected by humans, play it as a sound, display it as a photo, whatever.

The data type is defined in the assembler by the operations we will perform on the item.

Machine instructions

What can we use in the format mnemonic output, input as a output, input? There are four possibilities.

Instructions are called machine instructions, because they are built into the machine (processor). In assembler, we can still see pseudoinstructions that look similar, but they are directives for the assembler, not for the processor.

The latest processors of the x86-64 architecture can have over two thousand different machine instructions. The good news is that we only need a few dozen for normal programming, which we will now describe.

The best place to look for authoritative descriptions of each instruction is on the CPU manufacturer's web site, such as Intel. However, these are usually large PDFs that are difficult to work with and reference, so you'll probably prefer to look at their form converted to web-based HTML format.

Instruction for copying data

Probably the most versatile instruction is MOV, short for MOVE. Which is a rather unfortunate choice of name; COPY would be better, as it is about copying data from one place to another, not moving it. The information in the input registry or memory location is preserved. In fact, there is no way to remove information from the registry or memory, there is always some information left, at least all zeros.

In addition to copying between 8,16,32,64-bit registers and memory locations, the same mnemonic MOV is also used to move between general (GPR) and other registers such as segment, control, debug, MMX or SIMD.

We use MOV from memory to register to load the contents of a memory location (byte, word, doubleword) into a register, which is expressed by enclosing the memory location in square brackets: MOV ECX,[1234h], MOV ECX,[EBX] etc. Omitting the parentheses would load the address of the memory location instead of the contents stored in memory. Thus, MOV ECX,1234h would fill the ECX register with the number 1234h. MOV ECX,EBX would just copy the contents of the EBX register into the ECX register. Not realizing the difference between the address (offset) of a memory location and its contents is a common source of error.

Another very commonly used instruction is the LEA. While the MOV registr, [memory] fetches the contents of memory and the MOV registr, memory fetches the address of memory, LEA registr, [memory] fetches the address even though the second operand is always in square brackets. The LEA instruction is also one byte longer than the MOV, so why use it? For instance when we are interested in calculating the address and don't want to know its contents (which may not even exist in memory). In 32-bit and 64-bit mode, we can use memory addressing with two general registers (one is base register and the other is index register), and the index register can be scaled, i.e. internally multiplied by two, four or eight. This allows us to use address arithmetic for simpler calculations:
LEA EDX,[ESI+ECX] fills EDX with the sum of the contents of the ESI and ECX registers.
LEA EAX,[8*EAX] fills EAX with eight times the original contents of EAX.
LEA EBX,[EDX+4*EDX] fills EBX with five times the contents of EDX.

Another area of application of LEA is address fetching in 64-bit mode, where, unlike MOV, it uses relative addressing frame instead of absolute addressing, allowing memory to be addressed within plus or minus 2 GB of the LEA instruction.

Data copy instructions also include XCHG, which is the mutual exchange of information between two registers or between a register and a memory location.

With a single MOV instruction, we can load 1 to 8 bytes from memory into a register, or up to 64 bytes using VMOV*. The processor groups memory accesses, so when we load or write only one byte, e.g. if we use MOV AL,[RBX], MOV [RBX],AL, the CPU actually loads, say, 16 contiguous bytes at a time, and then selects the one desired byte and leaves the other bytes in their original state. The smallest granularity of a memory access is one byte.

What if we only need to change part of a byte, say just one bit? Then the programmer has to take care of that: read the whole byte, change only the desired bit (and leave the others as they are), and then write the byte back to memory. To manipulate individual bits, we can use the specialized instructions BTS (set bit to 1), BTR (reset bit to 0) or BTC (change the bit to opposite). Or we can use bitwise logic operations OR to set bits, AND to reset bits, or XOR to flip bits to the opposite state.

Arithmetic and logical instructions

Arithmetic instructions compute integers in binary. The contents of an eight-bit register can range from 00h to FFh, or in decimal notation from 0 to 255. We call such interpretation unsigned numbers. In addition to the 8-bit register, we can also store integers in 16, 32, or 64-bit registers. If we add, say, 9Ah=154 to an 8-bit register containing 78h=120, with instruction ADD AL,BL, the result is 112h=274. This is 12h more than the capacity of the 8-bit register (FFh), so only 12h is stored in the AL register and the processor sets the Carry Flag to 1 to indicate an overflow. This 1 overflowed from AL does not go to the higher register AH as it would in a 16-bit addition with ADD AX,BX. However, it can be added to the next addition if it were done with the ADC, (Add with Carry) instead of the plain ADD. With ADC an extra 1 is added to the result if the Carry Flag has been set: ADC AH,0. In practice, this is used to add and subtract numbers longer than the GPR capacity.

The following example illustrates the addition of two large numbers in 16-bit registers when 32-bit registers were not available, as they were in DOS. For example, let us add two 32-bit numbers 89ABCDEFh and 55556666h in 16-bit mode. We will split the numbers into register pairs DX:AX and BX:CX. The colon between the register names here represents the concatenation of two 16-bit registers into one virtual 32-bit register.

MOV DX,89ABhh MOV AX,0CDEFh MOV BX,5555h MOV CX,6666h CF DX AX BX CX ┌─┐ ┌──┬──┐ ┌──┬──┐ ┌──┬──┐ ┌──┬──┐ │?│ │89│AB│ │CD│EF│ │55│55│ │66│66│ └─┘ └──┴──┘ └──┴──┘ └──┴──┘ └──┴──┘ ADD AX,CX CF DX AX BX CX ┌─┐ ┌──┬──┐ ┌──┬──┐ ┌──┬──┐ ┌──┬──┐ │1│ │89│AB│ │34│55│ │55│55│ │66│66│ └─┘ └──┴──┘ └──┴──┘ └──┴──┘ └──┴──┘ ADC DX,BX CF DX AX BX CX ┌─┐ ┌──┬──┐ ┌──┬──┐ ┌──┬──┐ ┌──┬──┐ │0│ │DF│01│ │34│55│ │55│55│ │66│66│ └─┘ └──┴──┘ └──┴──┘ └──┴──┘ └──┴──┘

In addition to ADD and ADC, other arithmetic operations include SUB and SBB (Subtrack with Borrow). SBB differs from simple subtraction (SUB) in that it subtracts an extra 1 when the Carry Flag is set.

CMP is similar instruction to SUB but it does not subtract anything (register contents are not changed), it only sets flags according to the result of the hypothetical subtraction.

The logical OR, AND and XOR instructions perform the same logical operations with operands of width 8, 16, 32, 64 bits each, i.e. the zero bit of the output operand with the zero bit of the input operand, the first bit with the first bit, the second bit with the second bit, etc.

The integers in the 8-bit register 0..255 were considered unsigned. But this is not the only possible interpretation; we can reserve the most significant bit for a sign and thus treat the number as signed. Then the values 01h..7Fh will correspond to the positive numbers 1..127 and the values FFh..80h to the negative ones -1..-128. Zero remains zero. So the numeric range has changed to -128..+127 for the 8-bit register, and of course it will be much larger for wider registers. The beauty of binary arithmetic is that signed and unsigned numbers add and subtract in the same way, using the same ADD and SUB instructions. It doesn't matter to these instructions whether we've presented them with signed or unsigned numbers; we can interpret the result of the arithmetic operation either way.

If we are operating on signed numbers, an overflow (going out of the allowed numeric range) is indicated by an Overflow Flag instead of a Carry Flag.

The NEG instruction converts a positive binary number to its negative value and vice versa. It does this by changing all bits to the opposite and adding one to the result. In an 8-bit register, the NEG AL instruction changes the value of AL from 02h to FEh, from 01h to FFh, from 00h to 00h, from FFh to 01h, etc.

A similar instruction is NOT, which differs from NEG in that it does not add any one to the inverted bits, so it is more suited to logical operations.

Useful arithmetic operations are INC and DEC, which increment and decrement the contents of a register or memory location by one. With these two instructions, we must remember that they change arithmetic flags except for CF.

The Carry Flag remains unchanged by executing INC or DEC.

The arithmetic instructions include multiplication and division. However, they do not treat positive and negative numbers the same. If we want to multiply or divide signed binary numbers, we must either convert them to positive (using NEG) and then eventually convert the calculated value back to negative, or instead of their unsigned variants MUL and DIV use signed multiplication and division IMUL a IDIV. For multiplication and division, it is not true that we can use either register. The result of multiplying two 64-bit numbers may require up to 128 bits to store the result, so the fixed register pair rDX:rAX is used to store the product. For 32-bit multiplication, the result is stored in the pair EDX:EAX, for 16-bit multiplication in DX:AX, only for 8-bit multiplication there is an exception and the result of multiplying AL by the input 8-bit value goes into the AX register (DX remains unchanged). Overflow can no longer occur in principle, but setting the Carry Flag and Overflow Flag to ones simultaneously indicates that the result is large and has overflowed into the upper of the pair of output registers (DX, EDX or RDX).

For integer division, the reverse procedure is used: the divisor is placed in the rDX:rAX register pair; the divisor can be another register or memory location of the appropriate width. However, overflow may occur here if the divisor is smaller than the number in the upper half of the input register pair (DX, EDX,RDX). The result would not fit in the lower half (AX, EAX, RAX) and would therefore not be defined (this is called division by zero). The x86 architecture doesn't know what number it should store in the output register in this case and thus raises a program exception (interrupt), which can cause our program to crash. Therefore, before division, the upper half of the input register pair must be reset (in the case of DIV) or, conversely, set to all ones when dividing negative numbers using IDIV. This is best served by zeroing rDX using SUB rDX,rDX for unsigned division, and using the short instruction CWD, CDQ or CQO before signed division.

Instructions working with the stack

A stack is a contiguous area reserved from the total amount of operational memory and declared as a stack. The general register rSP (stack pointer) is used for addressing in the stack. The stack is most often used to temporarily store and then restore the contents of the general registers with the PUSH and POP instructions. When a program is loaded into memory, the operating system makes sure to reserve enough memory for the stack and sets its ESP or RSP pointer to its beginning, which is not the lowest address, but rather the highest. The addresses gradually decrease when stored on the stack using PUSH and, conversely, increase when removed from the stack using POP.

The subject of a PUSH can be general registers of 16, 32 or 64 bits wide, or memory variables of the same width, as well as segment registers, and a direct numeric value which will be sign-extended by the processor to the width of the operand. The processor first decrements the stack pointer rSP by 2, 4 or 8 bytes and stores the operand in the resulting space. The register rSP thus addresses the currently stored item.

The POP operation works in reverse: it moves the contents of the 2, 4, or 8 bytes addressed by the rSP register to the operand and then increments the rSP by 2, 4, or 8 bytes.

The use of the rSP register to address the stack is implicit; only the input or output operand is specified in the PUSH and POP instructions. Some assemblers allow more than one operand to be written to a single PUSH/POP instruction, but this is implemented internally as a series of separate PUSH or POP instructions. Writing multiple operands is mainly used to save source program lines.

Saving and restoring from the stack is done in principle by the LIFO method, i.e. Last In, First Out. The register saved to the stack last by PUSH is then restored first by the subsequent POP instruction; thus, we must restore them in the reverse order of saving. Example:

PUSH EAX,EBX,ECX ; Store three registers on stack. ; Here are instructions which clobber EAX, EBX,ECX. Register ESP is 3*4 lower then it was before PUSH. POP ECX,EBX,EAX ; Restore register from the stack. ESP is back, memory under its value is undefined.

In 16- and 32-bit mode, instead of saving and then restoring more registers, we can use the PUSHA and POPA instructions, which save and restore all 8 GPRs at once in the order eAX, eCX, eDX, eBX, eSP, eBP, eSI, eDI. While saving all eight registers is often unnecessary and slower, it will save code size because both PUSHA and POPA are encoded in mere one byte. In 64-bit mode, PUSHA is not available and we have to store the registers individually.

Jump instructions

If you remember from your programming language lessons about the prohibition of jumps and the harmfulness of the GOTO statement, you can forget about it in assembler. All program constructs such as IF, ELSE, WHILE, SWITCH, REPEAT UNTIL, etc. are executed here using conditional jumps, where a condition is first evaluated, e.g. by the CMP or TEST instructions, and then a jump is made (or not made) to another place in the code using the Jcc instruction. This jump has a number of variants differing by the condition cc in the instruction's mnemonic name. For instance, JA (Jump if Above) first examines whether CF=0 and ZF=0 are simultaneously true and jumps to the target address (label) only if both conditions are met, otherwise it ignores the instruction and continues with the one below it.

The terms Above, Below are used if we compared unsigned numbers, such as two addresses using CMP. The term Greater, Less are used on the other hand after comparing signed integers.

We don't need to mind the differences between short and near jumps, it is the assembler's concern to use the correct one.

The conditional jumps include the LOOPcc instruction, which first decrements the contents of the rCX register by 1, and if rCX is non-zero, it jumps to the label specified in the instruction operand, otherwise it continues under the LOOP instruction. If rCX was already zero before the LOOP instruction, it will first be decremented to CX=65535 or even ECX=4294967295, which is non-zero, and therefore the loop will repeat just this many times. Which we probably didn't want, so the rCX register is tested with JCXZ or JECXZ before the LOOP instruction, and the loop is skipped if it is zero:

JECXZ Skip Label: ; Instructions performed ECX-times. LOOP Label Skip:

In addition to conditional jumps, we can also jump to another location in the program unconditionally, i. e., each time a JMP appears in the instruction stream. This instruction replaces the rIP register (which normally contains the address of the next instruction) with the address being jumped to.

Related to unconditional jumps are a pair of CALL and RET instructions. Like JMP, the CALL instruction replaces the rIP with the jump address, but in addition, it stores the contents of the rIP on the stack beforehand, much as we would hypothetically perform a non-existent PUSH rIP operation. The RET instruction performs a hypothetical POP rIP operation, which is equivalent to jumping to the return address that was stored on the stack by the CALL instruction.

Stack instructions allow you to divide the flow of instructions into shorter subroutines (procedures) to structure the program in a clean way. We can treat each subroutine or program macro as a black box, document its input, output, and function, and then forget about the details of its implementation.

CALL BlackBox BlackBox: PROC ; Continue with the main program. PUSHAD ; Save all registers. ; Instructions of the black box. POPAD ; Restore registers. RET ; Return to the main program below CALL BlackBox. ENDPROC

The CALL machine instruction is similar to INT which causes software interrupt. The operand is a number 0..255, which in real mode specifies the sequence number of the double word in the interrupt table (IDT) indicating the address of the routine handling the interrupt. For example, on INT 21h the CPU looks at the address 21h*4 and loads two 16-bit words from that address into the IP and CS registers. At this address there should be a subroutine performing the function expected from INT 21h; this is then terminated by an IRET. The difference between CALL/RET and INT/IRET instructions is that INT additionally stores flags there before storing the return address on the stack, and then IRET restores them back.

Shifts and rotations

The following eight shift and rotatiton instructions allow the contents of an 8, 16, 32, or 64-bit register or memory location to be manipulated. The number of shifts is specified in the second opcode as an immediate number or as the contents of a fixed register CL. The contents are shifted or rotated by this number of bits either to the left, i. e., from the least significant bit LSb to the most significant bit MSb, or to the right, i. e., from MSb to LSb.

RCL RCR ┌──┐ ┌───────────┐ ┌───────────┐ ┌──┐ │CF│<-│MSb <-- LSb│<-CF CF->│MSb --> LSb│->│CF│ └──┘ └───────────┘ └───────────┘ └──┘ ROL ROR ┌──┐ ┌───────────┐ ┌───────────┐ ┌──┐ │CF│<-│MSb <-- LSb│<-MSb LSb->│MSb --> LSb│->│CF│ └──┘ └───────────┘ └───────────┘ └──┘ SAL SAR ┌──┐ ┌───────────┐ ┌───────────┐ ┌──┐ │CF│<-│MSb <-- LSb│<-0 MSb->│MSb --> LSb│->│CF│ └──┘ └───────────┘ └───────────┘ └──┘ SHL SHR ┌──┐ ┌───────────┐ ┌───────────┐ ┌──┐ │CF│<-│MSb <-- LSb│<-0 0->│MSb --> LSb│->│CF│ └──┘ └───────────┘ └───────────┘ └──┘

For rotate via carry (RCL, RCR) instructions, the Carry Flag bit is added to the register bits as the ninth (or 17th, 33rd, 65th) bit.

The arithmetic instructions SAL, SAR are used to quickly multiply and divide signed numbers by powers of two, e. g. SAL AX,4 is equivalent to multiplying the contents of AX by sixteen (2⁴), SAR EBX,3 divides the contents of EBX by eight, etc. Therefore for arithmetic right shift (SAR) the highest signed bit of MSb copies its original value at each step.

Logical shifts (SHL, SHR) are useful for logical operations. The SHL and SAL instructions behave identically. Two examples:

STC MOV EAX,12345678h CF EAX ┌─┐ ┌──┬──┬──┬──┐ │1│ │12│34│56│78│ └─┘ └──┴──┴──┴──┘ RCL EAX,4 CF EAX ┌─┐ ┌──┬──┬──┬──┐ │1│ │23│45│67│88│ └─┘ └──┴──┴──┴──┘ MOV ECX,89ABCDEFh ECX CF ┌──┬──┬──┬──┐ ┌─┐ │89│AB│CD│EF│ │?│ └──┴──┴──┴──┘ └─┘ SHR ECX,2 ECX CF ┌──┬──┬──┬──┐ ┌─┐ │22│6A│F3│7B│ │1│ └──┴──┴──┴──┘ └─┘

String instructions

As we know, instructions can move data between registers and between registers and memory, but not from one memory location to another. This is not quite true, there is an instruction MOVS that does this at the cost of bypassing the standard way of encoding the address (ModRM+SIB). Instead, it expects the input address to be stored in register rSI and the output address in rDI. The amount of data transferred depends on the instruction extension, i.e. MOVSB, MOVSW, MOVSD, MOVSQ to transfer a single byte, a 16-bit word (WORD), a 32-bit word (DWORD) or a 64-bit word (QWORD). The number of words transferred by a single instruction can be larger if the repeat prefix REP is used before the instruction. It specifies that the number of times a single element is transferred should be repeated as many times as the contents of rCX, and after each transfer, rCX is decremented by 1 and the addresses in the rSI and rDI registers are changed by the size of the word being transferred, i.e., by 1, 2, 4, or 8 bytes. Whether the rSI and rDI addresses are incremented or decremented by the size of the word depends on the Direction Flag.

Two examples of using MOVS in 32-bit mode where ESI, EDI are used as index registers rSI, rDI:

CLD MOV ESI,02h MOV ECX,3 MOV EDI,07h ┌──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───── │01│23│45│67│89│AB│CD│EF│01│23│45│67│89│.. └──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴───── 00 01 02 03 04 05 06 07 08 09 0A 0B 0C .. ^ ^ DF=0 ESI=02h EDI=07h ECX=3 REP MOVSB ┌──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───── │01│23│45│67│89│AB│CD│45│67│89│45│67│89│.. └──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴───── 00 01 02 03 04 05 06 07 08 09 0A 0B 0C .. ^ ^ DF=0 ESI=05h EDI=0Ah ECX=0 STD MOV ECX,2 ┌──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───── │01│23│45│67│89│AB│CD│45│67│89│45│67│89│.. └──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴───── 00 01 02 03 04 05 06 07 08 09 0A 0B 0C .. ^ ^ DF=1 ESI=05h EDI=0Ah ECX=2 REP MOVSW ┌──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───── │01│23│45│67│89│AB│CD│45│67│89│AB│CD│89│.. └──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴───── 00 01 02 03 04 05 06 07 08 09 0A 0B ... ^ ^ ESI EDI ECX=0

The REP MOVSB instruction in the previous example first transferred ECX=3 bytes from the address at ESI=02h to the address at EDI=07h. In the continuation of the example we changed the direction of transfer using STD to the left, changed the width of the element from one byte (MOVSB) to one 16-bit word (MOVSW), and prescribed the transfer of only ECX=2 of these elements from the address ESI=05h to the address EDI=0Ah. The change in the contents of the two addressing registers rSI, rDI occurs after the transfer of one element. If the REP prefix is not specified before the MOVS instruction, the transfer of just one element is performed (and both registers are incremented or decremented by the size of the element), otherwise the contents of rCX are examined and until they are non-zero, the transfer of the element including the change of rSI, rDI is repeated and rCX is then decremented by one. The contents of rCX in REP MOVS therefore determine the number of elements transferred. If rCX=0, REP MOVS is not executed even once and the contents of registers rCX, rSI, rDI are not changed.

A more interesting situation occurs when the transferred fields partially overlap. In the following example, the source string (address ESI=02h, length ECX=5) overlaps with the destination string (address EDI=07h, length ECX=5). In each of the five copy steps, one element (in this case a byte) is first read and transferred, even if this input element was transferred a moment ago. If ESI<EDI and DF=0, or if ESI>EDI and DF=1, the string is not copied, but only the portion between the starting addresses of the two registers is propagated along the length of the output string. In order for the source string to be moved forward or backward, ESI<EDI and DF=1, or ESI>EDI and DF=0, would have to be true.

CLD MOV ESI,02h MOV ECX,5 MOV EDI,05h ┌──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───── │01│23│45│67│89│AB│CD│EF│01│23│45│67│89│.. └──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴───── 00 01 02 03 04 05 06 07 08 09 0A 0B 0C .. ^ ^ DF=0 ESI=02h EDI=07h ECX=3 REP MOVSB ┌──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───── │01│23│45│67│89│45│67│89│45│67│45│67│89│.. └──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴───── 00 01 02 03 04 05 06 07 08 09 0A 0B 0C .. ^ ^ DF=0 ESI=07h EDI=0Ah ECX=0 STD MOV ECX,3 ┌──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───── │01│23│45│67│89│45│67│89│45│67│45│67│89│.. └──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴───── 00 01 02 03 04 05 06 07 08 09 0A 0B 0C .. ^ ^ DF=1 ESI=07h EDI=0Ah ECX=3 REP MOVSW ┌──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬───── │01│23│45│67│89│45│67│89│45│67│89│45│89│.. └──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴───── 00 01 02 03 04 05 06 07 08 09 0A 0B ... ^ ^ ESI=01h EDI=04h ECX=0

Another useful string instruction is STOS, which stores the contents of the AL, AX, EAX or RAX register into memory at the address stored in the rDI register, and when stored, increments or decrements the contents of rDI by the size of the element, depending on the Direction Flag. Also, the REP prefix is often used before this instruction, allowing large sections of RAM to be reset.

CLD MOV EDI,MemBlock ; Address the start byte of a zeroed memory block of size 64 KB. MOV ECX,64K / 4 ; Number of repetitions = block size divided by DWORD size. SUB EAX,EAX ; Reset the DWORD register EAX which will be stored in the block. REP MOVSD ; Clear the entire MemBlock.

The opposite of STOS is LODS, which loads memory from the address given by the ESI register into AL, AX, EAX or RAX, and increments or decrements rSI after the load. It makes no sense to use the REP prefix for this instruction, since the values loaded would overwrite the contents of rAX before we could do any processing with it. LODSB is used in conjunction with STOSB when we want to copy a string byte by byte while responding to the copied characters. Example of a simple conversion of a zero-terminated string from lowercase a..z to uppercase A..Z:

CLD MOV ESI,MixedCase ; Load the addresses of both strings. MOV EDI,UpperCase ; The length is unknown, but it is terminated by a zero (null) character. Next: LODSB ; Load one character of the input string from [ESI] to AL and increment ESI by 1 CMP AL,0 ; Test whether we are at the end of the input string. JE End ; If so, jump to the End label. CMP AL,'a' ; Check whether it is a character in the range 'a'..'z'. JB Store ; If it is below 'a', do not change it and jump to the Store label. CMP AL,'z' ; Check whether it is a character in the range 'a'..'z'. JA Store ; If it above 'z', do not change it and jump to the Store label. AND AL,11011111b ; Clear the 5th bit of the character in AL, which converts it to a capital 'A'..'Z'. Store:STOSB ; Store the character from AL at address [EDI] and increment EDI by 1. JMP Next ; Jump back to get the next character of the input string End: STOSB ; Done, store the AL register which terminates the output string with a zero.

The SCAS instruction is used to find the position of a value in the AL, AX, EAX, or RAX register in the input string addressed by rDI. The compare sets the flags and then increments or decrements rDI by the width of the register being compared (1, 2, 4, or 8) depending on the Direction Flag. The instruction is used with the REPNE (repeat if not equal) prefix, with the number of repeats determined by the contents of rCX. Execution of the instruction is terminated either by finding a character in the string (setting the Zero Flag as a match sign), or by exhausting the length of the string in the rCX register. If the operation terminated due to exhausting of rCX, we can still use the status of the ZF to determine if the value was found in the last element of the input string.

CLD MOV EDI,String ; The address of the string being searched MOV ECX,SizeOfString ; The number of characters in the string MOV AL,0 ; We will search for the null byte. REPNE SCASB ; EDI now points behind the null byte, if it has been found. JNE Error ; Jump if the string does not contain the null byte, but yet ECX=0.

Somewhat less meaningful is scanning repeatedly with the REPE prefix (repeat while equal). We would use this if the EDI register is pointing to a string of all null characters, for example, and we want to find the position of the first non-zero character:

CLD MOV EDI,String ; The address of the string being searched. MOV ECX,SizeOfString ; The number of characters in the string. MOV AL,0 ; We will search for a non-zero byte. REPE SCASB ; EDI now points behind the first non-zero byte, if found. JE Error ; Jump if the string contains only zero bytes.

The last useful string instruction is CMPS, which compares an element in memory (BYTE, WORD, DWORD, or QWORD) addressed by rSI with an element of the same width addressed by rDI. It sets arithmetic flags and then increments or decrements rSI and rDI by the size of the element. This instruction can be repeated using the REPE prefix, too. The following example shows a search for a word or expression (Needle) within a longer character string (Haystack). The algorithm first searches for the first character of Needle using SCASB and then it compares the subsequent characters of the search word using CMPSB. If they are not there, it continues to search for the first character and repeats the process.

CLD MOV ESI,Needle ; Address of the searched term. MOV EDI,Haystack ; Address of the block in which we will search for Needle. MOV ECX,HaystackSize ; Size of the block. LODSB ; Load the first character of Needle. Next:REPNE SCASB ; Search for the first character in Haystack starting with EDI. JNE NotFound ; Needle is not contained in Haystack, we are done. PUSH ECX,ESI,EDI ; Save the work-in-progress state on the stack. MOV ECX,NeedleSize-1 ; ESI points to the second character of Needle. REPE CMPSB ; Compare to Needle, starting with the second character. POP EDI,ESI,ECX ; Refresh the state from the stack, ignore the flags for now. JE Found ; If ZF=1, then REPE CMPSB has found a match, we're done. JMP Next ; Otherwise continue looking for the first character that is still in AL.

Both Linux and Windows require the Direction Flag to be reset before calling its functions, and they also guarantee that DF=0 on return. If we need to use a string instruction with the Direction Flag set, it needs to be reset again using CLD, preferably right after the instruction is executed. Then we don't even need to reset this flag at the beginning of the examples given.

Remember that string instructions increment registers after execution, so for example REPNE SCASB stops one element later after finding a byte, and thus the byte found is at address EDI-1.

The SCAS and CMPS instructions set flags, the other string instructions leave them unchanged.

The first program

After familiarizing ourselves with the basic machine instructions, we finally get to try them out. The instructions need to be written to the source file as plain text in 8-bit encoding (not UTF-16) and without internal flags signalizing bold or italics, headings, etc. We then submit the file to a program called assembler, which converts it into another executable file. This tutorial recommends using EuroAssembler, which is convenient because we don't have to specify any command line parameters, just the name of the source file. The output can be a directly executable file for DOS, Linux or Windows, so no linker is needed. We will not use any third party libraries either, only the API (Application Programming Interface) of the operating system.

To make programming amusing, every program should do something interesting, at least output something to the monitor. It usually starts by typing out the phrase "Hello, world!". The monitor, keyboard, and mouse are devices which are under the control of the operating system, which does not allow us to use IN and OUT instructions to write to the device. Maybe this would be possible in real DOS, but we would have a hard time dealing with the ports of the various video adapters that were used in the DOS era anyway.
We will need to use the operating system services to output the character string to the monitor.

We'll show how to output "Hello, world!" in 16-bit DOS, then in 32-bit Linux, 64-bit Linux, 32-bit MS Windows, and 64-bit MS Windows.

DOS 16 bits

The DOS services are invoked by the INT 21h machine instruction and are described for example in the DOS Fn Index or INT 21h. When we expand the INT 21h function list, we see two lines dealing with standard output, i.e., writing to the console: Int 21/AH=02h - DOS 1+ - WRITE CHARACTER TO STANDARD OUTPUT and Int 21/AH=09h - DOS 1+ - WRITE STRING TO STANDARD OUTPUT. We want to output the entire string, not just a single character, so we'll look at the second function, which has two input parameters:

MOV AH,09h MOV DX,HelloWorldAddress INT 21h

There is a small problem to solve: how to write the address of the string "Hello, world!" provisionally symbolized by the word HelloWorldAddress, into the DS:DX pair?

In DOS we will prefer programs in COM executable format. This takes place entirely within a single 64 KB block of memory, which is plenty for our small tasks, and has the great advantage that all four segment registers are already set to the necessary contents by DOS, so we don't have to deal with them. When DOS executes our program, it loads it into memory and stores a 256-byte data structure called PSP before it starts. The first instruction of our program follows right after it, at address 256. The program cannot start at address 256 with the definition of the string "Hello, world!" because the processor would try to execute the characters of the string as program code and would probably report an error or freeze. So we will put the string definition to the end of the program, the assembler doesn't care. After the INT 21h instruction we should terminate our program, otherwise it would print a greeting, but then it would try in vain to execute the bytes of the string as machine instructions and probably freeze. Looking through the list of DOS 21h functions, we find two items containing the text TERMINATE: Int 21/AH=00h - DOS 1+ - TERMINATE PROGRAM and Int 21/AH=4Ch - DOS 2+ - EXIT - TERMINATE WITH RETURN CODE. But there is another and much simpler way to terminate a COM format program: with a simple RET instruction. This fetches a null word from the stack, causing the program to return to address CS:0, where the beginning of the PSP is, and where the INT 0x20 machine instruction is stored to terminate the DOS program. So our program will look like this:

MOV AH,09h MOV DX,HelloWorldAddress INT 21h RET HelloWorldAddress DB "Hello, world!"

The text string was defined in the program using the DB (define bytes) and written in quotes. The assembler uses the address following the RET instruction, stores 13 bytes of the string at that address, and assigns the symbolic name HelloWorldAddress to that address. So we have the source code, we write it to a file called e. g. hello.asm and save it. Assuming that we have installed EuroAssembler accordig to the installation instructions, we can try to translate it into executable form by typing euroasm hello.asm in the console (the quotes around the filename can be omitted if it does not contain spaces):

From the output messages generated by EuroAssembler during compilation, note the line I0760 16bit TINY BIN file "hello.bin" created from source, size=29. The COM format program should have the file name extension .com. This is because we didn't tell the assembler to compile to COM format, so it used the default .bin extension, which among other things doesn't create a PSP, so our program wouldn't work anyway.

How to tell EuroAssembler to generate COM? This is done by using a pair of pseudoinstructions PROGRAM and ENDPROGRAM and their operands FORMAT=, WIDTH=, MODEL=, SUBSYSTEM= (and many others). So we have to wrap our program between PROGRAM and ENDPROGRAM. As a mandatory label for the PROGRAM command we will give the name of the program (without the suffix). Let's call it HelloDos, for example. The same name is also given for ENDPROGRAM, but not in the label field, but as the first operand, as is usual for EuroAssembler block pseudo-instructions. The program name does not necessarily have to match the source file name, as is the case with other assemblers. In fact, we could define more than one PROGRAM/ENDPROGRAM block in a single file and thus produce several different executable programs at once. But we won't try that yet.

In addition to the name of the program, we have to specify the format and width of the resulting file in the parameters of the PROGRAM pseudoinstruction. However, EuroAssembler derives the width for COM format itself as WIDTH=16, so we could omit this operand. Similarly, we could omit the ENTRY=256 operand, since the program entry point is always fixed at this address for COM-format programs.

HelloDos PROGRAM FORMAT=COM, WIDTH=16, ENTRY=256 MOV AH,09h MOV DX,HelloWorldAddress INT 21h RET HelloWorldAddress DB "Hello, world!" ENDPROGRAM HelloDos

... I0180 Assembling source file "hello.asm". I0270 Assembling source "hello". I0310 Assembling source pass 1. I0330 Assembling source pass 2 - final. I0470 Assembling program "HelloDos". "hello.asm"{1} I0510 Assembling program pass 1. "hello.asm"{1} I0510 Assembling program pass 2. "hello.asm"{1} I0530 Assembling program pass 3 - final. "hello.asm"{1} I0660 16bit TINY COM file "HelloDos.com" created, size=29. "hello.asm"{7} I0650 Program "HelloDos" assembled in 3 passes with errorlevel 0. "hello.asm"{7} I0750 Source "hello" (6 lines) assembled in 2 passes with errorlevel 0. I0860 Listing file "hello.asm.lst" created, size=1039. I0980 Memory allocation 448 KB. 32 statements assembled in 1 s. I0990 EuroAssembler terminated with errorlevel 0.

The last line is important, as it informs about errorlevel 0, i.e. no errors. Otherwise we would have to find and remove them in the source text first. Similarly, the line I0660 16bit TINY COM file "HelloDos.com" created, size=29. confirms that a program in COM format was generated and what size it is. After almost every message, EuroAssembler still appends the position in the source file to which the message refers. For example, "hello.asm"{7} associates the message with the seventh line of the source file, which is ENDPROGRAM HelloDos.

Note also the message I0860 Listing file "hello.asm.lst" created, size=1039.. EuroAssembler creates a listing file of the translated file without asking. This is again a plain text file, so we can view it with Notepad or similar viewer:

We can see that the listing contains a copy of the source code indented to the right, and in a left part a column delimited by | (pipe character) containing a four-digit hexadecimal address terminated by a colon, followed by the machine code of the instruction. At the end of the program occurs a map showing how its sections were linked into the resulting file (**** ListMap), and also a list of global symbols (**** ListGlobals), in our case empty.

The translation of the program ended with errorlevel 0. We can then open the DOS emulator (DosBox), go to the directory where we have HelloDos.com and try to run it by typing HelloDos or HelloDos.com. You will probably see the text Hello, wordl! followed by a jumble of nonsense characters in the DOS window. This is caused by overlooking a detail in the service description DS:DX -> '$'-terminated string – the string being written out must end with a dollar sign. Along with the dollar sign, we can also add a Carriage Return and Line Feed pair, 13 and 10, to the end of the string to cause line breaks:

After correcting the line and the new translation, everything should work as expected.

BIOS 16 bits

When the computer is turned on, the CPU starts in real mode and executes the program hardwired in memory on the motherboard. At that time, DOS or other operating system is not yet booted from the disk, but the BIOS or UEFI interface is working, which can perform several basic functions of the computer: print characters and strings to the monitor, read the keyboard, load a program from the disk to boot the OS. These functions can be called with INT instructions, especially INT 10h for working with the video adapter and thus the monitor. For a complete list of functions invoked by INT interrupts, see the Interrupt Jump Table.

In the previous example, the DOS interface was used to write out the string. If we needed to write to the screen immediately after the computer is turned on, before DOS or any other operating system is booted, we would have to use the BIOS interface hardwired into the motherboard firmware. This is the behavior of, for example, the boot sector program, which loads the 512 byte long contents of one disk sector into memory and passes control to it. Let's try writing out "Hello, world!" using the BIOS services. These, like the DOS services, preserve the contents of all registers except those that return a result. We'll use the TELETYPE OUTPUT service, which expects the character to be written out in AL, then it expects AH=0Eh as the service identifier, and BH=0 as the video adapter's internal page number.

Although EuroAssembler allows to generate the boot sector directly, by selecting PROGRAM FORMAT=BOOT, booting the boot sector is complicated and inconvenient. For practical reasons, we will again generate the proven good old COM format:

HelloBio PROGRAM FORMAT=COM, WIDTH=16, ENTRY=Start: Start: MOV AX,CS ; Fill the DS segment register with the same value as CS. MOV DS,AX ; Cannot use MOV DS,CS for this, as such an instruction is not supported. MOV SI,HelloWorldAddress MOV CX,HelloWorldSize CLD ; Reset the Direction Flag for safety. SUB BX,BX ; Use the base (zero) page of the video adapter. MOV AH,0Eh ; BIOS function number INT 10h. Next: LODSB ; Load one character of the string into AL, increment SI. INT 10h ; Call the BIOS function to print the character. LOOP Next ; Jump for the next character CX times. JMP $ ; End the program by endlessly jumping on itself. Without the OS, there is no other way. HelloWorldAddress DB "Hello, world!",10 HelloWorldSize EQU $ - HelloWorldAddress ENDPROGRAM HelloBio

After running HelloBio.com in the DosBOX emulator, we again get the expected output Hello, world!. However, due to the program looping at the end, we then have to emergency terminate DosBOX.

Linux 32 bits

The executable format for Linux in EuroAssembler is called ELFX, generated programs in this format get the file extension .x, which we can get rid of by renaming with mv HelloL32.x HelloL32 and have an executable program without the extension, as is customary in Linux. In addition to the parameters of the pseudoinstruction PROGRAM FORMAT=ELFX and WIDTH=32, we must also specify the entry point of the program, i. e. the first instruction to be executed. We mark the entry point (ENTRY=) symbolically with a label, for example Start: or Main: etc.

If we run the programs in this tutorial under MS Windows, to run the Linux variant we would need to have the WSL emulator installed in Windows.

To output text in Linux, we again have to use the application interface of its kernel. In the case of a 32-bit system, a kernel function is called with the INT 80h instruction and its parameters are entered into the registers EBX, ECX, EDX, ESI, EDI, EBP, while the identifier of the called function is entered in EAX. Of course, we only fill the input registers that the kernel function requires, in the case of the write function(sys_write has three parameters) these are the registers EBX, ECX, EDX. The kernel call returns the result of the function in the EAX register, the other registers are returned unchanged.
To output the string "Hello, world!" we'll need the sys_write function to write to standard output. According to the Linux Syscall Reference (32 bit) this function has the identifier EAX=0x04. The first parameter in the EBX register specifies fd, which is the file descriptor alias file handle. For standard output this is the number 1, for error output it would be 2. In the next two registers we specify the second and third parameters, which are the address and size of the string to be output. We could specify the size as a number (in our example MOV EDX,13), but it makes more sense to specify it indirectly, as the difference of the $ address and the HelloWorldAddress. If we were to later lengthen or shorten the HelloWordAddress string, its size would be set automatically. The dollar symbol $ denotes the current address of the instruction, in this case the EQU instruction. Since we specify the length explicitly, there is no need to end the string with a null character. So let's try it:

HelloL32 PROGRAM FORMAT=ELFX, WIDTH=32, ENTRY=Start: Start: MOV EAX,4 ; Function sys_write. MOV EBX,1 ; File descriptor for standard output. MOV ECX,HelloWorldAddress MOV EDX,HelloWorldSize INT 80h ; Print the string. RET HelloWorldAddress DB "Hello, world!" HelloWorldSize EQU $ - HelloWorldAddress ENDPROGRAM HelloL32

Compile the program in the hello.asm file again using euroasm hello.asm. If everything went well and EuroAssembler reported

we can try to run it in native Linux or in WSL with the command ./HelloL32.x. In my case, the console printed:

So the program works, but after dumping the string, it crashed with a Segmentation fault. The error was in exiting the program: the RET instruction is not enough in Linux, we have to use the kernel function sys_exit. And at the same time, we add a Line Feed character (10) at the end of the string for line feed after the dump is finished. Carriage return is not usually used in Linux.

HelloL32 PROGRAM FORMAT=ELFX, WIDTH=32, ENTRY=Start: Start: MOV EAX,4 ; Function sys_write. MOV EBX,1 ; File descriptor for standard output. MOV ECX,HelloWorldAddress MOV EDX,HelloWorldSize INT 80h ; Print the string. MOV EAX,1 ; Function sys_exit. MOV EBX,0 ; Errorlevel when the program terminates. INT 80h ; Exit program. HelloWorldAddress DB "Hello, world!",10 HelloWorldSize EQU $ - HelloWorldAddress ENDPROGRAM HelloL32

After replacing the RET instruction with the system call, the program works as expected.

Linux 64 bits

The program for 64-bit Linux looks similar, the difference is the parameter WIDTH=64, which we have to specify in the PROGRAM pseudo-instruction, since its default is WIDTH=32. Another thing that distinguishes 32-bit and 64-bit Linux is the calling of the kernel functions not by INT 80h, but by the SYSCALL instruction. The numeric identifiers of kernel functions passed in RAX and the order of registers for parameter passing also differ: instead of EBX, ECX, EDX, ESI, EDI, EBP, they are entered in registers RDI, RSI, RDX, R10, R8, R9. Otherwise the functions remain the same. For an overview of kernel calls, see e.g. Linux System Call Table for x86 64.

HelloL64 PROGRAM FORMAT=ELFX, WIDTH=64, ENTRY=Start: Start: MOV RAX,1 ; Function sys_write. MOV RDI,1 ; File descriptor for standard output. MOV RSI,HelloWorldAddress MOV RDX,HelloWorldSize SYSCALL ; Print the string. MOV RAX,60; Function sys_exit. MOV RDI,0 ; Errorlevel when the program terminates. SYSCALL ; Exit program. HelloWorldAddress DB "Hello, world!",10 HelloWorldSize EQU $ - HelloWorldAddress ENDPROGRAM HelloL64

We get a number of warnings when we try to compile it: W2340 This instruction requires option "EUROASM CPU=X64".

Even with these warnings our program would work, but we prefer to give it the required option by adding the line EUROASM CPU=X64 before the HelloL64 PROGRAM pseudo-instruction so that it does not unnecessarily point out that we are using 64-bit registers. After the new compile we should get errorlevel 0 and we can test our 64-bit program:

Windows 32 bits

Function calls in MS Windows use the Win32 application interface defined in dynamically linked libraries. Most of the basic functions are available in the kernel32.dll library.

We will again write to standard output. Unlike Linux, however, there are no fixed fd file identifiers, as there were 0 for standard input, 1 for standard output and 2 for error output. Instead, we must first ask the operating system what identifier (file handle) is used for standard output today. To do this, we use a Win32 function called GetStdHandle. It takes as a parameter the identifier of standard output, which is the number -11. The value returned by the function in the EAX register is then already the file handle used in WriteFile function.

What remains to be solved is how to call the Win32 functions and how to pass parameters (address and string size) to them. 32-bit Windows uses the standard call convention, where the parameters are first stored on the stack (PUSH) in order from last to first, and then the imported function is called (CALL). The programmer of the function in question takes care of removing the stored parameters from the stack. It's usually done by using RET n*4 instead of a regular RET return, where n is the number of parameters on the stack. The RET n*4 instruction works like a regular return from a function (RET), but then it also increases the ESP by n*4 bytes.

How do we make the CALL instruction recognize that we are calling a dynamically linked function? Either we include the winapi.lib import library in the program, or we define the name of the function with the IMPORT pseudoinstruction. This pseudoinstruction has the LIB= parameter, which defines the file (library) that contains the function. The library is defined in the description of each function on the Microsoft website in the Requirements paragraph. If the library name is "kernel32.dll", we do not need to specify it with the LIB= parameter. So let's try to write an executable for Windows that will have the PE (Portable Executable) format:

HelloW32 PROGRAM FORMAT=PE, WIDTH=32, ENTRY=Start:, IconFile= IMPORT GetStdHandle, WriteFile, ExitProcess ; We use 3 imported functions from "kernel32.dll". Start: PUSH -11 ; 1. parameter: Identifier for the standard output. CALL GetStdHandle ; This function returns file handle in EAX. PUSH 0 ; 5. parameter (lpOverlapped): not used. PUSH lpNumberOfBytesWritten ; 4. parameter: address of DWORD where the number of written bytes is stored. PUSH HelloWorldSize ; 3. parameter: number of bytes to write. PUSH HelloWorldAddress ; 2. parameter: string address. PUSH EAX ; 1. parameter: file handle. CALL WriteFile ; Function with parameters. PUSH 0 ; 1. parameter: errorlevel. CALL ExitProcess ; Exit program. This function returns nothing. HelloWorldAddress DB "Hello, world!",10 HelloWorldSize EQU $ - HelloWorldAddress lpNumberOfBytesWritten DD 0 ; This is where Windows writes how many characters WriteFile has written. ENDPROGRAM HelloW32

... I0180 Assembling source file "hello.asm". I0270 Assembling source "hello". I0310 Assembling source pass 1. I0330 Assembling source pass 2 - final. I0470 Assembling program "HelloW32". "hello.asm"{1} I0510 Assembling program pass 1. "hello.asm"{1} I0510 Assembling program pass 2. "hello.asm"{1} I0530 Assembling program pass 3 - final. "hello.asm"{1} I0660 32bit FLAT PE file "HelloW32.exe" created, size=2588. "hello.asm"{16} I0650 Program "HelloW32" assembled in 3 passes with errorlevel 0. "hello.asm"{16} I0750 Source "HelloW32" (15 lines) assembled in 2 passes with errorlevel 0. I0860 Listing file "HelloW32.asm.lst" created, size=2514. I0980 Memory allocation 512 KB. 68 statements assembled in 1 s. I0990 EuroAssembler terminated with errorlevel 0.

After running the program in Windows with the command HelloW32.exe or just HelloW32 we should get the greeting Hello, world!.

Windows 64 bits

MS Windows in 64-bit mode uses the same Win32 functions and the same kernel32.dll libraries as 32-bit Windows, but the calling convention differs significantly: instead of StdCall, FastCall is used. In this convention, RCX, RDX, R8, R9 are used for the first four parameters. If a function requires more than four parameters, they are stored on the stack again in reverse order (from last to fifth). Then the stack is reserved for the four parameters passed in the registers, even if the function has less than four parameters. In addition, the stack (register RSP) must be rounded to an integral multiple of 16 bytes before calling an external function (before the CALL instruction). In case that floating point parameters are passed to the function, the lower half of XMM0, XMM1, XMM2, XMM3 is used instead of the corresponding RCX, RDX, R8, R9. The called function does not remove parameters from the stack after it is finished, this must be taken care of by the caller.

Equipped with this knowledge, we will try to write out the string in 64-bit Windows:

EUROASM CPU=X64 HelloW64 PROGRAM FORMAT=PE, WIDTH=64, ENTRY=Start:, IconFile= IMPORT GetStdHandle, WriteFile, ExitProcess Start: TEST SPL,08h ; Check if the stack alignment is 16. JZ Round ; Jump over PUSH RAX, if stack was rounded to 16. PUSH RAX ; Otherwise execute instruction PUSH to reduce RSP by 8. Round: SUB RSP,4*8 ; Reserve shadow space for 4 registers. MOV RCX,-11 ; 1. parameter: identifier for standard output. CALL GetStdHandle ; Function returns file handle in RAX. ADD RSP,4*8 ; Return stack pointer to Round state. PUSH RAX ; Instruction to decrease RSP by 8. PUSH 0 ; 5. parameter (lpOverlapped): not used. SUB RSP,4*8 ; Reserve shadow space for 4 registers. MOV R9,lpNumberOfBytesWritten ; 4. parameter: address of DWORD where the number of written bytes is stored. MOV R8,HelloWorldSize ; 3. parameter: number of bytes to write. MOV RDX,HelloWorldAddress ; 2. parameter: string address. MOV RCX,RAX ; 1. parameter: file handle. CALL WriteFile ; Function with parameters. ADD RSP,6*8 ; Návrat ukazatele zásobníku na stav Round. SUB RSP,4*8 ; Return the stack pointer to the Round state. MOV RCX,0 ; 1. parameter - errorlevel. CALL ExitProcess ; The function returns nothing. HelloWorldAddress DB "Hello, world!",10 HelloWorldSize EQU $ - HelloWorldAddress lpNumberOfBytesWritten DQ 0 ; This is where Windows writes how many characters WriteFile has written. ENDPROGRAM HelloW64

Let's try this with macros

As you can see, especially in the last example for 64-bit Windows, calling a function as simple as writing out a greeting requires writing quite a large number of machine instructions. Let's try to do something about it. The key to reducing the programmer's workload is the use of macro instructions. Each macro instruction (alias a macro) can replace a number of machine instructions, and the macro can accept input parameters and thus modify its operation according to the programmer's needs.

The syntax of the macro instruction language and its nuances are properties of the assembler used; virtually every assembler handles them in its own way. Here we will focus on writing macro instructions in the EuroAssembler language. Its apparatus uses the percent sign % for the expressions used in writing macros. Pseudoinstructions starting with a percent refer to macros or auxiliary assembler variables. While common memory variables written using the D, such as OrdinaryVar DD 1234h, define a memory location called OrdinaryVar containing a DWORD with the value 1234h, the %OrdinaryVar variable represents something quite different: a variable of EuroAssembler itself. Its location is not in the compiled program code or its data, but it exists in EuroAssembler's memory while it is running. Its content can be set by the %SET pseudoinstruction and it can be any text, arithmetic expression, string, number, etc. Whenever %OrdinaryVar appears in the source text, it will be replaced by that text.

Macroinstructions are defined by a pair of block pseudoinstructions %MACRO and %ENDMACRO. The identifier in the %MACRO label field becomes the name of the macroinstruction. The machine instructions within the %MACRO/%ENDMACRO block are the body of the macro. Whenever the name of a macro instruction appears in the source code, it is replaced with all the machine instructions from its definition. Example macro for 64-bit Linux:

We have defined a macro called WriteString with two parameters: the address of the string and its length. Specifying the name of the macro in the program then causes it to be expanded into five machine instructions, with the first and second parameters available as variables %1 and %2. Instead of using numeric labels for the parameter variables (%1, %2), we could also use formal parameter names by prefixing the parameter name in the macro definition (StringAddress, StringSize) with a percent sign in the body of the macro:

We can further improve the macro by not hard-coding the file descriptor for the standard output passed in RDI as the number 1, but by specifying it as a parameter. And to avoid having to specify this parameter if we use its usual value of 1, we will specify it as a key parameter, i. e. with an equals sign and a default value of 1:

We can now use the same macro to write to the error output instead of the standard output; just add the fd=2 parameter.

Literal instead of symbol

EuroAssembler allows you to define memory variables using literals, i. e., directly specified values. Instead of defining a DB pseudoinstruction string and inventing its symbolic name, we define it only when its value is used in an instruction:

A literal is defined by an equals sign = followed by a type specification (BYTE, WORD, DWORD, QWORD, or just B, W, D, Q) and then its value. The advantage of a literal over a symbol is that we don't have to invent a name for it and we can immediately see its value in the instruction where it was used.

Using the macro language and literals, we can now write our own macros to output the string to standard output. For example, let's call them StdOutput. This work has already been done and the macros StdOutput are listed in the libraries supplied with EuroAssembler, namely the macrolibraries dosapi.htm for DOS 16 bits, linapi.htm for Linux 32 bits, linabi.htm for Linux 64 bits, winapi.htm for Windows 32 bits, winabi.htm for Windows 64 bits.

With the use of the StdOutput macro and the use of literals, our test programs become much simpler. Just include the appropriate library according to the target platform. Inclusion using the INCLUDE causes the named library (which is just another source file) to replace the INCLUDE command line with its contents.

API (16 and 32 bit) or ABI (64 bit) libraries contain definitions for the macros StdOuput, TerminateProgram and several others.

For simplicity, we will write the DOS, Linux and Windows programs into a single source file hello.asm. Since the macros for writing to standard output and for terminating the program are named the same in all libraries (StdOutput and TerminateProgram), we should tell EuroAssembler to forget their definitions from the previous library before defining the next program using %DROPMACO pseudoinstruction.

EUROASM CPU=X64 HelloDos PROGRAM FORMAT=COM, WIDTH=16 ; DOS version. INCLUDE dosapi.htm StdOutput =B "Hello, world!" TerminateProgram ENDPROGRAM HelloDos %DROPMACRO * HelloL32 PROGRAM FORMAT=ELFX, WIDTH=32, ENTRY=Start ; Linux 32 bit version. INCLUDE linapi.htm Start: StdOutput =B "Hello, world!" TerminateProgram ENDPROGRAM HelloL32 %DROPMACRO * HelloL64 PROGRAM FORMAT=ELFX, WIDTH=64, ENTRY=Start ; Linux 64 bit version, INCLUDE linabi.htm Start: StdOutput =B "Hello, world!" TerminateProgram ENDPROGRAM HelloL64 %DROPMACRO * HelloW32 PROGRAM FORMAT=PE, WIDTH=32, ENTRY=Start, ICONFILE= ; Windows 32 bit version. INCLUDE winapi.htm Start: StdOutput =B "Hello, world!" TerminateProgram ENDPROGRAM HelloW32 %DROPMACRO * HelloW64 PROGRAM FORMAT=PE, WIDTH=64, ENTRY=Start, ICONFILE= ; Windows 64 bit version. INCLUDE winabi.htm Start: StdOutput =B "Hello, world!" TerminateProgram ENDPROGRAM HelloW64

After translating the above file with the familiar euroasm hello.asm command, we should get five programs named HelloDos.com, HelloL32.x, HelloL64.x, HelloW32.exe, HelloW64.exe, which we can immediately try with the help of emulators (DosBox, WSL, wine).

Definition and expansion of macros

The block of instructions between %MACRO and %ENDMACRO represents the definition of a macro. The definition itself does not do anything interesting yet, it just takes up space in the source code. Only when we try to use the macroinstruction in the program it will be expanded, i. e. the macro name will be replaced by the instructions from the macro body (and maybe also errors will be shown, if we made any while writing the macro).

The macro definition can be written at the beginning of the PROGRAM/ENDPROGRAM block, or before this block, or even in a separate included file (library), but always before the macro is used (expanded) for the first time/ Macro instructions (together with EuroAssembler variables starting with the percent sign), pass through the boundaries of the PROGRAM/ENDPROGRAM block, they are visible throughout the source code starting from their definition. This is where they differ from symbols, which must be unique in the PROGRAM/ENDPROGRAM block and they must not be repeated.

Thus, macros and %variables can be redefined within the same source file. However, macro redefinitions are somewhat uncommon, so EuroAssembler responds with the warning message W2512 Overwriting previously defined macro. If we need to overwrite a macro with a different macro definition with the same name, it is better to first make it forget its previous definition using the %DROPMACRO pseudo-instruction.

Retrieving information from the user

Passing the text to the user of our program was easy, in the previous examples we used an operating system service usually called write or similar, perhaps wrapped in a StdOutput macro. Now let's look at the opposite case, where we want to get something from the user. One possibility is to retrieve arguments from the command line that we used to run our program.

From the command line

If our program is called MyCalc.exe, for example, and we typed MyCalc.exe 2 + 3 in the console, the operating system will provide us with a string containing the same information MyCalc.exe 2 + 3. That is the name of the running program (the 0-th argument) and then an exact copy of the following characters, including spaces or other characters separating the arguments. Where is this string stored?

In DOS, it is in the PSP structure starting at byte 81h. The previous byte at address 80h contains the length of the string.

In Windows we get a pointer to a similar string using the API function GetCommandLine. If we need to have each argument separately, we have to retrieve the string e.g. with the LODSB instruction and respond to the separator characters (this is called parsing).

It's a bit different in Linux: the string is already parsed into the program name and its space-separated arguments, and each of these items is terminated by a null character and the pointers to them are stored on the stack. This is shown schematically in the figure for the GetArg macro. The addresses of the stack items in this figure grow upwards. The width of each item is 8 bytes in a 64-bit program or 4 bytes in a 32-bit program. The MOV RCX,[RSP] instruction at the beginning of the program would load the total number of arguments into RCX. The address of the first argument is obtained by MOV RSI,[RSP+2*8], the address of the second argument is obtained by MOV RSI,[RSP+3*8] and so on.

Or we can use the ready-made GetArg macro which delivers the individual arguments already parsed, regardless of the operating system. Just use the appropriate macro library for DOS, Linux or Windows in the required 16, 32 or 64 bit width. Macro libraries are loaded with the INCLUDE pseudo-instruction, which takes as parameter the file dosspi.htm, linapi.htm, linabi.htm, winapi.htm nebo winabi.htm. The macro GetArg is named the same in all of these libraries, and returns a pointer to the requested argument in rSI and its length in rCX. Alternatively, it returns a CarryFlag if the argument in question was not supplied on the command line.

Let's try programming a primitive four-task calculator. After entering two integers separated by the sign of an arithmetic operation, our calculator should return the correct result. The allowable operations will be determined by the +, -, *, / sign for addition, subtraction, multiplication, division.

The target platform will be 32-bit Windows, so the corresponding library will be winapi.htm. If we write 3 + 4 to the CalcW32 command line, we want to get a result of 7. Let's call the source code file calc.asm, for example:

CalcW32 PROGRAM FORMAT=PE, WIDTH=32, ENTRY=Start:, IconFile= INCLUDE winapi.htm Start: GetArg 1 ; Use the macro to retrieve the 1st argument of our program. JC Error: ; If the argument is missing, jump to the Error flag. ; ESI points to the first argument (digit 3), in ECX its length (1 byte). ; We don't have enough registers to permanently remember all arguments, ; so to store both numeric arguments and the sign of the operation ; we create three empty string variables Arg1, Arg2, Arg3 of length one byte. Arg1: D BYTE ; Define a memory variable of type BYTE. Its name is Arg1. Arg2: D BYTE Arg3: D BYTE ; Since we have successfully loaded the first argument (CF=0, did not jump to the Error:), ; let's store the argument in a string variable and load the next one. MOV EDI,Arg1 REP MOVSB ; Copy the string from the ESI address to the EDI address in ECX byte length. GetArg 2 JC Error: MOV EDI,Arg2 REP MOVSB ; Store the character of the requested operation (+ in this case). GetArg 3 JC Error: MOV EDI,Arg3 REP MOVSB ; Store the third argument (digit 4). ; The arguments are specified. Let's try to display them for checking: StdOutput Arg1, Arg2, Arg3, Size=1 ; Check listing of the arguments. CMP [Arg2],'+' ; Has an addition operation been specified? JNE Error: ; We have not yet taught our program another one. StdOutput =B "=" ; List the equation, specified as a literal. MOV AL,[Arg1] ; Load the first argument to the AL register. ADD AL,[Arg3] ; Add the third argument to it. ; Now we should print the sum from the AL register. ; However, the StdOutput macro prints the contents of memory, not the register. ; Therefore we need to save AL first to memory, perhaps named Result. Result: D BYTE MOV [Result],AL StdOutput Result, Size=1 ; Print the resulting sum. T TerminateProgram ; Macro from winapi.htm library to exit. Error: StdOutput Help Help: DB "Calculate number1 operation number2.", 13, 10 DB "Example: %^PROGRAM 3 + 4", 13, 10, 0 ENDPROGRAM

You should be puzzled by the definition of the variables Arg1, Arg2, Arg3 in the middle of the instruction flow. This is a bad programming technique, but it works in EuroAssembler thanks to the EUROASM AUTOSEGMENT=ENABLED parameter enabled by default. Autosegmentation distinguishes whether a machine instruction or a data definition has been placed on a line, and accordingly divides the output of the translated instructions into separate sections for program code, initialized data, and uninitialized data. These sections have the traditional names [.text], [.data], [.bss]. If we look at the listing of calc.asm.lst, we can see the automatic change of sections there.
However, it is not very wise to rely on autosegmentation for longer programs; rather, when writing the program, we should already position the Arg* data items below the program code.

Next, note the help line DB "Example: %^PROGRAM 3 + 4", 13, 10, 0. The %^PROGRAM expression here is not a user-defined variable due to the ^ (caret) following the percent sign. Such variables are system variables, their value is set by EuroAssembler itself, in this particular case to the program name (CalcW32). EuroAssembler has many system variables, each parameter of the EUROASM and PROGRAM pseudoinstructions has one, see the help crib.

Thanks to the line StdOutput Arg1, Arg2, Arg3, Size=1 ; Argument checklist. we had listed the arguments with which CalcW32 was run. This is a good tactic for debugging. We can see that all three arguments were entered correctly.

A more experienced programmer would have found several more errors in CalcW32. For example it is not specified where to continue the program after the Error help line is printed. We want to send it to exit the program. So let's add an End: label to the TerminateProgram pseudoinstruction and let the program jump to it after printing the Help.
Another error that causes the incorrect result 3+4=g is not distinguishing between the binary number and the ASCII code of the individual digits that GetArg returns. When given the digit 3 as an argument, the GetArg macro returns its ASCII code, which is in fact hexadecimal 33h. Similarly, instead of digit 4 we get 34h, and since we have added the ASCII codes (and not the numbers), the result is 33h + 34h = 67h, which is the ASCII code of the letter g and not the digit 7. Before using the ADD instruction for addition we need to convert the ASCII codes to plain binary numbers. This is easily done by subtracting 30h or '0' from the ASCII code of the digit. After the binary addition, we then convert the sum by adding 30h, and we can print this ASCII code output as a result using StdOutput. Let's correct the program CalcW32:

CalcW32 PROGRAM FORMAT=PE, WIDTH=32, ENTRY=Start:, IconFile= INCLUDE winapi.htm [.text] ; This is the EuroAssembler way to indicate that machine instructions will follow. Start: GetArg 1 ; Use the macro to retrieve the 1st argument of our program JC Error: ; If an argument is missing, jump to the Error flag. ; ESI points to the first argument (to the digit 3), ECX had its length (1 byte). MOV EDI,Arg1 REP MOVSB ; Copy the string of ECX bytes from the ESI to the EDI address. GetArg 2 JC Error: MOV EDI,Arg2 REP MOVSB ; Save the character of the requested operation (in this case +). GetArg 3 JC Error: MOV EDI,Arg3 REP MOVSB ; Save the third argument (digit 4). ; The arguments are specified. Let's try listing them for checking: StdOutput Arg1, Arg2, Arg3, Size=1 ; Check listing of the arguments. CMP [Arg2],'+' ; Has an addition operation been specified? JNE Error: ; We have not yet taught our program another one. StdOutput =B "=" ; Print equals sign, specified as a literal. MOV AL,[Arg1] ; First argument to the AL register. SUB AL,'0' ; Convert ASCII to a binary value. MOV BL,[Arg3] ; Third argument to the BL register. SUB BL,'0' ; Convert ASCII to a binary value. ADD AL,BL ; Binary sum of two integer numbers. ADD AL,'0' ; Convert a binary number to ASCII. MOV [Result],AL ; Store the ASCII result to memory. StdOutput Result, Size=1 ; Print the resulting sum. End: TerminateProgram ; Macro from the winapi.htm library to terminate. Error: StdOutput Help JMP End: [.data] ; This is where the program code section ends and the data begins. Help: DB "Calculate number1 operation number2.", 13, 10 DB "Example: %^PROGRAM 3 + 4", 13, 10, 0 Arg1: D BYTE ; Define a memory variable names Arg1 of type BYTE. Arg2: D BYTE Arg3: D BYTE Result: D BYTE ENDPROGRAM

The program already works fine, but only for adding single digit numbers and we have to keep the spaces between the arguments. When entering CalcW32 3+4 without spaces, the program prints the Help, i.e. it doesn't like something. I guess it took the whole string 3+4 as one argument and when trying to retrieve the non-existent second and third one, GetArg returns an error. We will have to solve the input and output of binary numbers in the form of ASCII for longer numbers than just single digits. Unfortunately, operating systems do not offer any function to convert ASCII numbers to binary and back, we have to program everything ourselves. For example, if we have the characters "123", as the input string of the first number, we expect the result in binary form as the number 123, i. e. hexadecimal 7Bh in 8-bit register or 0000007Bh in 32-bit register. How to convert the string "123" to binary number? First convert each successively read character in the range "0" to "9" to a number in the range 0 to 9 by subtracting "0" or by subtracting 30h. Then there are two possible approaches to order the numbers thus obtained successively into the result:

The second method looks simpler because we don't have to calculate increasing weights of 1, 10, 100, etc., instead we make do with repeated multiplication by 10. We'll call this algorithm ASCIItoInteger. We'll probably use it more often in various programs, so we'll encapsulate it in a procedure using the PROC and ENDPROC pseudoinstructions. Then we will call it with CALL ASCIItoInteger whenever we need to convert a number from ASCII characters to an integer binary value.

ASCIItoInteger PROC ; Define a procedure. SUB EAX,EAX ; Result accumulator. SUB EBX,EBX ; Number conversion register. SUB ECX,ECX ; Input number length. MOV EDI,10 ; A constant multiplier. Next: MOV BL,[ESI+ECX] ; Retrieve the ECX-th character from the input string ESI. SUB BL,'0' ; Convert to binary if it was a decimal digit. JB End: ; Jump if it was not a digit. CMP BL,9 ; Check for an upper bound of the digit. JA End: ; Jump if it was not a digit. INC ECX ; Add 1 to the register specifying the length of the input number. MUL EDI ; Multiply the accumulator by ten. JC Over: ; Jump if 32 bits overflow. ADD EAX,EBX ; Add the last digit to the accumulator. JMP Next: ; Process the next character. End: CLC ; Return with the CF flag cleared. Binary number is in EAX. Over: RET ; Return with CF flag set. The result is not defined. ENDPROC ASCIItoInteger ; End of procedure.

All that remains is to complete the procedure with a description of exactly what it does, what values it expects on input, and what it provides on output:

; Description: The ASCIItoInteger procedure converts a string of ASCII digits to a binary integer. ; Input: ESI contains a pointer to the first character of the number being converted in memory. ; The procedure reads characters as long as they contain digits or until an overflow occurs. ; Output:Carry Flag is zero, EAX contains the converted number in unsigned binary form. ; ECX contains the number of ASCII characters processed from the input string. ; Error:Carry Flag is set when the input number would exceed 32 bits. EAX is undefined. ; ECX contains the number of processed ASCII characters from the input string. ; Clobbers: EBX,EDX,EDI.

We have thus produced something like a black box, for which we no longer have to think about its instructions, but we are concerned only with its description as a whole. Later we can put the whole procedure in a separate file and thus create a library that we can then include in all programs that require string-to-number conversions. We can call the library for example libcvt32.asm.

In the simple calculator example, we still need to solve the reverse conversion of a binary number to a series of ASCII characters representing that number in a format that we can print using StdOutput. Again, there are two options here:

The second approach is made more complicated by the need to maintain divisors of 1_000_000_000, 100_000_000, 10_000_000 etc., so let's use the first method. Again, we will program this as a separate procedure as it is a frequently used function.

; Description: The IntegerToASCII procedure converts a binary number from the EAX register to a string of ASCII digits. ; Input: EAX contains an input unsigned binary number in the range 0..4294967295. ; ESI contains a pointer to an output field of 10 bytes. ; Output:ESI is incremented by 0..9 bytes and points to the first valid digit of the result. ; Error: No error can occur. ; Clobbers: EAX,ECX,EDX,EDI. IntegerToASCII PROC ; Define the procedure. MOV ECX,10 ; Number of digits in the result. MOV EDI,10 ; Divisor. Next1:SUB EDX,EDX ; The high 32 bits of the input number EDX:EAX must be reset before division. DIV EDI ; EAX is now the quotient, EDX is the remainder 0..9. ADD DL,'0' ; Convert remainder DL to the digit '0'...'9'. DEC ECX ; Store numbers backwards. JS End1: ; Jump at the end of all ten digits. MOV [ESI+ECX],DL ; Store a digit at the end of the result. JMP Next1: ; Process the next digit. End1: ; The result field now contains '0000000123' for input number 123. ; Let ESI point to the first non-zero digit of the result. LEA EDI,[ESI+10] ; EDI is indent beyond which ESI will no longer be read. Next2:CMP ESI,EDI ; End of output number? JAE End2: ; Jump if yes, ESI will point to the last '0'. LODSB ; Read result digit from ESI, increment ESI. CMP AL,'0' ; Is this the beginning of valid digits? JE Next2: ; Jump if no. End2: DEC ESI ; Return ESI to the first valid digit. RET ; Return from the procedure. ENDPROC IntegerToASCII

Let's return to our calculator. The disadvantage of reading input from the command line is that it is only usable for one example; to enter another calculation, the program must be run again. Therefore, we will learn to read input characters entered from the keyboard so that we can enter different numerical examples repeatedly.

Direct reading of the keyboard and mouse using IN and OUT instructions is no longer on-topic in the age of USB peripherals and protected operating systems. The lowest level of keyboard handling in DOS and its emulations is provided by the INT 16h interface, specifically CHECK FOR KEYSTROKE and if it detects that a key has been pressed, then GET KEYSTROKE, which returns the ASCII character in AL.

From standard input

In Windows, Linux, and even DOS, a function to read from standard input is preferable. It is redirectible and returns the text entered from the keyboard line by line. The contents of another text file can be redirected to a program using standard input reading, for example in Linux with cat answers.txt | program and in DOS/Windows with type answers.txt | program.exe.

For reading standard input use use in DOS the function READ FROM FILE OR DEVICE, in Linux sys_read, in Windows ReadFile. These functions do not return ASCII characters each time a key is pressed, instead they implement a line editor. This means that the typed text can be overwritten or deleted using the Backspace key, and is only returned to the program when the Enter key is pressed. The returned text will be terminated by a Line Feed character (10) in Linux, and by a pair of Carriage Return and Line Feed characters (13, 10) in DOS and Windows. In addition, these functions report the number of characters actually read, including those terminating 0Dh,0Ah.

Let's try to improve our not-very-functional-yet calculator in the calc.asm source file by reading from the standard input. Let's assume we already have a text file called libcvt32.asm, in which we have stored the two conversion routines above, ASCIItoInteger and IntegerToASCII. We will insert this file into calc.asm using INCLUDE libcvt32.asm, so the calculator source code will be much shorter.

CalcL32 PROGRAM FORMAT=ELFX, WIDTH=32, ENTRY=Start: INCLUDE libcvt32.asm, linapi.htm ; Use the functions from these libraries. [.text] ; This is the EuroAssembler way to indicate that machine instructions will follow. Start: StdOutput Prompt ; Introduce the program, prompt for input. MOV EAX,3 ; Linux32 kernel function #3 - Read from a file or device. MOV EBX,0 ; The 1st parameter is the file descriptor of the standard input. MOV ECX,Buffer ; The 2nd parameter is the address of the buffer where the entire example input is read, e.g. the string "3 + 7". MOV EDX,SIZE# Buffer; The SIZE# attribute returns the size of the Buffer in bytes. We could also use MOV EDX,80. INT 80h ; Call a Linux kernel function. ; The Buffer is now filled with EAX characters of the input. It should contain two ASCII numbers separated by an arithmetic operator. MOV ESI,Buffer Next1: LODSB ; Load the next character. CMP AL,0 ; End of input string? JE Error: CMP AL,' ' ; Separating spaces should be skipped. JE Next1: DEC ESI ; Return a pointer to a valid character. CALL ASCIItoInteger ; Procedure from libcvt32.asm library. JC Error: MOV [Arg1],EAX ; Store the first number. ADD ESI,ECX ; Place ESI after the first number loaded. Next2: LODSB ; Load next character CMP AL,0 ; End of input string? JE Error: CMP AL,' ' ; Separating spaces should be skipped. JE Next2: MOV [Arg2],AL ; Store operator. Next3: LODSB ; Load next character CMP AL,0 JE Error: CMP AL,' ' ; Separating spaces must be skipped. JE Next3: DEC ESI ; Return pointer to valid character. CALL ASCIItoInteger ; Procedure from libcvt32.asm. JC Error: MOV [Arg3],EAX ; Store the second number. CMP [Arg2],'+' JE Addition: CMP [Arg2],'-' JE Subtraction: CMP [Arg2],'*' JE Multiplication: CMP [Arg2],'/' JE Division: Error: StdOutput Help ; If an invalid operator was specified, a help is printed. JMP End: Addition: MOV EAX,[Arg1] ADD EAX,[Arg3] MOV ESI,Result CALL IntegerToASCII StdOutput ESI JMP End: Subtraction: MOV EAX,[Arg1] SUB EAX,[Arg3] CALL IntegerToASCII StdOutput ESI JMP End: Multiplication: MOV EAX,[Arg1] MUL [Arg3] CALL IntegerToASCII StdOutput ESI JMP End: Division: MOV EAX,[Arg1] SUB EDX,EDX DIV [Arg3] CALL IntegerToASCII StdOutput ESI JMP End: End: TerminateProgram ; Macro to terminate the program. [.data] ; This is where the program code section ends and the data begins. Arg1 DD 0 Arg2 DD 0 Arg3 DD 0 Result DB 10 * B DB 10,0 Buffer DB 80 * B Prompt: DB "%^PROGRAM: Enter two integer numbers separated with arithmetic operator + - * /.",13, 10, 0 Help: DB "Calculate number1 operator number2.", 13, 10 DB "Example: 3 + 4", 13, 10, 0 ENDPROGRAM

When trying to compile the previous program, EuroAssembler reported the following errors:

The error occurred because we used the labels called Next1, Next2, End multiple times in the procedures and in the main program. We should come up with more unique names, but there is an even better solution to get rid of duplicates: make the symbols local. Local names start with a period, and EuroAssembler actually remembers them associated with the name of the procedure or program in which the label was defined. So End in the IntegerToASCII procedure, when renamed to .End, will be stored internally as IntegerToASCII.End, and will not conflict with End in the CalcL32 program.

What about the periods before the symbol name and the colons after it? The period at the beginning indicates the symbol as local, in fact the symbol name is modified by prefixing the namespace name (program, procedure, structure) before the local name. See also the namespace paragraph in manual.

The colon may or need not be appended after the symbol name to emphasize that it is a symbol and not the name of a structure, register, instruction. Unlike in most other assemblers, in €ASM a colon can be appended after the symbol name not only when defining it in the label field, but also whenever the symbol is mentioned, for example in MOV RSI, Symbol:. And if the colon is doubled, it also indicates the global visibility of the symbol, so we don't have to explicitly specify it with GLOBAL Symbol, or with PUBLIC and EXTERN.

After correcting the symbols names Next1, Next2, End1, End2 in libcvt32.asm to local version by prefixing a period before their name, the program should compile without errors and we can test it with longer numbers and different operations.

We can see that retrieving arguments from standard input works better than getting them from the program command line. Still, the ./CalcL32.x program had to be run again after each example, since we have not yet programmed a transition to a new input after each successful calculation. This is easily remedied by simply including a jump to the beginning, i. e., End: JMP Start: instead of TerminateProgram at the End label. Better yet, replace all jumps to End: with a jump to Start:.

Another change would be to replace the Linux kernel call with the StdInput from the linapi.htm library, which does essentially the same thing and is easily replacible by the same macro from libraries for other operating systems.

The last thing to fix in the previous source code are the redundancies in the calculation of the arithmetic operations Addition, Subtraction, Multiplication, Division: loading the first number of MOV EAX,[Arg1] can only be done once and then used for all four possible operations.
Instructions

are repeated, so for the second, third and fourth arithmetic operations we can replace them by jumping to the first one. The program source code in the calc.asm file is shortened by these interventions:

CalcL32 PROGRAM FORMAT=ELFX, WIDTH=32, ENTRY=Start: INCLUDE libcvt32.asm, linapi.htm ; Use the functions from these libraries. [.text] ; This is the EuroAssembler way of indicating that machine instructions will follow. Start: StdOutput Prompt ; Introduce the program, prompt for input. StdInput Buffer ; Load the input into the Buffer variable. MOV ESI,Buffer Next1: LODSB ; Load the next character. CMP AL,0 ; End of input string? JE Error: CMP AL,' ' ; Separating spaces should be skipped. JE Next1: DEC ESI ; Return a pointer to a valid character. CALL ASCIItoInteger ; Procedure from libcvt32.asm library. JC Error: MOV [Arg1],EAX ; Store the first number. ADD ESI,ECX ; Place ESI after the first number loaded. Next2: LODSB ; Load the next character. CMP AL,0 ; End of input string? JE Error: CMP AL,' ' ; Separating spaces should be skipped. JE Next2: MOV [Arg2],AL ; Store operator. Next3: LODSB ; Load the next character. CMP AL,0 JE Error: CMP AL,' ' ; Separating spaces must be skipped. JE Next3: DEC ESI ; Return a pointer to a valid character. CALL ASCIItoInteger ; Procedure from libcvt32.asm library. JC Error: MOV [Arg3],EAX ; Save the second number. MOV EAX,[Arg1] ; Load the first argument into EAX and then look at the operator. CMP [Arg2],'+' JE Addition: CMP [Arg2],'-' JE Subtraction: CMP [Arg2],'*' JE Multiplication: CMP [Arg2],'/' JE Division: Error: StdOutput Help ; If an invalid operator was specified, a help is printed. TerminateProgram ; and the program terminates. Addition: ADD EAX,[Arg3] Print: MOV ESI,Result CALL IntegerToASCII StdOutput ESI JMP Start: Subtraction: SUB EAX,[Arg3] JMP Print: Multiplication: MUL [Arg3] JMP Print: Division: SUB EDX,EDX DIV [Arg3] JMP Print: [.data] ; This is where the program code section ends and the data begins. Arg1 DD 0 Arg2 DD 0 Arg3 DD 0 Result DB 10 * B DB 10,0 Buffer DB 80 * B Prompt: DB 13,10,"%^PROGRAM: Enter two integer numbers separated with arithmetic operator + - * /.",13, 10, 0 Help: DB "Calculate number1 operator number2.", 13, 10 DB "Example: 3 + 4", 13, 10, 0 ENDPROGRAM

So we have a working calc.asm calculator for 32-bit Linux. Porting to Windows 32-bit is easy: instead of CalcL32 PROGRAM FORMAT=ELFX, WIDTH=32, ENTRY=Start: use CalcW32 PROGRAM FORMAT=PE, WIDTH=32, ENTRY=Start: and instead of INCLUDE libcvt32.asm, linapi.htm use INCLUDE libcvt32.asm, winapi.htm. That's it, the other instructions are not changed and the macros StdOutput, StdInput and TerminateProgram change only their body but not their name.

It's a bit more complicated with the 64-bit program porting. While 32-bit registers were used to handle both data and addresses in 32-bit mode, in 64-bit mode the address width is increased to 64 bits, while the default data width remains 32 bits (although it can also be increased to 64 bits by using registers starting with R). Also remember that writing to a 32-bit register nulls the upper 32 bits of the corresponding 64-bit register, too. Thus, SUB EAX,EAX not only nulls the EAX register, but also the upper (more significant) half of the RAX register.

Our library for converting integers to ASCII and back will look slightly different after porting to 64 bits. Let's call it libcvt64.asm:

; Description: ASCIItoInteger procedure converts a string of ASCII digits to a binary integer. ; Input: RSI contains a pointer to the first character of the number being converted. ; The procedure reads characters as long as they contain digits or until an overflow occurs. ; Output:Carry Flag is zero, EAX contains the converted number in binary form. ; RCX contains the number of ASCII characters processed from the input string. ; Error: Carry Flag is set if the input number would exceed 32 bits. EAX is undefined. ; RCX contains the number of processed ASCII characters from the input string. ; Clobbers: RBX,RDX,RDI. ASCIItoInteger PROC ; Define the procedure. SUB EAX,EAX ; Result accumulator. SUB EBX,EBX ; Number conversion register. SUB ECX,ECX ; Length of the input number. MOV EDI,10 ; Multiplier. .Next: MOV BL,[RSI+RCX] ; Retrieve the RCX-th character from the input string RSI. SUB BL,'0' ; Convert to binary if it was a digit. JB .End: ; Jump if it was not a digit. CMP BL,9 ; Check for an upper limit of a digit. JA .End: ; Jump if it was not a digit. INC ECX ; Add a character to the register specifying the length of the input number. MUL EDI ; Multiply the accumulator by 10. JC .Over: ; Jump if 32 bits overflowed. ADD EAX,EBX ; Add the last digit to the accumulator. JMP .Next: ; Process the next character. .End: CLC ; Return with the CF flag cleared. Binary number is in EAX. .Over: RET ; Return with CF flag set. Result not defined. ENDPROC ASCIItoInteger ; End of procedure. ; Description: IntegerToASCII procedure converts a binary number from the EAX register to a string of ASCII digits. ; Input: EAX contains an input unsigned binary number in the range 0..4294967295. ; RSI contains a pointer to an output field of size 10 bytes. ; Output:RSI is incremented by 0..9 bytes and points to the first valid digit of the result. ; Error: No error can occur. ; Clobbers: RAX,RCX,RDX,RDI. IntegerToASCII PROC ; Define the procedure. MOV ECX,10 ; Number of digits in the result. MOV EDI,10 ; Divisor. .Next1: SUB EDX,EDX ; The high 32 bits of the input number must be reset before dividing EDX:EAX. DIV EDI ; EAX is now the quotient, EDX the remainder 0..9. ADD DL,'0' ; Convert DL to the digit '0'...'9'. DEC ECX ; Store numbers backwards. JS .End1: ; Jump at the end of all ten digits. MOV [RSI+RCX],DL ; Store a digit at the end of the result. JMP .Next1: ; Process the next digit. .End1: ; The result field now contains "0000000123" for input number 123. ; Position RSI to the first non-zero digit of the result. LEA RDI,[RSI+10] ; Indent beyond which RSI will no longer be read. .Next2: CMP RSI,RDI ; End of output number? JAE .End2: ; Jump if yes, ESI will point to the last '0'. LODSB ; Read result digit from ESI, increment ESI. CMP AL,'0' ; Is this the beginning of valid digits? JE .Next2: ; Jump if no. .End2: DEC RSI ; Return ESI to the first valid digit. RET ; Return from procedure. ENDPROC IntegerToASCII

The calculator code for a 64-bit operating system will change only slightly, as we will remain limited to 32-bit numbers, and working with non-address registers will remain the same. Instead of the 32-bit libraries
libcvt32.asm, linapi.htm, winapi.htm we will include the libraries
libcvt64.asm, linabi.htm, winabi.htm. Instead of MOV ESI,Buffer it will be better to use LEA RSI,[Buffer]. Other instructions may remain the same as in the 32-bit variant:

EUROASM CPU=x64 CalcL64 PROGRAM FORMAT=ELFX, WIDTH=64, ENTRY=Start: INCLUDE libcvt64.asm, linabi.htm ; Use the functions from these libraries. [.text] ; This is how EuroAssembler indicates that machine instructions will be followed. Start: StdOutput Prompt ; Introduce the program, prompt for input. StdInput Buffer ; Load the input into the Buffer variable. LEA RSI,[Buffer] Next1: LODSB ; Load the next character. CMP AL,0 ; End of input string? JE Error: CMP AL,' ' ; Separating spaces should be skipped. JE Next1: DEC RSI ; Return a pointer to a valid character. CALL ASCIItoInteger ; Procedure from libcvt64.asm library. JC Error: MOV [Arg1],EAX ; Store the first number. ADD RSI,RCX ; Place ESI after the first number loaded. Next2: LODSB ; Load the next character. CMP AL,0 ; End of input string? JE Error: CMP AL,' ' ; Separating spaces should be skipped. JE Next2: MOV [Arg2],AL ; Store operator. Next3: LODSB ; Load the next character. CMP AL,0 JE Error: CMP AL,' ' ; Separating spaces should be skipped. JE Next3: DEC RSI ; Return a pointer to a valid character. CALL ASCIItoInteger ; Procedure from libcvt64.asm library. JC Error: MOV [Arg3],EAX ; Save the second number. MOV EAX,[Arg1] ; Load the first argument into EAX and then look at the operator. CMP [Arg2],'+' JE Addition: CMP [Arg2],'-' JE Subtraction: CMP [Arg2],'*' JE Multiplication: CMP [Arg2],'/' JE Division: Error: StdOutput Help ; If an invalid operator was specified, a help is printed. TerminateProgram ; and the program terminates. Addition: ADD EAX,[Arg3] Print: LEA RSI,[Result] CALL IntegerToASCII StdOutput RSI JMP Start: Subtraction: SUB EAX,[Arg3] JMP Print: Multiplication: MUL [Arg3] JMP Print: Division: DIV [Arg3] JMP Print: [.data] ; This is where the program code section ends and the data begins. Arg1: DD 0 Arg2: DD 0 Arg3: DD 0 Result: DB 10 * B DB 10,0 Buffer: DB 80 * B Prompt: DB 13,10,"%^PROGRAM: Enter two integer numbers separated with arithmetic operator + - * /.",13, 10, 0 Help: DB "Calculate number1 operator number2.", 13, 10 DB "Example: 3 + 4", 13, 10, 0 ENDPROGRAM

Porting from 64-bit Linux to Windows is again easy: replace the linabi.htm library with winabi.htm, change the name and format of the program from CalcL64 PROGRAM FORMAT=ELFX to CalcW64 PROGRAM FORMAT=PE and that's it. You get a calculator for 64-bit Windows.

Debugging

Finding and removing program errors – debugging – is an essential part of a programmer's job, and this is doubly true in assembler. We'll demonstrate several methods for finding bugs.

Finding bugs with checklist

In the previous examples we used the StdOutput macro to dump the specified arguments for checking. This is a pretty good method of making sure that the processor is running through the corners of our program where we expect it to. In the first place, let's put the StdOutput dump right at the beginning, where the program ENTRY= points. This will make us sure that we've let our program load the correct library and that the dump to standard output works. The text output by StdOutput doesn't need to be sophisticated, it just needs to tell us that the program has reached a certain check-point. For example StdOutput ="I am at line 123",Eol=Yes.

Finding bugs with macro Debug

Very often it would be useful to see the contents of a particular register or memory location in addition to knowing where we are. While we could convert the contents of any register to decimal or hexadecimal ASCII form, store it in some temporary memory variable and then dump it using StdOutput, this is an inconvenient solution. It's better to use a specialized macro that does the same for all registers and leaves no trace in the debugged program (except for increasing its length).

The maclib/debug.htm library is available for debugging and contains a single Debug macro, which is independent of the operating system and of the width of the program being debugged (16, 32 or 64 bits). The OS independence is redeemed by the fact that the Debug macro does not print the register information itself, but it sends a formatted dump-string to a procedure with the default name DebugOutput, which is called as a callback. This procedure has to be included (temporarily) in the debugged program, fortunately it is not difficult. DebugOutput's job is to write out a string addressed by the rSI register of length rCX bytes to the standard output. We can use the well known StdOutput macro for the output.

The output of the computer state after each use of the Debug macro looks something like this in a 32-bit program:

The macro initially lists its position in the source code as "filename.ext"{1234), i. e. the filename and the line number in compound brackets, and then the contents of all GPRs in hexadecimal notation. Inserting the macro into the program does not affect it, all registers and flags are preserved.

Debug can also dump the contents of a single memory location whose address and size are specified by the macro parameters. For example Debug ESI, Size=32 will print the contents of the memory at address ESI, which is 32 bytes long, in addition to the contents of the GPR. The Size= parameter is in the range 1..256, the default value is 16. For protected OSes, care must be taken that the specified memory actually exists (is allocated to the program), otherwise a program crash may occur.

Debugging with a debugger

A much more convenient approach to debugging is provided by specialized debugging applications – debuggers. For Linux we have the console application gdb or its graphical extension nemiver or ddd.

For DOS Turbo Debugger supplied by Borland along with Turbo Assembler or Turbo Pascal.
For Windows 32-bit OllyDbg,
for Windows 64-bit x64dbg. The nice thing about these full-screen applications is that their authors have copied from each other the uniform basic control using the function keys F4, F7, F8, etc. We will use mostly the basic CPU screen which, divided into four parts, displays machine instructions, register contents, memory contents and the stack. Stepping through the program (F7), possibly skipping detailed procedures (F8), allows us to see how the registers, program memory and stack have changed after each step.

The debugger shows the CPU addresses and disassembled machine code in the upper left quarter of the window. The addresses correspond to what was assigned to them by the linker, we see them as hexadecimal numbers. It would be useful to see the labels used in the source program instead. Unfortunately, this linking of the numeric address to the symbols does not happen automatically, even when a table of symbols with their addresses is present in the executable. If we want to know the addresses that EuroAssembler has assigned to the symbols, let's look at the listing. If we have left this enabled with the PROGRAM LISTGLOBALS=ON option, at the end of the listing calc.asm.lst we will see the virutal address (VA=) for each global symbol. If we needed to know the address of other, non-global symbols, we would need to give them global visibility. This could be done by adding explicit pseudoinstructions GLOBAL Division, Error, Multiplication etc., but it would be even easier to add two colons to the end of each label name. After the new translation, we will see their global addresses in the listing:

	Index	Manual	Download	Source	Macros	Embedded word Case ins.
	Sitemap	Links	Forum	Tests	Projects	Embedded word Case ins.

EuroAssembler tutorial

the processor reads some information from memory or a device into a register, manipulates it, and then writes it somewhere.

The data type is defined in the assembler by the operations we will perform on the item.

The Carry Flag remains unchanged by executing INC or DEC.

The SCAS and CMPS instructions set flags, the other string instructions leave them unchanged.