EuroAssembler Index Manual Download Source Macros


Sitemap Links Forum Tests Projects

EuroAssembler Manual


About €ASM ↓

Input/Output ↓

Structure of €ASM program ↓

Elements of source ↓

Instructions ↓

Program formats ↓

€ASM functions ↓


↑ About EuroAssembler

Product identification ↓

Short characteristic ↓

Notational typographic conventions ↓

Why Assembler ↓

Why Yet Another Assembler ↓

Why EuroAssembler ↓

Licence ↓

History

Download

Installation ↓


↑ Product Identification

Name of this software is EuroAssembler. Notice there is no space between Euro and Assembler.
The name is often shortcut as €ASM.
In 7-bit ASCII environment it may also be referred as EUROASM and in some internal identifiers it's just ea.

The Euro character is available on Windows keyboard as Alt-0128 or as HTML entity €.

↑ Short Characteristic

Some features are rarely seen in other assemblers:

↑ Notational typographic conventions

This manual covers programmer's guide, examples, language references, implementation remarks. Different styles are used to identify those elements.

Background color of web page helps to distinguish between

   this manual and links macroinstruction libraries  €ASM source files  test files  objects and samples .

Dashed hyperlinks refer to another paragraphs within the same page.

Underlined hyperlinks navigate to a different HTML page of this site.

Underlined hyperlinks with Link icon navigate to signpost page Links with external references.

Underlined hyperlinks with Exit icon navigate outside EuroAssembler website, you may want to open them in a new tab or window.

Contents of this manual is organized in chapters with a tree structure.

↑ Title

Up-arrow near the chapter title is a link which navigates from the Title one level higher.

Title ↓

Down-arrow following the title navigates from the Title downward to the actual text.
Statements and rules which are worth remembering are marked with a bulb icon.

Definitions of new terms is written in blue italics.

Implementation details, discussions and less important remarks are printed with smaller font.

File names are emphasized in quotes.

Characters used in text flow have white background.

Short piece of source code is displayed in monospace black on yellow in text flow.

; Longer examples of source code in this manual are presented in a box.
; They may have more lines.
; Negative examples are overstriked.
 
Examples of code in macrolibraries and €ASM sources are ignored by EuroAssembler, because their physical lines begin with HTML tag marker <.
|0000:0000| ; €ASM printed output (listing) is displayed black on white paper. |0000:0000| ; It contains assembled machine code, copy of source instructions |0000:0000| ; and error messages.

↑ Why Assembler

Assembly programming language (ASM) gives programmers maximal possible control of emitted machine code. It allows to prescribe any single instruction for the Central Processing Unit (CPU), and, on the other hand, to create their own macroinstructions, functions and procedures which can do the same work as sofisticated orders in higher level languages (HLL) and develop programs in ASM almost as rapidly as in HLL.

The disadvantage of assemblers is lack of standardized libraries which unify programming in HLL such as C or Java. Many ASM programmers build their own, which makes their sources not portable unless the libraries are also included with source. On the other hand, making a library with own functions it the best method how to remember all the function and parameter names, and how to learn a lot about computer and operating system (OS).
€ASM is shipped with several macrolibraries for the quick start.

Advantage of mastering assembly manifests when we are challenged with a third-party program without its source code available, or when some bad program throws an exception and exits. DrWatson, debugger or disassembler can only show the foreign code converted to assembly instructions. People who never met ASM will hardly know how to interpret the disassembled code, while ASM programmer will feel like a fish in its natural environment.

Assembler is an universal construction kit. You may program whatever possible to imagine, but first you have to prepare the building tools.

Phases of program creation
PhaseUsed tool
design-timeimagination
write-timetext editor
assembly-timeassembler
combine-timelinker
link-timelinker
load-timeoperating system loader
bind-timeoperating system loader
run-timeprocessor

↑ Why Yet Another Assembler

Dissatisfation with available tools is one of the reasons why some programmers want to invent their own language.

And last but not least, creating an assembler is a very interresting challenge. An incomplete list of assemblers and other tools, that I had the pleasure to come into contact with, is in presented at the link [Assemblers] and [UsefulTools].

The first assembler I met when I started to flirt with assembly language in early 80's, was IBM's FDOS for S360 mainframe computers [HLASM]. That was very sofisticated product with advanced features such as sections, keyword operands, literals, with macrolanguage which was able to manipulate not only with the generated machine statements, but also with its own macro variables and their names.

I missed many of those features in assemblers for Intel architecture. Some of them brought new ideas but none seemed ideal for me. [NASM] ver.0.99 was quite good, in fact the first bootstrap version of €ASM was written in it, but I was irritated when it wasn't able to automatically select SHORT or NEAR distance and had other design flaws, such as not expanding preprocessing variables in quoted strings.

I always wondered why constant EQU symbols had to be declared before the first use. Why I can't declare macro in macro. How to solve situations when file A includes files B and C, and file C also includes file B, duplicating its definitions.

I don't like language which is cluttered up with free space. In HLASM putting a space in the operand list signalised that everything up to the end of a punched card should be ignored. €ASM isn't that strict in this horror vacui, in fact white spaces may be put anywhere between language elements to improve readability. However, spaces are almost never required by syntax.

€ASM does not use English word modifiers such as SHORT, NEAR, DWORD PTR, NOSPLIT which are identified by their value only. Instead, it prefers Name=Value paradigma with keyword instruction modifiers such as DATA=QWORD,IMM=BYTE,MASK=K5,ZEROING=ON, which remove ambiguity and replace ugly decorators proposed in Intel documentation.

↑ Why EuroAssembler

  1. Euro because it comes from Czechia, the heart of Europe.
  2. Both Europe and €ASM are multilingual, as it supports national characters in identifiers and strings.
  3. is one of the few characters left unoccupied among many *ASM assemblers :-)

↑ Licence

Permission to use EuroAssembler is granted to everybody who obeys this Licence.
There are no restrictions on purpose of applications created with this tool. It may be used in private, educational or commercial environment freely.

EuroAssembler is provided free of charge as-is, without any warranty guaranteed by its author.

This software may be redistributed in unmodified zipped form, as downloaded from EuroAssembler.eu. No fee may be requested for the right to use this software.

You may disseminate euroasm.zip on other websites, repositories, FTP archives, compact disks and similar media. Please be sure to always distribute the latest available €ASM version.

Source code of EuroAssembler was written by Pavel Šrubař, AKA vitsoft, and it is copyrighted as so.
Macrolibraries and sample projects are released as public domain and they may be modified freely.

You may modify €ASM source code for the sole purpose to fix a bug or to enhance it with new function, but you may not distribute such modified software. It may only be used by you on the same computer where it was edited, reassembled and linked.

EuroAssembler is not open source. I don't want to fork €ASM developement into bazaar of incompatible versions, where each branch provides different enhancement. Please propose your modifications to the author or to €ASM forum instead, so it might be incorporated in future releases of EuroAssembler.

↑ Installation

Distribution file euroasm.zip contains folders and files as listed on the Sitemap page. Modification time of all files is equally set to the nominal release time. All file names are in lower case (Linux convention) and in 8.3 size (DOS convention), so any old DOS utility can be used for unpacking.
You may need to run the console as administrator for the installation on secure version of MS Windows.

Choose and create EuroAssembler home directory, for instance C:\euroasm, change to it and unzip the downloaded euroasm.zip. Move or copy the main executable euroasm.exe to some folder from system %PATH%, so it might be launched as euroasm from anywhere. When you run it without parameters for the first time, it will create the global configuration euroasm.ini, which you should tailor now with a plain-text editor.

You may want to replace relative IncludePath= and LinkPath= in [EUROASM] section with an absolute path identifying the €ASM home directory.
In [PROGRAM] section you can specify your preferred target format, for instance Format=PE, Subsystem=CON and Width=32. You could also replace IconFile="euroasm.ico" and copy your preferred personal icon to objlib subfolder.

For the (not-recommended) bare-bone minimal installation you are now done and you could erase the whole home directory now. The executable euroasm.exe itself does not need any other supporting files, environment or registry modification.

If you prefer to read this documentation in other language, rename the default English version of this manual eadoc\index.htm to eadoc\man_ENG.htm and then rename the chosen available mutation, e.g. eadoc\man_CZE.htm, to eadoc\index.htm.

For developement installation go to the home directory and unzip the developer scripts from the subarchive generate.zip. You will also need webserver and PHP (version 5.3 or higher) installed on your localhost.

Most of EuroAssembler files are in HTML format, you may want to incorporate €ASM into your local web server, if you run it on your localhost computer.

In my Apache installation I added the following paragraph to the httpd.conf or apache2.conf:

<VirtualHost *:80>
    DocumentRoot C:/euroasm/
    ServerName euroasm.localhost
</VirtualHost>

I appended the statement 127.0.0.1 euroasm.localhost into the file %SystemRoot%/SYSTEM32/drivers/etc/hosts. Now I can write euroasm.localhost into address line of my internet browser and explore the €ASM documentation and other files locally.


↑ Input/Output

Standard streams ↓

Other I/O ↓

Messages ↓

Input/Output files ↓


Computer programs exchange information with users through various channels: standard streams, disk files, devices, command-line parameters, environment variables.

↑ Standard streams

Basic form of communication between programs and human user has the form of characters streams, which are by default directed to the console terminal where was the program launched from. They may also be redirected to a disk file or device driver with command-line operators >, >>, <, |.

Standard input is not used in €ASM.

Standard output prints warnings, errors and informative messages produced by €ASM.

Standard error output is not used in €ASM.

↑ Other I/O

Command-line parameters are not used. €ASM assumes that everything on the command line is the main source file name(s) to assemble. All options controlling the assembly & link process are defined in configuration files euroasm.ini or directly in the source file itself.

In fact there are semi-undocumented EUROASM options which are recognized in command-line, however the preferred place for EUROASM options is the configuration file or the source file. Cmdline options are employed in test examples to suppress some volatile informative messages, e.g. I0010 EuroAssembler version 20151231 started. or I0980 Peak memory allocation 840 KB. 1234 statements assembled in 3 s., which cannot be guarrantied to be identical in various environments and therefore they would trigger false test-result differences.

Environment variables are not used in €ASM.

Environment variables may be incorporated into the source at assembly-time using pseudoinstruction %SETE. Of course it is also possible to read environment at run-time with the corresponding API call, such as GetEnvironmentVariable().

€ASM does not use any other devices (I/O ports, printer, sound card, graphic adapter etc.) at assembly-time.

↑ Messages

Important information detected by EuroAssembler during its activity is published in the form of short text messages. They are written on standard output (console window) and to the listing file.

Message severity ↓

Messages in standard output ↓

Messages in listing ↓

Each message is identified by a combination of capital letter followed with four decimal digits. The complete text of messages is defined in source file msg.htm.

The letter prefix and the first digit (0..9) declare message severity. Final errorlevel value, which euroasm.exe terminates with, is equal to the highest severity encounterred during the assembly session.

Message severity
Kind of
message
PrefixIdentifier
range
SeveritySearch
marker
InformativeII0000..I09990|#
DebuggingDD1000..D19991|#
WarningWW2000..W39992..3|##
Nonsuppressible warningWW4000..W49994|##
User-defined errorUU5000..U59995|###
ErrorEE6000..E89996..8|###
FatalFF9000..F99999|###

EuroAssembler is verbose by default, but it may be totally silenced when launched with parameter NOWARN=0000..0999, and if no error occured in source.

Warnings usually do not prevent compiled target from execution, they are meant as a friendly reminder that programmer might have forget about something or has made a typo mistake.

Messages with severity level 5..8 indicate that some statements were not compiled due to error. Although the target file may be valid, it will probably not work as intended.

Fatal errors indicate failure of interaction with the operating system, exhausting of resources, file errors or internal €ASM errors. Target and listing file might have been not written at all.

Warning messages in the range W2000..W3999 can be suppressed with EUROASM option NOWARN=, but this ostrich-like policy is not a good idea. It's always better to remedy the cause of message. If you intend to publish your code, it should always assemble with errorlevel 0.

↑ Messages on standard output

Typical message consists of the identifier followed by the actual tailored msg text. When it is printed on standard output, the text is accompanied with position indicator in the form of quoted file name followed with physical line number in curly brackets, for instance

E6601 Symbol "UnknownSym" mentioned at "t1646.htm"{71} was not found. "t1646.htm"{71}
▲▲▲▲▲                                                                 ▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲
Idenfifier                                                         position indicator

Usually there is just one position indicator per message, but when the error was discovered in macro expansion, another indicator is added which determines the line in macro library. In case of macro expanded in other macro, postition indicators will be further chained.

↑ Messages in listing

Messages printed to the listing file have a slightly different format. The position indicator is omitted, because they are inserted just below the source line which triggered the error:

|002B: | MOV SI,UnknownSym: ; E6601 expected. |### E6601 Symbol "UnknownSym" mentioned at "t1646.htm"{71} was not found. ▲▲▲▲ marker

Message text is prefixed with search marker which helps to find messages in listing.

Use internal function Find/FindNext (Ctrl-F) of the editor or viewer used to investigate the listing file.
€ASM syntax doesn't use multiple pound characters ##, so the search marker should be unique and helps to jump from one error|warning to the next.

Debugging messages D1??? produced by the pseudoinstruction %DISPLAY are published even when they are placed in false %IF branches or in blocks commented-out by %COMMENT..%ENDCOMMENT.

Listing is created only during the final assembly pass. Informative messages are not printed to listing at all, except for informative linker messages I056?.


↑ Input/Output files

Configuration file ↓

Source file ↓

Object file ↓

Listing file ↓

File path ↓

There are two kinds of input files which €ASM reads: configuration and source.

There are two kinds of output files which €ASM writes: object and listing.

If the output file already exists, €ASM will overwrite it without warning.

Configuration file

Configuration file with fixed name euroasm.ini specifies default options for assembler. €ASM consults two configuration files with identical name and structure:

Global configuration file is located in the same directory as the main executable (euroasm.exe) and it is processed once after €ASM has started. If the file does not exist, €ASM tries to create it with the factory-default contents.

Local configuration file is searched for in the same directory as the actual source file. If more than one source is specified on the command-line, local configuration is read each time when the actual source gets processed.
Local euroasm.ini is not automatically created by €ASM, you should clone the global file manually, and eventually erase unchanged options from the local configuration for better performance.

Initial content of configuration file, which is built-in in euroasm.exe as factory-defaults, is defined in objlib/euroasm.ini. There are two sections in the file: [EUROASM] and [PROGRAM].

The former specifies parameters for €ASM itself, such as CPU generation, what information should go to the listing file, which warnings should be supressed etc. Parameters from [EUROASM] section of configuration file can be redefined later in the source with EUROASM pseudoinstruction, where you will find detailed explanation of all parameters.

[PROGRAM] section of configuration file specifies default parameters of program which is to be created by €ASM, for instance the memory model, format and name of the object file etc. These parameters can be modified with PROGRAM pseudoinstruction.

The order of configuration parameters is not important. Names of parameters are case insensitive. Parameters with boolean value accept any of predefined enumerated constants ON, YES, TRUE, ENABLE, ENABLED as true and OFF, NO, FALSE, DISABLE, DISABLED as false. They also accept numeric expressions which evaluate as boolean.

When you send somebody your source program written in EuroAssembler, you don't have to specify which comand-line parameters were used to compile and link, as they can be declared in the source itself. Typical €ASM source program begins with configuration pseudoinstruction, such as EUROASM AUTOALIGN=YES,CPU=PENTIUM, so it is easy to guess in which assembler is the program written.

EuroAssembler options and directives can be specified in configuration files and in the source files (by pseudoinstruction EUROASM). Order of their processing:

  1. When €ASM starts, its options are already defined with built-in factory defaults.
  2. €ASM looks at the command-line and if some EUROASM keyword options were provided here, they overwrite the current options in charge.
  3. €ASM looks for the global configuration file and reapplies its options.
  4. Command-line options are reapplied again.
  5. €ASM looks for source filename(s) at the command-line, and if the local configuration file exists in the same directory, it is processed and applied to the current configuration in charge.
  6. Source file is now assembled. Each pseudoinstruction EUROASM found in the source overwrites current options.

↑ Source file

Source file contains instructions to be assembled, usually it is a plain-text file or HTML file arranged for €ASM. The file name will be provided as command-line parameter of euroasm.exe. The source file may be identified with absolute path in filesystem, e.g. euroasm /user/home/euroasm/MyProject/MySource.asm, or with relative or omitted path, which will be related to the current shell path.

Structure and syntax of source text, which €ASM is able to assemble and link, is described further in this document.

↑ Object file

Main purpose of programming is to obtain the target file, which may be an object module or library linkable to other files, binary file for special purposes, or an executable file.

Format of the output file is specified by PROGRAM parameter FORMAT=. Their layouts were standardized by their creators many years ago. For more details about supported output formats see the chapter Program formats.

Name of the target file is determined by the label used in pseudoinstruction PROGRAM, and it is appended with default extension depending on program format. It isn't necessarily derived from the source filename, as in many other assemblers. For instance, if the source has statement Hello PROGRAM FORMAT=COM, its output file will be created in the current directory with name Hello.com, no matter how is the source file named. Default target name can be changed by PROGRAM parameter OUTFILE=. If the OUTFILE= name is specified with relative or omitted path, current shell directory is assumed.

↑ Listing file

Dump parameters ↓
Dump separators ↓
Dump decoration ↓
List parameters ↓

Listing is a plain text file with two columns where EuroAssembler logs its activity:

  1. result of assembly of each statement is hexadecimally displayed in the dump column.
  2. statements, which were processed, are copied to the source column.

Name of the listing is determined by the name of source file, which is appended with extension .lst, and it is created in source file directory.
Default listing filename and location may be changed with EUROASM parameter LISTFILE=.

↑ Dump parameters

Let's create the source file Hello.asm with this contents:

      EUROASM DUMP=ON,DUMPWIDTH=18,DUMPALL=YES
Hello PROGRAM FORMAT=COM,LISTLITERALS=ON, \
              LISTMAP=OFF,LISTGLOBALS=OFF
       MOV DX,=B"Hello, world!$"
       MOV AH,9
       INT 21h
       RET
      ENDPROGRAM Hello

Submitting the file to EuroAssembler with the command euroasm Hello.asm will create listing file Hello.asm.lst.

Width of the dump column in characters can be specified with EUROASM option DUMPWIDTH=. Other EASM options which control the dump column are boolean DUMPALL= and DUMP=OFF, which can suppress the dump column completely.

|<-Dump column-->|<--Source column-------- <--DumpWidth=18--> | | EUROASM DUMP=ON,DUMPWIDTH=18,DUMPALL=YES | |Hello PROGRAM FORMAT=COM,LISTLITERALS=ON, \ | | LISTMAP=OFF,LISTGLOBALS=OFF |[COM] ::::Section changed. |0100:BA[0801] | MOV DX,=B"Hello, world!$" |0103:B409 | MOV AH,9 |0105:CD21 | INT 21h |0107:C3 | RET |[@LT1] ====ListLiterals in section [@LT1]. |0108:48656C6C6F =B"Hello, world!$" |010D:2C20776F72 ----Dumping all. (because of DUMPALL=YES) |0112:6C64212400 ----Dumping all. | | ENDPROGRAM Hello ▲ column separator
↑ Dump separators

The dump column on the left side always starts with machine comment indicator (pipe character |) and it is terminated with listing column separator, which determines the genesis of this line.

Listing column separators
CharacterFunction
| (pipe)Termination of machine comment. Used in ordinary statements, which can be reused as EASM source.
! (exclamation)Copy of source line with expanded preprocessing %variables (when LISTVAR=ENABLED).
+ (plus)Source line generated in %FOR,%WHILE,%REPEAT expansion (when LISTREPEAT=ENABLED).
+ (plus)Source line generated in %MACRO expansion (when LISTMACRO=ENABLED).
: (colon)Inserted listing line to display a changed [section].
. (fullstop)Inserted listing line to display autoalignment stuff (when AUTOALIGN=ENABLED).
- (minus)Inserted listing line to display the whole dump (when DUMPALL=ENABLED).
= (equal)Inserted listing line to display data literals (when LISTLITERALS=ENABLED).
  (space) Inserted envelope PROGRAM / ENDPROGRAM line.
* (asterix)Inserted listing line in INCLUDE* statement when filename wildcards are resolved.

When the column separator is not |, the whole listing line has the form of machine remark and is ignored if the listing is submitted as a program source.

↑ Dump decoration

Dump of emitting statements has hexadecimal address (offset in the current section), terminated with colon :. In 16bit section the offset is 16 bits wide (four hexadecimal digits), in 32bit and 64bit sections it is 32 bits. Then follow the emitted bytes. Data contents in the dump column is always in hexadecimal notation without explicit number modifier. If the chosen DUMPWIDTH= is too small for all emitted bytes to fit, they are either right-trimmed and replaced with tilde ~ (if DUMPALL=OFF), or additional lines with separator - are inserted to the listing (DUMPALL=ON).

Some other decorators are used in dumped bytes:

Dump column decoration
DecoratorDescription
~trimmed data indicator, used only when DUMPALL=OFF
..byte of reserved data (instead of hexadecimal byte value when it's initialized)
[]absolute relocation
()relative relocation
{}paragraph address relocation
<Ndisp8*N compression used

Brackets, which may enclose the dumped word or dword, indicate that the address requires relocation at link-time. Value printed in the listing will differ from the offset viewed in linked code or in debugger at run-time.

Character < followed with one decimal digit (N) signalizes that the previous dumped byte is 8bit displacement which will be left-shifted by N bits at run-time to obtain the effective displacement (so called disp8*N compression). The digit 0..6 specifying scaling factor N is not emitted to the assembled code.

Brackets [ ] and { } indicate relocable values. | | EUROASM DUMPWIDTH=30,CPU=X64,SIMD=AVX512,EVEX=ENABLED |[CODE] ▼ ▼▼ ▼ |[CODE] SEGMENT WIDTH=16 |0000:EA[0500]{0000} | JMPF Label ; Absolute far jump encodes immediate seg:offset. |0005:CB |Label: RETF |[CODE64] |[CODE64] SEGMENT WIDTH=64 |00000000:62F36D28234D02<504 | VSHUFF32X4 YMM1,YMM2,[RBP+40h],4 |00000008:C3 ▲▲ | RET <5 is a nonemitted disp8*N decorator. ▲▲Byte displacement +02h will be bit-shifted 5 times to the left, so the effective displacement is in fact +40h.

Dump of not emitting statements is empty or contains auxiliary information.

|[DATA] |[DATA] ; Segment|section switch quotes its [name] in dump column. |0000: |; Empty or comment-only line just displays the offset in current section. |0000: |Label: ; Ditto. | |;; Line comment starting with double semicolon will suppress the offset in dump. |[DATA]:0000 |Target EQU Label: ; Address symbol definition is displayed as [segment]:offset. |4358 |%Counter %SET CX ; Assignment of preprocessing %variable dumps its contents in hexadecimal. |TRUE | %IF "%Counter" == "CX" ; Preprocessing construct displays the evaluated boolean condition. |[]:0010 | Bits EQU 16 ; Scalar symbol definition is displayed with empty segment. |FALSE | %ELSE ; Boolean condition concerns %IF, %ELSE, %WHILE, %UNTIL. | | Bits EQU 32 ; Dump of statements in false conditional branches is empty. | | %ENDIF
↑ List parameters

Listing in default configuration is more or less exact copy of the source (except for the inserted dump column). Sometimes it is useful to check if the high-level constructs worked as expected, this is controlled by following boolean EUROASM options:
LISTINCLUDE= unrolls the contents of included file, which is normally hidden from the main source.
LISTVAR= creates a copy of statements which contain preprocessing %variable, and replace the %variable name with its expanded value in the copied line.
LISTMACRO= inserts statements expanded by the macroinstruction.
LISTREPEAT= inserts all iterations of repeating constructs %FOR..%ENDFOR, %WHILE..%ENDWHILE, %REPEAT..%ENDREPEAT. Repeated expansion is listed as commented-out by dump column separator +. In the default state (LISTREPEAT=DISABLED) only the first expansion is listed.

Trait of EuroAssembler listing is to keep the generated listing usable as the source again, in the following debugging session. Messages generated in the listing are ignored by €ASM parser, so they need not be removed when we want to submit the listing file to reassembly (however those messages will be generated again if the cause of error was not fixed).

I wanted to sustain this philosophy regardless of list parameters. In default state with LISTINCLUDE=OFF is the statement INCLUDE normally listed and the contents of included file is hidden. With option LISTINCLUDE=ON it is reversed: the original INCLUDE statement is commented out by dump column separator * and the included lines are inserted to the listing and they become valid source statements. See also t2220.

With options LISTVAR, LISTMACRO, LISTREPEAT=ENABLED is the original line kept as is and expanded lines are inserted below commented-out by dump column separator +. See also t2230

EUROASM option LIST=DISABLE will switch off generating of listing lines until enabled again, or until the end of source. Of course such listing will be no longer reusable as the source.

↑ File path

Disk files can be specified with their absolute path, i.e. with a path which begins at filesystem root, e.g. C:\ProgFiles\euroasm.exe D:\Project\source.asm. Such files are unequivocally defined.

File may be specified with relative path, e.g. euroasm ..\prowin32\skeleton.asm. Position of relatively specified file is always related to the current directory.

Files can also be specified without path, i.e. when their name contains no colon and no slash :, \, /. Position of such files is recapitulated in the table below:

Directory used when a file is specified without path
DirectionFileDirectorySee also
Executableeuroasm.exeExe-directoryOS PATH
InputGlobal euroasm.iniExe-directoryOS PATH
OutputGlobal euroasm.iniExe-directoryOS PATH
InputLocal euroasm.iniSource directory
InputSource fileCurrent directory
InputIncluded source fileInclude directoryEUROASM INCLUDEPATH=
OutputTarget object fileCurrent directoryPROGRAM OUTFILE=
OutputListing fileSource directoryEUROASM LISTFILE=
InputLinked module fileLink directoryEUROASM LINKPATH=
InputLinked stub fileLink directoryPROGRAM STUBFILE=
InputLinked icon fileLink directoryPROGRAM ICONFILE=
ImportDynamically imported functionOS-dependentIMPORT LIB=

Current directory is the actual folder assigned to the shell process at the moment when the euroasm.exe was launched. It's never changed by €ASM.

Exe-directory is the folder where euroasm.exe was found and executed, usually it is one of the directories specified by environment variable PATH.

Source directory is the folder where the currently assembled source file lies.

Include directory is one of the directories specified by the option EUROASM INCLUDEPATH=.

Link directory is one of the directories specified by the option EUROASM LINKPATH=.


↑ Structure of €ASM program

Character structure ↓

Horizontal structure ↓

Vertical structure ↓

This chapter describes the format of source file which €ASM understands and which it is able to compile.


↑ Character structure

Character width ↓

Character encoding ↓

Character case ↓

Character classification ↓


↑ Character width

Source file is a sequence of 8-bit characters.

If the source file is written in editor which uses WIDE character encoding (UNICODE UFT-16), it should be saved as a plain text in UTF-8 or in 8-bit ANSI or OEM codepage before submitting the file for assembly.

↑ Character encoding

Program written in €ASM may output texts in other languages than English. Therefore, string which defines the output text will contain characters with ASCII value 128..255. The relation between these codes and corresponding glyphes is called code page.

MS Windows uses different code pages in console applications and in GUI applications and it makes automatical conversions between them in some circumstances.

€ASM never changes the code page of the source.

Texts in your program which aim to the console (using WinAPI WriteConsoleA() function or ConsoleWrite macro) should be written in OEM code page. You may want to use DOS plain-text editor, such as EDIT.COM for writing console programs. Text mode editors use console fonts which are in OEM code page, so the text is displayed correctly both in editor at write-time and in the console of your program at run-time.

Text which is presented in GUI windows (using WinAPI TextOutA() function) should be written in ANSI code page, using windowed editor such as Notepad.exe.

Programmer may use 16-bit wide characters instead of 8-bit ANSI in the text strings. They are declared with pseudoinstruction DU (Define data in Unicode) instead of DB (Define data in bytes). Wide variant of WinAPI call must be used for visual representation of Unicode strings at run-time, e.g. TextOutW() instead of TextOutA(). However, the definition of Unicode characters in DU strings is still 8-bit. You should tell €ASM which code page was used for writing the DU statement in source file. This information is provided by EUROASM CODEPAGE= option. Codepage may change dynamically in the source, allowing mixing of different languages in one program.

Default is EUROASM CODEPAGE=UTF-8, where characters are encoded with variable length of one to four bytes. Thanks to clever [UTF8] design, all non-ASCII UTF-8 characters are encoded as bytes with value 128..255, which are treated as letters in €ASM, so any UTF-8 character can be used in identifiers as is.

Some editors insert Byte Order Mark characters 0xEF, 0xBB, 0xBF at the start of source file. EuroAssembler treats those three characters as a 3-bytes unsused label at the start of source, which usually makes no harm.

↑ Character case

€ASM is case semi-sensitive assembler.

All identifiers created by you, the programmer, are case sensitive: labels, constants, user-defined %variables, structures, macro names. On the other hand, all built-in names are case insensitive. Case insensivity concerns all enumerations: register names, machine instructions and prefixes, built-in data types, number modifiers, pseudoinstruction names and parameters, symbol attributes, system %^variables.

Case insensitive names are presented in UPPER CASE in this manual but they may be used in lower or mixed case as well.

↑ Character classification

Each byte (8 bits) in €ASM source is treated as a character. Many characters have special purpose in assembler syntax unless they are quoted inside double or single quotes. A character is unquoted if zero or even number of quotes appears between the start of line and the character itself.

EOL
End-of-line control character is Line Feed alias EOL (ASCII 10).
White spaces
All other control characters, Delete and Space are white spaces. White spaces are mainly used as separators which can improve readability but only seldom have some syntactic significance. Unquoted multiple white spaces are treated the same way as a single space.
Digits
Digits 0..9 create numbers and identifiers. Hexadecimal numbers may also contain hexadecimal digits A..F, a..f.
Letters
Letters in €ASM are a..z, A..Z, underscore _, commercial at @, dollar-sign $, grave accent `, question mark ? and all characters from the upper half of ASCII table (128..255).
Some extra-letters are employed in €ASM for special purposes, too:
Underscore _ is used in identifiers and numbers as a word separator instead of space.
Commercial at @ indicates literal section name.
Dollar $ alone used as an identifier specifies a dynamic symbol representing current offset in a section.
Grave ` is used as a prefix when some filename not starting with a letter should represent a valid identifier.
Punctuation
All other characters have special semantic meaning – operators, delimiters, modifiers etc – unless they are enclosed in a pair of single ' or double " quotes. Punctuation characters except for percent sign % and EOL are treated as ordinary letters when they are placed in quoted string.
Character classification table
ASCIIglyph name function in €ASM
0..9 controls white space
10 line feed end of line
11..31 controls white space
32 space white space
33! exclamation mark logical operator
34" double quote string delimiter
35# number sign modifier
36$ dollar sign letter
37% percent sign symbolic variable prefix
38& ampersand logical operator
39' apostrophe (single quote)string delimiter
40( left parenthesis priority parenthesis
41) right parenthesis priority parenthesis
42* asterix arithmetic and special operator
43+ plus sign arithmetic operator
44, comma operand separator
45- minus sign arithmetic operator
46. fullstop member separator
47/ slash (solidus) arithmetic operator
48..570..9 digits digit
58: colon field separator
59; semicolon comment separator
60< less-then sign logical operator, comment separator
61= equals sign logical operator, key separator, literal indicator
62> greater-than sign logical operator
63? question mark letter
64@ commercial at letter
65..90A..Z uppercase letters letter
91[ left square bracketcontent braces, substring operator
92\ backslash (reverse solidus)arithmetic operator, line continuation indicator
93] right square bracketcontent braces, substring operator
94^ caret (circumflex) logical operator
95_ underscore (low line)letter, digit separator
96` grave accent letter
97..122a..zlowercase letters letter
123{ left curly bracket sublist operator
124| vertical bar (pipe)logical operator, comment separator
125} right curly bracketsublist operator
126~ tilde logical operator, shortcut indicator
127 delete white space
128..255 NonASCII charactersletter
ASCIIglyph name function in €ASM

↑ Horizontal structure

Physical line ↓

Statement ↓

Machine remark field ↓

Label field ↓

Prefix field ↓

Operation field ↓

Operand field ↓

Line remark field ↓

Line continuation ↓

Assembler source is treated as a text consisting of lines which are processed from left to right, from top to bottom.


↑ Physical line

Source file consists of physical lines. Physical line is a sequence of characters terminated with line feed (ASCII 10). The line feed (EOL) character is part of the physical line, too.

EOL may be omitted in the last physical line of source file.

↑ Statement

Statement is an order for €ASM to perform some action at assembly-time, usually to emit some code to the object file or to change its internal state. Typical statement is identical with a physical line but long statements may span to several lines when line continuation is used.

Statement consists of several fields which are recognized by their position in the line, by the separator or by their contents. All fields are facultative, any of them may be omitted. However, no operand can be used when the operation field is omitted.

Fields in the statement
OrderField nameTermination
1.Machine remark| or EOL
2.Label : or white space
3.Prefix : or white space
4.Operation white space
5.Operand ,
6.Line comment EOL

↑ Machine remark field

Machine remark begins with vertical bar |, if it is the first non-white character on the physical line. It is terminated with the second occurence of the same vertical bar or with the end of physical line.

The contents of machine remark is usually hexadecimal address followed with the machine code emitted by the statement in question. As the field name indicates, this information is generated by computer into €ASM listing file, programmer should never need to write machine remark manually. Machine remarks are ignored in assembler source, thus any valid €ASM listing file may be reused as the source file without change.

↑ Label field

Label field can accomodate any of these entities:

  1. Structure or symbol name or block identifier, for example My1stStructure, My1stLabel:, Outer
  2. Name of a segment | section | group, for example [.data]
  3. Name of symbolic %variable which is being set, for example %Count
  4. Colon itself : explicitly telling €ASM that empty label is used, so the following field must be a prefix or operation.

In the first case the symbolic name may begin with point ., making the label local. Symbol in the label field may be optionally terminated with one or more colons : immediately following the identifier. The white space between label field and the next field may be omitted when the colon is used.

↑ Prefix field

Machine prefix is an order for CPU to change its internal state at run-time. It is similar to machine instruction code but it only modifies the following instruction at run-time. Each prefix assembles to 1 byte machine opcode.

Prefix table
NameGroupOpcode
LOCK10xF0
REP10xF3
REPE10xF3
REPZ10xF3
REPNE10xF2
REPNZ10xF2
XACQUIRE10xF2
XRELEASE10xF3
SEGCS20x2E
SEGSS20x36
SEGDS20x3E
SEGES20x26
SEGFS20x64
SEGGS20x65
SELDOM20x2E
OFTEN20x3E
OTOGGLE30x66
ATOGGLE40x67
The last four mnemonic names are not known in other assemblers.
SELDOM and OFTEN are used in front of conditional jump instruction as hints for newer CPU to help with prediction of the jump target.
OTOGGLE and ATOGGLE switch between 16- and 32-bit width of operand and address portion of machine code. They are normally generated by the assembler internally whenever needed, without explicit request.

Up to four prefixes can be defined in one statement but not more than one prefix from the same group.

Prefix name may not be used as a label, regardless of character-case.

Names of prefixes are case insensitive and reserved, they cannot be used as labels. Prefix name may terminate with colon(s) : (same as the symbol in label field).

AMD and Intel 64bit architecture introduced special prefixes REX, XOP, VEX, MVEX, EVEX. €ASM treats them as part of operation encoding and does not provide mnemonic for their direct declaration.

[AMDSSE5] introduced another instruction prefix DREX, but DREX-encoded instructions are not supported by €ASM as they never made it to the production, AFAIK.

↑ Operation field

Operation is the most important field of assembler statement; it tells €ASM what to do: declare something, change its internal state or emit something to the object file. Often it gives its name to the whole statement, we may say EXTERN operation instead of statement with EXTERN pseudoinstruction in the operation field.

€ASM recognizes three genders of operation:

Statement may have no operation at all:

[DATA]   ; Redirect further emitting to section [DATA].
Label:   ; Define a label but do not emit any data or code.
         ; Empty statement may be used for optical separation or comments.

Some statements tell €ASM to generate assembled code to the object file, they are called emitting instructions:

↑ Operand field

Ordinal operand ↓
Keyword operand ↓
Mixing operands ↓

Operands specify data which the operation works with. Number of operands in the statement is not limited and it depends on the operation. Operand can be a register name, number, expression, identifier, string, their various combinations etc.

Operation field is separated from the first operand with at least one white-space. Operands are separated with unquoted comma , from one another. There are two kinds of operands in €ASM: ordinal and keyword.


↑ Ordinal operands

Ordinal operands (or shortly ordinals) are referred by the order in the statement. The first operand has number one; in macros it is identified as %1. For instance, in MOV AL,BL statement the AL register is operand nr.1 and BL is nr.2. Machine instruction MOV is known to copy contents of the second operand to the first. Comma between operands will increase the ordinal number even when the operand is empty (nothing but white-spaces).

↑ Keyword operands

Beside ordinal parameters €ASM introduces one more type of operands: keyword operand. They are referred by name (key word) rather than by their position in operands list. Keyword operand has the the form name=value where name is an identifier immediately followed with equal sign. Keyword operands (or simply keywords) are referrenced by name rather than by their position in the operand list.

Keyword operands have many advantages: they are selfdescribing (if their name is chosen ponderously), they don't depend on position in the operand list (no tedious counting of commas), they may be assigned a default value and may be omitted when they have the default value.

Keyword operands are best used with macroinstructions but €ASM also employes them in some pseudoinstructions and even in machine instructions, too. For instance, in INC [EDI],DATA=DWORD the keyword parameter DATA= tells which form of possible INC machine instruction (increment byte, word or dword variable) should be used.

↑ Mixing keyword and ordinal operands

Order of keyword operands is not important. It's a good practice to list ordinal operands first and then all keyword operands, but keywords may be mixed with ordinals.

Keyword operand does not increase the ordinal number.
Label1: Operation1 Ordinal1,Ordinal2,,Ordinal4,,
Label2: Operation2 Ordinal1,Keyword1=Value1,Ordinal2,,Ordinal4

Operation1 in the previous example has three operands with ordinal numbers 1,2 and 4. The third operand is empty. The last two commas at the end of line are ignored, as no other nonempty operand follows.

Mixed operands are used in Operation2. Notice that Ordinal2 has ordinal number 2 although it is the third operand on the list. Keyword operands do not count into ordinal numbers but empty operands do.

↑ Line comment field

Line comment begins with unquoted semicolon ; and it ends with the end of physical line. Line comments are ignored by assembler, they aim to human reader of the code.

↑ Line continuation

Statement continues on the next physical line when line continuation character, which is the unquoted backslash \, is used at the beginning of any field.

 aLabel:       \ ; This semicolon is redundant.
     MOV EAX,  \ The first operand of MOV is destination
         EBX   ; and the second one is source.

Everything following the line continuation character is treated like a comment field, so the semicolon may be omitted in this case. In a multiline statement you may add comments to any physical line.

Line continuation may appear at the beginning of any field, but not inside the field.

The whole field of any statement must fit on one physical line.

Backslash \ is also used as modulo binary operator, which cannot appear at the beginning of operation, so the confusion is avoided.

;                   modulo  modulo line-continuation
;                      |      |    |  
|0000:01000200 |  DW 5 \ 4, 6 \ 4, \
|0004:03000000 |     7 \ 4, 8 \ 4

↑ Vertical structure

Block statements ↓

Switch statements ↓

Standalone statements ↓

Statements in the assembler source are processed one by one, from the top downwards. Some of them may influence the successive statements but most instructions are standalone. From this point of view there are three kinds of statements:


↑ Block statements

Block statement must appear in pair with its corresponding ending statement. Internal state of €ASM is changed only withing the range between them, which is called block.

Block is a continuous range of statements which starts with begin-block statement and ends with end-block statement.

Block actually begins at the operation field of begin-block statement and it ends at the operation field of end-block statement.

Some block statements may be prematurely cancelled (broken) with exit operation, for instance when an error is detected during macro expansion.

Block statements
Label fieldOperation field
ObligationRepresentsDeclares Begin blockBreak End block
mandatoryprogram name program PROGRAMnot used ENDPROGRAM
mandatoryprocedure name symbol PROC not used ENDPROC
mandatoryprocedure name symbol PROC1 not used ENDPROC1
mandatorystructure name structureSTRUC not used ENDSTRUC
optionalblock identifiernothing HEAD not used ENDHEAD
optionalblock identifiernothing %COMMENTnot used %ENDCOMMENT
optionalblock identifiernothing %IF %ELSE %ENDIF
optionalblock identifiernothing %WHILE %EXITWHILE %ENDWHILE
optionalids of Begin/End swappednothing %REPEAT %EXITREPEAT%ENDREPEAT
mandatoryformal control variable%variable%FOR %EXITFOR %ENDFOR
mandatorymacro name macro %MACRO %EXITMACRO %ENDMACRO

Some end-block operations can be aliased:
ENDPROC alias ENDP,
ENDPROC1 alias ENDP1,
%ENDREPEAT alias %UNTIL.

Label field of a block statement specifies the name of program, procedure, structure or macro. In the preprocessing %FOR loop the label field declares formal variable which changes its value in each loop cycle. In other preprocessing loops the label field is optional and it may contain identifier which optically connects the beginning and ending block statements together (for nesting check) but has no further significance - it does not declare a symbol.

Assemblers are not united in the format of block pseudoinstructions. MASM uses the same block identifier in the label fields of both begin- and end-block statements:

MyProcedure PROC    ; MASM syntax
     ; some code
MyProcedure ENDP

This is good when you search the source for procedure definition. Its name is on the left so it will hit your eyes when you scan the leftmost column. On the other hand, the same label appears in the source twice, making an ugly ambiguity from the rule that an unique symbol declaration may occur only once in the program.

Perhaps for that reason Borland choosed different syntax in TASM IDEAL mode:

 PROC MyProcedure   ; TASM syntax
        ; some code
 ENDP MyProcedure

It solves the double label problem but the name of MyProcedure never appears in the label field, although it is a regular label.

€ASM invents compromise solution: the name of block is defined in the label field of begin-block statement and it may appear in the end-block statement as the first and only operand:

MyProcedure PROC  ; €ASM syntax
                  ; some code
            ENDP MyProcedure

The operand in endblock statement can be omitted but, if used, must be the same as the label of corresponding begin-block statement. This helps to maintain correct block nesting because €ASM will emit an error when block identifiers don't match.

Blocks of code may nest.

Two blocks are correctly nested when one block contains the entire other block.
All blocks in €ASM must be correctly nested.

%MACRO block in the example below contains correctly nested %IF block.

WriteCMOS %MACRO Address,Value
           %IF %1 <= 30h
             %ERROR "Checksum protected area!"
             %EXITMACRO WriteCMOS
           %ENDIF
           MOV AL,%1
           OUT 70h,AL
           MOV AL,%2
           OUT 71h,AL
          %ENDMACRO WriteCMOS

↑ Switch statements

Switching statement changes the internal state of €ASM for all following statements until another switching statement changes the state again, or until the end of source is encountered.

There are two switching pseudoinstructions in €ASM: EUROASM, and SEGMENT. The latter has two forms:
[name] SEGMENT (define a new segment) and
[name] (define new section in current segment if not declared yet, and switch emitting to this section).
Examples of switching statements:

 EUROASM  AUTOSEGMENT=OFF, CPU=486 ; Change €ASM options for all following statements.
[Subprocedures] SEGMENT PURPOSE=CODE, ALIGN=BYTE  ; Declare a new segment.
[.data]                  ; Switch emitting of following statements to previously defined segment [.data]
[StringData]             ; Define a new section in the current segment (in [.data]).

↑ Standalone statements

All other pseudoinstructions and machine instructions are not logically bound with others in a vertical structure of a program, so they are standalone.


↑ Elements of €ASM program

Comments ↓

Identifiers ↓

Numbers ↓

Enumerated values ↓

Strings ↓

Addressing space ↓

Addresses ↓

Alignment ↓

Registers ↓

Condition codes ↓

Operators ↓

Expressions ↓

Sections ↓

Segments ↓

Groups ↓

Segmentation ↓

Distance ↓

Width ↓

Namespace ↓

Scope ↓

Data types ↓

Symbols ↓

Literals ↓

%Variables ↓


↑ Comments

Line comments ↓

Machine remarks ↓

Markup comments ↓

Block comments ↓

Comments are parts of source code which are not processed by assembler and their only purpose is to explain the code for human reader. There are four types of comments in €ASM:


↑ Line comments

Line comment starts with unquoted semicolon; everything up to the end of line is ignored by €ASM. Line comments are copied to the listing file.

 Label: CALL SomeProc ; This is a line comment.

↑ Machine remarks

Machine remarks are created by €ASM in the listing file and they contain the generated machine code in hexadecimal notation.

Machine remark starts with vertical bar | which is the first non-white character on the physical line. Machine remark ends with second occurence of the same vertical bar | , or with the end of line (whichever comes first). So, when the closing | is omitted, the whole physical line is treated as remark. This is used for inserting error messages into the listing, just below the erroneous statement.

|0030:E81234   |Label1: CALL SomeProc ; This is a line comment.
|0033:         |Label2: COLL OtherProc ; Typing error in operation name.
|### E6860 Unrecognized operation "COLL", ignored.

Machine remarks are ignored by €ASM and they are not copied to the listing. €ASM creates them again instead, if the listing produced by previous assembly session is submitted as source to assemble.

Machine remarks are not intended to be manually inserted by programmer into source text, use ordinary line comment instead.

↑ Markup comments

When physical line begins with less-than character <, it is treated as a markup comment and ignored to the end of line. This enables to mix the source code and hypertext markup language tags. Markup comments are not copied to the listing.

Thanks to markup comments, €ASM source can be stored not only as a plain-text but also as HTML or XML hypertext.

<h2>Description of SomeProcedure</h2>
<img src="SomeImage.png">
SomeProcedure  PROC  ; See the image above for description.

All source code shipped with €ASM is completely stored in HTML format, which allows to document the source with hypertext links, tables, images and better visual representation than simple line comments could yield.

If you want to keep your sources in HTML, make sure that €ASM statements do not start with < and rearrange the source so that every markup comment line starts with some HTML tag. You may also use void HTML tags <span/> or <!----> to start the comment line.

↑ Block comments

Block comment can be used to temporary disable a portion of source code or to include documentation inside the source.

Block comment begins with %COMMENT statement and it ends with the corresponding %ENDCOMMENT. It can span over many lines of program, which don't have to start with semicolons.
Block comments are copied into the listing file.

€ASM does not assemble the text inside the commented-out block, but it needs to parse it anyway in order to find the coresponding %ENDCOMMENT statement, so the commented-out text should be a valid source as well.

Block comments are nestable.

The text in %COMMENT block must be corectly nested, although it is ignored.

The pseudoinstrucion %COMMENT could be easily replaced with %IF 0, but the former one is more intuitive.
 CALL SomeProc ; This is a line comment.
 %COMMENT  ; This is a block comment.
 COLL OtherProc ; Typing error in operation name.
    %COMMENT ; This is a nested block comment.
    %ENDCOMMENT ; End of inner block comment.
    ; This statement is ignored, too.
 %ENDCOMMENT 
 ; Emitting assembly continues here.

↑ Identifiers

Identifier is a human readable text which gives name to an element of assembler program: a symbol, register, instruction, structure etc.

Identifier is a combination of letters and digits which begins with a letter.

Length of identifiers is not limited in €ASM and all characters are significant.


↑ Numbers

Decimal numbers ↓

Binary numbers ↓

Octal numbers ↓

Hexadecimal numbers ↓

Integer numbers overview ↓

Floating point numbers ↓

Floating point special values ↓

Character constants ↓

Number notation is the way to write numeric value. Numeric values are kept and computed internally by €ASM as 64-bit signed integers.

Number notation is a combination of digits and number modifiers, which begins with decimal digit.

Number modifier is one of B D E G H K M P Q T character apended to the end of digits sequence or 0N 0O 0X 0Y (zero followed by a letter) prefixed in front of other digits. All number modifiers are case insensitive. Except for decimal format, which is the default, a modifier must always be used.

Floating point numbers may use fullstop . to separate integer and decimal part of the number notation.

Another number modifier is underscore character _ which is ignored by number parser and can be used as digit separator instead of space for better readability of long numbers. No white spaces are allowed in the number notation.

↑ Decimal numbers

Decimal number is a combination of decimal digits optionally postfixed with decimal modifier D. There are five other decimal modifiers:
K (Kilo), which tells €ASM to multiply the number by 210=1024,
M (Mega), which tells €ASM to multiply the number by 220=1_048_576,
G (Giga), which tells €ASM to multiply the number by 230=1_073_741_824,
T (Tera), which tells €ASM to multiply the number by 240=1_099_511_627_776,
P (Peta), which tells €ASM to multiply the number by 250=1_125_899_906_842_624.

Decimal numbers may be prefixed with 0N modifier.

All six numbers in the following example have the same value: 1048576, 1048576d, 0n104857, 1_048_576, 1024K, 1M.

Maximal possible number which fits into 32 bits is 0xFFFF_FFFF=4_294_967_295.

Maximal possible number which fits into 63 bits is 0x7FFF_FFFF_FFFF_FFFF=9_223_372_036_854_775_807.

↑ Binary numbers

Binary number is made of digits 0 1 appended with binary number modifier B or prefixed by modifier 0Y. Examples: 0y101, 101b, 00110010b, 1_1111_0100B are equivalent to decimal numbers 5, 5, 50, 500 respectively.

Maximal 32bit binary number is 1111_1111__1111_1111__1111_1111__1111_1111b.

↑ Octal numbers

Each octal digit 0..7 represents three bits of equivalent binary notation. The number is terminated with octal suffix Q or prefixed with 0O alias 0o (digit zero followed by capital or small letter O).

Example: 177_377q = 0o177_377 = 0xFEFF

The biggest 32bit octal number is 37_777_777_777q.

The biggest 64bit octal number is 1_777_777_777_777_777_777_777q.

↑ Hexadecimal numbers

Hexadecimal digit encodes four bits in one character, which requires 24=16 possible values. Therefore the ten decadic digits are extended with letters A, B, C, D, E, F with values 10, 11, 12, 13, 14, 15. Hexadecimal letters A..F are case insensitive. When the first digit of hexadecimal number is represented with letter A..F, an additional leading zero must be prefixed to the number notation. Hexadecimal number is terminated with suffix H or it begins with prefix 0X.

Example: 5h, 0x32, 1F4H, 0x1388, 0C350H represent decadic numbers 5, 50, 500, 5000, 50000 respectively.

Keep in mind that all numbers in €ASM are internally kept as 64bit signed integer. Although instructions MOV EAX,0xFFFF_FFFF and MOV EAX,-1 assemble to identical codes, their operands are stored as 0x0000_0000_FFFF_FFFF and 0xFFFF_FFFF_FFFF_FFFF. Boolean expression 0xFFFF_FFFF = -1 is false.

↑ Integer numbers overview

Integers may be written in binary, decimal, octal or hexadecimal notation. Some number modifiers overlap with hexadecimal digits B, D, E. €ASM parses as much of the element as possible to resolve such ambiguity:
1BH is parsed as hexadecimal number 0x1B=27 and not binary 1 followed with letter H.
2DH is parsed as hexadecimal number 0x2D=45 and not decimal 2 followed with letter H.
3E2H is parsed as hexadecimal number 0x3E2=994 and not 3 * 102 followed with letter H.

Integer number notation
NotationPrefixBaseSuffixMultiplier
Binary0Y2B1
Octal0O8Q1
Decimal0N10D1
K210
M220
G230
T240
P250
Hexadecimal0X16H1

Binary, octal and hexadecimal numbers must always be written with prefix or suffix (or both, however this is not recommended). There is no RADIX directive in €ASM.

For more examples of acceptable syntax see €ASM numbers tests.

↑ Floating point numbers

Floating point alias real numbers are parsed from scientific notation with decimal point and exponent of 10, using this syntax:

FP number notation anatomy
OrderField nameContents
1number sign+, - or nothing
2significanddigits 0..9, digit separators _
3decimal point.
4fractiondigits 0..9, digit separators _
5FP number modifierE or e
6exponent sign+, - or nothing
7exponent partdigits 0..9, digit separators _

For instance, floating point number 1234.56E3 has value 1234.56 * 103=1234560.

Omitted sign is treated as +.

Decimal part can be omitted when zero(s). 123.00E2 = 123.E2

Decimal point may be omitted when decimal part is omitted (is zero). The E modifier still specifies the floating point format. 123.00E2 = 123.E2 = 123E2 = 12300.

Exponent can be omitted when it is zero. Modifier E may be omitted in this case, too. Without E modifier it is the presence of decimal point which decides if the number is integer or real. Example: 12345.67E0 = 12345.67E = 12345.67

No white space is allowed within FP number notation.

The number is considered as floating point when its notation contains either decimal point ., or modifier E (capital or small letter E), or both. Otherwise it is treated as integer.

€ASM does not calculate with floating point numbers at assembly time.

All internal calculations in €ASM are provided with 64bit integers only. When FP is used in mathematical expression, it is converted to integer first. Error E6130 (number overflow) is reported if the number does not fit to 64 bits. Warning W2210 (precision lost) is reported if the FP number had decimal part which was rounded in conversion.

Actual FP number format [IEEE754] is maintained only when the scientific notation is used to define static FP variable with pseudoinstruction DD, DQ, DT.

Half-precision FP numbers (float16) are not supported by €ASM, neither they are supported by processors, with exception of two packed SIMD instructions VCVTPS2PH and VCVTPH2PS, and a few MVEX-encoded up/down conversion operations.

Unlike integer numbers, the sign of FP notation is inseparable from digits which follow. If you by mistake put a space between the sign and the number, instead of FP definition it is treated as an operation (unary minus applied to a number), and therefore the FP number is converted to integer first, before the operation is evaluated. |00000000:001DF1C7 | DD -123.45E3 ; Single-precision FP number. |00000004:C61DFEFF | DD - 123.45E3 ; Dword signed integer number. |00000008:00000000A023FEC0 | DQ -123.45E3 ; Double-precision FP number. |00000010:C61DFEFFFFFFFFFF | DQ - 123.45E3 ; Qword signed integer number. |00000018:0000000000001DF10FC0 | DT -123.45E3 ; Extended-precision FP number. |00000022: | DT - 123.45E3 ; Tbyte integer number is not supported. |### E6725 Datatype TBYTE expects plain floating-point number.

↑ Floating point special values

Beside the standard scientific notation of floating-point numbers they may have a special FP constant value:

Special floating-point constant values (in hexadecimal notation)
ConstantInterpretationsingle precision (DD)double precision (DQ)extended precision (DT)
#ZEROzero0000000000000000_00000000 0000_00000000_00000000
+#ZEROpositive zero0000000000000000_00000000 0000_00000000_00000000
-#ZEROnegative zero8000000080000000_00000000 8000_00000000_00000000
#INFinfinity7F8000007FF00000_00000000 7FFF_80000000_00000000
+#INFpositive infinity7F8000007FF00000_00000000 7FFF_80000000_00000000
-#INFnegative infinityFF800000FFF00000_00000000 FFFF_80000000_00000000
#PINFpseudo infinity7F8000007FF00000_00000000 7FFF_00000000_00000000
+#PINFpositive pseudo infinity7F8000007FF00000_00000000 7FFF_00000000_00000000
-#PINFnegative pseudo infinityFF800000FFF00000_00000000 FFFF_00000000_00000000
#NANnot a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
+#NANpositive not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
-#NANnegative not a numberFFC00000FFF80000_00000000 FFFF_C0000000_00000000
#PNANpseudo not a number7F8000017FF00000_00000001 7FFF_00000000_00000001
+#PNANpositive pseudo not a number7F8000017FF00000_00000001 7FFF_00000000_00000001
-#PNANnegative pseudo not a numberFF800001FFF00000_00000001 FFFF_00000000_00000001
#QNANquiet not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
+#QNANpositive quiet not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
-#QNANnegative quiet not a numberFFC00000FFF80000_00000000 FFFF_C0000000_00000000
#SNANsignaling not a number7F8000017FF00000_00000001 7FFF_80000000_00000001
+#SNANpositive signaling not a number7F8000017FF00000_00000001 7FFF_80000000_00000001
-#SNANnegative signaling not a numberFF800001FFF00000_00000001 FFFF_80000000_00000001

↑ Character constants

A number can also be written as a character constant, which is a string containing not more than eight characters (or when nineth and higher characters are all NUL). Its numeric value is taken from ordinal number of each character in the ASCII table. Example of character constants and their values:

'0'   =     30h =      48
'abc' = 636261h = 6513249
"4%%" =   2534h =    9524
Character with the least significant value is on the left position in the string.

Assemblers are not united in character constants treatment. MASM and TASM use scriptual convention where the order of characters in written source corresponds with the way we write numbers: least significant digit on the right.

€ASM as well as other newer assemblers use the memory convention where the order of characters in the written source corresponds with the order how they are stored in memory on little endian architecture processors.

| | ; MASM and TASM: |00000000:616263 | DB 'abc' ; String. |00000003:63626100 | DD 'abc' ; Character constant. |00000007:B863626100 | MOV EAX,'abc' ; AL='c'. | | ; €ASM, FASM, GoASM, NASM, SpASM: |00000000:616263 | DB 'abc' ; String. |00000003:61626300 | DD 'abc' ; Character constant. |00000007:B861626300 | MOV EAX,'abc' ; AL='a'.

↑ Enumerated values

Some operands may acquire only one of the few predefined values, e.g. the EUROASM option CPU= may be 086, 186, 286 386, 486, 586, 686, PENTIUM, P6, X64.

All built-in €ASM boolean options accept any of enumerated tokens TRUE, YES, ON, ENABLE, ENABLED as logical true, and FALSE, NO, OFF, DISABLE, DISABLED as logical false (case insensitive).

Real enumeration is used only with operands built in the €ASM. They are not symbols that could be used elsewhere, such as MOV EAX,TRUE. To achieve similar functionality in macros, the programmer would have to define such symbols first, e.g.

FALSE   EQU 0
false   EQU 0
TRUE    EQU -1
true    EQU !false
MOV EAX,TRUE

↑ Strings

String is a set of arbitrary characters enclosed in quotes. Either double " or single quotes ' (also called apostrophes) may be used to claim the borders of a string. Surrounding quotes do not count into the string contents. All characters withing the string loose their semantic meaning with three exceptions:

  1. EOL cannot be used in strings. In other words, every string must fit to one physical line. This can be bypassed declaring end of line with ASCII value, e.g. MultilineString: DB "This is the first line",13,10,"and this is the second one.",13,10,0
  2. The same quote character which is used to surround the string cannot be used inside, unless it is doubled, e.g. Surname: DB 'O''Brien',0
  3. The percent sign % keeps its function of a %variable prefix. Use two adjacent percents when a single % is required in a string, e.g. DB "100%% completed."
Preprocessing %variables are expanded in strings.

No escape character is employed in €ASM, in fact the percent sign and quote escape themselves. If you need to use any of the above mentioned characters within a string, they must be doubled. This duplication (self-escaping) concerns only the notation in the source and it does not increase the string size in computer memory.

Strings enclosed in 'single quotes' and "double quotes" are equivalent with one exception: if the contents of string is filename, only double quotes may be used, because apostrophe is valid character when used in filenames on most filesystems. Example of strings:

"80 %% of users said ""yes"""
''        ; empty string
"'"       ; a single apostrophe
''''      ; a single apostrophe
'''       ; error: odd number of quotes
"It ain't necessarilly so'   ; type of quotes do no match

↑ Addressing space

Processor, alias Central Processing Unit (CPU), operates with data and communicates with its environment (registers, memory and devices). Typical operation reads a piece of information from register, memory or port (I/O device), makes some manipulation with the data and writes it back to the environment. The least addressable unit is one byte (1 B) and their number is limited by addressing space. Register is identified by its name, device is identified by its port number, byte in memory is identified by its address.

CPU addressing space
CPU modeGPR spaceI/O port spaceMemory addressing space
16bit 8* 2 B64 KB (216)1 MB (216+4)
32bit 8* 4 B64 KB (216)4 GB (232)
64bit16* 8 B64 KB (216)16384 PB (264)

↑ Addresses

Addressing space is limited by the CPU architecture and by the number of wires connecting CPU with memory chips. The combination of logical zeros and ones, which can be measured on these wires, is called physical address (PhA). From programmer's point of view the processor writes or reads from virtual address (VA). Both virtual and physical address were identical only in first generations of Intel processors operating in real mode without memory cache and memory paging.

Objects in the linked image of protected-mode program is often addressed with an offset from the beginning of image loaded in memory. Such offset is called Relative Virtual Address (RVA).

Position of data items in file formats are sometimes identified with file address (FA), which is defined as the distance between start of the file and the actual data item.

Address is a symbolic representation of some position in memory.

PhA, VA, RVA, FA are integer non-negative plain numbers, but adressing at assembly-time it is rather more complicated. From historical reason is the address space divided into segments of memory and each segment is identified by the contents of segment register. Address at assembly-time is usually expressed as number of bytes – offset – between the position and the start of the segment, which the code lies in. However, there is fundamental difference between an address and a number. Address is a piece of information which consists of two numeric components:

offset
is explicitly calculated with in expressions; it is visible in machine code
segment
represents address of the first byte of memory block – segment – where the assembled machine code is loaded at run-time

Unlike plain numbers, repertory of numeric operations with addresses is very limited, they cannot be added with one another, multiplied, shifted etc.:

  1. Positive or negative number may be added to an address; this will increase/decrease its offset.
  2. Two addresses within the same segment may be subtracted and the result is plain numeric value.
  3. Two addresses within the same segment may be compared to find out whether they are equal or which one is above the other.

Plain number is also called absolute number or scalar value, and address is called relative address or vector value in other assemblers. Number is internally stored by €ASM in eight bytes, but an address needs additional room to keep information of the segment it belongs to.

↑ Alignment

Data and code are retrieved from memory faster when their address is aligned, which means rounded to a value which is a multiple of power of two. Though most of IA-32 CPU instructions can cope with unaligned data, it takes more time as the data read from memory are not in the same cache page and the CPU may need to shift the information internally during fetch-time.

For the best performance, memory variables should be aligned to their natural alignment which corresponds with their size, see the Autoalign column in Data types table. Doublewords, for instance, have autoalign value 4, which says that the last two bits of properly aligned address must be zero. QWORD are aligned to 8, therefore the last three bits (8=23) must be zero.

Alignment can be achieved explicitly with ALIGN pseudoinstruction, or with ALIGN= keyword given in machine instruction or in pseudoinstructions D, PROC, PROC1.

Memory variables are being aligned implicitly when EUROASM option AUTOALIGN=ON. For instance the statement SomeDword: DD 1234 is autoaligned by 4 (offset of SomeDword can be divided by 4 without a remainder). Alignment stuff, which fills the space in front of aligned instruction, is NOP 0x90 in code segments and zero 0x00 in data segments.

The align value may be numeric expression which evaluates to 1, 2, 4, 8 or higher power of two. €ASM accepts without warning zero or empty value, too, which is identical to ALIGN=1 (has no effect). Beside the numeric values ALIGN also accepts enumerated values BYTE, WORD, DWORD, QWORD, OWORD, YWORD, ZWORD alias their short versions B, W, D, Q, O, Y, Z.

Alignment is always limited by the alignment of segment which the statement lies in. If the current segment is DWORD aligned, we cannot ask for QWORD or OWORD alignment in this segment. However, default segment alignment is OWORD in €ASM.

Beside instruction modifier ALIGN= the alignment may also be established with explicit pseudoinstruction ALIGN, which allows intentional disalignment, too.

↑ Registers

Register is a small fast fixed-size variable located on CPU chip.

Though a register remembers information written to it, it is not part of addressable memory. Registers can be referrenced by their names only, they have no address.

Registers table
FamilyMembersSize
GPR 8bitAL, AH, BL, BH, CL, CH, DL, DH,
DIB, SIB, BPB, SPB, R8B, R9B, R10B, R11B, R12B, R13B, R14B, R15B
DIL, SIL, BPL, SPL, R8L, R9L, R10L, R11L, R12L, R13L, R14L, R15L
1
GPR 16bitAX, BX, CX, DX, BP, SP, SI, DI, R8W, R9W, R10W, R11W, R12W, R13W, R14W, R15W2
GPR 32bitEAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI, R8D, R9D, R10D, R11D, R12D, R13D, R14D, R15D4
GPR 64bitRAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI, R8, R9, R10, R11, R12, R13, R14, R158
SegmentCS, SS, DS, ES, FS, GS2
FPUST0, ST1, ST2, ST3, ST4, ST5, ST6, ST710
MMXMM0, MM1, MM2, MM3, MM4, MM5, MM6, MM78
XMMXMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM16, XMM17, XMM18, XMM19, XMM20, XMM21, XMM22, XMM23, XMM24, XMM25, XMM26, XMM27, XMM28, XMM29, XMM30, XMM3116
AVXYMM0, YMM1, YMM2, YMM3, YMM4, YMM5, YMM6, YMM7, YMM8, YMM9, YMM10, YMM11, YMM12, YMM13, YMM14, YMM15, YMM16, YMM17, YMM18, YMM19, YMM20, YMM21, YMM22, YMM23, YMM24, YMM25, YMM26, YMM27, YMM28, YMM29, YMM30, YMM3132
AVX-512ZMM0, ZMM1, ZMM2, ZMM3, ZMM4, ZMM5, ZMM6, ZMM7, ZMM8, ZMM9, ZMM10, ZMM11, ZMM12, ZMM13, ZMM14, ZMM15, ZMM16, ZMM17, ZMM18, ZMM19, ZMM20, ZMM21, ZMM22, ZMM23, ZMM24, ZMM25, ZMM26, ZMM27, ZMM28, ZMM29, ZMM30, ZMM3164
MaskK0. K1, K2. K3, K4, K5, K6, K78
BoundBND0, BND1, BND2, BND316
ControlCR0, CR2, CR3, CR4, CR84
DebugDR0, DR1, DR2, DR3, DR6, DR74
TestTR3, TR4, TR54

Register names are case insensitive. General Purpose Registers (GPR) are aliased, for instance AL is another name for the lower half of AX, which is the lower half of EAX, which is the lower half of RAX.

Similary, AVX registers are aliased as well: XMM0 is another name for the lower half of YMM0, which is the lower half of ZMM0.

Eight bit registers DIB, SIB, BPB, SPB, R8B..R15B are aliases for least significant byte of RDI, RSI, RBP, RSP, R8..R15. They may also be referred as DIL, SIL, BPL, SPL, R8L..R15L, as used in Intel manual. €ASM supports both suffixes L and B. These registers are available in 64bit mode only.

Some other assemblers and Intel manuals use notation ST(0), ST(1)..ST(7) for Floating-Point Unit register names, but this syntax is not accepted in €ASM. Neither can be ST0 register aliased with ST (top of the FPU stack).

Processor x86 contains some other registers which hold flags, descriptor tables, FPU control and status registers, but they are not listed in the table because they are not directly accessible by their name.

↑ Condition codes

General condition codes ↓

SSE condition codes ↓

Result of some CPU operations is treated as a predicate with mnemonic shortcut that can be used as a part of instruction name.

↑ General condition codes

Some combinations of CPU flags ZF, CF, OF, SF, PF are given special names, so called condition codes. They are used in mnemonic of conditional branching using the jump instructions or in bit-manipulation general-purpose instructions.

Inverted code can be used in macroinstructions to bypass region of code when the condition is not met. See the automatic %variable inverted condition code.

General condition codes table
Num.
value
Mnemonic
code
AliasDescriptionConditionInverted
mnem.code
0x4E Z Equal ZF=1 NE
0x5NE NZ Not Equal ZF=0 E
0x4Z E Zero ZF=1 NZ
0x5NZ NE Not Zero ZF=0 Z
0x2C B Carry CF=1 NC
0x3NC NB Not Carry CF=0 C
0x2B C Borrow CF=1 NB
0x3NB NC Not Borrow CF=0 B
0x0O Overflow OF=1 NO
0x1NO Not Overflow OF=0 O
0x8S Sign SF=1 NS
0x9NS Not Sign SF=0 S
0xAP PE Parity PF=1 NP
0xBNP PO Not Parity PF=0 P
0xAPE P Parity Even PF=1 PO
0xBPO NP Parity Odd PF=0 PE
0x7A NBEAbove CF=0 && ZF=0 NA
0x6NA BE Not Above CF=1 || ZF=1 A
0x3AE NB Above or Equal CF=0 NAE
0x2NAE B Not Above nor Equal CF=1 AE
0x2B NAEBelow CF=1 NB
0x3NB AE Not Below CF=0 B
0x6BE NA Below or Equal CF=1 || ZF=1 NBE
0x7NBE A Not Below nor Equal CF=0 && ZF=0 BE
0xFG NLEGreater SF=OF && ZF=0 NG
0xENG LE Not Greater SF<>OF || ZF=1G
0xDGE NL Greater or Equal SF=OF NGE
0xCNGE L Not Greater nor EqualSF<>OF GE
0xCL NGELess SF<>OF NL
0xDNL GE Not Less SF=OF L
0xELE NG Less or Equal SF<>OF || ZF=1NLE
0xFNLE G Not Less nor Equal SF=OF && ZF=0 LE
CXZ CX register is Zero CX=0
ECXZ ECX register is Zero ECX=0
RCXZ RCX register is Zero RCX=0

↑ SSE condition codes

Streaming Single Instruction Multiple Data Extension instructions (V)CMPccSS,(V)CMPccSD,(V)CMPccPS,(V)CMPccPD use different set of condition codes cc.

Only aliased mnemonic code is documented for legacy instructions CMPccSS,CMPccSD,CMPccPS,CMPccPD.
SSE condition codes table
Num.
value
Mnemonic
code
AliasDescription
0x00EQ_OQEQEqual, Ordered, Quiet
0x01LT_OSLTLess Than, Ordered, Signaling
0x02LE_OSLELess than or Equal, Ordered, Signaling
0x03UNORD_QUNORDUnordered, Quiet
0x04NEQ_UQNEQNot Equal, Unordered, Quiet
0x05NLT_USNLTNot Less Than, Unordered, Signaling
0x06NLE_USNLENot Less than or Equal,Unordered, Signaling
0x07ORD_QORDOrdered, Quiet
0x08EQ_UQ Equal, Unordered, Quiet
0x09NGE_USNGENot Greater than or Equal, Unordered, Signaling
0x0ANGT_USNGTNot Greater Than, Unordered, Signaling
0x0BFALSE_OQFALSEFalse, Ordered, Quiet
0x0CNEQ_OQ Not Equal, Ordered, Quiet
0x0DGE_OSGEGreater than or Equal, Ordered, Signaling
0x0EGT_OSGTGreater Than, Ordered, Signaling
0x0FTRUE_UQTRUETrue, Unordered, Quiet
0x10EQ_OSEqual, Ordered, Signaling
0x11LT_OQLess Than, Ordered, Quiet
0x12LE_OQLess than or Equal, Ordered, Quiet
0x13UNORD_SUnordered, Signaling
0x14NEQ_USNot Equal, Unordered, Signaling
0x15NLT_UQNot Less Than, Unordered, Quiet
0x16NLE_UQNot Less than or Equal, Unordered, Quiet
0x17ORD_SOrdered, Signaling
0x18EQ_USEqual, Unordered, Signaling
0x19NGE_UQNot Greater than or Equal, Unordered, Quiet
0x1ANGT_UQNot Greater Than, Unordered, Quiet
0x1BFALSE_OSFalse, Ordered, Signaling
0x1CNEQ_OSNot Equal, Ordered, Signaling
0x1DGE_OQGreater than or Equal, Ordered, Quiet
0x1EGT_OQGreater Than, Ordered, Quiet
0x1FTRUE_USTrue, Unordered, Signaling

↑ Operators

Operator is an order to compute at assembly-time.

Combination of punctuation characters is used in €ASM to prescribe various operations with numbers, addresses, strings and registers in the assembly process. Placing a binary operator between two numbers tells €ASM to replace these three elements with the result of operation. Some operators are unary, they modify the value of operand which they stand in front of.

All operations implemented in €ASM are presented in the following table.

Operation table
Operation PriorityProperties Left
operand
Operator Right
operand
ResultII (6)
Membership 16binary noncomm. (1)identifier. identifieridentifier
Attribute 15unary noncomm. (3) attr# element number or address
Case-insens. Equal 14binary commutative (2)string== string boolean CMPS
Case-sens. Equal 14binary commutative string === string boolean CMPS
Case-insens. Nonequal 14binary commutative (2)string!== string boolean CMPS
Case-sens. Nonequal 14binary commutative string !=== string boolean CMPS
Plus 13unary (3) + number numeric NOP
Minus 13unary (3) - number numeric NEG
Shift Logical Left 12binary noncommutative number << number numeric SHL
Shift Arithmetic Left 12binary noncommutative number #<< number numeric SAL
Shift Logical Right 12binary noncommutative number >> number numeric SHR
Shift Arithmetic Right12binary noncommutative number #>> number numeric SAR
Signed Division 11binary noncommutative number #/ number numeric IDIV
Division 11binary noncommutative number / number numeric DIV
Signed Modulo 11binary noncommutative number #\ number numeric IDIV
Modulo 11binary noncommutative number \ number numeric DIV
Signed Multiplication 11binary commutative number #* number numeric IMUL
Multiplication 11binary commutative number * number numeric MUL
Scaling 10binary commutative (5)number* register address expression
Addition 9binary commutative number + number numeric ADD
Subtraction 9binary noncommutative number - number numeric SUB
Indexing 9binary commutative (5)number+ register address expression
Bitwise NOT 8unary (3) ~ number numeric NOT
Bitwise AND 7binary commutative number & number numeric AND
Bitwise OR 6binary commutative number | number numeric OR
Bitwise XOR 6binary commutative number ^ number numeric XOR
Above 5binary noncommutative number > number boolean JA
Greater 5binary noncommutative number #> number boolean JG
Below 5binary noncommutative number < number boolean JB
Lower 5binary noncommutative number #< number boolean JL
Above or Equal 5binary noncommutative number >= number boolean JAE
Greater or Equal 5binary noncommutative number #>= number boolean JGE
Below or Equal 5binary noncommutative number <= number boolean JBE
Lower or Equal 5binary noncommutative number #<= number boolean JLE
Numeric Equal 5binary commutative number = number boolean JE
Numeric Nonequal 5binary commutative (4)number!= or <>number boolean JNE
Logical NOT 4unary (3) ! number boolean NOT
Logical AND 3binary commutative number && number boolean AND
Logical OR 2binary commutative number || number boolean OR
Logical XOR 2binary commutative number ^^ number boolean XOR
Segment separation 1binary noncommutative number : number address expression
Data duplication 0binary noncomm. (1) (5)number* datatype data expression
Range 0binary noncomm. (1)number .. number range
Substring 0binary noncomm. (1)text [ ] range text
Sublist 0binary noncomm. (1)text { } range text

(1) Special operations Membership, Duplication, Range, Substring, Sublist are solved at parser level rather than by the €ASM expression evaluator. They are listed here only for completeness.

(2) Case insensitive string-compare operations ignore the case of letters A..Z but not the case of accented national letters above ASCII 127.

(3) Unary operator applies to the following operand. Binary operators work with two operands. Attribute operator applies to the following element or expression in parenthesis/brackets.

(4) Numeric Nonequal operation has two aliased operators != and <>. You can choose whichever you like.

(5) Operation Multiplication, Scaling and Duplication share the same operator *. Similary Addition and Indexing share operator +. The actual operation is determined by operands type.

(6) Column II illustrates which equivalent machine instruction is used internally to compute the operation at assembly-time.

The commutative property specifies whether both operands of binary operation can be exchanged without having impact to the result.

Priority column specifies the order of processing operators. Higher priority operations compute sooner but this can be changed with priority parenthesis ( ). Operation with equal priority compute in their notation order (from left to right).

Operations which compute with signed integers have the operator prefixed with #. Operations Addition and Subtraction do not need a special "#signed" version because they compute with signed and unsigned integer numbers in the same way.

Both numeric and boolean operations return 64bit number. In case of boolean operations the result number has one of the two possible values: 0 (FALSE) or -1 = 0xFFFF_FFFF_FFFF_FFFF (TRUE), so the result can be used in subsequent bitwise operations with all bits. For example the expression
'+' & %1 #>= 0 | '-' & %1 #< 0 is evaluated as
('+' & (%1 #>= 0)) | ('-' & (%1 #< 0)) and its result is the minus sign (45) if %1 is negative and plus sign (43) otherwise.

Spaces which separate operands and operators in expression examples serve only for better readability and they are not required by €ASM syntax.

Rich set of operators allows €ASM to get rid of cloned pseudoinstructions such as IFE, IFB, IFIDN, IFIDNI, IFDIF, ERRIDNI, ERRNB...

The Shift operators family is given higher priority than in other languages because I treat shifts as a special kind of multiplication/division.
NASM evaluates the expression 4+3<<2 as (4+3)<<2 = 28 but in €ASM it is evaluated as 4+(3<<2) = 16).


↑ Expressions

Numeric and logical expressions ↓

Address expressions ↓

Register expressions ↓

Data expressions ↓

Special expressions ↓

Expression is a combination of operands, operators and priority parenthesis () which follows rules in the table below.

Syntax of expression
What may followleft parenthesisunary operator operandbinary operatorright parenthesisend of expression
beginning of expressionyesyesyesnonoyes (2)
left parenthesisyesyesyesnoyes (2)no
unary operatoryesnoyesnonono
operandnononoyesyesyes
binary operatoryesyes (1)yesnonono
right parenthesisnononoyesyesyes

(1) Unary operator is permitted after binary operation, e.g. 5*-3 evaluates as 5*(-3).

(2) Empty expression, empty parenthesis contents and superabundant parenthesis are valid.

The table shows which combinations are permitted. It should be read by rows, for instance the first line stipulates that expression may begin with left parenthesis, unary operator or an operand.

Expression is parsed to elementar unary and binary operations, which are calculated according to the priority. Operations with the same priority are computed from left to right. Priority can be increased using parenthesis ( ).

↑ Numeric and logical expressions

String compare ↓
Numeric compare ↓
Numeric arithmetic ↓
Shift ↓
Bitwise arithmetic ↓
Boolean algebra ↓
Numeric operation calculate internally with 64-bit integers, no matter if the target program is intended to run in 64bit mode or not.

Result of numeric or logical expression is a scalar 64-bit numeric value (signed integer). It may be treated as a number or as a logical value. Zero result is treated as boolean false and any nonzero result is boolean true. Pure logical expressions, such as logical NOT, AND, OR, XOR and all compare operations return 0 when false and 0xFFFF_FFFF_FFFF_FFFF = -1 when true. This enables to use the result of logical expression in subsequent bitwise operations with all bits.

↑ String compare

String compare expressions return boolean value. Case insensitive versions convert both strings to the same case before actual comparing; however this concerns ASCII letters A..Z only. National letters with accents in any codepage are always compared case sensitively.

String compare is given the highest priority since no other assembly-time operation can be performed with strings beside the test of equality. At assembly time €ASM cannot tell which string is "bigger".

"EAX" ==  "eax" ; true, they are equal.
"EAX" === "eax" ; false, they differ in character case.
"I'm OK."  ===  'I''m OK.' ; true, their netto value is equal. 
"Müller" == "MÜLLER" ; false because of different case of umlauted U's.
"012" == "12" ; false, strings are not equal.
"123" = 123 ; false; left operand is treated as a character constant with value 0x333231=3355185.
"123" == 123 ; syntax error; right operand is not a string.

↑ Numeric compare

Numeric compare operation uses single equal sign = and it can compare two plain numbers or two addresses within the same segment. Numeric compare can be used to test which side of operation is bigger. Terms above/below are used when comparing unsigned numbers or addresses and greater/lower are used for comparing signed numbers. Operators which treat numbers as signed are prefixed with # modifier. Addresses are always unsigned, therefore we cannot ask whether they are greater or lower.

5 < 7          ; true, 5 is below 7. 
5 #< 7         ; true, 5 is lower than 7.
5 #< -7        ; false, 5 is not lower than -7.
5 < -7         ; true, 5=0x0000_0000_0000_0005 is below -7=0xFFFF_FFFF_FFFF_FFF9.
123 = 0123     ; true, both numbers are equal.
"123" == "0123" ; false, both strings are different.
"123" = "0123" ; false, both sides are treated as character constants with different values.
"123" = "000000123" ; syntax error; "000000123" is not a number.
↑ Numeric arithmetic

Common arithmetic operations are Addition, Subtraction, Multiplication, Division and Modulo (remainder after division). Unary minus may be applied to scalar numeric operand only. Unary plus does not change the value of operand; it is included in the operator set only for completeness. Adjacent binary and unary numeric operator is accepted by €ASM, however weird this may seem. This is useful in evalution expressions with substituted value, such as 5 + %1 where the symbolic argument %1 happens to be negative, e.g. -2. This expression is calculated as 5 + %1 -> 5 + -2 -> 5 + (-2) -> 3.

The greatest permitted value of number in €ASM source is 0xFFFF_FFF_FFFF_FFFF -> 18_446_744_073_709_551_615 as unsigned or 0x7FFF_FFFF_FFFF_FFFF -> 9_223_372_036_854_775_808 as signed. Overflow at assembly time is ignored in Addition, Subtraction and Shift Logical operation. Assembly error is reported when overflow occurs during Multiplication and Shift Arithmetic Left operation, or when division-by-zero happens during Division or Modulo operation. This maximum must not be exceeded even in intermediate results during the evaluation, such as 0x7FFF_FFFF_FFFF_FFFF * 2 / 2 (€ASM reports error). However, rearranged code 0x7FFF_FFFF_FFFF_FFFFF * (2 / 2) assembles well.

Examples of numeric expressions evaluation:

 2 + 3 * 4 -> 2 + (3 * 4) -> 14
0xFFFF_FFFF_FFFF_FFF9 + 0x0000_0000_0000_0009 -> 0x0000_0000_0000_0002 -> 2 (no overflow is reported)
-7 + 9 -> 0xFFFF_FFFF_FFFF_FFF9 + 0x0000_0000_0000_0009 -> 0x0000_0000_0000_0002 -> 2
0xFFF9 + 0x0009 -> 0x0000_0000_0000_FFF9 + 0x0000_0000_0000_0009 -> 0x0000_0000_0001_0002 -> 65538

€ASM implements truncated integer division and modulo at assembly-time, same as machine instruction IDIV. Before signed division both divident and divisor are converted to positive numbers. Then, having been divided as unsigned, the quotient is converted to negative if one of the operands (but not both) was negative.
Remainder in signed modulo operation is converted to negative only when the divident was negative.

; Signed division:
+14 #/ +4 -> +(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) -> +3
-14 #/ +4 -> -(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) -> -3
+14 #/ -4 -> -(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) -> -3
-14 #/ -4 -> +(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) -> +3
; Unsigned division:
+14  / +4 ->   0x0000_0000_0000_000E / 0x0000_0000_0000_0004  -> 0x0000_0000_0000_0003
-14  / +4 ->   0xFFFF_FFFF_FFFF_FFF2 / 0x0000_0000_0000_0004  -> 0x3FFF_FFFF_FFFF_FFFC
+14  / -4 ->   0x0000_0000_0000_000E / 0xFFFF_FFFF_FFFF_FFFC  -> 0x0000_0000_0000_0000
-14  / -4 ->   0xFFFF_FFFF_FFFF_FFF2 / 0xFFFF_FFFF_FFFF_FFFC  -> 0x0000_0000_0000_0000
; Signed modulo:
+14 #\ +4 -> +(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) -> +2
-14 #\ +4 -> -(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) -> -2
+14 #\ -4 -> +(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) -> +2
-14 #\ -4 -> -(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) -> -2
; Unsigned modulo:
+14  \ +4 ->   0x0000_0000_0000_000E \ 0x0000_0000_0000_0004  -> 0x0000_0000_0000_0002
-14  \ +4 ->   0xFFFF_FFFF_FFFF_FFF2 \ 0x0000_0000_0000_0004  -> 0x0000_0000_0000_0002
+14  \ -4 ->   0x0000_0000_0000_000E \ 0xFFFF_FFFF_FFFF_FFFC  -> 0x0000_0000_0000_000E
-14  \ -4 ->   0xFFFF_FFFF_FFFF_FFF2 \ 0xFFFF_FFFF_FFFF_FFFC  -> 0xFFFF_FFFF_FFFF_FFF2
↑ Shift

Shift operations are not commutative. Operand on the left side is treated as a 64-bit integer and shifted to the left/right by the number of bits specified with operand on the right side.

Bits which enter the least significant bit (LSB) during Shift Left operation are always 0. Bits which enter the most significant bit (MSB) during Shift Right operation are either 0 (Shift Logical Right) or they copy their previous value (Shift Arithmetic Right), thus preserving the sign of operand.

Bits which leave LSB during Shift Right are discarded. Bits which leave MSB during Shift Left are discarded, too, but overflow error E6311 is reported by €ASM when the sign of result (kept in MSB) has changed during Shift Arithmetic Left. Overflow sensitivity is the only difference between Shift Arithmetic Left and Shift Logical Left.

The right operand may be arbitrary number; however when it is greater than 64, the result is 0 with one exception: negative number shifted arithmetic right by more than 64 bit results in 0xFFFF_FFFF_FFFF_FFFF -> -1.

Shift by 0 bits does nothing. Shift by negative number just reverses the direction of actual shift from left to right and vice versa.

Assembly-time rotate operations are not supported.

1 << 16 -> 65536
-3 #<< 2 -> -12
0x1122_3344_5566_7788  << 4 -> 0x1223_3445_5667_7880
0xFFEE_DDCC_BBAA_9988  >> 4 -> 0x0FFE_EDDC_CBBA_A998
0xFFEE_DDCC_BBAA_9988 #>> 4 -> 0xFFFE_EDDC_CBBA_A998
0x8000_0000_0000_0000 << 1  -> 0x0000_0000_0000_0000
0x8000_0000_0000_0000 #<< 1 ; overflow error E6311

Shift are given higher priority than other numeric operation because they correspond with computing power of 2 rather than multiplication or division. For instance 1 << 7 is equivalent to 1 * 27.
NASM evaluates the expression 4 + 3 << 2 as (4 + 3) << 2 -> 28, but in €ASM it is evaluated as 4 + (3 << 2) -> 16.

↑ Bitwise arithmetic

Bitwise NOT, AND, OR, XOR perform logical operation with the whole operands bit per bit.

~5 -> ~0x0000_0000_0000_0005 -> 0xFFFF_FFFF_FFFF_FFFA -> -6
5 & 12 -> 0101b & 1100b -> 0100b -> 4
5 | 12 -> 0101b | 1100b -> 1101b -> 13
5 ^ 12 -> 0101b ^ 1100b -> 1001b -> 9
↑ Boolean algebra

Logical NOT, AND, OR, XOR operate with numbers as well as with boolean values. Each operand, which is internally stored as 64bit number, is converted to boolean true (0xFFFF_FFFF_FFFF_FFFF -> -1) or false (0x0000_0000_0000_0000) before the actual operation.

3 && 4 -> -1 && -1 -> -1 -> true (both operands are non-zero)
3 & 4 -> 0011b & 0100b -> 0000b -> false (no common bit is set)

↑ Address expressions

Numeric expressions operate with literal numeric values, such as 1, 0x23, '4567' or with symbols representing scalar numeric value, such as NumericSymbolTen EQU 10. Most symbols in real assembler program represent address value, i.e. they point to some memory variable data or to a position in the program code. The set of operations defined with address symbols is very limited in comparison with numeric expressions. They cannot be multiplied, divided, shifted, logically operated. Only two kind of operation are allowed with addresses:

  1. Scalar numeric value may be added to the address symbol or substracted from it. The result is address symbol again; operation affects the offset part of address; segment part remains intact.
  2. Two symbols may be subtracted from one another (or compared with one another) if they both belong to the same segment. The result is a scalar numeric value corresponding to the difference of their offsets.

Imagine yourself driving a car. You're passing the milestone 123 on a highway when a friend of yours rings you up that he is passing the milestone 97. How far from one another are you? The answer is as easy as subtracting only if you are both driving on the same highway.

The reason of such limitation is addressing in IA-32 architecture which calculates physical memory address from two components: segment and offset. Assembler does not know how segments will be combined together or what actual virtual address will the program be loaded to at run-time; it only marks all references to relocatable addresses in object code. Linker is responsible for patching (fixing up) those addresses at link time; however the mathematical capability of linkers is restricted to adding and subtracting. Linkable file formats lack specification of more sofisticated arithmetics.

↑ Register expressions

Unlike instructions with immediate number embedded in operation code, such as ADD EAX,1234, machine instructions which load data somewhere to/from memory, must have the whole operand enclosed in braces [ ]. E.g. ADD EAX,[1234], where 1234 is offset of dword variable in data segment where the addend is loaded from.

When the address expression is used in machine instruction, it may be completed with registry names; it becomes register address expression. Complete address expression follows the schema
segment: base + scale * index + displacement where
segment is segment register CS, DS, ES, SS, FS, GS,
base is BX, BP in 16-bit addressing mode, or EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI, R8D..R15D in 32-bit addressing mode, or RAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI, R8..R15 in 64-bit addressing mode,
scale is numeric expression which evaluates to scalar number 0, 1, 2, 4, 8,
index is SI, DI in 16-bit addressing mode, or EAX, EBX, ECX, EDX, EBP, ESI, EDI, R8D..R15D in 32-bit addressing mode, or RAX, RBX, RCX, RDX, RBP, RSI, RDI, R8..R15 in 64-bit addressing mode,
displacement is address or numeric expression with magnitude (width) not exceeding the addressing mode..

The order of components in addressing expression is arbitrary. Any portion of register address expression may be omitted.
Scale is not permitted in 16-bit addressing mode and scale cannot be used if indexregister is not specified.
ESP and RSP cannot be used as index register (they cannot be scaled).
Addressing modes of different sizes cannot be mixed in the same instruction, e g. [EBX+SI].
16bit addressing mode is not available in 64bit CPU mode.

Registers allowed in addressing modes
16bit addressing mode in 16bit and 32bit segment
base registerBX SS:BP
index registerSI DI
32bit addressing mode in 16bit and 32bit segment
base registerEAX EBX ECX EDX ESI EDI SS:EBP SS:ESP
index registerEAX EBX ECX EDX ESI EDI EBP
32bit addressing mode in 64bit segment
base registerEAX EBX ECX EDX ESI EDI SS:EBP SS:ESP R8D..R15D
index registerEAX EBX ECX EDX ESI EDI EBP R8D..R15D
64bit addressing mode in 64bit segment
base registerRAX RBX RCX RDX RSI RDI SS:RBP SS:RSP R8..R15
index registerRAX RBX RCX RDX RSI RDI RBP R8..R15

When segment register is not explicitly specified, default segment is used for addressing the operand. If BP, SP, EBP, ESP, RBP or RSP is used as baseregister, the default segment is SS, otherwise it is DS. Nondefault segment register used for data retrieving may be specified either as an explicit instruction prefix SEGCS SEGDS SEGES SEGSS SEGFS SEGGS, or as segment register which becomes part of the register expression. The segment register may be included in expression either with colon : (segment separator) or with plus + (indexing operator):

|0000:268A04 |SEGES MOV AL,[SI] |0003:268A04 | MOV AL,[ES:SI] |0006:268A04 | MOV AL,[ES+SI]

In expressions where scaling is not used and therefore it's not obvious which of the two registers is meant as an index, €ASM treats the leftmost register as a base. So in [ESI+EBP] the base is ESI and implicit segment is DS, while in [EBP+ESI] the implicit segment is register SS.

We don't have to bother with implicit segment selection in FLAT Windows programs, because both SS and DS are loaded with the same segment descriptor at load-time.

Although the operators * or + in register address expression look like an ordinary multiplication or addition, they specify very different kind of operation called Scaling or Indexing when applied to a register. The actual multiplication or addition is performed at run-time rather than at assembly time, because the assembler cannot know the contents of registers.

Indexing operation has lower priority than the corresponding Multiplication. Hence, the register expression [EBX + 5 + ESI * 2 * 2] is evaluated as [EBX + 5 + ESI * (2 * 2)] -> [EBX + 5 + ESI * 4].

↑ Data expressions

Data expression specifies static data declared with pseudoinstruction D or with literals. Format of data expression is
duplicator * type value, where duplicator is non-negative integer number, type is primitive data type in full BYTE UNICHAR WORD DWORD QWORD TBYTE OWORD YWORD ZWORD INSTR or short B U W D Q T S O Y Z I notation, or a structure name. Optional value defines the contents of data which is repeated duplicator times.

Duplication is not commutative operation; duplicator must be on the left side of duplication operator *. Default duplicator value is 1 (the duplication is not used). Nested duplication is not supported in €ASM. Priority of duplication is very low, so the data expression 2 + 3 * B 4 is evaluated as five bytes where each contains the value 4. Example:

D 3 * BYTE ; declare three bytes with uninitialized contents
D W 0x5 ; declare one word with value 5
D 2 * U "some text" ; declare Unicode string containing "some textsome text"
D 3 * MyStruc ; declare three instances of structured memory variable MyStruc

↑ Special expressions

Membership ↓
Range ↓
Substring ↓
Sublist ↓

Remaining expression are not calculated with mathematical expression evaluator; they are treated by parser in special way.

↑ Membership

The point . joining two identifiers creates complex identifier. Joined part can be the structure member name or the parent/child namespace. For instance, when a local symbol .bar is declared in a procedure Foo, it is treated by €ASM as symbol with complex name Foo.bar.

↑ Range

Range is defined as two numeric expressions separated with range operator, which is .. (two adjacent fullstops) and it represents set of integer numbers between those values, including the first and the last value.

A range has the property slope, which can be negative, zero or positive. Slope is defined as the sign of the difference between the right and the left value. Examples:

0 .. 15 ; Range represents sixteen numbers from 0 to 15; slope is positive.
-5 .. -4 ; Range represent values -5 and -4; slope is positive.
3 .. 4 - 1 ; Range represents one value 3; slope is zero.
↑ Substring

Substring is operation which returns only part of input text. Substring operator is a range enclosed in a pair of square brackets []. The text is treated as a sequence of characters and the range specifies which of them are used.

Substring can be applied on two kinds of resources only:
  1. preprocessing %variable at the time of its expansion
  2. filename string in INCLUDE family of pseudoinstructions
%Sample1 %SET ABCDEFGH ; Preprocessing variable %Sample1 now contains 8 characters
 DB "%Sample1[3..5]" ; -> DB "CDE"
↑ Sublist

Sublist operation is similar to Substring with the difference that curly brackets {} are used instead of braces and that it treats the input text as a sequence of comma-separated items (in case of %variable expansion) or as a sequence of physical lines (in case of file inclusion).

 INCLUDE "MySource.asm"{1..10} ; include first ten lines of file MySource.asm

Common properties of suboperations Substring and Sublist

Character and items are 1-based, the first character/item/line has number 1.

Ordinal number of the last character/item/line of input text is assigned by €ASM to special preprocessing variable with the name %&. This %variable is valid only in the suboperation, it cannot be used outside the braces. In Substring operation it corresponds to the number of characters in the source text or to the size of included file. In Sublist operation it represents the ordinal number of the last non-empty item or the number of physical lines in included file.

|4142432C4445462C2C4748492C4A4B4C |%Sample %SET ABC,DEF,,GHI,JKL |0000: | ; %& is now 16 in %Sample[%&] and 5 in %Sample{%&}. |0000:4B4C | DB "%Sample[15..%&]" ; DB "KL" |0002:4445462C2C4748492C4A4B4C | DB "%Sample{2..%&}" ; DB "DEF,,GHI,JKL"

Suboperated included file must be enclosed in double quotes even when its name doesn't contain spaces. The opening square bracket must immediately follow the input value (%variable name or the quote which terminates the filename). No white spaces are allowed between the %variable and the suboperation left bracket.

Suboperation are very tolerant about the range values. No warning is reported when they refer to nonexisting character or item, for instance when the range member is zero or negative. Ranges with negative slope simply return nothing. Ranges with zero slope return one character/item/line when the index is between 1 and %&, otherwise they return nothing.

|4142434445464748 |%Sample %SET ABCDEFGH ; Variable %Sample now contains 8 characters. |0000:4142434445 | DB "%Sample[-3..5]" ; DB "ABCDE" |0005:434445464748 | DB "%Sample[ 3..99]" ; DB "CDEFGH" |000B:43 | DB "%Sample[ 3..3]" ; DB "C" |000C: | DB "%Sample[5..3]" ; DB "" |000C:4142434445464748205B352E2E335D | DB "%Sample [5..3]" ; DB "ABCDEFGH [5..3]" ; Not a suboperation.

Suboperation range consists of three components:

  1. minimum range indices
  2. range operator ..
  3. maximum range indices

Some of these components can be omitted; they are given the default value in this case. Default minimum indices is 1. Default maximum indices is %&. |4142434445464748 |%Sample %SET ABCDEFGH ; Preprocessing variable %Sample now contains 8 characters. |0000:4142434445 | DB "%Sample[..5]" ; -> DB "%Sample[1..5]" -> DB "ABCDE" |0005:434445464748 | DB "%Sample[3..]" ; -> DB "%Sample[3..8]" -> DB "CDEFGH" |000B:4142434445464748 | DB "%Sample[..]" ; -> DB "%Sample[1..8]" -> DB "ABCDEFGH" |0013:4142434445464748 | DB "%Sample[]" ; -> DB "%Sample[1..8]" -> DB "ABCDEFGH"

All following notations are identical in %variable expansion:

%variable
%variable[1..%&]
%variable[..%&]
%variable[1..]
%variable[..]
%variable[]
%variable{1..%&}
%variable{..%&}
%variable{1..}
%variable{..}
%variable{}

The last notation in previous example is useful in %variable concatenating when we need to append some literal text, for instance 123 to the %variable contents. We cannot write %variable123 because the appended digits change the name of original %variable. The solution is to use empty suboperation, which doesn't change the %variable contents but it separates its name from the successive text: %variable[]123 or %variable{}123.

When the range inside braces contains only one index without range operator, it is treated as both minimum and maximum value and only one character/item/line is expanded: %Sample1[3] -> %Sample[3..3] -> C.

Suboperations may be chained. The chain is processed from left to right. Example: |4142432C4445462C2C4748492C4A4B4C |%Sample %SET ABC,DEF,,GHI,JKL ; %& is now 16 in %Sample[%&] and 5 in %Sample{%&}. |0000:4A4B | DB "%Sample{4..5}[2..6]{2}" ; DB "JK"

The first sublist in previous example takes items nr.4 and 5, giving the list of two items GHI,JKL. The next substring extracts characters from second to sixth from that sublist, giving HI,JK. The last sublist operation expands the second item, which is JK.

Suboperations may be nested. Inner ranges are calculated before the outer ones: |31323334353637383930 |%Sample %SET 1234567890 |0000:3233343536 | DB "%Sample[2..%Sample[6]]" ; -> DB "%Sample[2..6]" -> DB "23456"

↑ Sections

For each emitting statement assembler generates some data or machine code which will be dumped to the output file in the end. Fortunately we don't have to write the whole program in the exact sequence which is required by the output file format. Assembled data and code is routed on demand to one of several output sections. The statement, which will switch assembly to a different section, is quite simple: just the name of section in square brackets [ ] in the label field of the statement.

Imagine that you (the programmer) act like a manager dictating some code and data to your secretary (EuroAssembler). You have dictated a few instructions, which were written in shorthand by your secretary on a sheet of paper labeled [TEXT]. Then you decided to dictate other kind of data. The secretary will grab another sheet, label it [DATA] and start to write there. Later, when you want to dictate some other instructions, your secretary takes the sheet labeled [TEXT] again, and continues from the point (origin) where it was interrupted.
You are free to open new sheets and to switch between them ad libitum. When the dictation ends, all used sheets will be stapled together (linked).

In EuroAssembler is the term section used for a named division of segment. Each segment has one or more sections. By default the segment has just one section with identical name (base section) which was created at segment definition.

↑ Segments

Intel Architecture divides memory to segments controlled by segment registers. Segment in €ASM is defined by pseudoinstruction SEGMENT.

In the dawn of computer age, programmers demanded more memory then mere 256 bytes or 64 kilobytes which was addressable by 8bit and 16bit registers. Designers at Intel in pre-32bit times might have chosen to use joinder of two 16bit general registers, such as DX:AX or SI:BX and address inconceivable 4 GB of memory with them, but they didn't. Instead, they invented 16bit segment registers specialized by the purpose of addressed memory: register CS for machine code, DS for data, SS for machine stack, ES for extra temporary usage.
Segment registers are used for addressing of 16 bytes chunks of memory called paragraphs (alias octonary word, OWORD). Virtual address in real CPU mode is calculated as a sum of Using segment registers for addressing of 16byte paragraphs yields 1 MB of memory addressable by each segment register, which seemed enough for everybody in those times.

Contents of segment register in real processor mode represents paragraph address of the segment.
Contents of segment register in protected processor mode represents index to descriptor table, which holds some auxilliary information about the addressed segment (beside its address and size limit): access privileges and width.

Those auxilliary properties are fixed in real mode:
segment bottom address is specified with segment register contents multiplied by 16,
segment size limit is 64KB+16 in 16bit addressing mode,
access privilege is allow everything,
segment width is 16 bits but using 32bit offsets is also allowed on CPU 386 or newer.

Segment at run-time is a continuous range of operational memory addressable with the contents of one segment register.

Segment at link-time is a named part of object file, which can be concatenated with segments of the same name from other linkable files.

In [MS_PECOFF] terminology is the linkable segment called section. I thing the term segment would be more appropriate here, because COFF "sections" are differentiated by access privileges as they are addressed by different segment registers, ergo by different segment descriptors.
In our segment-highway parable, segments in flat protected mode are highway lanes running in parallel, so they have common milestones (offsets), but each lane is dedicated to a different kind of vehicles.

Segment at write-time is a part of assembler source which begins with section switching statement, and which ends with another switching statement or with the end of program. Segments and section divisions of assembler source do not have to be continuous. In fact, discontinuity is their main raison d'être. It allows to keep data close to the code which manipulates with it, which is good for readability and understanding of program function.

↑ Groups

When segments of assembler program are not much huge, they may be coalesced into segment group. The whole group of segments is addressable with one segment register. Group can be defined with pseudoinstruction GROUP.

When a group is defined, e.g. {DGRP] GROUP [DATA],[STRINGS] beside the group [DGRP] it automatically creates a segment with the same name [DGRP] (and consequently a section with the same name [DGRP]). It also declares that segments [DATA] and [STRINGS] belong to group [DGRP] together with its base segment [DGRP]. Nevertheless, when nothing is emitted to the implicitely defined segment [DGRP], it will be discarder in the end.

↑ Segmentation (more about sections, segments, groups)

Base segment and section ↓

Segmentation lifetime ↓

Implicit segments ↓

Segment naming conventions ↓

Loading segment registers ↓

Ordering of sections and segments ↓

Displaying the segment map ↓

The relation between segment and its sections in EuroAssembler is similar to the relation between group and its segments.

↑ Base section and segment

Whenever a segment is defined (with the pseudoinstruction SEGMENT), a section with the same name is created in it (it is called base section). Other sections of the same segments may be created on demand. This is done by the statement which has only the section name in its label field (there is no explicit SECTION directive in €ASM).

Section properties (class, purpose, combine, align) are inherited from the segment they belong to. The alignment is not inherited when special literal sections [@LT64] .. [@LT1], [@RT0], [@RT1].. are created, they are aligned according to the type of data which they keep.

Whenever a group is defined (with the pseudoinstruction GROUP), a segment with the same name is created in it (it is called base segment), together with other segments which we want to incorporate to the group.

↑ Segmentation lifetime

Each segment has one or more sections. Each section belongs to exactly one segment. During assembly time all segments are assumed to be loaded at virtual address 0. At the end of each assembly pass are sections virtually linked to their segment, so they begin at higher VA, where the preceeding section ended. However, in pass 1 it is not known yet what size will sections have, so all sections are assumed to start at VA=0 in pass 1. When the last assembly pass ends, all sections are linked physically (their emitted contents and relocations are concatenated to the segment=base section) and sections are then discarded. Linker is not aware of sections at all.

Why should we actually split a segment to sections? Well, it is not necessary, mostly we can get by with just one default section per segment. In big programs, on the other hand, it may be useful to group similar kind of data together; we may want to create separate section for double word sized variables, for floating-point numbers, for text strings. This may save a few bytes of alignment stuff, which would be necessary when variables of different sizes are mixed together.

Each group has one or more segments. Each segment belongs to exactly one group (even when it wasn't explicitly grouped, a group with the segment's name will be implicitly created at link time for the addressing purposes). When a program with executable format is linked, all groups are physically concatenated into image. Loader of realmode executable image is not aware of groups and segments.

↑ Implicit segments and groups

€ASM creates implicit segments when it starts to assemble a program. Implicit segment names depend on the chosen program format:

Implicit segments
FORMAT=Implicit segment names
BIN[BIN]
COM[COM]
OMF | MZ[CODE],[DATA],[BSS],[STACK]
COFF | PE | DLL[.text],[.data],[.bss]

If you are not satisfied with the implicit segments created by €ASM, you may redefine them at the start of program or create a new set of segments with different names. Segments and sections which were not used (nothing was emitted to them) will not be linked to output file and they can be ignored.

When the assembly ends and segments from linked modules have been incorporated (combined) to the base program, €ASM looks at segments which are not part of any group, and creates implicit group for them (name of the group is the same as the segment). Here the memory model is taken into account:

Models with single code segment (TINY, SMALL, COMPACT) links all code into a single group, no matter how many code segments are actually defined in the program.

Multicode models (MEDIUM, LARGE) keep each code segment it its own implicit group, (if they weren't grouped explicitely), hence intersegment jumps, calls and returns should have DIST=FAR.

Similary, single data models (TINY, SMALL, MEDIUM) assume that all initialized and uninitialized data fits into one group not exceeding 64 KB, so the €ASM linker will assign all data segments into implicit group and register DS does not have to be changed when accessing data from various segments, which may have been defined in the base program or in linked modules.

↑ Segment naming conventions

Name of group, segment and section is always surrounded by square brackets in €ASM source.

Unlike symbols, namespace is not preposited to segment name when it starts with . (fullstop).

Number of characters in group/segment/section name is not limited by €ASM but it may be limited by output format. In OMF object module the name of group or segment must not exceed 255 characters. In PE COFF executables the name in section header is truncated to 8 characters.

€ASM treats all names as case sensitive. If you want to link your segment with object module produced by an external compiler which converts segment name to uppercase or which mangles the names by prepending underscores __, you should adapt your naming convention to it.

Segment name should be unique, you cannot define two segments with the identical name in a program, except for the implicitly created segments, if there were not used yet. However, it is possible to define segments with same names in different programs and link them together; their contents will be concatenated according to their COMBINE= property. Similar rule applies to groups.

Section names cannot be duplicated on principle. When a section name appears in source for the second time, it will only switch to that section rather than creating a new one.

Implicit literal section name begins with @LT or @RT, you'd better avoid names which begin with this combination of letters.

Segment which have dollar sign $ in their name are treated in a special way. If the characters on the left side of $ match, all such segments will be linked adjacently in alphabetic order.

There are conventions how "sections" are named in COFF modules, you may need to adapt to them to succesfully link €ASM program with modules created by different compilers.

↑ Loading segment registers

When MZ executable program is prepared to start, its segment registers have been set by the DOS loader. CS:IP is set to program entry point, SS:SP is set to the top of machine stack, but both DS and ES point to PSP, which is not our data segment. Whenever programmer needs to access data in their own segment or to jump to some procedure in a different code segment, concerned segment register must be explicitly loaded with paragraph address of the group. There is no instruction in Intel architecture to load segment register with immediate value directly, so this is usually done via register or stack:

; Loading paragraph address of [DATA] to segment register
; using a general purpose register (which is faster):
MOV AX, PARA# [DATA]
MOV DS,AX   
; or using the machine stack (which is shorter):
PUSH PARA# [DATA]
POP ES

It is the responsibility of programmer to load segment register with the address of another segment, whenever used. €ASM makes no assumption about the contents of segment registers; there is no ASSUME or USING directive in €ASM.

Unlike some other assemblers, €ASM does not automatically create a homonymous symbol when a segment is defined. Use attribute operators PARA#, GROUP#, SEGMENT#, SECTION# instead.

↑ Ordering of sections and segments

Order of sections in segment and order of segments in linked program is generally specified by the order as they were defined in source code with few exceptions. At the end of each assembly pass are all sections linked to their segment in this order:

  1. Base section.
  2. Other non-literal sections in the order as they were defined.
  3. Data-literal sections in descending order of their alignment ([@LT64], [@LT32],..[@LT1]).
  4. Code-literal section in alphabetical order ([@RT0], [@RT1], [@RT2]..).

Segments are combined and linked at link time, when the final assembly pass ends.
Order of segments in output file:

  1. Group(s) of initialized segments in the order as they were defined.
  2. Initialized segments which are not in any group.
  3. Group(s) of uninitialized segments in the order as they were defined.
  4. Uninitialized segments which are not in any group.

Segments in each group are in the order as they were defined in the source (not as they were declared in the GROUP statement). The base segment is always the first in group.
Segments with $ in their name, which belong to the same group, and the left substring of their names up to the $ is identical, are kept together and sorted alphabetically.

When an executable format is linked, every segment belongs to some group, at least the implicit one.

↑ Displaying the segment map

Pseudoinstruction %DISPLAY Sections prints to the listing file a complete map of groups, segments and sections defined so far at assembly time, one object per line. Segment is indented with two spaces, section is indented with four spaces.

Instead of %DISPLAY Sections we could use %DISPLAY Segment or %DISPLAY Groups, the result is identical. The entire group/segment/section map is always displayed with those statements.

At link time €ASM prints to the listing similar map of groups and segments with finally used virtual addresses, unless it was disabled with option PROGRAM LISTMAP=OFF.

↑ Distance

Distance is property of difference between two addresses. It is not just the numeric difference of two offsets; in €ASM this term represents one of three enumerated values: FAR, NEAR, SHORT.

The distance of two addresses is FAR when they belong to different groups/segments, otherwise it is NEAR or SHORT. Difference of offsets is SHORT if it fits into 8-bit signed integer, i.e. -128..+127.

↑ Width

€ASM is 64-bit assembler, it can also compile programs for the older CPU which worked with 32 and 16 bit words only. The number of bits which CPU works with simultaneously is called width and it is either 16, 32 or 64.

The width is property of segment. Some 32-bits object file formats can mix segments of both widths in one file. A PROGRAM pseudoinstruction also has WIDTH property which will establish the default for all segments declared in the program.

↑ Namespace

Names of symbols created in the program must be unique. In large projects it might be difficult to maintain unique names, especially when more people work on separate parts of the program. That is why programmer can use local identifiers which must be unique only in a division of source file called namespace. Namespace is a range of the source specified by namespace block. There are four block-pseudoinstructions in €ASM which create the namespace: PROGRAM, PROC, PROC1, STRUC. The name of block is also the name of namespace. An identifier is local when its name begins with fullstop .. Unlike with standards symbols, the characters following the leading fullstop may start with a decimal digit and it is not an error when they form a reserved name. Example of valid local identifiers: .L1, .20, .AX.

Names of local identifiers are kept in €ASM internally concatenated with namespace name. This concatenation is called fully qualified name (FQN). Local symbols may be referred with .local name only within their native block; they may also be referred with fully qualified name anywhere in the program.

The namespace actually starts at the operation field of the block statement and it ends at the operation field of the corresponding endblock statement. Thanks to this, the namespace itself (label of the block) may be local, too, and the namespaces may be nested.

MyProg PROGRAM      ; PROGRAM starts a namespace MyProg           ;
                                                                  ;
Main    PROC        ; PROC starts inner namespace Main            ;
  .10:   RET        ; local label; its FQN is Main.10             ;
        ENDP Main   ; after ENDP we are in MyProg namespace again ;
                                                                  ;
.Local  PROC        ; Its FQN is MyProg.Local                     ;
  .10:   RET        ; FQN of this label is MyProg.Local.10        ;
        ENDP .Local ; MyProg.Local namespace ends right after ENDP;
                                                                  ;
       ENDPROGRAM MyProg

Beside the namespace blocks there is one more occasion where namespace is unfolded: operand fields of the structured data definition statement, which temporarily take over the namespace of structure which is being instanceized.

DateProg PROGRAM      ; PROGRAM starts a namespace DateProg              ;
                                                                         ;
Datum STRUC  ; Declaration of structure Datum creates namespace Datum    ;
.day   DB 0                                                              ;
.month DB 0                                                              ;
.year  DW 0                                                              ;
      ENDSTRUC Datum ; Namespace Datum ends right behind ENDSTRUC field  ;
                                                                         ;
[.data] ; Segment name is not local label, namespace is ignored          ;
Birthday DS Datum, .day=1, .month=1, .year=1970                          ;
                                                                         ;
; The previous statement defines 4 bytes long structured memory variable ;
; called Birthday in section [.data] and statically sets its members.    ;
; On creating the variable "Birthday" €ASM uses properties               ;
; declared as Datum.day, Datum.month, Datum.year (B,B,W).                ;
; Members can be referred as Birthday.day, Birthday.month, Birthday.year.;

↑ Scope

Scope is property of symbol which specifies symbol visibility.

Symbol defined in assembler program, such as label or memory variable, may be referred anywhere withing the program at assembly time. Our program may be linked with other programs, object modules or libraries, which might have misused the same name for their own symbols, but it's OK and no conflict occurs because programs are compiled separately. This is the standard behaviour, such symbols have standard private scope and their visibility is limited to the inside of PROGRAM/ENDPROGRAM block.

When a symbol name begins with fullstop ., visibility of such private local name is even narrower, it is limited to the smallest namespace block in which was the symbol defined (PROC/ENDPROC, STRUC/ENDSTRUC).

On the other hand, executables which are linked from several programs (modules, libraries) need to acces symbols outside their standard private scope, for instance to call an entry point of a library function. Names of such global symbols should be unique among all linked programs.

Scope recognized in €ASM
privateGlobal
Standardlocalstatic link dynamic link
PublicExterneXportImport

Scope of symbol can be examined at assembly time with attribute operator SCOPE#, which returns ASCII value of uppercase scope shortcut, for instance

MySymbol EXTERN
MOV AL,SCOPE# MySymbol ; equivalent to MOV AL,'E'

Available shortcuts are underlined in the table above. The same shortcuts are also used when symbol properties are listed by %DISPLAY Symbols and after the link phase if LISTGLOBALS=ENABLED.

GLOBAL, PUBLIC, EXTERN, EXPORT and IMPORT scope of a symbol can be explicitly declared by pseudoinstruction with the corresponding name. GLOBAL scope can be also declared implicitly, using two (or more) terminating colons :: after the symbol name. Symbol declared as GLOBAL is either available as PUBLIC (if it is defined in the same program), or it is marked as EXTERN (if it is not defined in the program).

Only the scopes for static linking (PUBLIC, EXTERN) can be declared by simplified global scope declaration (using two colons). When the symbol will be exported (if a DLL file is created), or when it should be dynamically imported from other DLL, using two colons is not enough and either explicit declaration EXPORT/IMPORT symbol or LINK import_library is required.

Word1:  DW 1   ; Standard private scope.
Word2:: DW 2   ; Public scope declared implicitly (with double colon).
Word3   PUBLIC ; Public scope declared explicitly.
Word4   GLOBAL ; Public or extern scope (which depends on Word4 definition in this program).
Word5   GLOBAL ; Public or extern scope (which depends on Word5 definition in this program).
Word6   EXTERN ; Extern scope. Symbol Word6 must not be defined anywhere else in this program.
Word4:         ; Definition of symbol Word4.
        MOV EAX,Word5 ; Reference of external symbol Word5.
; Scope of Word1 is PRIVATE.
; Scope of Word2, Word3, Word4 is PUBLIC.
; Scope of Word5, Word6 is EXTERN.

↑ Data types

Information in computer memory or register represents code or data. Important properties of stored texts and numbers is data type, which is a rule specifying how to interpret the information. €ASM recognizes following types of data:

Fundamental data types
TypenameShortSizeAutoalignWidth Typical
storage
Character
string
Integer
number
Floating-point
number
Packed
vector
BYTEB118 R8ANSI8bit
UNICHARU2216 R16WIDE
WORDW2216 R1616bit
DWORDD4432 R32,ST32bitSingle precision
QWORDQ8864 R64,ST64bitDouble precision
TBYTET10880 STExtended precision
OWORDO1616128 XMM4×D | 2×Q
YWORDY3232256 YMM8×D | 4×Q
ZWORDZ6464512 ZMM16×D | 8×Q
Other data types
TypenameShortSizeAutoalign Usage
Structure nameSvariableSTRUC explicit alignment,
otherwise program width
structured variables
INSTRIvariable1machine instructions

Using of fundamental typenames is often reduced to their first letter. Data types in short or long notation are used for explicit static data definition with pseudoinstruction D, for implicit data definition in literals, as an alignment specification, in instruction modifiers.

€ASM has some type awareness, though not so strong as in higher programming languages. For instance when processing instruction INC [MemoryVariable] it looks how was [MemoryVariable] defined and selects appropriate encoding version (byte/word/dword/qword).


↑ Symbols

Name of symbols ↓

Numeric symbols ↓

Address symbols ↓

$ - current origin address ↓

Attributes of symbol ↓

Literal symbols ↓

Assembler symbol represents a number or address in assembler source. It is an alias to the number or address, therefore it should have self-explaining mnemonic name.

There are two kinds of symbols in assembler: numeric and address.

Numeric symbol answers the question how many and address symbol answers the question where (at which position in the program).

↑ Name of symbols

Symbol name is an identifier (letter or fullstop optionally followed with other letters, fullstops and digits), which is not a reserved symbol name in either character case.
Symbol name may always be terminated with one or more colons : which helps to identify the term as a symbol name. The colon itself is not a part of symbol name.

Symbol name must be unique in the program.

Symbols and structures may be referred (used in statement) before they are actually defined. However, it's a good practice to define numeric symbols and structures at the beginning of the program, because forward references require an additional program pass in assembly process.

Reserved symbol names
CategoryReserved names
Assembly-time current pointer$
Segment register namesCS, DS, ES, FS, GS, SS
Prefix namesATOGGLE, LOCK, OFTEN, OTOGGLE, REP, REPE, REPNE, REPNZ, REPZ, SEGCS, SEGDS, SEGES, SEGFS, SEGGS, SEGSS, SELDOM, XACQUIRE, XRELEASE

Name of symbol may contain fullstop ., which is often treated specially, as a membership operator (when it connects member of structure with its name). Leading . makes the symbol local, as it is in fact connected with current namespace internally.

Creating symbol names which collide with names of registers or instructions is discouraged. If you really want to use some of not recommended name for a symbol, it must be always followed with colon, e.g.

  Byte: DB 1 ; Define a symbol named "Byte".
  MOV AX,Byte: ; Load AX with offset of the symbol.

In other cases, terminating symbol name with : is voluntary, but recommended.

Termination of each symbol name with : is a good practise both when the symbol is defined and referred, though many other assemblers do not support this. It's easier to copy&paste the symbol name without having to delete colon at its end. Colon tells both assembler and human reader that the name represents a symbol, and it protects from mistake when you choose a symbol name which accidentally happens to collide with one of thousands instruction mnemonics.
Instruction mnemonics, registers (except for segment registers), structure names are never colon-terminated.
Not recommended symbol names
CategoryNot recommended names
Fundamental data types B, BYTE, D, DWORD, I, INSTR, O, OWORD, Q, QWORD, S, T, TBYTE, U, UNICHAR, W, WORD, Y, YWORD, Z, ZWORD
Register names AH, AL, AX, BH, BL, BND0, BND1, BND2, BND3, BP, BPB, BPL, BX, CH, CL, CR0, CR2, CR3, CR4, CR8, CX, DH, DI, DIB, DIL, DL, DR0, DR1, DR2, DR3, DR6, DR7, DX, EAX, EBP, EBX, ECX, EDI, EDX, ESI, ESP, K0, K1, K2, K3, K4, K5, K6, K7 MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10L, R10W, R11, R11B, R11D, R11L, R11W, R12, R12B, R12D, R12L, R12W, R13, R13B, R13D, R13L, R13W, R14, R14B, R14D, R14L, R14W, R15, R15B, R15D, R15L, R15W, R8, R8B, R8D, R8L, R8W, R9, R9B, R9D, R9L, R9W, RAX, RBP, RBX, RCX, RDI, RDX, RSI, RSP, SEGR6, SEGR7, SI, SIB, SIL, SP, SPB, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, TR3, TR4, TR5, XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM16, XMM17, XMM18, XMM19, XMM20, XMM21, XMM22, XMM23, XMM24, XMM25, XMM26, XMM27, XMM28, XMM30, XMM31 YMM0, YMM1, YMM2, YMM3, YMM4, YMM5, YMM6, YMM7, YMM8, YMM9, YMM10, YMM11, YMM12, YMM13, YMM14, YMM15, YMM16, YMM17, YMM18, YMM19, YMM20, YMM21, YMM22, YMM23, YMM24, YMM25, YMM26, YMM27, YMM28, YMM30, YMM31 ZMM0, ZMM1, ZMM2, ZMM3, ZMM4, ZMM5, ZMM6, ZMM7, ZMM8, ZMM9, ZMM10, ZMM11, ZMM12, ZMM13, ZMM14, ZMM15, ZMM16, ZMM17, ZMM18, ZMM19, ZMM20, ZMM21, ZMM22, ZMM23, ZMM24, ZMM25, ZMM26, ZMM27, ZMM28, ZMM30, ZMM31
Pseudoinstruction names ALIGN, D, DB, DD, DI, DO, DQ, DS, DU, DW, DY, DZ, ENDHEAD, ENDP, ENDP1, ENDPROC, ENDPROC1, ENDPROGRAM, ENDSTRUC, EQU, EUROASM, EXTERN, GLOBAL, GROUP, HEAD, INCLUDE, INCLUDE1, INCLUDEBIN, INCLUDEHEAD, INCLUDEHEAD1, PROC, PROC1, PROGRAM, PUBLIC, SEGMENT, STRUC
Machine instruction mnemonicsAAA, AAD, ... XTEST, see IiHandlers in €ASM source for the complete list.

↑ Numeric symbols

Numeric symbol is defined with pseudoinstruction EQU (or with its alias =) specifying a number, numeric expression or other numeric symbol. Examples:

BufferSize: EQU 16K
WM_KEYDOWN = 0x0100
Total      EQU 2*BufferSize
   MOV ECX,BufferSize
Using numeric symbol instead of direct number notation has its advantages:

↑ Address symbols

Address symbol is defined when it appears as a label of machine instruction or prefix, as a label of empty instruction or as a label of pseudoinstruction D*, PROC, PROC1.

Examples:
[DATA]
SomeValue:   DD 1
[CODE]
             MOV EAX,SomeValue
StartOfLoop: CALL SomeProcedure
             LOOP StartOfLoop:

While numeric symbol BufferSize was completely defined with its value, in case of address symbol SomeValue it is not sufficient. Instruction MOV EAX,SomeValue loads EAX with the symbol offset, i.e. with the distance between its position and the start of its segment. Address symbol is defined with two properties: its segment and offset. That is why address symbol is sometimes called vector or relative symbol and numeric symbol is called scalar or absolute symbol.

There are four methods how to create a symbol:
  1. Symbol is defined when its name occurs in the label field of a statement. Such symbol represents address withing the section it was defined in, and the data emited by the statement, too. The statement may be empty (plain label) or it may declare data, prefix or machine instruction. Pseudoinstructions PROC and PROC1 also define the symbol with their name, but pseudoinstructions PROGRAM, STRUC, SEGMENT do not.
  2. External symbols are created with pseudoinstructions EXTERN and GLOBAL, or when they are referred with two colons appended to their name. Extern symbol is not defined in the current program, it must not appear in label field (with an exception of EXTERN pseudoinstruction wich declares it as external).
  3. €ASM maintains a special dynamic symbol $ for each section, which represents the current assembly position in the section.
  4. Symbol can be defined with pseudoinstruction EQU or with its alias =. This is the only way how to define a plain numeric symbol.

↑ $ symbol

Special dynamic symbol $ represents the address of next free position in emited code at the beginning of assembly of the statement, in which it is referred. Value of this symbol is not constant but it is changed by €ASM after an emitting statement has been assembled.

Programmer may change the offset of current origin $ with EQU pseudoinstruction, this is equivalent to pseudoinstruction ORG known from other assemblers. There is no ORG pseudoinstruction in €ASM, $ is made l-value instead.


↑ Symbol and file attributes

SIZE# ↓
TYPE# ↓
SCOPE# ↓
OFFSET# ↓
SECTION# ↓
SEGMENT# ↓
GROUP# ↓
PARA# ↓
FILESIZE# ↓
FILETIME# ↓

Some important symbol properties are available for next processing in a program at assembly time, they are called attributes. When a symbol is defined, it automatically gets its attributes. They can be referred by prefixing the symbol name with attribute operator. Attribute operator is an identifier which defines the kind of attribute, immediately followed with #. The object, which the attribute operator is applied on, may be separated by zero or more white spaces and it may be in parenthesis. For instance SIZE#SymbolName or SIZE# SymbolName or SIZE#(SymbolName). Remember that symbol name is case sensitive but the attribute name is not.

Attributes GROUP#, SEGMENT# and SECTION# return an address when applied to an address symbol; they return scalar zero when applied to numeric symbol. Other attributes always return scalar (plain number).

↑ SEGMENT#

Attribute SEGMENT# represents the address of beginning of the segment that the symbol belongs to. When applied to a numeric symbol, it returns scalar zero.

↑ GROUP#

Attribute GROUP# represents the address of beginning of the group that the symbol belongs to, i.e. address of the first byte of the first (lowest) segment of the group. When applied to a numeric symbol, it returns scalar zero.

↑ PARA#

Attribute PARA# represents the paragraph address of beginning of the group that the symbol belongs to. It is the value which has to be loaded to the segment register which will be used for addressing. When PARA# is applied to a numeric symbol, it returns scalar zero.

↑ SECTION#

Attribute SECTION# represents the address of beginning of the section that the symbol belongs to. When applied to a numeric symbol, it returns scalar zero. If the symbol lies in default section (with the same name as its segment), both SECTION# and SEGMENT# attributes return identical address.

↑ OFFSET#

Attribute OFFSET# returns the offset of symbol in the current segment as a plain number, i.e. the number of bytes between the start of segment and the symbol itself. If the symbol is numeric, its value is returned.

Symbol and OFFSET#Symbol are identical only when Symbol is a scalar value, otherwise the former represents its address and the latter represents plain number.

The expression Symbol - SEGMENT#Symbol is identical with OFFSET#Symbol for both numeric and address kind of symbols.

↑ SCOPE#

Attribute SCOPE# returns a number representing the ASCII value of capital letter corresponding with the symbol scope, which can be 'E' for external symbols, 'P' for public symbols, 'X' for exported symbols, 'I' for imported symbols, 'S' for standard (private) symbols, or '?' when the symbol is undeclared.

↑ SIZE#

SIZE# represents the amount of bytes emitted with the statement which defines the symbol. Typically it is the size of data defined with D pseudoinstruction or the size of machine instruction. Symbols defined with EQU pseudoinstruction or defined in non-emitting instruction have attribute SIZE# equal to zero.

↑ TYPE#

Attribute TYPE# returns a number representing the ASCII value of capital letter corresponding with the symbol type. It may be one of the fundamental data types 'B', 'U', 'W', 'D', 'Q', 'T', 'O', 'Y', 'Z', structured data type 'S' or machine instruction type 'I' when the symbol is defined with data definition pseudoinstruction D. Numeric symbol returns type attribute 'N'. Address symbol defined with just a label and external symbol returns 'A'. Undefined symbol returns '?'.

Forward reference to a symbol will create its record in symbol table. However, in the first pass its type attribute is '?' (undefined) until its definition is encounterred. On the other hand, applying an attribute to undefined symbol does not make it referred. That is why we may test with the pseudoinstruction %IF TYPE#Symbol = '?' whether the symbol is undefined in program.

Beside symbols, some attribute operators may be applied to other elements than symbols: register, structure name, string, expression in in parenthesis () or braces [].

TYPE# of register is 'R' and its SIZE# is equal to the register width in bytes (1,2,4,8,10,16,32,64).

TYPE# of structure is 'S' and its SIZE# returns the size of the structure.

↑ FILESIZE#
↑ FILETIME#

Unlike previous attributes, FILESIZE# and FILETIME# can be applied only to files specified by their name, which must be surrounded with double quotes ".

The filename may have absolute, relative, or no path, it is related to the current directory.

FILESIZE# "filename" returns the number of bytes in the file, or 0 if the file was not found.
FILETIME# "filename" returns the timestamp of the file, i.e. the number of seconds between 1st of January 1970 midnight UTC and the last file modification. It returns 0 when the file was not found.

↑ Literals

Literal symbols alias literals are similar to standard assembler symbols. The difference is that they don't have explicit definition and name. Literal is defined whenever it is referred and its name is represented with equal sign = followed with data expression, e.g. =D(5) or =B"Some text.". They may be duplicated, but unlike in D pseudoinstruction (which may have many operands), only one data expression can be specified. Examples of instructions with literals:

DIV [=W(10)] ; Divide DX:AX by an anonymous word variable with value 10
MOV DX,=B"This is a literal message.$"
LEA ESI,[=D 0]
CALL =I"RET"
LEA EBX,[=D 0,1,2,3] ; Error: multiple data expressions.
MOV DX,=B"This is a literal message.",13,10 ; Error: multiple data expressions.

The first example declares a word variable =W(10). Without literals we would have to explicitly define a data variable Ten DW 10 somewhere in data section.

Advantage of literal is that we don't need to invent unique symbol name and explicitly declare the symbol in data section with D pseudoinstruction. The data contents is visible directly in the instruction which uses the literal.

Literals are automatically aligned.

All literals are autoaligned according to their type, for instance =D 5 is DWORD aligned regardless of current EUROASM AUTOALIGN= option.

String literals are automatically zero-terminated.

String literals, such as =B"Some text" or =U"Some text" are always implicitly terminated with byte or unichar zero when they are declared as literals.
€ASM allows simplified declaration of literal strings, where the type identifier (B or U) is omitted, e.g. ="Some text". The actual type of string is then determined by system preprocessing variable %^UNICODE.

Implicit data definition with literals does not allow to control the location where the literals will be emitted to. €ASM creates a subservient section for each type of alignment and assigns those sections either

  1. to the first data segment which was created with PURPOSE=DATA+LITERAL, or if no such segment was found,
  2. to the last data segment defined in the program.

Names of literal sections are @LT64, @LT32, @LT16, @LT8, @LT4, @LT2, @LT1. Literals with INSTRUC data type, such as =8*I"MOVSD" are emitted to subservient section @RT0 which is similarly assigned to segment with PURPOSE=CODE+LITERAL, or to the last code segment.

Repeated literals with the same declaration are reused, they represent the same memory variable. However, literals with non-verbatim match, such as =W+4, =W 4 and =W(4) are stored separately, although they represent the same value. Similarly =B"Some text", =B'Some text' and =B 'Some text' are different and they will be stored in data memory as three different strings.

Literals should always be treated as read-only memory variables. Although nothing can stop the programmer from overwriting the literal at run-time, that could corrupt behaviour of other parts of program, which might be reusing the same literal data.


↑ %Variables

User-defined %variables ↓

Formal %variables ↓

Automatic %variables ↓

System %variables ↓

€ASM program uses preprocessing variables (alias %variables) for easy manipulation with the source text at assembly-time. Hand in hand with macroinstructions they make a very strong tool to ease of tedious repetitive work. Preprocessing machinery does not affect the object code directly, as plain assembler does. Instead, it manipulates with the source text, which can be modified with %variables and repeated with preprocessing %pseudoinstructions.

Preprocessing variables always treat their contents as a text, without inspecting its syntactic significance, no matter if they were assigned with text, string, numeric or logical expression or whatever.

Once assigned, the contents of %variable will be used (expanded) whenever the %variable appears in the source text. Expansion takes place before the physical line of source file is parsed into statement fields. By default the whole contents of %variable is expanded, but this can be changed with Substring or Sublist operation.


↑ User-defined %variables

User-defined %variable is represented as percent sign % immediately followed with an identifer, which is not reserved %variable name in either case. Identifier name must begin with a letter and may not contain fullstop or other punctuation.

User-defined %variable name is case-sensitive.
Reserved %variable names
CategoryReserved names
Pseudoinstructions %COMMENT, %DEBUG, %DISPLAY, %DROPMACRO, %ELSE, %ENDCOMMENT, %ENDFOR, %ENDIF, %ENDMACRO, %ENDREPEAT, %ENDWHILE, %ERROR, %EXITFOR, %EXITMACRO, %EXITREPEAT, %EXITWHILE, %FOR, %IF, %MACRO, %PROFILE, %REPEAT, %SET, %SET2 %SETA, %SETB, %SETC, %SETE, %SETL, %SETS, %SETX, %SHIFT, %UNTIL, %WHILE

User %variable is assigned (created) with one of the %SET* family of pseudoinstructions.

%Variables may be reassigned, they don't have to be unique in the source.

Scope of user-defined %variable begins at its definition and ends at the end of source file.

%Variables need not be assigned before the first use. Unassigned %variable expands to nothing (empty text). Once defined %variable cannot be unassigned, there is no UNDEFINE or UNASSIGN directive in €ASM. Nevertheless, setting a %variable to emptiness (e.g. %SomeVar %SET) is equivalent to unsetting it. €ASM reports no warning if it encounters user-defined %variable which is empty, which has not been defined earlier or which is not defined in the source file at all.

Differences between symbols and %variables
Symbols%Variables
are properties of PROGRAM are properties of EUROASM
never start with % always start with %.
may have membership fullstop in their name never have fullstop in their name
are declared in label field of a statement are assigned with %SET* pseudoinstruction
have assembly attributes such as TYPE# and SIZE#. are simply a piece of text without attributes
may be forward referrenced cannot be forward referrenced
must be declared just once in a program may be redeclared many times
cannot be referrenced if not declared somewhere in the main or linked program may be referrenced without declaration. Unassigned %variable is silently replaced with nothing.
cannot be subject of sublist or substring operation can be sublisted or substringed

↑ Formal %variables

Formal %variable represents parameter used in %FOR loop and in %MACRO invokation. It is defined with identifier which stands in for control variable in %FOR loop or in %MACRO prototype.

Scope of formal variables is limited to the block which is being expanded.

Count %FOR 1..8
        DB %Count
      %ENDFOR Count

The previous example generates eight DB statements which define byte values from 1 to 8. Identifier Count used in %FOR and %ENDFOR statements is %FOR-control variable, which is accessible inside the %FOR block as a formal %variable %Count.

Formal variables are also used to access macro operand by name during the macro expansion. In the next example we have two %MACRO-formal variables provided in %MACRO definition as identifiers Where and Stuff. In macro body their values are available as formal %variables %Where and %Stuff.

Fill %MACRO Where, Stuff=0 ; Definition of macro Fill.
       MOV %Where,%Stuff
     %ENDMACRO Fill
   
; Invokations of macro Fill:
   Fill [Counter], Stuff=255 ; Will be assembled as MOV [Counter],255
   Fill EBX                  ; Will be assembled as MOV EBX,0

Notice that formal %variables are always used without the percent sign when they are declared, but % must be prefixed to their name when they are referred in the %FOR or %MACRO body. This is important for inheriting of arguments in nested macroinstructions.


↑ Automatic %variables

Automatic preprocessing variables are created and updated by €ASM; their names contain punctuation characters and, unlike user-defined %variables, they cannot be explicitly reassigned with %SET pseudoinstruction.

%&
Number of characters | list items | physical lines in suboperations.
%1
Ordinal operand.
%!1
Inverted condition code from ordinal operand.
%Formal
Operand with formal name.
%!Formal
Inverted condition code of operand with formal name.
%*
List of all ordinal operands used in macro invokation.
%#
Number of ordinal operands used in macro invokation.
%=*
List of keyword operands used in macro invokation.
%=#
Number of keyword operands used in macro invokation.
%:
Label of macro invocation
%.
Expansion number

Automatic suboperation variable %& is created when the expansion of included file or of another %variable uses suboperations.
When the substring operator [] is appended to the %variable name or to the included file name, automatic variable %& can be used inside the brackets, e.g. [1..%&] and it represents the number of bytes in expanded %variable or in included "file".
When the sublist operator {} is appended to the %variable name, the %variable contents is treated as an array of comma-separated items and %& represents their number (ordinal of the last nonempty item).
When the sublist operator {} is appended to the included file name, the file contents is treated as a set of physical lines and %& represents number of lines in the file. For instance INCLUDE "file.inc"{%&-10 .. %&} will include the last ten lines from "file.inc".

Using the %& variable outside brackets will throw an error.

All other automatic %variables can be used in macrodefinition only, they refer to operands used when the macro is expanded.

Ordinal automatic variable %1 is a percent sign followed with positive decimal number and it represents the corresponding ordinal operand of macro, e.g. %1, %2, %15 etc.

Inverted-condition automatic variable %!1 has the ordinal number or formal name prefixed with logical NOT operator. The referred operand must contain a general condition code (case insensitive) such as E, NE, C etc. Operand contents will be replaced with corresponding inverted code. €ASM reports error if the operand did not contain valid condition code.

NASM uses unary-minus operator - to achieve similar functionality. I believe logical-not operator ! is more appropriate for the inversion of logical values.

Example of using inverted-condition macro:

|00000008: | | |AbortIf %MACRO Condition=, Errorlevel=1 ; Definition of macro AbortIf. | | J%!Condition Skip%.: ; Use inverted condition to bypass the abortion. | | PUSH %Errorlevel ; Prepare operand for API invokation. | | CALL ExitProcess:: ; Windows API for program termination. | |Skip%.: ; Label where the program continues. | | %ENDMACRO AbortIf |00000008: | |00000008: | ; Example of conditional abortion: | | EUROASM ListMacro=Yes, ListVar=Yes ; Display the expanded instructions. |00000008:833D[04000000]00 | CMP [Something],0 ; Test the condition and then invoke macro. |0000000F: | AbortIf Condition=E, Errorlevel=8 ; The program exits when Something is zero. | +AbortIf %MACRO Condition=, Errorlevel=1 ; Definition of macro AbortIf. |0000000F:7507 + J%!Condition Skip%.: ; Use inverted condition to bypass the abortion. | !JNE Skip1: |00000011:6A08 + PUSH %Errorlevel ; Prepare operand for API invokation. | !PUSH 8 |00000013:E8(00000000) + CALL ExitProcess:: ; Windows API for program termination. |00000018: +Skip%.: ; Label where the program continues. | !Skip1: | + %ENDMACRO AbortIf |00000018: | ; Continue with the program if not aborted.

Ordinal operand list %* is assigned with ordinal operand at macro invokation. Keyword operands are omitted from the list. Length of the list (ordinal number of highest non-empty operand) is set to ordinals count variable %#.

Macro operand can be referred by various ways: The following example demonstrates three possible ways how to refer to macro operands:

SomeMacro %MACRO FirstOp, SecondOp, ThirdOp
     MOV ESI,%FirstOp ; Using formal %variable name of the operand.
     MOV EDI,%2 ; Using ordinal number of the operand.
     MOV ECX,%*{3} ; Using the third item of operand list.
     REP MOVSB
    %ENDMACRO SomeMacro

Keyword operand list %=* is assigned with all keyword operands actually used in macro invokation and their number is set to
keyword count variable %=#. Each item on the list contains the keyname, equal sign and the value.

MyStruc STRUC ; This is definition of structure MyStruc.
 .Word1  D W 1
 .Word2  D W 2
 .Word3  D W 3
     ENDSTRUC MyStruc
     
MyMacro %MACRO Name=Undefined ; This is definition of macro MyMacro.
      DD %#, %* ; Ordinals count, list of ordinals
%Name DS MyStruc, %=* ; List of keyword operands.
    %ENDMACRO MyMacro
    
MyMacro 11, 22, .Word1=33, 44, .Word2=55, Name=MyName ; This is invokation of MyMacro. 
                                                      ; It expands to
;        DD 3, 11, 22, 44
; MyName DS MyStruc, .Word1=33, .Word2=55
; Now we have defined those symbols:
; MyName.Word1 D W 33
; MyName.Word2 D W 55
; MyName.Word3 D W 3

When a macro is invoked, the label of invokation (if used) is by default placed as the first of expanded statements. This behaviour can be changed when the automatic
macro label %variable %: is placed somewhere in the macro definition. This may save a few clocks when jumping to the macro expansion which begins with code which would have to be skipped, see the following example:

SaveCursor %MACRO Videopage=BH
   %IF TYPE#CursorSave != 'W' ; if not declared yet
     JMP $ + 4 ; Skip over DW when the macro is entered in normal statements flow.
     CursorSave DW 0
   %ENDIF
%: MOV AH,3 ; Entry point of macro is here when the macro invokation is jumped to.
   MOV BH,%Videopage
   INC 10h ; get cursor shape via BIOS
   MOV [CursorSave],CX
  %ENDMACRO SaveCursor

If there is some label declared within the macro definition, it will be defined at many places under the same name, which assembler treats as an error. The identifier used as a label withing macro or other expanding pseudooperations (%FOR, %REPEAT, %WHILE) should be made unique. This can be achieved by extending the identifier with automatic %variable
expansion counter %.. Its value is replaced with a decadic number incremented with each expansion of any macro or any repeating block (%FOR, %REPEAT, %WHILE), so the label which uses it is unique. See the example of macro AbortIf above. The label Skip is postfixed with %. giving the label Skip%. which expands to Skip1 and which will expand to Skip2 on the next AbortIf invokation.

Automatic %variable %. can be used inside %MACRO, %FOR, %REPEAT and %WHILE block.


↑ System %variables

EUROASM system %variables ↓
PROGRAM system %variables ↓
€ASM system %variables ↓

EuroAssembler maintains a collection of preprocessing %variables with values of its configuration parameters. Their current value can be tested at assembly time and branch the program accordingly.

The name of system %variable consists of %^ followed with one of enumerated identifiers.

System %^variable names are case insensitive.

Value of system %^variable cannot be assigned with %SET* pseudoinstruction; it is dynamically maintained by €ASM and reflects the current value in charge.

%^DumpWidth %SETA 32 ; Use EUROASM DumpWidth=32 instead.
System %^variables are read-only.

Programmer can involve the value of system %^variable only indirectly, with options specified in euroasm.ini configuration file or with EUROASM and PROGRAM pseudoinstructions.

Boolean type options, such as AutoSegment=, Priv= etc., are assigned to corresponding system %^variables %^Autosegment, %^Priv as 0 (false) or -1 (true), no matter whether they were defined using enumerated tokens ON/OFF, YES/NO, TRUE/FALSE or with a logical expression.

EUROASM options WARN= and NOWARN= are assigned to system %variables %^Warn, %^NoWarn as series of 3999 digits 0 (false) and 1 (true). The first digit reflects the current status of message I0001, the second I0002, the last W3999.
Example: %IF %^WARN[2820] will assemble the following statements only if message W2820 is currently enabled.

Arithmetic type options are always assigned with decimal numbers. Positive sign + is omitted.

System preprocessing %variables
Category%variable names (case insensitive)
EUROASM %^AES, %^AMD, %^AUTOALIGN, %^AUTOSEGMENT, %^CODEPAGE, %^CPU, %^CYRIX, %^D3NOW, %^DEBUG, %^DISPLAYENC, %^DISPLAYSTM, %^DUMP, %^DUMPALL, %^DUMPWIDTH, %^EVEX, %^FPU, %^IMPORTPATH, %^INCLUDEPATH, %^LINKPATH, %^LIST, %^LISTFILE, %^LISTINCLUDE, %^LISTMACRO, %^LISTREPEAT, %^LISTVAR, %^LWP, %^MAXINCLUSIONS, %^MAXLINKS, %^MMX, %^MPX, %^MVEX, %^NOWARN, %^PRIV, %^PROFILE, %^PROT, %^RTF, %^RTM, %^SHA, %^SIMD, %^SPEC, %^SVM, %^TBM, %^TSX, %^UNDOC, %^UNICODE, %^VIA, %^VMX, %^WARN, %^XOP,
PROGRAM %^DllCharacteristics, %^Entry, %^FileAlign, %^Format, %^IconFile, %^ImageBase, %^ListGlobals, %^ListLiterals, %^ListMap, %^MajorImageVersion, %^MajorLinkerVersion, %^MajorOSVersion, %^MajorSubsystemVersion, %^MaxExpansions, %^MaxPasses, %^MinorImageVersion, %^MinorLinkerVersion, %^MinorOSVersion, %^MinorSubsystemVersion, %^Model, %^OutFile, %^SectionAlign, %^SizeOfHeapCommit %^SizeOfHeapReserve, %^SizeOfStackCommit, %^SizeOfStackReserve, %^StubFile, %^Subsystem, %^TimeStamp, %^Width, %^Win32VersionValue,
€ASM %^Date, %^EuroasmOs, %^Proc, %^Program, %^Section, %^Segment, %^SourceExt, %^SourceFile, %^SourceLine, %^SourceName, %^Time, %^Version,
↑ EUROASM system %variables

are set with values specified in [EUROASM] section of euroasm.ini or with EUROASM pseudoinstruction.

For description of system %variables of this category see the corresponding keyword of pseudoinstruction EUROASM.

↑ PROGRAM system %variables

are set with values specified in [PROGRAM] section of euroasm.ini or with PROGRAM pseudoinstruction.

For description of system %variables of this category see the corresponding keyword of pseudoinstruction PROGRAM.

↑ €ASM system %variables

Value of €ASM system %variables is maintained by €ASM itself and programmer cannot change them directly. They are described here:

%^Version
Eight decimal digits which identify the version number of EuroAssembler. The version number can be deciphered as the day of €ASM release in the format YYYYMMDD.
%^Date, %^Time
Current time of assembly in the format YYYYMMDD, HHMMSS. These two %^variables are set only once when €ASM starts. All source files assembled with one command euroasm source*.asm will share the same %^Date and %^Time which were set from the current local time.
%^EuroasmOs
identifies operation system which EuroAssembler runs on during the assembly. It contains shortcut of operating system, such as win, lin, os2...
This is not necessarily the operating system which the output program is intended to run on.
%^SourceFile, %^SourceName, %^SourceExt
These three %^variables contain full file name including path, name (without path and extension) and extension (including the leading .) of the source file which is currently assembled. €ASM updates these %^variables at the start of assembly and whenever some other file is included.
%^SourceLine
contains physical line number of the current statement in the current source file. In multiline statements (with line continuation \) it is the last physical line.
%^Program
is the name of current PROGRAM / ENDPROGRAM block.
%^Proc
is the name of current procedure. This %^variable is empty outside PROC / ENDP block.
%^Segment
is the name of current segment (without braces).
%^Section
is the name of current section (without braces).

Example using system %^variables:

%MonthList %SET Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
%Day %SETA %^DATE[7..8] ; Using %SETA instead of %SET will assign %Day with decimal numeric value to get rid of leading zero.
InfoMsg DB "This program was assembled with €ASM %^EuroasmOs ver.%^Version",13,10
 DB "on %MonthList{%^Date[5..6]} %Day-th, %^Date[1..4] at %^Time[1..2]:%^Time[3..4].",13,10,0
; InfoMsg now contains something like
; This program was assembled with €ASM Win ver.20081231
; on Feb 8-th, 2009 at 22:05.

Combination of €ASM system %^variables is used internally to identify position of statement in error messages: "%^SourceName%^SourceExt"{%^SourceLine}, e.g. "HelloWorld.asm"{3}


↑ Instructions

Machine instructions ↓

Pseudoinstructions ↓

Macroinstructions ↓

Instruction is an identifier specified in operation field of the statement.

There are three genders of instructions in assembly language:
machine instructions invented by CPU manufacturer,
psedoinstructions invented by assembler manufacturer,
macro instructions invented by programmer.


↑ Machine instructions

Instruction suffixes ↓

Instruction modifiers ↓

Instruction enhancements ↓

Undocumented instructions ↓

Machine instruction is the least order for CPU to make some calculation or data manipulation at run-time.

EuroAssembler uses Intel syntax where the first instruction operand specifies destination (which is often one of source operands, too), and one or more sources may follow.

This is the syntax used in CPU-manufacturer documentation and also used in most other assemblers, with exception for Unix-based gas, which prefers alternative paradigma represented by AT&T syntax with reversed operand order. For more differences between AT&T and Intel syntax see [ATTsyntax].

EuroAssembler implements machine instructions mnemonics as defined in specifications by CPU vendors. It also implements some undocumented instructions and instruction-format enhancements which are described below.

Some machine instructions allow alternative encoding of the same mnemonic, €ASM prefers the shortest one, if not instructed otherwise. €ASM respects mnemonic chosen by programmer, therefore it never encodes e.g. LEA ESI,[MemoryVariable] as MOV ESI,MemoryVariable, although the latter encoding is one byte shorter. There are only two exceptions when the mnemonics is not obeyed:

Instruction suffixes ↓

Machine instruction can manipulate with registers and memory variables of different width, usually with a byte, word or doubleword operands. However, Intel architecture defines the same mnemonic in disregard of data size. For instance, SUB [MemoryVariable],4 tells CPU to subtract immediate number 4 from the MemoryVariable, which might have been defined as DB, DW, DD or DQ. €ASM looks at the type of MemoryVariable and selects appropriate encoding according to its size. However, the offset might also be external or expressed as a register contents or plain number, such as in SUB [ESI],4, and the type of memory variable is unknown in this case. One method, how to tell EuroAssembler which data-width is desired, is using instruction suffix, which is one of the letters B W D Q S N F appended to the mnemonic name.

€ASM allows to extend many general-purpose instructions with mnemonic suffix B, W, D, Q to specify operand size.

Transfer control instructions CALL, JMP, RET may be modified with suffix N or F which tells whether the distance of target is near or far, i.e. if the target belongs to the same segment or if segment descriptor value needs to change, too. The unconditional JMP instruction may be also completed with suffix S when the distance to target can be encoded into 8 bits (-128..+127).

Suffix aware instructions in €ASMSuffix
ADC, ADD, AND, CMP, CMPS, CRC32, DEC, DIV, IDIV, IMUL, INC, LODS, MOV, MOVS, MUL, NEG, NOT, OR, RCL, RCR, ROL, ROR, SAL, SAL2, SAR, SBB, SCAS, SHL, SHR, STOS, SUB, TEST, TEST2, XOR B, W, D, Q
BT, BTC, BTS, BTR, ENTER, HINT_NOP, IRET, LEAVE, POP, POPF, PUSH, PUSHF W, D, Q
PUSHA, POPAW, D
INS, MOVSX, MOVZX, OUTSB, W, D
XLATB
CALL, RETN, F
JMPS, N, F

Using of instruction suffix is not necessary in most cases because the width of memory variable can be deduced by its type attribute or the width is determined by the register used as one of the operands. Error is reported if the register width is in conflict with the suffix, for instance in MOVW AL,[ESI].

Mnemonic suffix notation is sporadicly used in other assemblers or in CPU documentations, see STOSB/W/D, OUTSB/W/D, RETN/F etc. €ASM just extends this enhancement.

Mnemonics of many SIMD instructions terminate with letters ~SS, ~SD, ~PS, ~PD which specify the type of operands, too (Scalar/Packed Single/Double-precision). €ASM does not treats them as mnemonic suffixes.

There are a few conflicts/overloads of suffixed mnemonics with IA-32 instructions, they are resolvable by the type and/or the number of operands:

|00000000: | ; Standard Move versus MMX Move Doubleword: |00000000:C7450800000000 | MOVD [EBP+8],0 ; Store immediate number to DWORD memory location (suffix ~D). |00000007:0F7E4508 | MOVD [EBP+8],MM0 ; Store DWORD from MMX register to the memory location. |0000000B: | |0000000B: | ; Shift versus Double Precision Shift: |0000000B:C1650804 | SHLD [EBP+8],4 ; Shift left logical the DWORD in memory location by 4 bits (suffix ~D). |0000000F:0FA4450804 | SHLD [EBP+8],EAX,4 ; Shift left 4 bits from the register to the memory location. |00000014: | |00000014: | ; Compare String versus Compare Scalar Double-precision FP number: |00000014:A7 | CMPSD ; Compare DWORDs at [DS:ESI] and [ES:EDI] (suffix ~D). |00000015:A7 | CMPSD [ESI],[EDI] ; Ditto, documented with explicit operands. |00000016:F20FC2CA00 | CMPSD XMM1,XMM2,0 ; Compare scalar float64 numbers for EQUAL.

↑ Instruction modifiers

CODE= ↓
DATA= ↓
IMM= ↓
DISP= ↓
SCALE= ↓
DIST= ↓
ADDR= ↓
PREFIX= ↓
MASK= ↓
ZEROING= ↓
EH= ↓
SAE= ↓
ROUND= ↓
BCST= ↓
OPER= ↓
ALIGN= ↓
NESTINGCHECK= ↓

Machine instructions with the same mnemonic name and functionality sometimes may be encoded to different machine codes. For instance, immediate value can be optionally encoded in one byte when it does not exceed the range -128..+127, or it can be encoded as a full word or doubleword. Similar rule applies to encoding of displacement value in address expressions. Scaled address expression such as [1*ESI+EBX] may be encoded without SIB as [ESI+EBX] or using the SIB byte with explicit scaling factor 1.

€ASM prefers the shortest variant but this may be changed with additional keyword operands called instruction modifiers.

Many other assemblers decorate operands with special directives byte, word, dword, qword, short, near, far, ptr to achieve specific encoding, for instance add word ptr [StringOfBytes + 4], 0x20 or jmp short SomeLabel. Instead of those directives, €ASM uses either mnemonic suffix, or instruction modifiers.

AVX instruction modifiers MASK=,ZEROING=,SAE=,ROUND= replace inconsistent and poorly documented decorators, such as {k} {z} {ru-sae} {4to16} {uint16} {cdab} proposed by [IntelAVX512] and [IntelMVEX].

Typical value of modifier is enumerated token such as BYTE, WORD, DWORD etc. Most of enumerated modifier values may be abbreviated to their first letter. Both names and values of instruction modifiers are case insensitive.

Some modifiers are boolean type, their value may be TRUE, YES, ON, ENABLE, ENABLED if true, and FALSE, NO, OFF, DISABLE, DISABLED otherwise. Boolean modifier may also be an expression which evaluates to zero (false) or nonzero (true).

When the requested modifier cannot be satisfied, €ASM reports warning and ignores it.

Modifiers actually used for encoding can be displayed when EUROASM option DISPLAYENC= is switched ON. In this case €ASM accompanies each machine instruction with diagnostic message D1080 which explicitly documents which modifiers were used for encoding:

| | EUROASM DISPLAYENC=ON |00000000:694D10C8000000 | IMUL ECX,[EBP+16],200 |# D1080 Emitted size=7,DATA=DWORD,DISP=BYTE,SCALE=SMART,IMM=DWORD. |00000007: | |00000007:62F1ED2CF44D02<5 | VPMULUDQ YMM1,YMM2,[EBP+40h],MASK=K4 |# D1080 Emitted size=7,PREFIX=EVEX,MASK=K4,ZEROING=OFF,DATA=YWORD,BCST=OFF,OPER=2,DISP=BYTE,SCALE=SMART.
↑ CODE=

As a heritage from evolution of older processors, some machine instructions have more then one encoding. For instance the instruction POP rAX may be encoded either as 0x58 or as 0x8FC0, keeping the same functionality. Modifier CODE= selects which encoding should €ASM use.

Operation-code modifier may be SHORT or LONG alias S or L. Default is the one which selects shorter encoding, usually CODE=SHORT.

When an instruction has two possible encoding with the same size, CODE=SHORT selects the variant with numerically lower opcode.

|00000000:43 | INC EBX |00000001:43 | INC EBX,CODE=SHORT ; Intel 8080 legacy encoding, not available in 64bit mode. |00000002:FFC3 | INC EBX,CODE=LONG |00000004: | |00000004:50 | PUSH EAX |00000005:50 | PUSH EAX,CODE=SHORT ; Intel 8080 legacy encoding, not available in 64bit mode. |00000006:FFF0 | PUSH EAX,CODE=LONG |00000008: | |00000008:87CA | XCHG ECX,EDX |0000000A:87D1 | XCHG ECX,EDX,CODE=LONG ; Modifier swaps operands in commutative operations XCHG, TEST. |0000000C:87D1 | XCHG EDX,ECX |0000000E:87CA | XCHG EDX,ECX,CODE=LONG |00000010: | |00000010:C3 | RET |00000011:C3 | RET CODE=LONG |00000012:C20000 | RET CODE=SHORT ; Numerically lower opcode 0xC2 requested, which requires imm16. |00000015: | |00000015:83C07F | ADD EAX,127 |00000018:83C07F | ADD EAX,127,CODE=LONG |0000001B:057F000000 | ADD EAX,127,CODE=SHORT ; Shorter opcode 0x05 requested, which cannot sign-extend imm8.
In some cases explicit request for numerically lower opcode with CODE=SHORT may lead to longer encoding, see the example ADD r32,imm8 above.
↑ DATA=

This modifier controls operation-size, i.e. the width of data which the instruction operates on. It may be one of BYTE, WORD, DWORD, QWORD, TBYTE, OWORD, YWORD, ZWORD alias B, W, D, Q, T, O, Y, Z. The default is not specified.

Modifier DATA= has the same function as instruction suffix, they are only two differences:

There are two other ways how the operand width is controlled. If one of operands is a register, its width prevails and this cannot be overriden with suffix or modifier. When the operand width is not determined with the register, suffix nor modifier, €ASM looks at the TYPE# attribute of the target operand.

Priority of operand-size specifications:

  1. Width of register operand
  2. Mnemonics suffix
  3. Modifier DATA=
  4. Memory operand type

See the following examples:

|00000000:00000000 |MemoryVariable DB 0,0,0,0 |00000004:0107 | ADD [EDI],EAX ; Operand width is set by register (32bits). |00000006:830701 | ADDD [EDI],1 ; Operand width is set by suffix (32bits). |00000009:66830701 | ADD [EDI],1,DATA=W ; Operand width is set by modifier (16bits). |0000000D:800701 | ADDB [EDI],1,DATA=W ; Operand width is set by suffix (8bits). Warning:modifier ignored. |## W2401 Modifier "DATA=WORD" could not be obeyed in this instruction. |00000010:660107 | ADDB [EDI],AX ; ; Operand width is set by register (16bits). Error:suffix ignored. |### E6740 Impracticable operand-size requested with mnemonic suffix. |00000013:8387[00000000]01 | ADDD [EDI+MemoryVariable],1 ; Operand width is set by suffix (32bits). |0000001A:668387[00000000]01 | ADD [EDI+MemoryVariable],1,DATA=W ; Operand width is set by modifier (16bits). |00000022:8087[00000000]01 | ADD [EDI+MemoryVariable],1 ; Operand width is set by TYPE#MemoryVariable='B' (8bits). |00000029:800701 | ADD [EDI],1 ; Error:Operand width is not specified. |### E6730 Operand size could not be determined, please use DATA= modifier.
↑ IMM=

Some instructions allow to encode small immediate value as one byte, although they operate with full words. The byte value is sign-extended by CPU on run-time.

Modifier IMM= may have value BYTE, WORD, DWORD, QWORD alias B, W, D, Q and it specifies how should immediate operand be encoded in the instruction.

|00000000:83D001 | ADC EAX,1 |00000003:83D001 | ADC EAX,1,IMM=BYTE |00000006:81D001000000 | ADC EAX,1,IMM=DWORD
↑ DISP=

Displacement address portion in some instructions may be encoded into one byte if its value is in the range -128..+127. The byte value is sign-extended by CPU at run-time. Values outside this range are encoded in full size, i.e. as WORD, or DWORD, according to the segment width (possibly inverted with ATOGGLE prefix). This is the default behaviour of €ASM. Modifier DISP= can have the same enumerated values as IMM= modifier (BYTE, WORD, DWORD, QWORD alias B, W, D, Q) and it controls whether the displacement is encoded with full size or as a byte.

|00000000:2945FC | SUB [EBP-4],EAX |00000003:2945FC | SUB [EBP-4],EAX,DISP=BYTE |00000006:2985FCFFFFFF | SUB [EBP-4],EAX,DISP=DWORD
↑ SCALE=

Scaling means multiplication of the contents of the index register with 0, 1, 2, 4 or 8 at run-time. The SCALE= modifier can be either SMART or VERBATIM (or shortly S, V). Default is SCALE=SMART.
In verbatim mode no optimization is performed with index and base registers and scaling is encoded in SIB byte even when the scale factor is 1 or 0. Encoding of instruction with SCALE=VERBATIM uses SIB byte, if possible.
In smart mode (default) €ASM tries to rearrange registers and not emit SIB byte unless absolutely necessary.
Here are the "smart" optimization rules (IR is indexregister, BR is baseregister, disp is displacement):

|00000000:A011000000 | MOV AL,[0x11] ; Special encoding without ModR/M. |00000005:A011000000 | MOV AL,[0*ESI+0x11] ; Special encoding without ModR/M. |0000000A:8A042511000000 | MOV AL,[0*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB. ESI is not used. |00000011: | |00000011:8A4611 | MOV AL,[ESI+0x11] ; ModR/M without SIB. ESI is base. |00000014:8A4611 | MOV AL,[ESI+0x11],SCALE=SMART ; ModR/M without SIB. ESI is base. |00000017:8A442611 | MOV AL,[ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB, ESI is base. |0000001B:8A4611 | MOV AL,[1*ESI+0x11] ; ModR/M without SIB. ESI is base. |0000001E:8A4611 | MOV AL,[1*ESI+0x11],SCALE=SMART ; ModR/M without SIB. ESI is base. |00000021:8A043511000000 | MOV AL,[1*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB, ESI is index. |00000028: | |00000028:8A443611 | MOV AL,[ESI+ESI+0x11] ; ModR/M without SIB. ESI is base and index. |0000002C:8A443611 | MOV AL,[2*ESI+0x11] ; ModR/M without SIB. ESI is base and index. |00000030:8A443611 | MOV AL,[2*ESI+0x11],SCALE=SMART ; ModR/M without SIB. ESI is base and index. |00000034:8A047511000000 | MOV AL,[2*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB. ESI is index. |0000003B: | |0000003B:8A442D11 | MOV AL,[EBP+EBP+0x11] ; ModR/M without SIB, EBP is base and index. |0000003F:8A442D11 | MOV AL,[2*EBP+0x11] ; ModR/M without SIB, EBP is base and index. |00000043:8A442D11 | MOV AL,[2*EBP+0x11],SCALE=SMART ; ModR/M without SIB, EBP is base and index. |00000047:8A046D11000000 | MOV AL,[2*EBP+0x11],SCALE=VERBATIM ; ModR/M with SIB, EBP is index.
Notice that optimization with SCALE=SMART may change the register role (base|index) and consequently the default segment register (SS|DS) used for addressing. This is usually not an issue in flat memory model, otherwise use SCALE=VERBATIM.
↑ DIST=

This modifier specifies the distance of target in control-transfer instructions. It can be one of FAR, NEAR, SHORT alias F, N, S.

DIST=FAR is used when the target is in a different segment and both rIP and CS registers need to be changed.

By default in intrasegment transfers €ASM automatically selects between SHORT and NEAR distance depending on the magnitude of offsets difference.

Modifier DIST= has the same function as instruction suffix, they are only two differences:

Modifier DIST=NEAR or DIST=FAR can be also applied to pseudoinstructions PROC, PROC1. Consequence of making a procedure FAR is that call and jumps to that procedure will be by default FAR, and that any RET inside this procedure will default to DIST=FAR, too.

|[CODE1] |[CODE1] SEGMENT |0000:EB2A | JMP CloseLabel: ; Encoded DIST=SHORT. |0002:E92701 | JMP DistantLabel: ; Encoded DIST=NEAR. |0005:EA[0000]{0000} | JMP FarLabel: ; Encoded DIST=FAR. |000A:EB20 | JMP CloseLabel:,DIST=SHORT ; Encoded DIST=SHORT. |000C:E91D01 | JMP DistantLabel:,DIST=SHORT ; Encoded DIST=NEAR. |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |000F:EA[0000]{0000} | JMP FarLabel:,DIST=SHORT ; Encoded DIST=FAR. |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |0014:E91500 | JMP CloseLabel:,DIST=NEAR ; Encoded DIST=NEAR. |0017:E91201 | JMP DistantLabel:,DIST=NEAR ; Encoded DIST=NEAR. |001A:E9(0000) | JMP FarLabel:,DIST=NEAR ; Encoded DIST=NEAR. |001D:EA[2C00]{0000} | JMP CloseLabel:,DIST=FAR ; Encoded DIST=FAR. |0022:EA[2C01]{0000} | JMP DistantLabel:,DIST=FAR ; Encoded DIST=FAR. |0027:EA[0000]{0000} | JMP FarLabel:,DIST=FAR ; Encoded DIST=FAR. |002C: |CloseLabel: |002C:90909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |012C: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |0000: |FarLabel:
↑ ADDR=

This modifier will choose the reference frame of memory addressing in 64bit mode. Allowed values are ABS, REL alias A, R. Number encoded in instruction code with absolute addressing is related to the start of segment, which is always 0 at assembly time.
In relative adressing it is related to the position of the next instruction, i.e. to the contents of register RIP. In legacy modes the reference frame is hardwired as ADDR=REL in control-transfer instructions (direct JMP, CALL, LOOP, Jcc), and as ADDR=ABS in all other instructions.

RIP-relative addressing is shorter by one byte and it does not require relocation, which saves space in object file and avoids patching of code at load-time. That is why ADDR=REL is preferred by default in 64-bit mode.
Explicit selection between absolute and RIP-relative addressing is relevant only in 64-bit mode when the absolute address would require relocation at link-time. This happens when the memory variable is specified as displacement of address symbol (not a plain number), and no index or base register is involved in addressing.

|00000000:00000000 | MemDword DD 0 |00000004: | |00000004:0305F6FFFFFF | ADD EAX,[MemDword] ; Encoded with relative addressing. |0000000A:0305F0FFFFFF | ADD EAX,[MemDword],ADDR=REL ; Encoded with relative addressing. |00000010:030425[00000000] | ADD EAX,[MemDword],ADDR=ABS ; Encoded with absolute addressing. |00000017: | |00000017:034540 | ADD EAX,[RBP+0x40] ; Encoded with absolute addressing. |0000001A:034540 | ADD EAX,[RBP+0x40],ADDR=ABS ; Encoded with absolute addressing. |0000001D:034540 | ADD EAX,[RBP+0x40],ADDR=REL ; Encoded with absolute addressing. |## W2401 Modifier "ADDR=REL" could not be obeyed in this instruction.
↑ PREFIX=

Following modifiers apply only to instructions which use Advanced Vector eXtensions (AVX) encoding. Possible value of prefix is XOP, VEX, VEX2, VEX3, MVEX, EVEX (shortcuts are not available).

Most AVX encodable instructions have their mnemonics prefixed with V~. Some instructions are defined with only one kind of AVX prefix, they don't need explicit modifier. When an instruction can be alternatively encoded with different AVX prefixes, €ASM will by default choose the shortest one.

Prefix VEX exists in two variants: VEX2 and VEX3. The longer encoding (VEX3) is automatically selected when the instruction uses indexregister or baseregister R8..R15 or when it uses opcode from map 0F38 or 0F3A.

Prefix EVEX or MVEX will be selected instead of VEX when the instruction uses register XMM16..XMM31, YMM16..YMM31, ZMM0..ZMM31, K0..K7, or modifier EH=, SAE=, ROUND=, MASK=, ZEROING=, OPER=.

Instruction encodable with both EVEX and MVEX default to PREFIX=EVEX. Software written for Intel® Xeon Phi™ CPU needs to explicitly request PREFIX=MVEX in each such amphibious instruction. In this case it is useful to disable EVEX EUROASM EVEX=DISABLED and thus be warned if some MVEX instruction encodes as EVEX by omission. Explicit specification of modifier EH= (which is available with MVEX only) will select MVEX too, and explicit PREFIX=MVEX is not necessary in this case.

CPU features required by using AVX prefix
PrefixEUROASM options
XOPSIMD=AVX, AMD=ENABLED, XOP=ENABLED
VEXSIMD=AVX
MVEXSIMD=AVX512, MVEX=ENABLED
EVEXSIMD=AVX512, EVEX=ENABLED
|00000000:8FE868CCCB04 | VPCOMB XMM1,XMM2,XMM3,4 ; VPCOMB is defined with XOP only. |00000006:62F1FA082917 | VMOVNRAPD [RDI],ZMM2 ; VMOVNRAPD is defined with MVEX only. |0000000C:C5E958CB | VADDPD XMM1,XMM2,XMM3 ; VADDPD is defined with VEX,MVEX,EVEX. |00000010:C5E958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX |00000014:C5E958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX2 |00000018:C4E16958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX3 |0000001D:62F1ED0858CB | VADDPD XMM1,XMM2,XMM3,PREFIX=EVEX |00000023:62F1ED4858CB | VADDPD ZMM1,ZMM2,ZMM3,PREFIX=EVEX |00000029:62F1E90858CB | VADDPD ZMM1,ZMM2,ZMM3,PREFIX=MVEX
↑ MASK=

Modifier MASK= (as well as ZEROING=, EH=, SAE=, ROUND=, BCST=, OPER=) is applicable only with Enhanced Advanced Vector eXtensions (EVEX or MVEX). MASK specifies which opcode mask register is used to control which elements (floating-point or integer numbers) should be written to the destination SIMD register. Only elements which have corresponding bits in mask-register set, are written. Other elements are either zeroed (if modifier ZEROING=ON) or left unchanged (ZEROING=OFF).

Possible value of MASK= is K0, K1, K2, K2, K3, K4, K5, K6, K7 or an expression which evaluates to number 0..7. Default is MASK=0. Opmask register K0 is special, it is treated as if it had all bits set, thus no masking is applied in this case.

↑ ZEROING=

Modifier ZEROING= is boolean, it controls whether elements masked-off by the contents of opmask register should be set to zero or left unchanged, which is called merging. It has no meaning when MASK=K0 or when mask is not specified at all. Default is ZEROING=OFF (merging). Modifier is applicable only with EVEX encoding.

|00000000:C5E958CB | VADDPD XMM1,XMM2,XMM3 ; VADDPD is defined with VEX,MVEX,EVEX. |00000004:62F1ED0C58CB | VADDPD XMM1,XMM2,XMM3,MASK=4 ; Using MASK= will force EVEX encoding. |0000000A:62F1ED0C58CB | VADDPD XMM1,XMM2,XMM3,MASK=K4,ZEROING=NO |00000010:62F1ED8C58CB | VADDPD XMM1,XMM2,XMM3,MASK=K4,ZEROING=YES
↑ EH=

Boolean modifier EH= (Eviction Hint) is applicable with MVEX-encoded instructions only. EH=1 informs CPU that the data is non-temporal and it is unlikely to be reused soon so it has no effect to store them in CPU cache. This concerns register-to-memory instructions only.

Value of EH is also consulted in register-to-register instructions where it will select between swizzle operations and static rounding.

↑ SAE=

If boolean modifier SAE= (Suppress All Exceptions) is switched on, the instruction will not raise any kind of floating-point exception flags, for instance when it operated with not-a-number value. Instruction with SAE=ON behaves as if all the MXCSR mask bits were set.

In EVEX-encoding SAE is by default enabled whenever static rounding is used, this behaviour cannot be switched off.

↑ ROUND=

Modifier ROUND= specifies static rounding mode, it is applicable on EVEX and MVEX instructions with rounding semantic, for instance for conversion from double to single-precision FP numbers. It has four possible enumerated values: NEAR, UP, DOWN, ZERO alias N, U, D, Z.

Static rounding is available only in ZMM register-to-register operations (not if one of the operands is in memory or when XMM and YMM registers are used). Default is no rounding, in this case general rounding mode controlled by RM bits in MXCSR applies.

↑ BCST=

Boolean modifier BCST= can be used to enable data broadcasting in operations which load data from memory. When BCST=ENABLED, memory source operand specifies only one element and its contents will be broadcast (copied) to all positions of destination register.

Default is BCST=OFF. Broadcasting cannot be used with register-to-register operations.

|00000000:62F16C48590E | VMULPS ZMM1,ZMM2,[RSI] ; Multiply 16 DWORD FP numbers in ZMM2 with 16 DWORD FP numbers at [RSI], store 16 products to ZMM1. |00000006:62F16C58590E | VMULPS ZMM1,ZMM2,[RSI],BCST=ON ; Multiply 16 DWORD FP numbers in ZMM2 with the same DWORD FP number at [RSI], store 16 products to ZMM1. |0000000C:62F16C4859CB | VMULPS ZMM1,ZMM2,ZMM3 ; Multiply 16 DWORD FP numbers in ZMM2 with 16 DWORD FP numbers in ZMM3, store 16 products to ZMM1. |00000012:62F16C7859CB | VMULPS ZMM1,ZMM2,ZMM3,ROUND=ZERO ; Ditto, truncate each product toward zero.
↑ OPER=

Instruction modifier OPER= encodes kind of operation performed with source operand at run-time. Affected operations are broadcasting, rounding, conversion, swizzling. Possible value is numeric expression which evaluates to 0..7.

Value of operation will be encoded in bits 6, 5, 4 of 32bit prefix EVEX or MVEX. These bits are named S2, S1, S0 in MVEX specification [IntelMVEX], and L', L, b in EVEX specification [IntelAVX512]. The same bits are also affected by modifiers BCST=, ROUND=, SAE= and by SIMD register width, but direct OPER= specification has higher significance, if a conflict occures.

Modifier OPER= is the only way how to request special conversion or swizzle (shuffle) operation for MVEX-encoded instruction available on Intel® Xeon Phi™ CPU. Not all operation values from the table below are available with all MVEX instructions, documentation in [IntelMVEX] should always be consulted prior to using OPER=.

MVEX-encoded operations
OPER=register-to-register, EH=0register-to-register, EH=1memory-to-registerregister-to-memory
0no swizzle {dcba}ROUND=NEAR,SAE=NOno operationno conversion
1swap (inner) pairs {cdab}ROUND=DOWN,SAE=NObcst 1 element {1to16} or {1to8}not available
2swap with two-away {badc}ROUND=UP,SAE=NObcst 4 elements {4to16} or {4to8}not available
3cross-product swizzle {dacb}ROUND=ZERO,SAE=NOconvert from {float16}convert to {float16}
4bcst a element across 4 {aaaa}ROUND=NEAR,SAE=YESconvert from {uint8}convert to {uint8}
5bcst b element across 4 {bbbb}ROUND=DOWN,SAE=YESconvert from {sint8}convert to {sint8}
6bcst c element across 4 {cccc}ROUND=UP,SAE=YESconvert from {uint16}convert to {uint16}
7bcst d element across 4 {dddd}ROUND=ZERO,SAE=YESconvert from {sint16}convert to {sint16}
EVEX-encoded operations
OPER=register-to-registermemory-to-register
0DATA=OWORD,SAE=NODATA=OWORD,BCST=OFF
1DATA=ZWORD,SAE=YES,ROUND=NEARDATA=OWORD,BCST=ON
2DATA=YWORD,SAE=NODATA=YWORD,BCST=OFF
3DATA=ZWORD,SAE=YES,ROUND=DOWNDATA=YWORD,BCST=ON
4DATA=ZWORD,SAE=NODATA=ZWORD,BCST=OFF
5DATA=ZWORD,SAE=YES,ROUND=UPDATA=ZWORD,BCST=ON
6reservedreserved
7DATA=ZWORD,SAE=YES,ROUND=ZEROreserved
|00000000:62F16908DB4D01<6 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=0 ; No broadcast {16to16}. |00000007:62F16918DB4D10<2 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=1 ; Broadcast one element {1to16}. |0000000E:62F16928DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=2 ; Broadcast four elements {4to16}. |00000015:62F16948DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=4 ; Convert from {uint8}. |0000001C:62F16958DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=5 ; Convert from {sint8}. |00000023:62F16968DB4D02<5 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=6 ; Convert from {uint16}. |0000002A:62F16978DB4D02<5 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=7 ; Convert from {sint16}. |00000031: | |00000031:62F1F9085A4D01<6 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=0 ; No broadcast {8to8}. |00000038:62F1F9185A4D08<3 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=1 ; Broadcast one element {1to8}. |0000003F:62F1F9285A4D02<5 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=2 ; Broadcast four elements {4to8}. |00000046: | |00000046:62F1F9085ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=0 ; No swizzle {dcba}. |0000004C:62F1F9185ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=1 ; Swap (inner) pairs {cdab}. |00000052:62F1F9285ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=2 ; Swap with two-away {badc}. |00000058:62F1F9385ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=3 ; Cross-product swizzle {dacb}. |0000005E:62F1F9485ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=4 ; Broadcast a element to 4 {aaaa}. |00000064:62F1F9585ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=5 ; Broadcast b element to 4 {bbbb}. |0000006A:62F1F9685ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=6 ; Broadcast c element to 4 {cccc}. |00000070:62F1F9785ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=7 ; Broadcast d element to 4 {dddd}. |00000076: | |00000076:62F1F9885ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=0 ; ROUND=NEAR,SAE=OFF {rn}. |0000007C:62F1F9985ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=1 ; ROUND=DOWN,SAE=OFF {rd}. |00000082:62F1F9A85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=2 ; ROUND-UP, SAE=OFF {ru}. |00000088:62F1F9B85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=3 ; ROUND=ZERO,SAE=OFF (rz). |0000008E:62F1F9C85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=4 ; ROUND=NEAR,SAE=ON {rn-sae}. |00000094:62F1F9D85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=5 ; ROUND=DOWN,SAE=ON {rd-sae}. |0000009A:62F1F9E85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=6 ; ROUND=UP, SAE=ON {ru-sae}. |000000A0:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=7 ; ROUND=ZERO,SAE=ON {rz-sae}. |000000A6:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,ROUND=ZERO,SAE=ON |000000AC:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,EH=1,ROUND=ZERO,SAE=ON
↑ ALIGN=

Alignment request may be applied to any machine instruction, and to pseudoinstructions D, PROC, PROC1, STRUC. See the alignment paragraph for accepted values. This instruction modifier has the same effect as if explicit pseudoinstruction ALIGN was placed above the statement.

↑ NESTINGCHECK=

This is a pseudoinstruction modifier, it can be applied only to pseudoinstructions PROC, ENDPROC, PROC1, ENDPROC1. Its value is boolean, default is NESTINGCHECK=ON. Switching the nesting control off will suppress error message on block mismatch. This enables to establish bounds between macros which enhance some block pseudoinstructions. See the definitions of macros Procedure and EndProcedure as an example.

↑ Instruction enhancements

FPU instruction default registers ↓
String instructions operands ↓
XLAT with nondefault [segment:base] ↓
LOOP with nondefault counter ↓
Near and far LOOP and JrCXZ ↓
Near and far Jcc ↓
PUSH, POP, INC, DEC multiple operands ↓
AAD, AAM operand ↓
TEST a register by itself ↓
Shift and rotate 2nd operand ↓
No-operation ↓
PINSR register source ↓
BLENDVPD, BLENDVPS, PBLENDVB 3rd operand ↓
MASKMOVQ, MASKMOVDQU 1st operand ↓
VERR, VERQ, LAR, LSL ↓

Some instrucions in IA-64 work with registers fixed by design. €ASM accepts voluntary explicit specification of such registers which serves as a documentation for human reader and sometimes it may be exploited as address-size definition and/or segment override.

↑ FPU instruction default registers

Unary FPU instructions with implicit destination ST0 may explicitly name this register as the first operand, or it may be omitted. In many other FPU instructions default destination is ST0 and default source is ST1, in which case one or both operands may be omitted. See also handlers of instructions FNOP, FCMOVB, FADD, FIADD, FADDP, FXCH, FCOM.

|00000000:000000000000F03F |Mem DQ 1.0 |00000008: | |00000008:DAC1 | FCMOVB ; ST0 = ST1 if Below. |0000000A:DAC1 | FCMOVB ST0,ST1 ; ST0 = ST1 if Below. |0000000C: | |0000000C:DAC7 | FCMOVB ST0,ST7 ; ST0 = ST7 if Below. |0000000E:DAC7 | FCMOVB ST7 ; ST0 = ST7 if Below. |00000010: | |00000010:D8C1 | FADD ; ST0 += ST1. |00000012:D8C1 | FADD ST0,ST1 ; ST0 += ST1. |00000014: | |00000014:DC05[00000000] | FADD ST0,[Mem] ; ST0 += [Mem]. |0000001A:DC05[00000000] | FADD [Mem] ; ST0 += [Mem]. |00000020: | |00000020:DCC7 | FADD ST7,ST0 ; ST7 += ST0. |00000022:DCC7 | FADD ST7 ; ST7 += ST0. |00000024: | |00000024:D9E9 | FLDL2T ; ST0 = log210. |00000026:D9E9 | FLDL2T ST0 ; ST0 = log210.
↑ String instructions operands

String instructions are implicitly addressing the source as memory [DS:rSI] or port DX, and the destination as memory [ES:rDI] or port DX. Beside the non-operand version €ASM accepts operand(s) explicitly presenting source and destination, with possible segment-override and address-size change.

|00000000:AC | LODSB |00000001:AC | LODSB [DS:ESI] ; Default segment is DS, address-size is 32. |00000002:2EAC | LODSB [CS:ESI] ; Segment override. |00000004:67AC | LODSB [SI] ; Address-size changed. |00000006: | |00000006:AA | STOSB |00000007:AA | STOSB [EDI] |00000008: | |00000008:AE | SCASB |00000009:AE | SCASB [EDI] |0000000A: | |0000000A:A5 | MOVSD |0000000B:A5 | MOVSD [EDI],[ESI] |0000000C:2667A5 | MOVSD [DI],[ES:SI] ; Address-size and source segment changed. |0000000F: | |0000000F:666D | INSW |00000011:666D | INSW [ES:EDI],DX |00000013: | |00000013:6E | OUTSB |00000014:6E | OUTSB DX,[DS:ESI] |00000015:2E6E | OUTSB DX,[CS:ESI] ; Source segment changed.
↑ XLAT with nondefault [segment:base]

Default translation table is implicitly addressed with [DS:rBX]. €ASM accepts optional memory operand which can specify nondefault segment override and nondefault rBX width.

↑ LOOP with nondefault counter

LOOP count register can be specified as the optional second operand.

|00000000:D7 | XLAT |00000001:D7 | XLATB ; XLAT and XLATB are identical. |00000002:D7 | XLATB [DS:EBX] ; Segment DS is default, no override necessary. |00000003:26D7 | XLATB [ES:EBX] |00000005:67D7 | XLATB [BX] |00000007: | |00000007:E2F6 | LOOP $-8 |00000009:E2F6 | LOOP $-8,ECX ; Default counter in 32bit mode is ECX. |0000000B:67E2F5 | LOOP $-8,CX ; Counter register (its address-size) changed to 16 bit.
↑ Near and far LOOP and JrCXZ

Looping is not limited to short-range distance in €ASM. When the destination of LOOP, LOOPcc, JCXZ, JECXZ, JRCXZ is far or near (out of byte range), €ASM will assemble three instructions instead:

LOOP $+2+2 ; Loop to the proxy-jump instead of the original destination. JMPS $+JMPSsize+JMPsize ; Skip the proxy-jump when the loop has finished (rCX is zero). JMP target ; Near or far unconditional proxy-jump to the original destination. |[CODE1] |[CODE1] SEGMENT |00000000:E366 | JECXZ CloseLabel: |00000002:E364 | JECXZ CloseLabel:,DIST=SHORT |00000004:E302EB05E95B000000 | JECXZ CloseLabel:,DIST=NEAR |0000000D:E302EB07EA[68000000]{0000}| JECXZ CloseLabel:,DIST=FAR |00000018: | |00000018:E302EB05E947010000 | JECXZ DistantLabel: |00000021:E302EB05E93E010000 | JECXZ DistantLabel:,DIST=SHORT |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |0000002A:E302EB05E935010000 | JECXZ DistantLabel:,DIST=NEAR |00000033:E302EB07EA[68010000]{0000}| JECXZ DistantLabel:,DIST=FAR |0000003E: | |0000003E:E302EB07EA[00000000]{0000}| JECXZ FarLabel: |00000049:E302EB07EA(00000000){0000}| JECXZ FarLabel:,DIST=SHORT |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |00000054:E302EB05E9(00000000) | JECXZ FarLabel:,DIST=NEAR |0000005D:E302EB07EA[00000000]{0000}| JECXZ FarLabel:,DIST=FAR |00000068: |CloseLabel: |00000068:909090909090909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |00000168: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |00000000: |FarLabel:
↑ Near and far Jcc

Conditional jump to the distance exceeding byte limit -128..127 was introduced with 386 CPU. When the program is intended to run on older processors as well, near and far conditional jump Jcc target will be assembled by €ASM as two instructions:

J!cc $+J!ccsize+JMPsize ; Skip the proxy-jump if inverted condition is true. JMP target ; Near or far unconditional proxy-jump to the original destination.

Near proxy-jump instead of standard 386 near conditional jump is assembled when these three conditions are met:

  1. distance to target is out of byte range,
  2. segment width is 16,
  3. EUROASM option CPU= is 286 or lower.
|[CODE1] |[CODE1] SEGMENT WIDTH=16 | | EUROASM CPU=386 |0000:7419 | JE CloseLabel: ; Standard short conditional jump. |0002:0F841501 | JE DistantLabel: ; Standard near conditional jump, available on CPU=386 and newer. |0006:7505EA[0000]{0000}| JE FarLabel: ; Far unconditional proxy-jump skipped by inverted-condition J!cc. | | EUROASM CPU=086 ; Following instructions should run on old PC/XT machine, too. |000D:740C | JE CloseLabel: ; Standard short conditional jump. |000F:7503E90701 | JE DistantLabel: ; Near unconditional proxy-jump skipped by inverted-condition J!cc. |0014:7505EA[0000]{0000}| JE FarLabel ; Far unconditional proxy-jump skipped by inverted-condition J!cc. |001B: |CloseLabel: |001B:9090909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |011B: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |0000: |FarLabel:
↑ PUSH, POP, INC, DEC multiple operands

In many assemblers instructions PUSH, POP, INC, DEC may have just one operand. €ASM does not limit the number of operands, they are performed one by one in the specified order. If an instruction modifier or suffix is used, it applies to all operands. |00000000:57FF370FA06A04 | PUSH EDI,[EDI],FS,4 |00000007:590FA18F0658 | POP ECX,FS,[ESI],EAX |0000000D:40FF07 | INC EAX,[EDI],DATA=DWORD |00000010:48664AFEC9 | DEC EAX,DX,CL

↑ AAD, AAM operand

Instructions AAD and AAM use by default radix 10 for adjusting AL before division or after multiplication of binary decimals. In €ASM they accept optional 8bit immediate operand, for instance AAD 16. |00000000:D40A | AAM |00000002:D40A | AAM 10 |00000004:D410 | AAM 16 |00000006:D50A | AAD |00000008:D50A | AAD 10 |0000000A:D510 | AAD 16

↑ TEST a register by itself

When both operands in TEST instruction specify the same register, the second operand may be omitted.

↑ Shift and Rotate 2nd operand

When the number of bits to rotate or shift in instructions RCL, ROL, SAL, SHL, RCR, ROR, SAR, SHR is equal to 1, the second operand may be omitted.

|00000000:85D2 | TEST EDX,EDX |00000002:85D2 | TEST EDX ; Operand2 of TEST is by default identical with Operand1. |00000004: | |00000004:D1D0 | RCL EAX,1 |00000006:D1D0 | RCL EAX ; Omitted rotate/shift count defaults to 1. |00000008:D165F8 | SHL [EBP-8],1,DATA=DWORD |0000000B:D165F8 | SHL [EBP-8],DATA=DWORD
↑ No-operation

Instruction which does nothing (no-operation) except for taking some time and incrementing instruction-pointer register, is implemented in all x86 processors as one-byte NOP, actually XCHG rAX,rAX (opcode 0x90). With Pentium II (CPU=686) Intel proposed dedicated multibyte no-operation instructions with opcodes 0x18..0x1F prefixed with 0x0F. Multibyte NOP is more suitable for alignment purposes than series of one-byte NOPs, as it's fetched and executed at once. On older CPU this real NOP must be emulated with legacy instructions, e.g. XCHG reg,reg or LEA reg,[reg].

[Sandpile] and [NasmInsns] define real-NOP as an undocumented instructions HINT_NOP0, HINT_NOP1, HINT_NOP2..63. with one memory operand of desired length. Instead of clutterring the instruction list with 64 new mnemonics, €ASM implements just one mnemonic HINT_NOP (suffixable as HINT_NOPW, HINT_NOPD, HINT_NOPQ) with ordinal number defined in the first immediate operand, and memory specification moved aside to the 2nd operand.

|00000000:0F18D9 | HINT_NOP 03q,ECX |00000003:660F18E1 | HINT_NOP 04q,CX |00000007:66670F182C | HINT_NOPW 05q,[SI] |0000000C:66670F187400 | HINT_NOPW 06q,[SI],DISP=BYTE |00000012:0F18BE00000000 | HINT_NOPD 07q,[ESI],DISP=DWORD |00000019:0F19043500000000 | HINT_NOPD 10q,[1*ESI],DISP=DWORD,SCALE=VERBATIM |00000021: | |00000021:90 | NOP1 |00000022:6690 | NOP2 |00000024:0F1F00 | NOP3 |00000027:0F1F4000 | NOP4 |0000002B:0F1F442000 | NOP5 |00000030:660F1F442000 | NOP6 |00000036:0F1F8000000000 | NOP7 |0000003D:0F1F842000000000 | NOP8 |00000045:660F1F842000000000 | NOP9

Beside that, €ASM implements operandless instructions NOP1, NOP2, NOP3, NOP4, NOP5, NOP6, NOP7, NOP8, NOP9 which occupy the specified number of bytes, regardless of current CPU mode and level:

No-operation encoding
MnemonicOperation code (hexa)Equivalent instruction in €ASM syntax
16bit mode, CPU=086
NOP190XCHG AX,AX
NOP287C9XCHG CX,CX
NOP39087C9XCHG AX,AX ; XCHG CX,CX
NOP487C987D2XCHG CX,CX ; XCHG DX,DX
NOP59087C987D2XCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX
NOP687C987D287DBXCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX
NOP79087C987D287DBXCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX
NOP887C987D287DB87E4XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX ; XCHG SP,SP
NOP99087C987D287DB87E4XCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX ; XCHG SP,SP
16bit mode, CPU=686
NOP190NOP DATA=WORD
NOP26690OTOGGLE NOP
NOP3666790OTOGGLE ATOGGLE NOP
NOP4670F1F00NOP [EAX],DATA=WORD
NOP5670F1F4000NOP [EAX],DATA=WORD,DISP=BYTE
NOP6670F1F442000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP766670F1F442000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP8670F1F8000000000NOP [EAX],DATA=WORD,DISP=DWORD
NOP9670F1F842000000000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
32bit mode, CPU=386
NOP190XCHG EAX,EAX,DATA=DWORD
NOP26690XCHG AX,AX,DATA=WORD
NOP38D4000LEA EAX,[EAX],DATA=DWORD
NOP48D442000LEA EAX,[EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP53E8D442000LEA EAX,[DS:EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP68D8000000000LEA EAX,[EAX],DATA=DWORD,DISP=DWORD
NOP78D842000000000LEA EAX,[EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP83E8D842000000000LEA EAX,[DS:EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9663E8D842000000000LEA AX,[DS:EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
32bit mode, CPU=686
NOP190NOP DATA=DWORD
NOP26690NOP DATA=WORD
NOP30F1F00NOP [EAX],DATA=DWORD
NOP40F1F4000NOP [EAX],DATA=DWORD,DISP=BYTE
NOP50F1F442000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP6660F1F442000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP70F1F8000000000NOP [EAX],DATA=DWORD,DISP=DWORD
NOP80F1F842000000000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9660F1F842000000000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
64bit mode, CPU=X64
NOP190NOP DATA=DWORD
NOP26690NOP DATA=WORD
NOP30F1F00NOP [RAX],DATA=DWORD
NOP40F1F4000NOP [RAX],DATA=DWORD,DISP=BYTE
NOP50F1F442000NOP [RAX+0*RAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP6660F1F442000NOP [RAX+0*RAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP70F1F8000000000NOP [RAX],DATA=DWORD,DISP=DWORD
NOP80F1F842000000000NOP [RAX+0*RAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9660F1F842000000000NOP [RAX+0*RAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
MnemonicOperation code (hexa)Equivalent instruction in €ASM syntax

↑ PINSR register source

Instructions PINSRB, PINSRW, PINSRD (insert Byte/Word/Dword into destination register XMM) accept as source register (operand 2) not only GPR with the corresponding width, but any wider register. Only lowest byte/word/dword from this register is used.

|00000000:660F3A20C902 | PINSRB XMM1,CL,2 |00000006:660F3A20C902 | PINSRB XMM1,CX,2 |0000000C:660F3A20C902 | PINSRB XMM1,ECX,2 |00000012: | |00000012:660FC4C902 | PINSRW XMM1,CX,2 |00000017:660FC4C902 | PINSRW XMM1,ECX,2
↑ BLENDVPS, BLENDVPD, PBLENDVB 3rd operand

Instruction for variable blending use fixed implied register XMM0 as a mask register. €ASM allows explicit specification of XMM0 as the third operand.

|00000000:660F3815CA | BLENDVPD XMM1,XMM2 |00000005:660F3815CA | BLENDVPD XMM1,XMM2,XMM0 |0000000A: | |0000000A:660F3814CA | BLENDVPS XMM1,XMM2 |0000000F:660F3814CA | BLENDVPS XMM1,XMM2,XMM0 |00000014: | |00000014:660F3810CA | PBLENDVB XMM1,XMM2 |00000019:660F3810CA | PBLENDVB XMM1,XMM2,XMM0
↑ MASKMOVQ, MASKMOVDQU 1st operand

Maskable copy to memory uses [DS:rDI] as fixed destination. €ASM allows explicit specification of destination memory as the optional first operand.

|00000000:0FF7CA | MASKMOVQ MM1,MM2 |00000003:0FF7CA | MASKMOVQ [DS:EDI],MM1,MM2 ; Default destination is [DS:EDI]. |00000006:260FF7CA | MASKMOVQ [ES:EDI],MM1,MM2 ; Segment override. |0000000A: | |0000000A:660FF7CA | MASKMOVDQU XMM1,XMM2 |0000000E:660FF7CA | MASKMOVDQU [DS:EDI],XMM1,XMM2 ; Default destination is [DS:EDI]. |00000012:26660FF7CA | MASKMOVDQU [ES:EDI],XMM1,XMM2 ; Segment override.
↑ VERR, VERW, LAR, LSL

Segment descriptor in system instruction VERR, VERW (operand 1) and LAR, LSL (operand 2) may be specified as 16bit memory variable or 16, 32 or 64bit GPR (only lower 16 bits are used).

|00000000:0F00E6 | VERR SI |00000003:0F00E6 | VERR ESI |00000006: | |00000006:0F00EE | VERW SI |00000009:0F00EE | VERW ESI |0000000C: | |0000000C:660F02C6 | LAR AX,SI |00000010:660F02C6 | LAR AX,ESI |00000014:0F02C6 | LAR EAX,SI |00000017:0F02C6 | LAR EAX,ESI |0000001A: | |0000001A:660F03C6 | LSL AX,SI |0000001E:660F03C6 | LSL AX,ESI |00000022:0F03C6 | LSL EAX,SI |00000025:0F03C6 | LSL EAX,ESI

Undocumented instructions ↓

€ASM supports few instructions which are not documented in official specification published by CPU manufacturer. They may not work with all processor generations and they require explicit feature EUROASM UNDOC=ENABLED.

For more information see instruction handlers BB0_RESET, CMPXCHG486, F4X4, FCOM2, FCOMP5, FFREEP, FMUL4X4, FNSETPM, FRSTPM, FSBP1, FSBP2, FSBP3, FSTDW, FSTP1, FSTP8, FSTP9, FSTSG, FXCH4, FXCH7, HCF, HINT_NOP, IBTS, ICEBP, INT1, JMPE, LOADALL, LOADALL286, PREFETCHWT1, PSRAQ, SAL2, SALC, SETALC, SMINTOLD, TEST2, UD0, UD1, UD2A, UMOV, XBTS, VLDQQU.

↑ Pseudoinstructions

ALIGN ↓

D, DB, DU, DW, DD, DQ, DT, DO, DY, DZ, DI, DS ↓

ENDHEAD ↓

ENDP ↓

ENDP1 ↓

ENDPROC ↓

ENDPROC1 ↓

ENDPROGRAM ↓

ENDSTRUC ↓

EQU ↓

= ↓

EUROASM ↓

EXTERN ↓

EXPORT ↓

GLOBAL ↓

GROUP ↓

HEAD ↓

IMPORT↓

INCLUDE ↓

INCLUDE1 ↓

INCLUDEBIN ↓

INCLUDEHEAD ↓

INCLUDEHEAD1 ↓

LINK ↓

PROC ↓

PROC1 ↓

PROGRAM ↓

PUBLIC ↓

SEGMENT ↓

STRUC ↓

%COMMENT ↓

%DEBUG ↓

%DISPLAY ↓

%DROPMACRO ↓

%ELSE ↓

%ENDCOMMENT ↓

%ENDFOR ↓

%ENDIF ↓

%ENDMACRO ↓

%ENDREPEAT ↓

%ENDWHILE

%ERROR ↓

%EXITFOR ↓

%EXITMACRO ↓

%EXITREPEAT ↓

%EXITWHILE

%FOR ↓

%IF ↓

%MACRO ↓

%PROFILE ↓

%REPEAT ↓

%SET ↓

%SETA ↓

%SETB ↓

%SETC ↓

%SETE ↓

%SETL ↓

%SETS ↓

%SETX ↓

%SET2 ↓

%SHIFT ↓

%UNTIL ↓

%WHILE


↑ EUROASM

AUTOALIGN= ↓
AUTOSEGMENT= ↓
CODEPAGE= ↓
CPU= ↓
CPU features ABM=, AES=, AMD=, AVX=, AVX512= CYRIX=, D3NOW=, EVEX=, FMA=, FPU=, LWP=, MMX=, MPX=, MVEX=, PRIV=, PROT=, RTM=, SGX=, SHA=, SPEC=, SVM=, TBM=, TSX=, UNDOC=, VIA=, VMX=, XOP= ↓
DEBUG= ↓
DISPLAYENC= ↓
DISPLAYSTM= ↓
DUMPALL= ↓
DUMP= ↓
DUMPWIDTH= ↓
INCLUDEPATH= ↓
LINKPATH= ↓
LIST= ↓
LISTFILE= ↓
LISTINCLUDE= ↓
LISTMACRO= ↓
LISTREPEAT= ↓
LISTVAR= ↓
MAXINCLUSIONS= ↓
MAXLINKS= ↓
NOWARN= ↓
PROFILE= ↓
SIMD= ↓
UNICODE= ↓
WARN= ↓

With pseudoinstruction EUROASM programmer controls various settings of EuroAssembler - EUROASM options. Particular options are set with keyword operands. The same keywords are used in [EUROASM] section of euroasm.ini configuration file.

Options specified with this pseudoinstruction rewrite default options set in configuration file. Names of options are case-insensitive.

Options which expect Boolean value may be provided with enumerated tokens TRUE, YES, ON, ENABLE, ENABLED or FALSE, NO, OFF, DISABLE, DISABLED (case insensitive) or they may contain logical expression.

Beside keyword options the EUROASM pseudoinstruction also recognizes ordinal operand(s) which may have one of enumerated tokens PUSH or POP. €ASM maintains a special option stack and these two directives allow to save and retrieve the whole set of options to this stack. This feature is handy in macros which temporary need some special option. Setting an option in the macro permanently would have had side effect on the statements following the macro invokation, because EUROASM is a switching statement. When we save the current options on stack at the beginning of macro and restore them at the end, other statements will not be influenced. Example:

SomeMacro %MACRO  ; Macro definition.
            EUROASM PUSH, NOWARN=1234 ; Store all options to option-stack and supress warning W1234.
             ; Here go instructions which may emit warning message W1234
             ...
            EUROASM POP ; Restore option-stack, W1234 is no longer supressed.
          %ENDMACRO SomeMacro 
↑ AUTOALIGN=

Boolean option; default is AUTOALIGN=ON. Memory variables created with D pseudoinstruction will be implicitly aligned according to their TYPE#.

Aligned memory-variables can be accessed faster, on the other hand this option may blow up the size of your program if data definition of various types are mixed frequently. It's better to manually group data of the same size, so the alignment stuff is used only once per group.

Memory variables defined as literals are always autoaligned, regardless of EUROASM AUTOALIGN= status.
↑ AUTOSEGMENT=

Boolean option; default is AUTOSEGMENT=ON. The section, where the current statement emits to, is implicitly changed by €ASM according to the purpose of the statement. If the statement is a machine instruction or prefix, €ASM will switch to the lastly defined code section. Similary, when the statement defines data, the current section is switched to the last data or bss section. Nonemitting statements, such as EQU or solo label, do not involve the current section.

Each explicit change of current section resets the value AUTOSEGMENT to OFF as a side effect.

AUTOSEGMENT= is weak option, it is automatically switched off when the programmer changes the current section explicitly with [section_name] in the label field of statement. If you want to keep AUTOSEGMENT enabled after manual change of section, you need to explicitly switch it back on with EUROASM AUTOSEGMENT=ON, or save its state using EUROASM PUSH and restore them with EUROASM POP afterwards.

↑ CODEPAGE=

€ASM can use Unicode strings but in the data definitions in source program they are defined in 8 bit ASCII. Option CODEPAGE= tells €ASM which code page it should internally use for conversion to Unicode.

Codepage may be specified with a direct 16bit integer value, for instance CODEPAGE=1253.

Codepage values can also be specified as an enumerated token, such as CODEPAGE=CP852, CODEPAGE=WINDOWS-1252, CODEPAGE=ISO-8859-2 etc, see DictCodePages for the complete list. Names of those specification are case insensitive.

Though some of those enumerated codepage constants may look like an arithmetic subtraction, they are recognized as verbatim tokens and not evaluated.
.

The factory default is CODEPAGE=UTF-8.

↑ INCLUDEPATH=

When an included file is specified without path, €ASM will search for this file in the directories which are defined in INCLUDEPATH= option. Pathes can be separated with semicolon ; or comma , and the whole list should be in double quotes. Both backward \ and forward slashes / may be used as folder separator. The last slash can be omitted. Default is INCLUDEPATH="./,./maclib,../maclib,".

↑ LINKPATH=

When a linked file is specified without path, €ASM will search for this file in the directories which are defined in LINKPATH= option. Pathes can be separated with semicolon ; or comma , and the whole list should be in double quotes. Both backward \ and forward slashes / may be used as folder separator. The last slash can be omitted. Default is LINKPATH="./,./objlib,../objlib,".

↑ MAXINCLUSIONS=

Parameter MAXINCLUSIONS limits the maximal number of succesfull executions of INCLUDE* statements in an €ASM source. This prevents the assembler from resource exhausting in case of recursive inclusion loop.

Default value is EUROASM MAXINCLUSIONS=64.

↑ MAXLINKS=

Parameter MAXLINKS limits the maximal number of files specified by LINK statements in an €ASM source. This prevents the assembler from resource exhausting in case of recursive link loop.

Default value is EUROASM MAXLINKS=64.

↑ Processor generation option CPU=

Not all IA-32 machine instructions are available on all types of Central Processing Unit (CPU). This EUROASM option specifies the minimal type of CPU which the program is intended for. Possible CPU= values are 086 alias 8086, 186, 286, 386, 486, 586 alias PENTIUM, 686 alias P6, X64. Default is EUROASM CPU=586.

↑ Processor features

This bunch of EUROASM boolean options tells €ASM which CPU features are required on the target computer. By default are all options switched OFF, you should explicitly enable each capability which you intend to program for.

ABM=: assembly of Advanced Bit Manipulation instructions.

AES= assembly of Intel's Advance Encryption Standard (AESNI) instructions.

AMD= instructions specific for AMD CPU manufacturer.

CYRIX= instructions specific for CYRIX CPU manufacturers.

D3NOW= assembly of AMD 3DNow! instructions.

EVEX= assembly of Intel's EVEX-encoded AVX-512 instructions.

FMA=: assembly of Fused Multiply-Add instructions.

FPU= assembly of Floating-Point Unit instructions (math coprocessor).

LWP= assembly of AMD's LightWeight Profiling instructions.

MMX=: assembly of MultiMedia Extensions.

MPX=: assembly of Memory Protection Extensions.

MVEX= assembly of Intel's MVEX-encoded AVX-512 instructions.

PRIV=: assembly of privileged mode instructions.

PROT=: assembly of protected mode instructions.

SGX=: assembly of Software Guard Extensions.

SHA= assembly of Intel's Secure Hash Algorithm instructions.

SPEC= assembly of other special instructions.

SVM=: assembly of Shared Virtual Memory instructions.

TSX=: assembly of Intel's Transactional Synchronization Extensions.

UNDOC= assembly of undocumented instructions.

VIA= instructions specific for VIA Geode CPU manufacturers.

VMX= assembly of Virtual Machine Extensions.

XOP= assembly of AMD's XOP-encoded AVX instructions.

↑ Streaming SIMD Extension generation option SIMD=

This option defines which Single Instruction Multiple Data (SIMD) generation is required to assemble following instructions. Possible enumerated values are SSE alias SSE1 alias boolean true, SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2, AVX, AVX2, AVX512. Default value is boolean false (no SIMD instructions are expected).

CPU generation, CPU features, SIMD generation options do not restrain €ASM from assembling instructions for higher CPU but a warning is issued when the instruction requires some capability currently not enabled with EUROASM. This should warn you that your program may not run on every PC, or that you may have made a typo in instruction mnemonics.
↑ DISPLAYSTM=
↑ DISPLAYENC=

Those boolean options are designed for debugging of assembly process, see also pseudoinstruction %DISPLAY. When enabled, €ASM inserts diagnostic message below each assembled statement, which displays how is the statement parsed into fields, and what modifiers are used for instruction encoding. Example:

    EUROASM DISPLAYSTM=ON
.L: MOV EAX,[ESI+16],ALIGN=DWORD
    EUROASM DISPLAYSTM=OFF, DISPLAYENC=ON
    LEA EDX,[ESI+16]
    ADD EAX,EDX

Listing of previous example is here:

| | EUROASM DISPLAYSTM=ON |00000000:8B4610 |.L: MOV EAX,[ESI+16],ALIGN=DWORD |# D1010 **** DISPLAYSTM ".L: MOV EAX,[~~ALIGN=DWORD " |# D1020 label=".L" |# D1040 machine operation="MOV" |# D1050 ordinal operand number=1,value="EAX" |# D1050 ordinal operand number=2,value="[ESI+16]" |# D1060 keyword operand,name="ALIGN",value="DWORD" | | EUROASM DISPLAYSTM=OFF, DISPLAYENC=ON |# D1010 **** DISPLAYSTM "EUROASM DISPL~~SPLAYENC=ON " |# D1040 pseudo operation="EUROASM" |# D1060 keyword operand,name="DISPLAYSTM",value="OFF" |# D1060 keyword operand,name="DISPLAYENC",value="ON" |00000003:8D5610 | LEA EDX,[ESI+16] |# D1080 Emitted size=3,DATA=DWORD,DISP=BYTE,SCALE=SMART,ADDR=ABS. |00000006:01D0 | ADD EAX,EDX |# D1080 Emitted size=2,CODE=SHORT,DATA=DWORD.
↑ DUMP=
↑ DUMPWIDTH=
↑ DUMPALL=

Options DUMP=, DUMPWIDTH= and DUMPALL= control how the dump column with emitted code is presented in listing.

Boolean option DUMP= can switch off the dump completely, the listing copies the input source almost verbatim in this case. Default is DUMP=ON.

DUMPWIDTH= sets the width of dump column in €ASM listing. This option specifies how many characters of dumped data will fit between the starting | and ending | including those two border characters. Default value is 27 which is enough for 8byte long instruction.
Dump data consists of an offset (4 or 8 hexadecimal characters, depending on section width), separator : and 2 hexadecimal digits per each byte of generated code.

Minimal usable dump width is 13 in 32bit section and 9 in 16bit section, which displays only one byte of dumped data. Lower values od DumpWidth will switch the whole dump column off, which is the same as specifying DUMP=OFF.

When the generated code is too long to fit into dump column, the Boolean option DUMPALL= decides if the rest will be omitted (the omittion is indicated by tilde ~ in place of the last character), or if additional lines will be inserted to the listing until all generated code is dumped. Factory default is DUMPALL=OFF.

Be careful when setting DUMPALL=ON with long duplicated data definition, such as DB 2048 * B 0, because this may clutter the listing with many lines of useless dump.
↑ LISTFILE=

This option defines the name of the listing file. By default it is LISTFILE="%^SourceName%^SourceExt.lst", i.e. the it copies the name and extension of source file and appends .lst to it.
If not specified otherwise, listing is always created in the same directory as the corresponding source file.

↑ LIST=
↑ LISTINCLUDE=
↑ LISTMACRO=
↑ LISTREPEAT=
↑ LISTVAR=

LIST* family of options controls what should be copied to the listing file. Boolean option LIST=OFF will suppress the generation of listing until it is switched on again. Default is LIST=ON.
Note that switching off even a minor part of listing will cause that the listing file is no longer usable as the source file, because some parts are not copied by €ASM from source to the listing.

Contents of included files are by default omitted from the listing (LISTINCLUDE=OFF). When this option is ON, the INCLUDE statement will be replaced by the contents of file.

LISTMACRO= controls whether the instructions from macro expansion go to the listing. Default state is LISTMACRO=OFF and only the macroinstruction itself is presented.

EUROASM option LISTREPEAT= is similar to LISTMACRO= with the difference that it controls listing of statements expanded in %FOR, %WHILE and %REPEAT blocks.

When a preprocessing %variable is used in the statement and the option LISTVAR=ON, the statement is repeated in the form of a machine comment just below the original statement and the expanded text is shown instead of %variables. Factory default is LISTVAR=OFF.

↑ UNICODE=

This boolean option specifies if quoted string in data definition, such as D "an explicit string" or ="a literal string" should be treated as a sequence of bytes (8bit characters) or unichars (16bit characters).

System %variable %^UNICODE may be tested in macros to determine which version of Windows API function (ANSI od WIDE) should be invoked.

↑ DEBUG=

This boolean option specifies if debug version should be assembled. When EUROASM DEBUG=ENABLED, linker includes symbol table or other debugging information to the output program. Macros can change their behaviour depending on condition %IF %^DEBUG.

The final release should be assembled with this option turned off.

↑ PROFILE=

This boolean option specifies if profileable version should be assembled. Profiling is not implemented yet in this version of EuroAssembler.

The final release should be assembled with this option turned off.

↑ WARN=
↑ NOWARN=

Options WARN= and NOWARN= control which informative and warning messages will be issued in the assembly process. With NOWARN= it is possible to suppress anticipated messages with identification number below W4000. Suppressed warnings do not involve the errorlevel. User generated warnings (U5000..U5999) and errors with higher severity cannot be supressed.

The value of option is either a number or a range of numbers, which cannot exceed 3999. WARN= and NOWARN= operands may repeat in a statement; they are processed from left to right. For instance EUROASM NOWARN=0600..0999, WARN=705 will supress informative messages I0600 to I0999 except for message I0705 which remains enabled.

Default value is WARN=0..4999 (all messages enabled}.


↑ PROGRAM

↑ ENDPROGRAM

DLLCHARACTERISTICS= ↓
ENTRY= ↓
FILEALIGN= ↓
FORMAT= ↓
ICONFILE= ↓
IMAGEBASE= ↓
LISTLITERALS= ↓
LISTGLOBALS= ↓
LISTMAP= ↓
MAJORIMAGEVERSION= ↓
MAJORLINKERVERSION= ↓
MAJOROSVERSION= ↓
MAJORSUBSYSTEMVERSION= ↓
MAXEXPANSIONS= ↓
MAXPASSES= ↓
MINORIMAGEVERSION= ↓
MINORIMAGEVERSION= ↓
MINORLINKERVERSION= ↓
MINOROSVERSION= ↓
MINORSUBSYSTEMVERSION= ↓
MODEL= ↓
OUTFILE= ↓
SECTIONALIGN= ↓
SIZEOFHEAPCOMMIT= ↓
SIZEOFHEAPRESERVED= ↓
SIZEOFSTACKCOMMIT= ↓
SIZEOFSTACKRESERVED= ↓
STUBFILE= ↓
SUBSYSTEM= ↓
TIMESTAMP= ↓
WIDTH= ↓
WIN32VERSIONVALUE= ↓

Pseudoinstructions PROGRAM and ENDPROGRAM specify a block of source code, which creates standalone output file. In most other assemblers the whole source file creates the output file, sometimes it is called modul. For instance, the command nasm -f win32 HelloWorld.asm tells NetWide Assembler to create a COFF output file HelloWorld.obj. With €ASM more than one output files can be created with the command euroasm HelloWorld.asm, provided that there are more PROGRAM / ENDPROGRAM blocks in HelloWorld.asm.

The label of PROGRAM statement represents the name of output program. Although it does not define a symbol, its name must follow the rules for symbol names, i.e. at least one letter followed with letters and digits. The same identifier may be used as operand %1 in the corresponding ENDPROGRAM statement.

One source may contain more program blocks and the blocks may nest. Each program block assembles to a different output file.

Symbols defined in a program are not visible outside the block. When a program needs to call a label from another program, labels must be marked as extern and public, even when both program may lay in the same source file or one program be nested in another.

Preprocessing %variables, macro definitions and Euroasm options, on the other hand, are visible throughout the source and they can carry the information between programs at assembly time.

The PROGRAM pseudoinstruction has many important keyword operands which specify properties of the output file. The same keywords are used in [PROGRAM] section of euroasm.ini configuration file.

Unlike EUROASM options, which involve only a division of source, PROGRAM properties involve the whole program en bloc. We cannot have half of the program with graphic subsystem, and another half with console subsystem, for instance. That is why options LISTMAP=, LISTGLOBALS=, LISTLITERALS= are properties of pseudoinstruction PROGRAM, but LISTINCLUDE=, LISTMACRO=, LISTREPEAT=, LISTVAR= are properties of pseudoinstruction EUROASM.
↑ FORMAT=

Format and file-extension of output file is determined with this parameter.

€ASM output file formats
FORMAT=Default
file
extension
Default
program
width
Default
program
model
Description
BIN.bin16bitsTINYBinary file
COM.com16bitsTINYDOS/CPM 16bit executable
OMF.obj16bitsSMALLOMF relocatable Object Module Format
LIBOMF.lib16bitsSMALLObject library in OMFormat
MZ.exe16bitsSMALLDOS 16bit executable
COFF.obj32bitsFLATCommon Object File Format in Microsoft specification
LIBCOF.lib32bitsFLATObject library in COFFormat
PE.exe32bitsFLATPortable executable, COFF based
DLL.dll32bitsFLATDynamic Linked Library, COFF based
↑ WIDTH=

This parameter specifies operating mode of the program:

Program width also defines default width for all its segments. Its value is numeric expression which evaluates to 16, 32, 64, or to 0. Empty or zero value (factory default) specifies that program width should be set internally by €ASM according to its FORMAT=. Nevertheless, when a segment is defined, it may specify a different width, regardless of the default width of its program.

↑ MODEL=

Memory model describes sizes and distances of code and data, and the number of code and noncode segments. The main function of memory model specification is to set default distance for segments and procedures defined in the program.
Program property MODEL= is querried in procedure pseudoinstructions (PROC, PROC1) and in control-transfer instructions (JMP, CALL, RET) without explicitly specified distance. In multicode models (MEDIUM,LARGE,HUGE) the default transfer distance is FAR, otherwise NEAR.

In monodata models (TINY,SMALL,MEDIUM,FLAT) are all data addressed relatively to the start of data segment. In multidata models (COMPACT,LARGE,HUGE) it is the programmers responsibility to load the used segment register with paragraph address of the data before they are accessed.

Properties implied by memory model
MODEL=Default segment properties Link propertiesUsual usage
CODE
distance
DATA
distance
Segm.
width
Multi-
code
Multi-
data
Segm.
overlap
CPU
mode
Used in
formats
TINYNEARNEAR16nonoyesrealCOM
SMALLNEARNEAR16nononorealMZ, OMF
MEDIUMFARNEAR16yesnonorealMZ, OMF
COMPACTNEARFAR16noyesnorealMZ, OMF
LARGEFARFAR16yesyesnorealMZ, OMF
HUGEFARFAR32yesyesnorealMZ, OMF
FLATNEARNEAR32,64nonoyesprotectedPE, DLL, COFF
↑ SUBSYSTEM=

Subsystem is a numeric identifier in the header of Portable Executable file. This parameter specifies whether Windows should create a new console when the PE program starts. Default is SUBSYSTEM=CON. Set it to GUI when your PE programs creates graphical windows rather than using standard text input and output. Value of subsystem is one of enumerated tokens from the table below or a numeric expression which evaluates to the corresponding number.

Subsystems table
SUBSYSTEM=ValueRemark
00Unknown subsystem.
1NATIVESubsystem is not used, i.e. device driver.
2GUIWindows GUI graphical windows.
3CONWindows console (character subsystem).
5OS2OS/2 character subsystem.
7POSIXPosix character subsystem.
8WXDWindows 95/98 native driver.
9WCEWindows CE graphical windows.
↑ ENTRY=

This parameter specifies an address where execution of the program begins. Usually this parameter contains a label whose address is set to CS:rIP when loader transfers execution to the program at run-time.

By default the ENTRY= parameter is empty; in this case €ASM will set it to 0 if PROGRAM FORMAT=BIN or to 256 if PROGRAM FORMAT=COM. This parameter may be left empty in linkable program formats but it must be specified in executable formats, otherwise €ASM reports error.
If the executable links other programs (object modules), entry point must be specified in exactly one such modul.

↑ MAXPASSES=

This parameter limits the number of assembly passes through the source code. It is €ASM who decides how many passes will be necessary, nonetheless this parameter sets the upper limit. Accepted values are 3..32, default is 20.

↑ MAXEXPANSIONS=

This parameter limits the number of %FOR, %WHILE, %REPEAT or %MACRO block expansions. €ASM declares a program property named %. and increments its value whenever a preprocessing block is expanded. When this number exceeds MAXEXPANSIONS value, €ASM emits error message and stops further expansions.
Factory default is MAXEXPANSIONS=1999.

This mechanism protects €ASM from exhausting of memory resources when some badly written preprocessing loop fails to exit. If your program is really big, you may need to increase MAXEXPANSIONS value.

The same expansion counter is used to maintain the value of special macro %variable %..

↑ OUTFILE=

OUTFILE= specifies filename of the output of assembly - executable or linkable object file. This filename is related to the current shell directory, if not specified otherwise. Default value is OUTFILE="%^PROGRAM" followed by extension specified by FORMAT=.
E.g.: Hello PROGRAM FORMAT=MZ will create output file "Hello.exe".

↑ STUBFILE=

STUBFILE= is used in COFF-based exectutables - PE and DLL formats only. The stub is 16bit MZ program which gets control when the output file is launched in 16bit DOS operating system. Usualy its only job is to tell the user, that this program requires MS Windows.

When STUBFILE parameter is empty (default), €ASM will use its own built-in stub code.
Otherwise it looks for previously compiled MZ executable. If the STUBFILE= is specified without path, €ASM looks for the file in pathes specified by EUROASM option LINKPATH=.

↑ ICONFILE=

ICONFILE= should specify an existing file with icon which will be built into resource segment of PE or DLL output file. This icon is used to graphically represent the output file in MS Windows environment (Desktop, Explorer etc). Icon file is searched for in the path specified by EUROASM option LINKPATH=.

Factory-default value is EUROASM ICONFILE="euroasm.ico" which represents an icon   Icon shipped with EuroAssembler in directory objlib.

Option ICONFILE= applies only when no resource file is linked to the output program, otherwise it is ignored and the first icon from resources (if any) is used by Windows Explorer to represent the executable.

When parameter ICONFILE= is empty, no icon is used and €ASM does not create resource section at all.

↑ LISTMAP=
↑ LISTGLOBALS=
↑ LISTLITERALS=

Those three options control whether auxilliary information will be dumped near the end of program in listing file. If LISTLITERALS=ON, contents of data literal sections @LT16, @LT8, @LT4,@LT2, @LT1 will be dumped too.

↑ TIMESTAMP=

Specifies nominal time embedded in some output file formats.

↑ DLLCHARACTERISTICS=
↑ FILEALIGN=
↑ IMAGEBASE=
↑ MAJORIMAGEVERSION=
↑ MAJORLINKERVERSION=
↑ MAJOROSVERSION=
↑ MAJORSUBSYSTEMVERSION=
↑ MINORIMAGEVERSION=
↑ MINORLINKERVERSION=
↑ MINOROSVERSION=
↑ MINORSUBSYSTEMVERSION=
↑ SECTIONALIGN=
↑ SIZEOFHEAPCOMMIT=
↑ SIZEOFHEAPRESERVED=
↑ SIZEOFSTACKCOMMIT=
↑ SIZEOFSTACKRESERVED=
↑ WIN32VERSIONVALUE=

Other PROGRAM parameters are mostly important only in COFF-family od output formats (PE, DLL, COFF) formats and they form a PE header. See [MS PECOFF] specification for detailed description.

↑ SEGMENT

PURPOSE= ↓
WIDTH= ↓
ALIGN= ↓
COMBINE= ↓
CLASS= ↓

Pseudoinstruction SEGMENT declares segment and allows to set its properties. Each segment definition defines a section with the same name simultaneously.

The name of segment is defined in the label field and it looks like an identifier in square brackets. Segment properties are set with keyword parameters.

€ASM declares automatically three default segments when it starts to assemble a program. In most cases there is no need to explicitly declare any segments. Default segments are [.text], [.data] and [.bss]. When these segments are not used in the program (no code was emited into them), they are discarded at assembly time and do not appear in the object file. This happens when the programer is not satisfied with default segment names and properties and declares new segments (usually near the program beginning). Implicit segment declaration looks like this:

[.data]  SEGMENT PURPOSE=DATA,COMBINE=PRIVATE,ALIGN=PARA
[.bss]   SEGMENT PURPOSE=BSS, COMBINE=PRIVATE,ALIGN=PARA
[.text]  SEGMENT PURPOSE=CODE,COMBINE=PRIVATE,ALIGN=BYTE
↑ PURPOSE=

Parameter PURPOSE= specifies what kind of information is the segment intended for. It is important in protected mode (formats COFF, PE, DLL), where descriptor's access bits control the rights granted to read, write or execute the contents of segment.

Segment purpose table
PURPOSE=AliasAccessDefault nameContents
CODETEXTread, execute[.code]|[CODE]Program code (instructions) (1)
STACKread, write[STACK]Machine stack (1)
DATAIDATAread, write[.data]|[DATA]Initialized data (1)
BSSUDATAread, write[.bss]|[BSS]Uninitialized data (1)
LITERALSLITERALread parasites on other data/code segmentLiteral sections (2)
DRECTVEdiscarded[.drectve]Linker directives (3)
EXPORT[.edata]Dynamic link export (4)
IMPORT[.idata]Dynamic link import (4)
RESOURCE[.rsrc]Programming resources (4)
EXCEPTION[.pdata]Runtime exceptions (5)
SECURITYAttribute certificate (5)
BASERELOCdiscarded[.reloc]Load-time relocations (4)
DEBUG[.debug]Data for debugger (5)
COPYRIGHTARCHITECTUREArchitecture info (5)
GLOBALPTRRVA of global pointer (5)
TLS[.tls]Thread local storage (5)
LOAD_CONFIGLoad configuration (5)
BOUND_IMPORTBound import (5)
IAT[.idata]Import address table (4)
DELAY_IMPORTDelayed import descriptor (5)
CLR[.cormeta]CLR metadata (5)
RESERVEDReserved (5)
Remarks:
(1) Basic purposes used in all program formats.
(2) Programmer may specify which data/code segment should be used to host literal symbols.
(3) Syntetic section used for transfer of dynamic-link information in COFF format.
(4) Special sections directly supported by EuroAssembler. They should never be declared explicitely.
(5) Special sections, their contents is not supported. Programmer may include such section in their PE file but the contents must be explicitely specified (with D or INCLUDEBIN), see program format PE.

Segments with special purpose names (4),(5) will be marked in the corresponding position of DataDirectory table in optional header of PE or DLL file format.

Although operand PURPOSE= accepts only enumerated values, they may be combined using the operator Addition + or Bitwise OR |, for instance
PROGRAM FORMAT=COM, PURPOSE=CODE|DATA|BSS|STACK or
[.lit] SEGMENT PURPOSE=DATA+LITERALS.

When this parameter is empty or not specified, €ASM will guess the segment's purpose by its class or [name], following this rules:

  1. If the name exactly case-insesitively matches any purpose enumerated in the table above, this purpose is assumed.
  2. If the name contains string STACK (case insensitive), PURPOSE=STACK is assumed.
  3. If the name contains string BSS or UDATA (case insensitive), PURPOSE=BSS is assumed.
  4. If the name contains string DATA, PURPOSE=DATA is assumed.
  5. If none of the previous rules applies, PURPOSE=CODE is assumed.

PURPOSE=LITERALS is used together with CODE and/or DATA and it only suggests that this segment should be preferably used to host literal sections. If no segment is explicitely marked as PURPOSE=LITERAL, €ASM will choose the last data/code segment defined when some literal symbol was encountered.

↑ WIDTH=

Segment width value can be numeric expression which evaluates to 16, 32 or 64. By default (if omitted) the width of segment is determined by program widtht.

↑ ALIGN=

This parameter reserves alignment of the segment in memory at run-time. Default values are ALIGN=BYTE for code segments and ALIGN=OWORD for data segments. Memory variables cannot ask for better alignment than the one of the segment they belong to.

↑ COMBINE=

This parameter specifies how segments from other program modules will be combined at link time. This is important only in MZ program format (16bit DOS executables) linked from several OBJ files. Possible values:

PUBLIC
All segments with the same name will be linked together. Total size is the sum of concatenated segments. This is the default option.
PRIVATE
Private segments will be not concatenated with other segments, no matter if they have the same name or not.
COMMON
All common segments with the same name will be linked to the same address so they overlay each other. The total segment size equals to the greatest size of all segments with this name. Data variables declared in common segment will be shared among separately assembled modules.
STACK
The STACK combine parameter is the same as PUBLIC, in addition the SS:SP pointer in target EXE file will be set to the end of such segment on run time.
↑ CLASS=

Value of CLASS= in an arbitrary identifier. It may be used by the linker to guess the segment purpose (CODE|DATA|BSS) in object formats which do not carry purpose information.

↑ GROUP

This pseudoinstruction specifies segments addressed with the same addressing frame. Data in all grouped segments are referrenced with the same value of segment register.

Name of the group must be defined in label field, names of grouped segments are enumerated in operand fields. All names are in braces [ ]. Example:
[DGROUP] GROUP [DATA],[STRINGS].
Grouped segment may be defined before or after the GROUP statement. Pseudoinstruction has no keyword operands.

Segment groups are applicable in huge realmode 16bit programs. Only 16bit segment can be member of the group.

Relation between group and its segments in link time is similar to the relation between segment and its sections in assembly time.


↑ PROC

↑ ENDPROC alias ↑ ENDP

DIST= ↓
ALIGN= ↓
NESTINGCHECK= ↓

Pseudinstructions PROC and ENDPROC declare a namespace procedure block. In most times it ends with instruction RET, so it can be called to perform some function and after the execution it returns back just behind the CALL instruction.

The label declares assembler symbol which is the procedure name. The same label may be used as the first and only operand of ENDPROC pseudoinstruction.
Alias ENDP may be used instead of ENDPROC.

We could manage without PROC/ENDP pseudoinstruction easily but wrapping the code in PROC/ENDPROC block has some advantages:

↑ DIST=

Pseudoinstruction PROC may have keyword operands DIST= and ALIGN=. DIST= sets the distance of the procedure (NEAR or FAR). When DIST=FAR, all CALL to this proc default to FAR, and all RET within this proc default to FAR (of course this can be overriden with instruction suffix CALLN/CALLF, RETN/RETF). Default value of this parameter is DIST=NEAR.

↑ ALIGN=

Alignment of procedure is ALIGN=BYTE by default. For the best use of instruction cache it sometimes may be usefull to complete frequently called proc with ALIGN=OWORD, if code size is not an issue.

↑ NESTINGCHECK=

This boolean option allows to switch off internal check of PROC/ENDPROC labels matching. This has only exceptional use in macros simulating built-in pseudoinstruction, which need to hack block context, such as Procedure and EndProcedure.

Pseudoinstruction PROC does not accept ordinal parameters. They can be passed in registers or machine stack managed manually. Macrolibrary StdCall shipped with EuroAssembler defines macros Procedure and EndProcedure with similar function as PROC/ENDP, which allow to pass arbitrary number of arguments as macro parameter when the procedure is invoked.

↑ PROC1

↑ ENDPROC1 alias ↑ ENDP1

This is equivalent to PROC/ENDPROC with two differences:

  1. Procedure declared with PROC1/ENDPROC1 may occur in the program more than one time. Repeated declarations of PROC1/ENDPROC1 block with the same label are ignored.

    This predetermines PROC1 for semiinline macros, which contain both call of a procedure and the procedure itself. When the procedure is defined with PROC1/ENDPROC1, such macro can be invoked many times but the called procedure will be assembled and emitted only once.

  2. A block defined with PROC1/ENDPROC1 is not emitted to the current section. €ASM will automatically switch to another code section instead, and return to the previous section after ENDPROC1 has been processed. The section, which €ASM will switch to, has the name [@RT1] and it belongs to the lastly defined code segment. In some circumstances €ASM may also use runtime sections [@RT2], [@RT3] etc. This happens when the code inside the PROC1/ENDPROC1 block contains other semiinline macros, so the current runtime section already is [@RT1] and €ASM must choose another one.

    Emitting procedures to diffent section, than the main program currently uses, has an advantage that the procedure needs not to be bypassed with jump instruction. It also leads to shorter code because the jumps over the semiinstrictic macros need not to jump over the whole procedure code, which could make them exceed 128 distance easily and that would require using longer form of jump instructions.

↑ ENDHEAD

HEAD / ENDHEAD just mark a portion of source code, which may be included from other source files with INCLUDEHEAD. The block usually contains the interface of programming objects (definition of structures, macros, constants) which needs to be included in other separately assembled programs.

Label field of pseudoinstruction HEAD may be used as block identifier but it does not create a symbol. More then one HEAD/ENDHEAD block can be specified in a source. When these blocks are nested, the outer (larger) block will be included.

Languages which do not have implemented this mechanism require to put interface part in separate header files. With HEAD/ENDHEAD they can be kept together with the execution body in one compact file.

↑ INCLUDE

This pseudoinstruction imports file(s) specified as its parameters to the main source file. The INCLUDE statement is virtually replaced with the contents of included file.

Inclusion may be nested, i.e. the included file may contain other INCLUDE statements.

When the file is specified without path, it will be searched for in folders specified with EUROASM option INCLUDEPATH=. If the included filename contains at least one slash, backslash or colon / \ : , it has specified its own path and the INCLUDEPATH= is ignored in this case.

The filename may contain wildcards * ?, in this case €ASM will include all files that apply (in unspecified order).

INCLUDE can have unlimited number of operands, for example INCLUDE "Win*.htm", ./MyConstants.asm, "C:*.inc".

Behaviour of INCLUDE statement is described in the following table:

PathWildcardExample When 1st file foundWhen no file found
NoNofile.incDone, stops further searching in INCLUDEPATH.Error E6914.
YesNo./file.incDone.Error E6914.
NoYesfile*.incContinue searching for more files in INCLUDEPATH.Nothing is included, no error.
YesYes./file*.incContinue searching for more files in given path.Nothing is included, no error.

Only part of source file will be included when substring or sublist operator immediately follows the file name. Example: INCLUDE "file.inc"{%&-20..%&} will include the last twenty lines of file.inc. Filename must be in double quotes in this case. When suboperation is used on wildcarded filename, it will be applied to all files.

↑ INCLUDE1

This pseudoinstruction behaves exactly like INCLUDE but first it looks if the same file (with the same size and contents, regardless of their names) was already included in the program, and skips the file in this case.

↑ INCLUDEHEAD

The INCLUDEHEAD variant includes only the contents of HEAD/ENDHEAD block(s) of included file. Error is reported if none such block is found in the file or if the block is incomplete (missing ENDHEAD). When suboperation is used with INCLUDEHEAD, it is applied first to the included file and HEAD/ENDHEAD block is searched for in the subrange only.

↑ INCLUDEHEAD1

The INCLUDEHEAD1 will not include the HEAD block if the file or any part of it has already been included in the program using INCLUDE, INCLUDE1, INCLUDEHEAD or INCLUDEHEAD1.

↑ INCLUDEBIN

Unlike INCLUDE and INCLUDEHEAD, this pseudoinstruction does not treat the file contents as a source to assemble, but the contents is emitted as is at the position specified by offset pointer $ of current section.

Including binary data should not be misplaced with linking; it does not update relocatable addresses or external symbols. For instance the statement INCLUDEBIN "C:\WINNT\Media\chimes.wav"[0x2C..] will skip the first 0x2C bytes of WAV header and load the rest of file (raw samples) to the assembled file, as if they were defined with DB statements.

Pseudoinstruction LINK specifies file(s) which should be linked into the current program.

Each ordinal operand represents file name, which may have wildcards and may be specified with or without path. Relative path refers to the current directory.

If the linked file name does not contain path, it will be searched for in all directories specified with EUROASM LINKPATH= option, respectively. Unlike included files, suboperations with linked files are not supported.

Linkable files have specific internal structure, which probably would have been damaged if only suboperated part of the file were subjected to the link process. Therefore only whole object file or library can be linked.

Position of the LINK statement withing the program is not important, the actual linking will be performed when the final program pass is about to end. Order in which the files are linked respects the order in which pseudoinstruction LINK appeared in source. However, if linked files are specified with wildcards, e.g. LINK "modul*.lib", their order depends on current filesystem and cannot be reliably predicted. Example:

 LINK Subproc.obj, "..\My modules\W*.obj"

↑ PUBLIC

Scope declaration pseudoinstructions GLOBAL, PUBLIC, EXTERN, EXPORT, IMPORT set the scope property of symbol(s), which is important in linking.

The symbol, whose scope is being declared, may be in the label field or in the operand field of the statement, or in both. More than one symbol may be declared with one statement. Symbols in question may be forward or backward referred.

Explicit scope declaration may appear before or after the symbol is actually defined or referred.

Example: Explicit scope declaration of four symbols: Sym1 PUBLIC Sym2, Sym3, Sym4

Specifying symbol as PUBLIC just tells €ASM that the symbol, which was or will be defined somewhere else in the program, should be referrable from other programs statically linked together. Public declaration does not create the symbol yet, in fact symbol with that name must be defined somewhere else in the same program.

↑ EXTERN

This property tells €ASM that this symbol is not defined in the program, and so references to its offset must be patched in the code at link time. It is an error to define symbol which is declared as EXTERN in the same program. Instead, it is searched for in other modules at link time, and the linker may report an error when the external symbol is not found.

↑ GLOBAL

Pseudoinstruction GLOBAL can be used to automatize dealing with PUBLIC and EXTERN scopes. If the symbol is marked with GLOBAL statement, it behaves either as public or external, depending whether or not it is defined in the same program.

Programmer surely knows whether the declared symbol belongs to the current program or not, so why is the declaration of PUBLIC and EXTERN scope duplicated by GLOBAL? Lets have program PgmA which defines public symbol SymA and refers external symbol SymB. Similary PgmB defines SymB and refers SymA:
PgmA PROGRAM
      PUBLIC SymA
      EXTERN SymB
      CALL SymB: ; Reference to external symbol.
SymA: RET        ; Definition of public symbol.
     ENDPROGRAM PgmA

PgmB PROGRAM
      PUBLIC SymB
      EXTERN SymA
      CALL SymA: ; Reference to external symbol.
SymB: RET        ; Definition of public symbol.
     ENDPROGRAM PgmB
If we replace PUBLIC and EXTERN declarations with GLOBAL, the same declaration statement can be used in all statically linked programs, either copy&pasted or included from external file, which is easier to maintain:
PgmA PROGRAM
      GLOBAL SymA, SymB
      CALL SymB: ; Reference to external symbol.
SymA: RET        ; Definition of public symbol.
     ENDPROGRAM PgmA

PgmB PROGRAM
      GLOBAL SymA, SymB
      CALL SymA: ; Reference to external symbol.
SymB: RET        ; Definition of public symbol.
     ENDPROGRAM PgmB
Another raison d'être of GLOBAL is backward compatibility with NASM, which doesn't know directive PUBLIC at all. NASM uses directive GLOBAL instead whenever €ASM would require PUBLIC.

↑ IMPORT

Scope IMPORT an EXPORT are used in dynamic linking, when our program calls an imported function from DLL. This pseudoinstruction accepts keyword parameter LIB= which specifies the library file. Parameter LIB= may be omitted when the symbols are imported from kernel32.dll (this is the Windows dynamic library of most frequently used system functions).
Library file name need not be in quotes when it's conforms DOS 8.3 convention. Library is always specified without path. Operating system uses its own rules concerning directories where are libraries searched for at bind-time.

↑ EXPORT

Scope EXPORT is used when we make a dynamic library and it declares symbols which are expected to be imported by other programs. Similar to PUBLIC scope, symbol marked for EXPORT must be defined in the program, sooner or later.

Pseudoinstruction EXPORT accepts two keyword parameters FWD= and LIB=, which specify that the exported symbol (function name) is in fact provided by another dynamic library (defined with LIB=) under a different symbol name (defined with FWD=). Example:

kernel32 PROGRAM FORMAT=DLL
          EXPORT EnterCriticalSection, LIB="NTDLL.dll", FWD=RtlEnterCriticalSection
         ENDPROGRAM kernel32

Library "kernel32.dll" yields API function EnterCriticalSection, which is in fact provided by the library "NTDLL.dll". In other Windows version it may be provided by a different library "XPDLL.dll" but programs importing the function from proxy library "kernel32.dll" need no update.

↑ ALIGN

This pseudoinstruction is used for explicit alignment of current section pointer $. For instance ALIGN OWORD in code section will emit several (0..15) bytes of NOP operation, so that the next statement will be emitted at octword-aligned address.

ALIGN statement may have no label but it can have two operands. The second operand is used for intentional misalignment, and it must be lower than the first one. For instance ALIGN 8,2 requests the current offset be set at the second byte in qword. Example of offsets which meet this requirement are 2, 10, 18, 26...

↑ STRUC

↑ ENDSTRUC

A structure represents virtual section of data declarations which can be used as a mask or a template laid over a piece of memory. Structure is declared with STRUC/ENDSTRUC block. The only statements which may be used within the STRUC/ENDSTRUC block are

  1. data definitions specified with D statement and its clones, either initialized or uninitialized
  2. explicit alignment statements (pseudoinstruction ALIGN)
  3. pseudoinstructions STRUC and ENDSTRUC (€ASM allows nested definitions of structures)
  4. line and markup comments

Declaration of a structure does not emit any data to the target file. Data are emited or reserved only when the declared structure is actually used in data definition (in pseudoinstruction D or DS).
When initialized data is defined in the structure declaration, it will be used to initialize corresponding members at the time of structured data definition, unless explicitly redefined.

Named data definitions in the structure must have local names (starting with .)
This alows to

  1. use the same name for members of different structures,
  2. avoid name conflict when more than one object of this structure is defined.

Each data definition is given its offset relative to the start of the structure, which is always at offset zero. Section, in which was the structure declared, is irrelevant.

Each structure must be given unique structure name, which is defined in the label field of STRUC statement and, optionally, in the operand field of ENDSTRUC statement.

Size of structure can be refereed with attribute SIZE#Structure_name.

Pseudoinstruction STRUC accepts keyword attribute ALIGN=, which specifies alignment of instances of the structure when EUROASM AUTOALIGN=ON. If the alignment is not explicitly specified with STRUC declaration, alignment corresponding to PROGRAM WIDTH= is used as the default (WORD, DWORD or QWORD).

↑ D, DB, DU, DW, DD, DQ, DT, DO, DY, DZ, DI, DS

Both initialized and uninitialized data are defined and reserved with pseudoinstruction D. When a static value is specified, the data are defined. When the value is omitted, data are reserved. If EUROASM option AUTOSEGMENT=ON, data definition will switch to data section and data reservation will switch to bss (uninitialized data) section. Each operand of D is a data expression.

Pseudoinstruction mnemonic may be appended with suffix B, U, W, D, Q, T, O, Y, Z, I, S. Suffix defines the default datatype, which is used if not explicitly specified in operand. For instance DD 2,3,4 defines three dwords with static values 2, 3 and 4.

Types of data may mix in the same D statement.

Operands without explicit redefinition take the default data type from D-suffix, for instance DB 27, "$", W 120 defines two bytes followed with one word. Datatypes in operand may also be specified with long names, e.g. DB 27, "$", WORD 120.

Data from one operand may be duplicated.

For instance TranslateTable: D 256 * BYTE reserves 256 uninitialized bytes.
If duplication is not used, it defaults to 1. Negative duplicator is not permitted.
Duplicator 0 does not define or reserve any data, but still it provides default datatype of the symbol and, if AUTOALIGN=ON, it aligns the curent offset $.

If no suffix is used, the default datatype is taken from the first nonempty operand, e.g. D D 2,3,4 defines three dwords with static values 2,3 and 4. When no default is defined, as in D 2, €ASM reports an error.

The only exception, when datatype needs not be explicitly specified, is definition of text string, for instance D "Some text.". In this case the default datatype is B or U, which depends on current value of EUROASM option UNICODE=.

No data is defined/reserved when no operand is used.
L1: D  B 5      ; Define one byte with value 5.    TYPE#L1='B', SIZE#L1=1.
L2: D  2*WORD 3 ; Define two words with value 3.   TYPE#L2='W', SIZE#L2=4.
L3: DW W        ; Reserve one word.                TYPE#L3='W', SIZE#L3=2.
L4: DW 0*D      ; Reserve nothing, align to DWORD. TYPE#L4='W', SIZE#L4=0.
L5: DQ          ; Reserve nothing, align to QWORD. TYPE#L5='Q', SIZE#L5=0.
L6: D           ; Do nothing.                      TYPE#L6='A', SIZE#L6=0.
Unlike other assemblers, omitted operand doesn't emit any data, €ASM requests that operand type and|or value be specified, no matter if the D operation is suffixed or not. For instance DB reserves one byte in MASM but it does nothing in €ASM. Use D B od DB B instead.

↑ EQU

↑ =

Pseudoinstruction EQU (or its alias =) defines a symbol, which must be specified in the label field. The statement must have just one operand, which specifies the address or the numeric value of the symbol.

Instruction Label:EQU $ or Label:= $ are equivalent to Label:, i.e. specifying the statement with label only. This is the only way how to define a plain numeric symbol, such as FILE_ATTRIBUTE_ARCHIVE = 00000020h.

↑ %COMMENT

↑ %ENDCOMMENT

These pseudoinstructions define block comments, i.e. range of source code which is ignored by €ASM. In the label field of %COMMENT there may be an identifier, which gives the block a name (but does not create a symbol). The same identifier can be used as the first operand of %ENDCOMMENT statement. This helps €ASM to check correct matching of %COMMENT/%ENDCOMMENT, especially when comment blocks are nested.

↑ %DROPMACRO

%DROPMACRO tells €ASM to forget previously defined macroinstruction.

↑ %IF

↑ %ENDIF

↑ %ELSE

Instructions between %IF and %ENDIF is assembled only if the condition in the first and only %IF operand is evaluated TRUE. %IF also accepts empty operand, which is always evaluated as FALSE.

Pseudoinstruction %ELSE may occur in the %IF/%ENDIF block; and it reverses the logic of assembly: instructions between %IF and %ELSE are assembled when the %IF condition is TRUE and instructions between %ELSE and %ENDIF are assembled when the %IF condition is FALSE.

%IF may have an identifier in the label field which does not create a symbol but it identifies the block. The same identifier can be used in operand field of %ELSE and %ENDIF statements.

↑ %FOR

↑ %ENDFOR

↑ %EXITFOR

Pseudoinstructions %FOR and %ENDFOR create block which is assembled (repeated) for each operand of the %FOR statement. The label field of %FOR statement must be an identifier. It does not create a symbol, instead it defines a formal preprocessing %variable which is accessible in the %FOR/%ENDFOR block only. The name of this %variable consists of percent sign followed with the identifier. The operands can be arbitrary text which we need to operate with: number, expression, string. The following example defines three memory variables:

data  %FOR "a", 3*B(5), "Long text"
          D %data
       %ENDFOR data

Repeating the identifier in operand field of %ENDFOR and %EXITFOR statement is optional and can be used to check propper pairing of block instructions.

The operand of %FOR can also be a numeric range, the block is repeated with each integer value of the range in this case. Slope of the range can be negative; default step of control %variable is -1 in this case instead of +1.

i  %FOR  0..5   ; Slope is positive, therefore step = +1.
      DB "A"+%i ; define bytes "A","B","C","D","E","F"
    %ENDFOR i
j  %FOR 'z'..'x' ; Slope is negative, therefore step = -1.
      DB %j ; define bytes 'z','y','x'
    %ENDFOR j

%FOR accepts one keyword operand STEP= which explicitly defines how is the control %variable incremented. Default value is STEP=0, which is a special case: the actual step is then either +1 or -1, depending on the range slope.

Both kind of operands (enumerated and range) can be combined. When the step is explicitly defined and its sign differs from the range slope, the %FOR/%ENDFOR body is not assembled. On the other hand, if STEP= is omitted or set to 0, ranges with both slopes can be combined in one %FOR statement and each range-operand will receive appropriate step +1 or -1. Example:

a %FOR 1..3, 6..4, 7
     ; %a is 1,2,3,6,5,4,7
   %ENDFOR
   
b %FOR 0..64, 256, 400..300, 512, STEP=16
     ; %b is 0,16,32,48,64,256,512
   %ENDFOR

When €ASM encounters %EXITFOR pseudoinstruction, it breaks the assembly of remaining instructions in %FOR/%ENDFOR block and continues below the %ENDFOR statement, no matter how many unprocessed %FOR operands is left.

i  %FOR 0..9
    DB %i
    %IF %i>=3 ;
      %EXITFOR
    %ENDIF
    DB "a" + %i
    %ENDFOR i ; This will define bytes 0,"a",1,"b",2,"c",3

↑ %WHILE

↑ %EXITWHILE

↑ %ENDWHILE

The block of statements between %WHILE and %ENDWHILE is assembled repeatedly while the condition in first and only %WHILE operand is TRUE. If the condition is FALSE on entering the block, it is skipped.

Identifier may be used in the label of %WHILE and in the operand of %ENDWHILE and %EXITWHILE just for visual binding; it does not define a symbol.

Unlike %FOR, which declares and maintains its control %variable, the %WHILE does not. It is the programer's duty to declare some control %variable outside the block, and to change it within %WHILE/%ENDWHILE. Example:

%i  %SETA 3
id1 %WHILE %i
C%i:  DB %i
%i    %SETA %i - 1
    %ENDWHILE id1 ; Define C3: DB 3;  C2: DB 2;  C1: DB 1

%EXITWHILE in the block will cause skipping the rest of statements; €ASM will continue below %ENDWHILE.

↑ %REPEAT

↑ %EXITREPEAT

↑ %ENDREPEAT alias

↑ %UNTIL

The conditional assembly block %REPEAT / %ENDREPAT is similar to %WHILE / %ENDWHILE but the condition is evaluated at the end of block, and the logic is inverted. %REPEAT takes no label and no operand. The statements in the block are always assembled at least once. The control condition is in operand field of %ENDREPEAT; if it evaluates to FALSE, €ASM will assemble the block repeatedly. Alias %UNTIL may be used instead of mnemonic %ENDREPEAT.

Block %REPEAT / %ENDREPEAT can use identifier for nesting check. Unlike other block statements, position of block identifier is different: Block identifier can be used as the first operand of %REPEAT, and as the label of %ENDREPEAT (alias %UNTIL).

%i  %SETA 3
    %REPEAT Id1
      C%i: DB %i
      %i %SETA %i - 1
Id1 %UNTIL %i = 0
; This will define C3: DB 3; C2: DB 2; C1: DB 1

↑ %SET

Pseudoinstruction %SET and other members of its family are designed to assign a value to preprocessing %variable, which is in the label field of the statement.

%SET assignes the whole list of operands verbatim, including the commas which separate operands from one another. White spaces between the operation mnemonics (%SET) and the first operand are omitted. White spaces after the last operand are omitted, too.

%CardList %SET Hearts, Diamonds, Clubs, Spades  ; comment

%CardList now will contain the string Hearts, Diamonds, Clubs, Spades (31 characters including spaces and commas).

↑ %SETA

%SETA accepts arithmetic values. They will be evaluated and assigned to the %variable as signed decimal number. When more than one operand is used, each value is set to the corresponding item of the %variable, which is being assigned. Example:

%Value %SETA PoolEnd - PoolBegin
%Sizes %SETA 2+3, 4, ,-5*2

The difference between labels PoolEnd and PoolBegin is calculated and assigned to %Value as a decadic number. %Sizes will contain the string 5,4,,-10 (8 characters). Items of %Sizes can be retrieved with sublist operation, such as %Sizes{2}.

↑ %SETB

%SETB is similar to %SETA, it accepts logical (Boolean) expressions as assigns them as binary digits 1 and 0. Empty operand evaluates as FALSE, i.e. %Not %SETB will assign one digit 0 to %Not. Unlike with %SETA, the binary digits are not separated with commas if more operand is used in %SETB statement. Items of assigned variable can be retrieved with substring operation. Example:

%TooBig %SETB 5 > 4  ; %TooBig is assigned with one character 1
%Flags %SETB %TooBig, 2,,3>2,Symbol-Symbol,4,, ; %Flags are now 11010100
Flags DB %Flags[]b ; Byte memory variable 11010100b is defined here

↑ %SETC

%SETC accepts numeric operand, which must evaluate to a plain number not greater than 255 or not lower then -128. The result will be assigned as one character with evaluated ASCII value. Example:

%Space %SETC 32
%Quote %SETC '"'
%NBSP  %SETC -1 ; assignes character 0xFF

Similar with %SETB, multiple operands may be defined in %SETC and the resulting characters are not separated with commas.

%Hexadigits %SETC 'A','B','C','D','E','F'
; %Hexadigits now contains six characters ABCDEF

↑ %SETE

This pseudoinstruction reads environment variable from system at assembly time and assigns it to the preprocessing variable. Name of environment must be in the first and only operand and it is cited without percent signs or dollar sign, e.g.

%OS %SETE OS
Msg: DB "This program was assembled at %OS system."

↑ %SETS

%SETS looks at the %variable in the operand field(s) and assignes the number of character which it occupies.

%SomeVar %SET ABC, DEF
%SomeSize %SETS %SomeVar ; %SomeSize is now 8 (six chars, comma and space)
%SizeOfSomeSize %SETS %SomeSize ; 1 (one digit)

↑ %SETL

%SETL is similar to %SETS except that is assignes the number of comma-separated items in the %variable.

%SomeVar %SET ABC, DEF
%SomeSize %SETL %SomeVar ; %SomeSize is now 2 (two comma separated items)
%SizeOfSomeSize %SETL %SomeSize ; 1 (one item)

↑ %SET2

Consider assembling the statement %Var1 %SET %Var2. €ASM first expands the %Var2 and result of expansion is then assigned to %Var1. First two tokens of the statement are not expanded, because %Var1 is just being assigned, and %SET is reserved name which is never expanded.

%SET2 is similar to %SET except that the operand field is expanded twice before being assigned. Each expansion "swallows" one percent sign.

%D1 %SET B"A"
%D2 %SET B"B"
%D3 %SET B"C"
i %FOR 1..3
%DataExp %SET2 %%D%i
    D %DataExp
   %ENDFOR i ; Emit  D B"A"; D B"B"; D B"C"

↑ %SETX

When pseudoinstruction of SET* family is being assembled, €ASM does not expand label field and operation field of statements such as %Label %SET* anything. This applies to %SET, %SETA, %SETB, %SETC, %SETU, %SETE, %SETS, %SETL, %SET2 but not to %SETX. In this statement the label field is expanded, too. %SETX then works like ordinary %SET, which means that there must be a valid %variable name in the label field after expansion. For instance %%Var1 %SETX ABC is equivalent to %Var1 %SET ABC.

Using %SETX we can assign %variables which names are not explicitly set at assembly time. Example:

i %FOR 1..3
%%M%i %SETX M%i
   %ENDFOR  ; This will assign values 1,2,3 to preprocessing %variables %M1,%M2,%M3.

↑ %MACRO

↑ %EXITMACRO

↑ %ENDMACRO

Block of statements between %MACRO and %ENDMACRO is called macro declaration. Identifier in the label field of %MACRO statement is the name of macro. The whole %MACRO statement is called macro prototype. Once declared, macro can be expanded many times it the program.

When €ASM reads the macro declaration in source text, it does not emit any code. Instructions from the macro body will be emitted only when the macro is actualy expanded with its macroinstruction.

%EXITMACRO allows to break the emitting process when it is encountered, most often when some error condition was detected. Example of macro declaration and macro expansion:

DWalignEAX %MACRO ; will round the contents of EAX to doubleword upward.
      ADD EAX,3
      AND EAX,-4
      %ENDMACRO DWalignEAX
      
      MOV EAX,13
      DWalignEAX
      ; Now EAX will contain 16.

↑ %SHIFT

Pseudoinstruction %SHIFT is usable in macro block only. It will decrement the ordinal number of all operands by one or by the integer, which it has in operand field. %SHIFT may have no label and only one operand which evaluates to plain number. Default 1 is assumed when the operand is omitted.

%SHIFT 0 does nothing. Shifting by negative number will inverse the direction.

Effect of the operation is limited only when macrooperands are accessed by ordinal number, such as %1, %2 etc. Accessing operands by name remains unaffected by %SHIFT operation.

Operands, which are left-shifted from ordinal position 1 to position zero or negative, are not accessible by ordinal number any longer, but they are not lost forever, as they may be shifted back by negative number.

Sample %MACRO Oper1, Oper2, Oper3
L1: DB %1, %Oper1
    %SHIFT 1
L2: DB %1, %Oper1
    %SHIFT 1
L3: DB %1, %Oper1
    %ENDMACRO Sample

   Sample  1,2,3 ; This macro expands to three lines:
; L1: DB 1, 1
; L2: DB 2, 1
; L3: DB 3, 1

↑ %ERROR

Pseudoinstruction %ERROR will insert into the listing file an user-defined error message similar to those emited by €ASM itself when it founds some mistake in the source text. %ERROR is often used in macroinstructions and usually it warns the programmer that the macro was not used in the intended way.

User defined errors have severity code U and severity level 5, i.e. somewhere between warnings and assembler errors. Programmer may specify the actual message identifier with optional keyword operand ID= which can be plain decimal number between 5000 and 5999. %ERROR will also accept identifier with value 0..999 and it adds internally 5000 in this case. Default value is 0, so the user defined message has identifier U5000.

↑ %DISPLAY

Pseudoinstruction %DISPLAY is used for retrieving information about internal objects created by €ASM during assembly process. Each such object is displayed in the form of debug message with severity level 1. The message is printed both to output console (in each pass) and to the listing file (in the final pass).
%DISPLAY is active even in non-emitting source passages, such as false %IF branch or block disabled with %COMMENT. It is intended to investigate €ASM internals when something is working not as expected.

Pseudoinstrucion %DISPLAY accepts arbitrary number of operands – object categories, which specify the kind of objects that we want to review. Categories may be provided as ordinal operands or as keyword operands with value specifying the filter. Filter can restrict the amount of displayed lines. Category names are case insensitive but the filtering value, if used, is case sensitive. Filter value defines first few characters of those object names, which we want to display. Filter value may be terminated with asterix *, but this is not mandatory.

Operands of pseudoinstruction %DISPLAY have rather relaxed syntax.Object categories (ordinal operand name or keyword name) may be shortened, too. Only such number of characters is required which is enough to identify the desired category. For instance %DISPLAY se will display map of all segments and their sections. %DISPLAY File displays the list of input files (main source and included libraries). %DISPLAY sym=Num*, sym=En will list only those symbols, whose name begins with Num or En.

%DISPLAY UserVar, %DISPLAY UserVar=*and %DISPLAY user= work equally (empty filter value will match any %variable name. Nonfilterable categories, such as segments, context stack, automatic macro %variables, will always display their complete list, any filterring value is ignored.

When specifying (system) %variable names as the filterring value, the leading percent sign % or %^ may be omitted, or the percent sign must be doubled (otherwise it would have been expanded to their current contents). %DISPLAY UserVar=Loc %DISPLAY us=Loc* and %DISPLAY user=%%Loc are equal in their function: they display the current contents of user-defined preprocessing %variables whose name begins with %Loc.

%DISPLAY object categories
%DISPLAY operandMessagesFilterOrderDisplayed objects
AllD1100..D1900yesalphabetical All objects specified below (shortcut for Fil,Ch,Se,St,Co,Sym,L,Rel,M,V).
FilesD1150..D1190ignorednaturalSource files included in the program.
ChunksD1200..D1240ignorednaturalChunks of source code.
SectionsD1250..D1290ignorednaturalMap of groups, segments and sections.
SegmentsD1250..D1290ignorednaturalMap of groups, segments and sections.
GroupsD1250..D1290ignorednaturalMap of groups, segments and sections.
StructuresD1300..D1340yesalphabeticalStructures declared in the program.
ContextD1350..D1390ignoredstackedContext stack of block statements
SymbolsD1400..D1450yesalphabetical All explicitly defined symbols (shortcut for Fix,Unf,Unr,Ref).
  UnfixedSymbolsD1410..D1450yesalphabeticalSymbols whose properties are not stable yet.
  FixedSymbolsD1420..D1450yesalphabeticalSymbols whose properties are already fixed.
  UnreferencedSymbolsD1430..D1450yesalphabeticalSymbols which were not used yet.
  ReferencedSymbolsD1440..D1450yesalphabeticalSymbols which were mentioned at least once, or used in a structure.
LiteralSymbolsD1500..D1540ignoredalphabeticalAll literal symbols declared in the program.
RelocationsD1550..D1590ignorednaturalRelocation records.
MacrosD1600..D1690yesalphabeticalMacroinstructions declared at this moment.
VariablesD1700..D1790yesalphabetical All preprocessing %variables currently set (shortcut for Au,Fo,Us,Sys).
  AutomaticVariablesD1710..D1730ignoredfixedAutomatic macro %variables.
  FormalVariablesD1740..D1750yesalphabeticalFormal macro/for %variables.
  UserVariablesD11760..D1770yesalphabeticalUser-defined preprocessing %variables.
  SystemVariablesD1780..D1790yesalphabeticalSystem preprocessing %^variables.

Displayed message usually contains object name, it's attributes and other properties.

Property src= specifies whether the file or chunk is

Chunk property type= shows what kind of information is in this chunk of source text:

Boolean property ref= tells whether the symbol, structure or section was used (referrenced at least once in the program). Members of the structure are automatically refered when the structure is defined.
Similar property fix= specifies if the offset of symbol or section is already fixed, i.e. it is stable between assembly passes.
Context property emit= informs whether the block is in normal (emitting) status, or if it is just bypassed without emitting any code or data.

Context property %.= shows current value of expansion counter in this block.

Property src= identifies position in source text where was the displayed object defined, in standard form "FileName"{LineNumber}.

Automatic and formal %variables are defined only in %macro | %for expansion, i.e. when %DISPLAY Auto,Formal is inserted in %MACRO/%ENDMACRO or %FOR/%ENDFOR body and the macro is then expanded.

Unlike other instructions, %DISPLAY is active even in non-emitting status. Be cautious to put unfiltered %DISPLAY in repeating preprocessing loops (%FOR, %WHILE, %REPEAT), as this may substantionally flood the output.

The main purpose of %DISPLAY is to find errors at assembly-time, when €ASM doesn't work as expected, together with EUROASM options DISPLAYSTM=, DISPLAYENC= and with PROGRAM options LISTGLOBALS=, LISTLITERALS=, LISTMAP=.
For investigation of your program at run-time use a debugger.

↑ %DEBUG

↑ %PROFILE

Reserved for future extension. Not implemented yet.


↑ Macroinstructions

Macro is defined by a block of statements (macro body) encapsulated between pseudoinstructions %MACRO and %ENDMACRO. The %MACRO statement itself (macro prototype) must have a label, which can be used later for macro invokation (alias macro expansion).

Macro must be defined before it is invoked.

Statement, which has the name of previously declared %MACRO in its operation field, is called macroinstruction or simply macro. It will be replaced with statements from the %MACRO block. Macro can be a fixed static set of instructions, such as

CarriageReturn %MACRO
    MOV AH,2
    MOV DL,13
    INT 21h
    %ENDMACRO CarriageReturn

More useful are macros which can modify the expanded instructions depending on operands they are invoked with. When a macro is invoked, it is usually provided with operand values, which are available in macro body as formal %variables or as automatic ordinal %variables %1, %2, %3,.... Operands in macrodefinition may be given temporary formal symbolic name; they are accessible in the macro block by this name prefixed with percent sign %. Or they may be referred with their ordinal number prefixed with %. Keyword operands are only accessible with the formal key name prefixed with %. Example:

Copy %MACRO Source, Destination, Size=ECX
     MOV ESI, %Source      ; or MOV ESI, %1
     MOV EDI, %Destination ; or MOV EDI, %2
     MOV ECX, %Size
     REP MOVSB
     %ENDMACRO Copy

The previous macro needlessly moves the number of copied bytes to register ECX even if it is already there at the time of its invocation. The expanded instruction MOV ECX,ECX may be spared in this case:

Copy %MACRO Source, Destination, Size=ECX
     MOV ESI, %Source      ; or MOV ESI, %1
     MOV EDI, %Destination ; or MOV EDI, %2
     %IF "%Size" !== "ECX"
       MOV ECX, %Size
     %ENDIF
     REP MOVSB
    %ENDMACRO Copy

Now when the macro is invoked with Copy From, To, Size=ECX or with Copy From, To, no superfluous MOV ECX,ECX is expanded.

All macros in EuroAssembler may have variable number of operands.

Number of operands specified at macro invokation needs not to correspond with number of operands specified when the macro was defined. If the macro is invoked with less operands than its prototype specifies, €ASM does not treat this as error and silently expands the omitted operands to nothing.
When the macro is invoked with more operands than its prototype specifies, those superfluous operands are not accessible in macro expansion with formal names, but still they may be referred by their automatic ordinal name. See also pseudoinstruction %SHIFT.

Similary to preprocessing %variables, macros may be redefined many times. However, this is not usual and €ASM will emit a warning W2512 in this case. Once defined macro may be undefined with pseudoinstruction %DROPMACRO.

As an example of situation, where dropping of the macro definition may be useful, is emulation of a machine instruction by a macro with the same name.
Instruction BSWAP, which reverses the byte order in 32bit register, was not available on Intel 80386. This could be solved by emulation using three ROR or ROL instructions. If we detect that our program runs on Pentium, we can drop the macro definition and €ASM will assemble BSWAP as a native machine instruction.

|00000000: | | |BSWAP %MACRO reg32 ; Swap the byte order in register. | | %IF TYPE# %reg32 <> 'R' || SIZE# %reg32 <> 4 | | %ERROR 'Macro "BSWAP" requires 32bit GPR as its operand.' | | %EXITMACRO BSWAP | | %ENDIF | |%reg16 %SET %reg32[2..3] ; Name of the lower half of reg32. | | ROL %reg16,8 | | ROL %reg32,16 | | ROL %reg16,8 | | %ENDMACRO BSWAP |00000000: | |00000000:BA78563412| MOV EDX,0x12345678 |00000005: | BSWAP EDX ; Expected result is EDX=0x78563412. | +BSWAP %MACRO reg32 ; Swap the byte order in register. |FALSE + %IF TYPE# %reg32 <> 'R' || SIZE# %reg32 <> 4 | + %ERROR 'Macro "BSWAP" requires 32bit GPR as its operand.' | + %EXITMACRO BSWAP | + %ENDIF |4458 +%reg16 %SET %reg32[2..3] ; Name of the lower half of reg32. |00000005:66C1C208 + ROL %reg16,8 |00000009:C1C210 + ROL %reg32,16 |0000000C:66C1C208 + ROL %reg16,8 | + %ENDMACRO BSWAP | | %DROPMACRO BSWAP |00000010:0FCA | BSWAP EDX ; This time swap the byte order with native 486 instruction. |00000012: |

↑ Program formats

BIN ↓

COM ↓

MZ ↓

OMF ↓

LIBOMF ↓

COFF ↓

LIBCOF ↓

PE ↓

DLL ↓

RSRC ↓

Width of program formats ↓

Target of EuroAssembler's endeavour is an output file in one of the formats selected by PROGRAM FORMAT= option. There are three main categories of €ASM output files:

  1. linkable file (also called module or object file) is designed to be joined with other modules and libraries into final executable file or to an object library.
    €ASM supports two main standards of object files: OMF and COFF . Default object file name extension is .obj.
  2. library is a collection of modules, ready to be linked on demand into the final executable file. There are four kinds of libraries supported by EuroAssembler:

    Default filename extension of object | import library is .lib, in case of dynamic library it is .dll.

  3. executable file (also called image) can be loaded and launched directly by the shell of the hosting OS.
    €ASM can produce executables in formats PE, MZ, COM, they have file extension .exe or .com. It can also create dynamically loaded libraries DLL, very similar to PE format, but they can be executed only indirectly, through invokation of their exported function from another program, or through a special Windows loader, such as RUNDLL32.exe.
    Program format BIN is ranked as executable, too. However, as it lacks any red tape information, binary file needs its own ad hoc loader to be launched directly, or it must be loaded to a special place of the computer, such as firmware ROM or boot sector of disk device.

↑ BIN

Option PROGRAM FORMAT=BIN is chosen as default when FORMAT= is not explicitly specified. Default options for BIN format are PROGRAM OUTFILE=%^PROGRAM.bin,MODEL=TINY,WIDTH=16,IMAGEBASE=0,SECTIONALIGN=0,FILEALIGN=0. €ASM creates default segment [BIN] with universal purpose.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Structure of BIN file is straightforward: binary image is a concatenation of emitted contents of its segments. Noninitialized (BSS) segments are omitted.

Segment alignment in the image is specified by the highest value of PROGRAM FILEALIGN=0, PROGRAM SECTIONALIGN=0 and SEGMENT ALIGN=16. Gaps between segments are filled with alignment stuff, which is 0x90 (NOP) if the neighbouring segments have both SEGMENT PURPOSE=CODE, otherwise it is 0x00.

Typical applications of binary format are pure data files, conversion tables, Dos drivers, boot sectors etc.

↑ COM

Files in COM format are legacy of CP/M operation system, they are directly executable in Dos and 32bit Windows. In other systems only with Dos emulator.

Default options for PROGRAM FORMAT=COM are PROGRAM OUTFILE=%^PROGRAM.com,MODEL=TINY,WIDTH=16,IMAGEBASE=0,ENTRY=256,SECTIONALIGN=0,FILEALIGN=0.
Options ENTRY=0x100 and IMAGEBASE=0 are fixed for this format and cannot be changed.

€ASM creates default implicit segment [COM] with universal purpose.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Structure of COM file is similar to BIN format, they are no metainformation stored in the file except for its extension .com which tells OS to treat it as executable. OS loader will allocate 64KB of memory, load segment registers CS,DS,ES,SS with paragraph address of that block, initialize 256 bytes long [PSP] structure at offset 0, load the entire file contents at offset 256 (0x0100), set stack pointer to the top of allocated block (usually SP=0xFFFE) and finally set IP=0x0100.

Size of code+data+stack altogether should not exceed 64KB in TINY memory model. Program in COM format can use 32bit registers, if CPU is 386 or higher. Also additional memory blocks may be requested from OS at runtime. Typical application of this obsolete format are fast and short little utilities and Terminate-and-Stay-Resident programs which provide services in Dos.
The following COM example is only 1 byte long, yet it is a formally valid computer program, though it does nothing:

         EUROASM
Shortest PROGRAM FORMAT=COM
          RET
         ENDPROGRAM Shortest

Program in COM format can link other object files or libraries, see the test table linker combinations.

↑ MZ

Specifying program format MZ creates 16bit or 32bit realmode executable file, which can be directly run in Dos and in 32bit Windows. Its structure is described in [MZ] and [MZEXE]. Dos executable file begins with MZ signature 'M','Z'.

Default options for PROGRAM FORMAT=MZ format are

PROGRAM OUTFILE=%^PROGRAM.exe,MODEL=SMALL,WIDTH=16,IMAGEBASE=0, \
        SECTIONALIGN=0,FILEALIGN=0,SIZEOFSTACKCOMMIT=4K,SIZEOFHEAPCOMMIT=1M
.

€ASM creates four default implicit segments [CODE], [DATA], [BSS], [STACK] in program formats MZ, OMF, LIBOMF.
Parameter PROGRAM SizeOfStackCommit= specifies default size of segment [STACK], so we don't have to explicitly define stack segment if EUROASM option AUTOSEGMENT= is enabled at the ENDPROGRAM statement.
Parameter SizeOfHeapCommit= can be used to limit the requested amount of heap memory preallocated by the loader (member .e_maxalloc of file header).
If the memory model is HUGE or FLAT and program width is not explicitly specified, it defaults to 32.
ImageBase=0 is fixed for this format and cannot be changed.
Specifying program Entry= is mandatory in MZ format.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPRESERVE, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

↑ OMF

Object Module Format as specified in [OMF] is designed to be linked to 16bit and 32bit real-mode programs. Imports in this format are linkable to protected-mode executables.

File format OMF is recognized for LINK when it is composed of valid OMF records and the first record is THEADR or LHEADR.

Default options for PROGRAM FORMAT=OMF are PROGRAM OUTFILE=%^PROGRAM.obj,MODEL=SMALL,WIDTH=16.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

↑ LIBOMF

OMF library format is described in Apendix2 of the same document as [OMF]. Hashed dictionary, required by format specification at the end of library, is created on output, but €ASM linker ignores it. When the library is linked to another program, its public symbols are searched sequentionally. Page size of LIBOMF libraries created by €ASM is fixed at 16.

File format LIBOMF is recognized for LINK when it starts with LIBHDR record with page size 16, 32, 64,..32K, and this record is followed by valid OMF modules, which start with THEADR or LHEADR records and which end with MODEND or MODEND32 record each. Library dictionary at the end of file is not checked.

Default options for PROGRAM FORMAT=LIBOMF are PROGRAM OUTFILE=%^PROGRAM.lib, other properties are inherited from library modules.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MODEL, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM, WIDTH.

Modules, which will be stored to the library, should be assembled beforehand to files in OMF format. If the program, which creates library, contains some code, it will be assembled and stored as the first library module. Modules which do not declare any global symbol, will not be included in the library at all. Example of static OMF library linked from 3 modules:

MyLib PROGRAM FORMAT=LIBOMF
     LINK "Module1.obj", "Module2.obj", "Module3.obj"
    ENDPROGRAM MyLib

Although format OMF was developed for real-mode programs, in can be enhanced with import declarations represented with OMF records COMENT/IMPDEF. Some librarians (for instance [ALIB]) create longer alternatives of import library, which adds LEDATA+FIXUPP records with relocable machine code of proxy jumps to the imported function.
€ASM does not create the longer version of import libraries but both short and long versions are accepted by the linker. Example of a program creating pure import library in short OMF format:

ImpLib PROGRAM FORMAT=LIBOMF
  IMPORT LIB="kernel32.dll",TerminateProcess,TerminateThread
  IMPORT LIB="user32.dll",CreateCursor,CreateIcon,CreateMenu
 ENDPROGRAM ImpLib

↑ COFF

EuroAssembler implements object format COFF in Microsoft modification described in [MS_PECOFF]. This description is also valid for €ASM formats LIBCOF, PE, DLL (COFF-based formats).

Default options for PROGRAM FORMAT=COFF are PROGRAM OUTFILE=%^PROGRAM.obj,MODEL=FLAT,WIDTH=32.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Value in PFCOFF_FILE_HEADER.Machine in 64bit mode PECOFF is either 0x0200 (Intel Itanium) when EUROASM AMD=OFF, or 0x8664 (AMD64) when EUROASM AMD=ON. In 32bit mode it is always 0x014C (Intel 386) regardless of EUROASM CPU= value.

PFCOFF_FILE_HEADER.TimeDateStamp corresponds with the current system time, unless it is forged by option EUROASM TIMESTAMP=.

Linked COFF module is recognized by the contents of PFCOFF_FILE_HEADER.Machine which should be one of the words with value 0x0000, 0x014C, 0x014D, 0x014E, 0x0200, 0x8664.

↑ LIBCOF

COFF library format is described in [COFFlib].

Default options for PROGRAM FORMAT=LIBCOF are PROGRAM OUTFILE=%^PROGRAM.lib,MODEL=FLAT,WIDTH=32.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

COFF library is identified by the signature !<arch> followed with byte 0x0A.

Modules, which will be stored to the library, should be assembled beforehand to files in COFF format. If the program, which creates library, contains some code, it will be assembled and stored as the first library module. Modules which do not declare any global symbol, will not be included in the library at all. Example of COFF library linked from 3 modules:

MyLib PROGRAM FORMAT=LIBCOF
     LINK "Module1.obj", "Module2.obj", "Module3.obj"
    ENDPROGRAM MyLib


€ASM does not create the longer version of import libraries but both short and long versions are accepted by the linker. Example of a program creating import library in short COFF format:

ImpLib PROGRAM FORMAT=LIBCOF
  IMPORT LIB="kernel32.dll",TerminateProcess,TerminateThread
  IMPORT LIB="user32.dll",CreateCursor,CreateIcon,CreateMenu
 ENDPROGRAM ImpLib

↑ PE

Portable executable file format PE is decribed in [MS_PECOFF]. Default options for PROGRAM FORMAT=PE are

PROGRAM OUTFILE=%^PROGRAM.exe,MODEL=FLAT,WIDTH=32,IMAGEBASE=4M,FILEALIGN=512,SECTIONALIGN=4K, \
        SUBSYSTEM=CON,ICONFILE="euroasm.ico",MAJORLINKERVERSION=1,MINORLINKERVERSION=0,ENTRY=, \
        MAJOROSVERSION=4,MINOROSVERSION=0,MAJORIMAGEVERSION=1,MINORIMAGEVERSION=0, \
        MAJORSUBSYSTEMVERSION=4,MINORSUBSYSTEMVERSION=0,WIN32VERSIONVALUE=0,DLLCHARACTERISTIC=0x000F, \
        SIZEOFSTACKRESERVE=1M,SIZEOFSTACKCOMMIT=4K,SIZEOFHEAPRESERVE=4M,SIZOHEAPCOMMIT=1M

PE file begins with DOS program (stub) in MZ format, which is executed when the program is not launched in MS Windows. At the file address PFMZ_DOS_HEADER.e_lfanew it expects the PE format signature with bytes 'P','E',0,0.

Older file format with NE (New Executable) signature, used in 16bit Windows and OS/2, is not supported by €ASM.

COFF file header is followed with PFPE_OPTIONAL_HEADER. Almost all its fields are configurable with PROGRAM options.
PROGRAM ENTRY= must be explicitly specified in PE format.
Option PROGRAM STUBFILE= specifies file name of 16bit MZ program used when the program runs in DOS. If it is left empty, €ASM will use its own built-in stub, which reports error message This program was launched in DOS but it requires Windows. and terminates.
Factory default option ICONFILE="euroasm.ico" specifies the icon file which will be built in the resource section of linked PE file. It visually represents the compiled file in Windows Explorer or Desktop.

This parameter is ignored if some resource file is explicitly linked to PE (Explorer will then use the first icon found in the PE resources). If the ICONFILE= option is explicitly defined as empty, and if no resources are linked, the resource section [.rsrc] will be omitted from PE file completely.

Optional header is followed with 16 special directory entries which identify sections with special purposes (other than ordinary segment purposes CODE, DATA, BSS). See the last 16 lines in Segment purpose table, starting with EXPORT.

EuroAssembler natively supports only few of special directories:

EXPORT
automatically creates section [.edata] with the table of exported symbols, if they are declared
IMPORT
automatically creates section [.idata] with the table of imported symbols names and ordinals
RESOURCE
is created when a resource file is linked to the executable or when program option ICONFILE= specifies an existing icon
BASERELOC
contains table of relocation which must be applied by the loader when the executable could not be loaded at the preferred VA specified by program option IMAGEBASE=
IAT
import address table is created in section [.idata], same as the special directory IMPORT. Concatenation of tables IAT, IMPORT and thunk proxy jumps to one common section [.idata] reduces the size of image.

Other special directories are not supported by this EuroAssembler version. Nevertheless, their segment may be created explicitly, their contents created manually or by some third-party tool and emitted to the segment with INCLUDEBIN or directly with Data definition statements. If segment parameter PURPOSE= complies with the table (case insensitive), the corresponding directory entry in PE optional header will be created, covering the whole segment contents. Example:

[.cormeta] SEGMENT PURPOSE=CLR
 D '<compatibility xmlns="urn:schemas-microsoft-com:compatibility.v1">'
 D '  <application>'
 D '     <!-- A list of all Windows versions that this application is designed to work with.>'
 D '   </application>'
 D ' </compatibility>'

When EUROASM option DEBUG=ENABLED at the ENDPROGRAM pseudoinstruction, symbol table is appended to the PECOFF image.

Debuggers should be able to retrieve symbol names from the debugged executable and associate them with disassembled source lines. Unfortunately, none of tools which I tried, was able to exploit the symbol table from PE.

↑ DLL

File format DLL is almost identical with the format PE, with some minor differences:
File header field PFCOFF_FILE_HEADER.Characteristic if flagged with pfcoffFILE_DLL = 0x2000,
default file extension and image base are PROGRAM OUTFILE=%^PROGRAM.dll,IMAGEBASE=256M,
option ENTRY= is optional in DLL.

Dynamically linkable symbols should be explicitly declared with exported scope.
Pseudoinstruction EXPORT supports dynamic DLL forwarding of exported function to a different function in other DLL, using EXPORT key operands FWD= and LIB=. See the test t7583 as an example.

Format DLL is sometimes used as resource library which contains only [.rsrc] section, typically a collection of icons. This is achieved by linking of compiled resource file, as created by an 3rd party resource compiler. Example of resource-only DLL, which contains 3 icons, can be found in tests t7586 and t7616.

↑ RSRC

Microsoft resources is common name for multimedia data, such as bitmap pictures, icons, cursor shapes, fonts etc. Resource used in GUI program are described in resource script as a tree referring individual graphic files. Typical script is a plain text file with extension .rc and it should be converted by a resource compiler into a binary resource file with extension .res, which is linkable by €ASM or other linkers. Its format is described in [RSRC].

MyCompiledResource PROGRAM FORMAT=RSRC does not work, EuroAssembler cannot compile resource scripts. Use third party tool instead, such as [MS_RC] or [ResourceHacker].

When a resource file is linked to PE or DLL image created by €ASM, program option ICONFILE= is ignored. The file is converted by €ASM to internal PECOFF binary-tree structure in special section [.rsrc] and referred with optional-header directory entry RESOURCE.

↑ Width of program formats

Width of output files linked by EuroAssembler is determined by program option WIDTH= and it defaults to 32 in COFF-based formats. To create a 64bit program PE, DLL or COFF, program width must be explicitly specified and 64bit CPU should be enabled, too.

   EUROASM CPU=X64, AMD=YES
MyProgram64 PROGRAM FORMAT=PE, WIDTH=64
Differences between 32bit and 64bit PECOFF
MemberPROGRAM WIDTH=32PROGRAM WIDTH=64
EUROASM AMD=YESEUROASM AMD=NO
PFCOFF_FILE_HEADER.Machine 0x014C (Intel 386)0x8664 (AMD64)0x0200 (Intel Itanium)
PFCOFF_FILE_HEADER.Characteristics:LARGE_ADDRESS_AWARE 0 (false)0x0020 (true)
PFPE_OPTIONAL_HEADER.Magic 0x010B (PE32)0x020B (PE32+)
SIZE# PFPE_OPTIONAL_HEADER 224240
Enabling AMD= is required in 64bit programs for MS Windows with both AMD and Intel processor, too, because Windows refuses execution of 64bit PE with .Machine type 0x0200.

↑ EuroAssembler functions

Preprocessing ↓

Refactoring ↓

Assembler ↓

Assembly debugging ↓

Linker ↓

Librarian ↓

Object convertor ↓

Makefile manager ↓

Optimization ↓

This chapter describes EuroAssembler capabilities.

↑ Preprocessing

Many assemblers provide tools which help programmer with tedious and repetitive work, they are called macroassemblers then. Preprocessing (macro) apparatus in EuroAssembler is recognizable by percent sign % prefixed to pseudoinstructions which control generating of repeated blocks of source code (%REPEAT, %WHILE, %FOR, %MACRO), conditional assembly (%IF, %COMMENT) and assignment and expansion of preprocessing %variables (%SET* family).

This set of tools manipulates with the source text before it is submitted to the final assembly processing (to the plain assembler which is not aware of preprocessing directives anymore).
Some compilers perform preprocessing in a special 0-th pass, which takes the input source file and emits plain assembly source. Preprocessed intermediate file may be manually checked then.

EuroAssembler utilizes a different approach: rather than preprocessing the source file as whole at once it will preprocess statement by statement in each assembly pass. This allows to manipulate with data which dynamically change and which are not fixed before €ASM was given the opportunity to pass the source program at least once, for instance the distance between labels, size of not-defined-yet structures and segments etc.

The relation between preprocessing and the plain assembly is similar to the relation between Javascript and the plain HTML text in internet browsers.

Proper function of €ASM preprocessing can be checked in the listing, by enabling options EUROASM LISTVAR=ENABLE, LISTREPEAT=ENABLE, LISTMACRO=ENABLE.

↑ Refactoring

Inline code ↓

Bypassed PROC ↓

PROC in own section ↓

PROC1 ↓

PROC in INCLUDE ↓

Statically linked PROC ↓

Dynamically linked PROC ↓

Inline macro ↓

Macro calling PROC ↓

Semiinline macro ↓

This chapter demonstrated various methods how we can break up the program functionality to small subprogrames in EuroAssembler.

Suppose that we need a function which calculates the third power of input positive integer number. The result should fit to 32 bits, otherwise the program will report overflow and abort.

Assuming 32bit mode and the input number loaded in register EAX, the simpliest solution uses instruction MUL (unsigned multiplication) two times.

↑ Inline code


Straightforward solution inserts the code directly to the main program flow.

    ; EAX contains the input number N.
    MOV ECX,EAX ; Copy the input value N to register ECX.
    MUL ECX     ; Let EDX:EAX = N*N
    JC Abort    ; CF=OF=1 when EDX is nonzero (32bit overflow).
    MUL ECX     ; Let EDX:EAX = N*N*N
    JC Abort    ; Abort on overflow.
    ; EAX now contains N3, continue the main program flow.

↑ Bypassed PROC

When such calculation is needed more than once, we should consider refactoring the direct code to a subprocedure which could be called repeatedly. We will insert the procedure named Cube to the program flow when its function is needed for the first time. Insertion of callable procedure requires a bypass skip. The procedure should be also accompanied with remarks which document its function.

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
       JMP Bypass: ; Skip the function code.
Cube PROC  ; Define a function which calculates 3rd power of N.
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32bit overflow).
     ENDPROC Cube
Bypass: ; EAX now contains N3, continue the main program flow.

↑ PROC in own section

The instruction JMP Bypass: could be spared if the procedure code would have been defined somewhere else, below the main program flow. This can be achieved with emitting the procedure to a different code section (for instance [Subproc]).

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
%CurrentSect %SET %^Section ; Backup the current section name to a variable.
[Subproc]  ; Switch emitting to a different code section.
Cube PROC  ; Define a function which calculates 3rd power of N.
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32bit overflow).
     ENDPROC Cube
[%CurrentSection]  ; Return to the original code section.
        ; EAX now contains N3, continue the main program flow.

↑ PROC1

Rather than manual section switch we could also utilize €ASM block PROC1..ENDPROC1 which will switch to a different section [@RT1] and return to the original section automatically.

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
Cube PROC1 ; Define a function which calculates 3rd power of N in section [@RT1].
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32bit overflow).
     ENDPROC1 Cube ; End of subprocedure in section [@RT1]. Return to [.text].
     ; EAX now contains N3, continue the main program flow.

↑ PROC in INCLUDE

Definition of function Cube at the place where it is used is good for understandability. On the other hand, when there are more such definitions, they clutter the main program thread. It could be more clearly organized if those helper functions were put away to a different file, for instance functions.inc. This file will be included to the main source file at assembly-time.

       INCLUDE "functions.inc" ; File with Cube: PROC source definition.
       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

Functions defined in included file functions.inc can be wrapped to a block(s) functions PROGRAM..ENDPROGRAM and assembled separately to an OMF or COFF object file functions.obj, eventually to a library. The function name (Cube) must be declared as GLOBAL or PUBLIC in the object file, and it must be declared as GLOBAL or EXTERN in the main file. Instead of explicite GLOBAL declaration it may also be specified with double colon (Cube::). The assembled object then will be statically linked to the main program at link-time.

       LINK "functions.obj" ; Object file with assembled code of function Cube.
       ; EAX contains the input number N.
       CALL Cube:: ; Invoke the external function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

Functions defined in included file functions.inc can be wrapped to a block(s) functions PROGRAM..ENDPROGRAM and assembled separately to a dynamically linked library file functions.dll, The function name (Cube) must be declared as EXPORT in the library file, and as IMPORT in the main executable file. The assembled function in DLL program then will be dynamically bound to the main program at run-time.

       IMPORT Cube, LIB="functions.dll"
       ; EAX contains the input number N.
       CALL Cube:: ; Invoke the DLL function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

↑ Inline macro

An alternative approach to the repeated inline code is utilizing a macro which will expand whenever the functionality is requested.

Statements which define the macro need not be bypassed, because they don't emit any code, but the macrodefinition must appear before the macro is used. The definition could be put aside to an included file as well, similary to PROC in INCLUDE method.

Cube %MACRO
       MOV ECX,EAX ; Copy the input value N to register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC Abort%.: ; CF=OF=1 when EDX is nonzero (32bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
Abort%.:           ; Label name is modified by %. variable, which increments in each macro expansion.
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube          ; Expansion of the macro.
     JC Abort:     ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Macro calling PROC

Inline macros are fast but each invokation repeats the whole function code. Size of program can be reduced if the macro calls the procedure with function code, which also can be put aside to functions.inc. The function of macro is then limited to process eventual parameters and to hide the calling convention (no parameters are actually used in our simple example, thou).

     INCLUDE "functions.inc" ; File with Cube: PROC source definition.
Cube %MACRO       ; Definition of the macro Cube.
       CALL Cube: ; Calling the procedure Cube:
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube         ; Invoke macro which calls the included PROC.
     JC Abort     ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Semiinline macro

Disadvantage of previous method is that we have to maintain two blocks of code: macro definition and procedure definition. €ASM provides procedure block PROC1 which is assembled only once, even if the macro, which contains it, is invoked repeatedly. Thank to this, the procedure code is emitted only once, when the macro is invoked for the first time, and if the macro is never invoked, the code is not emitted at all. Macrolibrary with such semiinline macros can be included to any program and does not increase the final code if the macro is not used (expanded) in the program.

Cube %MACRO          ; Definition of the semiinline macro Cube.
       CALL Cube:    ; Calling the procedure Cube:
 Cube: PROC1         ; The PROC1 block is assembled only once on first macro invokation.
         MOV ECX,EAX ; Copy the input value N to register ECX.
         MUL ECX     ; Let EDX:EAX = N*N
         JC .Abort:  ; CF=OF=1 when EDX is nonzero (32bit overflow).
         MUL ECX     ; Let EDX:EAX = N*N*N
  .Abort:RET         ; CF=OF=1 when EDX is nonzero (32bit overflow).
       ENDPROC1 Cube:
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube            ; Invoke of macro which calls the included PROC.
     JC Abort        ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Assembler

Source envelope ↓

Chained programs ↓

Nested programs ↓

This chapter gives a closer look how a program block of statements is processed by EuroAssembler.

↑ Source envelope

Consider a plain text file src.asm submitted to assembler:

 DB 'This source "src.asm" has'
 DB ' no PROGRAM statement.',13,10
 DB 'EuroAssembler will use '
 DB 'a fictive envelope instead.'

As no PROGRAM..ENDPROGRAM block is defined in this source, the output format of €ASM object file is configured only by [PROGRAM] section in configuration file euroasm.ini, or by built-in default, which is PROGRAM FORMAT=BIN,MODEL=TINY,WIDTH=16.

EuroAssembler formally wraps each source file into two fictive envelope statements. Prefixed envelope PROGRAM statement derives its label (module name) from the source file name, cutting off its extension. Thus it will assemble the source src.asm to a data file src.bin. This behaviour is compatible with most other assemblers.

If the source file name starts with a digit, such label is not acceptable by €ASM, so the module name will be prefixed with grave ` and source 123.asm is assembled to `123.bin.

Similary, when the label of PROGRAM statement contains ? or other letters unacceptable by filesystem, such character in the module file name will be replaced with underscore _. Statement IsNumlockOn? PROGRAM FORMAT=COM will produce program named IsNumlockOn_.com.

€ASM uses ANSI version of Windows API for dealing with file names, so I recomend to abstain from using national characters outside the current codepage in source file names.

When the source file is loaded in memory, €ASM begins to read the source, starting with the envelope statement PROGRAM. When the corresponding ENDPROGRAM is found, an assembly pass is over. €ASM checks all symbols, which might have been defined in the program, and looks whether their offset is marked fixed, i.e. it did not change between passes. If at least one symbol has its offset not fixed yet, another assembly pass is needed and €ASM goes back to the PROGRAM statement. When all symbols are fixed, €ASM starts the final assembly pass, in which code+data is generated to the target file and listing is produced. Each source requires at least two passes to assemble.

                                                     assembly progress ─>
┌─────────┬──────────────────────────────────┐
│envelope │src: PROGRAM                      │      █       ┌█
├─────────┼──────────────────────────────────┤       █      │ █
│      {1}│ DB 'This source "src.asm" has'   │        █     │  █
│"src.asm"│ DB ' no PROGRAM statement.',13,10│         █    │   █
│      {3}│ DB 'EuroAssembler will use '     │          █   │    █
│      {4}│ DB 'a fictive envelope instead.' │           █  │     █
├─────────┼──────────────────────────────────┤            █ │      █
│envelope │ ENDPROGRAM src:                  │             █┘       █─┐
└─────────┴──────────────────────────────────┘
                                                   ││        │      │ │
I0010 EuroAssembler started.───────────────────────┤│        │      │ │
I0180 Assembling source file "src.asm".────────────┤│        │      │ │
I0270 Assembling source "src".─────────────────────┘│        │      │ │
I0310 Assembling source pass 1.─────────────────────┘        │      │ │
I0330 Assembling source pass 2 - final.──────────────────────┘      │ │
I0760 16bit TINY BIN file "src.bin" created from source, size=99.───┘ │
I0750 Source "src" (4 lines) assembled in 2 passes with errorlevel 0.─┤
I0860 Listing file "src.asm.lst" created, size=717.───────────────────┤
I0990 EuroAssembler terminated with errorlevel 0.─────────────────────┘

Envelope statements are used regardless if explicit PROGRAM block was defined in source text, or not. Source lines between the start of file and the explicit PROGRAM statement, as well as lines between the explicit ENDPROGRAM and the end of source, should not emit any data or code. In this case the envelope source is empty and does not create target file from the source.

Consider the following source file src.asm. There is an explicit block Src:PROGRAM (lines 5..8) inside the envelope src: PROGRAM. When an internal PROGRAM..ENDPROGRAM block is found in assembly process, the block is skipped until a final pass is performed. Then €ASM puts the currently assembled final pass aside, and starts to assemble the inner block in as many passes as necessary, creating the inner program target file. After then €ASM returns to finish the final pass of outer (envelope) program.

    EUROASM ; Common options.
    ; Source file "src.asm"
    ; with PROGRAM defined
explicitly.
Src:PROGRAM FORMAT=BIN
     DB 'Data emitted '
     DB 'by program Src.'
     ENDPROGRAM Src:

Notice the bug: the wrap of comment line {3} yields an not-comment line {4}. Expression explicitly. is treated as a valid label (definition of address symbol). This causes the envelope being treated as not empty and target file src.bin is created from it, nonetheless with zero filesize, as it contains only a zero-sized address symbol.
Inner program from lines {5..8} creates target file Src.bin with size 28 bytes, but it is soon rewritten with envelope zero-sized target src.bin which happens to have almost identical name (filesystem in Windows is case-insensitive).


┌─────────┬──────────────────────────────────┐  █              assembly progress ─────────>
│envelope │src: PROGRAM                      │   █         ┌█         ┌█
├─────────┼──────────────────────────────────┤    █        │ █        │ █
│      {1}│ EUROASM ; Common options.        │     █       │  █       │  █
│      {2}│    ; Source file "src.asm"       │      █      │   █      │   █
│      {3}│    ; with PROGRAM defined        │       █     │    █     │    █
│      {4}│explicitly.                       │        █┐   │     █┐   │     █
│"src.asm"│Src:PROGRAM FORMAT=BIN            │         │   │      │   │      █─█   ┌█
│      {6}│     DB 'Data emitted '           │         │   │      │   │         █  │ █
│      {7}│     DB 'by program Src.'         │         │   │      │   │          █ │  █
│      {8}│     ENDPROGRAM Src:              │         └█  │      └█  │           █┘   █┐
├─────────┼──────────────────────────────────┤           █ │        █ │                 └█
│envelope │ ENDPROGRAM src:                  │            █┘         █┘                   █┐
└─────────┴──────────────────────────────────┘
                                                ││          │          │    │ ││    │  │  ││
I0010 EuroAssembler started.────────────────────┤│          │          │    │ ││    │  │  ││
I0180 Assembling source file "src.asm".─────────┤│          │          │    │ ││    │  │  ││
I0270 Assembling source "src".──────────────────┘│          │          │    │ ││    │  │  ││
I0310 Assembling source pass 1.──────────────────┘          │          │    │ ││    │  │  ││
I0310 Assembling source pass 2.─────────────────────────────┘          │    │ ││    │  │  ││
I0330 Assembling source pass 3 - final.────────────────────────────────┘    │ ││    │  │  ││
W2101 Symbol "explicitly." was defined but never used. "src.asm"{4}─────────┘ ││    │  │  ││
I0470 Assembling program "Src". "src.asm"{5}──────────────────────────────────┘│    │  │  ││
I0510 Assembling program pass 1. "src.asm"{5}──────────────────────────────────┘    │  │  ││
I0530 Assembling program pass 2 - final. "src.asm"{5}───────────────────────────────┘  │  ││
I0660 16bit TINY BIN file "Src.bin" created, size=28. "src.asm"{8}─────────────────────┤  ││
I0650 Program "Src" assembled in 2 passes with errorlevel 0. "src.asm"{8}──────────────┘  ││
W3990 Overwriting previously generated output file "Src.bin".─────────────────────────────┤│
I0760 16bit TINY BIN file "src.bin" created from source, size=0.──────────────────────────┤│
I0750 Source "src" (8 lines) assembled in 3 passes with errorlevel 3.─────────────────────┤│
I0860 Listing file "src.asm.lst" created, size=1372.──────────────────────────────────────┘│
I0990 EuroAssembler terminated with errorlevel 3.──────────────────────────────────────────┘

↑ Chained programs

EuroAssembler allows to define more than one program block in a single source file, and assemble all of them with one command. Remember that symbols used in different PROGRAM..ENDPROGRAM blocks have private scope, so they don't see each other, although they are defined in the same source file. If we want to call a procedure defined in Pgm1 from Pgm2, the called symbol must be declared global and both assembled modules must be linked together.

┌─────────┬──────────────────────────────────┐ █            assembly progress ─────────────────>
│envelope │src: PROGRAM                      │  █       ┌█
├─────────┼──────────────────────────────────┤   █      │ █
│      {1}│     EUROASM ; Common options.    │    █     │  █
│      {2}│Pgm1:PROGRAM FORMAT=PE,ENTRY=Run1:│     █┐   │   █─█   ┌█   ┌█
│      {3}│      ; Pgm1 data.                │      │   │      █  │ █  │ █
│      {4}│Run1: ; Pgm1 code.                │      │   │       █ │  █ │  █
│"src.asm"│     ENDPROGRAM Pgm1:             │      │   │        █┘   █┘   █┐
│      {6}│     ; Pgm2 description.          │      │   │                   █
│      {7}│Pgm2:PROGRAM FORMAT=PE,ENTRY=Run2:│      │   │                   └█   ┌█   ┌█
│      {8}│      ; Pgm2 data.                │      │   │                     █  │ █  │ █
│      {9}│Run2: ; Pgm2 code.                │      │   │                      █ │  █ │  █
│     {10}│      ENDPROGRAM Pgm2:            │      └█  │                       █┘   █┘   █┐
├─────────┼──────────────────────────────────┤        █ │                                  └█
│envelope │ ENDPROGRAM src:                  │         █┘                                    █┐
└─────────┴──────────────────────────────────┘
                                               ││        │    │    │    │  │ │    │    │   │ ││
I0010 EuroAssembler started.───────────────────┤│        │    │    │    │  │ │    │    │   │ ││
I0180 Assembling source file "src.asm".────────┤│        │    │    │    │  │ │    │    │   │ ││
I0270 Assembling source "src".─────────────────┘│        │    │    │    │  │ │    │    │   │ ││
I0310 Assembling source pass 1.─────────────────┘        │    │    │    │  │ │    │    │   │ ││
I0330 Assembling source pass 2 - final.──────────────────┘    │    │    │  │ │    │    │   │ ││
I0470 Assembling program "Pgm1". "src.asm"{2}─────────────────┤    │    │  │ │    │    │   │ ││
I0510 Assembling program pass 1. "src.asm"{2}─────────────────┘    │    │  │ │    │    │   │ ││
I0510 Assembling program pass 2. "src.asm"{2}──────────────────────┘    │  │ │    │    │   │ ││
I0530 Assembling program pass 3 - final. "src.asm"{2}───────────────────┘  │ │    │    │   │ ││
I0660 32bit FLAT PE file "Pgm1.exe" created, size=14320. "src.asm"{5}──────┤ │    │    │   │ ││
I0650 Program "Pgm1" assembled in 3 passes with errorlevel 0. "src.asm"{5}─┘ │    │    │   │ ││
I0470 Assembling program "Pgm2". "src.asm"{7}────────────────────────────────┤    │    │   │ ││
I0510 Assembling program pass 1. "src.asm"{7}────────────────────────────────┘    │    │   │ ││
I0510 Assembling program pass 2. "src.asm"{7}─────────────────────────────────────┘    │   │ ││
I0530 Assembling program pass 3 - final. "src.asm"{7}──────────────────────────────────┘   │ ││
I0660 32bit FLAT PE file "Pgm2.exe" created, size=14320. "src.asm"{10}─────────────────────┤ ││
I0650 Program "Pgm2" assembled in 3 passes with errorlevel 0. "src.asm"{10}────────────────┘ ││
I0750 Source "src" (10 lines) assembled in 2 passes with errorlevel 0.───────────────────────┤│
I0860 Listing file "src.asm.lst" created, size=1736.─────────────────────────────────────────┘│
I0990 EuroAssembler terminated with errorlevel 0.─────────────────────────────────────────────┘

Why should we pack multiple modules together with their documentation to a single file rather than scatter them to a bunch of small files? It's a matter of individual preferences.

One reason could be the transfer of information between modules with preprocessing %variables. Unlike ordinary symbols, their scope is not limited with PROGRAM block bounderies. Suppose that in Pgm2 we need to know the size of data segment from Pgm1. Let's read the size to %variable with statement %Pgm1DataSize %SETA SIZE# [DATA] which is placed in Pgm1 just above ENDPROGRAM Pgm1. In the final pass of Pgm1 is the segment size reliably known, and the variable %Pgm1DataSize will be visible in the whole source below its definition, so Pgm2 can calculate with it.

Another example where grouping programs is profitable is when the programs are similar or they share common data, declared with preprocessing %variables. The following example creates three similar short programs RstLPT1.com, RstLPT2.com, RstLPT3.com in a loop:

Nr %FOR 1,2,3 ; Repeat the %FOR..%ENDFOR block three times.
 RstLPT%Nr PROGRAM FORMAT=COM ; Program to reset LinePrinter port.
   MOV DX,%Nr ; LPT port ordinal number (1,2,3).
   MOV AH,1 ; BIOS function INITIALIZE LPT PORT.
   INT 17h  ; Use BIOS function to reset printer.
   MOV DX,Message ; Put the address of $-terminated string to DS:DX.
   MOV AH,9 ; DOS function WRITE STRING TO STDOUT.
   INT 21h ; Use DOS function to report success.
   RET     ; Terminate program.
   Message:DB "LPT%Nr was reset.$"
 ENDPROGRAM RstLPT%Nr
%ENDFOR Nr ; Generate 3 clones of the program.

↑ Nested programs

Program modules can be nested in one-another. For instance when building amphibious program executable both in Dos and Windows we may want to reflect the fact, that the Dos-executable MZ file is embedded as a stub in Windows-executable PE file, both providing the same functionality.

Again, when the outer program sees inner program block in non-final pass, it is skipped. In the final pass is the assembly of outer program temporarily suspended, inner program completely assembled, and then the final pass of outer program continues.

┌─────────┬──────────────────────────────────┐ █                   assembly progress ──────────────>
│envelope │src: PROGRAM                      │  █       ┌█
├─────────┼──────────────────────────────────┤   █      │ █
│      {1}│      EUROASM ; Common options.   │    █     │  █
│      {2}│Pgm1: PROGRAM FORMAT=PE,ENTRY=Run:│     █┐   │   █─█       ┌█       ┌█
│      {3}│Run:   ; Pgm1 data + code.        │      │   │      █      │ █      │ █
│      {4}│ Pgm2: PROGRAM FORMAT=COFF        │      │   │       █┐    │  █┐    │  █─█  ┌█
│"src.asm"│        ; Pgm2 data + code.       │      │   │        │    │   │    │     █ │ █
│      {6}│       ENDPROGRAM Pgm2:           │      │   │        └█   │   └█   │      █┘  █─█
│      {7}│       ; Pgm1 more code.          │      │   │          █  │     █  │             █
│      {8}│       LINK "Pgm2.obj"            │      │   │           █ │      █ │              █
│      {9}│      ENDPROGRAM Pgm1:            │      └█  │            █┘       █┘               █─█
├─────────┼──────────────────────────────────┤        █ │                                         █
│envelope │ ENDPROGRAM src:                  │         █┘                                          █─┐
└─────────┴──────────────────────────────────┘
                                               ││        │    │        │        │   │   │ │    │   │ │
I0010 EuroAssembler started. ──────────────────┤│        │    │        │        │   │   │ │    │   │ │
I0180 Assembling source file "src.asm".────────┤│        │    │        │        │   │   │ │    │   │ │
I0270 Assembling source "src".─────────────────┘│        │    │        │        │   │   │ │    │   │ │
I0310 Assembling source pass 1.─────────────────┘        │    │        │        │   │   │ │    │   │ │
I0330 Assembling source pass 2 - final.──────────────────┘    │        │        │   │   │ │    │   │ │
I0470 Assembling program "Pgm1". "src.asm"{2}─────────────────┤        │        │   │   │ │    │   │ │
I0510 Assembling program pass 1. "src.asm"{2}─────────────────┘        │        │   │   │ │    │   │ │
I0510 Assembling program pass 2. "src.asm"{2}──────────────────────────┘        │   │   │ │    │   │ │
I0530 Assembling program pass 3 - final. "src.asm"{2}───────────────────────────┘   │   │ │    │   │ │
I0470 Assembling program "Pgm2". "src.asm"{4}───────────────────────────────────────┤   │ │    │   │ │
I0510 Assembling program pass 1. "src.asm"{4}───────────────────────────────────────┘   │ │    │   │ │
I0530 Assembling program pass 2 - final. "src.asm"{4}───────────────────────────────────┘ │    │   │ │
I0660 32bit FLAT COFF file "Pgm2.obj" created, size=78. "src.asm"{6}──────────────────────┤    │   │ │
I0650 Program "Pgm2" assembled in 2 passes with errorlevel 0. "src.asm"{6}────────────────┘    │   │ │
I0560 Linking COFF module ".\Pgm2.obj". "src.asm"{9}───────────────────────────────────────────┤   │ │
I0660 32bit FLAT PE file "Pgm1.exe" created, size=14320. "src.asm"{9}──────────────────────────┤   │ │
I0650 Program "Pgm1" assembled in 3 passes with errorlevel 0. "src.asm"{9}─────────────────────┘   │ │
I0750 Source "src" (9 lines) assembled in 2 passes with errorlevel 0.──────────────────────────────┤ │
I0860 Listing file "src.asm.lst" created, size=1237.───────────────────────────────────────────────┘ │
I0990 EuroAssembler terminated with errorlevel 0.────────────────────────────────────────────────────┘

↑ Assembly debugging

Many features of EuroAssembler can help the programmer to assure that the source is assembled as intended.

Dump column of the listing displays the assembled code . Repeated stretchs, which are considered bug-free, are suppressed by default, but they can be displayed on demand with directives EUROASM LISTINCLUDE=ON, LISTVAR=ON, LISTMACRO=ON, LISTREPEAT=ON.

Recognition of fields in statements can be investigated with option EUROASM DISPLAYSTM=ON, which inserts comment lines identifying each field. As this option blows up the output size significantly, it's better to limit DISPLAYSTM only to the suspected lines, and then switch the option OFF or restore the previous set of options:

   EUROASM PUSH, DISPLAYSTM=ON ; Store all current EUROASM options with PUSH first.
   MyMacro Operand1, Operand2  ; "MyMacro" was not defined yet as a %MACRO, so it's treated like a label.
D1010 **** DISPLAYSTM "MyMacro Operand1, Operand2" D1020 label="MyMacro" D1040 unknown operation="Operand1" D1050 ordinal operand number=1,value="Operand2"
EUROASM POP ; Restore EUROASM options.
D1010 **** DISPLAYSTM "EUROASM POP" D1040 pseudo operation="EUROASM" D1050 ordinal operand number=1,value="POP"
; Statement fields are no longer displayed.

Detailed machine instructions encoding can be displayed with option EUROASM DISPLAYENC=ON, which inserts comment line below machine instruction with the list of actually used modifiers.

   EUROASM PUSH, DISPLAYENC=ON ; Store all current EUROASM options with PUSH first.
   SHRD [RDI+64],RDX,2
D1080 Emitted size=6,DATA=QWORD,DISP=BYTE,SCALE=SMART,ADDR=ABS,IMM=BYTE.
VMOVNTDQA XMM17,[RBP+40h]
D1080 Emitted size=7,PREFIX=EVEX,DATA=OWORD,OPER=0,DISP=BYTE,SCALE=SMART,ADDR=ABS.
EUROASM POP ; Restore EUROASM options. Encodings are no longer displayed.

All configuration options, which can be specified with EUROASM and PROGRAM keyword operands, are retrievable in the form of system %^variables, thus their current value can be checked or otherwise exploited:

   %IF %^NOWARN[2101]
     %ERROR You shouldn't suppress the warning W2101. Move unused symbols to included file instead.
   %ENDIF

The most powerful assembly-debugging tool is the pseudoinstruction %DISPLAY, which displays internal €ASM objects at assembly-time and helps to find out, why €ASM doesn't work as expected.

Static linking ↓

Dynamic linking ↓

Linking in IT terminology is the process when separately assembled | compiled modules are joined, interactions between globally accessible symbols resolved, their code and data combined and reformated to the target file format. See [Linkers] for more details.

Unlike many other linkers, EuroAssembler can create not only executable files, but also linkable formats COFF and OMF, and their libraries LIBCOF and LIBOMF (see Object convertor and the table of supported linker combinations).

Linking in EuroAssembler takes place when pseudoinstruction ENDPROGRAM is processed in the final pass.

Linking is mediated with pseudoinstruction LINK followed with filenames of input modules. Input files acceptable for EuroAssembler linker are of two kinds:

  1. linkable file formats for static linking are COFF, OMF, LIBCOF, LIBOMF, RSRC.
  2. importable file formats for dynamic linking are DLL, LIBCOF, LIBOMF.
File formats accepted by EuroAssembler statement LINK
CPU
mode
Program
width
Output
executable
Input
linkable
Input
importable
Real16BIN, COM, MZOMF, LIBOMF-
Real32BIN, COM, MZOMF, LIBOMF, COFF, LIBCOF-
Prot32PE, DLLOMF, LIBOMF, COFF, LIBCOF, RSRCOMF, LIBOMF, COFF, LIBCOF, DLL
Prot64PE, DLLCOFF, LIBCOF, RSRCOMF, LIBOMF, COFF, LIBCOF, DLL

The actual format of linked file is recognized by file contents, not by file name extension. Each linked module is loaded and converted to an €ASM internal format (PGM) in memory.

Notice that object format OMF cannot be linked in 64bit programs.

↑ Static linking

Code and data from linked object files in formats COFF or OMF will be combined and concatenated with code and data from the base program (i.e. the one to which it's linked). Linker also resolves mutual references between public and external symbols from all linked modules.

Beside standalone object modules the code and data can be also linked from object libraries in formats LIBCOF and LIBOMF.

When the target base program is executable, €ASM only links those modules from library, which are at least once referrenced by other modules (smart linking). This helps to keep size of the linked file small, eliminating the dead (never-to-be-executed) code.

If we nevertheless need to combine unreferrenced library procedures to our executable program, we would have to explicitely declare their names GLOBAL in the the base program.

Smart linking does not apply when the target file is linkable, for instance when a LIBCOF library is created from other libraries. In this case all modules (referrenced or unreferrenced) will be linked to the target file.

The good reason why to split big project into smaller, separately assembled modules, is faster build.

When a project grows and its source is doubled in size, the number of symbols in it is likely to double, too. Each symbol needs to be compared with array of other already declared symbols to avoid duplication. Number of checks, and also the consumed time, grows almost quadratically with source size.

During the developement process we usually concentrate to one part (module) of the project, so the remaining unchanged modules do not need to be recompiled again in each developement cycle (see also Makefile manager).

Recapitulation: If you want to statically link your own function (procedure), declare it PUBLIC function (or terminate its definition label with two colons function:: PROC) and assemble the function to an object or library module.
Then assemble the main program, declare the linked function EXTERN function (or terminate the called name with two colons) and insert pseudoinstruction LINK module.obj into the main program. The main program then can CALL function:: as if it were assembled in its own body.
The same applies for functions from 3rd party library. Again, you must observe its published name, calling convention, number, order and type of arguments.

↑ Dynamic linking

The code and data of dynamically linked functions are not copied to the target executable image, they remain in dynamic library (DLL), which has to be available on the system where our executable runs. When our program calls a function from DLL, it actually executes a thunk code represented by a call of single proxy jump instruction (stub).
€ASM generates stubs in a special import section [.idata] in the form of indirect absolute JMPN. Each such proxy jump is 7 bytes long (0xFF2425[00000000]) and it uses pointer into Import Address Table (IAT) as its indirect DWORD target. Virtual address in the pointer [00000000] is resolved by the linker, but the actual 32bit or 64bit virtual address of the library function (pointed to by the resolved dword) will be fixed up later, by the loader at bind time when the application starts.

Loader, implemented in Windows kernel, needs two pieces of information to dynamically link library functions and to fix up their addresses in IAT:

1) The name of linked symbol (function name) or its ordinal number in the table of exported symbols.

Calling by ordinals is not supported in €ASM.

2) The name of library file which exports the symbol (without path).

Path to the library file will be established by the loader. The order of directories where MS Windows searches for the library is explained in [WinDllSearchOrder].

Program, which needs to call symbol (imported function) from dynamic library, should declare the symbol as imported. It may be declared GLOBAL as well, either explicitly or implicitly ( CALL ImportedSymbol::), but €ASM will treat such global symbol as EXTERN (statically linked) and complain that the corresponding public symbol was not found.
There are several methods how to tell €ASM that the symbol should be dynamically linked:

Recapitulation: If you want to dynamically link your own function (procedure) in other programs, declare it EXPORT function and assemble the function to an DLL format (mylib PROGRAM FORMAT=DLL). Be sure to distribute mylib.dll together with your programs.
Then assemble the main executable program, declaring the linked function IMPORT function, LIB=mylib.dll. The main program then can invoke it using CALL function.
More often you will need to call functions from 3rd party dynamic library, which is the case of MS Windows API. You might enumerate each used WinAPI functions with pseudoinstruction such as IMPORT function1,function2,LIB=user32.dll, but more comfortable solution is to use import library, which declares all function names exported by the DLL. Then you don't have to add new import declarations every time when a new function is used in your program. Simply call the new function with double colon and, when its name appeares in some import library, it will be treated as imported. You may also want to use the macro WinAPI which takes care of IMPORT declaration and automatic selection between ANSI and WIDE variant.

↑ Librarian

EuroAssembler can create libraries from previously assembled object modules (files in OFM or COFF format). When the library program itself contains some code and data, it will be implicitly linked to the library as the first module.

Library PROGRAM FORMAT=LIBOMF  ; or FORMAT=LIBCOF
  ObjModule1 PROC ; One of the object modules can also be defined here.
                  ; Source code of ObjModule1.
             ENDP ObjModule1
             LINK "ObjModule2.obj", "ObjModule3.obj" ; Other OMF and/or COFF object modules.
        ENDPROGRAM Library

If the linked modules contain import information, it is copied to output library, too. Pure import library contains import declarations only. They may be explicitely declared as IMPORT, or loaded from dynamic library, or linked from other import libraries. Following example exploits all three methods:

ImpLibrary PROGRAM FORMAT=LIBOMF ; or FORMAT=LIBCOF
             IMPORT Symbol1, Symbol2, LIB="DynamicLibrary1.dll" ; Explicite declaration.
             LINK "C:\MyDLLs\DynamicLibrary2.dll"               ; Automatic export detection from DLL.
             LINK "OtherImportLibrary.lib"                      ; Reimport from library.
           ENDPROGRAM ImpLibrary

Example of libraries created from three separately assembled modules can be found in €ASM tests:
t7016 (object library LIBOMF for 16bit Dos),
t7337 (object library LIBCOF for 32bit Windows),
t7361 (object library LIBCOF for 64bit Windows),
t7184 (import library LIBOMF for Windows),
t7412 (import library LIBCOF for Windows),

↑ Object convertor

EuroAssembler can link both main object formats OMF and COFF, so the demand for explicite object conversion between them should be rare. Example:

OMFobject PROGRAM FORMAT=OMF ; Convert COFF object file to the format OMF.
            LINK "COFFobject.obj"
          ENDPROGRAM OMFobject
COFFobject PROGRAM FORMAT=COFF; Convert OMF object file to the format COFF.
            LINK "OMFobject.obj"
          ENDPROGRAM COFFobject
OMFlibrary PROGRAM FORMAT=LIBOMF ; Convert COFF object library to the format LIBOMF.
            LINK "COFFlibrary.lib"
          ENDPROGRAM OMFlibrary
COFFlibrary PROGRAM FORMAT=LIBCOF ; Convert OMF object library to the format LIBCOF.
            LINK "OMFlibrary.lib"
          ENDPROGRAM COFFlibrary

↑ Makefile manager

Operator FILETIME# retrieves the last modification time of a file at assembly-time, which can be used for detection if the target file needs reassembly or not. Just compare the filetime of target with filetime of each source, which the target depends on. If the target file does not exist, its attribute-operator FILETIME# returns 0, which is the same as if it was very old, so its reassembly will be required anyway.

    %IF FILETIME# "target.exe" > FILETIME# "source.asm" && FILETIME# "target.exe" > FILETIME# "included2source.inc"
       ; Recompile "source.asm" only if "target.exe" doesn't exist or if it is older than its sources.
target PROGRAM FORMAT=PE
         INCLUDE "source.asm"
       ENDPROGRAM target
    %ELSE
       %ERROR "target.exe" is fresh, no need to assemble again.
    %ENDIF

As an example of more sofisticated makefile script see the main EuroAssembler source file euroasm.htm.


↑ Optimization

Computer programs are often written in assembler because we want them to be fast and/or small. However, those are not the only criteria how a program can be optimized:

By program size ↓

By program speed ↓

By assembly speed ↓

By source writeability ↓

By source readability ↓

See also optimization tutorials.

Let's look how EuroAssembler can help with optimization.

↑ Optimization by program size

€ASM will always select by default the shortest possible encoding of machine instruction. On the other hand, it respects instruction mnemonic chosen by the programmer, which doesn't always have to be the shortest variant. A couple of rules worth remembering:

|0000:B80000 | MOV AX,0 |0003:29C0 | SUB AX,AX ; Using SUB or XOR for zeroing is shorter. Side effect: flags are changed. |0005: | |0005:89D8 | MOV AX,BX |0007:93 | XCHG AX,BX ; XCHG is shorter than MOV. Side effect: 2nd register is changed, too. |0008: | |0008: |Label: |0008:8D06[0800] | LEA AX,[Label] |000C:B8[0800] | MOV AX,Label ; Moving offset to register is shorter than loading its address by LEA. |000F: | |000F:5053 | PUSH AX,BX |0011:60 | PUSHAW ; Pushing/popping all registers at once is shorter than individual push/pop. |0012: | |0012:050100 | ADD AX,1 |0015:40 | INC AX ; Increment/decrement is shorter than add/subtract. |0016: | |0016: |LoopStart: |0016:49 | DEC CX |0017:75FD | JNZ LoopStart: |0019:E2FB | LOOP LoopStart: ; LOOP, JCXZ are shorter than separate test+jump.

Programs which aspire for short-size category should have PROGRAM FORMAT=COM and EUROASM AUTOALIGN=OFF. They may be terminated by simple near RET instead of invoking DOS function TERMINATE PROCESS, because the return address on stack of COM program is initialized to 0 and the final RET transfers execution to DOS terminating interrupt at the beginning of PSP block (CS:0), which was established by the loader.

Hello PROGRAM FORMAT=COM
       MOV DX,=B "Hello world!$"
       MOV AH,9
       INT 21h
       RET
      ENDPROGRAM Hello

For more inspiration check Hugi Size Coding Competition Series,
Assembly nibbles competition,
Graphical Tetris in 1986 bytes by Sebastian Mihai,
BootChess play in 487 bytes by Oliver Poudade.

Windows executable program created by €ASM will be shorter when the option PROGRAM ICONFILE= is explicitly specified as empty and no resource file is linked. In this case the resource section will not be included in PE file at all. You may also experiment with PE file properties using program options, such as PROGRAM FILEALIGN= value.

↑ Optimization by program speed

Writing fast programs is fully in the hands of programmer, EuroAssembler cannot help much here, it does no optimizations behind your back as high-level compilers do. You may want to set EUROASM AUTOALIGN=ON to be sure that all data will be aligned for the best performace. Total control of instruction encoding in €ASM allows to select a variant with exact code size, which is faster than size-optimized encoding stuffed by NOPs.

There are many tricks how to squeeze every CPU clock: by loop unrolling, parallelization, avoiding memory access, and last but not least, choosing the fastest algorithm. Performance also heavily depends on CPU model and generation. Good guide is [SoftwareOptimization] by Agner Fog.

Performance is usually traded off with program size, for instance many tricks mentioned above lead to slower execution. You may want to optimize only the critical parts of code which are executed many times in your program.

↑ Optimization by assembly speed

EuroAssembler is not optimized for speed, nevertheless assembly time is usually not an issue. It mostly depends on the number of passes, which is governed by €ASM itself and not directly impactable by the programmer. At least two passes are always required. Number of passes increases when the program contains forward references, assembly-time loops, macroinstructions.

When assembling forward referrenced jumps €ASM at first anticipates short distance to not-yet-defined target, and reserves room for only 2 byte (short) opcode. If we know at write time that the forward target will be further than 127 bytes, it is recommended to explicitly specify DIST=NEAR, which can save one pass at assembly time. However the pass will be spared only when the distances of all such jumps are specified, which is usually not worth the effort.

If you are interrested why €ASM performs this many passes, put the statement %DISPLAY UnfixedSymbols in front of ENDPROGRAM to find out which symbols do oscillate between assembly passes.

Build time of big projects can be reduced significantly by splitting the code to smaller, separately assembled modules, which will be finally linked together. See also the euroasm.htm source itself.

When a project grows and its source is doubled in size, the number of symbols in it is likely to double, too. Each symbol needs to be compared with array of other already declared symbols to avoid duplication. Number of checks, and also the consumed time, grows almost quadratically with source size.
During the developement process we usually concentrate to one part (module) of the project, so the remaining unchanged modules do not need to be recompiled again in each developement cycle (see also Makefile management).

↑ Optimization by writeability

EuroAssembler introduced some new comfortable features which are not usual among other assemblers:

↑ Optimization by readability

Well commented and structured program is easy to read and maintain. EuroAssembler allows HTML formatting in comments, so the source code can be directly published on web sites and each part of source can be immediately documented with rich format text, tables, images, hypertext links.

Size and language of identifiers is not limited, so they can be selfdescribing. If English is not your mother tongue, it is a good idea to provide labels with non-English names, such as Drucken rather than Print, файл rather than file etc. This helps reader of your program to distinguish built-in reserved words from identifiers created by the author.

Identifiers in EuroAssembler language use decorators which indicate their category:


▲Back to the top▲