EuroAssembler Index Manual Download Source Macros


Sitemap Links Forum Tests Projects

EuroAssembler Manual


About €ASM ↓

Input/Output ↓

Structure of €ASM program ↓

Elements of source ↓

Instructions ↓

Program formats ↓

€ASM functions ↓


↑ About EuroAssembler

Product identification ↓

Short characteristic ↓

Notational typographic conventions ↓

Why Assembler ↓

Why Yet Another Assembler ↓

Why EuroAssembler ↓

Licence ↓

History

Download

Installation ↓


↑ Product identification

Name of this software is EuroAssembler. Notice there is no space between Euro and Assembler.
The name is often abbreviated as €ASM.
In 7-bit ASCII environment it may also be referred as EUROASM and in some internal identifiers it's just ea.

The Euro character is available on Windows keyboard as Alt~0128 or as HTML entity €.

↑ Short Characteristic

Some features are rarely seen in other assemblers:

↑ Notational typographic conventions

This manual covers programmer's guide, examples, language references, implementation remarks. Different styles are used to identify those elements.

Background color of web page helps to distinguish between

   this manual and links macroinstruction libraries  €ASM source files  test files  objects and samples .

Dashed hyperlinks refer to another paragraphs within the same page.

Underlined hyperlinks navigate to a different HTML page of this site.

Underlined hyperlinks with Link icon navigate to signpost page Links with external references.

Underlined hyperlinks with Exit icon navigate outside EuroAssembler website, you may want to open them in a new tab or window.

Contents of this manual is organized in chapters with a tree structure.

↑ Title

Up-arrow near the chapter title is a link which navigates from the Title one level higher.

Title ↓

Down-arrow following the title navigates from the Title downward to the actual text.
Statements and rules which are worth remembering are marked with a bulb icon.

Definitions of new terms is written in blue bold italics.

Implementation details, discussions and less important personal remarks are printed with smaller font.

File names are emphasized in quotes.

Characters used in text flow have white background.

Short piece of source code is displayed in monospace black on yellow in text flow.

; Longer examples of source code in this manual are presented in a box.
; They may have more lines.
; Negative examples are overstriked.
 
Examples of code in macrolibraries and €ASM sources are ignored by EuroAssembler, because their physical lines begin with HTML tag marker <.
explaining metainformation ┐ |0000:0000| ; €ASM printed output (listing) is displayed black on white paper. |0000:0000| ; It contains assembled machine code, copy of source instructions |0000:0000| ; and error messages.

↑ Why Assembler

Assembly programming language (ASM) gives programmers the maximal possible control of emitted machine code. Of course, having to prescribe every instruction for Central Processing Unit (CPU) by hand is very tedious. That is why subprograms were invented: procedures, functions, macroinstructions.
Subprogram is like a black box with documented purpose, input and output. The main difference between our own ASM subprogram and HLL function is that when it doesn't work as expected, we can easily trace down the mistake, stepping each machine instruction in debugger, and that there is no-one else to blame but us.

ASM subprograms can do the same work as orders of higher level languages (HLL) or invokations of operating system (OS) application programming interface (API). EuroAssemble macrolanguage allows to prepare in advance macros tailored to the solved task, which are similar to functions from OS or HLL libraries, and they allow to develop programs in ASM almost as rapidly as in HLL.

Advantage of mastering assembly manifests when we are challenged with a third-party program without its source code available, or when some bad-written program throws an exception and exits. DrWatson, debugger or disassembler can only show the foreign code converted to assembly instructions. People who never met ASM will hardly know how to interpret the disassembled code, while ASM programmer will feel like a fish in its natural environment.

The disadvantage of assemblers is lack of standardized libraries which unify programming in HLL such as C or Java. Many ASM programmers build their own, which makes their sources not portable unless the necessary libraries are shipped together with source. On the other hand, making a library with own functions is the best method how to remember all the function and parameter names, and how to learn a lot about computer and operating system.
EuroAssembler package euroasm.zip contains several macrolibraries for the quick start and for inspiration.
Assembler is an universal construction kit. You may program whatever is possible to imagine, but first you have to prepare the building tools.
Phases of program creation
PhaseUsed tool
design-timeimagination
write-timetext editor
assembly-timeassembler
combine-timelinker
link-timelinker
load-timeoperating system loader
bind-timeoperating system loader
run-timeprocessor

↑ Why Yet Another Assembler

Dissatisfation with available tools is one of the reasons why some programmers want to invent their own language.

And last but not least, creating an assembler is a very interresting challenge. An incomplete list of assemblers and other tools, that I had the pleasure to come into contact with, is presented at the link [Assemblers] and [UsefulTools].

The first assembler I met when I started to flirt with assembly language in early 80's, was IBM's FDOS for S360 mainframe computers [HLASM]. That was very sofisticated product with advanced features such as sections, keyword operands, literals, with macrolanguage which was able to manipulate not only with the generated machine statements, but also with its own macro variables and their names.

I missed many of those features in assemblers for Intel architecture. Some of them brought new ideas but none seemed ideal for me. [NASM] ver.0.99 was quite good, in fact the first bootstrap version of €ASM was written in it, but I was irritated when it wasn't able to automatically select SHORT or NEAR distance and had other design flaws, such as not expanding preprocessing variables in quoted strings.

I always wondered why constant EQU symbols had to be declared before the first use. Why I can't declare macro in macro. How to solve situations when file A includes files B and C, and file C also includes file B, duplicating its definitions.

I don't like language which is cluttered up with free space. In HLASM a space in the operand list signalised that everything up to the end of the punched card should be ignored. €ASM isn't that strict in this horror vacui, in fact white spaces may be put anywhere between language elements to improve readability. However, spaces are almost never required by syntax.

€ASM does not use English word modifiers such as SHORT, NEAR, DWORD PTR, NOSPLIT which are identified by their value only. Instead, it prefers Name=Value paradigma with keyword instruction modifiers such as DATA=QWORD,IMM=BYTE,MASK=K5,ZEROING=ON, which remove ambiguity and replace ugly decorators proposed in Intel documentation.

↑ Why EuroAssembler

  1. Euro because it comes from Czechia, the heart of Europe.
  2. Both Europe and €ASM are multilingual, as it supports national characters in identifiers and strings.
  3. is one of the few characters left unoccupied among many *ASM assemblers :-)

↑ Licence

Permission to use EuroAssembler is granted to everybody who obeys this Licence.
There are no restrictions on purpose of applications created with this tool. It may be used in private, educational or commercial environment freely.

EuroAssembler is provided free of charge as-is, without any warranty guaranteed by its author.

This software may be redistributed in unmodified zipped form, as downloaded from EuroAssembler.eu. No fee may be requested for the right to use this software.

You may disseminate euroasm.zip on other websites, repositories, FTP archives, compact disks and similar media. Please be sure to always distribute the latest available €ASM version.

Source code of EuroAssembler was written by Pavel Šrubař, AKA vitsoft, and it is copyrighted as so.

Macrolibraries and sample projects are released as public domain and they may be modified freely.

I cannot recommend modifying the libraries, though, because they may be changed in future releases of €ASM and your enhancements would have been overwritten. Create your own files with vacant names instead.

You may modify €ASM source code for the sole purpose to fix a bug or to enhance it with new function, but you may not distribute such modified software. It may only be used by you on the same computer where it was edited, reassembled and linked.

EuroAssembler is not open source. I don't want to fork €ASM developement into bazaar of incompatible versions, where each branch provides different enhancement. Please propose your modifications to the author or to €ASM forum instead, so it might be incorporated in future releases of EuroAssembler.

↑ Installation

Distribution file euroasm.zip contains folders and files as listed on the Sitemap page. Modification time of all files is equally set to the nominal release time. All file names are in lower case (Linux convention) and in 8.3 size (DOS convention), so any old DOS utility can be used for unpacking.
You may need to run the console as administrator for the installation on secure version of MS Windows.

Choose and create EuroAssembler home directory, for instance C:\euroasm, change to it and unzip the downloaded euroasm.zip. Move or copy the main executable euroasm.exe to some folder from system %PATH%, so it might be launched as euroasm from anywhere. When you run it without parameters for the first time, it will create the global configuration euroasm.ini, which you should tailor now with a plain-text editor.

You may want to replace relative IncludePath= and LinkPath= in [EUROASM] section with an absolute path identifying the €ASM home directory.
In [PROGRAM] section you can specify your preferred target format, for instance Format=PE, Subsystem=CON and Width=32. You could also replace IconFile="euroasm.ico" and copy your preferred personal icon to objlib subfolder.

For the (not-recommended) bare-bone minimal installation you are now done and you could erase the whole home directory now. The executable euroasm.exe itself does not need any other supporting files, environment or registry modification.

If you prefer to read this documentation in other language, rename the default English version of this manual eadoc\index.htm to eadoc\man_eng.htm and then rename the chosen available mutation, e.g. eadoc\man_cze.htm, to eadoc\index.htm.

For developement installation go to the home directory and unzip developer-scripts from the subarchive generate.zip. You will also need webserver and PHP (version 5.3 or higher) installed on your localhost.

Most of EuroAssembler files are in HTML format, you may want to incorporate €ASM into your local web server, if you run it on your localhost computer.

In my Apache installation I added the following paragraph to the httpd.conf or apache2.conf:

<VirtualHost *:80>
    DocumentRoot C:/euroasm/
    ServerName euroasm.localhost
</VirtualHost>

I appended the statement 127.0.0.1 euroasm.localhost into the file %SystemRoot%/SYSTEM32/drivers/etc/hosts. Now I can write euroasm.localhost into address line of my internet browser and explore the €ASM documentation and other files locally.


↑ Input/Output

Standard streams ↓

Other I/O ↓

Messages ↓

Input/Output files ↓


Computer programs exchange information with users through various channels: standard streams, command-line parameters, environment variables, errorlevel value, disk files, devices.

↑ Standard streams

Basic form of communication between programs and human user has the form of characters streams, which are by default directed to the console terminal where was the program launched from. They may also be redirected to a disk file or device driver with command-line operators >, >>, <, |.

Standard input is not used in €ASM.

Standard output prints warnings, errors and informative messages produced by €ASM.

Standard error output is not used in €ASM.

↑ Other I/O

Command-line parameters are not used. €ASM assumes that everything on the command line is the main source file name(s) to assemble. All options controlling the assembly & link process are defined in configuration files euroasm.ini or directly in the source file itself.

In fact there are semi-undocumented EUROASM options which are recognized in command-line, however the preferred place for EUROASM options is the configuration file or the source file. Command-line options are employed in test examples to suppress some variable informative messages.

Environment variables are not used in €ASM.

Environment variables may be incorporated into the source at assembly-time using pseudoinstruction %SETE. Of course it is also possible to read environment at run-time with the corresponding API call, such as GetEnvironmentVariable().

€ASM does not use any other devices (I/O ports, printer, sound card, graphic adapter etc.) at assembly-time.

↑ Messages

Important information detected by EuroAssembler during its activity is published in the form of short text messages. They are written on standard output (console window) and to the listing file.

Message severity ↓

Messages in standard output ↓

Messages in listing ↓

Each message is identified by a combination of capital letter followed with four decimal digits. The complete text of messages is defined in source file msg.htm.

The letter prefix and the first digit (0..9) declare message severity. Final errorlevel value, which euroasm.exe terminates with, is equal to the highest message severity encounterred during the assembly session.

Message severity
Kind of
message
PrefixIdentifier
range
SeveritySearch
marker
InformativeII0000..I09990|#
DebuggingDD1000..D19991|#
WarningWW2000..W39992..3|##
Nonsuppressible warningWW4000..W49994|##
User-defined errorUU5000..U59995|###
ErrorEE6000..E89996..8|###
FatalFF9000..F99999|###

EuroAssembler is verbose by default, but it may be totally silenced when launched with parameter NOWARN=0000..0999, and if no error occured in source.

Warnings usually do not prevent compiled target from execution, they are meant as a friendly reminder that programmer might have forget about something or has made a typo mistake.

Messages with severity level 5..8 indicate that some statements were not compiled due to error. Although the target file may be valid, it will probably not work as intended.

Fatal errors indicate failure of interaction with the operating system, exhausting of resources, file errors or internal €ASM errors. Target and listing file might have been not written at all.

Warning messages in the range W2000..W3999 can be suppressed with EUROASM option NOWARN=, but this ostrich-like policy is not a good idea. It's always better to remedy the cause of message. If you intend to publish your code, it should always assemble with errorlevel 0.

↑ Messages on standard output

Typical message consists of its identifier followed by the actual tailored msg text. When it is printed on standard output, the text is accompanied with position indicator in the form of quoted file name followed with physical line number in curly brackets, for instance

E6601 Symbol "UnknownSym" mentioned at "t1646.htm"{71} was not found. "t1646.htm"{71}
▲▲▲▲▲                                                                 ▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲
Idenfifier                                                         position indicator

Usually there is just one position indicator per message, but when the error was discovered in macro expansion, another indicator is added which determines the line in macro library. In case of macro expanded in other macro, position indicators will be further chained.

↑ Messages in listing

Messages printed to the listing file have a slightly different format. The position indicator is omitted, because they are inserted just below the source line which triggered the error:

|002B: | MOV SI,UnknownSym: ; E6601 expected. |### E6601 Symbol "UnknownSym" mentioned at "t1646.htm"{71} was not found. ▲▲▲▲ marker

Message text is prefixed with search marker which helps to find messages in listing.

Use internal function Find/FindNext (Ctrl-F) of the editor or viewer used to investigate the listing file.
€ASM syntax never uses multiple pound characters ##, so the search marker is unique in listing and it helps to skip from one error|warning to the next.
You could also try the specialized €ASM listing viewer distributed as one of the sample projects.

Debugging messages D1??? produced by the pseudoinstruction %DISPLAY are published even when they are placed in false %IF branches or in blocks commented-out by %COMMENT..%ENDCOMMENT.

Listing is created only during the final assembly pass. Informative messages are not printed to listing at all, except for informative linker messages I056?.


↑ Input/Output files

Configuration file ↓

Source file ↓

Object file ↓

Listing file ↓

File path ↓

There are two kinds of input files which €ASM reads: configuration and source.

There are two kinds of output files which €ASM writes: object and listing.

If the output file already exists, €ASM will overwrite it without warning.

Configuration file

Configuration file with fixed name euroasm.ini specifies default options for assembler. €ASM consults two configuration files with identical name and structure:

Global configuration file is located in the same directory as the main executable (euroasm.exe) and it is processed once after €ASM has started. If the file does not exist, €ASM tries to create it with the factory-default contents.

Local configuration file is searched for in the same directory as the actual source file. If more than one source is specified on the command-line, local configuration is read each time when the actual source gets processed.
Local euroasm.ini is not automatically created by €ASM, you will need to copy|clone the global file manually, and eventually erase unchanged options from the local configuration for better performance.

Initial content of configuration file, which is built-in in euroasm.exe as factory-defaults, is defined in objlib/euroasm.ini. There are two sections in the file: [EUROASM] and [PROGRAM].

The former specifies parameters for €ASM itself, such as CPU generation, what information should go to the listing file, which warnings should be suppressed etc. Parameters from [EUROASM] section of configuration file can be redefined later in the source with EUROASM pseudoinstruction, where you will find detailed explanation of each parameter.

[PROGRAM] section of configuration file specifies default parameters of program which is to be created by €ASM, for instance the memory model, format and name of the object file etc. These parameters can be modified with PROGRAM pseudoinstruction.

The order of configuration parameters is not important. Names of parameters are case insensitive. Parameters with boolean value accept any of predefined enumerated tokens ON, YES, TRUE, ENABLE, ENABLED as true and OFF, NO, FALSE, DISABLE, DISABLED as false. They also accept numeric expressions which evaluate as boolean.

When you give away your source program written in EuroAssembler, you don't have to specify which comand-line parameters were used to compile and link, because they can be declared in the source itself. Typical €ASM source program begins with configuration pseudoinstruction, such as EUROASM AUTOALIGN=YES,CPU=PENTIUM, so it is easy to tell in which assembler is the program written.

EuroAssembler options and directives can be specified in configuration files and in the source files (by pseudoinstruction EUROASM). Order of their processing:

  1. When euroasm.exe starts, its options are already defined with built-in factory defaults.
  2. €ASM looks at the command-line; if some EUROASM keyword options were detected here, they overwrite the current options in charge.
  3. €ASM looks for the global configuration file and reapplies its options.
  4. Command-line options are reapplied again.
  5. €ASM looks for source filename(s) at the command-line, and if the local configuration file exists in the same directory, it is processed and applied to the current configuration in charge.
  6. Source file is now assembled. Each pseudoinstruction EUROASM found in the source overwrites current options.
  7. If another source file is provided on command-line in the same assembly session, €ASM restores configuration which was saved at the end of step 4 and then continues from step 5.

↑ Source file

Source file contains instructions to be assembled, usually it is a plain-text file or HTML file arranged for €ASM. The file name will be provided as command-line parameter of euroasm.exe. The source file may be identified with absolute path in filesystem, e.g. euroasm /user/home/euroasm/MyProject/MySource.asm, or with relative or omitted path, which will be related to the current shell path.

Structure and syntax of source text, which €ASM is able to assemble and link, is described further in this document.

↑ Object file

Main purpose of programming is to obtain the target file, which can be an object module or library linkable to other files, binary file for special purposes, or an executable file.

Format of the output file is specified by PROGRAM parameter FORMAT=. Their layouts were standardized by their creators many years ago. For more details about supported output formats see the chapter Program formats.

Name of the target file is determined by the label used in pseudoinstruction PROGRAM, and it is appended with default extension depending on program format. It isn't necessarily derived from the source filename, as in many other assemblers. For instance, if the source has statement Hello PROGRAM FORMAT=COM, its output file will be created in the current directory with name Hello.com, no matter how is the source file named. Default target name can be changed by PROGRAM parameter OUTFILE=. If the OUTFILE= name is specified with relative or omitted path, current shell directory is assumed.

↑ Listing file

Dump parameters ↓
Dump separators ↓
Dump decoration ↓
List parameters ↓

Listing is a plain text file with two columns where EuroAssembler logs its activity:

  1. result of assembly of each statement is hexadecimally displayed in the dump column.
  2. statements, which were processed, are copied to the source column.

Name of the listing is determined by the name of source file, which is appended with extension .lst, and it is created in source file directory.
Default listing filename and location may be changed with EUROASM parameter LISTFILE=.

↑ Dump parameters

Let's create the source file Hello.asm with this contents:

      EUROASM DUMP=ON,DUMPWIDTH=18,DUMPALL=YES
Hello PROGRAM FORMAT=COM,LISTLITERALS=ON, \
              LISTMAP=OFF,LISTGLOBALS=OFF
       MOV DX,=B"Hello, world!$"
       MOV AH,9
       INT 21h
       RET
      ENDPROGRAM Hello

Submitting the file to EuroAssembler with the command euroasm Hello.asm will create listing file Hello.asm.lst.

Width of the dump column in characters can be specified with EUROASM option DUMPWIDTH=. Other EASM options which control the dump column are boolean DUMPALL= and DUMP=OFF, which can suppress the dump column completely.

|<-Dump column-->|<--Source column-------- <--DumpWidth=18--> | | EUROASM DUMP=ON,DUMPWIDTH=18,DUMPALL=YES | |Hello PROGRAM FORMAT=COM,LISTLITERALS=ON, \ | | LISTMAP=OFF,LISTGLOBALS=OFF |[COM] ::::Section changed. |0100:BA[0801] | MOV DX,=B"Hello, world!$" |0103:B409 | MOV AH,9 |0105:CD21 | INT 21h |0107:C3 | RET |[@LT1] ====ListLiterals in section [@LT1]. |0108:48656C6C6F =B"Hello, world!$" |010D:2C20776F72 ----Dumping all. (because of DUMPALL=YES) |0112:6C64212400 ----Dumping all. | | ENDPROGRAM Hello ▲ column separator
↑ Dump separators

The dump column on the left side always starts with machine comment indicator (pipe character |) and it is terminated with listing column separator, which determines the genesis of this line.

Listing column separators
CharacterFunction
| (pipe)Termination of machine comment. Used in ordinary statements, which can be reused as EASM source.
! (exclamation)Copy of source line with expanded preprocessing %variables (when LISTVAR=ENABLED).
+ (plus)Source line generated in %FOR,%WHILE,%REPEAT expansion (when LISTREPEAT=ENABLED).
+ (plus)Source line generated in %MACRO expansion (when LISTMACRO=ENABLED).
: (colon)Inserted listing line to display a changed [section].
. (fullstop)Inserted listing line to display an autoalignment stuff (when AUTOALIGN=ENABLED).
- (minus)Inserted listing line to display the whole dump (when DUMPALL=ENABLED).
= (equal)Inserted listing line to display data literals (when LISTLITERALS=ENABLED).
  (space) Inserted envelope PROGRAM / ENDPROGRAM line.
* (asterix)Inserted listing line in INCLUDE* statement when filename wildcards are resolved.

When the column separator is not |, the whole listing line has the form of machine remark and is ignored if the listing is submitted as a program source.

↑ Dump decoration

Dump of emitting statements has hexadecimal address (offset in the current section), terminated with colon :. In 16bit section the offset is 16 bits wide (four hexadecimal digits), in 32bit and 64bit sections it is 32 bits. Then follow the emitted bytes. Data contents in the dump column is always in hexadecimal notation without explicit number modifier. If the chosen DUMPWIDTH= is too small for all emitted bytes to fit, they are either right-trimmed and replaced with tilde ~ (if DUMPALL=OFF), or additional lines with separator - are inserted to the listing (DUMPALL=ON).

Some other decorators are used in dumped bytes:

Dump column decoration
DecoratorDescription
~trimmed data indicator, used only when DUMPALL=OFF
..byte of reserved data (instead of hexadecimal byte value when it's initialized)
[]absolute relocation
()relative relocation
{}paragraph address relocation
<Ndisp8*N compression used

Brackets, which may enclose the dumped word or dword, indicate that the address requires relocation at link-time. Value printed in the listing will differ from the offset viewed in linked code or in debugger at run-time.

Character < followed with one decimal digit (N) signalizes that the previous dumped byte is 8bit displacement which will be left-shifted by N bits at run-time to obtain the effective displacement (so called disp8*N compression). The digit 0..6 specifying scaling factor N is not emitted to the assembled code.

Brackets [ ] and { } indicate relocable values. | | EUROASM DUMPWIDTH=30,CPU=X64,SIMD=AVX512,EVEX=ENABLED |[CODE] ▼ ▼▼ ▼ |[CODE] SEGMENT WIDTH=16 |0000:EA[0500]{0000} | JMPF Label ; Absolute far jump encodes immediate seg:offset. |0005:CB |Label: RETF |[CODE64] |[CODE64] SEGMENT WIDTH=64 |00000000:62F36D28234D02<504 | VSHUFF32X4 YMM1,YMM2,[RBP+40h],4 |00000008:C3 ▲▲ | RET <5 is a nonemitted disp8*N decorator. ▲▲Byte displacement +02h will be bit-shifted 5 times to the left, so the effective displacement is in fact +40h.

Dump of not emitting statements is empty or contains auxiliary information.

|[DATA] |[DATA] ; Segment|section switch quotes its [name] in dump column. |0000: |; Empty or comment-only line just displays the offset in current section. |0000: |Label: ; Ditto. | |;; Line comment starting with double semicolon will suppress the offset in dump. |[DATA]:0000 |Target EQU Label: ; Address symbol definition is displayed as [segment]:offset. |4358 |%Counter %SET CX ; Assignment of preprocessing %variable dumps its contents in hexadecimal. |TRUE | %IF "%Counter" == "CX" ; Preprocessing construct displays the evaluated boolean condition. |[]:0010 | Bits EQU 16 ; Scalar symbol definition is displayed with empty segment. |FALSE | %ELSE ; Boolean condition concerns %IF, %ELSE, %WHILE, %UNTIL. | | Bits EQU 32 ; Dump of statements in false conditional branches is empty. | | %ENDIF
↑ List parameters

Listing in default configuration is more or less exact copy of the source (except for the inserted dump column). Sometimes it is useful to check if the high-level constructs worked as expected, this is controlled by following boolean EUROASM options:
LISTINCLUDE= unrolls the contents of included file, which is normally hidden from the main source.
LISTVAR= creates a copy of statements which contain preprocessing %variable, and replace the %variable name with its expanded value in the copied line.
LISTMACRO= inserts statements expanded by the macroinstruction.
LISTREPEAT= inserts all iterations of repeating constructs %FOR..%ENDFOR, %WHILE..%ENDWHILE, %REPEAT..%ENDREPEAT. Repeated expansion is listed as commented-out by dump column separator +. In the default state (LISTREPEAT=DISABLED) only the first expansion is listed.

Trait of EuroAssembler listing is to keep the generated listing usable as the source again, in the following assembly session. Messages generated in the listing are ignored by €ASM parser, so they need not be removed when we want to submit the listing file to reassembly (nevertheless those messages will be generated again if the cause of error was not fixed).

I wanted to sustain this philosophy regardless of list parameters. In default state with LISTINCLUDE=OFF is the statement INCLUDE normally listed and the contents of included file is hidden. With option LISTINCLUDE=ON it is reversed: the original INCLUDE statement is commented out by dump column separator * but the included lines are inserted to the listing and they become valid source statements. See also t2220.

With options LISTVAR, LISTMACRO, LISTREPEAT=ENABLED is the original line kept as is and expanded lines are inserted below it, commented-out by dump column separator ! or +. See also t2230

EUROASM option LIST=DISABLE will switch off generating of listing lines until enabled again, or until the end of source. Of course such listing will be no longer reusable as the source.

↑ File path

Disk files can be specified with their absolute path, i.e. with a path which begins at filesystem root, e.g. C:\ProgFiles\euroasm.exe D:\Project\source.asm. Such files are unequivocally defined.

File may be specified with relative path, e.g. euroasm ..\prowin32\skeleton.asm. Position of relatively specified file is always related to the current directory.

Files can also be specified without path, i.e. when their name contains no colon and no slash :, \, /. Position of such files is recapitulated in the table below:

Directory used when a file is specified without path
DirectionFileDirectorySee also
Executableeuroasm.exeExe-directoryOS PATH
InputGlobal euroasm.iniExe-directoryOS PATH
OutputGlobal euroasm.iniExe-directoryOS PATH
InputLocal euroasm.iniSource directory
InputSource fileCurrent directory
InputIncluded source fileInclude directoryEUROASM INCLUDEPATH=
OutputTarget object fileCurrent directoryPROGRAM OUTFILE=
OutputListing fileSource directoryEUROASM LISTFILE=
InputLinked module fileLink directoryEUROASM LINKPATH=
InputLinked stub fileLink directoryPROGRAM STUBFILE=
InputLinked icon fileLink directoryPROGRAM ICONFILE=
ImportDynamically imported functionOS-dependentIMPORT LIB=

Current directory is the actual folder assigned to the shell process at the moment when euroasm.exe was launched. It's never changed by €ASM.

Exe-directory is the folder where euroasm.exe was found and executed, usually it is one of the directories specified by environment variable PATH.

Source directory is the folder where the currently assembled source file lies.

Include directory is one of the directories specified by the option EUROASM INCLUDEPATH=.

Link directory is one of the directories specified by the option EUROASM LINKPATH=.


↑ Structure of €ASM program

Character structure ↓

Horizontal structure ↓

Vertical structure ↓

This chapter describes the format of source file which €ASM understands and which it is able to compile.


↑ Character structure

Character width ↓

Character encoding ↓

Character case ↓

Character classification ↓


↑ Character width

Source file is a sequence of characters with 8-bit width or with variable width 8..32 bits (in UTF-8 encoding).

If the source file is written in editor which uses WIDE (16bit) character encoding (UTF-16), it should be saved as a plain text in UTF-8 or in 8-bit ANSI or OEM codepage before submitting the file for assembly.

↑ Character encoding

Program written in €ASM may need to display texts in other languages than English. Therefore, string which defines the output text will contain characters with their codepoint value above 127 (codepoint is ordinal number of the character in [Unicode] chart).
Many European languages are satisfied with limited set of 256 characters. The relation between their codes and corresponding glyphes is called code page.

MS Windows uses different code pages in console applications (OEM) and in GUI applications (ANSI) and it makes automatical conversion between them in some circumstances. €ASM itself never changes the code page of the source.

Programmer, who needs to mix several languages, may prescribe to use 16bit WIDE characters instead of 8bit ANSI in text strings at run-time. See cpmix32 as a demo example. Wide (UTF-16) strings are declared with pseudoinstruction DU (Define data in Unichars) instead of DB (Define data in Bytes). Wide variant of WinAPI call must be used for visual representation of Unichar strings at run-time, e.g. TextOutW() instead of TextOutA(). However, the in-source definition of characters in DU statement is still 8-bit. You should tell €ASM which code page was used for writing the DU statement in source file. This information is provided by EUROASM CODEPAGE= option. Codepage may change dynamically in the source, allowing mixing of different languages in one program.

Texts in your program which aim to the console (using WinAPI WriteConsoleA() function or StdOutput macro) should be written in OEM code page. You may want to use DOS plain-text editor, such as EDIT.COM for writing console programs. Text mode editors use console fonts which are in OEM code page, so the text is displayed correctly both in editor at write-time and in the console of your program at run-time.

Text which is presented in GUI windows (using WinAPI TextOutA() function) should be written in ANSI code page, using windowed editor such as Notepad.exe.

Default is EUROASM CODEPAGE=UTF-8, where characters are encoded with variable length of one to four bytes. Thanks to clever [UTF8] design, all non-ASCII UTF-8 characters are encoded as bytes with value 128..255, which are treated as letters in €ASM, so any UTF-8 character can be used in identifiers as is.

Recommended encoding of EuroAssembler source files is UTF-8.

Unlike 8bit ANSI or OEM encodings, which limit the repertoire to 256 glyphs, CODEPAGE=UTF8 allows mixing of arbitrary character codepoints defined in [Unicode], including non-European alphabets. MS Windows API does not directly support UTF-8 strings, they need run-time reencoding to UTF-16 which is used by WIDE variant of WinAPI functions, such as TextOutW(). Reencoding can be performed by WinAPI MultiByteToWideChar() or by macro DecodeUTF8. Exotic characters will be displayed correctly only if the used font supports their glyphes, of course.

Example of freeware text editor which supports UTF-8 encoding is [PSPad].
Some UTF-8 text editors insert Byte Order Mark characters 0xEF, 0xBB, 0xBF at the start of source file. EuroAssembler treats those three characters as a 3-bytes long unused label at the start of source, which usually makes no harm.

↑ Character case

€ASM is case semi-sensitive assembler.

All identifiers created by you, the programmer, are case sensitive: labels, constants, user-defined %variables, structures, macro names. On the other hand, all built-in names are case insensitive. Case insensivity concerns all enumerations: register names, machine instructions and prefixes, built-in data types, number modifiers, pseudoinstruction names and parameters, symbol attributes, system %^variables.

Case insensitive names are presented in UPPER CASE in this manual but they may be used in lower or mixed case as well.

↑ Character classification

Each byte (8 bits) in €ASM source is treated as a character. Many characters have special purpose in assembler syntax unless they are quoted inside double or single quotes. A character is unquoted if zero or even number of quotes appears between the start of line and the character itself.

EOL
End-of-line control character is Line Feed alias EOL (ASCII 10).
White spaces
All other control characters, Delete and Space are white spaces. White spaces are mainly used as separators which can improve readability but only seldom have some syntactic significance. Unquoted multiple white spaces are treated the same way as a single space.
Digits
Digits 0..9 create numbers and identifiers. Hexadecimal numbers may also contain hexadecimal digits A..F, a..f.
Letters
Letters in €ASM are a..z, A..Z, underscore _, commercial at @, dollar-sign $, grave accent `, question mark ? and all characters from the upper half of ASCII table (128..255).
Some extra-letters are employed in €ASM for special purposes, too:
Underscore _ is used in identifiers and numbers as a word separator instead of space.
Commercial at @ indicates literal section name.
Dollar $ alone used as an identifier specifies a dynamic symbol representing current offset in a section.
Grave ` is used as a prefix when some filename not starting with a letter should represent a valid identifier.
Punctuation
All other characters have special semantic meaning – operators, delimiters, modifiers etc – unless they are enclosed in a pair of single ' or double " quotes. Punctuation characters except for percent sign % and EOL are treated as ordinary letters when they are placed in quoted string.
Character classification table
ASCIIglyph name function in €ASM
0..9 controls white space
10 line feed end of line
11..31 controls white space
32 space white space
33! exclamation mark logical operator
34" double quote string delimiter
35# number sign modifier
36$ dollar sign letter
37% percent sign preprocessing apparatus prefix
38& ampersand logical operator
39' apostrophe (single quote)string delimiter
40( left parenthesis priority parenthesis
41) right parenthesis priority parenthesis
42* asterix arithmetic and special operator
43+ plus sign arithmetic operator
44, comma operand separator
45- minus sign arithmetic operator
46. fullstop member separator
47/ slash (solidus) arithmetic operator
48..570..9 digits digit
58: colon field separator
59; semicolon comment separator
60< less-then sign logical operator, comment separator
61= equals sign logical operator, key separator, literal indicator
62> greater-than sign logical operator
63? question mark letter
64@ commercial at letter
65..90A..Z uppercase letters letter
91[ left square bracketcontent braces, substring operator
92\ backslash (reverse solidus)arithmetic operator, line continuation indicator
93] right square bracketcontent braces, substring operator
94^ caret (circumflex) logical operator
95_ underscore (low line)letter, digit separator
96` grave accent letter
97..122a..zlowercase letters letter
123{ left curly bracket sublist operator
124| vertical bar (pipe)logical operator, comment separator
125} right curly bracketsublist operator
126~ tilde logical operator, shortcut indicator
127 delete white space
128..255 NonASCII charactersletter
ASCIIglyph name function in €ASM

↑ Horizontal structure

Physical line ↓

Statement ↓

Machine remark field ↓

Label field ↓

Prefix field ↓

Operation field ↓

Operand field ↓

Line remark field ↓

Line continuation ↓

Assembler source is treated as a text consisting of lines which are processed from left to right, from top to bottom.


↑ Physical line

Source file consists of physical lines. Physical line is a sequence of characters terminated with line feed (ASCII 10). The line feed (EOL) character is part of the physical line, too.

EOL may be omitted in the last physical line of source file.

↑ Statement

Statement is an order for €ASM to perform some action at assembly-time, usually to emit some code to the object file or to change its internal state. Typical statement is identical with a physical line but long statements may span to several lines when line continuation is used.

Statement consists of several fields which are recognized by their position in the line, by the separator or by their contents. All fields are facultative, any of them may be omitted. However, no operand can be used when the operation field is omitted.

Fields in the statement
OrderField nameTermination
1.Machine remark| or EOL
2.Label : or white space
3.Prefix : or white space
4.Operation white space
5.Operand ,
6.Line comment EOL

Example of a statement:

| machine remark |Label |Prefix|Operation| Operands | Line comment |00001234:F08705[78560000] |Mutex: LOCK: XCHG EAX,[TheLock] ; Guard the thread.

↑ Machine remark field

Machine remark begins with vertical bar | when it is the first non-white character on the physical line. It is terminated with the second occurence of the same vertical bar or with the end of physical line.

The contents of machine remark is usually hexadecimal address followed with the machine code emitted by the statement in question. As the field name indicates, this information is generated by computer into €ASM listing file, programmer should never need to write machine remark manually. Machine remarks are ignored in assembler source, thus any valid €ASM listing file may be reused as the source file without change.

↑ Label field

Label field can accomodate any of these elements:

  1. Structure or symbol name or block identifier, for example My1stStructure, My1stLabel:, Outer
  2. Name of a segment | section | group, for example [.data]
  3. Name of symbolic %variable which is being set, for example %Count
  4. Colon itself : explicitly telling €ASM that an empty label is used, so the following field must be a prefix or operation.

In the first case the symbolic name may begin with point ., making the label local. Symbol in the label field may be optionally terminated with one or more colons : immediately following the identifier. The white space between label field and the next field may be omitted when the colon is used.

↑ Prefix field

Machine prefix is an order for CPU to change its internal state at run-time. It is similar to machine instruction code but it only modifies the following instruction at run-time. Each prefix assembles to 1 byte machine opcode.

Prefix table
NameGroupOpcode
LOCK10xF0
REP10xF3
REPE10xF3
REPZ10xF3
REPNE10xF2
REPNZ10xF2
XACQUIRE10xF2
XRELEASE10xF3
SEGCS20x2E
SEGSS20x36
SEGDS20x3E
SEGES20x26
SEGFS20x64
SEGGS20x65
SELDOM20x2E
OFTEN20x3E
OTOGGLE30x66
ATOGGLE40x67

The last four mnemonic names are not known in other assemblers.
SELDOM and OFTEN are used in front of conditional jump instruction as hints for newer CPU to help with prediction of the jump target.
OTOGGLE and ATOGGLE switch between 16- and 32-bit width of operand and address portion of machine code. They are normally generated by the assembler internally whenever needed, without explicit request.

Up to four prefixes can be defined in one statement but not more than one prefix from the same group.

Prefix name may not be used as a label, regardless of character-case.

Names of prefixes are case insensitive and reserved, they cannot be used as labels. Prefix name may terminate with colon(s) : (same as symbols).

AMD and Intel 64bit architecture introduced special prefixes REX, XOP, VEX, MVEX, EVEX. €ASM treats them as part of operation encoding and does not provide mnemonic for their direct declaration.

[AMDSSE5] introduced another instruction prefix DREX, but DREX-encoded instructions are not supported by €ASM as they never made it to the production, AFAIK.

Segment-override prefixes SEG*S can be alternatively requested as a component of memory-variable register expression. In this case they are emitted only when they are not redundant (when they specify non-default segment). Explicitly specified prefixes are emitted always, in the order as they appeared in statement.

EuroAssemblers warns when a prefix is used in contradiction with CPU specification. This can be overcharged by specifying the prefix in separate statement.

|0000:F091 |LOCK: XCHG AX,CX ; Prefix Lock should not be used with register operands. |## W2356 Prefix LOCK: is not expected in this instruction. |0002:F0 |LOCK: ; This can be outperformed when the prefix is separated in extra statement, |0003:91 | XCHG AX,CX ; for instance to investigate CPU behaviour in such situation. |0004: | |0004:6691 | XCHG EAX,ECX ; Operand-size prefix 0x66 is emitted internally (in 16bit segment). |0006:6691 |OTOGGLE: XCHG EAX,ECX ; Its explicit specification has no effect, |0008:6691 |OTOGGLE: XCHG AX,CX ; but here it overrides the registers sizes from 16 to 32 bits.

↑ Operation field

Operation is the most important field of assembler statement; it tells €ASM what to do: declare something, change its internal state or emit something to the object file. Often it gives its name to the whole statement, we may say EXTERN operation instead of statement with EXTERN pseudoinstruction in the operation field.

€ASM recognizes three genders of operation:

Statement may have no operation at all:

[CODE]   ; Redirect further emitting to section [CODE].
         ; Empty statement may be used for optical separation or for comments.
Label:   ; Define a label but do not emit any data or code.
LOCK:    ; Define a machine prefix for the following instruction.

Some statements tell €ASM to generate assembled code to the object file, they are called emitting instructions:

↑ Operand field

Ordinal operand ↓
Keyword operand ↓
Mixing operands ↓

Operands specify data which the operation works with. Number of operands in the statement is not limited and it depends on the operation. Operand can be a register name, number, expression, identifier, string, their various combinations etc.

Operation field is separated from the first operand with at least one white-space. Operands are separated with unquoted comma , from one another. There are two kinds of operands in €ASM: ordinal and keyword.


↑ Ordinal operands

Ordinal operands (or shortly ordinals) are referred by the order in the statement. The first operand has number one; in macros it is identified as %1. For instance, in MOV AL,BL statement the AL register is operand nr.1 and BL is nr.2. Machine instruction MOV is known to copy contents of the second operand to the first. Comma between operands will increase the ordinal number even when the operand is empty (nothing but white-spaces).

Operand of machine instruction may represent a register, immediate integer number, address, memory variable enclosed in square braces, for instance MOV AL,[ES:SI+16].

Some other assemblers allow different syntax of address expression, which is not supported by EuroAssembler, for instance MOV AL,ES:[SI+16] or MOV AL,[ES:16]+SI.
€ASM requires that the entire memory operand is in braces [].
↑ Keyword operands

Beside ordinal parameters €ASM introduces one more type of operands: keyword operand (or simply keywords). They are referred by name (key word) rather than by their position in operands list. Keyword operand has the the form name=value where name is an identifier immediately followed with equal sign.

Keyword operands have many advantages: they are selfdescribing (if their name is chosen ponderously), they don't depend on position in the operand list (no tedious counting of commas), they may be assigned a default value and they may be completely omitted when they have the default value.

Keyword operands are best used with macroinstructions but €ASM also employes them in some pseudoinstructions and even in machine instructions, too. For instance, in INC [EDI],DATA=DWORD the keyword parameter DATA= tells which form of possible INC machine instruction (increment byte, word or dword variable) should be used.

Beware putting a space between keyword name and the equal sign:

|0000: |; Let's define two memory variables (with not recommended names). |0000:3412 |DATA: DW 1234h |0002:7856 |WORD: DW 5678h |0004: | |0004:50 | PUSH AX, DATA=WORD |0005: |; Assembled as PUSH AX. |0005: |; Operand DATA=WORD is recognized as a redundant but valid instruction modifier. |0005: | |0005:506A00 | PUSH AX, DATA = WORD |0008: |; Operand DATA = WORD is not recognized as keyword modifier |0008: |; due to the space which follows identifier DATA. |0008: |; €ASM sees the 2nd operand as a numerical comparison between symbols DATA and WORD, |0008: |; which happen to exist in this program (otherwise E6601 would have been issued). |0008: |; Their offsets (0000h and 0002h) are different, the result is boolean FALSE |0008: |; represented with value 0. The statement is recognized as PUSH AX, 0 |0008: |; which is legal, because €ASM accepts integration of multiple ordinal operands |0008: |; to one statement in machine instructions PUSH, POP, INC, DEC. |0008: |; The statement is assembled as two instructions: PUSH AX and PUSH 0.
↑ Mixing keyword and ordinal operands

Order of keyword operands is not important. It's a good practice to list ordinal operands first and then all keyword operands, but keywords may be mixed with ordinals.

Keyword operand does not increase the ordinal number.
Label1: Operation1 Ordinal1,Ordinal2,,Ordinal4,,
Label2: Operation2 Ordinal1,Keyword1=Value1,Ordinal2,,Ordinal4

Operation1 in the previous example has three operands with ordinal numbers 1,2 and 4. The third operand is empty. The last two commas at the end of line are ignored, as no other nonempty operand follows.

Mixed operands are used in Operation2. Notice that Ordinal2 has ordinal number 2 although it is the third operand on the list. Keyword operands do not count into ordinal numbers but empty operands do.

↑ Line comment field

Line comment begins with unquoted semicolon ; and it ends with the end of physical line. Line comments are ignored by assembler, they aim to human reader of the source.

↑ Line continuation

Statement continues on the next physical line when line continuation character, which is the unquoted backslash \, is used at the beginning of any field.

 aLabel:       \ ; This semicolon is redundant.
     MOV EAX,  \ The first operand of MOV is destination
         EBX   ; and the second one is source.

Everything following the line continuation character is treated like a comment field, so the semicolon may be omitted in this case. In a multiline statement you may add comments to any physical line.

Line continuation may appear at the beginning of any field, but not inside the field.

The whole field of any statement must fit on one physical line.

Backslash \ is also used as modulo binary operator, which cannot appear at the beginning of operation, so the confusion is avoided.

;                   modulo  modulo line-continuation
;                      |      |    |  
|0000:01000200 |  DW 5 \ 4, 6 \ 4, \
|0004:03000000 |     7 \ 4, 8 \ 4

↑ Vertical structure

Block statements ↓

Switch statements ↓

Standalone statements ↓

Statements in the assembler source are processed one by one, from the top downwards. Some of them may influence the successive statements but most instructions are standalone. From this point of view there are three kinds of statements:


↑ Block statements

Block statement must appear in pair with its corresponding ending statement. Internal state of €ASM is changed only withing the range between them, which is called block.

Block is a continuous range of statements which starts with begin-block statement and ends with end-block statement.

Block actually begins at the operation field of begin-block statement and it ends at the operation field of end-block statement.

Some block statements may be prematurely cancelled (broken) with exit operation, for instance when an error is detected during macro expansion.

Block statements
Label fieldOperation field
ObligationRepresentsDeclares Begin blockBreak End block
mandatoryprogram name program PROGRAMnot used ENDPROGRAM
mandatoryprocedure name symbol PROC not used ENDPROC
mandatoryprocedure name symbol PROC1 not used ENDPROC1
mandatorystructure name structureSTRUC not used ENDSTRUC
optionalblock identifiernothing HEAD not used ENDHEAD
optionalblock identifiernothing %COMMENTnot used %ENDCOMMENT
optionalblock identifiernothing %IF %ELSE %ENDIF
optionalblock identifiernothing %WHILE %EXITWHILE %ENDWHILE
optionalids of Begin/End swappednothing %REPEAT %EXITREPEAT%ENDREPEAT
mandatoryformal control variable%variable%FOR %EXITFOR %ENDFOR
mandatorymacro name macro %MACRO %EXITMACRO %ENDMACRO

Some end-block operations can be aliased:
ENDPROC alias ENDP,
ENDPROC1 alias ENDP1,
%ENDREPEAT alias %UNTIL.

Label field of a block statement specifies the name of program, procedure, structure or macro. In the preprocessing %FOR loop the label field declares formal variable which changes its value in each loop cycle. In other preprocessing loops the label field is optional and it may contain identifier which optically connects the beginning and ending block statements together (for nesting check) but has no further significance - it does not declare a symbol.

The same block identifier may be used as the first and only operand of the corresponding end-block statement.

Assemblers are not united in the format of block pseudoinstructions. MASM uses the same block identifier in the label fields of both begin- and end-block statements:

MyProcedure PROC    ; MASM syntax
     ; some code
MyProcedure ENDP

This is good when you search the source for procedure definition. Its name is on the left so it will hit your eyes when you scan the leftmost column. On the other hand, the same label appears in the source twice, making an ugly exception from the rule that a nonlocal symbol declaration may occur only once in the program.

Perhaps for that reason Borland chose different syntax in TASM IDEAL mode:

 PROC MyProcedure   ; TASM syntax
        ; some code
 ENDP MyProcedure

It solves the double label problem but the name of MyProcedure never appears in the label field, although it is a regular label.

€ASM invents compromise solution: the name of block is defined in the label field of begin-block statement and it may appear in the end-block statement:

MyProcedure PROC  ; €ASM syntax
                  ; some code
            ENDP MyProcedure

The operand in endblock statement may be omitted but, if used, it must be identical with the label of corresponding begin-block statement. This helps to maintain correct block nesting because €ASM will emit an error when block identifiers don't match.

Blocks of code may nest, but only correctly.

Two blocks are correctly nested when one block contains the entire other block.

%MACRO block in the example below contains correctly nested %IF block.

WriteCMOS %MACRO Address,Value
           %IF %1 <= 30h
             %ERROR "Checksum protected area!"
             %EXITMACRO WriteCMOS
           %ENDIF
           MOV AL,%1
           OUT 70h,AL
           MOV AL,%2
           OUT 71h,AL
          %ENDMACRO WriteCMOS
Incorrect block nesting will only be tolerated in procedures declared with option NESTINGCHECK=OFF.

Block identifier in operand field of end-block and exit-block statements usually only guards the correct binding. When blocks of the same type are nested one in another, exit-block operand can be used to identify the exiting block. As an example see t2642 where one Inner %FOR block is nested in Outer %FOR block, and the operand of %EXITFOR statement specifies which block is exited.

↑ Switch statements

Switching statement changes the internal state of €ASM for all following statements until another switching statement changes the state again, or until the end of source is encountered.

There are two switching pseudoinstructions in €ASM: EUROASM, and SEGMENT. The latter has two forms:
[name] SEGMENT (define a new segment) and
[name] (define new section in current segment if it wasn't defined yet, and switch emitting to this section).
Examples of switching statements:

 EUROASM  AUTOSEGMENT=OFF, CPU=486 ; Change €ASM options for all following statements.
[Subprocedures] SEGMENT PURPOSE=CODE, ALIGN=BYTE  ; Declare a new segment.
[.data]                  ; Switch emitting of following statements to previously defined segment [.data]
[StringData]             ; Define a new section in the current segment (in [.data]).

↑ Standalone statements

All other pseudoinstructions and machine instructions are not logically bound with others in a vertical structure of a program, so they are standalone.


↑ Elements of €ASM program

The size of EuroAssembler elements is not limited by design. This concerns the length of strings, physical text lines, identifiers, number notations, expressions, nesting depth, number of operands. They are kept internally as a signed 32bit integer number so the theoretical size limit of each such element is 2 GB = 2_147_483_647 bytes (characters).

In reality it is the amount of available virtual memory and stack space which restrict elements of this size, and EuroAssembler may terminate with fatal error message F9110 Cannot allocate virtual memory. or F9210 Memory reserved for machine stack is too small for this source file.

Addresses ↓

Addressing space ↓

Alignment ↓

Boolean values ↓

Boolean extensions ↓

Comments ↓

Condition codes ↓

Data types ↓

Distance ↓

Enumerated values ↓

Expressions ↓

Groups ↓

Identifiers ↓

Length ↓

Literals ↓

Memory variables ↓

Namespace ↓

Numbers ↓

Operators ↓

Registers ↓

Scope ↓

Sections ↓

Segmentation ↓

Segments ↓

Size ↓

Strings ↓

Structures↓

Symbols ↓

%Variables ↓

Width ↓


↑ Comments

Block comments ↓

Line comments ↓

Machine remarks ↓

Markup comments ↓

Comments are parts of source code which are not processed by assembler and their only purpose is to explain the code for human reader. There are four types of comments in €ASM:


↑ Line comments

Line comment starts with unquoted semicolon; everything up to the end of line is ignored by €ASM. Line comments are copied to the listing file.

 Label: CALL SomeProc ; This is a line comment.

↑ Machine remarks

Machine remarks are created by €ASM in the listing file and they contain the generated machine code in hexadecimal notation.

Machine remark starts with vertical bar | which is the first non-white character on the physical line. Machine remark ends with second occurence of the same vertical bar | , or with the end of line (whichever comes first). So, when the closing | is omitted, the whole physical line is treated as remark. This is used for inserting error messages into the listing, just below the erroneous statement.

|0030:E81234   |Label1: CALL SomeProc ; This is a line comment.
|0033:         |Label2: COLL OtherProc ; Typing error in operation name.
|### E6860 Unrecognized operation "COLL", ignored.

Machine remarks are ignored by €ASM and they are not copied to the listing. €ASM creates them again instead, if the listing produced by previous assembly session is submitted as source to assemble.

Machine remarks are not intended to be manually inserted by programmer into source text, use ordinary line comment instead.

↑ Markup comments

When physical line begins with less-than character <, it is treated as a markup comment and ignored to the end of line. This enables to mix the source code and hypertext markup language tags. Markup comments are not copied to the listing.

Thanks to markup comments, €ASM source can be stored not only as a plain-text but also as HTML or XML hypertext.

<h2>Description of SomeProcedure</h2>
<img src="SomeImage.png"/>
SomeProcedure  PROC  ; See the image above for description.

All source code shipped with €ASM is completely stored in HTML format, which allows to document the source with hypertext links, tables, images and better visual representation than simple line comments could yield.

If you want to keep your sources in HTML, make sure that assembler statements do not start with < and rearrange the source so that every markup comment line starts with some HTML tag. You may also use void HTML tags <span/> or <!----> to start the comment line.

↑ Block comments

Block comment can be used to temporary disable a portion of source code or to include documentation inside the source.

Block comment begins with %COMMENT statement and it ends with the corresponding %ENDCOMMENT. It can span over many lines of program, which don't have to start with semicolons.
Block comments are copied into the listing file.

€ASM does not assemble the text inside the commented-out block, but it needs to parse it anyway in order to find the coresponding %ENDCOMMENT statement, so the commented-out text should be a valid source as well.

Block comments are nestable.

The text in %COMMENT block must be corectly nested, although it is ignored.

The pseudoinstrucion %COMMENT could be easily replaced with %IF 0, but the former one is more intuitive.
 CALL SomeProc ; This is a line comment.
 %COMMENT  ; This is a block comment.
 COLL OtherProc ; Typing error in operation name.
    %COMMENT ; This is a nested block comment.
    %ENDCOMMENT ; End of inner block comment.
    ; This statement is ignored, too.
 %ENDCOMMENT
 ; Emitting assembly continues here.

↑ Identifiers

Identifier is a human readable text which gives name to an element of assembler program: a symbol, register, instruction, structure etc.

Identifier is a combination of letters and digits, which begins with a letter.

Length of identifiers is not limited in €ASM and all characters are significant.


↑ Numbers

Decimal numbers ↓

Binary numbers ↓

Octal numbers ↓

Hexadecimal numbers ↓

Integer numbers overview ↓

Floating point numbers ↓

Floating point special values ↓

Character constants ↓

Number notation is the way to write numeric value. Numeric values are kept and computed internally by €ASM as 64-bit signed integers.

Number notation is a combination of digits and number modifiers, which begins with decimal digit.

Number modifier is one of B D E G H K M P Q T character apended to the end of digits sequence, or 0N 0O 0X 0Y (zero followed by a letter) prefixed in front of other digits. All number modifiers are case insensitive. Except for decimal format, which is the default, a modifier must always be used.

Floating point numbers may use fullstop . to separate integer and decimal part of the number notation.

Another number modifier is underscore character _ which is ignored by number parser and can be used as digit separator instead of space|comma for better readability of long numbers. No white spaces are allowed in the number notation.

↑ Decimal numbers

Decimal number is a combination of decimal digits 0..9 optionally suffixed with decimal modifier D. There are five other decimal suffixes:
K (Kilo), which tells €ASM to multiply the number by 210=1024,
M (Mega), which tells €ASM to multiply the number by 220=1_048_576,
G (Giga), which tells €ASM to multiply the number by 230=1_073_741_824,
T (Tera), which tells €ASM to multiply the number by 240=1_099_511_627_776,
P (Peta), which tells €ASM to multiply the number by 250=1_125_899_906_842_624.

Decimal numbers may be prefixed with 0N modifier.

All six numbers in the following example have the same value: 1048576, 1048576d, 0n1048576, 1_048_576, 1024K, 1M.

Maximal possible number which fits into 32 bits is 0xFFFF_FFFF=4_294_967_295.

Maximal possible number which fits into 63 bits is 0x7FFF_FFFF_FFFF_FFFF=9_223_372_036_854_775_807.

↑ Binary numbers

Binary number is made of digits 0 1 appended with binary number modifier B or prefixed by modifier 0Y. Examples: 0y101, 101b, 00110010b, 1_1111_0100B are equivalent to decimal numbers 5, 5, 50, 500 respectively.

Maximal 32bit binary number is 1111_1111__1111_1111__1111_1111__1111_1111b.

↑ Octal numbers

Each octal digit 0..7 represents three bits of equivalent binary notation. The number is terminated with octal suffix Q or prefixed with 0O alias 0o (digit zero followed by capital or small letter O).

Example: 177_377q = 0o177_377 = 0xFEFF

The biggest 32bit octal number is 37_777_777_777q.

The biggest 64bit octal number is 1_777_777_777_777_777_777_777q.

↑ Hexadecimal numbers

Hexadecimal digit encodes four bits in one character, which requires 24=16 possible values. Therefore the ten decadic digits are extended with letters A, B, C, D, E, F with values 10, 11, 12, 13, 14, 15. Hexadecimal letters A..F are case insensitive. When the first digit of hexadecimal number is represented with letter A..F, an additional leading zero must be prefixed to the number notation. Hexadecimal number is terminated with suffix H or it begins with prefix 0X.

Example: 5h, 0x32, 1F4H, 0x1388, 0C350H represent decadic numbers 5, 50, 500, 5000, 50000 respectively.

Keep in mind that all numbers in €ASM are internally kept as 64bit signed integer. Although instructions MOV EAX,0xFFFF_FFFF and MOV EAX,-1 assemble to identical codes, their operands are represented as 0x0000_0000_FFFF_FFFF and 0xFFFF_FFFF_FFFF_FFFF. Boolean expression 0xFFFF_FFFF = -1 is false. |00000000:B8FFFFFFFF | MOV EAX, 0xFFFF_FFFF |00000005:B8FFFFFFFF | MOV EAX, -1 |FALSE | %IF 0xFFFF_FFFF = -1

↑ Integer numbers overview

Integers may be written in binary, decimal, octal or hexadecimal notation. Some number modifiers overlap with hexadecimal digits B, D, E. €ASM parses as much of the element as possible to resolve such ambiguity:
1BH is recognized as hexadecimal number 0x1B=27 and not binary 1 followed with letter H.
2DH is recognized as hexadecimal number 0x2D=45 and not decimal 2 followed with letter H.
3E2H is recognized as hexadecimal number 0x3E2=994 and not 3 * 102 followed with letter H.

Integer number notation
NotationPrefixBaseSuffixMultiplier
Binary0Y2B1
Octal0O8Q1
Decimal0N10D1
K210
M220
G230
T240
P250
Hexadecimal0X16H1

Binary, octal and hexadecimal numbers must always be written with prefix or suffix (or both, however this is not recommended). There is no RADIX directive in €ASM.

For more examples of acceptable syntax see €ASM numbers tests.

↑ Floating point numbers

Floating point alias real numbers are parsed from scientific notation with decimal point and exponent of 10, using this syntax:

FP number notation anatomy
OrderField nameContents
1number sign+, - or nothing
2significanddigits 0..9, digit separators _
3decimal point.
4fractiondigits 0..9, digit separators _
5FP number modifierE or e
6exponent sign+, - or nothing
7exponent partdigits 0..9, digit separators _

For instance, floating point number 1234.56E3 has value 1234.56 * 103=1234560.

Omitted sign is treated as +.

Decimal part can be omitted when zero(s). 123.00E2 = 123.E2

Decimal point may be omitted when decimal part is omitted (is zero). The E modifier still specifies the floating point format. 123.00E2 = 123.E2 = 123E2 = 12300.

Exponent can be omitted when it is zero. Modifier E may be omitted in this case, too. Without E modifier it is the presence of decimal point which decides if the number is integer or real. Example: 12345.67E0 = 12345.67E = 12345.67

No white space is allowed within FP number notation.

The number is considered as floating point when its notation contains either decimal point ., or modifier E (capital or small letter E), or both. Otherwise it is treated as integer.

€ASM does not calculate with floating point numbers at assembly time.

All internal calculations in €ASM are provided with 64bit integers only. When FP is used in mathematical expression, it is converted to integer first. Error E6130 (number overflow) is reported if the number does not fit to 64 bits. Warning W2210 (precision lost) is reported if the FP number had decimal part which was rounded in conversion.

Actual FP number format [IEEE754] is maintained only when the scientific notation is used to define static FP variable with pseudoinstruction DD, DQ, DT.

Half-precision FP numbers (float16) are not supported by €ASM, neither they are supported by processors, with exception of two packed SIMD instructions VCVTPS2PH and VCVTPH2PS, and a few MVEX-encoded up/down conversion operations.

Unlike integer numbers, the sign of FP notation is inseparable from digits which follow. If you by mistake put a space between the sign and the number, instead of FP definition it is treated as an operation (unary minus applied to a number), and therefore the FP number is converted to integer first, before the operation is evaluated. |00000000:001DF1C7 | DD -123.45E3 ; Single-precision FP number -123.45*103. |00000004:C61DFEFF | DD - 123.45E3 ; Dword signed integer number -123450. |00000008:00000000A023FEC0 | DQ -123.45E3 ; Double-precision FP number -123.45*103. |00000010:C61DFEFFFFFFFFFF | DQ - 123.45E3 ; Qword signed integer number -123450. |00000018:0000000000001DF10FC0 | DT -123.45E3 ; Extended-precision FP number -123.45*103. |00000022: | DT - 123.45E3 ; Tbyte integer number is not supported. |### E6725 Datatype TBYTE expects plain floating-point number.

↑ Floating point special values

Beside the standard scientific notation of floating-point numbers they may have a special FP constant value:

Special floating-point constant values (in hexadecimal notation)
ConstantInterpretationsingle precision (DD)double precision (DQ)extended precision (DT)
#ZEROzero0000000000000000_00000000 0000_00000000_00000000
+#ZEROpositive zero0000000000000000_00000000 0000_00000000_00000000
-#ZEROnegative zero8000000080000000_00000000 8000_00000000_00000000
#INFinfinity7F8000007FF00000_00000000 7FFF_80000000_00000000
+#INFpositive infinity7F8000007FF00000_00000000 7FFF_80000000_00000000
-#INFnegative infinityFF800000FFF00000_00000000 FFFF_80000000_00000000
#PINFpseudo infinity7F8000007FF00000_00000000 7FFF_00000000_00000000
+#PINFpositive pseudo infinity7F8000007FF00000_00000000 7FFF_00000000_00000000
-#PINFnegative pseudo infinityFF800000FFF00000_00000000 FFFF_00000000_00000000
#NANnot a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
+#NANpositive not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
-#NANnegative not a numberFFC00000FFF80000_00000000 FFFF_C0000000_00000000
#PNANpseudo not a number7F8000017FF00000_00000001 7FFF_00000000_00000001
+#PNANpositive pseudo not a number7F8000017FF00000_00000001 7FFF_00000000_00000001
-#PNANnegative pseudo not a numberFF800001FFF00000_00000001 FFFF_00000000_00000001
#QNANquiet not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
+#QNANpositive quiet not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
-#QNANnegative quiet not a numberFFC00000FFF80000_00000000 FFFF_C0000000_00000000
#SNANsignaling not a number7F8000017FF00000_00000001 7FFF_80000000_00000001
+#SNANpositive signaling not a number7F8000017FF00000_00000001 7FFF_80000000_00000001
-#SNANnegative signaling not a numberFF800001FFF00000_00000001 FFFF_80000000_00000001

Names of special constants are case insensitive. If sign + or - is used, it is unseparable. Examples:
FourNans DY 4 * QWORD #NaN ; Define vector of four double-precision not-a-number FP values.
MOV ESI,=8*Q#ZERO ; Define 8*8 zero bytes in literal section and set ESI to point at them.

↑ Character constants

A number can also be written as a character constant, which is a string containing not more than eight characters (or when nineth and higher characters are all NUL). Its numeric value is taken from ordinal number of each character in the ASCII table. Example of character constants and their values:

'0'   =     30h =      48
'abc' = 636261h = 6513249
"4%%" =   2534h =    9524
Character with the least significant value is on the left position in the string.

Assemblers are not united in character constants treatment. MASM and TASM use scriptual convention where the order of characters in written source corresponds with the way we write numbers: least significant digit on the right.

€ASM as well as other newer assemblers use the memory convention where the order of characters in the written source corresponds with the order how they are stored in memory on little endian architecture processors.

| | ; MASM and TASM: |00000000:616263 | DB 'abc' ; String. |00000003:63626100 | DD 'abc' ; Character constant. |00000007:B863626100 | MOV EAX,'abc' ; AL='c'. | | ; €ASM, FASM, GoASM, NASM, SpASM: |00000000:616263 | DB 'abc' ; String. |00000003:61626300 | DD 'abc' ; Character constant. |00000007:B861626300 | MOV EAX,'abc' ; AL='a'.

↑ Enumerated values

Some operands may acquire only one of the few predefined values, e.g. the EUROASM option CPU= may be 086, 186, 286, 386, 486, 586, 686, PENTIUM, P6, X64.

Although some enumerated values may look like a number, they are not countable.

↑ Boolean values

Any number can be interpreted as a boolean (logical) value, too. Boolean values can acquire one of the two states: false or true. Number 0 is treated as boolean false in logical expression, any nonzero number is treated as true.

↑ Boolean extended values

All built-in €ASM boolean options have extended repertoire of possible values. Those boolean values accept

This concerns

Extended boolean enumeration is used only with operands built in the €ASM. They are not symbols that could be used elsewhere, such as MOV EAX,TRUE. To achieve similar functionality in macros, the programmer would have to define such symbols first, e.g.

FALSE   EQU 0
false   EQU 0
TRUE    EQU -1
true    EQU !false
MOV EAX,TRUE

When an extended Boolean value is used as macro keyword operand, it can be also tested in macro body with %IF, %WHILE, %UNTIL, for instance

MacroWithBool  %MACRO Bool=On
  %IF %Bool
    ; Do something when Bool is set to TRUE.
  %ELSE
    ; Do something when Bool is set to FALSE.
  %ENDIF
 %ENDMACRO MacroWithBool

Now we may invoke the macro as MacroWithBool Bool=Enable, MacroWithBool Bool=No etc.

Extended Boolean values are not allowed in logical expressions
MacroWithBool  %MACRO Bool=0
  %IF ! %Bool
    ; Do someting when Bool is set to FALSE.
  %ENDIF
  %ENDMACRO MacroWithBool

The previous example would not work with extended Boolean values, for instance MacroWithBool Bool=False will complain that E6601 Symbol "False" was not found.. However, reversing the logic will work:

MacroWithBool  %MACRO Bool=0
  %IF  %Bool
  %ELSE
    ; Do someting when Bool is set to FALSE.
  %ENDIF
  %ENDMACRO MacroWithBool

↑ Strings

String is a set of arbitrary characters enclosed in quotes. Either double " or single quotes ' (also called apostrophes) may be used to claim the borders of a string. The surrounding quotes do not count into the string contents. All characters withing the string loose their semantic meaning with three exceptions:

  1. EOL cannot be used in strings. In other words, each portion of quoted "string data" must fit to one physical line. Definition of long strings can be split, e.g. |0000:5468697320697320 |MultilineString: DB "This is the first line",13,10, \ |0008:7468652066697273~| "and this is the second one.",13,10,0 |0036: |
  2. The same quote character which is used to surround the string cannot be used inside, unless it is doubled, e.g. |0000:4F27427269656E00 |Surname: DB 'O''Brien',0 |0008: |
  3. The percent sign % keeps its function of a %variable prefix. Use two adjacent percents when a single % is required in a string, e.g. |0000:313030252073617665642E00 |Status: DB "100%% saved.",0 |000C: |
Preprocessing %variables are expanded in strings.

No escape character is employed in €ASM, in fact the percent sign and quote escape themselves. If you need to use any of the above mentioned characters within a string, they must be doubled. This duplication (self-escaping) concerns only the notation in the source and it does not increase the netto string size in emitted computer memory.

Strings enclosed in 'single quotes' and "double quotes" are equivalent with one exception: if the contents of string is filename, only double quotes may be used, because apostrophe is valid character when used in filenames on most filesystems. More example of string definitions:

|0000:3830202520 |DB "80 %% " |0005:766F74656420224E6F22 |DB "voted ""No""" |000F: |DB '' ; Empty string. |000F:27 |DB "'" ; Single apostrophe. |0010:27 |DB '''' ; Single apostrophe. |0011: |; Examples of invalid syntax (odd number of quotes): |0011: |DB """ |### E6721 Invalid data expression """"". |0011: |DB "It ain't necessarilly so' |### E6721 Invalid data expression ""It ain't necessarilly so'". |0011: |

↑ Addressing space

Processor, alias Central Processing Unit (CPU), operates with data and communicates with its environment (registers, memory and devices). Typical operation reads a piece of information from register, memory or port (I/O device), makes some manipulation with the data and writes it back to the environment. The least addressable unit is one byte (1 B) and their number is limited by addressing space. Register is identified by its name, device is identified by its port number, byte in memory is identified by its address.

CPU addressing space
CPU modeGPR spaceI/O port spaceMemory addressing space
16bit 8* 2 B64 KB (216)1 MB (216+4)
32bit 8* 4 B64 KB (216)4 GB (232)
64bit16* 8 B64 KB (216)16384 PB (264)

↑ Addresses

Addressing space is limited by the CPU architecture and by the number of wires connecting address pins between CPU and memory chips. The combination of logical zeros and ones, which can be measured on these wires, is called physical address (PhA).

From application programmer's point of view the processor writes or reads from virtual address (VA). If segmentation of memory is not taken into account, virtual address is sometimes called linear address (LA). Both virtual and physical address were identical only in first generations of processors operating in real mode without memory cache and memory paging.

Objects in the linked image of protected-mode program are often addressed with an offset from the beginning of image loaded in memory (from the ImageBase). Such offset is called relative virtual address (RVA).

Position of data items in file formats are sometimes identified with file address (FA), which is defined as the distance between start of the file and the actual data item position in this file.

Address is a symbolic representation of some position in memory.

PhA, VA, LA, RVA, FA are integer non-negative plain numbers, but addressing at assembly-time is rather more complicated. From historical reason is the addressing space divided into segments of memory and each segment is identified by the contents of segment register. Address at assembly-time is expressed as number of bytes (offset) between the position and the start of its segment, and the segment identification. See also the chapters Address symbols and Address expressions.

↑ Alignment

Data and code are retrieved from memory faster when their address is aligned, which means rounded to a value which is a multiple of power of two. Though most of IA-32 CPU instructions can cope with unaligned data, it takes more time as the data read from memory are not in the same cache page and the CPU may need to shift the information internally during fetch-time.

For the best performance, memory variables should be aligned to their natural alignment which corresponds with their size, see the Autoalign column in Data types table. Doublewords, for instance, have autoalign value 4, which says that the last two bits of properly aligned address must be zero. QWORD are aligned to 8, therefore the last three bits (8=23) must be zero.

Alignment can be achieved explicitly with ALIGN pseudoinstruction, or with ALIGN= keyword given in machine instruction or in pseudoinstructions PROC and PROC1.

Memory variables are being aligned by €ASM implicitly when EUROASM option AUTOALIGN=ON. For instance the statement SomeDword: DD 1234 is autoaligned by 4 (offset of SomeDword can be divided by 4 without a remainder). Alignment stuff, which fills the space in front of aligned instruction, is NOP 0x90 in code segments and zero 0x00 in data segments.

The align value may be numeric expression which evaluates to 1, 2, 4, 8 or higher power of two. €ASM accepts without warning zero or empty value, too, which is identical to ALIGN=1 (has no effect). Beside the numeric values ALIGN also accepts enumerated values BYTE, WORD, DWORD, QWORD, OWORD, YWORD, ZWORD alias their short versions B, W, D, Q, O, Y, Z.

Alignment is always limited by the alignment of segment which the statement lies in. If the current segment is DWORD aligned, we cannot ask for QWORD or OWORD alignment in this segment. Default segment alignment is OWORD (10h) in €ASM and it is increased to SectionAlign (usually 1000h) when the assembled program is in PE of DLL format.

Beside instruction modifier ALIGN= the alignment may also be established with explicit pseudoinstruction ALIGN, which allows intentional disalignment, too.

↑ Registers

Register is a small fast fixed-size variable located on CPU chip.

Though a register remembers information written to it, it is not part of addressable memory. Registers can be referrenced by their names only, they have no address.

Registers table
FamilyREGTYPE#MembersSize
GPR 8bit'B'AL, AH, BL, BH, CL, CH, DL, DH,
DIB, SIB, BPB, SPB, R8B, R9B, R10B, R11B, R12B, R13B, R14B, R15B
DIL, SIL, BPL, SPL, R8L, R9L, R10L, R11L, R12L, R13L, R14L, R15L
1
GPR 16bit'W'AX, BX, CX, DX, BP, SP, SI, DI, R8W, R9W, R10W, R11W, R12W, R13W, R14W, R15W2
GPR 32bit'D'EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI, R8D, R9D, R10D, R11D, R12D, R13D, R14D, R15D4
GPR 64bit'Q'RAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI, R8, R9, R10, R11, R12, R13, R14, R158
Segment'S'CS, SS, DS, ES, FS, GS2
FPU'F'ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST710
MMX'M'MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM78
XMM'X'XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM16, XMM17, XMM18, XMM19, XMM20, XMM21, XMM22, XMM23, XMM24, XMM25, XMM26, XMM27, XMM28, XMM29, XMM30, XMM3116
AVX'Y'YMM0, YMM1, YMM2, YMM3, YMM4, YMM5, YMM6, YMM7, YMM8, YMM9, YMM10, YMM11, YMM12, YMM13, YMM14, YMM15, YMM16, YMM17, YMM18, YMM19, YMM20, YMM21, YMM22, YMM23, YMM24, YMM25, YMM26, YMM27, YMM28, YMM29, YMM30, YMM3132
AVX-512'Z'ZMM0, ZMM1, ZMM2, ZMM3, ZMM4, ZMM5, ZMM6, ZMM7, ZMM8, ZMM9, ZMM10, ZMM11, ZMM12, ZMM13, ZMM14, ZMM15, ZMM16, ZMM17, ZMM18, ZMM19, ZMM20, ZMM21, ZMM22, ZMM23, ZMM24, ZMM25, ZMM26, ZMM27, ZMM28, ZMM29, ZMM30, ZMM3164
Mask'K'K0. K1, K2. K3, K4, K5, K6, K78
Bound'N'BND0, BND1, BND2, BND316
Control'C'CR0, CR2, CR3, CR4, CR84
Debug'E'DR0, DR1, DR2, DR3, DR6, DR74
Test'T'TR3, TR4, TR54

Register names are case insensitive. General Purpose Registers (GPR) are aliased, for instance AL is another name for the lower half of AX, which is the lower half of EAX, which is the lower half of RAX.

Similary, SIMD (AVX) registers are aliased as well: XMM0 is another name for the lower half of YMM0, which is the lower half of ZMM0.

Names of 8bit registers DIB, SIB, BPB, SPB, R8B..R15B are aliases for the least significant byte of RDI, RSI, RBP, RSP, R8..R15. They may also be referred as DIL, SIL, BPL, SPL, R8L..R15L, as used in Intel manual. €ASM supports both suffixes L and B. Those registers are available in 64bit mode only.

Some other assemblers and Intel manuals use notation ST(0), ST(1)..ST(7) for Floating-Point Unit register names, but this syntax is not accepted in €ASM. Neither can be ST0 register aliased with ST (top of the FPU stack).

Processor x86 contains some other registers which hold flags, descriptor tables, FPU control and status registers, but they are not listed in the table above because they are not directly accessible by their name.

↑ Condition codes

General condition codes ↓

SSE condition codes ↓

Result of some CPU operations is treated as a predicate with mnemonic shortcut that can be used as a part of instruction name.

↑ General condition codes

Some combinations of CPU flags ZF, CF, OF, SF, PF are given special names, so called condition codes. They are used in mnemonic of conditional branching using the jump instructions or in bit-manipulation general-purpose instructions.

Inverted code can be used in macroinstructions to bypass region of code when the condition is not met. See the automatic %variable inverted condition code.

General condition codes table
Num.
value
Mnemonic
code
AliasDescriptionConditionInverted
mnem.code
0x4E Z Equal ZF=1 NE
0x5NE NZ Not Equal ZF=0 E
0x4Z E Zero ZF=1 NZ
0x5NZ NE Not Zero ZF=0 Z
0x2C B Carry CF=1 NC
0x3NC NB Not Carry CF=0 C
0x2B C Borrow CF=1 NB
0x3NB NC Not Borrow CF=0 B
0x0O Overflow OF=1 NO
0x1NO Not Overflow OF=0 O
0x8S Sign SF=1 NS
0x9NS Not Sign SF=0 S
0xAP PE Parity PF=1 NP
0xBNP PO Not Parity PF=0 P
0xAPE P Parity Even PF=1 PO
0xBPO NP Parity Odd PF=0 PE
0x7A NBEAbove CF=0 && ZF=0 NA
0x6NA BE Not Above CF=1 || ZF=1 A
0x3AE NB Above or Equal CF=0 NAE
0x2NAE B Not Above nor Equal CF=1 AE
0x2B NAEBelow CF=1 NB
0x3NB AE Not Below CF=0 B
0x6BE NA Below or Equal CF=1 || ZF=1 NBE
0x7NBE A Not Below nor Equal CF=0 && ZF=0 BE
0xFG NLEGreater SF=OF && ZF=0 NG
0xENG LE Not Greater SF<>OF || ZF=1G
0xDGE NL Greater or Equal SF=OF NGE
0xCNGE L Not Greater nor EqualSF<>OF GE
0xCL NGELess SF<>OF NL
0xDNL GE Not Less SF=OF L
0xELE NG Less or Equal SF<>OF || ZF=1NLE
0xFNLE G Not Less nor Equal SF=OF && ZF=0 LE
CXZ CX register is Zero CX=0
ECXZ ECX register is Zero ECX=0
RCXZ RCX register is Zero RCX=0

↑ SSE condition codes

Streaming Single Instruction Multiple Data Extension instructions (V)CMPccSS,(V)CMPccSD,(V)CMPccPS,(V)CMPccPD use different set of condition codes cc.

Only aliased mnemonic code is documented for legacy instructions CMPccSS,CMPccSD,CMPccPS,CMPccPD.
SSE condition codes table
Num.
value
Mnemonic
code
AliasDescription
0x00EQ_OQEQEqual, Ordered, Quiet
0x01LT_OSLTLess Than, Ordered, Signaling
0x02LE_OSLELess than or Equal, Ordered, Signaling
0x03UNORD_QUNORDUnordered, Quiet
0x04NEQ_UQNEQNot Equal, Unordered, Quiet
0x05NLT_USNLTNot Less Than, Unordered, Signaling
0x06NLE_USNLENot Less than or Equal,Unordered, Signaling
0x07ORD_QORDOrdered, Quiet
0x08EQ_UQ Equal, Unordered, Quiet
0x09NGE_USNGENot Greater than or Equal, Unordered, Signaling
0x0ANGT_USNGTNot Greater Than, Unordered, Signaling
0x0BFALSE_OQFALSEFalse, Ordered, Quiet
0x0CNEQ_OQ Not Equal, Ordered, Quiet
0x0DGE_OSGEGreater than or Equal, Ordered, Signaling
0x0EGT_OSGTGreater Than, Ordered, Signaling
0x0FTRUE_UQTRUETrue, Unordered, Quiet
0x10EQ_OSEqual, Ordered, Signaling
0x11LT_OQLess Than, Ordered, Quiet
0x12LE_OQLess than or Equal, Ordered, Quiet
0x13UNORD_SUnordered, Signaling
0x14NEQ_USNot Equal, Unordered, Signaling
0x15NLT_UQNot Less Than, Unordered, Quiet
0x16NLE_UQNot Less than or Equal, Unordered, Quiet
0x17ORD_SOrdered, Signaling
0x18EQ_USEqual, Unordered, Signaling
0x19NGE_UQNot Greater than or Equal, Unordered, Quiet
0x1ANGT_UQNot Greater Than, Unordered, Quiet
0x1BFALSE_OSFalse, Ordered, Signaling
0x1CNEQ_OSNot Equal, Ordered, Signaling
0x1DGE_OQGreater than or Equal, Ordered, Quiet
0x1EGT_OQGreater Than, Ordered, Quiet
0x1FTRUE_USTrue, Unordered, Signaling

↑ Operators

Operator is an order to compute at assembly-time.

Combination of punctuation characters is used in €ASM to prescribe various operations with numbers, addresses, strings and registers in the assembly process. Placing a binary operator between two numbers tells €ASM to replace these three elements with the result of operation. Some operators are unary, they modify the value of operand which they stand in front of.

All operations implemented in €ASM are presented in the following table.

Operation table
Operation PriorityProperties Left
operand
Operator Right
operand
ResultII (6)
Membership 16binary noncomm. (1)identifier. identifieridentifier
Attribute 15unary noncomm. (3) attr# element number or address
Case-insens. Equal 14binary commutative (2)string== string boolean CMPS
Case-sens. Equal 14binary commutative string === string boolean CMPS
Case-insens. Nonequal 14binary commutative (2)string!== string boolean CMPS
Case-sens. Nonequal 14binary commutative string !=== string boolean CMPS
Plus 13unary (3) + number numeric NOP
Minus 13unary (3) - number numeric NEG
Shift Logical Left 12binary noncommutative number << number numeric SHL
Shift Arithmetic Left 12binary noncommutative number #<< number numeric SAL
Shift Logical Right 12binary noncommutative number >> number numeric SHR
Shift Arithmetic Right12binary noncommutative number #>> number numeric SAR
Signed Division 11binary noncommutative number #/ number numeric IDIV
Division 11binary noncommutative number / number numeric DIV
Signed Modulo 11binary noncommutative number #\ number numeric IDIV
Modulo 11binary noncommutative number \ number numeric DIV
Signed Multiplication 11binary commutative number #* number numeric IMUL
Multiplication 11binary commutative number * number numeric MUL
Scaling 10binary commutative (5)number* register address expression
Addition 9binary commutative number + number numeric ADD
Subtraction 9binary noncommutative number - number numeric SUB
Indexing 9binary commutative (5)number+ register address expression
Bitwise NOT 8unary (3) ~ number numeric NOT
Bitwise AND 7binary commutative number & number numeric AND
Bitwise OR 6binary commutative number | number numeric OR
Bitwise XOR 6binary commutative number ^ number numeric XOR
Above 5binary noncommutative number > number boolean JA
Greater 5binary noncommutative number #> number boolean JG
Below 5binary noncommutative number < number boolean JB
Lower 5binary noncommutative number #< number boolean JL
Above or Equal 5binary noncommutative number >= number boolean JAE
Greater or Equal 5binary noncommutative number #>= number boolean JGE
Below or Equal 5binary noncommutative number <= number boolean JBE
Lower or Equal 5binary noncommutative number #<= number boolean JLE
Numeric Equal 5binary commutative number = number boolean JE
Numeric Nonequal 5binary commutative (4)number!= or <>number boolean JNE
Logical NOT 4unary (3) ! number boolean NOT
Logical AND 3binary commutative number && number boolean AND
Logical OR 2binary commutative number || number boolean OR
Logical XOR 2binary commutative number ^^ number boolean XOR
Segment separation 1binary noncommutative number : number address expression
Data duplication 0binary noncomm. (1) (5)number* datatype data expression
Range 0binary noncomm. (1)number .. number range
Substring 0binary noncomm. (1)text [ ] range text
Sublist 0binary noncomm. (1)text { } range text

(1) Special operations Membership, Duplication, Range, Substring, Sublist are solved at parser level rather than by the €ASM expression evaluator. They are listed here only for completeness.

(2) Case insensitive string-compare operations ignore the character case of letters A..Z but not the case of accented national letters above ASCII 127.

(3) Unary operator applies to the following operand. Binary operators work with two operands. Attribute operator applies to the following element or expression in parenthesis/brackets.

(4) Numeric Nonequal operation has two aliased operators != and <>. You can choose whichever you like.

(5) Operation Multiplication, Scaling and Duplication share the same operator *. Similary Addition and Indexing share operator +. The actual operation is determined by operands type.

(6) Column II illustrates which equivalent machine instruction is used internally to compute the operation at assembly-time.

The commutative property specifies whether both operands of binary operation can be exchanged without having impact to the result.

Priority column specifies the order of processing operators. Higher priority operations compute sooner but this can be changed with priority parenthesis ( ). Operation with equal priority compute in their notation order (from left to right).

Operations which calculate with signed integers have the operator prefixed with #. Operations Addition and Subtraction do not need a special "#signed" version because they compute with signed and unsigned integer numbers in the same way.

Both numeric and boolean operations return 64bit number. In case of boolean operations the result number has one of the two possible values: 0 (FALSE) or -1 = 0xFFFF_FFFF_FFFF_FFFF (TRUE). For example the expression
'+' & %1 #>= 0 | '-' & %1 #< 0 is evaluated as
('+' & (%1 #>= 0)) | ('-' & (%1 #< 0)) and its result is the minus sign (45) if %1 is negative and plus sign (43) otherwise.

Spaces which separate operands and operators in expression examples serve only for better readability and they are not required by €ASM syntax.

Rich set of operators allows €ASM to get rid of cloned pseudoinstructions such as IFE, IFB, IFIDN, IFIDNI, IFDIF, ERRIDNI, ERRNB...

The Shift operators family is given higher priority than in other languages because I treat shifts as a special kind of multiplication/division.
NASM evaluates the expression 4+3<<2 as (4+3)<<2 = 28 but in €ASM it is evaluated as 4+(3<<2) = 16).


↑ Expressions

Numeric and logical expressions ↓

Address expressions ↓

Register expressions ↓

Data expressions ↓

Special expressions ↓

Expression is a combination of operands, operators and priority parenthesis () which follows rules in the table below.

Syntax of expression
What may followleft parenthesisunary operator operandbinary operatorright parenthesisend of expression
beginning of expressionyesyesyesnonoyes (2)
left parenthesisyesyesyesnoyes (2)no
unary operatoryesnoyesnonono
operandnononoyesyesyes
binary operatoryesyes (1)yesnonono
right parenthesisnononoyesyesyes

(1) Unary operator is permitted after binary operation, e.g. 5*-3 evaluates as 5*(-3).

(2) Empty expression, empty parenthesis contents and superabundant parenthesis are valid.

The table shows which combinations are permitted. It should be read by rows, for instance the first line stipulates that expression may begin with left parenthesis, unary operator or an operand.

Expression is parsed to elementar unary and binary operations, which are calculated according to the priority. Operations with the same priority are computed from left to right. Priority can be increased using parenthesis ( ).

↑ Numeric and logical expressions

String compare ↓
Numeric compare ↓
Numeric arithmetic ↓
Shift ↓
Bitwise arithmetic ↓
Boolean algebra ↓
Numeric operations calculate internally with 64-bit integers, no matter if the target program is intended to run in 64bit mode or not.

Result of numeric or logical expression is a scalar 64-bit numeric value (signed integer). It may be treated as a number or as a logical value. Zero result is treated as boolean false and any nonzero result is boolean true. Pure logical expressions, such as logical NOT, AND, OR, XOR and all compare operations return 0 when false and 0xFFFF_FFFF_FFFF_FFFF = -1 when true. This enables to use the result of logical expression in subsequent bitwise operations with all bits.

↑ String compare

String compare expressions return boolean value. Case insensitive versions convert both strings to the same case before actual comparing; however this concerns ASCII letters A..Z only. National letters with accents in any codepage are always compared case sensitively.

String compare is given the highest priority since no other assembly-time operation can be performed with strings beside the test of equality. At assembly time €ASM cannot tell which string is "bigger". |00000000:FFFFFFFFFFFFFFFF | DQ "EAX" == "eax" ; TRUE, strings are equal. |00000008:0000000000000000 | DQ "EAX" === "eax" ; FALSE, strings differ in character case. |00000010:FFFFFFFFFFFFFFFF | DQ "I'm OK." === 'I''m OK.' ; TRUE, their netto value is equal. |00000018:0000000000000000 | DQ "Müller" == "MÜLLER" ; FALSE because of different case of umlauted U's. |00000020:0000000000000000 | DQ "012" == "12" ; FALSE, strings are not equal. |00000028:0000000000000000 | DQ "123" = 123 ; FALSE; character constant "123"=3355185 which is not 123. |00000030: | DQ "123" == 123 ; Syntax error; right operand is not a string. |### E6321 String compare InsensEqual with non-string operand in expression ""123" == 123". |00000030:

Case insensitive string compare should be used with built-in €ASM elements, such as register or datatype names , e.g.

 %IF '%1' !== 'ECX'
   %ERROR Only register ECX is expected as the first macro operand.
 %ENDIF

When we are investigating the presence of punctuation, it's better to use case-sensitive compare, because it assembles faster (€ASM doesn't have to convert both sides to a common character case):

DoSomethingWithMemoryVar %MACRO
 %IF '%1[1]' !=== '['  ; Test if the 1st operand begins with square bracket.
   %ERROR The first operand should be memory variable in [brackets].
 %ENDIF
%ENDMACRO DoSomethingWithMemoryVar

The test on square bracket in previous example fails if the macro operand is a string or character-constant in quotes, e.g. DoSomethingWithMemoryVar 'xyz'. The string compare operation will raise E6101 Expression "''' !=== '" is followed by unexpected character "[". because of syntax error. A trick how to avoid E6101 is to compare doubled values. In this case both single or double quotes escape themselves:

DoSomethingWithMemoryVar %MACRO
 %IF '%1[1]%1[1]' !=== '[['  ; Test if the 1st operand begins with a square bracket.
   %ERROR The first operand should be memory variable in [brackets].
 %ENDIF
↑ Numeric compare

Numeric compare operations use single equal sign =, optionally combined with < or > and they can compare values of two plain numbers or offsets of two addresses within the same segment.

Numeric compare can be used to test which side of operation is bigger. Terms above/below are used when comparing unsigned numbers or addresses. Terms greater/lower are used for comparing signed numbers. Operators which treat numbers as signed are prefixed with # modifier. Virtual addresses are always unsigned, therefore we cannot ask whether they are greater or lower.

|00000000:FFFFFFFFFFFFFFFF | DQ 5 < 7 ; TRUE, 5 is below 7. |00000008:FFFFFFFFFFFFFFFF | DQ 5 #< 7 ; TRUE, 5 is lower than 7. |00000010:0000000000000000 | DQ 5 #< -7 ; FALSE, 5 is not lower than -7. |00000018:FFFFFFFFFFFFFFFF | DQ 5 < -7 ; TRUE, 5=0x0000_0000_0000_0005 is below -7=0xFFFF_FFFF_FFFF_FFF9. |00000020:FFFFFFFFFFFFFFFF | DQ 123 = 0123 ; TRUE, both numbers are equal. |00000028:0000000000000000 | DQ "123" == "0123" ; FALSE, both strings are different. |00000030:0000000000000000 | DQ "123" = "0123" ; FALSE, both sides are treated as character constants with different values. |00000038: | DQ "123" = "000000123" ; "000000123" is not a number, its too big for a character constant. |### E6131 Character constant "123" = "000000123" is too big for 64 bits. |00000038: |
↑ Numeric arithmetic

Common arithmetic operations are Addition, Subtraction, Multiplication, Division and Modulo (remainder after division).

Unary minus may be applied to scalar numeric operand only. Unary plus does not change the value of operand; it is included in the operator set only for completeness. Adjacent binary and unary numeric operator is accepted by €ASM, however weird this may seem. This is useful in evalution expressions with substituted value, such as 5 + %1 where the symbolic argument %1 happens to be negative, e.g. -2. This expression is calculated as 5 + %1 -> 5 + -2 -> 5 + (-2) -> 3.

The greatest permitted value of number in €ASM source is 0xFFFF_FFFF_FFFF_FFFF -> 18_446_744_073_709_551_615 as unsigned, or 0x7FFF_FFFF_FFFF_FFFF -> 9_223_372_036_854_775_808 as signed. Overflow at assembly time is ignored in Addition, Subtraction and Shift Logical operation. Assembly error is reported when overflow occurs during Multiplication and Shift Arithmetic Left operation, or when division-by-zero happens during Division or Modulo operation. This maximum must not be exceeded even in intermediate results during the evaluation, such as 0x7FFF_FFFF_FFFF_FFFF * 2 / 2 (€ASM reports error). However, rearranged code 0x7FFF_FFFF_FFFF_FFFFF * (2 / 2) assembles well.

No overflow is reported in following examples of numeric expressions evaluation:

|00000000:0E00000000000000 | DQ 2 + 3 * 4 ; Result is 14. |00000008:0200000000000000 | DQ 0xFFFF_FFFF_FFFF_FFF9 + 0x0000_0000_0000_0009 ; Result is 2. |00000010:0200000000000000 | DQ -7 + 9 ; Result is 2 (0xFFFF_FFFF_FFFF_FFF9 + 0x0000_0000_0000_0009). |00000018:0200010000000000 | DQ 0xFFF9 + 0x0009 ; Result is 65538 (0x0000_0000_0000_FFF9 + 0x0000_0000_0000_0009). |00000020: |

€ASM calculates with integer truncated division and [Modulo] at assembly-time, same as machine instruction IDIV.

Before signed division applies, both divident and divisor are internally converted to positive numbers. Then, having been divided as unsigned, the quotient is converted to negative if one of the operands (but not both) was negative.
Remainder in signed modulo operation is converted to negative only when the divident was negative.

|00000000: |; Signed division: |00000000:0300000000000000 | DQ +14 #/ +4 ; +(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is +3. |00000008:FDFFFFFFFFFFFFFF | DQ -14 #/ +4 ; -(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is -3. |00000010:FDFFFFFFFFFFFFFF | DQ +14 #/ -4 ; -(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is -3. |00000018:0300000000000000 | DQ -14 #/ -4 ; +(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is +3. |00000020: |; Unsigned division: |00000020:0300000000000000 | DQ +14 / +4 ; (0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is 3. |00000028:FCFFFFFFFFFFFF3F | DQ -14 / +4 ; (0xFFFF_FFFF_FFFF_FFF2 / 0x0000_0000_0000_0004) is 4_611_686_018_427_387_900. |00000030:0000000000000000 | DQ +14 / -4 ; (0x0000_0000_0000_000E / 0xFFFF_FFFF_FFFF_FFFC) is 0. |00000038:0000000000000000 | DQ -14 / -4 ; (0xFFFF_FFFF_FFFF_FFF2 / 0xFFFF_FFFF_FFFF_FFFC) is 0. |00000040: |; Signed modulo: |00000040:0200000000000000 | DQ +14 #\ +4 ; +(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is +2. |00000048:FEFFFFFFFFFFFFFF | DQ -14 #\ +4 ; -(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is -2. |00000050:0200000000000000 | DQ +14 #\ -4 ; +(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is +2. |00000058:FEFFFFFFFFFFFFFF | DQ -14 #\ -4 ; -(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is -2. |00000060: |; Unsigned modulo: |00000060:0200000000000000 | DQ +14 \ +4 ; (0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is 2. |00000068:0200000000000000 | DQ -14 \ +4 ; (0xFFFF_FFFF_FFFF_FFF2 \ 0x0000_0000_0000_0004) is 2. |00000070:0E00000000000000 | DQ +14 \ -4 ; (0x0000_0000_0000_000E \ 0xFFFF_FFFF_FFFF_FFFC) is 14. |00000078:F2FFFFFFFFFFFFFF | DQ -14 \ -4 ; (0xFFFF_FFFF_FFFF_FFF2 \ 0xFFFF_FFFF_FFFF_FFFC) is 18_446_744_073_709_551_602. |00000080: |
↑ Shift

Shift operations are not commutative. Operand on the left side is treated as a 64-bit integer and shifted to the left|right by the number of bits specified by the operand on the right side.

Shift operations at assembly time are given higher priority than other numeric operation because they correspond with computing power of 2 rather than multiplication or division. For instance 1 << 7 is equivalent to 1 * 27.

NASM evaluates the expression 4 + 3 << 2 as (4 + 3) << 2 -> 28, but in €ASM it is evaluated as 4 + (3 << 2) -> 16.

Bits which enter the least significant bit (LSb) during Shift Left operation are always 0. Bits which enter the most significant bit (MSb) during Shift Right operation are either 0 (Shift Logical Right), or they copy their previous value (Shift Arithmetic Right), thus preserving the sign of operand.

Bits which leave LSb during Shift Right are discarded. Bits which leave MSb during Shift Left are discarded, too, but overflow error E6311 is reported by €ASM when the sign of result (kept in MSb) has changed during Shift Arithmetic Left. Overflow sensitivity is the only difference between Shift Arithmetic Left and Shift Logical Left.

The right operand may be arbitrary number; however when it is greater than 64, the result is 0 with one exception: negative number shifted arithmetic right by more than 64 bit results in 0xFFFF_FFFF_FFFF_FFFF -> -1.

Shift by 0 bits does nothing. Shift by negative number just reverses the direction of actual shift from left to right and vice versa.

Assembly-time rotate operations are not supported.

|00000000:0000010000000000 | DQ 1 << 16 ; Result is 65536. |00000008:F4FFFFFFFFFFFFFF | DQ -3 #<< 2 ; Result is -12. |00000010:8078675645342312 | DQ 0x1122_3344_5566_7788 << 4 ; Result is 0x1223_3445_5667_7880. |00000018:98A9BACBDCEDFE0F | DQ 0xFFEE_DDCC_BBAA_9988 >> 4 ; Result is 0x0FFE_EDDC_CBBA_A998. |00000020:98A9BACBDCEDFEFF | DQ 0xFFEE_DDCC_BBAA_9988 #>> 4 ; Result is 0xFFFE_EDDC_CBBA_A998. |00000028:0000000000000000 | DQ 0x8000_0000_0000_0000 << 1 ; Result is 0x0000_0000_0000_0000. |00000030: | DQ 0x8000_0000_0000_0000 #<< 1 ; Overflow, MSb would have been changed. |### E6311 ShiftArithmeticLeft 64bit overflow in "0x8000_0000_0000_0000 #<< 1". |00000030: |
↑ Bitwise arithmetic

Bitwise NOT, AND, OR, XOR perform logical operation with the whole operands bit per bit.

|0000:FA | DB ~ 5 ; ~ 0000_0101b is 1111_1010b which is -6. |0001:04 | DB 5 & 12 ; 0000_0101b & 0000_1100b is 0000_0100b which is 4. |0002:0D | DB 5 | 12 ; 0000_0101b | 0000_1100b is 0000_1101b which is 13. |0003:09 | DB 5 ^ 12 ; 0000_0101b ^ 0000_1100b is 0000_1001b which is 9.
↑ Boolean algebra

Logical NOT, AND, OR, XOR operate with numbers as well as with boolean values.
Each operand, which is internally stored as nonzero 64bit number, is converted to boolean true (0xFFFF_FFFF_FFFF_FFFF) before the actual logical operation.
Operand with value 0 is treated as false.

|0000:FF | DB 3 && 4 ; 0000_0011b && 0000_0100b is TRUE && TRUE (both operands are non-zero) which is TRUE. |0001:00 | DB 3 & 4 ; 0000_0011b & 0000_0100b have no common bit set, result is 0000_0000b, which is FALSE.

↑ Address expressions

Numeric expressions operate with literal numeric values, such as 1, 0x23, '4567' or with symbols representing scalar numeric value, such as NumericSymbolTen EQU 10. Most symbols in real assembler program represent address value which points to some data in memory or to some position in the program code.

While a plain number (scalar) is internally stored by €ASM in eight bytes, an address needs additional room to keep information of the segment it belongs to.

Imagine yourself driving a car. You're passing the milestone 123 on a highway when a friends of yours ring you up that they're passing the milestone 97. How far from one another are you? The answer is as easy as subtracting only if you are both driving on the same highway.

The set of operations defined with address symbols is very limited in comparison with numeric expressions. They cannot be multiplied, divided, shifted, logically operated. Only two kind of operations are allowed with addresses:

  1. Scalar numeric value may be added to the address symbol or substracted from it. The result is address symbol again; operation affects the offset part of address; segment part remains intact.
  2. Two symbols may be subtracted from one another (or compared with one another) if they both belong to the same segment. The result is a scalar numeric value calculated as the difference of their offsets.
The reason of such limitation is segment:offset addressing paradigma in IA-32 architecture. Assembler does not know how segments will be combined together at link-time or what actual virtual address will the segment be loaded to at run-time; it only marks all references to relocatable addresses in object code. Linker is responsible for patching (fixing up) those addresses at link time; however the mathematical capability of linkers is restricted to adding and subtracting. Linkable file formats lack specification of more sofisticated arithmetics.

↑ Register expressions

Memory variables are addressed as the offset from the first byte of used memory segment (displacement) which may be updated at run-time with the contents of one or two registers. Notation of such address is called register expression or memory address expression.

Unlike instructions with immediate number embedded in the instruction code, such as ADD EAX,1234, machine instructions which load|store data somewhere from|to memory, must have the whole operand enclosed in brackets [ ]. E.g. ADD EAX,[1234], where 1234 is offset of dword variable in data segment where the addend is loaded from.

MASM allows to omit square brackets even when the operand is a variable defined in memory, for instance ADD EAX,Something. Poor reader of MASM program has to look for the definition of variable to learn whether it was defined in memory (Something DD 1) or if it was defined as a constant (Something EQU 1). Newer assemblers abandoned this design flaw, luckily.

When the address expression is used in machine instruction, it may be completed with registry names; it becomes register address expression. Complete address expression follows the schema
segment: base + scale * index + displacement where
segment is segment register CS, DS, ES, SS, FS, GS,
base is BX, BP in 16-bit addressing mode, or EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI, R8D..R15D in 32-bit addressing mode, or RAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI, R8..R15 in 64-bit addressing mode,
scale is numeric expression which evaluates to scalar number 0, 1, 2, 4, 8,
index is SI, DI in 16-bit addressing mode, or EAX, EBX, ECX, EDX, EBP, ESI, EDI, R8D..R15D in 32-bit addressing mode, or RAX, RBX, RCX, RDX, RBP, RSI, RDI, R8..R15 in 64-bit addressing mode,
displacement is address or numeric expression with magnitude (width) not exceeding the addressing mode..

Some assemblers allow different syntax of memory addressing, for instance MOV EAX,Displ[ESI], MOV EAX,dword ptr [Displ+ESI], MOV EAX,Displ+[4*ESI], MOV EAX,Displ+4*[ESI]+[EBX].
EuroAssemblers requires that the whole operand is surrounded in square brackets: MOV EAX,[Disp+4*ESI+EBX].

The order of components in addressing expression is arbitrary. Any portion of register address expression may be omitted.
Scale is not permitted in 16-bit addressing mode and scale cannot be used if indexregister is not specified.
ESP and RSP cannot be used as index register (they cannot be scaled).
Addressing modes of different sizes cannot be mixed in the same instruction, e g. [EBX+SI].
16bit addressing mode is not available in 64bit CPU mode.

Registers allowed in addressing modes
16bit addressing mode in 16bit and 32bit segment
base registerBX SS:BP
index registerSI DI
displacement16bit signed integer, sign-extended to segment's width at run-time
32bit addressing mode in 16bit and 32bit segment
base registerEAX EBX ECX EDX ESI EDI SS:EBP SS:ESP
index registerEAX EBX ECX EDX ESI EDI EBP
displacement32bit signed integer, sign-extended|truncated to segment's width at run-time
32bit addressing mode in 64bit segment
base registerEAX EBX ECX EDX ESI EDI SS:EBP SS:ESP R8D..R15D
index registerEAX EBX ECX EDX ESI EDI EBP R8D..R15D
displacement32bit signed integer, sign-extended to segment's width at run-time
64bit addressing mode in 64bit segment
base registerRAX RBX RCX RDX RSI RDI SS:RBP SS:RSP R8..R15
index registerRAX RBX RCX RDX RSI RDI RBP R8..R15
displacement32bit signed integer, sign-extended to segment's width at run-time
MOFFS addressing mode in 16bit, 32bit and 64bit segment
base registernone
index registernone
displacementunsigned integer of segment's width (16|32|64 bits)

When segment register is not explicitly specified, default segment is used for addressing the operand. If BP, EBP, RBP, ESP or RSP is used as baseregister, the default segment is SS, otherwise it is DS. Nondefault segment register used for data retrieving may be specified either as an explicit instruction prefix SEGCS SEGDS SEGES SEGSS SEGFS SEGGS, or as segment register which becomes part of the register expression (implicit segment override). The segment register may be included in expression either with colon : (segment separator) or with plus + (indexing operator):

|0000:268A04 |SEGES MOV AL,[SI] |0003:268A04 | MOV AL,[ES:SI] |0006:268A04 | MOV AL,[ES+SI]

There is a subtle difference between implicit and explicit segment override: if it requests the same segment register which is already used as default, €ASM emits the prefix only when it is specified explicitly (in prefix field of the statement):

|0000:8B04 | MOV AX,[SI] |0002:8B04 | MOV AX,[DS:SI] |0004:3E8B04 | SEGDS: MOV AX,[SI] |0007:3E8B04 | SEGDS: MOV AX,[DS:SI]

See t3021, t3022, t3023 for more examples.

In expressions where scaling is not used and therefore it's not obvious which of the two registers is meant as an index, €ASM treats the leftmost register as a base. So in [ESI+EBP] the base is ESI and implicit segment is DS, while in [EBP+ESI] the implicit segment is register SS.

We don't have to bother with implicit segment selection in 32bit and 64bit FLAT model programs, because both SS and DS are loaded with the same segment descriptor at load-time.

Although the operators * or + in register address expression look like an ordinary multiplication or addition, they specify very different kind of operation called Scaling or Indexing when applied to a register. The actual multiplication or addition is performed at run-time rather than at assembly time, because the assembler cannot know the contents of registers.

Indexing operation has lower priority than the corresponding Multiplication. Hence, the register expression [EBX + 5 + ESI * 2 * 2] is evaluated as [EBX + 5 + ESI * (2 * 2)] -> [EBX + 5 + ESI * 4].

↑ Data expressions

Data expression specifies static data declared with pseudoinstruction D or with literals. Format of data expression is
duplicator * type value, where duplicator is non-negative integer number, type is primitive data type in full BYTE UNICHAR WORD DWORD QWORD TBYTE OWORD YWORD ZWORD INSTR or short B U W D Q T S O Y Z I notation, or a structure name. Optional value defines the contents of data which is repeated duplicator times.

Duplication is not commutative operation; duplicator must be on the left side of duplication operator *. Default duplicator value is 1 (the duplication is not used). Nested duplication is not supported in €ASM. Priority of duplication is very low, so the data expression 2 + 3 * B 4 is evaluated as five bytes where each contains the value 4. Example:

D 3 * BYTE          ; Declare three bytes with uninitialized contents.
D W 0x5             ; Declare one word with value 5.
D 2 * U "some text" ; Declare Unicode (UTF-16) string containing "some textsome text".
D 3 * MyStruc       ; Declare three instances of structured memory variable MyStruc.

See also pseudoinstruction D and tests t2480, t2481, t2482 for more examples.

↑ Special expressions

Membership ↓
Range ↓
Substring ↓
Sublist ↓

Remaining expression are not calculated with mathematical expression evaluator; they are treated by parser in special way.

↑ Membership

The point . joining two identifiers creates makes a fully qualified name (FQN), which looks like namespace identificator followed with local name. FQN is nonlocal, it never starts with fullstop. For instance, when a local symbol .bar is declared in a procedure or structure Foo, it is treated by €ASM as symbol with FQN Foo.bar.

Namespace can be local, too, so the membership operation can nest.

↑ Range

Range is defined as two numeric expressions separated with range operator, which is .. (two adjacent fullstops) and it represents the set of integer numbers between those values, including the first and the last value.

A range has the property slope, which can be negative, zero or positive. Slope is defined as the sign of the difference between the right and the left value. Examples:

0 .. 15    ; Range represents sixteen numbers from 0 to 15; slope is positive.
-5 .. -4   ; Range represents values -5 and -4; slope is positive.
3 .. 4 - 1 ; Range represents one value 3; slope is zero.
2..-2      ; Range represents five values; slope is negative.
↑ Substring

Substring is operation which returns only part of input text. Substring operator is a range enclosed in a pair of square brackets []. The text is treated as a sequence of 8bit characters (bytes) and the range specifies which of them are used.

%Sample1 %SET ABCDEFGH ; Preprocessing variable %Sample1 now contains 8 characters.
 DB "%Sample1[3..5]"   ; This actually assembles as  DB "CDE"
↑ Sublist

Sublist operation is similar to Substring with the difference that curly brackets {} are used instead of braces and that it treats the input text as an array of comma-separated items (in case of %variable expansion), or as a sequence of physical lines (in case of file inclusion).

 INCLUDE "MySource.asm"{1..10} ; include first ten lines of file MySource.asm

  Common properties of suboperations Substring and Sublist:

Suboperator is appended to the suboperated resource without spaces.
Suboperations can be applied on four kinds of elements:

When applied to files, the file name must always be specified in double quotes.

Character and items are 1-based, the first suboperable member (character/item/line) has number 1.
Number of the last suboperable member is automatically assigned to a special variable %&.

Ordinal number of the last character/item/line of input text is assigned by €ASM to an automatic preprocessing variable with the name %&. This %variable is valid only in the suboperation, it cannot be used outside the braces.

You can use pseudoinstruction %SETS to get the number of characters assigned to a %variable, or pseudoinstruction %SETL to get the number of items in it (array length).
You can use attribute operator FILESIZE# to get the number of bytes in a file at assembly-time.

In Substring operation the value of automatic %& specifies the number of characters assigned in the %variable or to the size of included/object file in bytes.
In Sublist operation it represents the ordinal number of the last non-empty item in the %variable, or the number of physical lines in the included file.

|4142432C4445462C2C4748492C4A4B4C |%Sample %SET ABC,DEF,,GHI,JKL |0000: | ; %& is now 16 in %Sample[%&] and 5 in %Sample{%&}. |0000:4B4C | DB "%Sample[15..%&]" ; DB "KL" |0002:4445462C2C4748492C4A4B4C | DB "%Sample{2..%&}" ; DB "DEF,,GHI,JKL"

Suboperated included file must be enclosed in double quotes even when its name doesn't contain spaces. The opening square bracket must immediately follow the input value (%variable name or the quote which terminates the filename). No white spaces are allowed between the %variable and the suboperation left bracket.

Suboperation are very tolerant about the range values. No warning is reported when they refer to nonexisting character or item, for instance when the range member is zero or negative. Ranges with negative slope simply return nothing. Ranges with zero slope return one character/item/line when the index is between 1 and %&, otherwise they return nothing.

|4142434445464748 |%Sample %SET ABCDEFGH ; Variable %Sample now contains 8 characters. |0000:4142434445 | DB "%Sample[-3..5]" ; DB "ABCDE" |0005:434445464748 | DB "%Sample[ 3..99]" ; DB "CDEFGH" |000B:43 | DB "%Sample[ 3..3]" ; DB "C" |000C: | DB "%Sample[5..3]" ; DB "" |000C:4142434445464748205B352E2E335D | DB "%Sample [5..3]" ; DB "ABCDEFGH [5..3]" ; Not a suboperation.

Suboperation range consists of three components:

  1. minimum range indices
  2. range operator ..
  3. maximum range indices

Some of these components may be omitted, they will be given the default value. Default minimum indices is 1. Default maximum indices is %&. |4142434445464748 |%Sample %SET ABCDEFGH ; Preprocessing variable %Sample now contains 8 characters. |0000:4142434445 | DB "%Sample[..5]" ; -> DB "%Sample[1..5]" -> DB "ABCDE" |0005:434445464748 | DB "%Sample[3..]" ; -> DB "%Sample[3..8]" -> DB "CDEFGH" |000B:4142434445464748 | DB "%Sample[..]" ; -> DB "%Sample[1..8]" -> DB "ABCDEFGH" |0013:4142434445464748 | DB "%Sample[]" ; -> DB "%Sample[1..8]" -> DB "ABCDEFGH"

All following notations are identical in %variable expansion:

%variable
%variable[1..%&]
%variable[..%&]
%variable[1..]
%variable[..]
%variable[]
%variable{1..%&}
%variable{..%&}
%variable{1..}
%variable{..}
%variable{}

The last notation in previous example is useful in %variable names concatenating when we need to append some literal text, for instance 123 to the %variable contents. We cannot write %variable123 because the appended digits change the name of original %variable. The solution is to use empty suboperation, which doesn't change the %variable contents but it separates its name from the successive text: %variable[]123 or %variable{}123.

When the range inside braces contains only one index without range operator, it is treated as both minimum and maximum value and only one character/item/line is expanded: %Sample1[3] -> %Sample[3..3] -> C.

Suboperations may be chained. The chain is processed from left to right. Example: |4142432C4445462C2C4748492C4A4B4C |%Sample %SET ABC,DEF,,GHI,JKL ; %& is now 16 in %Sample[%&] and 5 in %Sample{%&}. |0000:4A4B | DB "%Sample{4..5}[2..6]{2}" ; DB "JK"

The first sublist in previous example takes items nr.4 and 5, giving the list of two items GHI,JKL. The next substring extracts characters from second to sixth from that sublist, giving HI,JK. The last sublist operation expands the second item, which is JK.

Suboperations may be nested. Inner ranges are calculated before the outer ones: |31323334353637383930 |%Sample %SET 1234567890 |0000:3233343536 | DB "%Sample[2..%Sample[6]]" ; -> DB "%Sample[2..6]" -> DB "23456"

↑ Sections

For each emitting statement assembler generates some data or machine code which will be dumped to the output file in the end. Fortunately we don't have to write the whole program in the exact sequence which is required by the output file format. Assembled data and code is tossed on demand to one of several output sections. The statement, which will switch assembly to a different section, is quite simple: just the name of section in square brackets [ ] in the label field of the statement.

Imagine that you (the programmer) act like a manager dictating some code and data to your secretary (EuroAssembler). You have dictated a few instructions, which were written in shorthand by your secretary on a sheet of paper labeled [TEXT]. Then you decided to dictate other kind of data. The secretary will grab another sheet, label it [DATA] and start to write there. Later, when you want to dictate some other instructions, your secretary takes the sheet labeled [TEXT] again, and continues from the point (origin) where it was interrupted.
You are free to open new sheets and to switch between them ad libitum. When the dictation ends, all used sheets will be stapled together (linked).

In EuroAssembler is the term section used for a named division of segment. Each segment has one or more sections. By default the segment has just one section with identical name (base section) which was created at segment definition.

↑ Segments

Intel Architecture divides memory to segments controlled by segment registers. Segment in €ASM is defined by pseudoinstruction SEGMENT.

In the dawn of computer age, programmers demanded more memory then mere 256 bytes or 64 kilobytes which was addressable by 8bit and 16bit registers. Designers at Intel in pre-32bit times might have chosen to use joinder of two 16bit general registers, such as DX:AX or SI:BX and to address inconceivable 4 GB of memory with them, but they didn't. Instead, they invented new 16bit segment registers specialized by the purpose of addressed memory: register CS for machine code, DS for data, SS for machine stack, ES for extra temporary usage.
Segment registers are used for addressing of 16 bytes long chunks of memory called paragraphs (alias octonary word, OWORD). Linear address in real CPU mode is calculated as a sum of Using segment registers for addressing of 16byte paragraphs yields 1 MB of memory addressable by each segment register, which seemed enough for everybody in those times.

Contents of segment register in real processor mode represents paragraph address of the segment.
Contents of segment register in protected processor mode represents index to descriptor table, which holds some auxilliary information about the addressed segment (beside its address and size limit): access privileges and width.

Those auxilliary properties are fixed in real mode:

Segment at run-time is a continuous range of operational memory addressable with the contents of one segment register.

Segment at link-time is a named part of object file, which can be concatenated with segments of the same name from other linkable files.

In [MS_PECOFF] terminology is the linkable segment called section. I think the term segment would be more appropriate here, because COFF "sections" are differentiated by access privileges as they are addressed by different segment registers, ergo by different segment descriptors.
In our segment-highway parable, segments in flat protected mode are highway lanes running in parallel, so they share common milestones (offsets), but each lane is dedicated to a different kind of vehicles.

Segment at write-time is a part of assembler source which begins with section switching statement, and which ends with another switching statement or with the end of program.

There is no ENDS (end-of-segment) directive in €ASM. It is not possible to say this part of source code doesn't belong to any segment. When you write the very first statement of your source text, it already belongs to the default (envelope) program, and every program implicitly defines its default segments. Nevertheless, when a structure or numeric constant is being defined, it is irrelevant which segment is currently in charge, because structures and scalar symbols do not belong to any segment, no matter where was the structure or symbol defined.

Segments and section divisions of assembler source do not have to be continuous. In fact, discontinuity is their main raison d'être. It allows to keep data in source text near the code which manipulates with it, and this is good for readability and understanding of program function.

↑ Groups

When segments of assembler program are not much huge, they may be coalesced into segment group. The whole group of segments is addressable with one segment register. Group can be defined with pseudoinstruction GROUP.

When a group is defined, e.g. {DGRP] GROUP [DATA],[STRINGS] beside the group [DGRP] it automatically creates a segment with the same name [DGRP] (and consequently a section with the same name [DGRP]). It also declares that segments [DATA] and [STRINGS] belong to group [DGRP] together with its base segment [DGRP]. Nevertheless, when nothing is emitted to the implicitely defined segment [DGRP], it will be discarder in the end.

↑ Segmentation (more about sections, segments, groups)

Base segment and section ↓

Segmentation lifetime ↓

Implicit segments ↓

Segment naming conventions ↓

Loading segment registers ↓

Ordering of sections and segments ↓

Displaying the segment map ↓

The relation between segment and its sections in EuroAssembler is similar to the relation between group and its segments.

↑ Base section and segment

Whenever a segment is defined (with the pseudoinstruction SEGMENT), a section with the same name is automatically created in it (it is called base section). Other sections of the same segment may be created on demand later. This is done by the statement which has only the section name in its label field (there is no explicit SECTION directive in €ASM).

Section properties (class, purpose, combine, align) are inherited from the segment they belong to. The alignment is not inherited when special literal sections [@LT64] .. [@LT1], [@RT0], [@RT1].. are created, literal sections are aligned according to the type of data which they keep.

Whenever a group is defined (with the pseudoinstruction GROUP), a segment with the same name is created in it (it is called base segment), together with other segments which we want to incorporate to the group.

↑ Segmentation lifetime

Each segment has one or more sections. Each section belongs to exactly one segment. During assembly time all segments are assumed to be loaded at virtual address 0. At the end of each assembly pass are sections virtually linked to their segment, so they begin at higher VA, where the preceeding section ended. However, in pass 1 it is not known yet what size will sections have, so all sections are assumed to start at VA=0 in pass 1. When the last assembly pass ends, all sections are linked physically (their emitted contents and relocations are concatenated to the segment=base section) and sections are then discarded. Linker is not aware of sections at all.

Why should we actually split a segment to sections? Well, it is not necessary, mostly we can get by with just one default section per segment. In big programs, on the other hand, it may be useful to group similar kind of data together; we may want to create separate section for double word sized variables, for floating-point numbers, for text strings. This may save a few bytes of alignment stuff, which would be necessary when variables of different sizes are mixed together.

Another occasion where sections are handy is fast retrieving from read-only "databases" defined statically somewhere in data segment.
Database can be mentally visualized as a table with many rows and with columns containing data items of constant size. For fast selection of a particular row by an item of "indexed" key value is profitable to emit all items from one column sequentially to a section, one after another. Data from every column will have their own section. Width of "indexed" column should be padded to 1, 2, 4 or 8 bytes, so its items can be scanned with a single machine instruction REPNE SCAS. When an item is found, the difference between register rDI and the start of section identifies the selected row index. Remaining items of this row then can be addressed with the knowledge of row index.
This access method was used in a sample project EuroConvertor and in EuroAssembler itself, where it assigns address of instruction handler to each of two thousands mnemonics, see DistLookupIi.

Each group has one or more segments. Each segment belongs to exactly one group (even when it wasn't explicitly grouped, a group with the segment's name will be implicitly created at link time for the addressing purposes). When a program with executable format is linked, all groups are physically concatenated into image. Loader of realmode executable image is not aware of groups and segments.

↑ Implicit segments and groups

€ASM creates implicit segments when it starts to assemble a program. Implicit segment names depend on the chosen program format:

Implicit segments
FORMAT=Implicit segment names
BIN[BIN]
COM[COM]
OMF | MZ[CODE],[DATA],[BSS],[STACK]
COFF | PE | DLL[.text],[.data],[.bss]

If you are not satisfied with the implicit segments created by €ASM, you may redefine them at the start of program or create a new set of segments with different names. Segments and sections which were not used (nothing was emitted to them) will not be linked to output file and they can be ignored.

When the assembly ends and segments from linked modules have been incorporated (combined) to the base program, €ASM looks at segments which are not part of any group, and creates implicit group for them (name of the group is the same as the segment). Here the memory model is taken into account:

Models with single code segment (TINY, SMALL, COMPACT) link all code into a single group, no matter how many code segments are actually defined in the program.

Multicode models (MEDIUM, LARGE) keep each code segment it its own implicit group, (if they weren't grouped explicitly), hence intersegment jumps, calls and returns should have DIST=FAR.

Similary, single data models (TINY, SMALL, MEDIUM) assume that all initialized and uninitialized data fits into one group not exceeding 64 KB, so the €ASM linker will assign all data segments into implicit group and register DS does not have to be changed when accessing data from various segments, which may have been defined in the base program or in linked modules.

↑ Segment naming conventions

Name of group, segment and section is always surrounded by square brackets in €ASM source.

Unlike symbols, namespace is not preposited to segment name when it starts with . (fullstop). Group, segment, section names are always nonlocal.

Number of characters in group/segment/section name is not limited by €ASM but it may be limited by the output format. In OMF object module the name of group or segment must not exceed 255 characters. In PE COFF executables the name in section header is truncated to 8 characters.

€ASM treats all names as case sensitive. If you want to link your segment with object module produced by an external compiler which converts segment name to uppercase or which mangles the names by prepending underscores __, you should adapt your naming convention to it.

Segment name should be unique, you cannot define two segments with the identical name in a program, except for the implicitly created segments, if there were not used yet. However, it is possible to define segments with same names in different programs and link them together; their contents will be concatenated according to their COMBINE= property. Similar rule applies to groups.

Section names cannot be duplicated on principle. When a section name appears in source for the second time, it will only switch to that section rather than creating a new one.

Implicit literal section name begins with @LT or @RT, you'd better avoid names which begin with this combination of letters.

Segment which have dollar sign $ in their name are treated in a special way. If the characters on the left side of $ match, all such segments will be linked adjacently in alphabetic order.

There are conventions how "sections" are named in COFF modules, you may need to adapt to them to succesfully link €ASM program with modules created by different compilers.

↑ Loading segment registers

When MZ executable program is prepared to start, its segment registers have been set by the DOS loader. CS:IP is set to the program entry point, SS:SP is set to the top of machine stack, but both DS and ES point to PSP, which is not our data segment. Whenever programmer needs to access data in their own segment or to jump to some procedure in a different code segment, concerned segment register must be explicitly loaded with paragraph address of the group. There is no instruction in Intel architecture to load segment register with immediate value directly, so this is usually done via register or stack:

; Loading paragraph address of [DATA] to segment register
; using a general purpose register (which is faster):
MOV AX, PARA# [DATA]
MOV DS,AX
; or using the machine stack (which is shorter):
PUSH PARA# [DATA]
POP DS

It is the responsibility of programmer to load segment register with the address of another segment, whenever it is used. €ASM makes no assumption about the contents of segment registers; there is no ASSUME, USING, WRT directive in €ASM.

Unlike some other assemblers, €ASM does not automatically create a homonymous symbol when a segment is defined. Use attribute operators PARA#, GROUP#, SEGMENT#, SECTION# instead.

↑ Ordering of sections and segments

Order of sections in segment and order of segments in linked program is generally specified by the order as they were defined in source code with few exceptions. At the end of each assembly pass are all sections linked to their segment in this order:

  1. Base section.
  2. Other non-literal sections in the order as they were defined.
  3. Data-literal sections in descending order of their alignment ([@LT64], [@LT32],..[@LT1]).
  4. Code-literal sections in alphabetical order ([@RT0], [@RT1], [@RT2]..).

Segments are combined and linked at link time, when the final assembly pass ends.
Order of segments in output file:

  1. Group(s) of initialized segments in the order as they were defined.
  2. Initialized segments which are not in any group.
  3. Group(s) of uninitialized segments in the order as they were defined.
  4. Uninitialized segments which are not in any group.

Segments in each group are in the order as they were defined in the source (not as they were declared in the GROUP statement). The base segment is always the first in a group.
Segments with $ in their name, which belong to the same group, and the left-side substrings of their names up to the $ are identical, are kept together and sorted alphabetically.

When an executable format is linked, every segment belongs to some group, at least the implicit one (with identical name).

↑ Displaying the segment map

Pseudoinstruction %DISPLAY Sections prints to the listing file a complete map of groups, segments and sections defined so far at assembly time, one object per line represented by a debugging message D1260 (group), D1270 (segment), D1280 (section). Segment is indented with two spaces, section is indented with four spaces.

Instead of %DISPLAY Sections we could use %DISPLAY Segment or %DISPLAY Groups, the result is identical. The entire group/segment/section map is always displayed with those statements.

At link time €ASM prints to the listing similar map of groups and segments with finally used virtual addresses, unless it was disabled with option PROGRAM LISTMAP=OFF.

↑ Distance

Distance is property of difference between two addresses. It is not just the numeric difference of two offsets; in €ASM this term represents one of three enumerated values: FAR, NEAR, SHORT.

The distance of two addresses is FAR when they belong to different groups/segments, otherwise it is NEAR or SHORT. Difference of offsets is SHORT if it fits into 8-bit signed integer, i.e. -128..+127.

↑ Width

€ASM is 64-bit assembler, it can also compile programs for the older CPU which worked with 32 and 16 bit words only. The number of bits which CPU works with simultaneously is called width and it is either 16, 32 or 64.

Width is always measured in bits.

The width is property of segment. Some 32-bits object file formats allow to mix segments of different widths in one file. Width of addressing and operating mode can be ad hoc changed with instruction prefix ATOGGLE, OTOGGLE.

Pseudoinstruction PROGRAM has WIDTH= property, too. It will establish the default for all segments declared in the program. Program width is also used to select the format of output file, for instance if PExecutable should be created as 32bit or 64bit.

↑ Size

Size is a plain non-negative number which specifies the number of bytes in object (register, memory variable, structure, segment, file etc). Size of string is specified in bytes, no matter if the string is composed of ANSI or WIDE characters.

Size of object can be counted with at assembly time, using the attribute operator SIZE#, FILESIZE#.

Size of a preprocessing %variable contents can be retrieved with pseudoinstruction %SETS.

Size is always measured in bytes.

Size and length of €ASM elements (identifiers, numbers, structures, expressions, file contents, nesting depth, number of operands, etc.) is not limited by design, but such sizes are internally stored as signed 32bit integers, so the actual limitation is 2_147_483_647 characters. In practice we will be restricted by the amount of available virtual memory, of course.

↑ Length

This term is used to count the number of comma-separated items in an array, for instance the length of operand list in the statement VPERMI2B XMM1,XMM2,XMM3,MASK=K4,ZEROING=ON is 5.

Length of a preprocessing %variable contents can be retrieved with pseudoinstruction %SETL.

↑ Namespace

Names of symbols and structures created in the program must be unique. In large projects it might be difficult to maintain unique names, especially when more people work on separate parts of the program. That is why programmer can use local identifiers which must be unique only in a division of source file called namespace. Namespace is a range of the source specified by namespace block. There are four block-pseudoinstructions in €ASM which create the namespace: PROGRAM, PROC, PROC1, STRUC. The block name is also the name of namespace. An identifier is local when its name begins with fullstop .. Unlike with standards symbols, the characters following the leading fullstop may start with a decimal digit and it is not an error when they form a reserved name. Example of valid local identifiers: .L1, .20, .AX.

Names of local identifiers are kept in €ASM internally concatenated with namespace name, so they form fully qualified name (FQN). Local symbols may be referred with .local name only within their native namespace block; they may also be referred with fully qualified name anywhere in the program.

The namespace actually starts at the operation field of the block statement and it ends at the operation field of the corresponding endblock statement. Thanks to this, the namespace itself (label of the block) may be local, too, and the namespaces may be nested.

MyProg PROGRAM      ; PROGRAM starts the namespace MyProg.         ;
                                                                    ;
Main    PROC        ; PROC starts inner namespace Main.             ;
  .10:   RET        ; Local label; its FQN is Main.10.             ;
        ENDP Main   ; After ENDP we are in MyProg namespace again. ;
                                                                    ;
.Local  PROC        ; Its FQN is MyProg.Local.                     ;
  .10:   RET        ; FQN of this label is MyProg.Local.10.        ;
        ENDP .Local ; MyProg.Local namespace ends right after ENDP.;
                                                                    ;
       ENDPROGRAM MyProg

Beside the namespace blocks there is one more occasion where namespace is unfolded: operand fields of the structured data definition statement, which temporarily take over the namespace of structure which is being instanceized.

DateProg PROGRAM      ; PROGRAM starts the namespace DateProg.           ;
                                                                         ;
Datum STRUC  ; Declaration of structure Datum creates namespace Datum.   ;
.day   DB 0                                                              ;
.month DB 0                                                              ;
.year  DW 0                                                              ;
      ENDSTRUC Datum ; Namespace Datum ends right behind ENDSTRUC field. ;
                                                                         ;
[.data] ; Segment name is not local label, namespace is ignored here.    ;
Birthday DS Datum, .day=1, .month=1, .year=1970                          ;
                                                                         ;
; The previous statement defines 4 bytes long structured memory variable ;
; called Birthday in section [.data] and statically sets its members.    ;
; On creating the variable "Birthday" €ASM uses properties               ;
; declared as Datum.day, Datum.month, Datum.year (B,B,W).                ;
; Members can be referred as Birthday.day, Birthday.month, Birthday.year.;

↑ Scope

Scope is property of symbol which specifies symbol visibility.

Symbol defined in assembler program, such as label or memory variable, may be referred anywhere withing the program at assembly time. Our program may be linked with other programs, object modules or libraries, which might have misused the same name for their own symbols, but it's OK and no conflict occurs because programs are compiled separately. This is the standard behaviour, such symbols have standard private scope and their visibility is limited to the inside of PROGRAM..ENDPROGRAM block.

When a symbol name begins with fullstop ., visibility of such private local name is even narrower, it is limited to the smallest namespace block in which was the symbol defined (PROC..ENDPROC, STRUC..ENDSTRUC).

On the other hand, executables which are linked from several programs (modules, libraries) need to acces symbols outside their standard private scope, for instance to call an entry point of a library function. Names of such global symbols should be unique among all linked programs.

Scope recognized in €ASM
privateGlobal
Standardlocalstatic link dynamic link
PublicExterneXportImport

Scope of symbol can be examined at assembly time with attribute operator SCOPE#, which returns ASCII value of uppercase scope shortcut, for instance

MySymbol EXTERN
MOV AL,SCOPE# MySymbol ; equivalent to MOV AL,'E'

Available shortcuts are underlined in the table above. The same shortcuts are also used when symbol properties are listed by %DISPLAY Symbols and after the link phase if LISTGLOBALS=ENABLED.

GLOBAL, PUBLIC, EXTERN, EXPORT and IMPORT scope of a symbol can be explicitly declared by pseudoinstruction with the corresponding name. GLOBAL scope can be also declared implicitly, using two (or more) terminating colons :: after the symbol name. Symbol declared as GLOBAL is either available as PUBLIC (if it is defined in the same program), or it is marked as EXTERN (if it is not defined in the program).

Only the scopes for static linking (PUBLIC, EXTERN) can be declared by simplified global scope declaration (using two colons). When the symbol will be exported (if a DLL file is created), or when it should be dynamically imported from other DLL, using two colons is not enough and either explicit declaration EXPORT/IMPORT symbol or LINK import_library is required.

Word1:  DW 1   ; Standard private scope.
Word2:: DW 2   ; Public scope declared implicitly (with double colon).
Word3   PUBLIC ; Public scope declared explicitly.
Word4   GLOBAL ; Public or extern scope (which depends on Word4 definition in this program).
Word5   GLOBAL ; Public or extern scope (which depends on Word5 definition in this program).
Word6   EXTERN ; Extern scope. Symbol Word6 must not be defined anywhere else in this program.
Word4:         ; Definition of symbol Word4.
        MOV EAX,Word5 ; Reference of external symbol Word5.
; Scope of Word1 is PRIVATE.
; Scope of Word2, Word3, Word4 is PUBLIC.
; Scope of Word5, Word6 is EXTERN.

↑ Data types

Information in computer memory or register represents code or data. Important properties of stored texts and numbers is data type, which is a rule specifying how to interpret the information. €ASM recognizes following types of data:

Fundamental data types
TypenameShortSizeAutoalignWidth Typical
storage
Character
string
Integer
number
Floating-point
number
Packed
vector
BYTEB118 R8ANSI8bit
UNICHARU2216 R16WIDE
WORDW2216 R1616bit
DWORDD4432 R32,ST32bitSingle precision
QWORDQ8864 R64,ST64bitDouble precision
TBYTET10880 STExtended precision
OWORDO1616128 XMM4×D | 2×Q
YWORDY3232256 YMM8×D | 4×Q
ZWORDZ6464512 ZMM16×D | 8×Q
Other data types
TypenameShortSizeAutoalign Usage
Structure nameSvariableSTRUC explicit alignment,
otherwise program width
structured variables
INSTRIvariable1machine instructions

Using of fundamental typenames is often reduced to their first letter. Data types in short or long notation are used for explicit static data definition with pseudoinstruction D, for implicit data definition in literals, as an alignment specification, in instruction modifiers.

€ASM has some type awareness, though not so strong as in higher programming languages. For instance when processing instruction INC [MemoryVariable] it looks how was MemoryVariable defined and selects appropriate encoding version (byte/word/dword/qword).


↑ Symbols

Name of symbols ↓

Numeric symbols ↓

Address symbols ↓

$ - current origin address ↓

Attributes of symbol ↓

Literal symbols ↓

Symbol in assembly language is an alias to a number or address.

There are two kinds of symbols in assembler: numeric and address.

Numeric symbol answers the question how many and address symbol answers the question where (at which position in the program).

Numeric symbol is defined with pseudoinstruction EQU or with its alias =, for instance Dozen EQU 12.
Address symbol is defined when its name appears in a label field of a statement.

Value of numeric symbol is internally kept in 8 bytes (signed QWORD) but address symbols need an additional information about the section where they belong to.
It is not possible in €ASM to define numeric symbol as a label of other statement than EQU, or as a solo label without operation field. Each program statement compulsorily belongs to some section (either explicitly defined or implicitly created when assembly of program block starts).

↑ Name of symbols

Symbol name is an identifier (letter or fullstop optionally followed with other letters, fullstops and digits), which is not a reserved symbol name in either character case.
Symbol name may always be terminated with one or more colons : which helps to recognize the identifier as a symbol name. The colon itself is not a part of the symbol name. Symbols should have self-explaining mnemonic name.

Termination of each symbol name with : is a good habit both when the symbol is defined and referred, though many other assemblers do not support this. It's easier to copy&paste the symbol name without having to delete colon at its end. Colon tells both assembler and human reader that the name represents a symbol, and it protects from mistake when you choose a symbol name which accidentally happens to collide with one of thousands instruction mnemonics.
Instruction mnemonics, registers (except for segment registers), structure names are never colon-terminated.
Symbol name must be unique in the program.

Symbols and structures may be referred (used in statement) before they are actually defined. However, it's a good practice to define numeric symbols and structures at the beginning of the program, because forward references require additional program passes, which extends the duration of assembly.

Reserved symbol names
CategoryReserved names
Assembly-time current pointer$
Segment register namesCS, DS, ES, FS, GS, SS
Prefix namesATOGGLE, LOCK, OFTEN, OTOGGLE, REP, REPE, REPNE, REPNZ, REPZ, SEGCS, SEGDS, SEGES, SEGFS, SEGGS, SEGSS, SELDOM, XACQUIRE, XRELEASE

Name of symbol may contain fullstop ., which usually connects namespace with symbol's local name. Leading . makes the symbol local, as it is in fact connected with current namespace internally.

Creating symbol names which collide with names of registers or instructions is discouraged. If you really want to use some of not recommended name for a symbol, it must be always followed with colon, e.g.

  Byte: DB 1 ; Define a symbol named "Byte".
  MOV AX,Byte: ; Load AX with offset of the symbol.

In other cases, terminating symbol name with : is voluntary, but recommended.

Not recommended symbol names
CategoryNot recommended names
Fundamental data types B, BYTE, D, DWORD, I, INSTR, O, OWORD, Q, QWORD, S, T, TBYTE, U, UNICHAR, W, WORD, Y, YWORD, Z, ZWORD
Register names AH, AL, AX, BH, BL, BND0, BND1, BND2, BND3, BP, BPB, BPL, BX, CH, CL, CR0, CR2, CR3, CR4, CR8, CX, DH, DI, DIB, DIL, DL, DR0, DR1, DR2, DR3, DR6, DR7, DX, EAX, EBP, EBX, ECX, EDI, EDX, ESI, ESP, K0, K1, K2, K3, K4, K5, K6, K7 MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10L, R10W, R11, R11B, R11D, R11L, R11W, R12, R12B, R12D, R12L, R12W, R13, R13B, R13D, R13L, R13W, R14, R14B, R14D, R14L, R14W, R15, R15B, R15D, R15L, R15W, R8, R8B, R8D, R8L, R8W, R9, R9B, R9D, R9L, R9W, RAX, RBP, RBX, RCX, RDI, RDX, RSI, RSP, SEGR6, SEGR7, SI, SIB, SIL, SP, SPB, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, TR3, TR4, TR5, XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM16, XMM17, XMM18, XMM19, XMM20, XMM21, XMM22, XMM23, XMM24, XMM25, XMM26, XMM27, XMM28, XMM30, XMM31 YMM0, YMM1, YMM2, YMM3, YMM4, YMM5, YMM6, YMM7, YMM8, YMM9, YMM10, YMM11, YMM12, YMM13, YMM14, YMM15, YMM16, YMM17, YMM18, YMM19, YMM20, YMM21, YMM22, YMM23, YMM24, YMM25, YMM26, YMM27, YMM28, YMM30, YMM31 ZMM0, ZMM1, ZMM2, ZMM3, ZMM4, ZMM5, ZMM6, ZMM7, ZMM8, ZMM9, ZMM10, ZMM11, ZMM12, ZMM13, ZMM14, ZMM15, ZMM16, ZMM17, ZMM18, ZMM19, ZMM20, ZMM21, ZMM22, ZMM23, ZMM24, ZMM25, ZMM26, ZMM27, ZMM28, ZMM30, ZMM31
Pseudoinstruction names ALIGN, D, DB, DD, DI, DO, DQ, DS, DU, DW, DY, DZ, ENDHEAD, ENDP, ENDP1, ENDPROC, ENDPROC1, ENDPROGRAM, ENDSTRUC, EQU, EUROASM, EXTERN, GLOBAL, GROUP, HEAD, INCLUDE, INCLUDE1, INCLUDEBIN, INCLUDEHEAD, INCLUDEHEAD1, PROC, PROC1, PROGRAM, PUBLIC, SEGMENT, STRUC
Machine instruction mnemonicsAAA, AAD, ... XTEST, see IiHandlers in €ASM source for the complete list.

↑ Numeric symbols

Numeric symbol is defined with pseudoinstruction EQU (or with its alias =) which specifies a number, numeric expression or other numeric symbol. Examples:

BufferSize: EQU 16K
WM_KEYDOWN = 0x0100
Total      EQU 2*BufferSize
   MOV ECX,BufferSize
Using numeric symbol instead of direct number notation has its advantages:

↑ Address symbols

Address symbol is defined when it appears as a label of machine instruction or prefix, as a label of empty instruction or as a label of pseudoinstruction D*, PROC, PROC1.

Examples:
[DATA]
SomeValue:   DD 1
[CODE]
             MOV EAX,SomeValue:
StartOfLoop: CALL SomeProcedure:
             DEC EAX
             JNZ  StartOfLoop:

While numeric symbol BufferSize was completely defined with its value, in case of address symbol SomeValue it is not sufficient. Instruction MOV EAX,SomeValue loads EAX with the symbol offset, i.e. with the distance between its position and the start of its segment. Address symbol is defined with two properties: its segment and offset. That is why address symbol is sometimes called vector or relative symbol and numeric symbol is called scalar or absolute symbol or constant.

There are four methods how to create a symbol:
  1. Symbol is defined when its name occurs in the label field of a statement. Such symbol represents address withing the section it was defined in, and the data or code emitted by the statement, too. The statement may be empty (solo label) or it may declare data, prefix or machine instruction. Pseudoinstructions PROC and PROC1 also define the symbol with their name, but pseudoinstructions PROGRAM, STRUC, SEGMENT do not.
  2. External symbols are created with pseudoinstructions EXTERN and GLOBAL, or when they are referred with two colons appended to their name. Extern symbol is not defined in the current program, it must not appear in label field (with an exception of EXTERN pseudoinstruction itself, which declares it as external).
  3. €ASM maintains a special dynamic symbol $ for each section, which represents the current assembly position in the section.
  4. Symbol can be defined with pseudoinstruction EQU or with its alias =. This is the only way how to define a plain numeric symbol.

↑ $ symbol

Special dynamic symbol $ represents the address of next free position in emitted code at the beginning of assembly of the statement, in which it is referred. Value of this symbol is not constant but it is changed by €ASM after an emitting statement has been assembled.

Programmer may change the offset of current origin $ with EQU pseudoinstruction, this is equivalent to pseudoinstruction ORG known from other assemblers.

There is no ORG pseudoinstruction in €ASM, $ is made l-value instead.
|00000100:44444444 |DataDword DD 0x44444444 |00000104: | ; Redefine DataDword as a word-accessible union: |00000100: | $ EQU DataDword ; Return emitting pointer back. |00000100:1111 |DataLoWord DW 0x1111 ; Re-emit new data which will overwrite |00000102:2222 |DataHiWord DW 0x2222 ; data defined at DataDword. |00000104: |

See also test t2551 or sample project boot16.


↑ Symbol, register and file attributes

SIZE# ↓
TYPE# ↓
REGTYPE# ↓
SCOPE# ↓
OFFSET# ↓
SECTION# ↓
SEGMENT# ↓
GROUP# ↓
PARA# ↓
FILESIZE# ↓
FILETIME# ↓

Some important symbol properties are available for next processing in a program at assembly time, they are called attributes. When a symbol is defined, it automatically gets its attributes. They can be referred by prefixing the symbol name with attribute operator. Attribute operator is an identifier which defines the kind of attribute, immediately followed with #. The object, which the attribute operator is applied on, may be separated by zero or more white spaces and it may be in parenthesis. For instance SIZE#SymbolName or SIZE# SymbolName or SIZE#(SymbolName). Remember that symbol name is case sensitive but the attribute name is not.

Attributes GROUP#, SEGMENT# and SECTION# return an address when applied to an address symbol; they return scalar zero when applied to numeric symbol. Other attributes always return scalar (plain number).

↑ SEGMENT#

Attribute SEGMENT# represents the address of beginning of the segment that the symbol belongs to. When applied to a numeric symbol, it returns scalar zero.

↑ GROUP#

Attribute GROUP# represents the address of beginning of the group that the symbol belongs to, i.e. address of the first byte of the first (lowest) segment of the group. When applied to a numeric symbol, it returns scalar zero.

↑ PARA#

Attribute PARA# represents the paragraph address of beginning of the group that the symbol belongs to. It is the value which has to be loaded to the segment register which will be used for addressing. When PARA# is applied to a numeric symbol, it returns scalar zero.

↑ SECTION#

Attribute SECTION# represents the address of beginning of the section that the symbol belongs to. When applied to a numeric symbol, it returns scalar zero. If the symbol lies in default section (with the same name as its segment), both SECTION# and SEGMENT# attributes return identical address.

↑ OFFSET#

Attribute OFFSET# returns the offset of symbol in the current segment as a plain number, i.e. the number of bytes between the start of segment and the symbol itself. If the symbol is numeric, its value is returned.

Symbol and OFFSET#Symbol are identical only when Symbol is a scalar value, otherwise the former represents its address and the latter represents plain number.

The expression Symbol - SEGMENT#Symbol is identical with OFFSET#Symbol for both numeric and address kind of symbols.

↑ SCOPE#

Attribute SCOPE# returns a number representing the ASCII value of capital letter corresponding with the symbol scope, which can be 'E' for external symbols, 'P' for public symbols, 'X' for exported symbols, 'I' for imported symbols, 'S' for standard (private) symbols, or '?' when the symbol is undeclared.

↑ SIZE#

SIZE# represents the amount of bytes emitted with the statement which defines the symbol. Typically it is the size of data defined with D pseudoinstruction or the size of machine instruction. Symbols defined with EQU pseudoinstruction or defined in non-emitting instruction have attribute SIZE# equal to zero.

↑ TYPE#

Attribute TYPE# returns a number representing the ASCII value of capital letter corresponding with the symbol type. It may be one of the fundamental data types 'B', 'U', 'W', 'D', 'Q', 'T', 'O', 'Y', 'Z', structured data type 'S' or machine instruction type 'I' when the symbol is defined with data definition pseudoinstruction D.
Numeric symbol returns type attribute 'N'.
Label of a machine instruction or machine prefix have type attribute 'I'.
Address symbol defined with just a label and external symbol returns 'A'.
Undefined symbol returns '?'.

Forward reference to a symbol will create its record in the symbol table. However, in the first pass its type attribute is '?' (undefined) until its definition is encounterred. On the other hand, applying an attribute to undefined symbol does not make it referred. That is why we may test with the pseudoinstruction %IF TYPE#Symbol = '?' whether the symbol is undefined in program.

Beside symbols, some attribute operators may be applied to other elements than symbols: register, structure name, string, expression in parenthesis () or braces [].

TYPE# of register is 'R' and its SIZE# is equal to the register width in bytes (1,2,4,8,10,16,32,64).

TYPE# of structure or segment is 'S' and SIZE# computes its size in bytes.

|[.data] |[.data] |00000000:456E642E0D0A00 |Message D 'End.',13,10,0 ; Defined as DB or DU. |[.text] |[.text] |TRUE | %IF TYPE# Message = 'B' ; If UNICODE is disabled |00000000:B907000000 | MOV ECX,SIZE# Message ; load ECX with size in bytes. |FALSE | %ELSE ; Otherwise UNICODE is enabled, | | MOV ECX,SIZE#Message/2 ; and SIZE# returns 14 bytes. | | %ENDIF ; ECX is now 7 (message length in characters).

Why should we use SIZE# or TYPE# attributes when the querried symbol is defined by ourselves and therefore we already know its size and type? If we would decide to change the text of Message later, we won't have to bother with its length recalculation.

Attribute operators are often used in macros to determine what type of operand was the macro provided: if it's a register, data symbol, immediate value etc. When we need to know in a macro if the provided operand %1 is a plain number, we could test this with query %IF TYPE# %1 = 'N'.

See tests t16* for more attribute examples.

Detailed differentiation of data symbol which attribute TYPE# yields is sometimes not necessary. For instance we may need to distinguish whether macro operand %1 needs relocation at link time. This happens when this is address symbol or memory variable which contains some address symbol. TYPE# DataSymbol or TYPE# [DataSymbol+RSI] may return 'A', 'B','W','D','Q','T' or whichever kind of data was the DataSymbol defined with. Otherwise it will return 'N' when the operand was a number which doesn't use relocation, such as TYPE# MAX_PATH_SIZE or TYPE# [RBP-16]. Here we may need to unify all kinds of address+external symbols with attribute operator SEGMENT#, wich returns relocatable address of its bottom, regardless of its datatype. Attribute TYPE# applied to such SEGMENT# attribute will always return 'A'. On the other hand, SEGMENT# ScalarSymbol and TYPE#(SEGMENT#ScalarSymbol) return 'N'.

%IF TYPE# (SEGMENT# %1) = 'A'
  ; %1 is address expression which requires relocation.
%ELSE
  ; %1 is nonrelocable expression.
%ENDIF
Notice that chained attributes require parenthesis. This is because all attribute operators have equal priority, so they are evaluated from left to right, and without parenthesis the first operator would attempt to apply itself on another unary operator.
See also test t1695 for more examples.
↑ REGTYPE#

Attribute TYPE# applied on register returns value 'R', regardless of register family. Sometimes it is useful to know the exact kind of register. Attribute REGTYPE# returns a number representing the ASCII value of capital letter corresponding with the register family. General-purpose registers return 'B', 'W', 'D', 'Q', SIMD registers return 'X', 'Y', 'Z', segment registers return 'S' etc. See the Registers table for the complete list. When this attribute is applied to an element which is not a register, it returns '?'. See also test t1648.

↑ FILESIZE#
↑ FILETIME#

Unlike previous attributes, FILESIZE# and FILETIME# can be applied only to files specified by their name, which must be surrounded with double quotes ". The filename may have absolute, relative, or no path, it is related to the current directory at assembly time.

Both file attribute operators investigate the file properties at assembly time.

FILESIZE# "filename" returns the number of bytes in the file, or 0 if the file was not found.
FILETIME# "filename" returns the timestamp of the file, i.e. the number of seconds between 1st of January 1970 midnight UTC and the last file modification. It returns 0 when the file was not found. See also test t1690.

↑ Literals

Literal symbols alias literals are similar to standard assembler symbols. The main difference is that they don't have explicit definition and name. Literal is defined whenever it is referred and its name is represented with equal sign = followed with data expression, for instance =D(5) or =B"Some text.". They may be duplicated, but unlike in D pseudoinstruction (which may have many operands), only one data expression can be specified. Examples of instructions with literals:

DIV [=W(10)]   ; Divide DX:AX by an anonymous word memory variable with value 10.
MOV DX,=B"This is a literal message.$" ; Load DX with offset of a string defined ad hoc somewhere in data segment.
LEA ESI,[=D 0] ; Load ESI with address of a DWORD memory variable which contains the value 0.
CALL =I"RET"   ; Push EIP and then load EIP with offset of machine instruction RET defined somewhere in code segment.
LEA EBX,[=D 0,1,2,3] ; Error: multiple data expressions.
MOV DX,=B"This is a literal message.",13,10 ; Error: multiple data expressions.
The first example declares a word variable =W(10). Without literals we would have to explicitly define a data variable Ten DW 10 somewhere in data section.

Advantage of literal is that we don't need to invent unique symbol name and explicitly declare the symbol in data section with D pseudoinstruction. The data contents is visible directly in the instruction which uses the literal.

Literals are automatically aligned.

All literals are autoaligned according to their type, for instance =D 5 is DWORD aligned regardless of current EUROASM AUTOALIGN= option.

String literals are automatically zero-terminated.

String literals, such as =B"Some text" or =U"Some text" are always implicitly terminated with byte or unichar zero when they are declared as literals.
€ASM allows simplified declaration of nonduplicated literal strings, where the type identifier (B or U) is omitted, e.g. ="Some text". The actual type of string is then determined by system preprocessing variable %^UNICODE.

Implicit data definition with literals does not allow to control the exact location where the literals will be emitted to. €ASM creates a subservient section for each type of data depending on their natural alignment. The literal section is created either

  1. in the last data segment which was created with PURPOSE=DATA+LITERAL, or if no such segment was found,
  2. in the last data segment defined in the program, or if no data segment was declared in the program,
  3. in implicit data segment [@LT] created automatically by €ASM.

Names of literal sections are [@LT64], [@LT32], [@LT16], [@LT8], [@LT4], [@LT2], [@LT1].
Literals with INSTRUC data type, such as =8*I"MOVSD", are emitted to subservient section [@RT0] which is similarly created in segment with PURPOSE=CODE+LITERAL, or in the last code segment, or in automatically created implicit code segment [@RT].

Repeated literals with the same declaration are reused, they represent the same memory variable. Literals with non-verbatim match, such as =W+4, =W 4 and =W(2+2) are stored separately as different symbols, nevertheless their value is reused when it's identical, so it occupies common space in literal section. Similarly =B"Some text", =B'Some text' and =B 'Some text' are different but those three symbols together will occupy only 9+1 bytes in literal section memory at run-time.

Literals must always be treated as read-only memory variables.

Although nothing can stop the programmer from overwriting the literal value at run-time, this could corrupt behaviour of other parts of program, which might be reusing the same literal data.

Comparison of standard symbols and literals
PropertyStandard symbolLiteral symbol
DeclarationIt is defined explicitly, with pseudoinstruction D or its clones, e.g. Dozen: DD 12 It is declared when it is first used in any instruction, e.g. MOV ECX,=D 12
NameProgrammer must invent unique symbol name. Name of literal symbol is created from its value.
Position in object codePlacement of the symbol is fully in programmer's hands. Placement is not directly controlled by programmer.
AlignmentIf required, it must be specified explicitly with pseudoinstruction ALIGN, or with modifier ALIGN= or with EUROASM option AUTOALIGN=. Literals are always naturally aligned, as if EUROASM AUTOALIGN=ENABLED were set at their declaration.
Alignment stuffIn order to minimalize necessary alignment stuff, programer should pay attention when mixing aligned data with different sizes. Literal data of all sizes are packed together in the descending order which minimalizes alignment stuff between them.
Multioperands Data definition pseudoinstruction D and its clones support multiple operands, e.g. Hello DB "Hello, world",13,10,'$' Multiple literal operands are not supported.
String NUL terminationOnly when explicitly declared, for instance Hello: DU "Hello, world",0 Automatically, e.g. MOV ESI,=U "Hello, world"
DuplicationDuplication is supported, e.g. FourDoublePrecOnes: DY 4 * Q 1.0 Duplication is supported, e.g. VMOVUPD YMM7, [= 4 * Q 1.0]
Value overwritingAd libitum.This should be avoided.

↑ Structures

Structure is declared by a piece of assembly code represented with STRUC..ENDSTRUC block. The block declares names, datatypes, sizes and offsets of structure members. In OOP terminology the structure is a class and structured memory variable is an object. Example:

DATUM STRUC           ; Declaration of structure (class) DATUM.
 .Year  D W
 .Month D B
 .Day   D B
      ENDSTRUC DATUM

Today DS DATUM        ; Definition of memory variable (object) Today.
Members of structure should have local names (beginning with period .). Structure declaration defines namespace block.

Structure declaration creates symbols DATUM.Year, DATUM.Month, DATUM.Day with values 0, 2, 3 respectively. Those symbols are absolute (scalars) and they give names to relative offsets inside the structure.

Data definition creates structured memory variable - symbol Today. At the same time it creates symbols Today.Year, Today.Month, Today.Day. Their addresses are defined somewhere in data or bss section, they are not scalars but have relocable addresses.

Value of structure members is undefined (when the structured variable was defined in BSS segment) or it contains all zeroes (if defined in DATA segment). Members of structured memory variable can be defined statically at definition-time with keyword operands, for instance Today DS Datum, .Day=31, see also pseudoinstruction DS.

Memory-variable member can be accessed directly, for instance

MOV [Today.Month],12

We could also use a register to address the whole memory-variable, and employ this register to address individual members with relative offsets specified in structure declaration:

 MOV EDI,Today
 MOV [EDI+DATUM.Month],12

↑ %Variables

User-defined %variables ↓

Formal %variables ↓

Automatic %variables ↓

System %variables ↓

€ASM program uses preprocessing variables (alias %variables) for easy manipulation with the source text at assembly-time. Hand in hand with macroinstructions they make a very strong tool to save repetitive programmer's labour. Preprocessing apparatus does not affect the object code directly, as plain assembler does. Instead, it manipulates with the source text, which can be modified with %variables and repeated with preprocessing %pseudoinstructions.

Preprocessing variables always treat their contents as a sequence of characters, without inspecting its syntactic significance, no matter if they were assigned with literal text, string, numeric or logical expression or whatever.

Once assigned, the contents of %variable will be used (expanded) whenever the %variable appears in the source text (except for comments). Expansion takes place before the physical line of source file is parsed into statement fields. By default the whole contents of %variable is expanded, but this can be limited with Substring or Sublist operation.

See also €ASM function Preprocessing.

Preprocessing %variables families
%Variable family ►User-defined FormalAutomatic   System
EUROASMPROGRAM€ASM
name format%identifier%identifier%spec.character(s) %^option%^option%^fixed
case-sensitiveYesYesYesNoNoNo
(re)assignmentableexplicitly
with %SET*
indirectly by
for-loop | macro expansion
indirectly by
macro expansion
indirectly by
EUROASM option
indirectly by
PROGRAM option
No

↑ User-defined %variables

Name of user-defined %variable is represented with a percent sign % immediately followed by an identifer, which is not reserved %variable name in either case. Identifier name must begin with a letter and may not contain fullstop or other punctuation.

User-defined %variable name is case-sensitive.
Reserved %variable names
CategoryReserved names
Pseudoinstructions %COMMENT, %DEBUG, %DISPLAY, %DROPMACRO, %ELSE, %ENDCOMMENT, %ENDFOR, %ENDIF, %ENDMACRO, %ENDREPEAT, %ENDWHILE, %ERROR, %EXITFOR, %EXITMACRO, %EXITREPEAT, %EXITWHILE, %FOR, %IF, %MACRO, %PROFILE, %REPEAT, %SET, %SET2 %SETA, %SETB, %SETC, %SETE, %SETL, %SETS, %SETX, %SHIFT, %UNTIL, %WHILE

User %variables are assigned (created) by the programmer with one of the %SET* family of pseudoinstructions.

%Variables may be reassigned later with a different value, they don't have to be unique in the source.

Scope of user-defined %variable begins at its definition and ends at the end of source file.

%Variables need not be assigned before the first use. Unassigned %variable expands to nothing (empty text). Once defined %variable cannot be unassigned, there is no %UNSET, UNDEFINE or UNASSIGN directive in €ASM. Nevertheless, setting a %variable to emptiness (e.g. %SomeVar %SET) is equivalent to unsetting it. €ASM reports no warning if it encounters user-defined %variable which is empty, which has not been defined earlier or which is not defined in the source file at all.

See also test t8321.

Differences between symbols and %variables
SymbolsUser-defined %variables
are properties of PROGRAM are properties of EuroAssembler
their name never begins with % their name always begins with %
may have membership fullstop in their name never have fullstop in their name
are declared in label field of a statement are assigned with %SET* pseudoinstruction
have assembly attributes such as TYPE# and SIZE#. are simply a piece of text without attributes
may be forward referrenced cannot be forward referrenced
must be declared just once in a program may be redeclared many times
cannot be referrenced if not declared somewhere in the main or linked program may be referrenced without declaration
cannot be subject of sublist or substring operation can be sublisted or substringed

↑ Formal %variables

Formal %variable expands to a value of parameter used in %FOR loop and in %MACRO invocation. It is represented by an identifier which stands in the label field of %FOR statement, or as an operand in %MACRO prototype.

Scope of formal variables is limited to the block which is being expanded.

Count %FOR 1..8
        DB %Count
      %ENDFOR Count

The previous example generates eight DB statements which define byte values from 1 to 8. Identifier Count used in %FOR and %ENDFOR statements is %FOR-control variable, which is accessible inside the %FOR block as a formal %variable %Count.

Formal variables are also used to access macro operand by name during the macro expansion. In the next example we have two %MACRO-formal variables provided in %MACRO definition as identifiers Where and Stuff. In macro body their values are available as formal %variables %Where and %Stuff.

Fill %MACRO Where, Stuff=0 ; Definition of macro Fill.
       MOV %Where,%Stuff
     %ENDMACRO Fill

; invocations of macro Fill:
   Fill [Counter], Stuff=255 ; Will be assembled as MOV [Counter],255
   Fill EBX                  ; Will be assembled as MOV EBX,0

Notice that formal %variables are always written without the percent sign when they are declared, but % must be prefixed to their name when they are referred in the %FOR or %MACRO body. This is important for inheriting of arguments in nested and recursively expanded macroinstructions, see t8233 as an example.

Scope of formal %variables has higher priority than user-defined %variables with identical name, no matter if they were assigned outside or inside the scope. Reassignment of a %variable with formal name inside the macro body will assign the new value to the user-defined %variable, but inside macro the value of formal %variable prevails, see t8347, t8349. %Variable with reassigned value will be visible outside the macro, though.


↑ Automatic %variables

Automatic preprocessing variables are created and maintained by EuroAssembler at asm-time; their names contain punctuation characters and, unlike user-defined %variables, they cannot be explicitly reassigned with %SET pseudoinstruction.

Scope of automatic %variables is limited, using them outside their scope leads to an error.

%&

Suboperation size | length %& represents the number of characters | list items | physical lines in suboperated object.
Its scope is constrained to suboperation braces [ ] or { }.

Automatic suboperation variable %& is created when the expansion of included file or of another %variable uses suboperations.

When the substring operator [ ] is appended to the %variable name or to the included file name, automatic variable %& can be used inside the brackets, e.g. [1..%&] and it represents the number of bytes in expanded %variable or in the included "file".
E.g. if the user has assigned %aVariable with five letters %aVariable %SET ABCDE then its size is 5 and the statement DB "%aVariable[4..%&]" expands to DB "DE".

When the sublist operator { } is appended to the %variable name, contents of this %variable is treated as an array of comma-separated items and %& represents their count (ordinal number of the last nonempty item).
E.g. if the user has assigned %aReglist %SET ax,cx,dx,bx,bp then its length is 5 operands (items) and the statement MOV %aReglist{3},%aReglist{%&} expands to MOV dx,bp.

When the same sublist operator { } is appended to the included file name, contents of the file is treated as a set of physical lines and %& represents number of lines in the file. For instance INCLUDE "file.inc"{%&-10 .. %&} will include the last ten lines from "file.inc".

Using the %& variable outside brackets will throw an error.

Indices of suboperation span from 1 to %&.
%.

Expansion counter %. maintains a decadic number which is incremented by €ASM in each expansion of preprocessing block and can be used to create unique labels in repeating blocks.
Its scope is limited to the body of preprocessing blocks %MACRO, %FOR, %WHILE, %REPEAT. If used outside those blocks, it will expand to a digit 0, see t8362.

If there is some private or local label declared within macro or repeating block, and if the macro or block is expanded more than once, the symbols will be defined more than once, which assembler treats as an error. The identifier used as a label withing macro or other expanding pseudooperations (%FOR, %REPEAT, %WHILE) should be unique. This can be achieved with the expansion counter embedded into symbol name.

See the example of macro AbortIf below. The label Skip is postfixed with %., giving the label Skip%. which expands to Skip1 and which will expand to Skip2 on the next AbortIf invocation.

|00000008: | | |AbortIf %MACRO Condition=, Errorlevel=1 ; Definition of macro AbortIf. | | J%!Condition Skip%.: ; Use inverted condition to bypass the abortion. | | PUSH %Errorlevel ; Prepare operand for API invocation. | | CALL ExitProcess:: ; Windows API for program termination. | |Skip%.: ; Label where the program continues. | | %ENDMACRO AbortIf |00000008: | |00000008: | ; Example of conditional abortion: | | EUROASM ListMacro=Yes, ListVar=Yes ; Display the expanded instructions. |00000008:833D[04000000]00 | CMP [Something],0 ; Test the condition and then invoke macro. |0000000F: | AbortIf Condition=E, Errorlevel=8 ; The program exits when Something is zero. | +AbortIf %MACRO Condition=, Errorlevel=1 ; Definition of macro AbortIf. |0000000F:7507 + J%!Condition Skip%.: ; Use inverted condition to bypass the abortion. | !JNE Skip1: |00000011:6A08 + PUSH %Errorlevel ; Prepare operand for API invocation. | !PUSH 8 |00000013:E8(00000000) + CALL ExitProcess:: ; Windows API for program termination. |00000018: +Skip%.: ; Label where the program continues. | !Skip1: | + %ENDMACRO AbortIf |00000018: | ; Continue with the program if not aborted.
Automatic variable %. helps to create unique symbol names.

All the following automatic macro %variables have their scope limited to the %MACRO block body. They refer to operands used when the macro is invoked (expanded).

%:

If a label is used in macro invocation, the label is by default placed in the first of expanded statements. This behaviour can be overridden when the automatic macro label %variable %: is explicitly declared somewhere in the macro definition. Only one such label may be defined in the macro. Resettlement of macro label may spare a few clocks when jumping to the macro expansion which begins with code which would have to be skipped, see the following example:

SaveCursor %MACRO Videopage=BH
   %IF TYPE#CursorSave != 'W' ; If the memory variable CursorSave was not defined yet.
     JMP %:                   ; Skip to $+4 (below the DW) when the macro is entered in normal statements flow.
     CursorSave DW 0          ; Space for storing the cursor is reserved here in the code section.
   %ENDIF
%: MOV AH,3                   ; Entry point of macro is here when the macro invocation is jumped to.
   MOV BH,%Videopage
   INT 10h                    ; Get cursor shape via BIOS API.
   MOV [CursorSave],CX
 %ENDMACRO SaveCursor
  ...
Save: SaveCursor Videopage=0  ; Use the macro in program.
  ...
  JMP Save:                   ; Jumps to the instruction MOV AH,3.
Automatic variable %: represents "entry" of macro body.

See also test t8215.

%1

Ordinal operands of the macro can be referred by digits Unlike in batch scripts for DOS and Windows, their number is not limited to 9, but any decadic number is possible, for instance %11. Of course, when the eleventh operand is not specified in macro invocation, %11 expands to nothing.
See also pseudoinstruction %SHIFT.

Automatic %variable %0 expands to the macro name.

%Formal

Another method how to refer to macro operand (both ordinal and keyword) is prefixing formal name of the operand with percent sign.

%!1 or %!Formal

When the ordinal number or formal operand name is prefixed with logical NOT operator (exclamation !), it expands to inverted condition code from ordinal operand. This requires that the referred operand contains a general condition code (case insensitive) such as E, NE, C etc. Operand contents will be replaced with corresponding inverted code. €ASM reports error if the operand did not contain valid condition code.

NASM uses unary-minus operator - to achieve similar functionality. I believe logical-not operator ! is more appropriate for the inversion of logical values.

See the macro AbortIf above as an example.

%*

Ordinal operand list %* is assigned with all ordinal operands from macro invocation, comma-separated. Keyword operands are omitted from the list.

Macro operands can be referred by various methods. The following example demonstrates three possible ways how to refer to macro ordinal operands:

CopyStr %MACRO FirstOp, SecondOp, ThirdOp ; Macro prototype.
          MOV ESI,%FirstOp ; Using formal %variable name of the operand.
          MOV EDI,%2       ; Using ordinal number of the operand.
          MOV ECX,%*{3}    ; Using the third item of operand list.
          REP MOVSB
        %ENDMACRO CopyStr
          ...
        CopyStr Source, Dest, SIZE# Dest ; invocation of the macro.
%#

Length of the ordinal operand list (ordinal number of the last non-empty operand) is set to ordinals count variable %# and it represents the number of ordinal operands used in macro invocation (not the number declared in macro prototype).

The same length could be also obtained with %NrOfOrdinals %SETL %*.
%=*

List of keyword operands %=* is similar to the automatic variable %* but is contains only comma-separated keyword=value operands actually used in macro invocation.

Both %* and %=* can be used to make cloned macros with different names. For example

copystr %MACRO
          CopyStr %*, %=*
        %ENDMACRO copystr

This creates a clone of previously defined macro CopyStr but with a different name copystr. All operands used in invocation of copystr will be passed verbatim to CopyStr.

%=#

Keyword count variable %=# represents the number of keyword operands actualy used in macro invocation (not the number declared in macro prototype).
See also t8364.


↑ System %variables

EUROASM system %variables ↓
PROGRAM system %variables ↓
€ASM system %variables ↓

EuroAssembler maintains a collection of preprocessing variables with values specified by configuration parameters. Their current value can be tested at asm-time, so the assembly process can branch accordingly.

The name of system variable consists of %^ followed with one of enumerated identifiers.

System %^variable names are case insensitive.

Value of system %^variable cannot be assigned with %SET* pseudoinstruction; it is dynamically maintained by €ASM and reflects the current value in charge.

%^DumpWidth %SETA 32 ; Use EUROASM DumpWidth=32 instead.
System %^variables are read-only.

Programmer can involve the value of system %^variable only indirectly, with options specified in euroasm.ini configuration file or with EUROASM and PROGRAM pseudoinstructions.

System preprocessing %variables
Category%variable names (case insensitive)
EUROASM %^AES, %^AMD, %^AutoAlign, %^AutoSegment, %^CodePage, %^CPU, %^CYRIX, %^D3NOW, %^Debug, %^DisplayEnc, %^DisplayStm, %^Dump, %^DumpAll, %^DumpWidth, %^EVEX, %^FPU, %^ImportPath, %^IncludePath, %^Linkpath, %^List, %^ListFile, %^ListInclude, %^ListMacro, %^ListRepeat, %^ListVar, %^LWP, %^MaxInclusions, %^MaxLinks, %^MMX, %^MPX, %^MVEX, %^NoWarn, %^Profile, %^Prot, %^Prov, %^RTF, %^RTM, %^SHA, %^SIMD, %^Spec, %^SVM, %^TBM, %^TimeStamp, %^TSX, %^Undoc, %^Unicode, %^VIA, %^VMX, %^Warn, %^XOP,
PROGRAM %^DllCharacteristics, %^Entry, %^FileAlign, %^Format, %^IconFile, %^ImageBase, %^ListGlobals, %^ListLiterals, %^ListMap, %^MajorImageVersion, %^MajorLinkerVersion, %^MajorOSVersion, %^MajorSubsystemVersion, %^MaxExpansions, %^MaxPasses, %^MinorImageVersion, %^MinorLinkerVersion, %^MinorOSVersion, %^MinorSubsystemVersion, %^Model, %^OutFile, %^SectionAlign, %^SizeOfHeapCommit %^SizeOfHeapReserve, %^SizeOfStackCommit, %^SizeOfStackReserve, %^StubFile, %^Subsystem, %^TimeStamp, %^Width, %^Win32VersionValue,
€ASM %^Date, %^EuroasmOs, %^Proc, %^Program, %^Section, %^Segment, %^SourceExt, %^SourceFile, %^SourceLine, %^SourceName, %^Time, %^Version,
↑ EUROASM system %^variables

are assigned with values specified in [EUROASM] section of euroasm.ini or with EUROASM pseudoinstruction.

For description of system %variables of this category see the corresponding keyword of pseudoinstruction EUROASM.

↑ PROGRAM system %^variables

are assigned with values specified in [PROGRAM] section of euroasm.ini or with PROGRAM pseudoinstruction.

For description of system %variables of this category see the corresponding keyword of pseudoinstruction PROGRAM.

↑ €ASM system %^variables

Value of €ASM system %variables is maintained by €ASM itself and programmer cannot change them directly. They are described here:

%^Version
Eight decimal digits which identify the version number of EuroAssembler. The version number can be deciphered as the day of €ASM release in the format YYYYMMDD.
%^Date, %^Time
Current time of assembly in the format YYYYMMDD, HHMMSS. These two %^variables are set only once when €ASM starts. All source files assembled with one command euroasm source*.asm will share the same %^Date and %^Time which were set from the current local time at the moment when euroasm.exe launched.
%^EuroasmOs
identifies operation system which EuroAssembler runs on during the assembly. It contains shortcut of operating system, such as Win, Lin, BSD...
This is not necessarily the operating system which the output program is intended to run on.
%^SourceFile, %^SourceName, %^SourceExt
These three %^variables contain full file name including path, name (without path and extension) and extension (including the leading .) of the source file which is currently assembled. €ASM updates these %^variables at the start of assembly and whenever some other file is included.
%^SourceLine
contains physical line number of the current statement in the current source file.
In multiline statements (with line continuation \) it is the last physical line.
%^Program
is the name of current PROGRAM..ENDPROGRAM block.
%^Proc
is the name of current procedure. This %^variable is empty outside PROC..ENDPROC or PROC1..ENDPROC1 block.
%^Segment
is the name of current segment (without braces).
%^Section
is the name of current section (without braces).

Combination of €ASM system %^variables is used internally to identify position of statement in error messages: "%^SourceName%^SourceExt"{%^SourceLine}, e.g. "HelloWorld.asm"{3}

€ASM %^variable %^Section can be used to save and restore the current section|segment in macros. Together with statement EUROASM PUSH it guaranties that the €ASM environment will not be modified by expanding a macro, even if the macro required to temporarily change it.

aMacro %MACRO              ; Declaration of a macro which needs to emit to its own private section.
         EUROASM PUSH      ; Save all EUROASM options on their own stack.
%BackupSec %SET %^Section  ; Save the current section name to a user-defined %variable.
[.MacroPrivateSection]     ; Switch to the desired section.
               ...         ; Declare the macro body.
[%BackupSec]               ; Switch back to the original section, whatever it was.
         EUROASM POP       ; Restore EUROASM options.
        %ENDMACRO aMacro

Another example using system €ASM %^variables:

%MonthList %SET Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
%Day %SETA %^DATE[7..8] ; Using %SETA instead of %SET will assign %Day with decimal numeric value to get rid of leading zero.
InfoMsg DB "This program was assembled with €ASM %^EuroasmOs ver.%^Version",13,10
        DB "on %MonthList{%^Date[5..6]} %Day-th, %^Date[1..4] at %^Time[1..2]:%^Time[3..4].",13,10,0
; InfoMsg now contains something like
;           This program was assembled with €ASM Win ver.20081231
;           on Feb 8-th, 2009 at 22:05.

Enumerated option, such as %^CPU, %^FORMAT, %^MODEL etc. is assigned as a text in upper case. They can be tested at assembly time with string-compare operations.

Arithmetic options are always assigned as numbers in decimal notation. Positive sign + is omitted. They can be tested at assembly time with numeric-compare operations.

Boolean options, such as AutoSegment=, Priv= etc., are assigned to corresponding system %^variables %^Autosegment, %^Priv as 0 (false) or -1 (true), no matter whether they were assigned using enumerated tokens ON/OFF, YES/NO, TRUE/FALSE or with a logical expression. They can be tested at assembly time with boolean expression or directly as an operand of %IF, e.g. %IF %^UNDOC.

Range EUROASM options WARN= and NOWARN= are assigned to system %variables %^Warn, %^NoWarn as series of 3999 digits 0 (false) and 1 (true). The first digit reflects the current status of message I0001, the second I0002, the last W3999.
Example: %IF %^WARN[2820] will assemble the following statements only if message W2820 is currently enabled.

System %^variables can be used in macros to warn users of the macro that €ASM environment is not set as desired. Example:

 %IF "%^MODEL" !== "FLAT"
    %ERROR Macro "%0" is intended for flat memory model only.
 %ENDIF
 %IF %^SizeOfStackCommit < 16K
    %ERROR This recursive macro requires stack size at least 16 KB.
 %ENDIF
 %IF %^Width = 64 && ! %^AMD
    %ERROR 64bit programs for MS Windows should have AMD=Enabled.
 %ENDIF
 %IF %^NoWarn[2101]
    %ERROR You shouldn't suppress W2101. Move unused symbols to an included file instead.
 %ENDIF

↑ Instructions

Machine instructions ↓

Pseudoinstructions ↓

Macroinstructions ↓

Instruction is an identifier specified in operation field of the statement.

There are three genders of instructions in assembly language:
machine instructions invented by CPU manufacturer,
pseudoinstructions invented by assembler manufacturer,
macro instructions invented by programmer.


↑ Machine instructions

Instruction suffixes ↓

Instruction modifiers ↓

Instruction enhancements ↓

Undocumented instructions ↓

Machine instruction is the least order for CPU to make some calculation or data manipulation at run-time.

EuroAssembler uses Intel syntax where the first instruction operand specifies destination (which is often one of source operands, too), and one or more sources may follow.

This is the syntax used in CPU-manufacturer documentation and also used in most other assemblers, with exception for Unix-based gas, which prefers alternative paradigma represented by AT&T syntax with reversed operand order. For more differences between AT&T and Intel syntax see [ATTsyntax].

EuroAssembler implements machine instructions mnemonics as defined in specifications by CPU vendors. It also implements some undocumented instructions and instruction-format enhancements which are described below.

Machine instruction mnemonic names and their suffixes are case-insensitive.

Some machine instructions allow alternative encoding of the same mnemonic, €ASM prefers the shortest one, if not instructed otherwise.
€ASM respects mnemonic chosen by programmer, therefore it never encodes e.g. LEA ESI,[MemoryVariable] as MOV ESI,MemoryVariable, although the latter encoding is one byte shorter. There are only two exceptions when the mnemonics is not obeyed:

Instruction suffixes ↓

Machine instruction can manipulate with registers and memory variables of different width, usually with a byte, word or doubleword operands. However, Intel architecture defines the same mnemonic in disregard of data size. For instance, SUB [MemoryVariable],4 tells CPU to subtract immediate number 4 from the contents of MemoryVariable, which might have been defined as DB, DW, DD or DQ. €ASM looks at the type of MemoryVariable and selects appropriate encoding according to its size. However, the offset might also be external or expressed as a register contents or plain number, such as in SUB [ESI],4, and the type of memory variable is unknown in this case. One method, how to tell EuroAssembler which data-width is desired, is using instruction suffix, which is one of the letters B W D Q S N F appended to the mnemonic name.

€ASM allows to extend many general-purpose instructions with mnemonic suffix B, W, D, Q to specify operand size.

Transfer control instructions CALL, JMP, RET may be modified with suffix N or F which tells whether the distance of target is near or far, i.e. if the target belongs to the same segment or if segment descriptor value needs to change, too. The unconditional JMP instruction may be also completed with suffix S when the distance to target can be encoded into 8 bits (-128..+127).

Suffix aware instructions in €ASMSuffix
ADC, ADD, AND, CMP, CMPS, CRC32, DEC, DIV, IDIV, IMUL, INC, LODS, MOV, MOVS, MUL, NEG, NOT, OR, RCL, RCR, ROL, ROR, SAL, SAL2, SAR, SBB, SCAS, SHL, SHR, STOS, SUB, TEST, TEST2, XOR B, W, D, Q
BT, BTC, BTS, BTR, ENTER, HINT_NOP, IRET, LEAVE, POP, POPF, PUSH, PUSHF W, D, Q
PUSHA, POPAW, D
INS, MOVSX, MOVZX, OUTSB, W, D
XLATB
CALL, RETN, F
JMPS, N, F

Using of instruction suffix is not necessary in most cases because the width of memory variable can be deduced by its type attribute or the width is determined by the register used as one of the operands. Error is reported if the register width is in conflict with the suffix, for instance in MOVW AL,[ESI].

Mnemonic suffix notation is sporadicly used in other assemblers or in CPU documentations, see STOSB/W/D, OUTSB/W/D, RETN/F etc. €ASM just extends this enhancement.

Mnemonics of many SIMD instructions terminate with letters ~SS, ~SD, ~PS, ~PD which specify the type of operands, too (Scalar/Packed Single/Double-precision). €ASM does not treat them as mnemonic suffixes.

There are a few conflicts/overloads of suffixed mnemonics with IA-32 instructions, they are resolvable by the type and/or the number of operands:

|00000000: | ; Standard Move versus MMX Move Doubleword: |00000000:C7450800000000 | MOVD [EBP+8],0 ; Store immediate number to DWORD memory location (suffix ~D). |00000007:0F7E4508 | MOVD [EBP+8],MM0 ; Store DWORD from MMX register to the memory location. |0000000B: | |0000000B: | ; Shift versus Double Precision Shift: |0000000B:C1650804 | SHLD [EBP+8],4 ; Shift left logical the DWORD in memory location by 4 bits (suffix ~D). |0000000F:0FA4450804 | SHLD [EBP+8],EAX,4 ; Shift left 4 bits from register EAX to the memory location. |00000014: | |00000014: | ; Compare String versus Compare Scalar Double-precision FP number: |00000014:A7 | CMPSD ; Compare DWORDs at [DS:ESI] and [ES:EDI] (suffix ~D). |00000015:A7 | CMPSD [ESI],[EDI] ; Ditto, documented with explicit operands. |00000016:F20FC2CA00 | CMPSD XMM1,XMM2,0 ; Compare scalar float64 numbers for EQUAL.

↑ Instruction modifiers

CODE= ↓
DATA= ↓
IMM= ↓
DISP= ↓
SCALE= ↓
DIST= ↓
ADDR= ↓
PREFIX= ↓
MASK= ↓
ZEROING= ↓
EH= ↓
SAE= ↓
ROUND= ↓
BCST= ↓
OPER= ↓
ALIGN= ↓
NESTINGCHECK= ↓

Machine instructions with the same mnemonic name and functionality sometimes may be encoded to different machine codes. For instance, immediate value can be optionally encoded in one byte when it does not exceed the range -128..+127, or it can be encoded as a full word or doubleword. Similar rule applies to encoding of displacement value in address expressions. Scaled address expression such as [1*ESI+EBX] may be encoded without SIB as [ESI+EBX] or using the SIB byte with explicit scaling factor 1.

€ASM prefers the shortest variant but this may be changed with additional keyword operands called instruction modifiers.

Many other assemblers decorate operands with special directives byte, word, dword, qword, short, strict, near, far, ptr to achieve specific encoding, for instance add word ptr [StringOfBytes + 4], 0x20 or jmp short SomeLabel. Instead of those directives, €ASM uses either mnemonic suffix, or instruction modifiers.

AVX instruction modifiers MASK=, ZEROING=, SAE=, ROUND=, BCST= are used in €ASM instead of inconsistent and poorly documented decorators, such as {k} {z} {ru-sae} {4to16} {uint16} {cdab} proposed by [IntelAVX512] and [IntelMVEX].

Typical value of modifier is enumerated token such as BYTE, WORD, DWORD etc. Most of enumerated modifier values may be abbreviated to their first letter. Both names and values of instruction modifiers are case insensitive.

Some modifiers are boolean type, their value may be TRUE, YES, ON, ENABLE, ENABLED if true, and FALSE, NO, OFF, DISABLE, DISABLED otherwise. Boolean modifier may also be an expression which evaluates to zero (false) or nonzero (true), see boolean extended values.

When the requested modifier cannot be satisfied, €ASM reports warning and ignores it.

Modifiers actually used for encoding can be displayed when EUROASM option DISPLAYENC= is switched ON. In this case €ASM accompanies each machine instruction with diagnostic message D1080 which explicitly documents which modifiers were used for encoding:

| | EUROASM DISPLAYENC=ON |00000000:694D10C8000000 | IMUL ECX,[EBP+16],200 |# D1080 Emitted size=7,DATA=DWORD,DISP=BYTE,SCALE=SMART,IMM=DWORD. |00000007: | |00000007:62F1ED2CF44D02<5 | VPMULUDQ YMM1,YMM2,[EBP+40h],MASK=K4 |# D1080 Emitted size=7,PREFIX=EVEX,MASK=K4,ZEROING=OFF,DATA=YWORD,BCST=OFF,OPER=2,DISP=BYTE,SCALE=SMART.
↑ CODE=

As a heritage from evolution of older processors, some machine instructions have more then one encoding. For instance the instruction POP rAX may be encoded either as 0x58 or as 0x8FC0, keeping the same functionality. Modifier CODE= selects which encoding should €ASM use.

Operation-code modifier may be SHORT or LONG alias S or L. Default is the one which selects shorter encoding, usually CODE=SHORT.

When an instruction has two possible encodings with the same size, CODE=SHORT selects the variant with numerically lower opcode.

|00000000:43 | INC EBX |00000001:43 | INC EBX,CODE=SHORT ; Intel 8080 legacy encoding, not available in 64bit mode. |00000002:FFC3 | INC EBX,CODE=LONG |00000004: | |00000004:50 | PUSH EAX |00000005:50 | PUSH EAX,CODE=SHORT ; Intel 8080 legacy encoding, not available in 64bit mode. |00000006:FFF0 | PUSH EAX,CODE=LONG |00000008: | |00000008:87CA | XCHG ECX,EDX |0000000A:87D1 | XCHG ECX,EDX,CODE=LONG ; Modifier swaps operands in commutative operations XCHG, TEST. |0000000C:87D1 | XCHG EDX,ECX |0000000E:87CA | XCHG EDX,ECX,CODE=LONG |00000010: | |00000010:C3 | RET |00000011:C3 | RET CODE=LONG |00000012:C20000 | RET CODE=SHORT ; Numerically lower opcode 0xC2 requested, which requires imm16. |00000015: | |00000015:83C07F | ADD EAX,127 |00000018:83C07F | ADD EAX,127,CODE=LONG |0000001B:057F000000 | ADD EAX,127,CODE=SHORT ; Shorter opcode 0x05 requested, which cannot sign-extend imm8.
In some cases explicit request for numerically lower opcode with CODE=SHORT may lead to longer encoding, see the example ADD r32,imm8 above.
↑ DATA=

This modifier controls operation-size, i.e. the width of data which the instruction operates on. It may be one of BYTE, WORD, DWORD, QWORD, TBYTE, OWORD, YWORD, ZWORD alias B, W, D, Q, T, O, Y, Z. The default is not specified.

Modifier DATA= has the same function as instruction suffix, they are only two differences:

There are two other ways how the operand width is controlled. If one of operands is a register, its width prevails and this cannot be overriden with suffix or modifier. When the operand width is not determined with the register, suffix nor modifier, €ASM looks at the TYPE# attribute of the target operand.

Priority of operand-size specifications:

  1. Width of register operand
  2. Mnemonics suffix
  3. Modifier DATA=
  4. Memory operand type

See the following examples:

|00000000:00000000 |MemoryVariable DB 0,0,0,0 |00000004:0107 | ADD [EDI],EAX ; Operand width is set by register (32 bits). |00000006:830701 | ADDD [EDI],1 ; Operand width is set by suffix (32 bits). |00000009:66830701 | ADD [EDI],1,DATA=W ; Operand width is set by modifier (16 bits). |0000000D:800701 | ADDB [EDI],1,DATA=W ; Operand width is set by suffix (8 bits). Warning:modifier ignored. |## W2401 Modifier "DATA=WORD" could not be obeyed in this instruction. |00000010:660107 | ADDB [EDI],AX ; Operand width is set by register (16 bits). Error:suffix ignored. |### E6740 Impracticable operand-size requested with mnemonic suffix. |00000013:8387[00000000]01 | ADDD [EDI+MemoryVariable],1 ; Operand width is set by suffix (32 bits). |0000001A:668387[00000000]01 | ADD [EDI+MemoryVariable],1,DATA=W ; Operand width is set by modifier (16 bits). |00000022:8087[00000000]01 | ADD [EDI+MemoryVariable],1 ; Operand width is set by TYPE#MemoryVariable = 'B' (8 bits). |00000029:800701 | ADD [EDI],1 ; Error:Operand width is not specified. |### E6730 Operand size could not be determined, please use DATA= modifier.
↑ IMM=

Some instructions allow to encode small immediate value as one byte, although they operate with full words. The byte value is sign-extended by CPU at run-time.

Modifier IMM= may have value BYTE, WORD, DWORD, QWORD alias B, W, D, Q and it specifies how should immediate operand be encoded in the instruction.

|00000000:83D001 | ADC EAX,1 |00000003:83D001 | ADC EAX,1,IMM=BYTE |00000006:81D001000000 | ADC EAX,1,IMM=DWORD
↑ DISP=

Displacement address portion in some instructions may be encoded into one byte when its value is in the range -128..+127. The byte value is sign-extended by CPU at run-time. Values outside this range are encoded in full size, i.e. as WORD, or DWORD, according to the segment width (possibly inverted with ATOGGLE prefix). This is the default behaviour of €ASM. Modifier DISP= can have the same enumerated values as IMM= modifier (BYTE, WORD, DWORD, QWORD alias B, W, D, Q) and it controls whether the displacement is encoded with full size or as a byte.

|00000000:2945FC | SUB [EBP-4],EAX |00000003:2945FC | SUB [EBP-4],EAX,DISP=BYTE |00000006:2985FCFFFFFF | SUB [EBP-4],EAX,DISP=DWORD
↑ SCALE=

Scaling means multiplication of the contents of the index register with 0, 1, 2, 4 or 8 at run-time. The SCALE= modifier can be either SMART or VERBATIM (or shortly S, V). Default is SCALE=SMART.
In verbatim mode no optimisation is performed with index and base registers and scaling is encoded in SIB byte even when the scale factor is 1 or 0. Encoding of instruction with SCALE=VERBATIM uses SIB byte, if possible.
In smart mode (default) €ASM tries to rearrange registers and not emit SIB byte unless absolutely necessary.
Here are the "smart" optimisation rules (IR is indexregister, BR is baseregister, disp is displacement):

|00000000:A011000000 | MOV AL,[0x11] ; Special encoding without ModR/M. |00000005:A011000000 | MOV AL,[0*ESI+0x11] ; Special encoding without ModR/M. |0000000A:8A042511000000 | MOV AL,[0*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB. ESI is not used. |00000011: | |00000011:8A4611 | MOV AL,[ESI+0x11] ; ModR/M without SIB. ESI is base. |00000014:8A4611 | MOV AL,[ESI+0x11],SCALE=SMART ; ModR/M without SIB. ESI is base. |00000017:8A442611 | MOV AL,[ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB, ESI is base. |0000001B:8A4611 | MOV AL,[1*ESI+0x11] ; ModR/M without SIB. ESI is base. |0000001E:8A4611 | MOV AL,[1*ESI+0x11],SCALE=SMART ; ModR/M without SIB. ESI is base. |00000021:8A043511000000 | MOV AL,[1*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB, ESI is index. |00000028: | |00000028:8A443611 | MOV AL,[ESI+ESI+0x11] ; ModR/M with SIB. ESI is base and index. |0000002C:8A443611 | MOV AL,[2*ESI+0x11] ; ModR/M with SIB. ESI is base and index. |00000030:8A047511000000 | MOV AL,[2*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB. ESI is scaled index. |00000037: | |00000037:8A442D11 | MOV AL,[EBP+EBP+0x11] ; ModR/M with SIB, EBP is base and index. |0000003B:8A442D11 | MOV AL,[2*EBP+0x11] ; ModR/M with SIB, EBP is base and index. |0000003F:8A046D11000000 | MOV AL,[2*EBP+0x11],SCALE=VERBATIM ; ModR/M with SIB, EBP is scaled index.
Notice that optimisation with SCALE=SMART may change the register role (base|index) and consequently the default segment register (SS|DS) used for addressing. This is usually not an issue in flat memory model, otherwise use SCALE=VERBATIM.

When the instruction encoding is displayed with EUROASM DisplayEnc=Yes, modifier SCALE=VERBOSE tells thas SIB was actually emitted in this encoding, otherwise SCALE=SMART signalizes no SIB byte.

↑ DIST=

This modifier specifies the distance of target in control-transfer instructions. It can be one of FAR, NEAR, SHORT alias F, N, S.

DIST=FAR is used when the target is in a different segment and both rIP and CS registers need to be changed.

By default in intrasegment transfers €ASM automatically selects between SHORT and NEAR distance depending on the magnitude of offsets difference.

Modifier DIST= has the same function as instruction suffix, they are only two differences:

Modifier DIST=NEAR or DIST=FAR can be also applied to pseudoinstructions PROC, PROC1. Consequence of making a procedure FAR is that call and jumps to that procedure will be by default FAR, and that any RET inside this procedure will default to DIST=FAR, too.

|[CODE1] |[CODE1] SEGMENT |0000:EB2A | JMP CloseLabel: ; Encoded DIST=SHORT. |0002:E92701 | JMP DistantLabel: ; Encoded DIST=NEAR. |0005:EA[0000]{0000} | JMP FarLabel: ; Encoded DIST=FAR. |000A:EB20 | JMP CloseLabel:,DIST=SHORT ; Encoded DIST=SHORT. |000C:E91D01 | JMP DistantLabel:,DIST=SHORT ; Encoded DIST=NEAR. |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |000F:EA[0000]{0000} | JMP FarLabel:,DIST=SHORT ; Encoded DIST=FAR. |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |0014:E91500 | JMP CloseLabel:,DIST=NEAR ; Encoded DIST=NEAR. |0017:E91201 | JMP DistantLabel:,DIST=NEAR ; Encoded DIST=NEAR. |001A:E9(0000) | JMP FarLabel:,DIST=NEAR ; Encoded DIST=NEAR. |001D:EA[2C00]{0000} | JMP CloseLabel:,DIST=FAR ; Encoded DIST=FAR. |0022:EA[2C01]{0000} | JMP DistantLabel:,DIST=FAR ; Encoded DIST=FAR. |0027:EA[0000]{0000} | JMP FarLabel:,DIST=FAR ; Encoded DIST=FAR. |002C: |CloseLabel: |002C:90909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |012C: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |0000: |FarLabel:
↑ ADDR=

This modifier will choose the reference frame of memory addressing in 64bit mode. Allowed values are ABS, REL alias A, R. Number encoded in instruction code with absolute addressing is related to the start of segment, which is always 0 at assembly time.
In relative adressing it is related to the position of the next instruction, i.e. to the contents of register RIP. In legacy modes (16bit, 32bit) the reference frame is hardwired as ADDR=REL in control-transfer instructions (direct JMP, CALL, LOOP, Jcc), and as ADDR=ABS in all other instructions.

RIP-relative addressing is shorter by one byte and it does not require relocation, which saves space in object file and avoids patching of code at load-time. That is why ADDR=REL is preferred by default in 64bit mode.
Explicit selection between absolute and RIP-relative addressing is relevant only in 64bit mode when the absolute address would require relocation at link-time. This happens when the memory variable is specified as displacement of address symbol (not a plain number), and no index or base register is involved in addressing.

|00000000:00000000 | MemDword DD 0 |00000004: | |00000004:0305F6FFFFFF | ADD EAX,[MemDword] ; Encoded with relative addressing. |0000000A:0305F0FFFFFF | ADD EAX,[MemDword],ADDR=REL ; Encoded with relative addressing. |00000010:030425[00000000] | ADD EAX,[MemDword],ADDR=ABS ; Encoded with absolute addressing. |00000017: | |00000017:034540 | ADD EAX,[RBP+0x40] ; Encoded with absolute addressing. |0000001A:034540 | ADD EAX,[RBP+0x40],ADDR=ABS ; Encoded with absolute addressing. |0000001D:034540 | ADD EAX,[RBP+0x40],ADDR=REL ; Encoded with absolute addressing. |## W2401 Modifier "ADDR=REL" could not be obeyed in this instruction.
↑ PREFIX=

Following modifiers apply only to instructions which use Advanced Vector eXtensions (AVX) encoding. Possible value of prefix is XOP, VEX, VEX2, VEX3, MVEX, EVEX (shortcuts are not available).

Most AVX-encodable instructions have their mnemonics prefixed with V~. Some instructions are defined with only one kind of AVX prefix, they don't need explicit modifier. When an instruction can be alternatively encoded with different AVX prefixes, €ASM will by default choose the shortest one.

Prefix VEX exists in two variants: VEX2 and VEX3. The longer encoding (VEX3) is automatically selected when the instruction uses indexregister or baseregister R8..R15 or when it uses opcode from map 0F38 or 0F3A.

Prefix EVEX or MVEX will be selected instead of VEX when the instruction uses register XMM16..XMM31, YMM16..YMM31, ZMM0..ZMM31, K0..K7, or modifier EH=, SAE=, ROUND=, MASK=, ZEROING=, OPER=.

Instruction encodable with both EVEX and MVEX default to PREFIX=EVEX. Software written for Intel® Xeon Phi CPU needs to explicitly request PREFIX=MVEX in each such amphibious instruction. In this case it is useful to disable EVEX EUROASM EVEX=DISABLED and thus be warned if some MVEX instruction encodes as EVEX by omission. Explicit specification of modifier EH= (which is available with MVEX only) will select MVEX too, and explicit PREFIX=MVEX is not necessary in this case.

CPU features required by using AVX prefix
PrefixRequired EUROASM options
XOPSIMD=AVX, AMD=ENABLED, XOP=ENABLED
VEXSIMD=AVX
MVEXSIMD=AVX512, MVEX=ENABLED
EVEXSIMD=AVX512, EVEX=ENABLED
|00000000:8FE868CCCB04 | VPCOMB XMM1,XMM2,XMM3,4 ; VPCOMB is defined with XOP only. |00000006:62F1FA082917 | VMOVNRAPD [RDI],ZMM2 ; VMOVNRAPD is defined with MVEX only. |0000000C:C5E958CB | VADDPD XMM1,XMM2,XMM3 ; VADDPD is defined with VEX,MVEX,EVEX. |00000010:C5E958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX |00000014:C5E958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX2 |00000018:C4E16958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX3 |0000001D:62F1ED0858CB | VADDPD XMM1,XMM2,XMM3,PREFIX=EVEX |00000023:62F1ED4858CB | VADDPD ZMM1,ZMM2,ZMM3,PREFIX=EVEX |00000029:62F1E90858CB | VADDPD ZMM1,ZMM2,ZMM3,PREFIX=MVEX
↑ MASK=

Modifier MASK= (as well as ZEROING=, EH=, SAE=, ROUND=, BCST=, OPER=) is applicable only with Enhanced Advanced Vector eXtensions (EVEX or MVEX). MASK specifies which opcode mask register is used to control which elements (floating-point or integer numbers) should be written to the destination SIMD register. Only elements which have corresponding bits in mask-register set, are written. Other elements are either zeroed (if modifier ZEROING=ON) or left unchanged (ZEROING=OFF).

Possible value of MASK= is K0, K1, K2, K2, K3, K4, K5, K6, K7 or an expression which evaluates to number 0..7. Default is MASK=0. Opmask register K0 is special, it is treated as if it had all bits set, thus no masking is applied in this case.

↑ ZEROING=

Modifier ZEROING= is boolean, it controls whether elements masked-off by the contents of opmask register should be set to zero or left unchanged, which is called merging. It has no meaning when MASK=K0 or when mask is not specified at all. Default is ZEROING=OFF (merging). Modifier is applicable only with EVEX encoding.

|00000000:C5E958CB | VADDPD XMM1,XMM2,XMM3 ; VADDPD is defined with VEX,MVEX,EVEX. |00000004:62F1ED0C58CB | VADDPD XMM1,XMM2,XMM3,MASK=4 ; Using MASK= will force EVEX encoding. |0000000A:62F1ED0C58CB | VADDPD XMM1,XMM2,XMM3,MASK=K4,ZEROING=NO |00000010:62F1ED8C58CB | VADDPD XMM1,XMM2,XMM3,MASK=K4,ZEROING=YES
↑ EH=

Boolean modifier EH= (Eviction Hint) is applicable with MVEX-encoded instructions only. EH=1 informs CPU that the data is non-temporal and it is unlikely to be reused soon so it has no effect to store them in CPU cache. This concerns register-to-memory instructions only.

Value of EH is also consulted in register-to-register instructions where it will select between swizzle operations and static rounding.

↑ SAE=

If boolean modifier SAE= (Suppress All Exceptions) is switched on, the instruction will not raise any kind of floating-point exception flags, for instance when it operated with not-a-number value. Instruction with SAE=ON behaves as if all the MXCSR mask bits were set.

In EVEX-encoding SAE is by default enabled whenever static rounding is used, this behaviour cannot be switched off.

↑ ROUND=

Modifier ROUND= specifies static rounding mode, it is applicable on EVEX and MVEX instructions with rounding semantic, for instance for conversion from double to single-precision FP numbers. It has four possible enumerated values: NEAR, UP, DOWN, ZERO alias N, U, D, Z.

Static rounding is available only in ZMM register-to-register operations (not if one of the operands is in memory or when XMM and YMM registers are used). Default is no rounding, in this case general rounding mode controlled by RM bits in MXCSR applies.

↑ BCST=

Boolean modifier BCST= can be used to enable data broadcasting in operations which load data from memory. When BCST=ENABLED, memory source operand specifies only one element and its contents will be broadcast (copied) to all positions of destination register.

Default is BCST=OFF. Broadcasting cannot be used with register-to-register operations.

|00000000:62F16C48590E | VMULPS ZMM1,ZMM2,[RSI] ; Multiply 16 DWORD FP numbers in ZMM2 with 16 DWORD FP numbers at [RSI], store 16 products to ZMM1. |00000006:62F16C58590E | VMULPS ZMM1,ZMM2,[RSI],BCST=ON ; Multiply 16 DWORD FP numbers in ZMM2 with the same DWORD FP number at [RSI], store 16 products to ZMM1. |0000000C:62F16C4859CB | VMULPS ZMM1,ZMM2,ZMM3 ; Multiply 16 DWORD FP numbers in ZMM2 with 16 DWORD FP numbers in ZMM3, store 16 products to ZMM1. |00000012:62F16C7859CB | VMULPS ZMM1,ZMM2,ZMM3,ROUND=ZERO ; Ditto, truncate each product toward zero.
↑ OPER=

Instruction modifier OPER= encodes kind of operation performed with source operand at run-time. Affected operations are broadcasting, rounding, conversion, swizzling. Possible value is numeric expression which evaluates to 0..7.

Value of operation will be encoded in bits 6, 5, 4 of 32bit prefix EVEX or MVEX. These bits are named S2, S1, S0 in MVEX specification [IntelMVEX], and L', L, b in EVEX specification [IntelAVX512]. The same bits are also affected by modifiers BCST=, ROUND=, SAE= and by SIMD register width, but direct OPER= specification has higher priority when a conflict occurs.

Modifier OPER= is the only way how to request special conversion or swizzle (shuffle) operation for MVEX-encoded instruction available on Intel® Xeon Phi CPU. Not all operation values from the table below are available with all MVEX instructions, documentation in [IntelMVEX] should always be consulted prior to using OPER=.

MVEX-encoded operations
OPER=register-to-register, EH=0register-to-register, EH=1memory-to-registerregister-to-memory
0no swizzle {dcba}ROUND=NEAR,SAE=NOno operationno conversion
1swap (inner) pairs {cdab}ROUND=DOWN,SAE=NObcst 1 element {1to16} or {1to8}not available
2swap with two-away {badc}ROUND=UP,SAE=NObcst 4 elements {4to16} or {4to8}not available
3cross-product swizzle {dacb}ROUND=ZERO,SAE=NOconvert from {float16}convert to {float16}
4bcst a element across 4 {aaaa}ROUND=NEAR,SAE=YESconvert from {uint8}convert to {uint8}
5bcst b element across 4 {bbbb}ROUND=DOWN,SAE=YESconvert from {sint8}convert to {sint8}
6bcst c element across 4 {cccc}ROUND=UP,SAE=YESconvert from {uint16}convert to {uint16}
7bcst d element across 4 {dddd}ROUND=ZERO,SAE=YESconvert from {sint16}convert to {sint16}
EVEX-encoded operations
OPER=register-to-registermemory-to-register
0DATA=OWORD,SAE=NODATA=OWORD,BCST=OFF
1DATA=ZWORD,SAE=YES,ROUND=NEARDATA=OWORD,BCST=ON
2DATA=YWORD,SAE=NODATA=YWORD,BCST=OFF
3DATA=ZWORD,SAE=YES,ROUND=DOWNDATA=YWORD,BCST=ON
4DATA=ZWORD,SAE=NODATA=ZWORD,BCST=OFF
5DATA=ZWORD,SAE=YES,ROUND=UPDATA=ZWORD,BCST=ON
6reservedreserved
7DATA=ZWORD,SAE=YES,ROUND=ZEROreserved
|00000000:62F16908DB4D01<6 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=0 ; No broadcast {16to16}. |00000007:62F16918DB4D10<2 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=1 ; Broadcast one element {1to16}. |0000000E:62F16928DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=2 ; Broadcast four elements {4to16}. |00000015:62F16948DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=4 ; Convert from {uint8}. |0000001C:62F16958DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=5 ; Convert from {sint8}. |00000023:62F16968DB4D02<5 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=6 ; Convert from {uint16}. |0000002A:62F16978DB4D02<5 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=7 ; Convert from {sint16}. |00000031: | |00000031:62F1F9085A4D01<6 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=0 ; No broadcast {8to8}. |00000038:62F1F9185A4D08<3 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=1 ; Broadcast one element {1to8}. |0000003F:62F1F9285A4D02<5 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=2 ; Broadcast four elements {4to8}. |00000046: | |00000046:62F1F9085ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=0 ; No swizzle {dcba}. |0000004C:62F1F9185ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=1 ; Swap (inner) pairs {cdab}. |00000052:62F1F9285ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=2 ; Swap with two-away {badc}. |00000058:62F1F9385ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=3 ; Cross-product swizzle {dacb}. |0000005E:62F1F9485ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=4 ; Broadcast a element to 4 {aaaa}. |00000064:62F1F9585ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=5 ; Broadcast b element to 4 {bbbb}. |0000006A:62F1F9685ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=6 ; Broadcast c element to 4 {cccc}. |00000070:62F1F9785ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=7 ; Broadcast d element to 4 {dddd}. |00000076: | |00000076:62F1F9885ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=0 ; ROUND=NEAR,SAE=OFF {rn}. |0000007C:62F1F9985ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=1 ; ROUND=DOWN,SAE=OFF {rd}. |00000082:62F1F9A85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=2 ; ROUND-UP, SAE=OFF {ru}. |00000088:62F1F9B85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=3 ; ROUND=ZERO,SAE=OFF (rz). |0000008E:62F1F9C85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=4 ; ROUND=NEAR,SAE=ON {rn-sae}. |00000094:62F1F9D85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=5 ; ROUND=DOWN,SAE=ON {rd-sae}. |0000009A:62F1F9E85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=6 ; ROUND=UP, SAE=ON {ru-sae}. |000000A0:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=7 ; ROUND=ZERO,SAE=ON {rz-sae}. |000000A6:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,ROUND=ZERO,SAE=ON |000000AC:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,EH=1,ROUND=ZERO,SAE=ON
↑ ALIGN=

Alignment request may be applied to any machine instruction, and to pseudoinstructions D, PROC, PROC1, STRUC. See the alignment paragraph for accepted values. This instruction modifier has the same effect as if explicit pseudoinstruction ALIGN was placed above the statement.

↑ NESTINGCHECK=

This is a pseudoinstruction modifier, it can be applied only to pseudoinstructions PROC, ENDPROC, PROC1, ENDPROC1. Its value is boolean, default is NESTINGCHECK=ON. Switching the nesting control off will suppress error message on block mismatch. This enables to establish bounds between macros which enhance some block pseudoinstructions. See the definitions of macros Procedure and EndProcedure as an example.

↑ Instruction enhancements

FPU instruction default registers ↓
String instructions operands ↓
XLAT with nondefault [segment:base] ↓
LOOP with nondefault counter ↓
Near and far LOOP and JrCXZ ↓
Near and far Jcc ↓
PUSH, POP, INC, DEC multiple operands ↓
AAD, AAM operand ↓
TEST a register by itself ↓
Shift and rotate 2nd operand ↓
No-operation ↓
PINSR register source ↓
BLENDVPD, BLENDVPS, PBLENDVB 3rd operand ↓
MASKMOVQ, MASKMOVDQU 1st operand ↓
VERR, VERQ, LAR, LSL ↓

Some instructions in IA-64 work with registers fixed by design. €ASM accepts voluntary explicit specification of such registers which serves as a documentation for human reader and sometimes it may be exploited as address-size definition and/or segment override.

↑ FPU instruction default registers

Unary FPU instructions with implicit destination ST0 may explicitly name this register as the first operand, or it may be omitted. In many other FPU instructions default destination is ST0 and default source is ST1, in which case one or both operands may be omitted. See also handlers of instructions FNOP, FCMOVB, FADD, FIADD, FADDP, FXCH, FCOM.

|00000000:000000000000F03F |Mem DQ 1.0 |00000008: | |00000008:DAC1 | FCMOVB ; ST0 = ST1 if Below. |0000000A:DAC1 | FCMOVB ST0,ST1 ; ST0 = ST1 if Below. |0000000C: | |0000000C:DAC7 | FCMOVB ST0,ST7 ; ST0 = ST7 if Below. |0000000E:DAC7 | FCMOVB ST7 ; ST0 = ST7 if Below. |00000010: | |00000010:D8C1 | FADD ; ST0 += ST1. |00000012:D8C1 | FADD ST0,ST1 ; ST0 += ST1. |00000014: | |00000014:DC05[00000000] | FADD ST0,[Mem] ; ST0 += [Mem]. |0000001A:DC05[00000000] | FADD [Mem] ; ST0 += [Mem]. |00000020: | |00000020:DCC7 | FADD ST7,ST0 ; ST7 += ST0. |00000022:DCC7 | FADD ST7 ; ST7 += ST0. |00000024: | |00000024:D9E9 | FLDL2T ; ST0 = log210. |00000026:D9E9 | FLDL2T ST0 ; ST0 = log210.
↑ String instructions operands

String instructions are implicitly addressing the source as memory [DS:rSI] or port DX, and the destination as memory [ES:rDI] or port DX. Beside the non-operand version €ASM accepts operand(s) explicitly presenting source and destination, with possible segment-override and address-size change.

|00000000:AC | LODSB |00000001:AC | LODSB [DS:ESI] ; Default segment is DS, address-size is 32. |00000002:2EAC | LODSB [CS:ESI] ; Segment override. |00000004:67AC | LODSB [SI] ; Address-size changed. |00000006: | |00000006:AA | STOSB |00000007:AA | STOSB [EDI] |00000008: | |00000008:AE | SCASB |00000009:AE | SCASB [EDI] |0000000A: | |0000000A:A5 | MOVSD |0000000B:A5 | MOVSD [EDI],[ESI] |0000000C:2667A5 | MOVSD [DI],[ES:SI] ; Address-size and source segment changed. |0000000F: | |0000000F:666D | INSW |00000011:666D | INSW [ES:EDI],DX |00000013: | |00000013:6E | OUTSB |00000014:6E | OUTSB DX,[DS:ESI] |00000015:2E6E | OUTSB DX,[CS:ESI] ; Source segment changed.
↑ XLAT with nondefault [segment:base]

Default translation table is implicitly addressed with [DS:rBX]. €ASM accepts optional memory operand which can specify nondefault segment override and nondefault rBX width.

↑ LOOP with nondefault counter

LOOP count register can be specified as the optional second operand.

|00000000:D7 | XLAT |00000001:D7 | XLATB ; XLAT and XLATB are identical. |00000002:D7 | XLATB [DS:EBX] ; Segment DS is default, no override necessary. |00000003:26D7 | XLATB [ES:EBX] |00000005:67D7 | XLATB [BX] |00000007: | |00000007:E2F6 | LOOP $-8 |00000009:E2F6 | LOOP $-8,ECX ; Default counter in 32bit mode is ECX. |0000000B:67E2F5 | LOOP $-8,CX ; Counter register (its address-size) changed to 16 bit.
↑ Near and far LOOP and JrCXZ

Looping is not limited to short-range distance in €ASM. When the destination of LOOP, LOOPcc, JCXZ, JECXZ, JRCXZ is far or near (out of byte range), €ASM will assemble three instructions instead:

LOOP $+2+2 ; Loop to the proxy-jump instead of the original destination. JMPS $+JMPSsize+JMPsize ; Skip the proxy-jump when the loop has finished (rCX is zero). JMP target ; Near or far unconditional proxy-jump to the original destination. |[CODE1] |[CODE1] SEGMENT |00000000:E366 | JECXZ CloseLabel: |00000002:E364 | JECXZ CloseLabel:,DIST=SHORT |00000004:E302EB05E95B000000 | JECXZ CloseLabel:,DIST=NEAR |0000000D:E302EB07EA[68000000]{0000}| JECXZ CloseLabel:,DIST=FAR |00000018: | |00000018:E302EB05E947010000 | JECXZ DistantLabel: |00000021:E302EB05E93E010000 | JECXZ DistantLabel:,DIST=SHORT |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |0000002A:E302EB05E935010000 | JECXZ DistantLabel:,DIST=NEAR |00000033:E302EB07EA[68010000]{0000}| JECXZ DistantLabel:,DIST=FAR |0000003E: | |0000003E:E302EB07EA[00000000]{0000}| JECXZ FarLabel: |00000049:E302EB07EA(00000000){0000}| JECXZ FarLabel:,DIST=SHORT |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |00000054:E302EB05E9(00000000) | JECXZ FarLabel:,DIST=NEAR |0000005D:E302EB07EA[00000000]{0000}| JECXZ FarLabel:,DIST=FAR |00000068: |CloseLabel: |00000068:909090909090909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |00000168: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |00000000: |FarLabel:
↑ Near and far Jcc

Conditional jump to the distance exceeding byte limit -128..127 was introduced with 386 CPU. When the program is intended to run on older processors as well, near and far conditional jump Jcc target will be assembled by €ASM as two instructions:

J!cc $+J!ccsize+JMPsize ; Skip the proxy-jump if inverted condition is true. JMP target ; Near or far unconditional proxy-jump to the original destination.

Near proxy-jump instead of standard 386 near conditional jump is assembled when these three conditions are met:

  1. distance to target is out of byte range,
  2. segment width is 16,
  3. EUROASM option CPU= is 286 or lower.
|[CODE1] |[CODE1] SEGMENT WIDTH=16 | | EUROASM CPU=386 |0000:7419 | JE CloseLabel: ; Standard short conditional jump. |0002:0F841501 | JE DistantLabel: ; Standard near conditional jump, available on CPU=386 and newer. |0006:7505EA[0000]{0000}| JE FarLabel: ; Far unconditional proxy-jump skipped by inverted-condition J!cc. | | EUROASM CPU=086 ; Following instructions should run on old PC/XT machine, too. |000D:740C | JE CloseLabel: ; Standard short conditional jump. |000F:7503E90701 | JE DistantLabel: ; Near unconditional proxy-jump skipped by inverted-condition J!cc. |0014:7505EA[0000]{0000}| JE FarLabel: ; Far unconditional proxy-jump skipped by inverted-condition J!cc. |001B: |CloseLabel: |001B:9090909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |011B: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |0000: |FarLabel:
↑ PUSH, POP, INC, DEC multiple operands

In many assemblers instructions PUSH, POP, INC, DEC may have just one operand. €ASM does not limit the number of operands, they are performed one by one in the specified order. If an instruction modifier or suffix is used, it applies to all operands. |00000000:57FF370FA06A04 | PUSH EDI,[EDI],FS,4 |00000007:590FA18F0658 | POP ECX,FS,[ESI],EAX |0000000D:40FF07 | INC EAX,[EDI],DATA=DWORD |00000010:48664AFEC9 | DEC EAX,DX,CL

↑ AAD, AAM operand

Instructions AAD and AAM use by default radix 10 for adjusting AL before division or after multiplication of binary decimals. In €ASM they accept optional 8bit immediate operand, for instance AAD 16. |00000000:D40A | AAM |00000002:D40A | AAM 10 |00000004:D410 | AAM 16 |00000006:D50A | AAD |00000008:D50A | AAD 10 |0000000A:D510 | AAD 16

↑ TEST a register by itself

When both operands in TEST instruction specify the same register, the second operand may be omitted.

↑ Shift and Rotate 2nd operand

When the number of bits to rotate or shift in instructions RCL, ROL, SAL, SHL, RCR, ROR, SAR, SHR is equal to 1, the second operand may be omitted.

|00000000:85D2 | TEST EDX,EDX |00000002:85D2 | TEST EDX ; Operand2 of TEST is by default identical with Operand1. |00000004: | |00000004:D1D0 | RCL EAX,1 |00000006:D1D0 | RCL EAX ; Omitted rotate/shift count defaults to 1. |00000008:D165F8 | SHL [EBP-8],1,DATA=DWORD |0000000B:D165F8 | SHL [EBP-8],DATA=DWORD
↑ No-operation

Instruction which does nothing (no-operation) except for taking some time and incrementing instruction-pointer register, is implemented in all x86 processors as one-byte NOP, actually XCHG rAX,rAX (opcode 0x90). With Pentium II (CPU=686) Intel proposed dedicated multibyte no-operation instructions with opcodes 0x18..0x1F prefixed with 0x0F. Multibyte NOP is more suitable for alignment purposes than series of one-byte NOPs, as it's fetched and executed at once. On older CPU this real NOP must be emulated with legacy instructions, e.g. XCHG reg,reg or LEA reg,[reg].

[Sandpile] and [NasmInsns] define real-NOP mnemonic as an undocumented instructions HINT_NOP0, HINT_NOP1, HINT_NOP2..63. with one memory operand of desired length. Instead of clutterring the instruction list with 64 new mnemonics, €ASM implements just one mnemonic HINT_NOP (suffixable as HINT_NOPW, HINT_NOPD, HINT_NOPQ) with ordinal number defined in the first immediate operand, and memory specification moved aside to the 2nd operand.

|00000000:0F18D9 | HINT_NOP 03q,ECX |00000003:660F18E1 | HINT_NOP 04q,CX |00000007:66670F182C | HINT_NOPW 05q,[SI] |0000000C:66670F187400 | HINT_NOPW 06q,[SI],DISP=BYTE |00000012:0F18BE00000000 | HINT_NOPD 07q,[ESI],DISP=DWORD |00000019:0F19043500000000 | HINT_NOPD 10q,[1*ESI],DISP=DWORD,SCALE=VERBATIM |00000021: | |00000021:90 | NOP1 |00000022:6690 | NOP2 |00000024:0F1F00 | NOP3 |00000027:0F1F4000 | NOP4 |0000002B:0F1F442000 | NOP5 |00000030:660F1F442000 | NOP6 |00000036:0F1F8000000000 | NOP7 |0000003D:0F1F842000000000 | NOP8 |00000045:660F1F842000000000 | NOP9

Beside that, €ASM implements operandless instructions NOP1, NOP2, NOP3, NOP4, NOP5, NOP6, NOP7, NOP8, NOP9 which occupy the specified number of bytes, regardless of current CPU mode and level:

No-operation encoding
MnemonicOperation code (hexa)Equivalent instruction in €ASM syntax
16bit mode, CPU=086
NOP190XCHG AX,AX
NOP287C9XCHG CX,CX
NOP39087C9XCHG AX,AX ; XCHG CX,CX
NOP487C987D2XCHG CX,CX ; XCHG DX,DX
NOP59087C987D2XCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX
NOP687C987D287DBXCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX
NOP79087C987D287DBXCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX
NOP887C987D287DB87E4XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX ; XCHG SP,SP
NOP99087C987D287DB87E4XCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX ; XCHG SP,SP
16bit mode, CPU=686
NOP190NOP DATA=WORD
NOP26690OTOGGLE NOP
NOP3666790OTOGGLE ATOGGLE NOP
NOP4670F1F00NOP [EAX],DATA=WORD
NOP5670F1F4000NOP [EAX],DATA=WORD,DISP=BYTE
NOP6670F1F442000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP766670F1F442000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP8670F1F8000000000NOP [EAX],DATA=WORD,DISP=DWORD
NOP9670F1F842000000000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
32bit mode, CPU=386
NOP190XCHG EAX,EAX,DATA=DWORD
NOP26690XCHG AX,AX,DATA=WORD
NOP38D4000LEA EAX,[EAX],DATA=DWORD
NOP48D442000LEA EAX,[EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP53E8D442000LEA EAX,[DS:EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP68D8000000000LEA EAX,[EAX],DATA=DWORD,DISP=DWORD
NOP78D842000000000LEA EAX,[EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP83E8D842000000000LEA EAX,[DS:EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9663E8D842000000000LEA AX,[DS:EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
32bit mode, CPU=686
NOP190NOP DATA=DWORD
NOP26690NOP DATA=WORD
NOP30F1F00NOP [EAX],DATA=DWORD
NOP40F1F4000NOP [EAX],DATA=DWORD,DISP=BYTE
NOP50F1F442000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP6660F1F442000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP70F1F8000000000NOP [EAX],DATA=DWORD,DISP=DWORD
NOP80F1F842000000000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9660F1F842000000000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
64bit mode, CPU=X64
NOP190NOP DATA=DWORD
NOP26690NOP DATA=WORD
NOP30F1F00NOP [RAX],DATA=DWORD
NOP40F1F4000NOP [RAX],DATA=DWORD,DISP=BYTE
NOP50F1F442000NOP [RAX+0*RAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP6660F1F442000NOP [RAX+0*RAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP70F1F8000000000NOP [RAX],DATA=DWORD,DISP=DWORD
NOP80F1F842000000000NOP [RAX+0*RAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9660F1F842000000000NOP [RAX+0*RAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
MnemonicOperation code (hexa)Equivalent instruction in €ASM syntax

↑ PINSR register source

Instructions PINSRB, PINSRW, PINSRD (insert Byte/Word/Dword into destination register XMM) accept as source register (operand 2) not only GPR with the corresponding width, but any wider register. Only lowest byte/word/dword from this register is used.

|00000000:660F3A20C902 | PINSRB XMM1,CL,2 |00000006:660F3A20C902 | PINSRB XMM1,CX,2 |0000000C:660F3A20C902 | PINSRB XMM1,ECX,2 |00000012: | |00000012:660FC4C902 | PINSRW XMM1,CX,2 |00000017:660FC4C902 | PINSRW XMM1,ECX,2
↑ BLENDVPS, BLENDVPD, PBLENDVB 3rd operand

Instruction for variable blending uses fixed implied register XMM0 as a mask register. €ASM allows explicit specification of XMM0 as the third operand.

|00000000:660F3815CA | BLENDVPD XMM1,XMM2 |00000005:660F3815CA | BLENDVPD XMM1,XMM2,XMM0 |0000000A: | |0000000A:660F3814CA | BLENDVPS XMM1,XMM2 |0000000F:660F3814CA | BLENDVPS XMM1,XMM2,XMM0 |00000014: | |00000014:660F3810CA | PBLENDVB XMM1,XMM2 |00000019:660F3810CA | PBLENDVB XMM1,XMM2,XMM0
↑ MASKMOVQ, MASKMOVDQU 1st operand

Maskable copy to memory uses [DS:rDI] as fixed destination. €ASM allows explicit specification of destination memory as the optional first operand.

|00000000:0FF7CA | MASKMOVQ MM1,MM2 |00000003:0FF7CA | MASKMOVQ [DS:EDI],MM1,MM2 ; Default destination is [DS:EDI]. |00000006:260FF7CA | MASKMOVQ [ES:EDI],MM1,MM2 ; Segment override. |0000000A: | |0000000A:660FF7CA | MASKMOVDQU XMM1,XMM2 |0000000E:660FF7CA | MASKMOVDQU [DS:EDI],XMM1,XMM2 ; Default destination is [DS:EDI]. |00000012:26660FF7CA | MASKMOVDQU [ES:EDI],XMM1,XMM2 ; Segment override.
↑ VERR, VERW, LAR, LSL

Segment descriptor in system instruction VERR, VERW (operand 1) and LAR, LSL (operand 2) may be specified as 16bit memory variable or 16, 32 or 64bit GPR (only lower 16 bits are used).

|00000000:0F00E6 | VERR SI |00000003:0F00E6 | VERR ESI |00000006: | |00000006:0F00EE | VERW SI |00000009:0F00EE | VERW ESI |0000000C: | |0000000C:660F02C6 | LAR AX,SI |00000010:660F02C6 | LAR AX,ESI |00000014:0F02C6 | LAR EAX,SI |00000017:0F02C6 | LAR EAX,ESI |0000001A: | |0000001A:660F03C6 | LSL AX,SI |0000001E:660F03C6 | LSL AX,ESI |00000022:0F03C6 | LSL EAX,SI |00000025:0F03C6 | LSL EAX,ESI

Undocumented instructions ↓

€ASM supports few instructions which are not documented in official specification published by CPU manufacturer. They may not work with all processor generations and they require explicit feature EUROASM UNDOC=ENABLED.

For more information see instruction handlers BB0_RESET, CMPXCHG486, F4X4, FCOM2, FCOMP5, FFREEP, FMUL4X4, FNSETPM, FRSTPM, FSBP1, FSBP2, FSBP3, FSTDW, FSTP1, FSTP8, FSTP9, FSTSG, FXCH4, FXCH7, HCF, HINT_NOP, IBTS, ICEBP, INT1, JMPE, LOADALL, LOADALL286, PREFETCHWT1, PSRAQ, SAL2, SALC, SETALC, SMINTOLD, TEST2, UD0, UD1, UD2A, UMOV, XBTS, VLDQQU.

↑ Pseudoinstructions

ALIGN ↓

D, DB, DU, DW, DD, DQ, DT, DO, DY, DZ, DI, DS ↓

ENDHEAD ↓

ENDP ↓

ENDP1 ↓

ENDPROC ↓

ENDPROC1 ↓

ENDPROGRAM ↓

ENDSTRUC ↓

EQU ↓

= ↓

EUROASM ↓

EXTERN ↓

EXPORT ↓

GLOBAL ↓

GROUP ↓

HEAD ↓

IMPORT↓

INCLUDE ↓

INCLUDE1 ↓

INCLUDEBIN ↓

INCLUDEHEAD ↓

INCLUDEHEAD1 ↓

LINK ↓

PROC ↓

PROC1 ↓

PROGRAM ↓

PUBLIC ↓

SEGMENT ↓

STRUC ↓

%COMMENT ↓

%DEBUG ↓

%DISPLAY ↓

%DROPMACRO ↓

%ELSE ↓

%ENDCOMMENT ↓

%ENDFOR ↓

%ENDIF ↓

%ENDMACRO ↓

%ENDREPEAT ↓

%ENDWHILE

%ERROR ↓

%EXITFOR ↓

%EXITMACRO ↓

%EXITREPEAT ↓

%EXITWHILE

%FOR ↓

%IF ↓

%MACRO ↓

%PROFILE ↓

%REPEAT ↓

%SET ↓

%SETA ↓

%SETB ↓

%SETC ↓

%SETE ↓

%SETL ↓

%SETS ↓

%SETX ↓

%SET2 ↓

%SHIFT ↓

%UNTIL ↓

%WHILE

Pseudoinstructions (sometimes also called directives) are orders for the assembler which are formally similar to ordinary machine instructions — many of them may have label field and operands. Some pseudoinstructions (ALIGN and D) may even emit data or code.

Pseudoinstruction names and their keyword operands are case-insensitive.

↑ EUROASM

AUTOALIGN= ↓
AUTOSEGMENT= ↓
CODEPAGE= ↓
CPU= ↓
CPU features ABM=, AES=, AMD=, AVX=, AVX512= CYRIX=, D3NOW=, EVEX=, FMA=, FPU=, LWP=, MMX=, MPX=, MVEX=, PRIV=, PROT=, RTM=, SGX=, SHA=, SPEC=, SVM=, TBM=, TSX=, UNDOC=, VIA=, VMX=, XOP= ↓
DEBUG= ↓
DISPLAYENC= ↓
DISPLAYSTM= ↓
DUMPALL= ↓
DUMP= ↓
DUMPWIDTH= ↓
INCLUDEPATH= ↓
LINKPATH= ↓
LIST= ↓
LISTFILE= ↓
LISTINCLUDE= ↓
LISTMACRO= ↓
LISTREPEAT= ↓
LISTVAR= ↓
MAXINCLUSIONS= ↓
MAXLINKS= ↓
NOWARN= ↓
PROFILE= ↓
SIMD= ↓
UNICODE= ↓
WARN= ↓

With pseudoinstruction EUROASM programmer controls various settings of EuroAssembler - EUROASM options. Particular options are set with keyword operands. The same keywords are used in [EUROASM] section of euroasm.ini configuration file.

Options specified with this pseudoinstruction rewrite default options set in configuration file. Names of options are case-insensitive.

Current value in charge can be retrieved in the form of EUROASM system %^variables.

Options which expect Boolean value may be provided with enumerated tokens TRUE, YES, ON, ENABLE, ENABLED or FALSE, NO, OFF, DISABLE, DISABLED (case insensitive) or they may contain logical expression.

Beside keyword options the EUROASM pseudoinstruction also recognizes ordinal operand(s) which may have one of enumerated tokens PUSH or POP. €ASM maintains a special option stack and these two directives allow to save and retrieve the whole set of EUROASM options to this stack. This feature is handy in macros which temporarily require some unusual option value. Blindly setting the option in macro would have had side effect on the statements following the macro invocation, because EUROASM is a switching statement. So it is better to save the current options on stack at the beginning of macro and restore them at the end; other statements will not be influenced. Example:

SomeMacro %MACRO  ; Macro definition.
            EUROASM PUSH, NOWARN=1234 ; Store all options to option-stack and then supress warning W1234.
             ; Here go instructions which may emit warning message W1234
             ...
            EUROASM POP ; Restore option-stack, W1234 is no longer suppressed.
          %ENDMACRO SomeMacro
↑ AUTOALIGN=

Boolean option; default is AUTOALIGN=ON. Memory variables created or reserved with D pseudoinstruction will be implicitly aligned according to their TYPE#.

Aligned memory-variables can be accessed faster, on the other hand this option may blow up the size of your program if data definition of various types are mixed frequently. It's better to manually group data of the same size, so the alignment stuff is used only once per group.

Memory variables defined as literals are always autoaligned, regardless of EUROASM AUTOALIGN= status.

Structured data variables (defined with DS structure_name) do not autoalign by their largest member. They are aligned by the segment width (WORD, DWORD or QWORD) if AUTOALIGN=ENABLED.

Autoalignment does not work inside structure definition.

Programmers should design their structures with respect to the natural alignment of structure members. This is especially important in 64bit mode, where API requires all data be aligned. On conversion from badly designed 32bit structures they need manually inserted stuff-members which complete DWORD member sizes to QWORD alignment of following members, and rounds up the strucure size to a multiple of 8. See the WinAPI structure MSG as an example.

Autoalignment does not apply to machine instructions. If we want to have a procedure aligned to the start of cache boundery (for better performance), it should be aligned explicitly, for instamce Rapid PROC ALIGN=OWORD.

↑ AUTOSEGMENT=

Boolean option; default is AUTOSEGMENT=ON. The section, where the current statement emits to, is implicitly changed by €ASM according to the purpose of the statement. When more than one section with this purpose is defined in a program, autosegment will switch to the last defined one.

If the statement is a machine instruction or prefix or PROC, €ASM will switch to the last defined CODE section.
Similary, when the statement defines data (pseudoinstruction D and its clones, including DI), the current section is switched to the last DATA or BSS section.

Pseudoinstruction ALIGN, macros and all nonemitting operations, such as EQU or solo label, do not change the current section.

If you rely on autosegmentation, avoid a pitfall when the new section begins with a macro invocation, with an explicit ALIGN or with just a label itself. These statement will not autoswitch the current section. You may need to insert NOP or PROC to autoswitch to CODE, DB 0 statement to autoswitch to DATA, or DB to autoswitch to BSS. Example of such pitfall:
      EUROASM AUTOSEGMENT=ON
Hello PROGRAM FORMAT=PE, ENTRY=Main:
       INCLUDE winapi.htm; Include some basic code macros.
Title  DB "World!",0     ; Correctly autoswitched to [.data].
Main:  StdOutput Title   ; Macro didn't swich to [.text] as desired.
       TerminateProgram
      ENDPROGRAM Hello   ; Hello.exe does not work because its entry is in [.data] section.

The label Main: incorrectly remained in previous [.data] section. Remedy is simple:

      EUROASM AUTOSEGMENT=ON
Hello PROGRAM FORMAT=PE, ENTRY=Main:
       INCLUDE winapi.htm; Include some basic code macros.
Title  DB "World!",0     ; Correctly autoswitched to [.data].
Main:  PROC              ; Correctly autoswitched to [.text].
        StdOutput Title
        TerminateProgram
       ENDPROC Main:
      ENDPROGRAM Hello   ; Hello.exe works as expected.
Each explicit change of current section disables AUTOSEGMENT as a side effect.

AUTOSEGMENT= is weak option, it is automatically switched off when the programmer changes the current section explicitly with [section_name] in the label field of statement.

If you want to keep AUTOSEGMENT enabled after manual change of section, you need to explicitly switch it back on with EUROASM AUTOSEGMENT=ON, or save its state using EUROASM PUSH and restore them with EUROASM POP afterwards.
↑ CODEPAGE=

€ASM can use Unicode strings at run time but the data definitions in source program are defined in bytes. Option CODEPAGE= tells €ASM which code page it should internally use for conversion of the string in source text to Unicode at assembly-time.

Codepage may be specified with a direct 16bit integer value, as specified by [CodePageMS], for instance CODEPAGE=1253 for Greek aplhabet.

Codepage values can also be specified as an enumerated token, such as CODEPAGE=CP852, CODEPAGE=WINDOWS-1252, CODEPAGE=ISO-8859-2 etc, see DictCodePages for the complete list. Names of those specification are case insensitive.

Though some of those enumerated codepage constants may look like an arithmetic substraction, they are recognized as verbatim tokens and not evaluated.

The factory default and recommended value is CODEPAGE=UTF-8. See also Character encoding above.

↑ INCLUDEPATH=

When an included file is specified without path, €ASM will search for this file in the directories which are defined in INCLUDEPATH= option. Pathes can be separated with semicolon ; or comma , and the whole list should be in double quotes. Both backward \ and forward slashes / may be used as folder separator. The last slash can be omitted. Default is INCLUDEPATH="./,./maclib,../maclib,".

This syntax doesn't support directory names which begin or end with a space as a significat part of the name. Nevertheless, such names should be avoided anyway.
↑ LINKPATH=

When a linked file is specified without path, €ASM will search for this file in the directories which are defined in LINKPATH= option. Pathes can be separated with semicolon ; or comma , and the whole list should be in double quotes. Both backward \ and forward slashes / may be used as folder separator. The last slash can be omitted. Default is LINKPATH="./,./objlib,../objlib,".

↑ MAXINCLUSIONS=

Parameter MAXINCLUSIONS limits the maximal number of succesfull executions of INCLUDE* statements in an €ASM source. This prevents the assembler from resource exhausting in case of recursive inclusion loop.

Default value is EUROASM MAXINCLUSIONS=64.

↑ MAXLINKS=

Parameter MAXLINKS limits the maximal number of files specified by LINK statements in an €ASM source. This prevents the assembler from resource exhausting in case of recursive link loop.

Default value is EUROASM MAXLINKS=64.

↑ Processor generation option CPU=

Not all IA-32 machine instructions are available on all types of Central Processing Unit (CPU). This EUROASM option specifies the minimal type of CPU which the program is intended for. Possible CPU= values are
086 alias 8086,
186,
286,
386,
486,
586 alias PENTIUM,
686 alias P6,
X64. Default is EUROASM CPU=586. 64bit program should have EUROASM CPU=X64 enabled.

EuroAssembler pretends that later CPU also promotes all instructions supported by previous CPU versions.
↑ Processor features

This bunch of EUROASM boolean options tells €ASM which CPU features are required on the target computer. By default are all options switched OFF, you should explicitly enable each capability which you intend to program for.

ABM=: assembly of Advanced Bit Manipulation instructions.

AES= assembly of Intel's Advance Encryption Standard (AESNI) instructions.

AMD= instructions specific for AMD CPU manufacturer.

CYRIX= instructions specific for CYRIX CPU manufacturers.

D3NOW= assembly of AMD 3DNow! instructions.

EVEX= assembly of Intel's EVEX-encoded AVX-512 instructions.

FMA=: assembly of Fused Multiply-Add instructions.

FPU= assembly of Floating-Point Unit instructions (math coprocessor).

LWP= assembly of AMD's LightWeight Profiling instructions.

MMX=: assembly of MultiMedia Extensions.

MPX=: assembly of Memory Protection Extensions.

MVEX= assembly of Intel's MVEX-encoded AVX-512 instructions.

PRIV=: assembly of privileged mode instructions.

PROT=: assembly of protected mode instructions.

SGX=: assembly of Software Guard Extensions.

SHA= assembly of Intel's Secure Hash Algorithm instructions.

SPEC= assembly of other special instructions.

SVM=: assembly of Shared Virtual Memory instructions.

TSX=: assembly of Intel's Transactional Synchronization Extensions.

UNDOC= assembly of undocumented instructions.

VIA= instructions specific for VIA Geode CPU manufacturers.

VMX= assembly of Virtual Machine Extensions.

XOP= assembly of AMD's XOP-encoded AVX instructions.

↑ Streaming SIMD Extension generation option SIMD=

This option defines which Single Instruction Multiple Data (SIMD) generation is required to assemble following instructions. Possible enumerated values are
SSE1 alias SSE alias boolean true,
SSE2,
SSE3,
SSSE3,
SSE4,
SSE4.1,
SSE4.2,
AVX,
AVX2,
AVX512. Default value is SIMD=DISABLED (no SIMD instructions are expected).

Options CPU generation, CPU features, SIMD generation do not restrain €ASM from assembling instructions for higher CPU but a warning is issued when the instruction requires some capability currently not enabled with EUROASM. This should warn you that your program may not run on every PC, or that you may have made a typo in instruction mnemonics.
↑ DISPLAYSTM=
↑ DISPLAYENC=

Those boolean options are designed for debugging of assembly process, see also pseudoinstruction %DISPLAY. When enabled, €ASM inserts diagnostic message below each assembled statement, which displays how is the statement parsed into fields, and what modifiers are used for instruction encoding. Example:

    EUROASM DISPLAYSTM=ON
.L: MOV EAX,[ESI+16],ALIGN=DWORD
    EUROASM DISPLAYSTM=OFF, DISPLAYENC=ON
    LEA EDX,[ESI+16]
    ADD EAX,EDX

Listing of previous example is here:

| | EUROASM DISPLAYSTM=ON |00000000:8B4610 |.L: MOV EAX,[ESI+16],ALIGN=DWORD |# D1010 **** DISPLAYSTM ".L: MOV EAX,[~~ALIGN=DWORD " |# D1020 label=".L" |# D1040 machine operation="MOV" |# D1050 ordinal operand number=1,value="EAX" |# D1050 ordinal operand number=2,value="[ESI+16]" |# D1060 keyword operand,name="ALIGN",value="DWORD" | | EUROASM DISPLAYSTM=OFF, DISPLAYENC=ON |# D1010 **** DISPLAYSTM "EUROASM DISPL~~SPLAYENC=ON " |# D1040 pseudo operation="EUROASM" |# D1060 keyword operand,name="DISPLAYSTM",value="OFF" |# D1060 keyword operand,name="DISPLAYENC",value="ON" |00000003:8D5610 | LEA EDX,[ESI+16] |# D1080 Emitted size=3,DATA=DWORD,DISP=BYTE,SCALE=SMART,ADDR=ABS. |00000006:01D0 | ADD EAX,EDX |# D1080 Emitted size=2,CODE=SHORT,DATA=DWORD.
↑ DUMP=
↑ DUMPWIDTH=
↑ DUMPALL=

Options DUMP=, DUMPWIDTH= and DUMPALL= control how the dump column with emitted code is presented in listing.

Boolean option DUMP= can switch off the dump completely, the listing copies the input source almost verbatim in this case. Default is DUMP=ON.

DUMPWIDTH= sets the width of dump column in €ASM listing. This option specifies how many characters of dumped data will fit between the starting | and ending | including those two border characters. Default value is DUMPWIDTH=27 which is enough for 8byte long instruction.

Accepted dump width value is between 16 and 128 characters.

Dump data consists of an offset (4 or 8 hexadecimal characters, depending on section width), separator : and 2 hexadecimal digits per each byte of generated code.

When the generated code is too long to fit into dump column, the Boolean option DUMPALL= decides if the rest will be omitted (the omittion is indicated by tilde ~ in place of the last character), or if additional lines will be inserted to the listing until all generated code is dumped. Factory default is DUMPALL=OFF.

See also the description of listing file.

Be careful when setting DUMPALL=ON with long duplicated data definition, such as DB 2048 * B 0, because this may clutter the listing with many lines of useless dump.
↑ LISTFILE=

This option defines the name of the listing file. By default it is LISTFILE="%^SourceName%^SourceExt.lst", i.e. the it copies the name and extension of source file and appends .lst to it.
If not specified otherwise, listing is always created in the same directory as the corresponding source file.

↑ LIST=
↑ LISTINCLUDE=
↑ LISTMACRO=
↑ LISTREPEAT=
↑ LISTVAR=

LIST* family of options controls what should be copied to the listing file. Boolean option LIST=OFF will suppress the generation of listing until it is switched on again. Default is LIST=ON.
Note that switching off even a minor part of listing will cause that the listing file is no longer usable as the source file, because some parts are not copied by €ASM from original source to the listing.

Contents of included files are by default omitted from the listing (LISTINCLUDE=OFF). When this option is ON, the INCLUDE statement will be replaced by the contents of file.

LISTMACRO= controls whether the instructions from macro expansion go to the listing. Default state is LISTMACRO=OFF and only the invocation of macroinstruction is presented.

EUROASM option LISTREPEAT= is similar to LISTMACRO= with the difference that it controls listing of statements expanded in %FOR, %WHILE and %REPEAT blocks.

When a preprocessing %variable is used in the statement and the option LISTVAR=ON, the statement is repeated in the form of a machine comment just below the original statement and the expanded text is shown instead of %variables. Factory default is LISTVAR=OFF.

See also te description of listing file above.

↑ UNICODE=

UNICODE= determines the character width. This boolean option specifies if data definition of unspecified string, such as D "an explicit string" or ="a literal string" should be treated as a sequence of bytes (8bit characters) or unichars (16bit characters).

System variable %^UNICODE is consulted in macros or structure definition which have different versions for ANSI (8bit) or WIDE (16bit) string encoding.
It is also consulted in macros WinAPI (32bit) and WinABI (64bit) to determine which version of Windows API function (ANSI or WIDE) should be invoked.

Some string-handling macros and WinAPI functions expect the string size be specified in characters rather than in bytes. Attribute operation SIZE# returns the size of its operand always in bytes. This can be solved by testing the system variable %^UNICODE:

aString D "String" ; Symbol aString defines 6 bytes if UNICODE=OFF or 12 bytes if UNICODE=ON.
  %IF %^UNICODE  ; WIDE version of aString.
     MOV ECX, SIZE# aString / 2
  %ELSE          ; ANSI version of aString.
     MOV ECX,SIZE# aString
  %ENDIF         ; ECX is now loaded with the number of characters in aString.

Trickier but more elegant solution exploits the fact, that %^UNICODE (and all other boolean system %^variables) expands to either 0 or -1, and that shift left by negative value is calculated as shift right by the negated value. When %^UNICODE is -1, size in bytes is shifted to the right by 1 bit, which is equivalent to division by two.

aString D "String" ; Symbol aString defines 6 bytes if UNICODE=OFF or 12 bytes if UNICODE=ON.
  MOV ECX, SIZE# aString << %^UNICODE  ; ECX is now loaded with the number of characters in aString.
↑ DEBUG=

This boolean option specifies if debug version should be assembled. When EUROASM DEBUG=ENABLED, linker includes symbol table or other debugging information to the output program. Macros can change their behaviour depending on condition %IF %^DEBUG.

The final release should be assembled with this option turned off.

↑ PROFILE=

This boolean option specifies if profileable version should be assembled. Profiling is not implemented yet in this version of EuroAssembler.

The final release should be assembled with this option turned off.

↑ WARN=
↑ NOWARN=

Options WARN= and NOWARN= control which informative and warning messages will be issued in the assembly process. With NOWARN= it is possible to suppress anticipated messages with identification number below W4000. Suppressed warnings do not involve the errorlevel. User generated warnings (U5000..U5999) and errors with higher severity cannot be suppressed.

The value of option is either a number or a range of numbers, which cannot exceed 3999. WARN= and NOWARN= operands may repeat in a statement; they are processed from left to right. For instance EUROASM NOWARN=0600..0999, WARN=705 will supress informative messages I0600 to I0999 except for message I0705 which remains enabled.

Default value is WARN=0..4999 (all messages enabled}.


↑ PROGRAM

↑ ENDPROGRAM

DLLCHARACTERISTICS= ↓
ENTRY= ↓
FILEALIGN= ↓
FORMAT= ↓
ICONFILE= ↓
IMAGEBASE= ↓
LISTLITERALS= ↓
LISTGLOBALS= ↓
LISTMAP= ↓
MAJORIMAGEVERSION= ↓
MAJORLINKERVERSION= ↓
MAJOROSVERSION= ↓
MAJORSUBSYSTEMVERSION= ↓
MAXEXPANSIONS= ↓
MAXPASSES= ↓
MINORIMAGEVERSION= ↓
MINORIMAGEVERSION= ↓
MINORLINKERVERSION= ↓
MINOROSVERSION= ↓
MINORSUBSYSTEMVERSION= ↓
MODEL= ↓
OUTFILE= ↓
SECTIONALIGN= ↓
SIZEOFHEAPCOMMIT= ↓
SIZEOFHEAPRESERVED= ↓
SIZEOFSTACKCOMMIT= ↓
SIZEOFSTACKRESERVED= ↓
STUBFILE= ↓
SUBSYSTEM= ↓
TIMESTAMP= ↓
WIDTH= ↓
WIN32VERSIONVALUE= ↓

Pseudoinstructions PROGRAM and ENDPROGRAM specify a block of source code, which creates standalone output file. In most other assemblers the whole source file creates the output file, sometimes it is called modul. For instance, the command nasm -f win32 HelloWorld.asm tells NetWide Assembler to create a COFF output file HelloWorld.obj. In €ASM more than one output files can be created with the command euroasm HelloWorld.asm, provided that there are more PROGRAM / ENDPROGRAM blocks in HelloWorld.asm.

The label of PROGRAM statement represents the name of output program. Although it does not define a symbol, its name must follow the rules for symbol names, i.e. at least one letter followed with letters and digits. The same identifier may be used as operand %1 in the corresponding ENDPROGRAM statement.

One source may contain more program blocks and the blocks may nest. Each program block assembles to a different output file.

Symbols defined in a program are not visible outside the block. When a program needs to call a label from another program, labels must be marked as extern and public, even when both program may lay in the same source file or one program be nested in another.

Preprocessing %variables, macro definitions and Euroasm options, on the other hand, are visible throughout the source and they can carry the information between programs at assembly time. See the sample program locktest as an example.

The PROGRAM pseudoinstruction has many important keyword operands which specify properties of the output file. The same keywords are used in [PROGRAM] section of euroasm.ini configuration file.

Values of all PROGRAM options can be inspected as system %^variables at assembly-time.

Unlike EUROASM options, which involve only a division of source, PROGRAM properties involve the whole program en bloc. We cannot have half of the program with graphic subsystem, and another half with console subsystem, for instance. That is why options LISTMAP=, LISTGLOBALS=, LISTLITERALS= are properties of pseudoinstruction PROGRAM, but LISTINCLUDE=, LISTMACRO=, LISTREPEAT=, LISTVAR= are properties of pseudoinstruction EUROASM.
↑ FORMAT=

Format and file-extension of output file is determined with this PROGRAM's parameter.

€ASM output file formats
FORMAT=Default
file
extension
Default
program
width
Default
memory
model
Description
BIN.bin16bitsTINYBinary file
COM.com16bitsTINYDOS/CPM 16bit executable
OMF.obj16bitsSMALLOMF relocatable Object Module Format
LIBOMF.lib16bitsSMALLObject library in OMFormat
MZ.exe16bitsSMALLDOS 16bit executable
COFF.obj32bitsFLATCommon Object File Format in Microsoft specification
LIBCOF.lib32bitsFLATObject library in COFFormat
PE.exe32bitsFLATPortable executable, COFF based
DLL.dll32bitsFLATDynamic Linked Library, COFF based

See also Program formats for more details.

↑ WIDTH=

This parameter specifies operating mode of the program:

Program width also defines default width for all its segments. Its value is numeric expression which evaluates to 16, 32, 64, or to 0. Empty or zero value (factory default) specifies that program width should be set internally by €ASM according to its FORMAT=. Nevertheless, when a segment is defined, it may specify a different width, regardless of the default width of its program. €ASM doesn't argue against mixing 16bit and 32bit segments in one module.

↑ MODEL=

Memory model describes sizes and distances of code and data, and the number of code and noncode segments. The main function of memory model specification is to set default distance for segments and procedures defined in the program.

Program property MODEL= is taken into account in procedure pseudoinstructions (PROC, PROC1) and in control-transfer instructions (JMP, CALL, RET) without explicitly specified distance.
In monocode models (TINY,SMALL,COMPACT,FLAT) the default transfer distance is NEAR.
In multicode models (MEDIUM,LARGE,HUGE) the default transfer distance is FAR.
In monodata models (TINY,SMALL,MEDIUM,FLAT) are all data addressed relatively to the start of data segment.
In multidata models (COMPACT,LARGE,HUGE) it is the programmers responsibility to load the used segment register with paragraph address of the data before they are accessed.

Properties implied by memory model
MODEL=Default segment properties Link propertiesUsual usage
CODE
distance
DATA
distance
Segm.
width
Multi-
code
Multi-
data
Segm.
overlap
CPU
mode
Used in
formats
TINYNEARNEAR16nonoyesrealCOM
SMALLNEARNEAR16nononorealMZ, OMF
MEDIUMFARNEAR16yesnonorealMZ, OMF
COMPACTNEARFAR16noyesnorealMZ, OMF
LARGEFARFAR16yesyesnorealMZ, OMF
HUGEFARFAR32yesyesnorealMZ, OMF
FLATNEARNEAR32,64nonoyesprotectedPE, DLL, COFF
↑ SUBSYSTEM=

Subsystem is a numeric identifier in the header of Portable Executable file. This parameter specifies whether Windows should create a new console when the PE program starts. Default is SUBSYSTEM=CON. Set it to GUI when your PE programs creates graphical windows rather than using standard text input and output. Value of subsystem is one of enumerated tokens from the table below, or a numeric expression which evaluates to the corresponding number.

Subsystems table
SUBSYSTEM=ValueRemark
00Unknown subsystem.
1NATIVESubsystem is not used, i.e. device driver.
2GUIWindows GUI graphical windows.
3CONWindows console (character subsystem).
5OS2OS/2 character subsystem.
7POSIXPosix character subsystem.
8WXDWindows 95/98 native driver.
9WCEWindows CE graphical windows.
↑ ENTRY=

This parameter specifies an address where execution of the program begins. Usually this parameter contains a label whose address is set to CS:rIP when loader transfers execution to the program at run-time.

By default the ENTRY= parameter is empty; in this case €ASM will set it to 0 if PROGRAM FORMAT=BIN or to 256 if PROGRAM FORMAT=COM. This parameter may be left empty in linkable program formats but it must be specified in executable formats, otherwise €ASM reports error.
If the executable links other programs (object modules), entry point must be specified in exactly one such modul.

↑ MAXPASSES=

This parameter limits the number of assembly passes through the source code. It is €ASM who decides how many passes will be necessary, nonetheless this parameter may specify the upper limit.

EuroAssembler keeps repeating assembly passes until offsets of symbols do not change between passes (all symbols are fixed). Then it performs the last, emitting final pass.

In very rare circumstances this may lead to oscillation of emitted code size due to optimisation of short|near jump encodings. In this case €ASM would request more and more passes forever, that is why their number is limited. When the pass number approaches %^MAXPASSES-1, this (last but one pass) is marked as fixing pass. Symbol offsets may only grow up in fixing pass and the vacant code space is stuffed with NOP bytes. See the test t9181 as an example of oscillating code with fixing pass.

Factory default value is MAXPASSES=32. You may need to increase this option only in extremely large sources with lots of macros and conditional-assembly constructs. The maximum ever reached with my programs is 44 passes consumed in assembly of module iiz.htm.
↑ MAXEXPANSIONS=

Parameter MAXEXPANSIONS= limits the number of %FOR, %WHILE, %REPEAT or %MACRO block expansions. €ASM declares a program property named %. and increments its value whenever a preprocessing block is expanded. When this number exceeds MAXEXPANSIONS value, €ASM emits error message and stops further expansions.
Factory default is MAXEXPANSIONS=65536.

This mechanism protects €ASM from exhausting of memory resources when some incorrectly written preprocessing loop fails to exit. If your program is really big, you may need to increase MAXEXPANSIONS value.

The same expansion counter is used to maintain the value of special automatic %variable %..

↑ OUTFILE=

OUTFILE= specifies filename of the output of assembly - executable or linkable object file. This filename is related to the current shell directory, if not specified otherwise. Default value is OUTFILE="%^PROGRAM" followed by extension specified by FORMAT=.
E.g.: Hello PROGRAM FORMAT=MZ will create output file "Hello.exe".

↑ STUBFILE=

STUBFILE= is used in COFF-based exectutables - PE and DLL formats only. The stub is 16bit MZ program which gets control when the output file is launched in 16bit DOS operating system. Usualy its only job is to tell the user, that this program requires MS Windows.

When STUBFILE parameter is empty (default), €ASM will use its own built-in stub code.
Otherwise it looks for previously compiled MZ executable. If the STUBFILE= is specified without path, €ASM looks for the file in pathes specified by EUROASM option LINKPATH=.

↑ ICONFILE=

ICONFILE= should specify an existing file with icon which will be built into resource segment of PE or DLL output file. This icon is used to graphically represent the output file in MS Windows environment (Desktop, Explorer etc). Icon file is searched for in the path specified by EUROASM option LINKPATH=.

Factory-default value is EUROASM ICONFILE="euroasm.ico" which represents an icon   Icon shipped with EuroAssembler in directory objlib.

Option ICONFILE= applies only when no resource file is linked to the output program, otherwise it is ignored and the first icon from resources (if any) is used by Windows Explorer to represent the executable.

When parameter ICONFILE= is empty, no icon is used and €ASM does not create resource section at all.

↑ LISTMAP=
↑ LISTGLOBALS=
↑ LISTLITERALS=

Those three options control whether auxilliary information will be dumped near the end of program in listing file. See t7475 for an example of ListMap and ListGlobals format.

If LISTLITERALS=ON, contents of data and code literal sections @LT16, @LT8, @LT4, @LT2, @LT1, @RT0 will be dumped too. See t1711 for an example of ListLiterals format.

↑ TIMESTAMP=

Specifies nominal time which is provided by €ASM system variables %^DATE, %^TIME and which is embedded in some COFF-based file formats: PFCOFF_FILE_HEADER.TimeDateStamp, PFLIBCOF_IMPORT_OBJECT_HEADER.TimeDateStamp, PFRSRC_RESOURCE_DIRECTORY.TimeDateStamp.

Value of this parameter represents number of seconds elapsed since midnight, 1st of January 1970, UTC. When it is set to -1 or left empty (factory default), it will by assigned from system timer at the start of assembly session.
TIMESTAMP= can be used to fake the time when was the target file created.

↑ DLLCHARACTERISTICS=
↑ FILEALIGN=
↑ IMAGEBASE=
↑ MAJORIMAGEVERSION=
↑ MAJORLINKERVERSION=
↑ MAJOROSVERSION=
↑ MAJORSUBSYSTEMVERSION=
↑ MINORIMAGEVERSION=
↑ MINORLINKERVERSION=
↑ MINOROSVERSION=
↑ MINORSUBSYSTEMVERSION=
↑ SECTIONALIGN=
↑ SIZEOFHEAPCOMMIT=
↑ SIZEOFHEAPRESERVED=
↑ SIZEOFSTACKCOMMIT=
↑ SIZEOFSTACKRESERVED=
↑ WIN32VERSIONVALUE=

Other PROGRAM parameters are mostly important only in COFF-family of output formats (PE, DLL, COFF) formats and they form a PE header. See [MS PECOFF] specification for detailed description. Do not change them if you don't know what you are doing.

↑ SEGMENT

PURPOSE= ↓
WIDTH= ↓
ALIGN= ↓
COMBINE= ↓
CLASS= ↓

Pseudoinstruction SEGMENT declares segment and specifies its properties. Each segment definition also simultaneously defines a section with the same name. Other section of the segment may be declared (or switched to) later, with an operation-less statement which has the section name in its label field, for example
[Strings] ; Declare section [Strings]..

The name of segment is specified in the label field and it looks like an identifier in square brackets. Segment properties are assigned with keyword parameters.

€ASM declares automatically a few default segments when it starts to assemble a program. In most cases there is no need to explicitly declare any other segments. Number and purpose of default segments depends on program format. If these segments are not used in the program (no code was emitted into them), they will be discarded at assembly time and do not appear in the object file. This happens when programers are not satisfied with default segment names and properties and they declare new segments of their own choice, usually near the program beginning.

↑ PURPOSE=

Parameter SEGMENT PURPOSE= specifies what kind of information is the segment intended for. It is important in protected mode (formats COFF, PE, DLL), where descriptor's access bits control the rights granted to read, write or execute the contents of segment.

Segment purpose table
PURPOSE=AliasAccessDefault nameContents
CODETEXTread, execute[.text]|[CODE]Program code (instructions) (1)
STACKread, write[STACK]Machine stack (1)
DATAIDATAread, write[.data]|[DATA]Initialized data (1)
BSSUDATAread, write[.bss]|[BSS]Uninitialized data (1)
LITERALSLITERALread parasites on other data/code segmentLiteral sections (2)
DRECTVEdiscarded[.drectve]Linker directives (3)
EXPORT[.edata]Dynamic link export (4)
IMPORT[.idata]Dynamic link import (4)
RESOURCE[.rsrc]Programming resources (4)
EXCEPTION[.pdata]Runtime exceptions (5)
SECURITYAttribute certificate (5)
BASERELOCdiscarded[.reloc]Load-time relocations (4)
DEBUG[.debug]Data for debugger (5)
COPYRIGHTARCHITECTUREArchitecture info (5)
GLOBALPTRRVA of global pointer (5)
TLS[.tls]Thread local storage (5)
LOAD_CONFIGLoad configuration (5)
BOUND_IMPORTBound import (5)
IAT[.idata]Import address table (4)
DELAY_IMPORTDelayed import descriptor (5)
CLR[.cormeta]CLR metadata (5)
RESERVEDReserved (5)
Remarks:
(1) Basic purposes used in all program formats.
(2) Programmer may specify which data/code segment should be used to host literal symbols.
(3) Syntetic section used for transfer of dynamic-link information in COFF format.
(4) Special sections directly supported by EuroAssembler. They should never be declared explicitly.
(5) Special sections, their contents is not supported. Programmer may include such section in their PE file but the contents must be explicitly specified (with D or INCLUDEBIN), see program format PE.

Segments with special purpose names (4),(5) will be marked in the corresponding position of DataDirectory table in optional header of PE or DLL file format.

Although operand PURPOSE= accepts only enumerated values, they may be combined using the operator Addition + or Bitwise OR |, for instance
[TINY] SEGMENT PURPOSE=CODE|DATA|BSS|STACK or
[.lit] SEGMENT PURPOSE=DATA+LITERALS.

When this parameter is empty or not specified, €ASM will guess the segment's purpose by its class or [name], following this rules:

  1. If the name exactly case-insesitively matches any purpose enumerated in the table above, this purpose is assumed.
  2. If the name contains string STACK (case insensitive), PURPOSE=STACK is assumed.
  3. If the name contains string BSS or UDATA (case insensitive), PURPOSE=BSS is assumed.
  4. If the name contains string DATA (case insensitive), PURPOSE=DATA is assumed.
  5. If none of the previous rules applies, PURPOSE=CODE is assumed.

PURPOSE=LITERALS is used together with CODE and/or DATA and it only suggests that this segment should be preferably used to host literal sections. If no segment is explicitly marked as PURPOSE=LITERAL, €ASM will choose the last data/code segment defined when some literal symbol was encountered.

Purpose guessing first looks at the SEGMENT CLASS= property, and only if it's empty, segment name is looked at. This mechanism can be used with segments defined in OMF object files to propagate their purpose to the linked executable.
↑ WIDTH=

Segment width value can be numeric expression which evaluates to 16, 32 or 64. By default (if omitted) the width of segment is determined by program width.

↑ ALIGN=

This parameter requests alignment of the segment in memory at run-time. Default values are ALIGN=BYTE for code segments and ALIGN=OWORD for data segments.

Final effective segment alignment is in fact determined as the highest from three options:

  1. explicitly specified at segment declaration with SEGMENT ALIGN=,
  2. alignment in target file specified with PROGRAM FILEALIGN=, and
  3. alignment of segment loaded in memory PROGRAM SECTIONALIGN=.

Typical effective segment alignment is determined with the third option, which defaults to SECTIONALIGN=1K in COFF-based formats.
Memory variables cannot ask for better alignment.

↑ COMBINE=

This parameter specifies how segments from other program modules will be combined at link time. This is important only in MZ program format (16bit DOS executables) linked from several OBJ files. Possible values:

PUBLIC
All segments with the same name will be linked together. Total size is the sum of concatenated segments. This is the default option.
PRIVATE
Private segments will be not concatenated with other segments, no matter if they have the same name or not.
COMMON
All common segments with the same name will be linked to the same address so they overlay each other. The total segment size equals to the greatest size of all segments with this name. Data variables declared in common segment will be shared among separately assembled modules.
STACK
The STACK combine method is the same as PUBLIC, in addition the SS:SP pointer in target EXE file will be set to the end of such segment on run time.
↑ CLASS=

Value of CLASS= in an arbitrary identifier. It may be used by the linker to guess the segment purpose (CODE|DATA|BSS) in object formats which do not carry purpose information.

↑ GROUP

This pseudoinstruction specifies segments addressed with the same addressing frame. Data in all grouped segments are referrenced with the same value of segment register.

Segment groups are applicable in big realmode 16bit programs. Only 16bit segment can be member of the group.

Name of the group must be defined in label field, names of grouped segments are enumerated in operand fields. All names are in braces [ ]. Group name may be the same as the name of one of its segment. Example:
[DGROUP] GROUP [DATA],[STRINGS].
Grouped segment may be defined before or after the GROUP statement. This pseudoinstruction has no keyword operands.

Relation between group and its segments at link time is similar to the relation between segment and its sections at assembly time.


↑ PROC

↑ ENDPROC alias ↑ ENDP

DIST= ↓
ALIGN= ↓
NESTINGCHECK= ↓

Pseudinstructions PROC and ENDPROC declare a namespace procedure block. In most times it ends with instruction RET, so the block can be called to perform some function and after the execution it returns back just behind the CALL instruction.

The mandatory label of PROC declares assembler symbol which is the procedure name. The same identifier may be used as the first and only operand of the corresponding ENDPROC pseudoinstruction.
Alias ENDP may be used instead of ENDPROC.

Pseudoinstruction ENDPROC may define its own label, too. This label doesn't represent a return from the subprogram, it points to the code which follows PROC..ENDP block. Label of ENDPROC is useful only when the PROC..ENDP block is used to define namespace block rather than subprogram block. Examples:

SubPgm:PROC ; Define PROC as a call-able subprogram block.
          ; PROC body instructions.
          TEST SomeCondition
          JC .Abort:  ; Go to return below CALL SubPgm: statement.
          TEST OtherCondition
          JC .End:    ; Go to continue below .End: ENDP. Probably not what the programmer wanted.
          ; More body instructions.
.Abort:   RET         ; Return below CALL SubPgm: statement.
.End:  ENDP SubPgm:
NameSp:PROC ; Define PROC as a pass-through-able namespace block.
          ; PROC body instructions.
          TEST SomeCondition
          JC .End:  ; Go to continue below .End:  ENDP NameSp: statement.
          ; More body instructions. No RET instruction here.
.End:  ENDP NameSp: ; Continue below this statement.

Jumping to the ENDPROC label differs from jumping to macroinstruction EndProcedure defined in defined in calling convention macrolibraries. Pseudoinstructions PROC, ENDPROC, PROC1, ENDPROC1 do not emit any machine code.

What are procedures good for? We could manage without PROC..ENDP pseudoinstructions easily but wrapping the block of code in PROC..ENDPROC block has some advantages:

↑ DIST=

Pseudoinstructions PROC and PROC1 accept keyword operands DIST= and ALIGN=. DIST= sets the distance of the procedure (NEAR or FAR). When DIST=FAR, all CALL to this proc default to FAR, and all RET within this proc default to FAR (of course this can be overriden with instruction suffix CALLN/CALLF, RETN/RETF). Default value of this parameter depends on program memory model.

↑ ALIGN=

Alignment of procedure is ALIGN=BYTE by default. For the best use of instruction cache it sometimes may be usefull to complete frequently called procedures with PROC ALIGN=OWORD, if code size is not an issue.

↑ NESTINGCHECK=

This boolean option allows to switch off internal check of PROC..ENDPROC labels matching. This has only exceptional use in macros simulating built-in pseudoinstruction, which need to hack their block context, such as Procedure and EndProcedure.

See also the instruction modifier NESTINGCHECK=.

Pseudoinstruction PROC does not accept ordinal parameters. They can be passed in registers or machine stack and managed individually. Calling convention macrolibraries shipped with EuroAssembler define macros Procedure and EndProcedure with similar function as PROC and ENDPROC, which allow to pass arbitrary number of arguments as macro parameter when the Procedure is invoked.

↑ PROC1

↑ ENDPROC1 alias ↑ ENDP1

Pseudoinstructions PROC1 and ENDPROC1 are equivalent to PROC and ENDPROC with two differences:

  1. Procedure declared with PROC1..ENDPROC1 may occur in the program more than one time. Repeated declarations of PROC1..ENDPROC1 block with the same label are ignored, it is only emitted once.

    This predetermines PROC1 for semiinline macros, which contain both call of a procedure and the procedure itself. When the procedure is defined with PROC1..ENDPROC1, such macro can be invoked many times but the called procedure will be assembled and emitted only once (during the first macro expansion).

  2. A block defined with PROC1..ENDPROC1 is not emitted to the current section. €ASM will automatically switch to another code section instead, and return to the previous section after ENDPROC1 has been processed. The section, which €ASM will switch to, has the name [@RT1] and it is automatically created in the segment with PURPOSE=CODE+LITERAL or in the lastly defined code segment. In some circumstances €ASM may also use runtime sections [@RT2], [@RT3] etc. This happens when the code inside the PROC1..ENDPROC1 block contains other semiinline macros, so the current runtime section already is [@RT1] and €ASM must choose another one.

    Emitting procedures to a different section, than the main program currently uses, has an advantage that the procedure body needs not to be bypassed with jump instruction. It also leads to shorter code because jumps over the semiinline macros need not to jump over the whole procedure body, which could make them exceed 128 distance easily and that would require using longer form of jump instructions.

↑ ENDHEAD

Pseudoinstructions HEAD and ENDHEAD just claim a division of source code. This division may be included to other source files with INCLUDEHEAD or INCLUDEHEAD1. The block usually contains the interface of programming objects (definition of structures, macros, constants) which needs to be included in other separately assembled programs.

Label field of pseudoinstruction HEAD may be used as block identifier but it does not create a symbol. More then one HEAD..ENDHEAD block can be specified in a source file. When these blocks are nested, the whole outer (larger) block will be included.

Languages which do not have implemented this mechanism require to put interface part in separate header files. With HEAD..ENDHEAD they can be kept together with the implementation body in one compact file.

↑ INCLUDE

This pseudoinstruction incorporates file(s) with name specified as its operand to the main source file. The INCLUDE statement is virtually replaced with the contents of included file.

Inclusion may be nested, i.e. included files may contain other INCLUDE statements.

Double quotes may be omitted if the filename contains only alphanumeric characters (no spaces or punctuation).

INCLUDE can have unlimited number of operands, for example INCLUDE "Win*.htm", ./MyConstants.asm, C:\MyLib\*.inc.

When the file is specified without path, it will be searched for in folders specified with EUROASM option INCLUDEPATH=. If the included filename contains at least one slash, backslash or colon / \ : , this means that it has specified its own path and the INCLUDEPATH= is ignored in this case.

The filename may contain wildcards * ?, in this case €ASM will include all files conforming this mask. The order of inclusion depends on operating system.

Behaviour of INCLUDE statement is described in the following table:

PathWildcardExample When 1st file foundWhen no file found
NoNofile.incDone, stops further searching in INCLUDEPATH.Error E6914.
YesNo./file.incDone.Error E6914.
NoYesfile*.incContinue searching for more files in INCLUDEPATH.Nothing is included, no error.
YesYes./file*.incContinue searching for more files in the given path.Nothing is included, no error.

Only a part of source file can be included when substring or sublist operator immediately follows the file name. Example: INCLUDE "file.inc"{%&-20..%&} will include the last twenty lines of file.inc (automatic %variable %& represents the number of lines in the file). Filename must be in double quotes when suboperation is used. When suboperation is used on wildcarded filename, it will be applied to all files.

↑ INCLUDE1

Include once behaves exactly like INCLUDE but first it looks if the same file (with the same size and contents, regardless of their names) was already included in the program, and skips the file in this case.

Using INCLUDE1 instead of INCLUDE allows to resolve mutual dependencies of source libraries. When some included library uses macros, structures and constant definitions from another library, don't hesitate to INCLUDE1 another.library in each such library.

↑ INCLUDEHEAD

The INCLUDEHEAD variant includes only the contents of HEAD..ENDHEAD block(s) of included file, see the test t2420. Error is reported if no such block is found in the file or if the block is incomplete (missing ENDHEAD). When a suboperation is used with INCLUDEHEAD, it is applied first to the entire included file and HEAD..ENDHEAD block is searched for in the subrange only.

↑ INCLUDEHEAD1

The INCLUDEHEAD1 and INCLUDE1 will ignore the source if the file or any part of it has already been included in the program using INCLUDE, INCLUDE1, INCLUDEHEAD or INCLUDEHEAD1.

Library is treated as already-included when it was included as an entire file with INCLUDE or INCLUDE1, when its interface division was included with INCLUDEHEAD or INCLUDEHEAD1, or when only a suboperated part of it was included.

↑ INCLUDEBIN

Unlike INCLUDE and INCLUDEHEAD, this pseudoinstruction does not treat the file contents as a source to assemble, but the contents is emitted as is at the position specified by offset pointer $ of current section.

Including binary data should not be misplaced with linking; it does not update relocatable addresses or external symbols. For instance the statement INCLUDEBIN "C:\WINNT\Media\chimes.wav"[0x2C..] will skip the first 0x2C bytes of WAV header in sound file and load the rest (raw samples) to the assembled target, as if they were defined with DB statements.

See also t2470.

Pseudoinstruction LINK specifies file(s) which should be linked into the current program.

Each ordinal operand represents file name, which may have wildcards and may be specified with or without path. Relative path refers to the current directory.

If the linked file name does not contain path, it will be searched for in all directories specified with EUROASM LINKPATH= option, respectively. Unlike included files, suboperations with linked files are not supported.

Linkable files have specific internal structure, which probably would have been damaged if only suboperated part of the file were subjected to the link process. Therefore only whole object file or library can be linked.

Position of the LINK statement withing the program is not important, the actual linking will be performed when the final program pass is about to end. Order in which the files are linked respects the order in which pseudoinstruction LINK appeared in source. However, if linked files are specified with wildcards, e.g. LINK "modul*.lib", their order depends on current filesystem and cannot be reliably predicted. Example:

 LINK Subproc.obj, "..\My modules\W*.obj"

See static linking for more info.

↑ PUBLIC

Scope declaration pseudoinstructions GLOBAL, PUBLIC, EXTERN, EXPORT, IMPORT set the scope property of symbol(s), which is important in linking.

The symbol, whose scope is being declared, may be in the label field or in the operand field of the statement, or in both. More than one symbol may be declared with one statement. Symbols in question may be forward or backward referred.

Explicit scope declaration may appear before or after the symbol is actually defined or referred.

Example: Explicit scope declaration of four symbols: Sym1 PUBLIC Sym2, Sym3, Sym4

Specifying symbol as PUBLIC just tells €ASM that the symbol, which was or will be defined somewhere else in the program, should be referrable from other programs statically linked together. Public declaration does not create the symbol yet, in fact symbol with that name must be defined somewhere else in the same program.

↑ EXTERN

This property tells €ASM that this symbol is not defined in the program, and so references to its offset must be patched in the code at link time. It is an error to define symbol which is declared as EXTERN in the same program. Instead, it is searched for in other modules at link time, and the linker may report an error when the external symbol is not found.

↑ GLOBAL

Pseudoinstruction GLOBAL can be used to automatize dealing with PUBLIC and EXTERN scopes. If the symbol is marked with GLOBAL statement, it behaves either as public or external, depending whether or not it is defined in the same program.

Programmer surely knows whether the declared symbol belongs to the current program or not, so why is the declaration of PUBLIC and EXTERN scope duplicated by GLOBAL? Lets have program PgmA which defines public symbol SymA and refers external symbol SymB. Similary PgmB defines SymB and refers SymA:
PgmA PROGRAM
      PUBLIC SymA
      EXTERN SymB
      CALL SymB: ; Reference to external symbol.
SymA: RET        ; Definition of public symbol.
     ENDPROGRAM PgmA

PgmB PROGRAM
      PUBLIC SymB
      EXTERN SymA
      CALL SymA: ; Reference to external symbol.
SymB: RET        ; Definition of public symbol.
     ENDPROGRAM PgmB
If we replace PUBLIC and EXTERN declarations with GLOBAL, the same declaration statement can be used in all statically linked programs, either copy&pasted or included from external file, which is easier to maintain:
PgmA PROGRAM
      GLOBAL SymA, SymB
      CALL SymB: ; Reference to external symbol.
SymA: RET        ; Definition of public symbol.
     ENDPROGRAM PgmA

PgmB PROGRAM
      GLOBAL SymA, SymB
      CALL SymA: ; Reference to external symbol.
SymB: RET        ; Definition of public symbol.
     ENDPROGRAM PgmB
Another raison d'être of GLOBAL is backward compatibility with NASM, which doesn't know directive PUBLIC at all. NASM uses directive GLOBAL instead whenever €ASM would require PUBLIC.

↑ IMPORT

Scopes IMPORT an EXPORT are used in dynamic linking, when our program calls an imported function from DLL. This pseudoinstruction accepts keyword parameter LIB= which specifies the library file. Parameter LIB= may be omitted when the symbols are imported from kernel32.dll (this is the Windows dynamic library of core WinAPI functions).
Library file name doesn't have to be in quotes when it follows DOS convention 8.3. Library is always specified without path. Operating system uses its own rules ([WinDllSearchOrder]) concerning directories where are the libraries searched for at bind-time.

↑ EXPORT

Scope EXPORT is used when we make a dynamic library and it declares symbols which are expected to be imported by other programs. Similar to PUBLIC scope, symbol marked for EXPORT must be defined in the program, sooner or later.

Pseudoinstruction EXPORT accepts two keyword parameters FWD= and LIB=, which specify that the exported symbol (function name) is in fact provided by another dynamic library (defined with LIB=) under a different symbol name (defined with FWD=). Example:

kernel32 PROGRAM FORMAT=DLL
          EXPORT EnterCriticalSection, LIB="NTDLL.dll", FWD=RtlEnterCriticalSection
          ; Other kernel functions.
         ENDPROGRAM kernel32

Library "kernel32.dll" yields API function EnterCriticalSection, which is in fact provided by the library "NTDLL.dll". In other Windows version it may be provided by a different library "XPDLL.dll" but programs importing the function from proxy library "kernel32.dll" need no update or recompilation.

↑ ALIGN

This pseudoinstruction is used for explicit alignment of current section pointer $. For instance ALIGN OWORD in code section will emit several (0..15) bytes of NOP operation, so that the next statement will be emitted at octword-aligned address. ALIGN in data sections uses NUL byte (0x00) instead of NOP (0x90) as a stuff.

The operand can be type specifier in short or long notation: B, U, W, D, Q, T, O, Y, Z, BYTE, UNICHAR, WORD, DWORD, QWORD, TBYTE, OWORD, YWORD, ZWORD or arithmetic expression which evaluates to a power of two: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512.
ALIGN TBYTE aligns to 8.

ALIGN statement may have no label but it can have two operands. The second operand is used for intentional unalignment, it needs not to be the power of 2 and it must be lower than the first one. For instance ALIGN OWORD, QWORD alignes $ to odd multiple of 8.
ALIGN 8,2 requests the current offset be set at the second byte in qword (counted from zero). Example of offsets which meet such requirement are 2, 10, 18, 26...

↑ STRUC

↑ ENDSTRUC

A structure represents virtual section of data declarations which can be used as a mask or a grid-template laid over a piece of memory. Structure is declared with STRUC..ENDSTRUC block. The only statements which may be used within the block are

  1. data definitions specified with D statement and its clones, either initialized or uninitialized
  2. explicit alignment statements (pseudoinstruction ALIGN)
  3. pseudoinstructions STRUC and ENDSTRUC (€ASM allows nested definitions of structures)
  4. line and markup comments
|[.data] ::::Section changed. |00000000: | ; Example of structure declaration: |[MyStruc] |MyStruc: STRUC |00000000:................ |.Member1 D Q ; Uninitialized QWORD member. |00000008:........ |.Member2 D D ; Uninitialized DWORD member. |0000000C:........ | D D ; Uninitialized anonymous DWORD member. |00000010:FF |.Member3 D B 255 ; Initialized BYTE member. |00000011:.. |.Member4 D B ; Uninitialized BYTE member. |00000012:............ | ALIGN QWORD ; Increase size# MyStruc to QWORD. |00000018: | ENDSTRUC MyStruc |[.data] ::::Section changed. |00000000: | |00000000: | |00000000:18000000 | DD SIZE# MyStruc ; MyStruc is 0x18 bytes long. |00000004:53000000 | DD TYPE# MyStruc ; Type of any structure is 'S'. |00000008:00000000 | DD SEGMENT# MyStruc ; Segment/section/offset of any struc declaration is scalar 0. |0000000C: |

Declaration of a structure does not emit any data to the target file. Data are emitted or reserved only when the declared structure is actually used in data definition (in pseudoinstruction D or DS).
When initialized data is defined in the structure declaration, it will be used to initialize corresponding members at the time of structured data definition (with pseudoinstruction D or DS), unless explicitly redefined.

Named data definitions in the structure must have local names (starting with .)
This alows to

  1. use the same name for members of different structures,
  2. avoid name conflict when more than one object of this structure is defined.

Each member is given its offset relative to the start of the structure. Section, which was current at the time of structure declaration, is irrelevant. Each structure declaration temporarily creates its own pseudosection with virtual address 0.

Structure must be given a unique structure name, which is defined in the label field of STRUC statement and, optionally, in the operand field of ENDSTRUC statement.

Size of structure can be referred with attribute SIZE#Structure_name.

Pseudoinstruction STRUC accepts keyword operand ALIGN=, which specifies alignment of instances of the structure when EUROASM AUTOALIGN=ON.
If the alignment is not explicitly specified with STRUC declaration, alignment corresponding to PROGRAM WIDTH= is used as the default (WORD, DWORD or QWORD).

See tests t2500, t25010, t2504 for more examples of structure declaration.

↑ D, DB, DU, DW, DD, DQ, DT, DO, DY, DZ, DI, DS

Both initialized and uninitialized data are defined and reserved with pseudoinstruction D. When a static value is specified, the data are defined. When the value is omitted, data are reserved. If EUROASM option AUTOSEGMENT=ON, INSTR data definition will switch to code section, all other data definition will switch to data section and data reservation will switch to bss (uninitialized data) section.

|[.data] ::::Section changed. |00000000: |; Integer numbers definitions: |00000000:01 | D BYTE 1 ; Define byte integer with value 1, using long typename specification. |00000001:00 ....AutoAlignment stuff. |00000002:0200 | D W 2 ; Define word integer with value 2, using short typename specification. |00000004:03000000 | D D 3 ; Define dword integer with value 3. |00000008:0400000000000000 | D Q 4 ; Define qword integer with value 4. |00000010: |; Floating-point numbers definitions: |00000010:0000A040 | D D 5.0 ; Define single-precision number with value 5. |00000014:00000000 ....AutoAlignment stuff. |00000018:0000000000001840 | D Q 6.0 ; Define double-precision number with value 6. |00000020:00000000000000E0~| D T 7.0 ; Define extended-precision number with value 7. |0000002A: | ; String definitions: |0000002A:4279746573 | D B "Bytes" ; Define a string of bytes. |0000002F:00 ....AutoAlignment stuff. |00000030:55006E0069006300~| D U "Unichars" ; Define a string of unichars. |00000040:4368617273 | D "Chars" ; Define a string of bytes or unichars (depends on the option UNICODE=). |00000045: | ; Instruction operation code definitions: |[.text] ::::Section changed. |00000000:90 | D INSTR "NOP" ; Define NOP opcode, using long typename specification. |00000001:C3 | D I "RET" ; Define RET opcode, using short typename specification. |00000002: | ; String reservations: |[.bss] ::::Section changed. |00000000:................ | D 8 * B ; Reserve eight bytes long string. |00000008:................~| D 9 * U ; Reserve nine unichars long string. |0000001A: | ; Number reservations: |0000001A:.... | D W ; Reserve one word. |0000001C:........ | D D ; Reserve one dword. |00000020:................ | D Q ; Reserve one qword. |00000028:................~| D T ; Reserve one tenbyte. |00000032: | ; Vector reservations: |00000032:................~....AutoAlignment stuff. |00000040:................~| D O ; Reserve one oword, which can hold two qword or four dword numbers. |00000050:................~....AutoAlignment stuff. |00000060:................~| D Y ; Reserve one yword, which can hold four qword or eight dword numbers. |00000080:................~| D Z ; Reserve one zword, which can hold eight qword or sixteen dword numbers. |000000C0: |

See t2482 for more examples.

Each operand of D is a data expression.

Pseudoinstruction mnemonic may be appended with suffix B, U, W, D, Q, T, O, Y, Z, I, S. Suffix defines the default datatype, which is used if not explicitly specified in operand. For instance DD 2,3,4 defines three dwords with static values 2, 3 and 4.

Suffix also determines datatype of symbol, which defines the data. For instance in definition Sym1 DQ B 1, W 2, D 4 the suffix specifies that datatype of Sym1 is QWORD, although it defines only byte, word and dword data.

Types of data may mix in the same D statement.

Default datatype specified with mnemonic suffix can be overridden in operand fields by explicit datatype in short or long notation. Operands without explicit redefinition take the default data type from D-suffix, for instance DB 27, "$", W 120 defines two bytes followed with one word. Datatypes in operand may be specified with long names as well, e.g. DB 27, "$", WORD 120.
See t2481 for more examples.

Data from one operand may be duplicated.

For instance TranslateTable: D 256 * BYTE reserves 256 uninitialized bytes.
If duplication is not used, it defaults to 1. Negative duplicator is not permitted.
Duplicator 0 does not define or reserve any data, but still it provides default datatype of the symbol and, if AUTOALIGN=ON, it aligns the curent offset $.

If no suffix is used, the default datatype is taken from the first nonempty operand, e.g. D D 2,3,4 defines three dwords with static values 2,3 and 4. When no default is defined, as in D 2, €ASM reports an error.

The only exception, when datatype needs not be explicitly specified, is definition of text string, for instance D "Some text.". In this case the default datatype is B or U, which depends on current value of EUROASM option UNICODE=.

No data is defined/reserved when no operand is used.
L1: D  B 5      ; Define one byte with value 5.    TYPE#L1='B', SIZE#L1=1.
L2: D  2*WORD 3 ; Define two words with value 3.   TYPE#L2='W', SIZE#L2=4.
L3: DW W        ; Reserve one word.                TYPE#L3='W', SIZE#L3=2.
L4: DW 0*D      ; Reserve nothing, align to DWORD. TYPE#L4='W', SIZE#L4=0.
L5: DQ          ; Reserve nothing, align to QWORD. TYPE#L5='Q', SIZE#L5=0.
L6: D           ; Do nothing.                      TYPE#L6='A', SIZE#L6=0.
Unlike other assemblers, omitted operand doesn't emit any data, €ASM requests that operand type and|or value be specified, no matter if the D operation is suffixed or not. For instance DB reserves one byte in MASM but it does nothing in €ASM. Use D B or DB B instead.

EuroAssembler can define operation code of machine instruction as data, with pseudoinstruction DI. It is similar to DB or DU but the string contents is not emitted verbatim, it is assembled first. The quoted text in DI operand(s) should be a valid machine instruction, it may have prefix and operands but not a label.

For instance DI "SEGES:MOVSB" defines bytes 0x26,0xA4.
D 8*I"MOVSD" defines eight bytes 0xA5.
See t2515 for more DI examples.

Structured memory variable is defined with pseudoinstruction DS struc_name or just D struc_name.

Only one structured object can be defined with one D statement.

€ASM does not allow multiple ordinal operands when a structured object is defined, such as DS MyStruc1, Mystruc2. Nevertheless, duplication is supported, e.g. DS 4*MyStruc.

Members of the structured object can be overriden statically, using keyword operands. Keyword name is the local name of defined member, immediately followed with equal sign = and with the new value of statically defined member. Namespace of operand fields in DS statement is temporarily changed to the namespace of structure definition.

Instance of MyStruc declared above in STRUC example could be for example defined as MyObject DS MyStruc, .Member2=2, .Member4=4. This initializes the contens of MyObject.Member2 to dword integer 2, and the contents of MyObject.Member4 to byte integer 4. Contents of MyObject.Member3 is already statically defined as byte integer 255, other members of MyObject remain uninitialized.
If at least one member is initialized, the object is by default emitted to data section, uninitialized members are filled with zeroes. See also test t2510.

↑ EQU

↑ =

Pseudoinstruction EQU (or its alias =) defines a symbol, which is presented in the label field. The statement must have just one operand, which specifies the address or the numeric value of the symbol.

Instruction Label:EQU $ or Label:= $ are equivalent to Label:, i.e. specifying the statement with label only, which assigns an address to the symbol Label.

Using EQU is the only way how to define a plain numeric symbol, such as FILE_ATTRIBUTE_ARCHIVE = 00000020h.

See any macrolibrary within PROGRAM realm as an example of EQU symbol definitions, for example winsfile.htm.

↑ %COMMENT

↑ %ENDCOMMENT

These pseudoinstructions define block comments, i.e. range of source code which is ignored by €ASM. In the label field of %COMMENT there may be an identifier, which gives the block a name (but does not create a symbol). The same identifier can be used as the first operand of %ENDCOMMENT statement. This helps €ASM to check correct matching of %COMMENT/%ENDCOMMENT, especially when comment blocks are nested.

↑ %DROPMACRO

%DROPMACRO tells €ASM to forget previously defined macroinstruction. One %DROPMACRO statement may drop one or more macros specified as operands, e.g.
%DROPMACRO Macro1, Macro2, Macro3.

Alternatively we may drop all macros declared so far with %DROPMACRO *.

See also %DROPMACRO example below.

↑ %IF

↑ %ELSE

↑ %ENDIF

Instructions between %IF and %ENDIF is assembled only if the condition in the first and only %IF operand is evaluated as true. %IF accepts extended boolean expression and it also accepts empty operand, which is always evaluated as false.

Pseudoinstruction %ELSE may occur in the %IF..%ENDIF block. It reverses the logic of assembly: instructions between %IF and %ELSE are assembled when the %IF condition is true and instructions between %ELSE and %ENDIF are assembled when the %IF condition is false.

%IF may have an identifier in the label field which does not create a symbol but it identifies the block. The same identifier can be used in operand field of %ELSE and %ENDIF statements.

↑ %FOR

↑ %EXITFOR

↑ %ENDFOR

Pseudoinstructions %FOR and %ENDFOR create block which is assembled (repeated) for each operand of the %FOR statement. The label field of %FOR statement must be an identifier. It does not create a symbol, instead it defines a formal preprocessing %variable which is accessible in the %FOR..%ENDFOR block only. The name of this %variable consists of percent sign followed with the identifier.

Operands can be arbitrary elements which we need to operate with: register, number, expression, string. Formal %variable will be assigned with each %FOR operand respectively, and the block will be emitted with its value in the formal %variable. The following example defines %FOR loop with three operands and it emits three memory variables:

data %FOR "a", 3*B(5), "Long text"
       D %data
     %ENDFOR data
and it will be expanded to |00000000:61 + D "a" |00000001:050505 + D 3*B(5) |00000004:4C6F6E672074657874 + D "Long text" |0000000D: |

Repeating the identifier in operand field of %ENDFOR and %EXITFOR statement is optional and it can be used to check proper pairing of block instructions.

The operand of %FOR can also be a numeric range, the block is repeated with each integer value of the range in this case. Slope of the range can be negative; default step of control %variable is -1 in this case instead of +1.

i  %FOR  0..5    ; Slope is positive, therefore implicit step = +1.
      DB "A"+%i  ; Define bytes "A","B","C","D","E","F".
   %ENDFOR i
j  %FOR 'z'..'x' ; Slope is negative, therefore implicit step = -1.
      DB %j      ; Define bytes 'z','y','x'.
   %ENDFOR j

See also t2640.

%FOR accepts keyword integer operand STEP= which explicitly defines how is the control %variable incremented when a range is used. Default value (or when it's omitted) is STEP=0, which is a special case: the actual effective step is then either +1 or -1, depending on the range slope.

Both kind of operands (enumerated and range) can be combined. When the step is explicitly defined and its sign differs from the range slope, the %FOR..%ENDFOR body is not assembled. On the other hand, if STEP= is omitted or set to 0, ranges with both slopes can be combined in one %FOR statement and each range-operand will receive appropriate step +1 or -1. Example:

a %FOR 1..3, 6..4, 7
     ; Block is assembled with %a = 1,2,3,6,5,4,7.
   %ENDFOR

b %FOR 0..64, 256, 400..300, 512, STEP=16
     ; Block is assembled with %b = 0,16,32,48,64,256,512.
   %ENDFOR

If the formal %FOR variable has identical name with another user-defined %variable, it prevails and the user-defined %variable is not visible until %ENDFOR is encountered. See t2641.

When €ASM encounters %EXITFOR pseudoinstruction, it breaks the assembly of remaining instructions in %FOR..%ENDFOR block and continues below the %ENDFOR statement, no matter how many unprocessed %FOR operands is left.

i  %FOR 0..9
     DB %i
     %IF %i>=3
       %EXITFOR i
     %ENDIF
     DB "a" + %i
   %ENDFOR i ; This will define bytes 0,"a",1,"b",2,"c",3

In nested %FOR..%ENDFOR blocks can be the formal variable (%EXITFOR's first and only operand) used for specification which of the nested block should be exited, see t2642 as an example.

↑ %WHILE

↑ %EXITWHILE

↑ %ENDWHILE

The block of statements between %WHILE and %ENDWHILE is being assembled repeatedly while the condition in first and only %WHILE operand is true. If the condition is false at the block entry, it is skipped entirely.

Identifier may be used in the label of %WHILE and in the operand of %ENDWHILE and %EXITWHILE just for visual binding; it does not define a symbol.

Unlike %FOR, which temporarily declares and maintains its own control %variable, the %WHILE does not. It is the programer's duty to declare some control %variable outside the block, and to change it within %WHILE..%ENDWHILE. Example:

%i  %SETA 3        ; Define %variable %i which will control the block expansion.
id1 %WHILE %i
C%i:  DB %i
%i    %SETA %i - 1 ; Alternate the user-defined control %variable.
    %ENDWHILE id1
; Statements assembled with %WHILE..%ENDWHILE block: C3: DB 3, C2: DB 2, C1: DB 1.

%EXITWHILE in the block will cause skipping the rest of statements; €ASM will continue below %ENDWHILE.

See also t2700, t2701, t2702.

↑ %REPEAT

↑ %EXITREPEAT

↑ %ENDREPEAT alias

↑ %UNTIL

The conditional assembly block %REPEAT..%ENDREPAT is similar to %WHILE..%ENDWHILE but the condition is evaluated at the end of block, and the logic is inverted. %REPEAT takes no label and no operand. The statements in the block are always assembled at least once. The control condition is in the operand field of %ENDREPEAT; if it evaluates to false, €ASM will assemble the block repeatedly. Alias %UNTIL may be used instead of mnemonic %ENDREPEAT.

Block %REPEAT..%ENDREPEAT can use identifier for nesting check. Unlike other block statements, position of block identifier is different: Block identifier can be specified as the first operand of %REPEAT, and as the label of %ENDREPEAT (alias %UNTIL).

%i  %SETA 3           ; Define %variable %i which will control the block expansion.
    %REPEAT Id1
      C%i: DB %i
      %i %SETA %i - 1 ; Alternate the user-defined control %variable.
Id1 %UNTIL %i = 0
; Statements assembled with %REPEAT..%UNTIL block: C3: DB 3, C2: DB 2, C1: DB 1.

%EXITREPEAT in the block will cause skipping the rest of statements; €ASM will continue below %ENDREPEAT.

See also t2750, t2751, t2752.

↑ %SET

Pseudoinstruction %SET and other members of its family are designed to assign a value to preprocessing %variable, which is in the label field of the statement.

%SET assigns the whole list of operands as a verbatim text, including the commas which separate operands from one another. White spaces between the operation mnemonics (%SET) and the first operand are omitted. White spaces after the last operand are trimmed off, too. White spaces are similary trimmed when line-continuation is used.

%CardList %SET Hearts, Diamonds, Clubs, Spades  ; Comment

%CardList now will contain the string Hearts, Diamonds, Clubs, Spades (31 characters including spaces and commas).

See also t2810.

↑ %SETA

%SETA accepts arithmetic expressions. They will be evaluated and assigned to the %variable as signed decimal number. Error is reported if the %SETA operand is not a valid expression.

When more than one operand is used, each value is set to the corresponding comma-separated item of the %variable, which is being assigned. Example:

%Value %SETA PoolEnd - PoolBegin
%Sizes %SETA 2+3, 4, ,-5*2

The difference between offsets PoolEnd and PoolBegin in previous example was calculated and assigned to %Value as a decadic number.
%Sizes now contains the text 5,4,,-10 (8 characters). Individual items of %Sizes can be retrieved with sublist operation, such as %Sizes{2}.

See also t2821.

%SETA is better suitable for modification of control %variable in preprocessing loop, such as %i %SETA %i+1. Though text assignment %i %SET %i+1 would work here as well, with %SET is the expression not evaluated immediately and we might wind up with something like +1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 after 15th expansion.

↑ %SETB

%SETB is similar to %SETA, it accepts extended boolean expressions and assigns them in the form of binary digits 1 or 0.

See also t2831.

Unlike with %SETA, the binary digits are not separated with commas if more than one operand is used in %SETB statement. Items of assigned variable can be retrieved with substring operation. Example:

%TooBig %SETB 5 > 4                   ; %TooBig is assigned with one character 1 (true).
%Flags  %SETB %TooBig, 2,,3>2,off,4,, ; %Flags  are assigned  with 110101.
        %IF %Flags[1]                 ; True, equals to 1st member of %Flags, i.e. %TooBig, i.e. 1.
Flags:  DB %Flags[]b                  ; Memory variable contains 00110101b.

↑ %SETC

%SETC accepts expression operand, which must evaluate to a plain number not greater than 255 and not lower then -128. The result will be assigned as one character with evaluated ASCII byte value. Example:

%Quote %SETC """" ; One character "quote" is assigned.
%Tab   %SETC 9    ; One character "tabelator" is assigned.
%NBSP  %SETC -1   ; One character with value 0xFF is assigned.

Similar with %SETB, multiple operands may be defined in %SETC and the resulting characters are not separated with commas.

%Hexadigits %SETC 'A','B','C','D','E','F'
; %Hexadigits now contains six characters ABCDEF

See also t2841.

%SETC allows to assign special characters to preprocessing %variable, which couldn't be possible to assign as plain text with %SET due to €ASM parser syntax rules.
%Space %SETC 32 assigns one space. This could also be achieved with
%QuotedSpace %SET " " and suboperating only the 2nd of three assigned characters:
%Space %SET %QuotedSpace[2].

↑ %SETE

This pseudoinstruction reads environment variable from system at assembly time and assigns its value to the preprocessing variable. Name of environment variable specified in the operand field(s) is cited without quotes, percent signs or dollar sign, e.g.

%OS %SETE OS
Msg: DB "This program was assembled at %OS system."

€ASM reports warning W2520 when the requested variable is empty or not defined.

%SETE allows to retrieve more than one environment %variables, their values will be assigned as unquoted and comma-separated. Example:

%CpuInfo %SETE PROCESSOR_ARCHITECTURE, PROCESSOR_IDENTIFIER, \
               PROCESSOR_LEVEL, PROCESSOR_REVISION
On my old computer this will assign following text to %CpuInfo:
x86,x86 Family 15 Model 1 Stepping 2, GenuineIntel,15,0102. Due to comma character inserted by Windows into the value of %PROCESSOR_IDENTIFIER% it wouldn't be easy to retrieve individual components from such concatenation with sublist %CpuInfo{4}. So it is always better to use %SETE for only one environment variable.

↑ %SETS

%SETS looks at the %variable in the operand field and assigns its size, i.e. the number of characters which its value occupies.

%SomeVar        %SET  ABC, DEF
%SomeSize       %SETS %SomeVar  ; %SomeSize is now 8 (3 letters + comma + space + 3 letters).
%SizeOfSomeSize %SETS %SomeSize ; %SizeOfSomeSize is now 1 (one digit).

%SETS must have just one operand, which looks like a preprocessing %variable (percent sign followed with an identifier).

See also t2861.

↑ %SETL

%SETL is similar to %SETS except that is assigns length of the %variable contents, i.e. the number of comma-separated items in the %variable contents.

%SomeVar            %SET  ABC, DEF
%SomeLength         %SETL %SomeVar    ; %SomeLength is now 2 (2 comma separated items).
%LengthOfSomeLength %SETL %SomeLength ; %LengthOfSomeLength is now 1 (one item).

%SETL must have just one operand, which looks like a preprocessing %variable (percent sign followed with an identifier).

See also t2866.

↑ %SET2

Consider assembly of the statement %Var1 %SET %Var2. €ASM first expands the %Var2 and result of expansion is then assigned to %Var1. First two tokens of the statement are not expanded, because %Var1 is the target which is just being assigned, and %SET is reserved name which is never expanded.

%SET2 is similar to %SET except that the operand field is expanded 2 times before being assigned. Each expansion "swallows" one percent sign.

%V1 %SET "A"
%V2 %SET "B"
%V3 %SET "C"
i   %FOR 1..3
      %DataExp %SET2 %%V%i
      DB %DataExp
    %ENDFOR i ; Emit DB "A", DB "B", DB "C".

See also t2871.

Only special macros make use of %SET2, for instance EndProcedure where it is used to expand %variable with not-known-yet dynamically changing name.

↑ %SETX

When pseudoinstruction of SET* family is being assembled, €ASM does not expand label field and operation field of statements such as %Label %SET* anything. This applies to %SET, %SETA, %SETB, %SETC, %SETU, %SETE, %SETS, %SETL, %SET2 but not to %SETX. In this statement the label field is expanded, too. After the expansion of label field %SETX works like ordinary %SET, which means that it requires a valid %variable name in the label field. For instance %%Var1 %SETX ABC is equivalent to %Var1 %SET ABC.

Using %SETX we can assign %variables whose names are not explicitly set at assembly time and dynamically change. Example:

i %FOR 1..4
     %%M%i %SETX %i  ; Identical with %M1 %SET 1, %M2 %SET 2 etc.
  %ENDFOR  ; This will assign values 1,2,3,4 to preprocessing %variables %M1,%M2,%M3,%M4.

See also t2881.

Only special macros make use of %SETX, for instance Procedure where it is used to assign stack-frame addresses to %variables, whose names are not-known-yet at macro-write time.

↑ %MACRO

↑ %EXITMACRO

↑ %ENDMACRO

Block of statements claimed with pseudoinstructions %MACRO and %ENDMACRO is called macro declaration. Identifier in the label field of %MACRO statement is the name of macro.
%MACRO statement itself is called macro prototype, as it declares macro name and gives names to macro arguments. Once declared, macro can be expanded many times it the program.

When €ASM reads the macro declaration in source text, it does not emit any code. Instructions from the macro body will be emitted only when the macro is actualy expanded with its macroinstruction.

%EXITMACRO allows to break the emitting process if it is encountered, usually when some error condition was detected.

Both %EXITMACRO and %ENDMACRO pseudoinstructions may have the macro name in the operand field in order to emphasize block matching.

Example of macro declaration and macro expansion:

AlignEAX %MACRO ; Round-up the contents of EAX to multiple of 4.
           ADD EAX,3
           AND EAX,-4
         %ENDMACRO AlignEAX

         MOV EAX,13
         AlignEAX   ; After macro expansion EAX contains 16.

For more information see also the chapter MacroInstructions.

↑ %SHIFT

Pseudoinstruction %SHIFT is usable in macro block only. It will decrement the ordinal number of all macro operands by one or by the integer, which it has in operand field. %SHIFT may have no label and only one operand which evaluates to a plain integer number. Default 1 is assumed when the operand is omitted.

%SHIFT 0 does nothing. Shifting by negative number will inverse the direction.

Effect of the operation is limited only when macrooperands are accessed by ordinal number, such as %1, %2 etc. Accessing operands by formal names remains unaffected by %SHIFT operation.

Operands, which are left-shifted from ordinal position %1 to position zero or negative, are not accessible by ordinal number any longer, but they are not lost forever, as they may be shifted back by negative number.

| |Sample %MACRO Oper1, Oper2, Oper3 | |L1: DB %1, %Oper1 | | %SHIFT 1 | |L2: DB %1, %Oper1 | | %SHIFT 2 | |L3: DB %1, %Oper1 | | %ENDMACRO Sample |0000: | |0000: |Sample 0x44, 0x55, 0x66, 0x77 | +Sample %MACRO Oper1, Oper2, Oper3 |0000:4444 +L1: DB %1, %Oper1 | + %SHIFT 1 |0002:5544 +L2: DB %1, %Oper1 | + %SHIFT 2 |0004:7744 +L3: DB %1, %Oper1 | + %ENDMACRO Sample |0006: |

See also t8221.

↑ %ERROR

Pseudoinstruction %ERROR will insert into the listing file an user-defined error message similar to those emitted by €ASM itself when it founds some mistake in the source text. %ERROR is often used in macroinstructions and usually it warns the programmer that the macro was not used in the intended way.

User defined errors have severity code U and severity level 5, which is between warnings and assembler errors. Programmer may specify the actual message identifier with optional keyword operand ID= which can be plain decimal number between 5000 and 5999. %ERROR will also accept identifier with value 0..999 and it adds internally 5000 in this case. Default value is 0, so the user defined message has identifier U5000, if no keyword operand ID= was used.

The message text does not have to be in quotes. If the message text consists from more than one ordinal operands, they will be concatenated verbatim, including quotes, if used. Example:

%ERROR Id=5123, Something went wrong. Try again.

See also t2581 for more examples.

↑ %DISPLAY

Pseudoinstruction %DISPLAY is used for retrieving information about internal objects created by €ASM during assembly process. Each such object is displayed in the form of debug message with severity level 1. The message is printed both to output console (in each pass) and to the listing file (in the final pass).
%DISPLAY is active even in non-emitting source passages, such as false %IF branch or block disabled with %COMMENT. It is intended to investigate €ASM internals when something is working not as expected.

Pseudoinstrucion %DISPLAY accepts arbitrary number of operands – object categories, which specify the kind of objects that we want to review. Categories may be provided as ordinal operands or as keyword operands with value which specifies the filter. Filter can restrict the amount of displayed lines. Category names are case insensitive but the filtering value, if used, is case sensitive. Filter value defines first few characters of those object names, which we want to display. Filter value may be terminated with asterix *, but this is not mandatory. For instance the statement %DISPLAY Macros=Alig will display all macros whose names begin with "Alig".

Operands of pseudoinstruction %DISPLAY have rather relaxed syntax.Object categories (ordinal operand name or keyword name) may be shortened, too. Only such number of characters is required which is enough to identify the desired category. For instance %DISPLAY se will display map of all segments and their sections. %DISPLAY File displays the list of input files (main source and included libraries). %DISPLAY sym=Num*, sym=En will list only those symbols, whose name begins with Num or En.

%DISPLAY UserVar, %DISPLAY UserVar=*and %DISPLAY user= work equally (empty filter value will match any %variable name. Nonfilterable categories, such as segments, context stack, automatic macro %variables, will always display their complete list, any filterring value is ignored.

When specifying (system) %variable names as the filterring value, the leading percent sign % or %^ may be omitted, or the percent sign must be doubled (otherwise it would have been expanded to their current contents). %DISPLAY UserVar=Loc %DISPLAY us=Loc* and %DISPLAY user=%%Loc are equal in their function: they display the current contents of user-defined preprocessing %variables whose name begins with %Loc.

%DISPLAY object categories
%DISPLAY operandMessagesFilterOrderDisplayed objects
AllD1100..D1900yesalphabetical All objects specified below (shortcut for Fil,Ch,Se,St,Co,Sym,L,Rel,M,V).
FilesD1150..D1190ignorednaturalSource files included in the program.
ChunksD1200..D1240ignorednaturalChunks of source code.
SectionsD1250..D1290ignorednaturalMap of groups, segments and sections.
SegmentsD1250..D1290ignorednaturalMap of groups, segments and sections.
GroupsD1250..D1290ignorednaturalMap of groups, segments and sections.
StructuresD1300..D1340yesalphabeticalStructures declared in the program.
ContextD1350..D1390ignoredstackedContext stack of block statements
SymbolsD1400..D1450yesalphabetical All explicitly defined symbols (shortcut for Fix,Unf,Unr,Ref).
  UnfixedSymbolsD1410..D1450yesalphabeticalSymbols whose properties are not stable yet.
  FixedSymbolsD1420..D1450yesalphabeticalSymbols whose properties are already fixed.
  UnreferencedSymbolsD1430..D1450yesalphabeticalSymbols which were not used yet.
  ReferencedSymbolsD1440..D1450yesalphabeticalSymbols which were mentioned at least once, or used in a structure.
LiteralSymbolsD1500..D1540ignoredalphabeticalAll literal symbols declared in the program.
RelocationsD1550..D1590ignorednaturalRelocation records.
MacrosD1600..D1690yesalphabeticalMacroinstructions declared at this moment.
VariablesD1700..D1790yesalphabetical All preprocessing %variables currently set (shortcut for Au,Fo,Us,Sys).
  AutomaticVariablesD1710..D1730ignoredfixedAutomatic macro %variables.
  FormalVariablesD1740..D1750yesalphabeticalFormal macro/for %variables.
  UserVariablesD11760..D1770yesalphabeticalUser-defined preprocessing %variables.
  SystemVariablesD1780..D1790yesalphabeticalSystem preprocessing %^variables.

Displayed message usually contains object name, it's attributes and other properties.

%DISPLAY operands Groups, Segments, Sections are identical, each of them always displays the complete tree.
Line with group lists all groups's segment names.
Line with segment is indented by 2 spaces and displays purpose, width,align, combine, class, src.
Line with section is indented by 4 chars and displays address, size, align, ref.

Property src= specifies whether the file or chunk is

Chunk property type= shows what kind of information is in this chunk of source text:

Boolean property ref= tells whether the symbol, structure or section was used (referrenced at least once in the program). Members of the structure are automatically referred when the structure is defined.
Similar property fix= specifies if the offset of symbol or section is already fixed, i.e. it is stable between assembly passes.
Context property emit= informs whether the block is in normal (emitting) status, or if it is just bypassed without emitting any code or data.

Context property %.= shows current value of expansion counter in this block.

Property src= identifies position in source text where the displayed object was defined, in standard form "FileName"{LineNumber}.

Automatic and formal %variables are defined only in %macro | %for expansion, i.e. when %DISPLAY Auto,Formal is inserted in %MACRO..%ENDMACRO or %FOR..%ENDFOR body and the macro is then expanded.

See tests t2901..t2917 for examples of %DISPLAY output.

Unlike other instructions, %DISPLAY is active even in non-emitting status. Be cautious to put unfiltered %DISPLAY in repeating preprocessing loops (%FOR, %WHILE, %REPEAT), as this may substantionally flood the output.

The main purpose of %DISPLAY is to find errors at assembly-time, when €ASM doesn't work as expected, together with EUROASM options DISPLAYSTM=, DISPLAYENC= and with PROGRAM options LISTGLOBALS=, LISTLITERALS=, LISTMAP=.
For investigation of your program at run-time use a debugger.

↑ %DEBUG

↑ %PROFILE

Those pseudoinstruction names are reserved for future extension of EuroAssembler, they are not implemented yet. See also EUROASM boolean options DEBUG= and PROFILE=.


↑ Macroinstructions

Macro is defined by a block of statements (macro body) encapsulated between pseudoinstructions %MACRO and %ENDMACRO. The %MACRO statement itself ( macro prototype) must have a label, which can be used later for macro invocation (alias macro expansion).

Macro must be defined before it is invoked.

Statement, which has the name of previously declared %MACRO in its operation field, is called macroinstruction or simply macro. It will be replaced with statements from the block %MACRO..%ENDMACRO. Macro can be a fixed static set of instructions, such as

CarriageReturn %MACRO
                 MOV AH,2  ; 3 statements between %MACRO and %ENDMACRO are macro body.
                 MOV DL,13
                 INT 21h
               %ENDMACRO CarriageReturn

More useful are macros which can modify the expanded instructions depending on operands they are invoked with. When a macro is invoked, it is usually provided with operand values, which are available in macro body as formal %variables or as automatic ordinal %variables %1, %2, %3,.... Operands in macrodefinition may be given temporary formal symbolic name; they are accessible in the macro block by this name prefixed with percent sign %. Or they may be referred with their ordinal number prefixed with %. Keyword operands are only accessible with the formal key name prefixed with %. Example:

Copy %MACRO Source, Destination, Size=ECX ; Statement %MACRO is called macro prototype.
       MOV ESI, %Source      ; or MOV ESI, %1
       MOV EDI, %Destination ; or MOV EDI, %2
       MOV ECX, %Size
       REP MOVSB
     %ENDMACRO Copy

The previous macro needlessly moves the number of copied bytes to register ECX even if it is already there at the time of its invocation. The expanded instruction MOV ECX,ECX could be spared in this case:

Copy %MACRO Source, Destination, Size=ECX
       MOV ESI, %Source      ; Instead of formal %Source we could use MOV ESI, %1
       MOV EDI, %Destination ; Or MOV EDI, %2
       %IF "%Size" !== "ECX"
         MOV ECX, %Size
       %ENDIF
       REP MOVSB
     %ENDMACRO Copy

Now when the macro is invoked as Copy From, To, Size=ecx or as Copy From, To, no superfluous MOV ECX,ECX is expanded.

If the name of formal macro %variable happens to collide with some previously user-defined preprocessing %variable, visibility of the user-defined %variable is temporarily overriden with the formal %variable, see the test t8347.
Automatic variables, such as %*, %#, %:, %1, %2,,, are not visible outside the macro body.

All macros in EuroAssembler may have variable number of operands.

Number of operands specified at macro invocation doesn't need to correspond with the number of operands specified at macro definition. If the macro is invoked with less ordinal operands than its prototype declares, €ASM does not treat this as error and silently expands the omitted operands to nothing.
When the macro is invoked with more operands than its prototype specifies, those superfluous operands are not accessible in macro expansion by formal names, but still they may be referred by their automatic ordinal number. See also pseudoinstruction %SHIFT.

When a keyword operand is omitted in macro invokation, it retains its value which was specified at macro definition. Adding a voluntary keyword operand(s) allows to extend functionality of macroinstruction without destroying the backward compatibility. Consider this simple macro:

Write %MACRO TextPtr,TextSize ; Write the text to standard output.
   MOV DX,%TextPtr
   MOV CX,%TextSize
   MOV BX,1       ; File handle of standard output.
   MOV AH,40h     ; Write string DS:DX to device or file.
   INT 21h        ; Invoke the DOS service.
 %ENDMACRO Write

Later we may want to use the same macro for writing to other devices, too. Let's extend it with keyword operand Handle= with predefined default value of standard output:

Write %MACRO TextPtr,TextSize,Handle=1 ; Write the text to standard output.
   MOV DX,%TextPtr
   MOV CX,%TextSize
   MOV BX,%Handle ; Handle of output device or file.
   MOV AH,40h     ; Write string DS:DX to device or file.
   INT 21h        ; Invoke the DOS service.
 %ENDMACRO Write

Now it's possible to write to other devices, too, for instance to the standard line printer: Write Message,80,Handle=4. The enhanced macro Write is backward compatible. Even if our old programs include updated macrolibrary with enhanced macro Write, they don't have to be recompiled.

Similary to preprocessing %variables, macros may be redefined many times. However, this is not usual and €ASM will emit a warning W2512 in this case. Once defined macro can be undefined with pseudoinstruction %DROPMACRO.

As an example of situation, where dropping of the macro definition may be useful, is emulation of a machine instruction by a macro with the same name.
Machine instruction BSWAP, which reverses the byte order in 32bit register, was not available on Intel 80386. This could be solved by emulation using three ROR or ROL instructions. If we detect that our program runs on Pentium, we can drop the macro definition and €ASM will assemble BSWAP as a native machine instruction.

|00000000: | | |BSWAP %MACRO reg32 ; Swap the byte order in register. | | %IF TYPE# %reg32 <> 'R' || SIZE# %reg32 <> 4 | | %ERROR 'Macro "BSWAP" expects 32bit GPR as its operand.' | | %EXITMACRO BSWAP | | %ENDIF | |%reg16 %SET %reg32[2..3] ; Name of the lower half of reg32 (omit the letter E). | | ROL %reg16,8 | | ROL %reg32,16 | | ROL %reg16,8 | | %ENDMACRO BSWAP |00000000: | |00000000:BA78563412| MOV EDX,0x12345678 |00000005: | BSWAP EDX ; Expected result is EDX=0x78563412. | +BSWAP %MACRO reg32 ; Swap the byte order in register. |FALSE + %IF TYPE# %reg32 <> 'R' || SIZE# %reg32 <> 4 | + %ERROR 'Macro "BSWAP" requires 32bit GPR as its operand.' | + %EXITMACRO BSWAP | + %ENDIF |4458 +%reg16 %SET %reg32[2..3] ; Name of the lower half of reg32. |00000005:66C1C208 + ROL %reg16,8 |00000009:C1C210 + ROL %reg32,16 |0000000C:66C1C208 + ROL %reg16,8 | + %ENDMACRO BSWAP | | ; If CPU is 486 or higher, prefer the machine instruction. | | %DROPMACRO BSWAP |00000010:0FCA | BSWAP EDX ; This time swap the byte order with native 486 instruction. |00000012: |

Advanced EuroAssembler macrolanguage allows to change the style of programming. We can create macroinstructions which mimic the functions of high-level languages and customize the new "language" for the particular task. See the macros Ii* in €ASM source file ii.htm as an example of pseudolanguage developed for intelligible description of conversion from assembly-instruction to the machine code.

When something doesn't work as expected, it's always possible to look at expanded macroinstruction body in the listing and adhere to a plain assembly code.


↑ Program formats

BIN ↓

COM ↓

MZ ↓

OMF ↓

LIBOMF ↓

COFF ↓

LIBCOF ↓

PE ↓

DLL ↓

RSRC ↓

Width of program formats ↓

Target of EuroAssembler's endeavour is an output file in one of the formats selected by PROGRAM FORMAT= option. There are three main categories of €ASM output files:

  1. linkable file (also called module or object file) is designed to be joined with other modules and libraries into final executable file or to an object library.
    €ASM supports two main standards of object files: OMF and COFF . Default object file name extension is .obj.
  2. library is a collection of modules, ready to be linked on demand into the final executable file. There are four kinds of libraries supported by EuroAssembler:

    Default filename extension of object | import library is .lib, in case of dynamic library it is .dll.

  3. executable file (also called image) can be loaded and launched directly by the shell of the hosting OS.
    €ASM can produce executables in formats PE, MZ, COM, they have file extension .exe or .com. It can also create dynamically loaded libraries DLL, very similar to PE format, but they can be executed only indirectly, through invocation of their exported function from another program, or through a special Windows loader, such as RUNDLL32.exe.
    Program format BIN is ranked as executable, too. However, as it lacks any red tape information, binary file needs its own ad hoc loader to be launched directly, or it must be loaded to a special storage place of the computer, such as firmware ROM or boot sector of disk device.

↑ BIN

Option PROGRAM FORMAT=BIN is chosen as default when FORMAT= is not explicitly specified. Default options for BIN format are

Name: PROGRAM FORMAT=BIN, OUTFILE=%^PROGRAM.bin,MODEL=TINY,WIDTH=16, \
              ENTRY=0,IMAGEBASE=0,SECTIONALIGN=0,FILEALIGN=0
.

€ASM creates default segment [BIN] with universal purpose:

[BIN] SEGMENT WIDTH=16,ALIGN=16, \
              PURPOSE=CODE+DATA+BSS+STACK+LITERALS
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Structure of BIN file is straightforward: binary image is a concatenation of emitted contents of its segments. Noninitialized (BSS) segments are omitted.

Segment alignment in the image is specified by the highest value of PROGRAM FILEALIGN=0, PROGRAM SECTIONALIGN=0 and SEGMENT ALIGN=16. Gaps between segments are filled with alignment stuff, which is 0x90 (NOP) if the neighbouring segments have both SEGMENT PURPOSE=CODE, otherwise it is 0x00.

Typical applications of binary format are pure data files, conversion tables, Dos drivers, boot sectors etc., see the sample BIN projects.

↑ COM

Files in COM format are legacy of CP/M operation system, they are directly executable in Dos and 32bit Windows. In other systems only with Dos emulator.

Default options for PROGRAM FORMAT=COM are

Name: PROGRAM FORMAT=COM,OUTFILE=%^PROGRAM.com,MODEL=TINY,WIDTH=16,IMAGEBASE=0, \
              ENTRY=256,SECTIONALIGN=0,FILEALIGN=0
.

Options ENTRY=0x100 and IMAGEBASE=0 are fixed for this format and cannot be changed.

€ASM creates default implicit segment [COM] with universal purpose:

[COM] SEGMENT WIDTH=16,ALIGN=16,PURPOSE=CODE+DATA+BSS+STACK+LITERALS
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Structure of COM file is similar to BIN format, they are no metainformation stored in the file except for its extension .com which tells OS to treat it as executable. OS loader will allocate 64 KB of memory, load segment registers CS,DS,ES,SS with paragraph address of that block, initialize 256 bytes long [PSP] structure at offset 0, load the entire file contents at offset 256 (0x0100), set stack pointer to the top of allocated block (usually SP=0xFFFE) and finally set IP=0x0100.

Size of code+data+stack altogether should not exceed 64 KB in TINY memory model. Program in COM format can use 32bit registers, if CPU is 386 or higher. Also additional memory blocks may be requested from OS at runtime. Typical application of this obsolete format are fast and short little utilities and Terminate-and-Stay-Resident (TSR) programs which provide services in Dos, see the sample DOS projects.

The following COM example is only 1 byte long, yet it is a formally valid computer program, though it does nothing:

         EUROASM
Shortest PROGRAM FORMAT=COM
          RET
         ENDPROGRAM Shortest

Program in COM format can link other object files or libraries, see the test table linker combinations.

↑ MZ

Specifying program format MZ creates 16bit or 32bit realmode executable file, which can be directly run in Dos and in 32bit Windows. Its structure is described in [MZ] and [MZEXE]. Dos executable file begins with MZ signature 'M','Z'.

Default options for PROGRAM FORMAT=MZ format are:

PROGRAM FORMAT=MZ,OUTFILE=%^PROGRAM.exe,MODEL=SMALL,WIDTH=16,IMAGEBASE=0, \
        SECTIONALIGN=0,FILEALIGN=0,SIZEOFSTACKCOMMIT=4K,SIZEOFHEAPCOMMIT=1M

€ASM creates four default implicit segments [CODE], [DATA], [BSS], [STACK] in program formats MZ, OMF, LIBOMF.

Parameter PROGRAM SizeOfStackCommit= specifies default size of segment [STACK], so we don't have to explicitly define stack segment if EUROASM option AUTOSEGMENT= is enabled at the ENDPROGRAM statement.

Parameter PROGRAM SizeOfHeapCommit= can be used to limit the requested amount of heap memory preallocated by the loader (member .e_maxalloc of DOS file header).

If the memory model is HUGE or FLAT and program width is not explicitly specified, it defaults to PROGRAM WIDTH=32, otherwise it is 16.

ImageBase=0 is fixed for this format and cannot be changed.
Explicit specifications of PROGRAM Entry= is mandatory in MZ format.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPRESERVE, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

↑ OMF

Object Module Format as specified in [OMF] is designed to be linked to 16bit and 32bit real-mode programs. Imports in this format are linkable to protected-mode executables.

Default segments are the same as in MZ format.

File format OMF is recognized for LINK when it is composed of valid OMF records and the first record is THEADR or LHEADR.

Default options for PROGRAM FORMAT=OMF are:

Name: PROGRAM FORMAT=OMF,OUTFILE=%^PROGRAM.obj,MODEL=SMALL,WIDTH=16
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

↑ LIBOMF

OMF library format is described in Apendix2 of the same document as [OMF]. Hashed dictionary, required by format specification at the end of library, is created on output, but €ASM linker ignores it. When the library is linked to another program, its public symbols are searched sequentionally. Page size of LIBOMF libraries created by €ASM is fixed at 16.

Default segments are the same as in MZ format.

File format LIBOMF is recognized for LINK when it starts with LIBHDR record with page size 16, 32, 64,..32K, and this record is followed by valid OMF modules, which start with THEADR or LHEADR records and which end with MODEND or MODEND32 record each. Library dictionary at the end of file is not checked.

Default options for PROGRAM FORMAT=LIBOMF are:

Name: PROGRAM FORMAT=LIBOMF,OUTFILE=%^PROGRAM.lib

Other properties are inherited from library modules.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MODEL, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM, WIDTH.

Modules, which will be stored to the library, should be assembled beforehand to files in OMF format. If the program, which creates library, contains some code, it will be assembled and stored as the first library module. Modules from other linked libraries, which do not declare any global symbol, will not be included in the target library at all. Example of static OMF library linked from 3 standalone modules:

MyLib: PROGRAM FORMAT=LIBOMF
        LINK "Module1.obj", "Module2.obj", "Module3.obj"
       ENDPROGRAM MyLib

Although format OMF was developed for real-mode programs, in can be enhanced with import declarations represented with OMF records COMENT/IMPDEF, and such import library used in Windows programs.

Some librarians (for instance [ALIB]) create longer alternatives of import library, which adds LEDATA+FIXUPP records with relocable machine code of proxy jumps to the imported function.
€ASM does not create the longer version of import libraries but both short and long versions are accepted by the linker. Example of a program creating pure import library in short OMF format:

ImpLib PROGRAM FORMAT=LIBOMF
  IMPORT LIB="kernel32.dll",TerminateProcess,TerminateThread
  IMPORT LIB="user32.dll",CreateCursor,CreateIcon,CreateMenu
 ENDPROGRAM ImpLib

↑ COFF

EuroAssembler implements object format COFF in Microsoft modification described in [MS_PECOFF]. This description is also valid for €ASM formats LIBCOF, PE, DLL (COFF-based formats).

€ASM creates three default segments (sections) in COFF-based formats:
[.text], [.data], [.bss]. Machine stack for executables will be established by the loader at run-time.

Default options for PROGRAM FORMAT=COFF are:

PROGRAM FORMAT=COFF,OUTFILE=%^PROGRAM.obj,MODEL=FLAT,WIDTH=32
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Generated value in PFCOFF_FILE_HEADER.Machine for legacy mode COFF is always 0x014C (Intel 386) regardless of EUROASM CPU= value. In 64bit mode PECOFF is always 0x8664 (architecture AMD64). Architecture Itanium (0x0200) is currently not supported.

PFCOFF_FILE_HEADER.TimeDateStamp corresponds with the current system time, unless it is forged by option EUROASM TIMESTAMP=.

Linked COFF module is recognized by the contents of PFCOFF_FILE_HEADER.Machine which should be one of the words with value 0x0000, 0x014C, 0x014D, 0x014E, 0x0200, 0x8664.

↑ LIBCOF

COFF library format is described in [COFFlib].

Default options for PROGRAM FORMAT=LIBCOF are:

PROGRAM FORMAT=LIBCOF,OUTFILE=%^PROGRAM.lib,MODEL=FLAT,WIDTH=32

Default segments are the same as in COFF format.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

COFF library is identified by the signature !<arch> followed with byte 0x0A.

Modules, which will be stored to the library, should be assembled beforehand to files in COFF format. If the program, which creates library, contains some code, it will be assembled and stored as the first library module. Modules which do not declare any global symbol, will not be included in the library at all. Example of COFF library linked from 3 modules:

MyLib: PROGRAM FORMAT=LIBCOF
         LINK "Module1.obj", "Module2.obj", "Module3.obj"
       ENDPROGRAM MyLib

€ASM does not create the longer version of import libraries but both short and long versions are accepted by the linker. Example of a program creating import library in short COFF format:

ImpLib: PROGRAM FORMAT=LIBCOF
         IMPORT LIB="kernel32.dll",TerminateProcess,TerminateThread
         IMPORT LIB="user32.dll",CreateCursor,CreateIcon,CreateMenu
        ENDPROGRAM ImpLib:

↑ PE

Portable executable file format PE is decribed in [MS_PECOFF]. Default options for PROGRAM FORMAT=PE are

Name: PROGRAM FORMAT=PE,OUTFILE=%^PROGRAM.exe,MODEL=FLAT,WIDTH=32,IMAGEBASE=4M,FILEALIGN=512,SECTIONALIGN=4K, \
              SUBSYSTEM=CON,ICONFILE="euroasm.ico",MAJORLINKERVERSION=1,MINORLINKERVERSION=0,ENTRY=,          \
              MAJOROSVERSION=4,MINOROSVERSION=0,MAJORIMAGEVERSION=1,MINORIMAGEVERSION=0,                      \
              MAJORSUBSYSTEMVERSION=4,MINORSUBSYSTEMVERSION=0,WIN32VERSIONVALUE=0,DLLCHARACTERISTIC=0x000F,   \
              SIZEOFSTACKRESERVE=1M,SIZEOFSTACKCOMMIT=4K,SIZEOFHEAPRESERVE=4M,SIZOHEAPCOMMIT=1M

Default segments are the same as in COFF format.

PE file begins with DOS program (stub) in MZ format, which is executed when the program is not launched in MS Windows. At the file address PFMZ_DOS_HEADER.e_lfanew it expects the PE format signature with bytes 'P','E',0,0.

Older file format with NE (New Executable) signature, used in 16bit Windows and OS/2, is not supported by €ASM.

COFF file header is followed with PFPE_OPTIONAL_HEADER. Almost all its fields are configurable with PROGRAM options.
PROGRAM ENTRY= must be explicitly specified in PE format.
Option PROGRAM STUBFILE= specifies file name of 16bit MZ program used when the program runs in DOS. If it is left empty, €ASM will use its own built-in stub, which reports error message This program was launched in DOS but it requires Windows. and terminates.
Factory default option ICONFILE="euroasm.ico" specifies the icon file name, which will be built in the resource section of linked PE file. It visually represents the compiled file in Windows Explorer or Desktop.

This parameter is ignored if any resource file is explicitly linked to PE (Explorer will then use the first icon found in the PE resources). If the ICONFILE= option is explicitly defined as empty, and if no resources are linked, the resource section [.rsrc] will be omitted from PE file completely.

Optional header is followed with 16 special directory entries which identify sections with special purposes (other than ordinary segment purposes CODE, DATA, BSS). See the last 16 lines in Segment purpose table, starting with EXPORT.

EuroAssembler natively supports only few of special directories:

EXPORT
automatically creates section [.edata] with the table of exported symbols, if they are declared.
IMPORT
automatically creates section [.idata] with the table of imported symbols names and ordinals.
RESOURCE
is created when a resource file is linked to the executable or when program option ICONFILE= specifies an existing icon.
BASERELOC
contains table of relocation which must be applied by the loader when the executable could not be loaded at the preferred VA specified by program option IMAGEBASE=.
IAT
import address table is created in section [.idata], same as the special directory IMPORT. Concatenation of tables IAT, IMPORT and thunk proxy jumps to one common section [.idata] reduces the size of image.

Other special directories are not supported by this EuroAssembler version. Nevertheless, their segment may be created explicitly, their contents created manually or by some third-party tool and emitted to the segment with INCLUDEBIN or directly with Data definition statements. If segment parameter PURPOSE= complies with the table (case insensitive), the corresponding directory entry in PE optional header will be created, covering the whole segment contents. Example:

[.cormeta] SEGMENT PURPOSE=CLR
 D '<compatibility xmlns="urn:schemas-microsoft-com:compatibility.v1">'
 D '  <application>'
 D '     <!-- A list of all Windows versions that this application is designed to work with.>'
 D '   </application>'
 D ' </compatibility>'

When EUROASM option DEBUG=ENABLED at the ENDPROGRAM pseudoinstruction, symbol table is appended to the PECOFF image.

Debuggers should be able to retrieve symbol names from the debugged executable and associate them with disassembled source lines. Unfortunately, none of tools which I tried, was able to exploit the symbol table from PE.

↑ DLL

File format DLL is almost identical with the format PE, with some minor differences:
File header field PFCOFF_FILE_HEADER.Characteristic if flagged with pfcoffFILE_DLL = 0x2000,
default file extension and image base are:

Name: PROGRAM FORMAT=DLL,OUTFILE=%^PROGRAM.dll,IMAGEBASE=256M

option ENTRY= is optional in DLL.

Default segments are the same as in COFF format.

Dynamically linkable symbols should be explicitly declared with exported scope.
Pseudoinstruction EXPORT supports dynamic DLL forwarding of exported function to a different function in other DLL, using EXPORT key operands FWD= and LIB=. See the test t7583 as an example.

Format DLL is sometimes used as resource library which contains only [.rsrc] section, typically a collection of icons. This is achieved by linking of compiled resource file, as created by an 3rd party resource compiler. Example of resource-only DLL, which contains 3 icons, can be found in tests t7586 and t7616.

↑ RSRC

Microsoft resources is common name for multimedia data, such as bitmap pictures, icons, cursor shapes, fonts etc. Resource used in GUI program are described in resource script as a tree referring individual graphic files. Typical script is a plain text file with extension .rc and it should be converted by a resource compiler into a binary resource file with extension .res, which is linkable by €ASM or other linkers. Its format is described in [RSRC].

MyCompiledResource PROGRAM FORMAT=RSRC does not work, EuroAssembler cannot compile resource scripts. Use third party tool instead, such as [MS_RC], [GoRC], or [ResourceHacker].

When a resource file is linked to PE or DLL image created by €ASM, program option ICONFILE= is ignored. The file is converted by €ASM to internal PECOFF binary-tree structure in special section [.rsrc] and referred with optional-header directory entry RESOURCE.

↑ Width of program formats

Width of output files linked by EuroAssembler is determined by program option WIDTH= and it defaults to 32 in COFF-based formats. To create a 64bit program PE, DLL, COFF or LIBCOF, program width must be explicitly specified and 64bit CPU + SIMD should be enabled, too.

   EUROASM CPU=X64,SIMD=SSE2
MyProgram64 PROGRAM FORMAT=PE, WIDTH=64
   ...
Differences between PE-COFF formats generated by EuroAssembler
MemberPROGRAM WIDTH=16PROGRAM WIDTH=32PROGRAM WIDTH=64
PFCOFF_FILE_HEADER.Machine 0x014C (Intel 386)0x014C (Intel 386)0x8664 (AMD64)
PFCOFF_FILE_HEADER.Characteristics:32BIT_MACHINE 0 (false)0x0100 (true)0 (false)
PFCOFF_FILE_HEADER.Characteristics:LARGE_ADDRESS_AWARE 0 (false)0 (false)0x0020 (true)
PFPE_OPTIONAL_HEADER.Magic 0x010B (PE32)0x010B (PE32)0x020B (PE32+)
SIZE# PFPE_OPTIONAL_HEADER 224224240

↑ EuroAssembler functions

Preprocessing ↓

Refactoring ↓

Assembler ↓

Assembly debugging ↓

Linker ↓

Librarian ↓

Object convertor ↓

Makefile manager ↓

Optimisation ↓

Where to begin ↓

This chapter describes EuroAssembler capabilities.

↑ Preprocessing

Many assemblers provide tools which help programmer with tedious and repetitive work, they are called macroassemblers. Preprocessing (macro) apparatus in EuroAssembler is recognizable by percent sign % prefixed to pseudoinstructions which control generating of repeated blocks of source code (%REPEAT, %WHILE, %FOR, %MACRO), conditional assembly (%IF, %COMMENT), assembly-time debugging (%DISPLAY), run-time debugging (%DEBUG, %PROFILE) and assignment and expansion of preprocessing %variables (%SET* family).

This set of tools manipulates with the source text before it is submitted to the final assembly processing (to the plain assembler, which is not aware of preprocessing apparatus at all).

Some compilers perform preprocessing in a special 0-th pass, which takes the input source file and emits plain assembly source. Preprocessed intermediate file can be manually inspected.

EuroAssembler utilizes a different approach: instead of preprocessing the source file as whole at once it will preprocess statement by statement in each assembly pass. This allows to manipulate with data which dynamically change and which are not fixed before €ASM was given the opportunity to pass through the source program at least once, for instance the distance between labels, size of not-defined-yet structures and segments etc.

The relation between preprocessing and the plain assembly is similar to the relation between Javascript and the plain HTML text in internet browsers.

Proper function of €ASM preprocessing can be checked in the listing, by enabling options EUROASM LISTVAR=ENABLE, LISTREPEAT=ENABLE, LISTMACRO=ENABLE.

↑ Refactoring

Inline code ↓

Bypassed PROC ↓

PROC in own section ↓

PROC1 ↓

PROC in INCLUDE ↓

Statically linked PROC ↓

Dynamically linked PROC ↓

Inline macro ↓

Macro calling PROC ↓

Semiinline macro ↓

This chapter demonstrates various methods how we can break up the program functionality to small subprogrames in EuroAssembler.

Let's suppose that we need a function which calculates the third power of input positive integer number. The result should fit to 32 bits, otherwise the program will report overflow and abort.

Assuming 32bit mode and the input number loaded in register EAX, the solution uses instruction MUL (unsigned multiplication) two times.

↑ Inline code

Straightforward solution inserts the code directly to the main program flow.

    ; EAX contains the input number N.
    MOV ECX,EAX ; Copy the input value N to the register ECX.
    MUL ECX     ; Let EDX:EAX = N*N
    JC Abort:   ; CF=OF=1 when EDX is nonzero (32bit overflow).
    MUL ECX     ; Let EDX:EAX = N*N*N
    JC Abort:   ; Abort on overflow.
    ; EAX now contains N3, continue the main program flow.

↑ Bypassed PROC

When such calculation is needed more than once, we should consider refactoring the direct code to a subprocedure which could be called repeatedly. We will insert the procedure named Cube to the program flow when its function is needed for the first time. Insertion of callable procedure requires a bypass skip. The procedure should be also accompanied with remarks which document its function.

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
       JMP Bypass: ; Skip the function code.
Cube PROC  ; Define a function which calculates 3rd power of N.
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to the register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32bit overflow).
     ENDPROC Cube
Bypass: ; EAX now contains N3, continue the main program flow.

↑ PROC in own section

The instruction JMP Bypass: could be spared if the procedure code would have been defined somewhere else, below the main program flow. This can be achieved with emitting the procedure to a different code section (for instance [Subproc]).

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
%CurrentSect %SET %^Section ; Backup the current section name to a variable.
[Subproc]  ; Switch emitting to a different code section.
Cube PROC  ; Define a function which calculates 3rd power of N.
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to the register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32bit overflow).
     ENDPROC Cube
[%CurrentSect]     ; Return to the original code section.
        ; EAX now contains N3, continue the main program flow.

↑ PROC1

Rather than manual section switch we could also utilize €ASM block PROC1..ENDPROC1 which will switch to a different section [@RT1] and return to the original section automatically.

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
Cube PROC1 ; Define a function which calculates 3rd power of N in section [@RT1].
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to the register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32bit overflow).
     ENDPROC1 Cube ; End of subprocedure in section [@RT1]. Return to [.text].
     ; EAX now contains N3, continue the main program flow.

↑ PROC in INCLUDE

Definition of function Cube at the place where it is used is good for understandability. On the other hand, when there are more such definitions, they clutter the main program thread. It could be more clearly organized if those helper functions were put away to a different file, for instance functions.inc. This file will be included to the main source file at assembly-time.

       INCLUDE "functions.inc" ; File with Cube: PROC source definition.
       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

Functions defined in included file functions.inc can be wrapped to a block(s) functions PROGRAM..ENDPROGRAM and assembled separately to an OMF or COFF object file functions.obj, eventually to a library. The function name (Cube) must be declared as GLOBAL or PUBLIC in the object file, and it must be declared as GLOBAL or EXTERN in the main file. Instead of explicit GLOBAL declaration it may also be specified with double colon (Cube::). The assembled object then will be statically linked to the main program at link-time.

       LINK "functions.obj" ; Object file with assembled code of function Cube.
       ; EAX contains the input number N.
       CALL Cube:: ; Invoke the external function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

Functions defined in included file functions.inc can be wrapped to a block(s) functions PROGRAM..ENDPROGRAM and assembled separately to a dynamically linked library file functions.dll, The function name (Cube) must be declared as EXPORT in the library file, and as IMPORT in the main executable file. The assembled function in DLL program then will be dynamically bound to the main program at run-time.

       IMPORT Cube, LIB="functions.dll"
       ; EAX contains the input number N.
       CALL Cube:: ; Invoke the DLL function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

↑ Inline macro

An alternative approach to the repeated inline code is utilizing a macro which will expand whenever the functionality is requested.

Statements which define the macro need not be bypassed, because they don't emit any code, but the macrodefinition must appear before the macro is used. The definition could be put aside to an included file as well, similary to PROC in INCLUDE method.

Cube %MACRO
       MOV ECX,EAX ; Copy the input value N to the register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC Abort%.: ; CF=OF=1 when EDX is nonzero (32bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
Abort%.:           ; Label name is modified by %. variable, which increments in each macro expansion.
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube          ; Expansion of the macro.
     JC Abort:     ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Macro calling PROC

Inline macros are fast but each invocation repeats the whole function code. Size of program can be reduced if the macro calls the procedure with function code, which also can be put aside to functions.inc. The function of macro is then limited to process eventual parameters and to hide the calling convention (no parameters are actually used in our simple example, thou).

     INCLUDE "functions.inc" ; File with Cube: PROC source definition.
Cube %MACRO       ; Definition of the macro Cube.
       CALL Cube: ; Calling the procedure Cube:
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube         ; Invoke macro which calls the included PROC.
     JC Abort     ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Semiinline macro

Disadvantage of previous method is that we have to maintain two blocks of code: macro definition and procedure definition. €ASM provides procedure block PROC1 which is assembled only once, even if the macro, which contains it, is invoked repeatedly. Thank to this, the procedure code is emitted only once, when the macro is invoked for the first time, and if the macro is never invoked, the code is not emitted at all. Macrolibrary with such semiinline macros can be included to any program and does not increase the final code if the macro is not used (expanded) in the program.

This method is preferred in most macrolibraries shipped with EuroAssembler.

Cube %MACRO          ; Definition of the semiinline macro Cube.
       CALL Cube:    ; Calling the procedure Cube:
 Cube: PROC1         ; The PROC1 block is assembled only once on first macro invocation.
         MOV ECX,EAX ; Copy the input value N to the register ECX.
         MUL ECX     ; Let EDX:EAX = N*N
         JC .Abort:  ; CF=OF=1 when EDX is nonzero (32bit overflow).
         MUL ECX     ; Let EDX:EAX = N*N*N
  .Abort:RET         ; CF=OF=1 when EDX is nonzero (32bit overflow).
       ENDPROC1 Cube:
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube            ; Invoke of macro which calls the embedded PROC1.
     JC Abort        ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Assembler

Source envelope ↓

Chained programs ↓

Nested programs ↓

This chapter gives a closer look how a program block of statements is processed by EuroAssembler.

↑ Source envelope

Consider a plain text file src.asm submitted to assembler:

 DB 'This source "src.asm" has'
 DB ' no PROGRAM statement.',13,10
 DB 'EuroAssembler will use '
 DB 'a fictive envelope instead.'

As no PROGRAM..ENDPROGRAM block is defined in this source, the output format of €ASM object file is configured only by [PROGRAM] section in configuration file euroasm.ini, or by built-in default, which is PROGRAM FORMAT=BIN,MODEL=TINY,WIDTH=16.

EuroAssembler formally wraps each source file into two fictive envelope statements PROGRAM and ENDPROGRAM. Prefixed envelope PROGRAM statement derives its label (module name) from the source file name, cutting off its extension. Thus it will assemble the source src.asm to a data file src.bin. This behaviour is compatible with most other assemblers.

If the source file name starts with a digit, such label is not acceptable by €ASM, so the module name will be prefixed with grave ` and source 123.asm is assembled to `123.bin.

Similary, when the label of PROGRAM statement contains ? or other letters unacceptable by filesystem, such character in the module file name will be replaced with underscore _. Statement IsNumlockOn? PROGRAM FORMAT=COM will produce program named IsNumlockOn_.com.

€ASM uses ANSI version of Windows API for dealing with file names, so I recomend to abstain from using national characters outside the current codepage in source file names.

When the source file is loaded in memory, €ASM begins to read the source, starting with the envelope statement PROGRAM. When the corresponding ENDPROGRAM is found, an assembly pass is over. €ASM checks all symbols, which might have been defined in the program, and looks whether their offset is marked fixed, i.e. it did not change between passes. If at least one symbol has its offset not fixed yet, another assembly pass is needed and €ASM goes back to the PROGRAM statement. When all symbols are fixed, €ASM starts the final assembly pass, in which code+data is generated to the target file and listing is produced. Each source requires at least two passes to assemble.

                                                     assembly progress ─>
┌─────────┬──────────────────────────────────┐
│envelope │src: PROGRAM                      │      █       ┌█
├─────────┼──────────────────────────────────┤       █      │ █
│      {1}│ DB 'This source "src.asm" has'   │        █     │  █
│"src.asm"│ DB ' no PROGRAM statement.',13,10│         █    │   █
│      {3}│ DB 'EuroAssembler will use '     │          █   │    █
│      {4}│ DB 'a fictive envelope instead.' │           █  │     █
├─────────┼──────────────────────────────────┤            █ │      █
│envelope │ ENDPROGRAM src:                  │             █┘       █─┐
└─────────┴──────────────────────────────────┘
                                                   ││        │      │ │
I0010 EuroAssembler started.───────────────────────┤│        │      │ │
I0180 Assembling source file "src.asm".────────────┤│        │      │ │
I0270 Assembling source "src".─────────────────────┘│        │      │ │
I0310 Assembling source pass 1.─────────────────────┘        │      │ │
I0330 Assembling source pass 2 - final.──────────────────────┘      │ │
I0760 16bit TINY BIN file "src.bin" created from source, size=99.───┘ │
I0750 Source "src" (4 lines) assembled in 2 passes with errorlevel 0.─┤
I0860 Listing file "src.asm.lst" created, size=717.───────────────────┤
I0990 EuroAssembler terminated with errorlevel 0.─────────────────────┘

Envelope statements are used regardless if explicit PROGRAM block was defined in source text, or not. Source lines between the start of file and the explicit PROGRAM statement, as well as lines between the explicit ENDPROGRAM and the end of source, should not emit any data or code. In this case the envelope source is empty and does not create target file from the source.

Consider the following source file src.asm. There is an explicit block Src:PROGRAM..ENDPROGRAM Src: (lines 5..8) inside the invisible envelope statements src: PROGRAM and ENDPROGRAM src:. When the internal Src:PROGRAM..ENDPROGRAM Src: block is found in assembly process, this entire block is skipped until a final pass of outer block is performed. Then €ASM puts the currently assembled final pass aside, and starts to assemble the inner block in as many passes as necessary, creating the inner program target file. After then €ASM returns to finish the final pass of outer (envelope) program.

    EUROASM ; Common options.
    ; Source file "src.asm"
    ; with PROGRAM defined
explicitly.
Src:PROGRAM FORMAT=BIN
     DB 'Data emitted '
     DB 'by program Src.'
     ENDPROGRAM Src:

Notice the bug: the wrap of comment line {3} yields an not-comment line {4}. Expression explicitly. is treated as a valid label (definition of address symbol). This causes the envelope being treated as not empty and target file src.bin is created from it, nonetheless with zero filesize, as it contains only a zero-sized address symbol.
Inner program from lines {5..8} creates target file Src.bin with size 28 bytes, but it is soon overwritten with envelope zero-sized target src.bin which happens to have almost identical name (filesystem in Dos|Windows is case-insensitive).


┌─────────┬──────────────────────────────────┐  █              assembly progress ─────────>
│envelope │src: PROGRAM                      │   █         ┌█         ┌█
├─────────┼──────────────────────────────────┤    █        │ █        │ █
│      {1}│ EUROASM ; Common options.        │     █       │  █       │  █
│      {2}│    ; Source file "src.asm"       │      █      │   █      │   █
│      {3}│    ; with PROGRAM defined        │       █     │    █     │    █
│      {4}│explicitly.                       │        █┐   │     █┐   │     █
│"src.asm"│Src:PROGRAM FORMAT=BIN            │         │   │      │   │      █─█   ┌█
│      {6}│     DB 'Data emitted '           │         │   │      │   │         █  │ █
│      {7}│     DB 'by program Src.'         │         │   │      │   │          █ │  █
│      {8}│     ENDPROGRAM Src:              │         └█  │      └█  │           █┘   █┐
├─────────┼──────────────────────────────────┤           █ │        █ │                 └█
│envelope │ ENDPROGRAM src:                  │            █┘         █┘                   █┐
└─────────┴──────────────────────────────────┘
                                                ││          │          │    │ ││    │  │  ││
I0010 EuroAssembler started.────────────────────┤│          │          │    │ ││    │  │  ││
I0180 Assembling source file "src.asm".─────────┤│          │          │    │ ││    │  │  ││
I0270 Assembling source "src".──────────────────┘│          │          │    │ ││    │  │  ││
I0310 Assembling source pass 1.──────────────────┘          │          │    │ ││    │  │  ││
I0310 Assembling source pass 2.─────────────────────────────┘          │    │ ││    │  │  ││
I0330 Assembling source pass 3 - final.────────────────────────────────┘    │ ││    │  │  ││
W2101 Symbol "explicitly." was defined but never used. "src.asm"{4}─────────┘ ││    │  │  ││
I0470 Assembling program "Src". "src.asm"{5}──────────────────────────────────┘│    │  │  ││
I0510 Assembling program pass 1. "src.asm"{5}──────────────────────────────────┘    │  │  ││
I0530 Assembling program pass 2 - final. "src.asm"{5}───────────────────────────────┘  │  ││
I0660 16bit TINY BIN file "Src.bin" created, size=28. "src.asm"{8}─────────────────────┤  ││
I0650 Program "Src" assembled in 2 passes with errorlevel 0. "src.asm"{8}──────────────┘  ││
W3990 Overwriting previously generated output file "Src.bin".─────────────────────────────┤│
I0760 16bit TINY BIN file "src.bin" created from source, size=0.──────────────────────────┤│
I0750 Source "src" (8 lines) assembled in 3 passes with errorlevel 3.─────────────────────┤│
I0860 Listing file "src.asm.lst" created, size=1372.──────────────────────────────────────┘│
I0990 EuroAssembler terminated with errorlevel 3.──────────────────────────────────────────┘

↑ Chained programs

EuroAssembler allows to define more than one program block in a single source file, and assemble all of them with one command. Remember that symbols used in different PROGRAM..ENDPROGRAM blocks have private scope, so they don't see each other, although they are defined in the same source file. If we want to call a procedure defined in Pgm1 from Pgm2, the called symbol must be declared global and both assembled modules must be linked together.

┌─────────┬──────────────────────────────────┐ █            assembly progress ─────────────────>
│envelope │src: PROGRAM                      │  █       ┌█
├─────────┼──────────────────────────────────┤   █      │ █
│      {1}│     EUROASM ; Common options.    │    █     │  █
│      {2}│Pgm1:PROGRAM FORMAT=PE,ENTRY=Run1:│     █┐   │   █─█   ┌█   ┌█
│      {3}│      ; Pgm1 data.                │      │   │      █  │ █  │ █
│      {4}│Run1: ; Pgm1 code.                │      │   │       █ │  █ │  █
│"src.asm"│     ENDPROGRAM Pgm1:             │      │   │        █┘   █┘   █┐
│      {6}│     ; Pgm2 description.          │      │   │                   █
│      {7}│Pgm2:PROGRAM FORMAT=PE,ENTRY=Run2:│      │   │                   └█   ┌█   ┌█
│      {8}│      ; Pgm2 data.                │      │   │                     █  │ █  │ █
│      {9}│Run2: ; Pgm2 code.                │      │   │                      █ │  █ │  █
│     {10}│      ENDPROGRAM Pgm2:            │      └█  │                       █┘   █┘   █┐
├─────────┼──────────────────────────────────┤        █ │                                  └█
│envelope │ ENDPROGRAM src:                  │         █┘                                    █┐
└─────────┴──────────────────────────────────┘
                                               ││        │    │    │    │  │ │    │    │   │ ││
I0010 EuroAssembler started.───────────────────┤│        │    │    │    │  │ │    │    │   │ ││
I0180 Assembling source file "src.asm".────────┤│        │    │    │    │  │ │    │    │   │ ││
I0270 Assembling source "src".─────────────────┘│        │    │    │    │  │ │    │    │   │ ││
I0310 Assembling source pass 1.─────────────────┘        │    │    │    │  │ │    │    │   │ ││
I0330 Assembling source pass 2 - final.──────────────────┘    │    │    │  │ │    │    │   │ ││
I0470 Assembling program "Pgm1". "src.asm"{2}─────────────────┤    │    │  │ │    │    │   │ ││
I0510 Assembling program pass 1. "src.asm"{2}─────────────────┘    │    │  │ │    │    │   │ ││
I0510 Assembling program pass 2. "src.asm"{2}──────────────────────┘    │  │ │    │    │   │ ││
I0530 Assembling program pass 3 - final. "src.asm"{2}───────────────────┘  │ │    │    │   │ ││
I0660 32bit FLAT PE file "Pgm1.exe" created, size=14320. "src.asm"{5}──────┤ │    │    │   │ ││
I0650 Program "Pgm1" assembled in 3 passes with errorlevel 0. "src.asm"{5}─┘ │    │    │   │ ││
I0470 Assembling program "Pgm2". "src.asm"{7}────────────────────────────────┤    │    │   │ ││
I0510 Assembling program pass 1. "src.asm"{7}────────────────────────────────┘    │    │   │ ││
I0510 Assembling program pass 2. "src.asm"{7}─────────────────────────────────────┘    │   │ ││
I0530 Assembling program pass 3 - final. "src.asm"{7}──────────────────────────────────┘   │ ││
I0660 32bit FLAT PE file "Pgm2.exe" created, size=14320. "src.asm"{10}─────────────────────┤ ││
I0650 Program "Pgm2" assembled in 3 passes with errorlevel 0. "src.asm"{10}────────────────┘ ││
I0750 Source "src" (10 lines) assembled in 2 passes with errorlevel 0.───────────────────────┤│
I0860 Listing file "src.asm.lst" created, size=1736.─────────────────────────────────────────┘│
I0990 EuroAssembler terminated with errorlevel 0.─────────────────────────────────────────────┘

Why should we pack multiple modules together with their documentation to a single file rather than scatter them to a bunch of small files? It's a matter of individual preferences.

One reason could be the transfer of information between modules with preprocessing %variables. Unlike ordinary symbols, scope of %variables is not limited with PROGRAM..ENDPROGRAM block bounderies. Suppose that in Pgm2 we need to know the size of data segment from Pgm1. Let's read the size to %variable with statement %Pgm1DataSize %SETA SIZE# [DATA] which is placed in Pgm1 just above ENDPROGRAM Pgm1. In the final pass of Pgm1 is the segment size reliably known, and the variable %Pgm1DataSize will be visible in the whole source below its definition, so Pgm2 can calculate with it.

Another example where grouping programs is profitable is when the programs are similar or they share common data, declared with preprocessing %variables. The following example creates three similar short programs RstLPT1.com, RstLPT2.com, RstLPT3.com in a loop:

Nr %FOR 1,2,3 ; Repeat the %FOR..%ENDFOR block three times.
 RstLPT%Nr PROGRAM FORMAT=COM ; Program to reset LinePrinter port.
   MOV DX,%Nr ; LPT port ordinal number (1,2,3).
   MOV AH,1 ; BIOS function INITIALIZE LPT PORT.
   INT 17h  ; Use BIOS function to reset printer.
   MOV DX,Message ; Put the address of $-terminated string to DS:DX.
   MOV AH,9 ; DOS function WRITE STRING TO STDOUT.
   INT 21h ; Use DOS function to report success.
   RET     ; Terminate program.
   Message:DB "LPT%Nr was reset.$"
 ENDPROGRAM RstLPT%Nr
%ENDFOR Nr ; Generate 3 clones of the program.

↑ Nested programs

Program modules can be nested in one-another. For instance when building amphibious program executable both in Dos and Windows we may want to reflect the fact, that the Dos-executable MZ file is embedded as a stub in Windows-executable PE file, both providing the same functionality.
See the sample projects Lock test or EuroConvertor as examples of dual DOS&Windows program.

Again, when the outer program sees inner program block in non-final pass, it is skipped. In the final pass is the assembly of outer program temporarily suspended, inner program completely assembled, and then the final pass of outer program continues.

┌─────────┬──────────────────────────────────┐ █                   assembly progress ──────────────>
│envelope │src: PROGRAM                      │  █       ┌█
├─────────┼──────────────────────────────────┤   █      │ █
│      {1}│      EUROASM ; Common options.   │    █     │  █
│      {2}│Pgm1: PROGRAM FORMAT=PE,ENTRY=Run:│     █┐   │   █─█       ┌█       ┌█
│      {3}│Run:   ; Pgm1 data + code.        │      │   │      █      │ █      │ █
│      {4}│ Pgm2: PROGRAM FORMAT=COFF        │      │   │       █┐    │  █┐    │  █─█  ┌█
│"src.asm"│        ; Pgm2 data + code.       │      │   │        │    │   │    │     █ │ █
│      {6}│       ENDPROGRAM Pgm2:           │      │   │        └█   │   └█   │      █┘  █─█
│      {7}│       ; Pgm1 more code.          │      │   │          █  │     █  │             █
│      {8}│       LINK "Pgm2.obj"            │      │   │           █ │      █ │              █
│      {9}│      ENDPROGRAM Pgm1:            │      └█  │            █┘       █┘               █─█
├─────────┼──────────────────────────────────┤        █ │                                         █
│envelope │ ENDPROGRAM src:                  │         █┘                                          █─┐
└─────────┴──────────────────────────────────┘
                                               ││        │    │        │        │   │   │ │    │   │ │
I0010 EuroAssembler started. ──────────────────┤│        │    │        │        │   │   │ │    │   │ │
I0180 Assembling source file "src.asm".────────┤│        │    │        │        │   │   │ │    │   │ │
I0270 Assembling source "src".─────────────────┘│        │    │        │        │   │   │ │    │   │ │
I0310 Assembling source pass 1.─────────────────┘        │    │        │        │   │   │ │    │   │ │
I0330 Assembling source pass 2 - final.──────────────────┘    │        │        │   │   │ │    │   │ │
I0470 Assembling program "Pgm1". "src.asm"{2}─────────────────┤        │        │   │   │ │    │   │ │
I0510 Assembling program pass 1. "src.asm"{2}─────────────────┘        │        │   │   │ │    │   │ │
I0510 Assembling program pass 2. "src.asm"{2}──────────────────────────┘        │   │   │ │    │   │ │
I0530 Assembling program pass 3 - final. "src.asm"{2}───────────────────────────┘   │   │ │    │   │ │
I0470 Assembling program "Pgm2". "src.asm"{4}───────────────────────────────────────┤   │ │    │   │ │
I0510 Assembling program pass 1. "src.asm"{4}───────────────────────────────────────┘   │ │    │   │ │
I0530 Assembling program pass 2 - final. "src.asm"{4}───────────────────────────────────┘ │    │   │ │
I0660 32bit FLAT COFF file "Pgm2.obj" created, size=78. "src.asm"{6}──────────────────────┤    │   │ │
I0650 Program "Pgm2" assembled in 2 passes with errorlevel 0. "src.asm"{6}────────────────┘    │   │ │
I0560 Linking COFF module ".\Pgm2.obj". "src.asm"{9}───────────────────────────────────────────┤   │ │
I0660 32bit FLAT PE file "Pgm1.exe" created, size=14320. "src.asm"{9}──────────────────────────┤   │ │
I0650 Program "Pgm1" assembled in 3 passes with errorlevel 0. "src.asm"{9}─────────────────────┘   │ │
I0750 Source "src" (9 lines) assembled in 2 passes with errorlevel 0.──────────────────────────────┤ │
I0860 Listing file "src.asm.lst" created, size=1237.───────────────────────────────────────────────┘ │
I0990 EuroAssembler terminated with errorlevel 0.────────────────────────────────────────────────────┘

↑ Assembly debugging

Some useful features of EuroAssembler can help the programmer to assure that the source is assembled as intended.

Dump column of the listing displays the assembled code . Repeated stretchs, which are considered bug-free, are suppressed by default, but they can be displayed on demand with directives EUROASM LISTINCLUDE=ON, LISTVAR=ON, LISTMACRO=ON, LISTREPEAT=ON.

Recognition of fields in statements can be investigated with option EUROASM DISPLAYSTM=ON, which inserts comment lines identifying each field. As this option blows up the listing size significantly, it's better to limit DISPLAYSTM only to the suspected lines, and then switch the option OFF or restore the previous set of options:

   EUROASM PUSH, DISPLAYSTM=ON ; Store all current EUROASM options with PUSH first.
   MyMacro Operand1, Operand2  ; "MyMacro" was not defined yet as a %MACRO, so it's treated like a label.
D1010 **** DISPLAYSTM "MyMacro Operand1, Operand2"
D1020 label="MyMacro"
D1040 unknown operation="Operand1"
D1050 ordinal operand number=1,value="Operand2"
   EUROASM POP                 ; Restore EUROASM options.
D1010 **** DISPLAYSTM "EUROASM POP"
D1040 pseudo operation="EUROASM"
D1050 ordinal operand number=1,value="POP"
                              ; Statement fields are no longer displayed.

Detailed machine instructions encoding can be displayed with option EUROASM DISPLAYENC=ON, which inserts comment line below machine instruction with the list of actually used modifiers.

   EUROASM PUSH, DISPLAYENC=ON ; Store all current EUROASM options with PUSH first.
   SHRD [RDI+64],RDX,2
D1080 Emitted size=6,DATA=QWORD,DISP=BYTE,SCALE=SMART,ADDR=ABS,IMM=BYTE.
   VMOVNTDQA XMM17,[RBP+40h]
D1080 Emitted size=7,PREFIX=EVEX,DATA=OWORD,OPER=0,DISP=BYTE,SCALE=SMART,ADDR=ABS.
   EUROASM POP         ; Restore EUROASM options. Encodings are no longer displayed.

All configuration options, which can be specified with EUROASM and PROGRAM keyword operands, are retrievable in the form of system %^variables, thus their current value can be checked or otherwise exploited:

   %IF %^NOWARN[2101]
     %ERROR You shouldn't suppress the warning W2101. Move unused symbols to included file instead.
   %ENDIF

The most powerful assembly-time debugging tool is the pseudoinstruction %DISPLAY, which displays internal €ASM objects at assembly-time and helps to find out, why €ASM doesn't work as expected.

See tests t2901..t2917 as examples.

Static linking ↓

Dynamic linking ↓

Linking in IT terminology is the process when separately assembled | compiled modules are joined, interactions between globally accessible symbols resolved, their code and data combined and reformated to the target file format. See [Linkers] for more details.

Unlike many other linkers, EuroAssembler can create not only executable files, but also linkable formats COFF and OMF, and their libraries LIBCOF and LIBOMF (see Object convertor and the table of supported linker combinations).

Linking in EuroAssembler takes place when pseudoinstruction ENDPROGRAM is processed in the final pass.

Linking is mediated with pseudoinstruction LINK which is followed with filenames of input modules. Input format acceptable for EuroAssembler linker are of two kinds:

  1. linkable file formats for static linking are COFF, OMF, LIBCOF, LIBOMF, RSRC.
  2. importable file formats for dynamic linking are DLL, LIBCOF, LIBOMF.
File formats accepted by EuroAssembler statement LINK
CPU
mode
Program
width
Output
executable
Output
linkable
Input
linkable
Input
importable
Real16BIN, COM, MZOMF, LIBOMF, COFF, LIBCOFOMF, LIBOMF, COFF, LIBCOF-
Real32BIN, COM, MZOMF, LIBOMF, COFF, LIBCOFOMF, LIBOMF, COFF, LIBCOF-
Prot32PE, DLLCOFF, LIBCOF, OMF, LIBOMFCOFF, LIBCOF, RSRC, OMF, LIBOMFCOFF, LIBCOF, DLL, OMF, LIBOMF
Prot64PE, DLLCOFF, LIBCOFCOFF, LIBCOF, RSRCCOFF, LIBCOF, DLL, OMF, LIBOMF

See also the table of tests on linker combinations.
Notice that object format OMF cannot be linked in 64bit programs.

The actual format of linked file is recognized by the file contents, not by the file name extension. Each linked module is loaded and converted to an €ASM internal format (PGM) in memory.

↑ Static linking

Code and data from linked object files in formats COFF or OMF will be combined and concatenated with code and data from the base program (i.e. the one to which it's linked). Base program may be empty, however. Linker also resolves mutual references between public and external symbols from all linked modules.

Unlike other linkers, EuroAssembler does not accept names of linked module as its command line arguments. A linker script (€ASM source program) must be prepared beforehand when we want to employ EuroAssembler as a pure linker, for instance to convert object files created by 3rd-party assembler or compiler to an executable file. The desired output file name and format will be specified as the PROGRAM arguments:

MyExeFile PROGRAM FORMAT=PE, WIDTH=32, ListMap=Yes, ListGlobals=Yes
                 LINK MyCoff.obj, PascalOmf.obj, Win32.lib
               ENDPROGRAM MyExeFile
Save the linker script as MyScript.asm, execute euroasm MyScript.asm and it will produce the Windows program MyExeFile.exe and listing MyScript.asm.lst with the map of linked sections and global symbols.

Beside standalone object modules the code and data can be also linked from object libraries in formats LIBCOF and LIBOMF.

When the target base program is executable, €ASM only links those modules from library, which are at least once referrenced by other modules (smart linking). This helps to keep size of the linked file small, eliminating the dead (never-to-be-executed) code.

If we nevertheless need to combine unreferrenced library procedures to our executable program, we would have to explicitly declare their names GLOBAL in the the base program.

Smart linking does not apply when the target file is linkable, for instance when a LIBCOF library is created from other libraries and standalone object modules. In this case all modules (referrenced or unreferrenced) will be linked to the target file.

The good reason why to split big project into smaller, separately assembled modules, is faster build.

When a project grows and its source is doubled in size, the number of symbols in it is likely to double, too. Each symbol needs to be compared with array of other already declared symbols to avoid duplication. Number of checks, and also the consumed time, grows almost quadratically with source size.

During the developement process we are usually focused to one part (module) of the project, so the remaining unchanged modules do not need to be recompiled again in each developement cycle (see also Makefile manager).

Recapitulation: If you want to statically link your own function (procedure), declare it PUBLIC function (or terminate its definition label with two colons function:: PROC) and assemble the function to an object or library module.
Then assemble the main program, declare the linked function EXTERN function (or terminate the called name with two colons) and insert pseudoinstruction LINK module.obj into the main program. The main program then can CALL function:: as if it were assembled in its own body.
The same applies for functions from 3rd party library. Again, you must observe its published name, calling convention, number, order and type of arguments.

↑ Dynamic linking

The code and data of dynamically linked functions are not copied to the target executable image, they remain in dynamic library (DLL), which has to be available on the system where our executable runs. When our program calls a function from DLL, it actually executes a thunk code represented by a call of single proxy jump instruction (stub).
€ASM generates stubs in a special import section [.idata] in the form of indirect absolute JMPN. Each such proxy jump is 7 bytes long (0xFF2425[00000000]) and it uses pointer into Import Address Table (IAT) as its indirect DWORD target. Virtual address in the pointer [00000000] is resolved by the linker, but the actual 32bit or 64bit virtual address of the library function (pointed to by the resolved dword) will be fixed up later, by the loader at bind time when the application starts.

Loader, implemented in Windows kernel, needs two pieces of information to dynamically link library functions and to fix up their addresses in IAT:

1) The name of linked symbol (function name) or its ordinal number in the table of exported symbols.

Calling by ordinals is not supported in €ASM.

2) The name of library file which exports the symbol (without path).

Path to the library file will be established by the loader. The order of directories where MS Windows searches for the library is explained in [WinDllSearchOrder].

Program, which needs to call symbol (imported function) from dynamic library, should declare the symbol as imported. It may be declared GLOBAL as well, either explicitly or implicitly ( CALL ImportedSymbol::), but €ASM will treat such global symbol as EXTERN (statically linked) and complain that the corresponding public symbol was not found.
There are several methods how to tell €ASM that the symbol should be dynamically linked:

Recapitulation: If you want to dynamically link your own function (procedure) in other programs, declare it EXPORT function and assemble the function to an DLL format (mylib PROGRAM FORMAT=DLL). Be sure to distribute mylib.dll together with your programs.
Then assemble the main executable program, declaring the linked function IMPORT function, LIB=mylib.dll. The main program then can invoke it using CALL function.
More often you will need to call functions from 3rd party dynamic library, which is the case of MS Windows API. You might explicitly enumerate each used WinAPI functions with pseudoinstruction such as IMPORT function1,function2,LIB=user32.dll, but more comfortable solution is to use import library, which declares all function names exported by the DLL. Then you don't have to add new import declarations every time when a new function is used in your program. Simply call the new function with double colon and, when its name appeares in some import library, it will be treated as imported. You may also want to use the macro WinAPI (32bit) or WinABI (64bit) which takes care of IMPORT declaration and automatic selection between ANSI and WIDE variant.

↑ Librarian

EuroAssembler can create libraries from previously assembled object modules (files in OMF or COFF format). When the library program itself contains some code and data, it will be implicitly linked to the library as the first module.

Library PROGRAM FORMAT=LIBOMF  ; or FORMAT=LIBCOF
ObjModule1:: PROC ; One of the object modules can also be defined here.
                  ; Source code of ObjModule1.
             ENDP ObjModule1::
             LINK "ObjModule2.obj", "ObjModule3.obj" ; Other OMF and/or COFF object modules.
        ENDPROGRAM Library

If the linked modules contain import information, it is copied to output library, too. Pure import library contains import declarations only. They may be explicitly declared as IMPORT, or loaded from dynamic library, or linked from other import libraries. Following example exploits all three methods:

ImpLibrary PROGRAM FORMAT=LIBOMF ; or FORMAT=LIBCOF
             IMPORT Symbol1, Symbol2, LIB="DynamicLibrary1.dll" ; Explicit declaration.
             LINK "C:\MyDLLs\DynamicLibrary2.dll"               ; Automatic export detection from DLL.
             LINK "OtherImportLibrary.lib"                      ; Reimport from library.
           ENDPROGRAM ImpLibrary

Example of libraries created from three separately assembled modules can be found in €ASM tests:
t7016 (object library LIBOMF for 16bit Dos),
t7337 (object library LIBCOF for 32bit Windows),
t7361 (object library LIBCOF for 64bit Windows),
t7184 (import library LIBOMF for Windows),
t7412 (import library LIBCOF for Windows),

↑ Object convertor

EuroAssembler can link both main object formats OMF and COFF, so the demand for explicit object conversion between them should be rare. Example:

OMFobject PROGRAM FORMAT=OMF ; Convert COFF object file to the format OMF.
            LINK "COFFobject.obj"
          ENDPROGRAM OMFobject
COFFobject PROGRAM FORMAT=COFF; Convert OMF object file to the format COFF.
             LINK "OMFobject.obj"
           ENDPROGRAM COFFobject
OMFlibrary PROGRAM FORMAT=LIBOMF ; Convert COFF object library to the format LIBOMF.
             LINK "COFFlibrary.lib"
           ENDPROGRAM OMFlibrary
COFFlibrary PROGRAM FORMAT=LIBCOF ; Convert OMF object library to the format LIBCOF.
              LINK "OMFlibrary.lib"
            ENDPROGRAM COFFlibrary

↑ Makefile manager

Operator FILETIME# retrieves the last modification time of a file at assembly-time, which can be used for detection if the target file needs reassembly or not. Just compare the filetime of target with filetime of each source, which the target depends on. If the target file does not exist, its attribute-operator FILETIME# returns 0, which is the same as if it was very old, so its reassembly will be required anyway.

       ; Recompile "source.asm" only if "target.exe" doesn't exist or if it is older than its sources.
    %IF FILETIME# "target.exe" > FILETIME# "source.asm" && FILETIME# "target.exe" > FILETIME# "included2source.inc"
       %ERROR "target.exe" is fresh, no need to assemble again.
    %ELSE
       target PROGRAM FORMAT=PE
               INCLUDE "source.asm"
              ENDPROGRAM target
    %ENDIF

As an example of more sofisticated makefile script see the main EuroAssembler source file euroasm.htm.


↑ Optimisation

Computer programs are often written in assembler because we want them to be fast and/or small. However, those are not the only criteria how a program can be optimised:

By program size ↓

By program speed ↓

By assembly speed ↓

By source writeability ↓

By source readability ↓

See also optimisation tutorials.

Let's look how EuroAssembler can help with optimisation.

↑ Optimisation by program size

€ASM selects by default the shortest possible encoding of machine instruction. On the other hand, it respects instruction mnemonic chosen by the programmer, which doesn't always have to be the shortest variant. A couple of rules worth remembering:

|0000:B80000 | MOV AX,0 |0003:29C0 | SUB AX,AX ; Using SUB or XOR for zeroing is shorter. Side effect: flags are changed. |0005: | |0005:89D8 | MOV AX,BX |0007:93 | XCHG AX,BX ; XCHG is shorter than MOV. Collateral damage: 2nd register is changed, too. |0008: | |0008: |Label: |0008:8D06[0800] | LEA AX,[Label] |000C:B8[0800] | MOV AX,Label ; Moving offset to a register is shorter than loading its address by LEA. |000F: | |000F:5053 | PUSH AX,BX |0011:60 | PUSHAW ; Pushing/popping all registers at once is shorter than individual push/pop. |0012: | |0012:050100 | ADD AX,1 |0015:40 | INC AX ; Increment/decrement is shorter than add/subtract. |0016: | |0016: |LoopStart: |0016:49 | DEC CX |0017:75FD | JNZ LoopStart: |0019:E2FB | LOOP LoopStart: ; LOOP, JCXZ are shorter than separate test+jump.

Programs which aspire for short-size category should have PROGRAM FORMAT=COM and EUROASM AUTOALIGN=OFF. They may be terminated by simple near RET instead of invoking DOS function TERMINATE PROCESS, because the return address on stack of COM program is initialized to 0 and the final RET transfers execution to DOS terminating interrupt at the beginning of PSP block (CS:0), which was established by the loader.

Hello PROGRAM FORMAT=COM
       MOV DX,=B "Hello world!$"
       MOV AH,9
       INT 21h
       RET
      ENDPROGRAM Hello

For more inspiration check Hugi Size Coding Competition Series,
Assembly nibbles competition,
Graphical Tetris in 1986 bytes by Sebastian Mihai,
BootChess play in 487 bytes by Oliver Poudade.

Windows executable program created by €ASM will be shorter when the option PROGRAM ICONFILE= is explicitly specified as empty and no resource file is linked. In this case the resource section will not be included in PE file at all. You may also experiment with PE file properties using program options, such as PROGRAM FILEALIGN= value.

↑ Optimisation by program speed

Writing fast programs is fully in the hands of programmer, EuroAssembler cannot help much here, it does no optimizations behind your back as high-level compilers do. You may want to set EUROASM AUTOALIGN=ON to be sure that all data will be aligned for the best performace. Total control of instruction encoding in €ASM allows to select a variant with exact code size, which is faster than size-optimised encoding stuffed by NOPs. €ASM supports optimised no-operations encoding for fast and easy manual alignment.

There are many tricks how to squeeze every CPU clock: by loop unrolling, parallelization, avoiding memory access, and last but not least, choosing the fastest algorithm. Performance also heavily depends on CPU model and generation. Good guide is [SoftwareOptimisation] by Agner Fog.

Performance is usually traded off with program size, for instance many tricks mentioned above lead to slower execution. You may want to optimize only the critical parts of code which are executed many times in your program.

↑ Optimisation by assembly speed

EuroAssembler is not optimised for speed, nevertheless duration of assembly is usually not an issue. It mostly depends on the number of passes, which is governed by €ASM itself and not directly impactable by the programmer. At least two passes are always required. Number of passes increases when the program contains forward references, assembly-time loops, macroinstructions.

When assembling forward referrenced jumps €ASM at first anticipates short distance to not-yet-defined target, and reserves room for only 2 byte (short) opcode. If we know at write time that the forward target will be further than 127 bytes, it is recommended to explicitly specify DIST=NEAR, which can save one pass at assembly time. However the pass will be spared only when the distances of all such jumps are specified, which is usually not worth the effort.

If you are interrested why €ASM performs this many passes, put the statement %DISPLAY UnfixedSymbols in front of ENDPROGRAM to find out which symbols do oscillate between assembly passes.

Build time of big projects can be reduced significantly by splitting the code to smaller, separately assembled modules, which will be finally linked together. See also the euroasm.htm source itself.

↑ Optimisation by writeability

EuroAssembler introduced some new comfortable features which are not usual among other assemblers:

↑ Optimisation by readability

Well commented and structured program is easy to read and maintain. EuroAssembler allows HTML formatting in comments, so the source code can be directly published on web sites and each part of source can be immediately documented with rich formated remarks, tables, images, hypertext links.

Size and language of identifiers is not limited, so they can be selfdescribing. If English is not your mother tongue, it is a good idea to prefer labels with non-English names, such as Drucken rather than Print, файл rather than file etc. This helps the reader of your program to distinguish built-in reserved words from identifiers created by the author.

Elements of EuroAssembler language use decorators which help the human reader to distinguish the category of decorated identifier:

↑ Where to begin

If you have read this manual hitherward and if you want to try EuroAssembler, download the latest version, print a hardcopy of paper crib and look at the sample projects. Good luck!

▲Back to the top▲