EuroAssembler Index Manual Download Source Macros


Sitemap Links Forum Tests Projects

EuroAssembler Manual


Česká verze tohoto manuálu

About EuroAssembler ↓

Input/Output ↓

Structure of €ASM program ↓

Elements of source ↓

Instructions ↓

Program formats ↓

€ASM functions ↓


↑ About EuroAssembler

Product identification ↓

Short characteristics ↓

Notational typographic conventions ↓

Why Assembler ↓

Why Yet Another Assembler ↓

Why EuroAssembler ↓

Licence ↓

History

Download

Installation ↓


↑ Product identification

The name of the software is EuroAssembler. Please notice that there is no space between Euro and Assembler.
The name is often abbreviated as €ASM.
In a 7-bit ASCII environment it may also be referred as EUROASM and in some internal identifiers it's just ea.

The Euro character is available on a Windows keyboard as Alt~0128 or as HTML entity €.

↑ Short Characteristics

Some features that are rarely seen in other assemblers:

↑ Notational typographic conventions

This manual covers the programmer's guide, examples, language references and implementation remarks. Different styles are used to identify those elements.

The background color of the element in the web page helps to distinguish between

   this manual and links macroinstruction libraries  €ASM source files  test files  objects and samples .

Dashed hyperlinks refer to another paragraph within the same page.

Underlined hyperlinks navigate to a different HTML page of this site.

Underlined hyperlinks with Link icon navigate to signpost page Links with external references.

Underlined hyperlinks with Exit icon navigate outside EuroAssembler website, you may want to open them in a new tab or window.

The contents of this manual are organized in chapters with a tree structure.

↑ Title

Up-arrow near the chapter title is a link which navigates from the Title one level higher.

Title ↓

Down-arrow following the title navigates from the Title downward to the actual text.
Statements and rules which are worth remembering are marked with a bulb icon.

Definitions of new terms is written in blue bold italics.

Implementation details, discussions and less important personal remarks are printed with smaller font.

File names are emphasized in quotes.

Characters used in text have white background.

Short piece of source code is displayed in a monospace font, black on yellow.

; Longer examples of source code in this manual are presented in a box.
; They may have more lines.
; Errorneous, negative or wrong examples are overstriked.
 
The examples of code in macrolibraries and €ASM sources are ignored by EuroAssembler, because their physical lines begin with an HTML tag marker <.
explaining metainformation ┐ |0000:0000| ; €ASM printed output (listing) is displayed black on white background. |0000:0000| ; It contains assembled machine code, copy of source instructions |0000:0000| ; and error messages.

↑ Why Assembler

The assembly programming language (ASM) gives programmers the maximal possible control of emitted machine code. Of course, having to write every instruction for the Central Processing Unit (CPU) by hand is very tedious. That is why subprograms were invented: procedures, functions and macroinstructions.
A subprogram is like a black box with a documented purpose, input and output. The main difference between our own ASM subprogram and a HLL function is that when it doesn't work as expected, we can easily trace down the mistake, stepping on each machine instruction in a debugger, and that there is no-one else to blame but us.

ASM subprograms can do the same job as orders of higher level languages (HLL) or invokations of operating system (OS) application programming interface (API). The EuroAssembler macrolanguage allows to prepare in advance macros tailored to the problem and use them to solv a task, which are similar to functions from OS or HLL libraries, and they allow to develop programs in ASM almost as rapidly as in HLL.

The advantage of mastering the assembly language manifests when we are challenged with a third-party program that is without its source code available, or when some badly written program throws an exception and exits. DrWatson, debuggers or disassemblers can only show the alien code converted to assembly instructions. People who never met ASM will hardly know how to interpret the disassembled code, while ASM programmer will feel like a fish in its natural environment.

The main disadvantage of assemblers is a lack of standardized libraries which unify programming in HLL such as C or Java. In one hand, many ASM programmers build their own, which makes their sources not portable unless the necessary libraries are shipped together with source. On the other hand, making a library with our own functions is the best method how to remember all the function and parameter names, and on how to learn a lot about computers and operating systems.
The EuroAssembler package euroasm.zip contains several macrolibraries for a quick start and for inspiration.
Assembler is an universal construction kit. You may program whatever is possible to imagine, but first you have to prepare the building tools.
Phases of program creation
PhaseUsed tool
design-timeimagination
write-timetext editor
assembly-timeassembler
combine-timelinker
link-timelinker
load-timeoperating system loader
bind-timeoperating system loader
run-timeprocessor

↑ Why Yet Another Assembler

Dissatisfation with available tools is one of the reasons why some programmers want to invent their own language.

And last but not least, creating an assembler is a very interresting challenge. An incomplete list of assemblers and other tools, that I had the pleasure to come into contact with, is presented at the link [Assemblers] and [UsefulTools].

The first assembler I met when I started to flirt with the assembly language in the early 80's, was IBM's FDOS for S360 mainframe computers [HLASM]. That was a very sofisticated product with advanced features such as sections, keyword operands, literals, with a macro language which was able to manipulate not only with the generated machine statements, but also with its own macro variables and their names.

I missed many of those features in assemblers for the Intel architecture. Some of them brought new ideas but none seemed ideal for me. [NASM] ver.0.99 was quite good, in fact the first bootstrap version of €ASM was written in it, but I was irritated when it wasn't able to automatically select SHORT or NEAR distance jumps and had other design flaws, such as not expanding preprocessing variables in quoted strings.

I always wondered why constant EQU symbols had to be declared before the first use. Why I can't declare a macro in a macro. How to solve situations when file A includes files B and C, and file C also includes file B, duplicating its definitions.

I don't like a language which is cluttered up with free space. In HLASM a space in the operand list signalised that everything up to the end of the punched card should be ignored. €ASM isn't that strict in this horror vacui, in fact white spaces may be put anywhere between language elements to improve readability. However, spaces are almost never required by syntax.

€ASM does not use English word modifiers such as SHORT, NEAR, DWORD PTR, NOSPLIT which are identified by their value only. Instead, it prefers the Name=Value paradigma with keyword instruction modifiers such as DATA=QWORD,IMM=BYTE,MASK=K5,ZEROING=ON, which remove ambiguity and replace ugly decorators proposed in the Intel documentation.

↑ Why EuroAssembler

  1. Euro because it comes from Czechia, the heart of Europe.
  2. Both Europe and €ASM are multilingual, as it supports national characters in identifiers and strings.
  3. is one of the few characters left unoccupied among many *ASM assemblers :-)

↑ Licence

Permission to use EuroAssembler is granted to everybody who obeys this Licence.
There are no restrictions on purpose and scope of applications created with this tool. It may be used in private, educational or commercial environments freely.

EuroAssembler is provided free of charge as-is, without any warranty guaranteed by its author.

This software may be redistributed in unmodified zipped form, as downloaded from EuroAssembler.eu. No fee may be requested for the right to use this software.

You may disseminate euroasm.zip on other websites, repositories, FTP archives, compact disks and similar media. Please be sure to always distribute the latest available €ASM version.

Source code of EuroAssembler was written by Pavel Šrubař, AKA vitsoft, and it is copyrighted as so.

Macrolibraries and sample projects are released as public domain and they may be modified freely.

I cannot recommend modifying the libraries, though, because they may be changed in future releases of €ASM and your enhancements would have been overwritten. Create your own files with vacant names instead.

You may modify €ASM source code for the sole purpose to fix a bug or to enhance it with new function, but you may not distribute such modified software. It may only be used by you on the same computer where it was edited, reassembled and linked.

EuroAssembler is not open source. I don't want to fork €ASM developement into a bazaar of incompatible versions, where each branch provides different enhancements. Please propose your modifications to the author or to €ASM forum instead, so it might be incorporated in future releases of EuroAssembler.

↑ Installation

The distribution file euroasm.zip contains folders and files as listed on the Sitemap page. The modification time of all files is equally set to the nominal release time. All file names are in lower case (Linux convention) and in 8.3 size (DOS convention), so any old DOS utility can be used for unpacking.
You may need to run the console as an administrator for an installation on a secure version of MS-Windows.

Choose and create EuroAssembler home directory, for instance C:\euroasm, change to it and unzip the downloaded euroasm.zip. Move or copy the main executable euroasm.exe to some folder from system %PATH%, so it might be launched as euroasm from anywhere. When you run it without parameters for the first time, it will create the global configuration euroasm.ini, which you should tailor now with a plain-text editor.

You may want to replace relative IncludePath= and LinkPath= in [EUROASM] section with an absolute path identifying the €ASM home directory.
In [PROGRAM] section you can specify your preferred target format, for instance Format=PE, Subsystem=CON and Width=32. You could also replace IconFile="euroasm.ico" and copy your preferred personal icon to objlib subfolder.

For the (not-recommended) bare-bone minimal installation you are now done and you could erase the whole home directory now. The executable euroasm.exe itself does not need any other supporting files, environment or registry modification.

If you prefer to read this documentation in other language, rename the default English version of this manual eadoc\index.htm to eadoc\man_eng.htm and then rename the chosen available human language translation, e.g. eadoc\man_cze.htm, to eadoc\index.htm.

For a developement installation go to the home directory and unzip developer-scripts from the subarchive generate.zip. You will also need webserver and PHP (version 5.3 or higher) installed on your localhost.

Most of EuroAssembler files are in HTML format, you may want to incorporate €ASM into your local web server, if you run it on your localhost computer.

In my Apache installation I added the following paragraph to the httpd.conf or apache2.conf:

<VirtualHost *:80>
    DocumentRoot C:/euroasm/
    ServerName euroasm.localhost
</VirtualHost>

I appended the statement 127.0.0.1 euroasm.localhost into the file %SystemRoot%/SYSTEM32/drivers/etc/hosts. Now I can write euroasm.localhost into address line of my internet browser and explore the €ASM documentation and other files locally.


↑ Input/Output

Standard streams ↓

Other I/O ↓

Messages ↓

Input/Output files ↓


Computer programs exchange information with users through various channels: standard streams, command-line parameters, environment variables, errorlevel value, disk files and devices.

↑ Standard streams

The basic form of communication between programs and human user has the form of characters streams, which are by default directed to the console terminal where was the program launched from. They may also be redirected to a disk file or device driver with command-line operators >, >>, <, |.

Standard input is not used in €ASM.

Standard output prints warnings, errors and informative messages produced by €ASM.

Standard error output is not used in €ASM.

↑ Other I/O

Command-line parameters are not used. €ASM assumes that everything on the command line is the main source file name(s) intended to assemble. All options controlling the assembly & link process are defined in the configuration files euroasm.ini or directly in the source file itself.

In fact there are semi-undocumented EUROASM options which are recognized in command-line, however the preferred place for EUROASM options is the configuration file or the source file. Command-line options are employed in test examples to suppress some variable informative messages, and its use should be kept to a minimum.

Environment variables are not used in €ASM.

Environment variables may be incorporated into the source at assembly-time using the pseudoinstruction %SETE. Of course, it is also possible to read environment variables at run-time with the corresponding API call, such as GetEnvironmentVariable().

€ASM does not use any other devices (I/O ports, printers, sound cards, graphic adapters, etc.) at assembly-time.

↑ Messages

Important information detected by EuroAssembler during its activity is published in the form of short text messages. They are written on standard output (console window) and to the listing file.

Message severity ↓

Messages in standard output ↓

Messages in listing ↓

Each message is identified by a combination of a capital letter followed by four decimal digits. The complete text of messages is defined in source file msg.htm.

The letter prefix and the first digit (0..9) declare message severity. The final errorlevel value, which euroasm.exe terminates with, is equal to the highest message severity encounterred during the assembly session.

Message severity
Type of
message
PrefixIdentifier
range
SeveritySearch
marker
InformativeII0000..I09990|#
DebuggingDD1000..D19991|#
WarningWW2000..W39992..3|##
Nonsuppressible warningWW4000..W49994|##
User-defined errorUU5000..U59995|###
ErrorEE6000..E89996..8|###
FatalFF9000..F99999|###

EuroAssembler is verbose by default, but it may be totally silenced when launched with the parameter NOWARN=0000..0999, and if no error occured in source.

Warnings usually do not prevent the compiled target from execution, they are meant as a friendly reminder that the programmer might have forget about something or has made a typo mistake.

Messages with a severity level tanging from 5..8 indicate that some statements were not compiled due to error. Although the target file may be valid, it will probably not work as intended.

Fatal errors indicate an interaction failure with the operating system, resource exhaustion, file errors or internal €ASM errors. The target and listing file might have not been written at all.

Informative, debugging and warning messages in the range I0000..W3999 can be suppressed with EUROASM option NOWARN=, but this ostrich-like policy is not a good idea. It's always better to fix the root cause of the message. If you intend to publish your code, it should always assemble with an errorlevel 0.

↑ Messages on standard output

A typical message consists of its identifier followed by the actual tailored message text. When it is printed on standard output, the text is accompanied by a position indicator in the form of a quoted file name followed by a physical line number in curly brackets, for instance

E6601 Symbol "UnknownSym" mentioned at "t1646.htm"{71} was not found. "t1646.htm"{71}
▲▲▲▲▲                                                                 ▲▲▲▲▲▲▲▲▲▲▲▲▲▲▲
Identifier                                                         position indicator

Usually there is just one position indicator per message, but when the error was discovered in the macro expansion phase, another indicator is added which determines the line in the macro library. In case of a macro expanded in another macro, position indicators will be further chained.

↑ Messages in listing

The messages printed to the listing file have a slightly different format. The position indicator is omitted, because they are inserted just below the source line which triggered the error:

|002B: | MOV SI,UnknownSym: ; E6601 expected. |### E6601 Symbol "UnknownSym" mentioned at "t1646.htm"{71} was not found. ▲▲▲▲ marker

The message text is prefixed with a search marker which helps to find messages in listing.

So you can use the internal function Find/FindNext (Ctrl-F) of the editor or viewer used to examine the file listing.
As amatter of fact €ASM syntax never uses multiple pound characters ##, so the search marker is unique in listing and it helps to skip (filter out) from one error|warning to the next.
You could also try the specialized €ASM listing viewer distributed as one of the sample projects.

Debugging messages D1??? produced by the pseudoinstruction %DISPLAY are published even when they are placed in false %IF branches or in blocks commented-out by %COMMENT..%ENDCOMMENT.

The listing file is created only during the final assembly pass, and informative messages are not printed to listing at all, except for informative linker messages in the I056? range.


↑ Input/Output files

Configuration file ↓

Source file ↓

Object file ↓

Listing file ↓

File path ↓

There are two kinds of input files which €ASM reads: configuration and source.

There are two kinds of output files which €ASM writes: object and listing.

If the output file already exists, €ASM will overwrite it without warning.

Configuration file

The configuration file, which has the immovable (predetermined) name euroasm.ini, specifies default options for assembler. €ASM queries two configuration files with identical name and structure:

A global configuration file is located in the same directory as the main executable (euroasm.exe) and it is processed once after €ASM has started. If the file does not exist, €ASM tries to create it with the factory-default contents.

The local configuration file is searched for in the same directory as the actual source file. If more than one source is specified at the command-line, local configuration files are read each time when the actual source gets processed.
Local euroasm.ini is not automatically created by €ASM, you may need to copy|clone the global file manually, and eventually erase unchanged or unused options from the local configuration file for better performance.

Example of command line which assembles two sources:
C:\PgmFiles\euroasm.exe Source1.asm D:\Temp\Source2.asm
EuroAssembler will try to read its configuration from three files: C:\PgmFiles\euroasm.ini, .\euroasm.ini, D:\Temp\euroasm.ini.

The initial contents of configuration file, which is built-in in euroasm.exe as factory-defaults, are defined in objlib/euroasm.ini. There are two sections in the file: [EUROASM] and [PROGRAM].

The former specifies parameters for €ASM itself, such as CPU generation, what information should go to the listing file, which warnings should be suppressed etc. The parameters from [EUROASM] section of the configuration file can be redefined later in the source with the EUROASM pseudoinstruction, where you will find detailed explanation for each one of the parameters.

[PROGRAM] section of configuration file specifies the default working parameters of program which is to be created by €ASM, for instance the memory model, format and name of the object file etc. These parameters can be modified further with the PROGRAM pseudoinstruction.

The configuration parameters order is not important. Names of the parameters are case insensitive. The parameters with a boolean value accept any of the predefined enumerated tokens such as ON, YES, TRUE, ENABLE, ENABLED as true and OFF, NO, FALSE, DISABLE, DISABLED as false. They may also accept numeric expressions which are evaluated as boolean.

When you give away your programs source code written in EuroAssembler, you don't have to specify which comand-line parameters were used to compile and link, because they can be declared in the source itself. A typical €ASM source program begins with configuration pseudoinstruction, such as EUROASM AUTOALIGN=YES,CPU=PENTIUM, so it is easy to tell in which assembler was the program written.
As a developer of program written in EuroAssebler, you shouldn't rely that users of your distributed source will have the same contents of euroasn.ini as you have. Specify all important settings in the beginning of the published source. Local configuration file is convenient during the development phase, when sources in the same directory do not have to explicitly specify all EUROASM and PROGRAM parameters.

The EuroAssembler options and directives can be defined in the configuration files and in the source files (by the pseudoinstruction EUROASM). They have the following order of precedence in their processing:

  1. When euroasm.exe starts, its options are already defined with built-in factory defaults.
  2. €ASM looks at the command-line; if some EUROASM keyword options were detected here, they overwride the current options in charge (factory defaults).
  3. €ASM looks for the global configuration file and reapplies its options.
  4. The command-line options are reapplied again (step 2 is repeated).
  5. Then €ASM looks for source filename(s) at the command-line, and if a local configuration file exists in the same directory, it is processed and applied to the current configuration derived from the previous steps.
  6. Source file is now assembled. For each pseudoinstruction EUROASM found in the source that definition overwrites current working options.
  7. If another source file is provided at the command-line level in the same assembly session, €ASM restores configuration which was saved at the end of step 4 and then continues from step 5.

↑ Source file

The source file contains the instructions to be assembled, usually it is a plain-text file or an HTML file arranged for €ASM. The file name will be provided as a command-line parameter of euroasm.exe. The source file may be identified with an absolute path in the filesystem, e.g. euroasm /user/home/euroasm/MyProject/MySource.asm, or with a relative or omitted path, which will be related to the current shell or command line path.

The structure and syntax of source text, which €ASM is able to assemble and link, is described further down in this document.

↑ Object file

The main purpose of programming is to obtain the target file from the source code. The target file may be an object module or a library linkable to other files, or a binary file for special purposes, or an executable file.

The format of the output file is specified by the PROGRAM parameter FORMAT=. Their layouts were standardized by their creators many, many years ago. For more details about supported output formats see the chapter Program formats.

The final name of the target file is determined by the label used in the previously described pseudoinstruction PROGRAM, and it is appended with its default extension depending on program format. The target name is not necessarily derived from the source filename, as in many other assemblers. For instance, if the source code file has statement Hello PROGRAM FORMAT=COM, its output file will be created in the current directory with the name Hello.com, no matter what the source file is named. The default target name can be changed by the PROGRAM parameter OUTFILE=. If the OUTFILE= name is specified with relative or omitted path, current shell directory is assumed.

↑ Listing file

Dump parameters ↓
Dump separators ↓
Dump decoration ↓
List parameters ↓

A listing file is a plain text file with two columns where EuroAssembler logs its activity:

  1. The result of assembly of each statement is hexadecimally displayed in the dump column.
  2. Statements, which were processed in the previous step, are copied to the source column.

The name of the listing is determined by the name of source file, which is then appended an .lst extension, and it is created in the source file directory.
The default listing filename and location might be changed with the EUROASM parameter LISTFILE=.

↑ Dump parameters

Let's create the source file Hello.asm with the following contents:

      EUROASM DUMP=ON,DUMPWIDTH=18,DUMPALL=YES
Hello PROGRAM FORMAT=COM,LISTLITERALS=ON, \
              LISTMAP=OFF,LISTGLOBALS=OFF
       MOV DX,=B"Hello, world!$"
       MOV AH,9
       INT 21h
       RET
      ENDPROGRAM Hello

Submitting the file to EuroAssembler with the command euroasm Hello.asm will create the listing file Hello.asm.lst.

The width of the dump column expressed in characters can be specified with the EUROASM option DUMPWIDTH=. Other EUROASM options which control the dump column are the boolean DUMPALL= and DUMP=OFF, which can suppress the dump column completely.

|<-Dump column-->|<--Source column-------- <--DumpWidth=18--> | | EUROASM DUMP=ON,DUMPWIDTH=18,DUMPALL=YES | |Hello PROGRAM FORMAT=COM,LISTLITERALS=ON, \ | | LISTMAP=OFF,LISTGLOBALS=OFF |[COM] ::::Section changed. |0100:BA[0801] | MOV DX,=B"Hello, world!$" |0103:B409 | MOV AH,9 |0105:CD21 | INT 21h |0107:C3 | RET |[@LT1] ====ListLiterals in section [@LT1]. |0108:48656C6C6F =B"Hello, world!$" |010D:2C20776F72 ----Dumping all. (because of DUMPALL=YES) |0112:6C64212400 ----Dumping all. | | ENDPROGRAM Hello ▲ column separator
↑ Dump separators

The dump column on the left side always starts with the machine comment indicator (pipe character |) and it is terminated with a listing column separator, which determines the origin of this line.

Listing column separators
CharacterFunction
| (pipe)Termination of a machine comment. Used in ordinary statements, which can be reused as €ASM source.
! (exclamation)Copy of the source line with expanded preprocessing %variables (when LISTVAR=ENABLED is used).
+ (plus)Source line generated in %FOR,%WHILE,%REPEAT expansion (when LISTREPEAT=ENABLED is used).
+ (plus)Source line generated in %MACRO expansion (when LISTMACRO=ENABLED is used).
: (colon)Inserted listing line to display a changed [section].
. (fullstop)Inserted listing line to display an autoalignment stuff (when AUTOALIGN=ENABLED is used).
- (minus)Inserted listing line to display the whole dump (when DUMPALL=ENABLED is used).
= (equal)Inserted listing line to display data literals (when LISTLITERALS=ENABLED is used).
  (space) Inserted envelope PROGRAM / ENDPROGRAM line.
* (asterix)Inserted listing line in INCLUDE* statement when filename wildcards are resolved.

As a side effect when the column separator is not |, the whole listing line has the form of a machine remark and it is ignored if the listing is submitted again as a program source.

↑ Dump decoration

The dump of emitting statements has their hexadecimal address (offset in the current working section), terminated with a colon :. In a 16-bit section the offset is 16 bits wide (four hexadecimal digits), in a 32-bit and 64-bit sections it is 32 bits wide. Then the emitted bytes follow. The data contents in the dump column is always in hexadecimal notation without an explicit number modifier. If the chosen DUMPWIDTH= is too small for all emitted bytes to fit, they are either right-trimmed and replaced with a tilde ~ (if DUMPALL=OFF), or additional lines with separator - are inserted to the listing (DUMPALL=ON).

Some other decorators are used in the dumped bytes:

Dump column decoration
DecoratorDescription
~Trimmed data indicator, used only when DUMPALL=OFF
..Byte of reserved data (instead of hexadecimal byte value when it's initialized)
[]Absolute relocation
()Relative relocation
{}Paragraph address relocation
<Ndisp8*N compression used

The brackets [] and {}, which may enclose the dumped word or dword, indicate that the address requires relocation at link-time. Value printed in the listing will differ from the offset viewed in a linked code or in a debugger at run-time.

The character < followed with one decimal digit (N) signals that the previously dumped byte is a 8-bit displacement which will be left-shifted by N bits at run-time to obtain the effective displacement (the so called disp8*N compression). The digit from 1..6 specifying scaling factor N is not emitted to the assembled code.

Brackets [ ] and { } indicate relocatable values. | | EUROASM DUMPWIDTH=30,CPU=X64,SIMD=AVX512,EVEX=ENABLED |[CODE] ▼ ▼▼ ▼ |[CODE] SEGMENT WIDTH=16 |0000:EA[0500]{0000} | JMPF Label ; Absolute far jump encodes immediate seg:offset. |0005:CB |Label: RETF |[CODE64] |[CODE64] SEGMENT WIDTH=64 |00000000:62F36D28234D02<504 | VSHUFF32X4 YMM1,YMM2,[RBP+40h],4 |00000008:C3 ▲▲ | RET <5 is a nonemitted disp8*N decorator. ▲▲Byte displacement +02h will be bit-shifted 5 times to the left, so the effective displacement is in fact +40h.

The dump of not emitting statements is either empty or contains auxiliary information.

|[DATA] |[DATA] ; Segment|section switch quotes its [name] in dump column. |0000: |; Empty or comment-only line just displays the offset in current section. |0000: |Label: ; Ditto. | |;; Line comment starting with double semicolon will suppress the offset in dump. |[DATA]:0000 |Target EQU Label: ; Address symbol definition is displayed as [segment]:offset. |4358 |%Counter %SET CX ; Assignment of preprocessing %variable dumps its contents in hexadecimal. |TRUE | %IF "%Counter" == "CX" ; Preprocessing construct displays the evaluated boolean condition. |[]:0010 | Bits EQU 16 ; Scalar symbol definition is displayed with empty segment. |FALSE | %ELSE ; Boolean condition concerns %IF, %ELSE, %WHILE, %UNTIL. | | Bits EQU 32 ; Dump of statements in false conditional branches is empty. | | %ENDIF
↑ List parameters

A listing produced with the default (factory) configuration is more or less an exact copy of the source (except for the inserted dump column). Sometimes it is useful to check if the high-level constructs worked as expected, and this is controlled by the following boolean EUROASM options:
LISTINCLUDE=ON unrolls the contents of the included file, which is normally hidden from the main source.
LISTVAR=ON creates a copy of the statements which contain preprocessing %variable, and replace the %variable name with its expanded value in the copied line.
LISTMACRO=ON inserts statements expanded by the macroinstruction.
LISTREPEAT=ON inserts all iterations of the repeating constructs %FOR..%ENDFOR, %WHILE..%ENDWHILE, %REPEAT..%ENDREPEAT. A repeated expansion is listed as a commented-out by dump column separator +. In the default state (defined by LISTREPEAT=DISABLED) only the first expansion is listed.

A very useful trait by design of an EuroAssembler listing is to keep the generated listing re-usable as source code again, in the following assembly session. The messages generated in the listing are ignored by the €ASM parser, so they need not be removed when we want to submit the listing file to a reassembly (nevertheless, those messages will be generated again if the cause of error was not fixed).

I wanted to sustain this philosophy regardless of the LIST* parameters. In the default state with LISTINCLUDE=OFF the statement INCLUDE is normally listed and the contents of included file is hidden. With option LISTINCLUDE=ON it is reversed: the original INCLUDE statement is commented out by dump column separator * but the included lines are inserted into the listing and they become valid source statements. See also t2220.

When options LISTVAR, LISTMACRO, LISTREPEAT are enabled, the original line is kept as is and expanded lines will be inserted below it, commented-out by dump column separator ! or +. See also t2230

The EUROASM option LIST=DISABLE will switch off the generating of listing lines until enabled again, or until the end of source, whichever comes first, and of course such listing will be no longer reusable as source code.

↑ File path

Disk files can be specified by their absolute path, i. e. with a path which begins at filesystem root, e.g. C:\ProgFiles\euroasm.exe D:\Project\source.asm. Such files are unequivocally defined.

Files may be also specified with a relative path, e. g. euroasm ..\prowin32\skeleton.asm. These relative paths are always related to the current working directory.

Files can also be specified without a path, i. e. when their name contains no colon and no slash :, \, /. The location of such files is reviewed in the table below:

Directory used when a file is specified without a path
DirectionFileDirectorySee also
Executableeuroasm.exeExe-directoryOS PATH
InputGlobal euroasm.iniExe-directoryOS PATH
OutputGlobal euroasm.iniExe-directoryOS PATH
InputLocal euroasm.iniSource directory
InputSource fileCurrent directory
InputIncluded source fileInclude directoryEUROASM INCLUDEPATH=
OutputTarget object fileCurrent directoryPROGRAM OUTFILE=
OutputListing fileSource directoryEUROASM LISTFILE=
InputLinked module fileLink directoryEUROASM LINKPATH=
InputLinked stub fileLink directoryPROGRAM STUBFILE=
InputLinked icon fileLink directoryPROGRAM ICONFILE=
ImportDynamically imported functionOS-dependentIMPORT LIB=

The current directory is the actual folder assigned to the shell process at the moment when euroasm.exe was launched. It's never changed by €ASM.

The exe-directory is the folder where euroasm.exe was found and executed, usually it is one of the directories specified by the environment variable PATH.

The source directory is the folder where the currently assembled source file lies.

The include directory is one of the directories specified by the option EUROASM INCLUDEPATH=.

The link directory is one of the directories specified by the option EUROASM LINKPATH=.


↑ Structure of an €ASM program

Character structure ↓

Horizontal structure ↓

Vertical structure ↓

This chapter describes the format of a typical source file which €ASM understands and which it is able to compile.


↑ Character structure

Character width ↓

Character encoding ↓

Character case ↓

Character classification ↓


↑ Character width

Source file is a sequence of characters with 8-bit width or with a variable width 8..32 bits (in UTF-8 encoding).

That is particulary important taht if the source file is written in an editor that uses WIDE (16-bit) character encoding (UTF-16), it should be saved as a plain text in UTF-8 or in 8-bit ANSI or OEM codepage before submitting the file for assembly.

↑ Character encoding

A program written in €ASM may need to display messages and texts in other languages than English. Therefore, a string which defines the output text will contain characters with their codepoint value above 127 (codepoint is an ordinal number of the character in the [Unicode] chart).
Many European languages are satisfied with a limited set of 256 characters. Historically the relation between their codes and corresponding glyphes is called a code page.

Be aware that MS-Windows uses different code pages in console applications (OEM) and in GUI applications (ANSI) and it makes automatic conversion between them in some circumstances. €ASM itself never changes the code page of the source.

A programmer, who needs to mix several human-languages in MS-Windows application, may need to use 16-bit WIDE characters instead of 8-bit ANSI in text strings at run-time. See cpmix32 as a demo example. The wide (UTF-16) strings are declared with pseudoinstruction DU (Define data in Unichars) instead of DB (Define data in Bytes) pseudoinstruction. The wide variant of WinAPI call must be used for a visual representation of Unichar strings at run-time, e. g. TextOutW() instead of TextOutA(). However, the in-source definition of characters in DU statement is still 8-bit. You should tell €ASM which code page was used for writing the DU statement in the source file. This information is provided by the EUROASM CODEPAGE= option. The codepage may change dynamically in the source, thus allowing mixing of different human-languages in one program.

The texts in your program which aim to run inside the console (using the WinAPI function WriteConsoleA() or macroinstruction StdOutput) should be written in the OEM code page. You may want to use a DOS plain-text editor, such as EDIT.COM for writing console programs. As text mode editors use console fonts which are in OEM code page, the text is displayed correctly both in editor at write-time and in the console of your program at run-time.

Converserly text which would be presented in GUI windows (using the WinAPI function TextOutA()) should be written in the ANSI code page, using a windowed editor such as Notepad.exe.

The default is EUROASM CODEPAGE=UTF-8, where characters are encoded with a variable length of one to four bytes. Thanks to the clever [UTF8] design, all non-ASCII UTF-8 characters are encoded as censecutive bytes with the values in the 128..255 range, which are treated as letters in €ASM, so any UTF-8 defined character can be used in identifiers as is.

The recommended encoding of the EuroAssembler source files is UTF-8.

Unlike the 8-bit ANSI or OEM encodings, which limit the repertoire to 256 glyphs, CODEPAGE=UTF8 allows the mixing of arbitrary character codepoints defined in [Unicode], including non-European alphabets. MS-Windows API does not, by design, directly support UTF-8 strings, and they need run-time reencoding to UTF-16 which is used by the WIDE variant of the WinAPI functions, such as TextOutW(). This reencoding can be performed by WinAPI MultiByteToWideChar() or by macro DecodeUTF8. Exotic characters will be displayed correctly only if the used font supports their glyphes, of course.

Example of a freeware text editor that supports UTF-8 encoding is [PSPad].
Some UTF-8 text editors insert Byte Order Mark characters 0xEF, 0xBB, 0xBF at the start of source file. EuroAssembler treats those three characters as a 3-bytes long unused label at the start of source, which usually makes no harm.

↑ Character case

€ASM is a case semi-sensitive assembler.

All identifiers created by you, the programmer, are case sensitive: labels, constants, user-defined %variables, structures, macro names. On the other hand, all built-in names are case insensitive. Case insensivity concerns all enumerations: register names, machine instructions and prefixes, built-in data types, number modifiers, pseudoinstruction names and parameters, symbol attributes, system %^variables.

The case insensitive names are presented in UPPER CASE in this manual but they may be used in lower or mixed case as well.

↑ Character classification

Each byte (8 bits) in €ASM source is treated as a character. Many characters have special purpose in assembler syntax unless they are quoted inside double or single quotes. A character is unquoted if zero or an even number of quotes appears between the start of the line and the character itself.

EOL
End-of-line control character is Line Feed alias EOL (ASCII 10).
White spaces
All other control characters, Delete and Space are considered white spaces. White spaces are mainly used as separators which can improve readability but only seldom have some syntactic significance. Unquoted multiple white spaces are treated the same way as a single one.
Digits
Digits 0..9 create numbers and identifiers. Hexadecimal numbers may also contain hexadecimal digits A..F, a..f.
Letters
Letters in €ASM are defined as a..z, A..Z, underscore _, at sign @, dollar sign $, grave accent `, question mark ? and all characters from the upper half of ASCII table (128..255).
Some of them are employed in €ASM for special purposes, too:
Underscore _ is used in identifiers and numbers as a word separator instead of space.
A leading at-sign @ indicates a literal section name.
The dollar sign $ alone is used as an identifier that specifies a dynamic symbol representing the current offset in a section.
The grave ` is used as a prefix when some filename not starting with a letter should represent a valid identifier.
Punctuation
All punctuation and other characters have special semantic meaning – operators, delimiters, modifiers etc. – unless they are enclosed in a pair of single ' or double " quotes. Punctuation characters except for the percent sign % and EOL are treated as ordinary letters when they are placed inside a quoted string.
Character classification table
ASCIIglyph name function in €ASM
0..9 controls white space
10 line feed end of line
11..31 controls white space
32 space white space
33! exclamation mark logical operator
34" double quote string delimiter
35# number sign modifier
36$ dollar sign letter
37% percent sign preprocessor apparatus prefix
38& ampersand logical operator
39' apostrophe (single quote)string delimiter
40( left parenthesis priority parenthesis
41) right parenthesis priority parenthesis
42* asterix arithmetic and special operator
43+ plus sign arithmetic operator
44, comma operand separator
45- minus sign arithmetic operator
46. fullstop member separator
47/ slash (solidus) arithmetic operator
48..570..9 digits digit
58: colon field separator
59; semicolon comment separator
60< less-then sign logical operator, comment separator
61= equals sign logical operator, key separator, literal indicator
62> greater-than sign logical operator
63? question mark letter
64@ commercial at letter
65..90A..Z uppercase letters letter
91[ left square bracketcontent braces, substring operator
92\ backslash (reverse solidus)arithmetic operator, line continuation indicator
93] right square bracketcontent braces, substring operator
94^ caret (circumflex) logical operator
95_ underscore (low line)letter, digit separator
96` grave accent letter
97..122a..zlowercase letters letter
123{ left curly bracket sublist operator
124| vertical bar (pipe)logical operator, comment separator
125} right curly bracketsublist operator
126~ tilde logical operator, shortcut indicator
127 delete white space
128..255 NonASCII charactersletter
ASCIIglyph name function in €ASM

↑ Horizontal structure

Physical line ↓

Statement ↓

Machine remark field ↓

Label field ↓

Prefix field ↓

Operation field ↓

Operand field ↓

Line remark field ↓

Line continuation ↓

An assembler source is treated as a text consisting of lines which are processed from left to right, from top to bottom.


↑ Physical line

A source file consists of physical lines. A physical line is a sequence of characters terminated with a line feed (ASCII 10). The line feed (EOL) character is part of the physical line, too.

The EOL may be omitted in the last physical line of source file.

↑ Statement

A statement is an order for €ASM to perform some action at assembly-time, that is usually to emit some code to the object file or to change its internal state. A typical statement is equivalent to a physical line but long statements might span several lines when line continuation is used.

A statement consists of several fields which are recognized by their position in the line, by the separator or by their contents. All fields are facultative (optional), any of them may be omitted. However, no operand can be used when the operation field is omitted.

Fields in the statement
OrderField nameTermination
1.Machine remark| or EOL
2.Label : or white space
3.Prefix : or white space
4.Operation white space
5.Operand ,
6.Line comment EOL

Example of a statement:

| machine remark |Label |Prefix|Operation| Operands | Line comment |00001234:F08705[78560000] |Mutex: LOCK: XCHG EAX,[TheLock] ; Guard the thread.

↑ Machine remark field

A machine remark begins with a vertical bar | when it is the first non-white character on the physical line. It is terminated with the second occurence of the same vertical bar or with the end of the physical line.

The contents of a machine remark is usually an hexadecimal address followed by the machine code emitted by the statement in question. As the field name indicates, this information is generated by the computer into €ASM listing file, and because of that, the programmer should never need to write a machine remark manually. Machine remarks are ignored in assembler source, thus any valid €ASM listing file may be reused as the source file without change.

↑ Label field

A label field can accomodate any of these elements:

  1. A structure or a symbol name or a block identifier, for example My1stStructure, My1stLabel:, Outer
  2. The name of a segment, section or group, for example [.data]
  3. The name of a symbolic %variable which is being set, for example %Count
  4. The colon itself :, as it is explicitly telling €ASM that an empty label is used, so the following field must be a prefix or an operation.

In the first case the symbolic name may begin with a period (point) ., making the label local. The symbol in the label field may be optionally terminated with one or more colons : immediately following the identifier. The white space between the label field and the next field may be omitted when the colon is used.

↑ Prefix field

The machine prefix is an order for CPU to change its internal state at run-time. It is similar to a machine instruction code but it only modifies the following instruction at run-time. Each prefix assembles to a single byte machine opcode.

Prefix table
NameGroupOpcode
LOCK10xF0
REP10xF3
REPE10xF3
REPZ10xF3
REPNE10xF2
REPNZ10xF2
XACQUIRE10xF2
XRELEASE10xF3
SEGCS20x2E
SEGSS20x36
SEGDS20x3E
SEGES20x26
SEGFS20x64
SEGGS20x65
SELDOM20x2E
OFTEN20x3E
OTOGGLE30x66
ATOGGLE40x67

The last four mnemonic names are not known in other assemblers.
The SELDOM and OFTEN may be used in front of conditional jump instructions as hints for newer CPUs to help with predictions of the jump target.
The OTOGGLE and ATOGGLE switch between 16-bit and 32-bit width of operand and address portion of machine code. They are normally generated by the assembler internally whenever needed, without an explicit request.

Up to four prefixes can be defined in one statement but not more than one prefix from the same group.

Prefix name cannot and should not be used as a label, regardless of character-case.

The names of the prefixes are case insensitive and reserved, they cannot be used as labels. A prefix name may be terminated with colon(s) : (same as symbols).

AMD and Intel 64-bit architecture introduced special prefixes REX, XOP, VEX, MVEX, EVEX. €ASM treats them as part of operation encoding and does not provide mnemonic for their direct declaration.

[AMDSSE5] introduced another instruction prefix DREX, but DREX-encoded instructions are not supported by €ASM as they never made it to the production, as far as I know.

The segment-override prefixes SEG*S can be alternatively requested as a component of memory-variable register expression. In this case they are emitted only when they are not redundant (when they specify a non-default segment). Explicitly specified prefixes are emitted always, in the order as they appeared in the statement.

EuroAssembler warns when a prefix is used in contradiction with the CPU specification. This can be overrided when the prefix is separated in extra statement.

|0000:F091 |LOCK: XCHG AX,CX ; Prefix Lock should not be used with register operands. |## W2356 Prefix LOCK: is not expected in this instruction. |0002:F0 |LOCK: ; This can be outperformed when the prefix is separated in extra statement, |0003:91 | XCHG AX,CX ; for instance to investigate CPU behaviour in such situation. |0004: | |0004:6691 | XCHG EAX,ECX ; Operand-size prefix 0x66 is emitted internally (in 16-bit segment). |0006:6691 |OTOGGLE: XCHG EAX,ECX ; Its explicit specification has no effect, |0008:6691 |OTOGGLE: XCHG AX,CX ; but here it overrides the registers sizes from 16 to 32 bits.

↑ Operation field

The operation field is the most important field of an assembler statement; it tells €ASM what to do: declare something, change its internal state or emit something to the object file. It often gives its name to the whole statement, we may say an EXTERN operation instead of a statement with EXTERN pseudoinstruction in the operation field.

€ASM recognizes three types (genders) of operation:

Statement may have no operation at all:

[CODE]   ; Redirect further emitting to section [CODE].
         ; Empty statement may be used for optical separation or for comments.
Label:   ; Define a label but do not emit any data or code.
LOCK:    ; Define a machine prefix for the following instruction.

Some statements tell €ASM to generate assembled code|data to the object file, they are called emitting instructions:

↑ Operand field

Ordinal operand ↓
Keyword operand ↓
Mixing operands ↓

The operands specify the data which the operation works. Conversely, the number of operands in the statement is not limited and it depends on the operation. The operand can be a register name, number, expression, identifier, string, and almost any of their various combinations.

The operation field is separated from the first operand with at least one white-space. Operands are separated with an unquoted comma , from one another. There are two kinds of operands recognised in €ASM: ordinal and keyword.


↑ Ordinal operands

The ordinal operands (or shortly ordinals) are referred by the order in the statement. The first operand has an ordinal number one (that is one-based index); in macros it is identified as %1. For instance, in the MOV AL,BL statement the AL register is operand number 1 and BL is number 2. The machine instruction MOV is known to copy contents of the second operand to the first. The comma between operands will increase the ordinal number even when the operand is empty (nothing but white-spaces).

An operand of machine instruction may represent a register, immediate integer number, address, memory variable enclosed in square braces, for instance MOV AL,[ES:SI+16].

Some other assemblers allow for different syntax of address expression, which is not supported by EuroAssembler, for instance MOV AL,ES:[SI+16] or MOV AL,[ES:16]+SI.
€ASM requires that the entire memory operand is placed inside square braces [].
↑ Keyword operands

Beside the ordinal parameters €ASM introduces one more type of operands: keyword operand (or shortly keywords). They are referred by name (key word) rather than by their position in the operands list. A keyword operand has the cannonical form name=value where name is an identifier immediately followed by an equal sign.

Keyword operands have many advantages: they are selfdescribing (if their name is chosen wisely), they don't depend on position in the operand list (no more tedious comma counting), they may be assigned a default value and they may be completely omitted when they have the default value.

Keyword operands are best used with macroinstructions but €ASM also employs them in some pseudoinstructions and even in machine instructions, too. For instance, in INC [EDI],DATA=DWORD the keyword parameter DATA= tells which form of the possible INC machine instruction (increment byte, word or dword variable) should be used.

It should not have an space between keyword and equal sign to be recognized as a valid instrukction modifier:

|0000: |; Let's define two memory variables (with not recommended names). |0000:3412 |DATA: DW 1234h |0002:7856 |WORD: DW 5678h |0004: | |0004:50 | PUSH AX, DATA=WORD |0005: |; Assembled as PUSH AX. |0005: |; Operand DATA=WORD is recognized as a redundant but valid instruction modifier. |0005: | |0005:506A00 | PUSH AX, DATA = WORD |0008: |; Operand DATA = WORD is not recognized as keyword modifier |0008: |; due to the space which follows identifier DATA. |0008: |; €ASM sees the 2nd operand as a numerical comparison between symbols DATA and WORD, |0008: |; which happen to exist in this program (otherwise E6601 would have been issued). |0008: |; Their offsets (0000h and 0002h) are different, the result is boolean FALSE |0008: |; represented with value 0. The statement is recognized as PUSH AX, 0 |0008: |; which is legal, because €ASM accepts integration of multiple ordinal operands |0008: |; to one statement in machine instructions PUSH, POP, INC, DEC. |0008: |; The statement is assembled as two instructions: PUSH AX and PUSH 0.
↑ Mixing keyword and ordinal operands

The order of keyword operands is not important. It is a good practice to list ordinal operands first and then all keyword operands, but keywords may be mixed freely with ordinals, too.

A keyword operand does not increase the ordinal number.
Label1: Operation1 Ordinal1,Ordinal2,,Ordinal4,,
Label2: Operation2 Ordinal1,Keyword1=Value1,Ordinal2,,Ordinal4

Operation1 in the previous example has three operands with ordinal numbers 1,2 and 4. The third operand is empty and the last two commas at the end of line are ignored, as no other nonempty operand follows.

Mixed operands are used in Operation2 and notice that Ordinal2 has an ordinal number 2 although it is the third operand on the list. Keyword operands do not count into ordinal numbers but empty operands do.

↑ Line comment field

A line comment begins with unquoted semicolon ; and it extends to the end of this physical line. Line comments are ignored by assembler, they are geared towards human reader of the source code.

↑ Line continuation

A statement continues on the next physical line when line continuation character, which is an unquoted backslash \, is used at the position where the next field would normally begin.

 aLabel:       \ ; This semicolon is redundant.
     MOV EAX,  \ The first operand of MOV is destination
         EBX   ; and the second one is source.

Everything that follows the line continuation character is treated like a comment field, so the semicolon may be omitted in this case. In a multiline statement you may add comments to any physical line.

A line continuation may appear at the beginning of any field, but not inside the field.

The whole field of any statement must fit on one physical line.

The backslash \ is also used as modulo binary operator, which cannot appear at the beginning of operation, so the confusion is avoided.

;                   modulo  modulo line-continuation
;                      |      |    |  
|0000:01000200 |  DW 5 \ 4, 6 \ 4, \
|0004:03000000 |     7 \ 4, 8 \ 4

↑ Vertical structure

Block statements ↓

Switch statements ↓

Standalone statements ↓

Statements in assembler source code are processed one by one, from top to bottom in a downwards fashion and some of them might influence successive statements but most instructions are standalone. From this point of view there are three kinds of statements:


↑ Block statements

A block statement must appear in pair with its corresponding ending statement. The internal state of €ASM is changed only within the range between them, which is called a block.

A block is a continuous range of statements which starts with begin-block statement and ends with a matching end-block statement.

A block actually begins at the operation field of a begin-block statement and it ends at the operation field of the end-block statement.

Some block statements may be prematurely cancelled (broken) with an exit operation, for instance when an error is detected during a macro expansion.

Block statements
Label fieldOperation field
ObligationRepresentsDeclares Begin blockBreak End block
mandatoryprogram name program PROGRAMnot used ENDPROGRAM
mandatoryprocedure name symbol PROC not used ENDPROC
mandatoryprocedure name symbol PROC1 not used ENDPROC1
mandatorystructure name structureSTRUC not used ENDSTRUC
optionalblock identifiernothing HEAD not used ENDHEAD
optionalblock identifiernothing %COMMENTnot used %ENDCOMMENT
optionalblock identifiernothing %IF %ELSE %ENDIF
optionalblock identifiernothing %WHILE %EXITWHILE %ENDWHILE
optionalids of Begin/End swappednothing %REPEAT %EXITREPEAT%ENDREPEAT
mandatoryformal control variable%variable%FOR %EXITFOR %ENDFOR
mandatorymacro name macro %MACRO %EXITMACRO %ENDMACRO

Some end-block operations can be aliased:
ENDPROC alias ENDP,
ENDPROC1 alias ENDP1,
%ENDREPEAT alias %UNTIL.

The label field of a block statement specifies the name of the program, procedure, structure or macro. In the preprocessing of a %FOR loop the label field declares a formal variable which changes its value in each loop cycle. In other preprocessing loops (%REPEAT, %WHILE) the label field is optional and it may contain identifier which optically connects the beginning and the ending of block statements together (for nesting check) but has no further significance - it does not declare a symbol.

The same block identifier may be used as the first and only operand of the corresponding end-block statement.

Assemblers are not united in the cannonical format of pseudoinstructions block. In one hand MASM uses the same block identifier in the label fields of both begin- and end-block statements:

MyProcedure PROC    ; MASM syntax
     ; some code
MyProcedure ENDP

This is good when you eyeball the source code for a procedure definition, as its name is on the left so it will hit your eyes when you scan the leftmost column. On the other hand, the same label appears in the source twice, making an ugly exception from the rule that a non-local symbol declaration may occur only once in the program.

Perhaps for that reason Borland chose a different syntax in TASM IDEAL mode:

 PROC MyProcedure   ; TASM syntax
        ; some code
 ENDP MyProcedure

It solves the double label problem but the name of MyProcedure never appears in the label field, although it is a regular label.

€ASM presents a compromise solution: the name of block is defined in the label field of a begin-block statement and it may appear in the end-block statement:

MyProcedure PROC  ; €ASM syntax
                  ; some code
            ENDP MyProcedure

The operand in the endblock statement may be omitted but, if used, it must be identical to the label of the corresponding begin-block statement label. This helps to maintain a correct block nesting because €ASM will emit an error when block identifiers don't match.

Blocks of code can be nested, but only correctly, that is, that there is no spillover between them.

Two blocks are correctly nested when one block contains the entire other block.

A %MACRO block in the example presented below contains a correctly nested %IF block.

WriteCMOS %MACRO Address,Value
           %IF %1 <= 30h
             %ERROR "Checksum protected area!"
             %EXITMACRO WriteCMOS
           %ENDIF
           MOV AL,%1
           OUT 70h,AL
           MOV AL,%2
           OUT 71h,AL
          %ENDMACRO WriteCMOS
Incorrect block nesting is only tolerated in procedures declared with the NESTINGCHECK=OFF option.

A block identifier in an operand field of end-block and exit-block statements usually only guards the correct binding. When blocks of the same type are nested one in another, exit-block operand can be used to identify the exiting block. As an example see t2642 where one Inner %FOR block is nested in Outer %FOR block, and the operand of %EXITFOR statement specifies which block is exited.

↑ Switch statements

A switching statement changes the internal state of €ASM for all following statements until another switching statement changes the state again, or until the end of source code is found.

There are two switching pseudoinstructions in €ASM: EUROASM, and SEGMENT. The latter has two forms:
[name] SEGMENT (define a new segment) and
[name] (define new section in current segment if it wasn't defined yet, and switch emitting to this section).
Examples of switching statements:

 EUROASM  AUTOSEGMENT=OFF, CPU=486 ; Change €ASM options for all following statements.
[Subprocedures] SEGMENT PURPOSE=CODE, ALIGN=BYTE  ; Declare a new segment.
[.data]                  ; Switch emitting of following statements to previously defined segment [.data]
[StringData]             ; Define a new section in the current segment (in [.data]).

↑ Standalone statements

All the remaining pseudoinstructions and machine instructions are not logically bound with others in a vertical structure of a program, so they are standalone, by definition.


↑ Elements of an €ASM program

The size of EuroAssembler elements is not limited by design. This applies to the length of strings, physical text lines, identifiers, number notations, expressions, nesting depth and number of operands. They are kept internally as a signed 32-bit integer number so the theoretical size limit of each such element is 2 GB = 2_147_483_647 bytes (characters).

In reality it is the amount of available virtual memory and stack space which restrict elements of this size, and EuroAssembler may terminate well before with a fatal error message F9110 Cannot allocate virtual memory. or F9210 Memory reserved for machine stack is too small for this source file.

Addresses ↓

Addressing space ↓

Alignment ↓

Boolean values ↓

Boolean extensions ↓

Comments ↓

Condition codes ↓

Data types ↓

Distance ↓

Enumerated values ↓

Expressions ↓

Groups ↓

Identifiers ↓

Length ↓

Literals ↓

Memory variables ↓

Namespace ↓

Numbers ↓

Operators ↓

Registers ↓

Scope ↓

Sections ↓

Segmentation ↓

Segments ↓

Size ↓

Strings ↓

Structures↓

Symbols ↓

%Variables ↓

Width ↓


↑ Comments

Block comments ↓

Line comments ↓

Machine remarks ↓

Markup comments ↓

Comments are parts of the source code which are not processed by assembler and their only purpose is to explain the code for a human reader. There are four types of comments recognised in €ASM:


↑ Line comments

Line comments start with an unquoted semicolon; everything up to the end of line is ignored by €ASM. Line comments are copied to the listing file verbatim.

 Label: CALL SomeProc ; This is a line comment.

↑ Machine remarks

Machine remarks are written by €ASM into the listing file and they contain the generated machine code in hexadecimal notation.

A machine remark starts with a vertical bar | which is the first non-white character on the physical line. A machine remark ends with the second occurence of the same vertical bar || is omitted, the whole physical line is treated as a remark. This is used for inserting error messages into the listing, just below the erroneous statement.

|0030:E81234   |Label1: CALL SomeProc  ; This is a line comment.
|0033:         |Label2: COLL OtherProc ; A typing error in the operation name.
|### E6860 Unrecognized operation "COLL", ignored.

Machine remarks are ignored by €ASM and they are not copied to the listing. Instead, €ASM recreates them when the listing produced by previous assembly session is submitted as a source to the assembler.

Machine remarks are not intended to be manually inserted by a programmer into the source text, use an ordinary line comment instead.

↑ Markup comments

When a physical line begins with less-than character <, it is treated as a markup comment and ignored up to the end of line. This enables to mix source code and hypertext markup language tags. Markup comments are not copied onto the listing.

Thanks to the markup comments, €ASM source code can be stored not just only as a plain-text but also as HTML or XML hypertext.

<h2>Description of SomeProcedure</h2>
<img src="SomeImage.png"/>
SomeProcedure  PROC  ; See the image above for description.

All source code shipped with €ASM is completely stored in HTML format, which allows to document the source with hypertext links, tables, images and better visual representation than simple line comments could yield.

If you want to keep your source codes in HTML, make sure that ordinary assembler statements do not start with < and rearrange the source so that every markup comment line starts with some HTML tag. You may also use void HTML tags <span/> or <!----> to start the comment line.

↑ Block comments

A block comment can be used to temporary disable a portion of source code or to include the documentation inside the source code.

Block comment begins with %COMMENT statement and it ends with the corresponding %ENDCOMMENT. It can span over many lines of program, which as a sole restriction don't have to start with semicolons.
Block comments are copied into the listing file.

€ASM does not assemble the text inside the commented-out block, but it needs to parse it anyway in order to find the coresponding %ENDCOMMENT statement, so the commented-out text should be a valid source as well.

Block comments are nestable.

The text in %COMMENT block must be corectly nested, although it is ignored.

The pseudoinstrucion %COMMENT could be easily replaced with %IF 0, but the former one is more intuitive.
 CALL SomeProc ; This is a line comment.
 %COMMENT  ; This is a block comment.
 COLL OtherProc ; Intentional typing error in operation name.
    %COMMENT ; This is a nested block comment.
    %ENDCOMMENT ; End of inner block comment.
    ; This statement is ignored, too.
 %ENDCOMMENT
 ; Emitting assembly continues here.

↑ Identifiers

An identifier is a human readable text which gives the name to an element of assembler program: a symbol, register, instruction, structure etc.

Each identifier is a combination of letters and digits, that begins with a letter.

The length of identifiers is not limited in €ASM and all characters are significant.


↑ Numbers

Decimal numbers ↓

Binary numbers ↓

Octal numbers ↓

Hexadecimal numbers ↓

Integer numbers overview ↓

Floating point numbers ↓

Floating point special values ↓

Character constants ↓

A number notation is the way to write numeric value and those numeric values are kept and computed internally by €ASM as 64-bit signed integers.

Number notation is a combination of digits and number modifiers, which begins with a decimal digit (0..9).

A number modifier is one of the B D E G H K M P Q T character apended to the end of a digits sequence, or 0N 0O 0X 0Y (a zero followed by a letter) prefixed in front of other digits. All number modifiers are case insensitive. Except for the decimal format, which is the default, a modifier must always be used.

Floating point numbers shell use a period (fullstop) . to separate the integer and decimal part of the number notation.

Another number modifier is the underscore character _ which is ignored by the number parser and it can be used as a digit separator instead of space or comma for a better readability of long numbers. No white spaces are allowed in number notation.

↑ Decimal numbers

A decimal number is a combination of decimal digits 0..9 optionally suffixed with a decimal modifier D. There are five other decimal suffixes:
K (Kilo), which tells €ASM to multiply the number by 210=1024,
M (Mega), which tells €ASM to multiply the number by 220=1_048_576,
G (Giga), which tells €ASM to multiply the number by 230=1_073_741_824,
T (Tera), which tells €ASM to multiply the number by 240=1_099_511_627_776,
P (Peta), which tells €ASM to multiply the number by 250=1_125_899_906_842_624.

Decimal numbers may be prefixed with 0N modifier.

All six numbers in the following example have the same value: 1048576, 1048576d, 0n1048576, 1_048_576, 1024K, 1M.
Pay attention of the fact that using a decimal modifier is done in powers of 2, not in the usual sense of powers of tens.

Maximal possible unsigned number which would fit into 32 bits is 0xFFFF_FFFF=4_294_967_295.

Maximal possible positive number which would fit into 63 bits is 0x7FFF_FFFF_FFFF_FFFF=9_223_372_036_854_775_807.

↑ Binary numbers

A binary number is made of digits 0 1 appended with a binary number modifier B or prefixed by a modifier 0Y. Examples: 0y101, 101b, 00110010b, 1_1111_0100B are equivalent to decimal numbers 5, 5, 50, 500 respectively.

Maximal 32-bit binary number is 1111_1111__1111_1111__1111_1111__1111_1111b.

↑ Octal numbers

Each octal digit 0..7 represents three bits of the equivalent binary notation. The number is terminated with octal suffix Q or prefixed with 0O alias 0o (digit zero followed by the capital or small letter O).

Example: 177_377q = 0o177_377 = 0xFEFF

The biggest 32-bit octal number is 37_777_777_777q.

The biggest 64-bit octal number is 1_777_777_777_777_777_777_777q.

↑ Hexadecimal numbers

Each hexadecimal digit encodes four bits in one character, which requires 24=16 possible values. Therefore the ten decadic digits are extended with letters A, B, C, D, E, F with values 10, 11, 12, 13, 14, 15. Hexadecimal digits (letters) A..F are case insensitive. When the first digit of a hexadecimal number is represented with a letter A..F, an additional leading zero must be prefixed to the number notation to avoid confusions. Hexadecimal number is terminated with suffix H or it begins with prefix 0X.

Example: 5h, 0x32, 1F4H, 0x1388, 0C350H represent decadic numbers 5, 50, 500, 5000, 50000 respectively.

Keep in mind that all numbers in €ASM are internally kept as 64-bit signed integer. Although instructions MOV EAX,0xFFFF_FFFF and MOV EAX,-1 assemble to identical codes, their operands are internally represented as 0x0000_0000_FFFF_FFFF and 0xFFFF_FFFF_FFFF_FFFF. Boolean expression 0xFFFF_FFFF = -1 is false. |00000000:B8FFFFFFFF | MOV EAX, 0xFFFF_FFFF |00000005:B8FFFFFFFF | MOV EAX, -1 |FALSE | %IF 0xFFFF_FFFF = -1

↑ Integer numbers overview

Integers may be written in binary, decimal, octal or hexadecimal notation. Some number modifiers overlap with hexadecimal digits B, D, E. €ASM parses as much of the element as possible to solve such ambiguity:
1BH is recognized as a hexadecimal number 0x1B=27 and not binary 1 followed with letter H.
2DH is recognized as a hexadecimal number 0x2D=45 and not decimal 2 followed with letter H.
3E2H is recognized as a hexadecimal number 0x3E2=994 and not 3 * 102 followed with letter H.

Integer number notation
NotationPrefixBaseSuffixMultiplier
Binary0Y2B1
Octal0O8Q1
Decimal0N10D1
K210
M220
G230
T240
P250
Hexadecimal0X16H1

Binary, octal and hexadecimal numbers must always be written with prefix or suffix (or both, however this is not recommended, and it feels awkward). There is no RADIX directive in €ASM.

For more examples of acceptable syntax see €ASM numbers tests.

↑ Floating point numbers

Floating point alias real numbers are parsed from the scientific notation with decimal point and exponent of 10, using this syntax:

FP number notation anatomy
OrderField nameContents
1number sign+, - or nothing
2significanddigits 0..9, digit separators _
3decimal point.
4fractiondigits 0..9, digit separators _
5FP number modifierE or e
6exponent sign+, - or nothing
7exponent partdigits 0..9, digit separators _

For instance, in the floating point number 1234.56E3 has value 1234.56 * 103=1234560.

An omitted sign is treated as +.

The decimal part can be omitted when it is zero(s), as 123.00E2 = 123.E2 and even

The decimal point may be omitted when decimal part is omitted (is equal to zero). The E modifier still specifies the floating point format. 123.00E2 = 123.E2 = 123E2 = 12300.

Exponent can be omitted when it is zero. The modifier E may be omitted in this case, too, and without the E modifier it is the presence of the decimal point which decides if the number is integer or real. In our example: 12345.67E0 = 12345.67E = 12345.67

No white space is allowed within FP number notation.

The number is considered as floating point when its notation contains either decimal point ., or modifier E (capital or small letter E), or both. Otherwise it is treated as an integer.

€ASM does not calculate with floating point numbers at assembly time.

All internal assembly-time calculations in €ASM are provided with 64-bit integers only. When FP is used in mathematical expression, it is converted to an integer first. And the error E6130 (number overflow) is reported if the number does not fit to 64 bits. Warning W2210 (precision lost) is reported if the FP number had decimal part which was rounded in conversion.

An actual FP number format [IEEE754] is maintained only when the scientific notation is used to define the static FP variable with pseudoinstruction DD, DQ, DT.

Half-precision FP numbers (float16) are not supported by €ASM, neither they are supported by processors, with exception of two packed SIMD instructions VCVTPS2PH and VCVTPH2PS, and a few MVEX-encoded up/down conversion operations.

Unlike integer numbers, the sign of FP notation is inseparable from digits which follow. If you by mistake put a space between the sign and the number, instead of FP definition it is treated as an operation (unary minus applied to a number), and therefore the FP number is converted to integer first, before the operation is evaluated. |00000000:001DF1C7 | DD -123.45E3 ; Single-precision FP number -123.45*103. |00000004:C61DFEFF | DD - 123.45E3 ; Dword signed integer number -123450. |00000008:00000000A023FEC0 | DQ -123.45E3 ; Double-precision FP number -123.45*103. |00000010:C61DFEFFFFFFFFFF | DQ - 123.45E3 ; Qword signed integer number -123450. |00000018:0000000000001DF10FC0 | DT -123.45E3 ; Extended-precision FP number -123.45*103. |00000022: | DT - 123.45E3 ; Tbyte integer number is not supported. |### E6725 Datatype TBYTE expects plain floating-point number.

↑ Floating point special values

Beside the standard scientific notation of floating-point numbers they may have a special FP constant value:

Special floating-point constant values (in hexadecimal notation)
ConstantInterpretationsingle precision (DD)double precision (DQ)extended precision (DT)
#ZEROzero0000000000000000_00000000 0000_00000000_00000000
+#ZEROpositive zero0000000000000000_00000000 0000_00000000_00000000
-#ZEROnegative zero8000000080000000_00000000 8000_00000000_00000000
#INFinfinity7F8000007FF00000_00000000 7FFF_80000000_00000000
+#INFpositive infinity7F8000007FF00000_00000000 7FFF_80000000_00000000
-#INFnegative infinityFF800000FFF00000_00000000 FFFF_80000000_00000000
#PINFpseudo infinity7F8000007FF00000_00000000 7FFF_00000000_00000000
+#PINFpositive pseudo infinity7F8000007FF00000_00000000 7FFF_00000000_00000000
-#PINFnegative pseudo infinityFF800000FFF00000_00000000 FFFF_00000000_00000000
#NANnot a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
+#NANpositive not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
-#NANnegative not a numberFFC00000FFF80000_00000000 FFFF_C0000000_00000000
#PNANpseudo not a number7F8000017FF00000_00000001 7FFF_00000000_00000001
+#PNANpositive pseudo not a number7F8000017FF00000_00000001 7FFF_00000000_00000001
-#PNANnegative pseudo not a numberFF800001FFF00000_00000001 FFFF_00000000_00000001
#QNANquiet not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
+#QNANpositive quiet not a number7FC000007FF80000_00000000 7FFF_C0000000_00000000
-#QNANnegative quiet not a numberFFC00000FFF80000_00000000 FFFF_C0000000_00000000
#SNANsignaling not a number7F8000017FF00000_00000001 7FFF_80000000_00000001
+#SNANpositive signaling not a number7F8000017FF00000_00000001 7FFF_80000000_00000001
-#SNANnegative signaling not a numberFF800001FFF00000_00000001 FFFF_80000000_00000001

Names of special constants are case insensitive. If sign + or - is used, it is unseparable. Examples:
FourNans DY 4 * QWORD #NaN ; Define vector of four double-precision not-a-number FP values.
MOV ESI,=8*Q#ZERO ; Define 8*8 zero bytes in literal section and set ESI to point at them.

↑ Character constants

A number can also be written as a character constant, which is a string containing not more than eight characters. Its numeric value is taken from ordinal number of each character in the ASCII table. Example of character constants and their values:

'0'   =     30h =      48
'abc' = 636261h = 6513249
"4%%" =   2534h =    9524
A character with the least significant value is on the left position in the string.

Assemblers are not united in character constants treatment. MASM and TASM use scriptual convention where the order of characters in the written source code corresponds with the way we write numbers: least significant digit is on the right side.

€ASM as well as other newer assemblers use the memory convention where the order of characters in the written source code corresponds with the order how they are stored in memory on little endian architecture processors.

| | ; MASM and TASM: |00000000:616263 | DB 'abc' ; String. |00000003:63626100 | DD 'abc' ; Character constant. |00000007:B863626100 | MOV EAX,'abc' ; AL='c'. | | ; €ASM, FASM, GoASM, NASM, SpASM: |00000000:616263 | DB 'abc' ; String. |00000003:61626300 | DD 'abc' ; Character constant. |00000007:B861626300 | MOV EAX,'abc' ; AL='a'.

↑ Enumerated values

Some operands may acquire only one of the few predefined values, e.g. the EUROASM option CPU= may be 086, 186, 286, 386, 486, 586, 686, PENTIUM, P6, X64.

Although some enumerated values may look like a number, they are not countable, they merely represent a position in a predefined collection.

↑ Boolean values

Any number can be interpreted as a boolean (logical) value, too. Boolean values can acquire one of the two states: false or true. Number 0 is treated as boolean false in logical expression, any nonzero number is treated as true.

↑ Boolean extended values

All built-in €ASM boolean options have an extended repertoire of possible values. Those boolean values accept

This aplies to the:

Extended boolean enumeration is used only with operands built in the €ASM. They are not symbols that could be used elsewhere, such as MOV EAX,TRUE. To achieve similar functionality in macros, the programmer would have to define such symbols first, e.g.

FALSE   EQU 0
false   EQU 0
TRUE    EQU -1
true    EQU !false
MOV EAX,TRUE

When an extended Boolean value is used as the macro keyword operand, it can be also tested in the macro body with %IF, %WHILE, %UNTIL, for instance

MacroWithBool  %MACRO Bool=On
  %IF %Bool
    ; Do something when Bool is set to TRUE.
  %ELSE
    ; Do something when Bool is set to FALSE.
  %ENDIF
 %ENDMACRO MacroWithBool

Now we may invoke the macro as MacroWithBool Bool=Enable, MacroWithBool Bool=No etc.

Extended enumerated Boolean values are not allowed in logical expressions
MacroWithBool  %MACRO Bool=0
  %IF ! %Bool
    ; Do someting when Bool is set to FALSE.
  %ENDIF
  %ENDMACRO MacroWithBool

The previous example would not work with extended Boolean values, for instance MacroWithBool Bool=False will complain that E6601 Symbol "False" was not found.. However, reversing the logic should work well:

MacroWithBool  %MACRO Bool=0
  %IF  %Bool
  %ELSE
    ; Do someting when Bool is set to FALSE.
  %ENDIF
  %ENDMACRO MacroWithBool

↑ Strings

A string is a set of arbitrary characters enclosed in quotes. Either double " or single quotes ' (also called apostrophes) may be used to mark the borders of a string. The surrounding quotes do not count into the string contents. All characters within the string lose their semantic significance, with three exceptions:

  1. EOL cannot be used in strings. In other words, each portion of quoted "string data" must fit to one physical line. Definition of long strings can be split, e.g. |0000:5468697320697320 |MultilineString: DB "This is the first line",13,10, \ |0008:7468652066697273~| "and this is the second one.",13,10,0 |0036: |
  2. The same quote character which is used to surround the string cannot be used inside, unless it is doubled, e.g. |0000:4F27427269656E00 |Surname: DB 'O''Brien',0 |0008: |
  3. The percent sign % keeps its function of a %variable prefix. Use two adjacent percent signs when a single % is required in a string, e.g. |0000:313030252073617665642E00 |Status: DB "100%% saved.",0 |000C: |
Preprocessing %variables are expanded in strings.

No escape character is employed in €ASM, in fact the percent sign and quote escape themselves. If you need to use any of the above mentioned characters within a string, they must be doubled. This duplication (self-escaping) concerns only the notation in the source text and it does not increase the final string size in emitted computer memory.

Strings enclosed in 'single quotes' and "double quotes" are equivalent with a single exception: if the contents of a string is a filename, only double quotes may be used, because the apostrophe is a valid character when used in filenames on most filesystems. More examples of string definitions:

|0000:3830202520 |DB "80 %% " |0005:766F74656420224E6F22 |DB "voted ""No""" |000F: |DB '' ; Empty string. |000F:27 |DB "'" ; Single apostrophe. |0010:27 |DB '''' ; Single apostrophe. |0011: |; Examples of invalid syntax (odd number of quotes): |0011: |DB """ |### E6721 Invalid data expression """"". |0011: |DB "It ain't necessarilly so' |### E6721 Invalid data expression ""It ain't necessarilly so'". |0011: |

↑ Addressing space

The processor, otherwise known as Central Processing Unit (CPU), operates with data and communicates with its environment (registers, memory and devices). A typical operation reads a piece of information from a register, memory or port (I/O device), makes some manipulation with the data and writes it back to the environment. The least addressable unit is a single byte (1 B) and their number is limited by the addressing space. A register is identified by its name, a device is identified by its port number, a byte in memory is identified by its address.

CPU addressing space
CPU modeGPR I/O port Memory addressing
16-bit 8* 2 B64 KB (216)1 MB (216+4)
32-bit 8* 4 B64 KB (216)4 GB (232)
64-bit16* 8 B64 KB (216)16384 PB (264)

↑ Addresses

Addressing space is limited by the CPU architecture and by the number of wires connecting addressing pins between the CPU and the memory chips. A combination of logical zeros and ones, which can be measured on those wires, is called physical address (PhA).

From an application programmer's point of view, the processor writes or reads from virtual address (VA). If the memory segmentation is not taken into account, virtual address is sometimes called linear address (LA). As a matter of historical fact both virtual and physical address were identical only in first generations of processors operating in real mode without memory cache and memory paging.

The objects in the linked image of a protected-mode program are often addressed with an offset from the beginning of an image loaded in memory (from the ImageBase). Such offset is called relative virtual address (RVA).

And similary, the position of the data items in file formats are sometimes identified with file address (FA), that is defined as the distance between start of the file and the actual data item position in this file.

Address is a symbolic representation of some position in memory.

PhA, VA, LA, RVA, FA are integer non-negative plain numbers, but addressing objects or data at assembly-time is rather more complicated. From historical reasons, the addressing space is divided into segments of memory and each segment is identified by the contents of a segment register. An address at assembly-time is expressed as number of bytes off, (hence the name offset) between the position and the start of its segment, and the segment identification. See also the chapters Address symbols and Address expressions.

↑ Alignment

Data and code are retrieved from memory faster when their address is aligned, which means that is rounded to a value which in turn is a multiple of power of two. Even though most of IA-32 CPU instructions can cope with unaligned data, it takes more time as the data read from memory are not in the same cache page and the CPU may need to shift the information internally during the fetch-time.

For the best performance, memory variables should be aligned to their natural alignment which corresponds with their size, see the Autoalign column in Data types table. Doublewords, for instance, have autoalign value 4, which says that the last two bits of a properly aligned address should be zero. QWORD are aligned to 8, therefore the last three bits (8=23) should be zero.

This alignment can be achieved explicitly with ALIGN pseudoinstruction, or with the ALIGN= keyword given in machine instruction or in PROC and PROC1 pseudoinstructions.

Memory variables are being aligned by €ASM implicitly when the EUROASM option AUTOALIGN=ON is set. For instance the statement SomeDword: DD 1234 is autoaligned by 4 (offset of SomeDword can be divided by 4 without a remainder). An important concept is the alignment stuff, which fills the space in front of the aligned instruction. It is zero 0x00 in data segments and NOP 0x90 or multibyte NOP in code segments.

The align value may be a numeric expression which evaluates to 1, 2, 4, 8 or a higher power of two. €ASM accepts without warning a zero or an empty value, too, which is identical to ALIGN=1 (it has no effect). Beside the numeric values ALIGN also accepts the enumerated values BYTE, WORD, DWORD, QWORD, OWORD, YWORD, ZWORD or their short versions B, W, D, Q, O, Y, Z.

Alignment is always limited by the alignment of the segment on which the statement lies in. If the current segment is DWORD aligned, we cannot ask for a QWORD or an OWORD alignment in this segment. The default segment alignment is OWORD (10h) in €ASM and it is increased to SectionAlign (usually by 1000h) when the assembled program is in ELF or PE/DLL format.

Beside the instruction modifier ALIGN= the alignment may also be established with the explicit ALIGN pseudoinstruction, which allows for intentional disalignment, too.

↑ Registers

Register is a small and fast variable with fixed-size located on the CPU chip.

Though a register remembers information written to it, it is not a part of the addressable memory. Registers can be referenced by their names only, they have no address.

Registers table
FamilyREGTYPE#MembersSize
GPR 8-bit'B'AL, AH, BL, BH, CL, CH, DL, DH,
DIB, SIB, BPB, SPB, R8B, R9B, R10B, R11B, R12B, R13B, R14B, R15B
DIL, SIL, BPL, SPL, R8L, R9L, R10L, R11L, R12L, R13L, R14L, R15L
1
GPR 16-bit'W'AX, BX, CX, DX, BP, SP, SI, DI, R8W, R9W, R10W, R11W, R12W, R13W, R14W, R15W2
GPR 32-bit'D'EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI, R8D, R9D, R10D, R11D, R12D, R13D, R14D, R15D4
GPR 64-bit'Q'RAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI, R8, R9, R10, R11, R12, R13, R14, R158
Segment'S'CS, SS, DS, ES, FS, GS2
FPU'F'ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST710
MMX'M'MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM78
XMM'X'XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM16, XMM17, XMM18, XMM19, XMM20, XMM21, XMM22, XMM23, XMM24, XMM25, XMM26, XMM27, XMM28, XMM29, XMM30, XMM3116
AVX'Y'YMM0, YMM1, YMM2, YMM3, YMM4, YMM5, YMM6, YMM7, YMM8, YMM9, YMM10, YMM11, YMM12, YMM13, YMM14, YMM15, YMM16, YMM17, YMM18, YMM19, YMM20, YMM21, YMM22, YMM23, YMM24, YMM25, YMM26, YMM27, YMM28, YMM29, YMM30, YMM3132
AVX-512'Z'ZMM0, ZMM1, ZMM2, ZMM3, ZMM4, ZMM5, ZMM6, ZMM7, ZMM8, ZMM9, ZMM10, ZMM11, ZMM12, ZMM13, ZMM14, ZMM15, ZMM16, ZMM17, ZMM18, ZMM19, ZMM20, ZMM21, ZMM22, ZMM23, ZMM24, ZMM25, ZMM26, ZMM27, ZMM28, ZMM29, ZMM30, ZMM3164
Mask'K'K0. K1, K2. K3, K4, K5, K6, K78
Bound'N'BND0, BND1, BND2, BND316
Control'C'CR0, CR2, CR3, CR4, CR84
Debug'E'DR0, DR1, DR2, DR3, DR6, DR74
Test'T'TR3, TR4, TR54

Register names are case insensitive. General Purpose Registers (GPR) are aliased, for instance AL is another name for the lower half of AX, which is the lower half of EAX, which is the lower half of RAX.

Similary, SIMD (AVX) registers are aliased as well: XMM0 is another name for the lower half of YMM0, which is the lower half of ZMM0.

Names of 8-bit registers DIB, SIB, BPB, SPB, R8B..R15B are aliases for the least significant byte of RDI, RSI, RBP, RSP, R8..R15. They may also be referred as DIL, SIL, BPL, SPL, R8L..R15L, as used in Intel manual. €ASM supports both suffixes ~L and ~B. Those registers are available in 64-bit mode only.

Some other assemblers and Intel manuals use notation ST(0), ST(1)..ST(7) for Floating-Point Unit register names, but this syntax is not accepted in €ASM. Neither can be ST0 register aliased with ST (top of the FPU stack).

Processor x86 contains some other registers which hold flags, descriptor tables, FPU control and status registers, but they are not listed in the table above because they are not directly accessible by their name.

↑ Condition codes

General condition codes ↓

SSE condition codes ↓

The result of some CPU operations is treated as a predicate with mnemonic shortcut that can be used as a part of instruction name.

↑ General condition codes

Some combinations of CPU flags ZF, CF, OF, SF, PF are given special names, so called condition codes. They are used in mnemonic of conditional branching using the jump instructions or in bit-manipulation general-purpose instructions.

Inverted code can be used in macroinstructions to bypass region of code when the condition is not met. See the automatic %variable inverted condition code.

General condition codes table
Num.
value
Mnemonic
code
AliasDescriptionConditionInverted
mnem.code
0x4E Z Equal ZF=1 NE
0x5NE NZ Not Equal ZF=0 E
0x4Z E Zero ZF=1 NZ
0x5NZ NE Not Zero ZF=0 Z
0x2C B Carry CF=1 NC
0x3NC NB Not Carry CF=0 C
0x2B C Borrow CF=1 NB
0x3NB NC Not Borrow CF=0 B
0x0O Overflow OF=1 NO
0x1NO Not Overflow OF=0 O
0x8S Sign SF=1 NS
0x9NS Not Sign SF=0 S
0xAP PE Parity PF=1 NP
0xBNP PO Not Parity PF=0 P
0xAPE P Parity Even PF=1 PO
0xBPO NP Parity Odd PF=0 PE
0x7A NBEAbove CF=0 && ZF=0 NA
0x6NA BE Not Above CF=1 || ZF=1 A
0x3AE NB Above or Equal CF=0 NAE
0x2NAE B Not Above nor Equal CF=1 AE
0x2B NAEBelow CF=1 NB
0x3NB AE Not Below CF=0 B
0x6BE NA Below or Equal CF=1 || ZF=1 NBE
0x7NBE A Not Below nor Equal CF=0 && ZF=0 BE
0xFG NLEGreater SF=OF && ZF=0 NG
0xENG LE Not Greater SF<>OF || ZF=1G
0xDGE NL Greater or Equal SF=OF NGE
0xCNGE L Not Greater nor EqualSF<>OF GE
0xCL NGELess SF<>OF NL
0xDNL GE Not Less SF=OF L
0xELE NG Less or Equal SF<>OF || ZF=1NLE
0xFNLE G Not Less nor Equal SF=OF && ZF=0 LE
CXZ CX register is Zero CX=0
ECXZ ECX register is Zero ECX=0
RCXZ RCX register is Zero RCX=0

↑ SSE condition codes

Streaming Single Instruction Multiple Data Extension instructions (V)CMPccSS,(V)CMPccSD,(V)CMPccPS,(V)CMPccPD use different set of condition codes cc.

Only aliased mnemonic code is documented for legacy instructions CMPccSS,CMPccSD,CMPccPS,CMPccPD.
SSE condition codes table
Num.
value
Mnemonic
code
AliasDescription
0x00EQ_OQEQEqual, Ordered, Quiet
0x01LT_OSLTLess Than, Ordered, Signaling
0x02LE_OSLELess than or Equal, Ordered, Signaling
0x03UNORD_QUNORDUnordered, Quiet
0x04NEQ_UQNEQNot Equal, Unordered, Quiet
0x05NLT_USNLTNot Less Than, Unordered, Signaling
0x06NLE_USNLENot Less than or Equal,Unordered, Signaling
0x07ORD_QORDOrdered, Quiet
0x08EQ_UQ Equal, Unordered, Quiet
0x09NGE_USNGENot Greater than or Equal, Unordered, Signaling
0x0ANGT_USNGTNot Greater Than, Unordered, Signaling
0x0BFALSE_OQFALSEFalse, Ordered, Quiet
0x0CNEQ_OQ Not Equal, Ordered, Quiet
0x0DGE_OSGEGreater than or Equal, Ordered, Signaling
0x0EGT_OSGTGreater Than, Ordered, Signaling
0x0FTRUE_UQTRUETrue, Unordered, Quiet
0x10EQ_OSEqual, Ordered, Signaling
0x11LT_OQLess Than, Ordered, Quiet
0x12LE_OQLess than or Equal, Ordered, Quiet
0x13UNORD_SUnordered, Signaling
0x14NEQ_USNot Equal, Unordered, Signaling
0x15NLT_UQNot Less Than, Unordered, Quiet
0x16NLE_UQNot Less than or Equal, Unordered, Quiet
0x17ORD_SOrdered, Signaling
0x18EQ_USEqual, Unordered, Signaling
0x19NGE_UQNot Greater than or Equal, Unordered, Quiet
0x1ANGT_UQNot Greater Than, Unordered, Quiet
0x1BFALSE_OSFalse, Ordered, Signaling
0x1CNEQ_OSNot Equal, Ordered, Signaling
0x1DGE_OQGreater than or Equal, Ordered, Quiet
0x1EGT_OQGreater Than, Ordered, Quiet
0x1FTRUE_USTrue, Unordered, Signaling

↑ Operators

Operator is an order to compute at assembly-time.

Combination of punctuation characters is used in €ASM to prescribe various operations with numbers, addresses, strings and registers in the assembly process. Placing a binary operator between the two numbers tells €ASM to replace these three elements with the result of operation. Some operators are unary, they modify the value of operand which they stand in front of.

All operations implemented in €ASM are presented in the following table.

Operation table
Operation PriorityProperties Left
operand
Operator Right
operand
ResultII (6)
Membership 16binary noncomm. (1)identifier. identifieridentifier
Attribute 15unary noncomm. (3) attr# element number or address
Case-insens. Equal 14binary commutative (2)string== string boolean CMPS
Case-sens. Equal 14binary commutative string === string boolean CMPS
Case-insens. Nonequal 14binary commutative (2)string!== string boolean CMPS
Case-sens. Nonequal 14binary commutative string !=== string boolean CMPS
Plus 13unary (3) + number numeric NOP
Minus 13unary (3) - number numeric NEG
Shift Logical Left 12binary noncommutative number << number numeric SHL
Shift Arithmetic Left 12binary noncommutative number #<< number numeric SAL
Shift Logical Right 12binary noncommutative number >> number numeric SHR
Shift Arithmetic Right12binary noncommutative number #>> number numeric SAR
Signed Division 11binary noncommutative number #/ number numeric IDIV
Division 11binary noncommutative number / number numeric DIV
Signed Modulo 11binary noncommutative number #\ number numeric IDIV
Modulo 11binary noncommutative number \ number numeric DIV
Signed Multiplication 11binary commutative number #* number numeric IMUL
Multiplication 11binary commutative number * number numeric MUL
Scaling 10binary commutative (5)number* register address expression
Addition 9binary commutative number + number numeric ADD
Subtraction 9binary noncommutative number - number numeric SUB
Indexing 9binary commutative (5)number+ register address expression
Bitwise NOT 8unary (3) ~ number numeric NOT
Bitwise AND 7binary commutative number & number numeric AND
Bitwise OR 6binary commutative number | number numeric OR
Bitwise XOR 6binary commutative number ^ number numeric XOR
Above 5binary noncommutative number > number boolean JA
Greater 5binary noncommutative number #> number boolean JG
Below 5binary noncommutative number < number boolean JB
Lower 5binary noncommutative number #< number boolean JL
Above or Equal 5binary noncommutative number >= number boolean JAE
Greater or Equal 5binary noncommutative number #>= number boolean JGE
Below or Equal 5binary noncommutative number <= number boolean JBE
Lower or Equal 5binary noncommutative number #<= number boolean JLE
Numeric Equal 5binary commutative number = number boolean JE
Numeric Nonequal 5binary commutative (4)number!= or <>number boolean JNE
Logical NOT 4unary (3) ! number boolean NOT
Logical AND 3binary commutative number && number boolean AND
Logical OR 2binary commutative number || number boolean OR
Logical XOR 2binary commutative number ^^ number boolean XOR
Segment separation 1binary noncommutative number : number address expression
Data duplication 0binary noncomm. (1) (5)number* datatype data expression
Range 0binary noncomm. (1)number .. number range
Substring 0binary noncomm. (1)text [ ] range text
Sublist 0binary noncomm. (1)text { } range text

(1) Special operations Membership, Duplication, Range, Substring, Sublist are solved at parser level rather than by the €ASM expression evaluator. They are listed here only for completeness.

(2) Case insensitive string-compare operations ignore the character case of letters A..Z but not the case of accented national letters above ASCII 127.

(3) Unary operator applies to the following operand. Binary operators work with two operands. Attribute operator applies to the following element or expression in parenthesis/brackets.

(4) Numeric Nonequal operation has two aliased operators != and <>. You can choose whichever you like.

(5) Operation Multiplication, Scaling and Duplication share the same operator *. Similary Addition and Indexing share operator +. The actual operation is determined by the operands types.

(6) Column II illustrates which equivalent machine instruction is used internally to compute the operation at assembly-time.

The commutative property specifies whether both operands of a binary operation can be exchanged without having impact to the result.

Priority column specifies the order of processing operators. Higher priority operations compute sooner but this can be changed with priority parenthesis ( ). Operation with equal priority compute in their notation order (from left to right).

Operations which calculate with signed integers have the operator prefixed with #. Operations Addition and Subtraction do not need a special "#signed" version because they compute with signed and unsigned integer numbers in the same way.

Both numeric and boolean operations return 64-bit number. In case of boolean operations the result number has one of the two possible values: 0 (FALSE) or -1 = 0xFFFF_FFFF_FFFF_FFFF (TRUE). For example the expression
'+' & %1 #>= 0 | '-' & %1 #< 0 is evaluated as
('+' & (%1 #>= 0)) | ('-' & (%1 #< 0)) and its result is the minus sign (45) if %1 is negative and plus sign (43) otherwise.

Spaces which separate operands and operators in expression examples serve only for better readability and they are not required by €ASM syntax.

Rich set of operators allows €ASM to get rid of cloned pseudoinstructions such as IFE, IFB, IFIDN, IFIDNI, IFDIF, ERRIDNI, ERRNB...

The Shift operators family is given higher priority than in other languages because I treat shifts as a special kind of multiplication/division.
NASM evaluates the expression 4+3<<2 as (4+3)<<2 = 28 but in €ASM it is evaluated as 4+(3<<2) = 16).


↑ Expressions

Numeric and logical expressions ↓

Address expressions ↓

Register expressions ↓

Data expressions ↓

Special expressions ↓

Expression is a combination of operands, operators and priority parenthesis () which follows the rules in the table below.

Syntax of expression
What may followleft parenthesisunary operator operandbinary operatorright parenthesisend of expression
beginning of expressionyesyesyesnonoyes (2)
left parenthesisyesyesyesnoyes (2)no
unary operatoryesnoyesnonono
operandnononoyesyesyes
binary operatoryesyes (1)yesnonono
right parenthesisnononoyesyesyes

(1) Unary operator is permitted after the binary operation, e.g. 5*-3 evaluates as 5*(-3).

(2) Empty expression, empty parenthesis contents and superabundant parenthesis are valid.

The table shows which combinations are permitted. It should be read by rows, for instance the first line stipulates that expression may begin with the left parenthesis, unary operator or an operand.

Expression is parsed into elementar unary and binary operations, which are calculated according to the priority. Operations with the same priority are computed from left to right. Priority can be increased using parenthesis ( ).

↑ Numeric and logical expressions

String compare ↓
Numeric compare ↓
Numeric arithmetic ↓
Shift ↓
Bitwise arithmetic ↓
Boolean algebra ↓
Numeric operations calculate internally with 64-bit integers, no matter if the target program is intended to run in 64-bit mode or not.

Result of the numeric or logical expression is a scalar 64-bit numeric value (signed integer). It may be treated as a number or as a logical value. Zero result is treated as boolean false and any nonzero result is boolean true. Pure logical expressions, such as logical NOT, AND, OR, XOR and all compare operations return 0 when false and 0xFFFF_FFFF_FFFF_FFFF = -1 when true. This enables to use the result of logical expression in subsequent bitwise operations with all bits.

↑ String compare

String compare expressions return a boolean value. Case insensitive versions convert both strings to the same case before actual comparing; however this concerns ASCII letters A..Z only. National letters with accents in any codepage are always compared case sensitively.

String compare is given the highest priority since no other assembly-time operation can be performed with strings beside the test of equality. At assembly time €ASM cannot tell which string is "bigger". |00000000:FFFFFFFFFFFFFFFF | DQ "EAX" == "eax" ; TRUE, the strings are equal. |00000008:0000000000000000 | DQ "EAX" === "eax" ; FALSE, the strings differ in character case. |00000010:FFFFFFFFFFFFFFFF | DQ "I'm OK." === 'I''m OK.' ; TRUE, their netto value is equal. |00000018:0000000000000000 | DQ "Müller" == "MÜLLER" ; FALSE because of the different case of umlauted U's. |00000020:0000000000000000 | DQ "012" == "12" ; FALSE, the strings are not equal. |00000028:0000000000000000 | DQ "123" = 123 ; FALSE; the character constant "123"=3355185 which is not 123. |00000030: | DQ "123" == 123 ; Syntax error; right operand is not a string. |### E6321 String compare InsensEqual with non-string operand in expression ""123" == 123". |00000030:

Case insensitive string compare should be used with built-in €ASM elements, such as register or datatype names , e.g.

 %IF '%1' !== 'ECX'
   %ERROR Only register ECX is expected as the first macro operand.
 %ENDIF

When we are investigating the presence of punctuation, it's better to use case-sensitive compare, because it assembles faster (€ASM doesn't have to convert both sides to a common character case):

DoSomethingWithMemoryVar %MACRO
 %IF '%1[1]' !=== '['  ; Test if the 1st operand begins with a square bracket.
   %ERROR The first operand should be a memory variable in [brackets].
 %ENDIF
%ENDMACRO DoSomethingWithMemoryVar

The test on square bracket in previous example fails if the macro operand is a string or character-constant in quotes, e.g. DoSomethingWithMemoryVar 'xyz'. The string compare operation will raise E6101 Expression "''' !=== '" is followed by unexpected character "[". because of syntax error. A trick how to avoid E6101 is to compare doubled values. In this case both single or double quotes escape themselves:

DoSomethingWithMemoryVar %MACRO
 %IF '%1[1]%1[1]' !=== '[['  ; Test if the 1st operand begins with a square bracket.
   %ERROR The first operand should be a memory variable in [brackets].
 %ENDIF
↑ Numeric compare

The numeric compare operations use a single equal sign =, optionally combined with < or > and they can compare values of two plain numbers or offsets of two addresses within the same segment.

Numeric compare can be used to test which side of operation is bigger. Terms above/below are used when comparing unsigned numbers or addresses. Terms greater/lower are used for comparing signed numbers. Operators which treat numbers as signed are prefixed with # modifier. Virtual addresses are always unsigned, therefore we cannot ask whether they are greater or lower.

|00000000:FFFFFFFFFFFFFFFF | DQ 5 < 7 ; TRUE, 5 is below 7. |00000008:FFFFFFFFFFFFFFFF | DQ 5 #< 7 ; TRUE, 5 is lower than 7. |00000010:0000000000000000 | DQ 5 #< -7 ; FALSE, 5 is not lower than -7. |00000018:FFFFFFFFFFFFFFFF | DQ 5 < -7 ; TRUE, 5=0x0000_0000_0000_0005 is below -7=0xFFFF_FFFF_FFFF_FFF9. |00000020:FFFFFFFFFFFFFFFF | DQ 123 = 0123 ; TRUE, both numbers are equal. |00000028:0000000000000000 | DQ "123" == "0123" ; FALSE, both strings are different. |00000030:0000000000000000 | DQ "123" = "0123" ; FALSE, both sides are treated as character constants with different values. |00000038: | DQ "123" = "000000123" ; "000000123" is not a number, its too big for a character constant. |### E6131 Character constant "123" = "000000123" is too big for 64 bits. |00000038: |
↑ Numeric arithmetic

Common arithmetic operations are Addition, Subtraction, Multiplication, Division and Modulo (remainder after division).

Unary minus may be applied to scalar numeric operand only. Unary plus does not change the value of operand; it is included in the operator set only for completeness. Adjacent binary and unary numeric operator is accepted by €ASM, however weird this may seem. This is useful in evalution expressions with substituted value, such as 5 + %1 where the symbolic argument %1 happens to be negative, e. g. -2. This expression is calculated as 5 + %1 -> 5 + -2 -> 5 + (-2) -> 3.

The greatest permitted value of integer number in €ASM source is 0xFFFF_FFFF_FFFF_FFFF -> 18_446_744_073_709_551_615 as unsigned, or 0x7FFF_FFFF_FFFF_FFFF -> 9_223_372_036_854_775_808 as signed. Overflow at assembly time is ignored in Addition, Subtraction and Shift Logical operation. Assembly error is reported when overflow occurs during Multiplication and Shift Arithmetic Left operation, or when division-by-zero happens during Division or Modulo operation. This maximum must not be exceeded even in intermediate results during the evaluation, such as 0x7FFF_FFFF_FFFF_FFFF * 2 / 2 (€ASM reports error). However, rearranged code 0x7FFF_FFFF_FFFF_FFFFF * (2 / 2) assembles well.

No overflow is reported in following examples of numeric expressions evaluation:

|00000000:0E00000000000000 | DQ 2 + 3 * 4 ; Result is 14. |00000008:0200000000000000 | DQ 0xFFFF_FFFF_FFFF_FFF9 + 0x0000_0000_0000_0009 ; Result is 2. |00000010:0200000000000000 | DQ -7 + 9 ; Result is 2 (0xFFFF_FFFF_FFFF_FFF9 + 0x0000_0000_0000_0009). |00000018:0200010000000000 | DQ 0xFFF9 + 0x0009 ; Result is 65538 (0x0000_0000_0000_FFF9 + 0x0000_0000_0000_0009). |00000020: |

€ASM calculates with the integer truncated division and with [Modulo] at assembly-time in the same way as machine instruction IDIV.

Before the signed division applies, both divident and divisor are internally converted to positive numbers. Then, having been divided as unsigned, the quotient is converted to negative if one of the operands (but not both) was negative.
Remainder in signed modulo operation is converted to negative only when the divident was negative.

|00000000: |; Signed division: |00000000:0300000000000000 | DQ +14 #/ +4 ; +(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is +3. |00000008:FDFFFFFFFFFFFFFF | DQ -14 #/ +4 ; -(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is -3. |00000010:FDFFFFFFFFFFFFFF | DQ +14 #/ -4 ; -(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is -3. |00000018:0300000000000000 | DQ -14 #/ -4 ; +(0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is +3. |00000020: |; Unsigned division: |00000020:0300000000000000 | DQ +14 / +4 ; (0x0000_0000_0000_000E / 0x0000_0000_0000_0004) is 3. |00000028:FCFFFFFFFFFFFF3F | DQ -14 / +4 ; (0xFFFF_FFFF_FFFF_FFF2 / 0x0000_0000_0000_0004) is 4_611_686_018_427_387_900. |00000030:0000000000000000 | DQ +14 / -4 ; (0x0000_0000_0000_000E / 0xFFFF_FFFF_FFFF_FFFC) is 0. |00000038:0000000000000000 | DQ -14 / -4 ; (0xFFFF_FFFF_FFFF_FFF2 / 0xFFFF_FFFF_FFFF_FFFC) is 0. |00000040: |; Signed modulo: |00000040:0200000000000000 | DQ +14 #\ +4 ; +(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is +2. |00000048:FEFFFFFFFFFFFFFF | DQ -14 #\ +4 ; -(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is -2. |00000050:0200000000000000 | DQ +14 #\ -4 ; +(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is +2. |00000058:FEFFFFFFFFFFFFFF | DQ -14 #\ -4 ; -(0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is -2. |00000060: |; Unsigned modulo: |00000060:0200000000000000 | DQ +14 \ +4 ; (0x0000_0000_0000_000E \ 0x0000_0000_0000_0004) is 2. |00000068:0200000000000000 | DQ -14 \ +4 ; (0xFFFF_FFFF_FFFF_FFF2 \ 0x0000_0000_0000_0004) is 2. |00000070:0E00000000000000 | DQ +14 \ -4 ; (0x0000_0000_0000_000E \ 0xFFFF_FFFF_FFFF_FFFC) is 14. |00000078:F2FFFFFFFFFFFFFF | DQ -14 \ -4 ; (0xFFFF_FFFF_FFFF_FFF2 \ 0xFFFF_FFFF_FFFF_FFFC) is 18_446_744_073_709_551_602. |00000080: |
↑ Shift

The shift operations are not commutative. Operand on the left side is treated as a 64-bit integer and shifted to the left or right by the number of bits specified by the operand on the right side.

Shift operations at assembly time are given higher priority than other numeric operation because they correspond with computing power of 2 rather than with multiplication or division. For instance 1 << 7 is equivalent to 1 * 27.

NASM evaluates the expression 4 + 3 << 2 as (4 + 3) << 2 -> 28, but in €ASM it is evaluated as 4 + (3 << 2) -> 16.

Bits which enter the least significant bit (LSb) during Shift Left operation are always 0. Bits which enter the most significant bit (MSb) during Shift Right operation are either 0 (Shift Logical Right), or they copy their previous value (Shift Arithmetic Right), thus preserving the sign of operand.

Bits which leave LSb during Shift Right are discarded. Bits which leave MSb during Shift Left are discarded, too, but overflow error E6311 is reported by €ASM when the sign of result (kept in MSb) has changed during Shift Arithmetic Left. Overflow sensitivity is the only difference between Shift Arithmetic Left and Shift Logical Left.

The right operand may be arbitrary number; however when it is greater than 64, the result is 0 with one exception: negative number shifted arithmetic right by more than 64 bit results in 0xFFFF_FFFF_FFFF_FFFF -> -1.

Shift by 0 bits does nothing. Shift by a negative number just reverses the direction of actual shift from left to right and vice versa.

Assembly-time rotate operations are not supported.

|00000000:0000010000000000 | DQ 1 << 16 ; The result is 65536. |00000008:F4FFFFFFFFFFFFFF | DQ -3 #<< 2 ; The result is -12. |00000010:8078675645342312 | DQ 0x1122_3344_5566_7788 << 4 ; The result is 0x1223_3445_5667_7880. |00000018:98A9BACBDCEDFE0F | DQ 0xFFEE_DDCC_BBAA_9988 >> 4 ; The result is 0x0FFE_EDDC_CBBA_A998. |00000020:98A9BACBDCEDFEFF | DQ 0xFFEE_DDCC_BBAA_9988 #>> 4 ; The result is 0xFFFE_EDDC_CBBA_A998. |00000028:0000000000000000 | DQ 0x8000_0000_0000_0000 << 1 ; The result is 0x0000_0000_0000_0000. |00000030: | DQ 0x8000_0000_0000_0000 #<< 1 ; Overflow, MSb would have been changed. |### E6311 ShiftArithmeticLeft 64-bit overflow in "0x8000_0000_0000_0000 #<< 1". |00000030: |
↑ Bitwise arithmetic

Bitwise NOT, AND, OR, XOR perform logical operation with the whole operands bit per bit.

|0000:FA | DB ~ 5 ; ~ 0000_0101b is 1111_1010b which is -6. |0001:04 | DB 5 & 12 ; 0000_0101b & 0000_1100b is 0000_0100b which is 4. |0002:0D | DB 5 | 12 ; 0000_0101b | 0000_1100b is 0000_1101b which is 13. |0003:09 | DB 5 ^ 12 ; 0000_0101b ^ 0000_1100b is 0000_1001b which is 9.
↑ Boolean algebra

Logical NOT, AND, OR, XOR operate with the numbers as well as with the boolean values.
Each operand, which is internally stored as a nonzero 64-bit number, is converted to boolean true (0xFFFF_FFFF_FFFF_FFFF) before the actual logical operation.
Operand with the value 0 is treated as false.

|0000:FF | DB 3 && 4 ; 0000_0011b && 0000_0100b is TRUE && TRUE (both operands are non-zero) which is TRUE. |0001:00 | DB 3 & 4 ; 0000_0011b & 0000_0100b have no common bit set, result is 0000_0000b, which is FALSE.

↑ Address expressions

Numeric expressions operate with immediate numeric values, such as 1, 0x23, '4567' or with symbols representing such scalar numeric value, such as NumericSymbolTen EQU 10. On the other hand, most symbols in a real assembler program represent address value which points to some data in memory or to some position in the program code.

While a plain number (scalar) is internally stored by €ASM in eight bytes, an address needs additional room to keep information of the segment it belongs to.

Imagine yourself driving a car. You're passing the milestone 123 on a highway when some friends of yours ring you up that they're passing the milestone 97. How far are you from one another? The answer is as easy as subtracting only when you are both driving on the same highway.

The set of operations defined with address symbols is very limited in comparison with numeric expressions. They cannot be multiplied, divided, shifted, logically operated. Only two kind of operations are allowed with addresses:

  1. A scalar numeric value may be added to the address symbol or substracted from it. The result is address symbol again; this operation affects the offset part of address; segment part remains intact.
  2. Two symbols may be subtracted from one another (or compared with one another) if they both belong to the same segment. The result is a scalar numeric value calculated as the difference of their offsets.

↑ Register expressions

Memory variables are addressed as the offset from the first byte of used memory segment (displacement) which may be updated at run-time with the contents of one or two registers. Notation of such address is called register expression or memory address expression.

Unlike instructions with immediate number embedded in the instruction code, such as ADD EAX,1234, machine instructions which load|store data somewhere from|to memory, must have the entire operand enclosed in brackets [ ]. For instance ADD EAX,[1234], where 1234 is offset of dword variable in data segment where the addend is loaded from.

MASM allows to omit square brackets even when the operand is a variable defined in memory, for instance ADD EAX,Something. A poor reader of MASM program has to search for the definition of the variable to learn whether it was defined in memory (Something DD 1) or if it was defined as a constant (Something EQU 1). Newer assemblers abandoned this design flaw, luckily.

When the address expression is used in machine instruction, it may be completed with registry names; it becomes register address expression. Complete address expression follows the schema
segment: base + scale * index + displacement where
segment is segment register CS, DS, ES, SS, FS, GS,
base is BX, BP in 16-bit addressing mode, or EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI, R8D..R15D in 32-bit addressing mode, or RAX, RBX, RCX, RDX, RBP, RSP, RSI, RDI, R8..R15 in 64-bit addressing mode,
scale is a numeric expression which evaluates to a scalar number 0, 1, 2, 4 or 8,
index is SI, DI in 16-bit addressing mode, or EAX, EBX, ECX, EDX, EBP, ESI, EDI, R8D..R15D in 32-bit addressing mode, or RAX, RBX, RCX, RDX, RBP, RSI, RDI, R8..R15 in 64-bit addressing mode,
displacement is an address or numeric expression with magnitude (width) not exceeding the addressing mode.

Some assemblers allow different syntax of memory addressing, for instance MOV EAX,Displ[ESI], MOV EAX,dword ptr [Displ+ESI], MOV EAX,Displ+[4*ESI], MOV EAX,Displ+4*[ESI]+[EBX].
EuroAssembler requires that the whole operand is surrounded in square brackets: MOV EAX,[Disp+4*ESI+EBX].

The order of components in addressing expression is arbitrary. Any portion of register address expression may be omitted.
Scale is not permitted in 16-bit addressing mode and scale cannot be used if indexregister is not specified.
ESP and RSP cannot be used as index register (they cannot be scaled).
Addressing modes of different sizes cannot be mixed in the same instruction, e. g. [EBX+SI].
16-bit addressing mode is not available in 64-bit CPU mode.

Registers allowed in addressing modes
16-bit addressing mode in 16-bit and 32-bit segment
base registerBX SS:BP
index registerSI DI
displacement16-bit signed integer, sign-extended to segment's width at run-time
32-bit addressing mode in 16-bit and 32-bit segment
base registerEAX EBX ECX EDX ESI EDI SS:EBP SS:ESP
index registerEAX EBX ECX EDX ESI EDI EBP
displacement32-bit signed integer, sign-extended|truncated to segment's width at run-time
32-bit addressing mode in 64-bit segment
base registerEAX EBX ECX EDX ESI EDI SS:EBP SS:ESP R8D..R15D
index registerEAX EBX ECX EDX ESI EDI EBP R8D..R15D
displacement32-bit signed integer, sign-extended to segment's width at run-time
64-bit addressing mode in 64-bit segment
base registerRAX RBX RCX RDX RSI RDI SS:RBP SS:RSP R8..R15
index registerRAX RBX RCX RDX RSI RDI RBP R8..R15
displacement32-bit signed integer, sign-extended to segment's width at run-time
MOFFS addressing mode in 16-bit, 32-bit and 64-bit segment
base registernone
index registernone
displacementunsigned integer of segment's width (16|32|64 bits)

When the segment register is not explicitly specified, a default segment is used for addressing the operand. If BP, EBP, RBP, ESP or RSP is used as a baseregister, the default segment is SS, otherwise it is DS. Nondefault segment register used for data retrieving may be specified either as an explicit instruction prefix SEGCS SEGDS SEGES SEGSS SEGFS SEGGS, or as a segment register which becomes part of the register expression (implicit segment override). The segment register may be included in expression either with colon : (segment separator) or with plus + (indexing operator):

|0000:268A04 |SEGES MOV AL,[SI] |0003:268A04 | MOV AL,[ES:SI] |0006:268A04 | MOV AL,[ES+SI]

There is a subtle difference between implicit and explicit segment override: if it requests the same segment register which is already used as a default, €ASM emits the prefix only when it is specified explicitly (in the prefix field of the statement):

|0000:8B04 | MOV AX,[SI] |0002:8B04 | MOV AX,[DS:SI] |0004:3E8B04 | SEGDS: MOV AX,[SI] |0007:3E8B04 | SEGDS: MOV AX,[DS:SI]

See t3021, t3022, t3023 for more examples.

In expressions where scaling is not used and therefore it's not obvious which of the two registers is meant as an index, €ASM treats the leftmost register as a base. So in [ESI+EBP] the base is ESI and implicit segment is DS, while in [EBP+ESI] the implicit segment is register SS.

We don't have to bother with implicit segment selection in 32-bit and 64-bit FLAT model programs, because both SS and DS are loaded with the same segment descriptor at load-time.

Although the operators * or + in register address expression look like an ordinary multiplication or addition, they specify a very different kind of operation called Scaling or Indexing when applied to a register. The actual multiplication or addition is performed at run-time rather than at assembly-time, because the assembler cannot know the contents of registers.

Indexing operation has lower priority than the corresponding Multiplication. Hence, the register expression [EBX + 5 + ESI * 2 * 2] is evaluated as [EBX + 5 + ESI * (2 * 2)] -> [EBX + 5 + ESI * 4].

↑ Data expressions

Data expression specifies static data declared with pseudoinstruction D or with literals. Format of data expression is
duplicator * type value, where duplicator is a non-negative integer number, type is primitive data type in full BYTE UNICHAR WORD DWORD QWORD TBYTE OWORD YWORD ZWORD INSTR or short B U W D Q T S O Y Z I notation, or a structure name. Optional value defines the contents of data which is repeated duplicator times.

Duplication is not a commutative operation; duplicator must be on the left side of duplication operator *. Default duplicator value is 1 (the data is not duplicated). Nested duplication is not supported in €ASM. Priority of duplication is very low, so the data expression 2 + 3 * B 4 is evaluated as five bytes where each contains the value 4. Example:

D 3 * BYTE          ; Declare three bytes with uninitialized contents.
D W 0x5             ; Declare one word with value 5.
D 2 * U "some text" ; Declare Unicode (UTF-16) string containing "some textsome text".
D 3 * MyStruc       ; Declare three instances of structured memory variable MyStruc.

See also pseudoinstruction D and tests t2480, t2481, t2482 for more examples.

↑ Special expressions

Membership ↓
Range ↓
Substring ↓
Sublist ↓

The remaining expression are not calculated with mathematical expression evaluator; they are evaluated by the parser.

↑ Membership

The fullstop alias the point . which joins two identifiers will make them a fully qualified name (FQN), which looks like a namespace identificator followed by the local name. FQN is nonlocal, it never starts with fullstop. For instance, when a local symbol .bar is declared in a procedure or structure Foo, it is treated by €ASM as symbol with FQN Foo.bar.

Namespace can be local, too, so the membership operation can nest.

↑ Range

Range is defined as two numeric expressions separated with range operator, which is .. (two adjacent fullstops) and it represents the set of integer numbers between those values, including the first and the last value.

A range has the property slope, which can be negative, zero or positive. Slope is defined as the sign of the difference between the right and the left value. Examples:

0 .. 15    ; Range represents sixteen numbers from 0 to 15; slope is positive.
-5 .. -4   ; Range represents values -5 and -4; slope is positive.
3 .. 4 - 1 ; Range represents one value 3; slope is zero.
2..-2      ; Range represents five values; slope is negative.
↑ Substring

Substring is an operation which returns only part of the input text. Substring operator is a range enclosed in a pair of square brackets []. The text is treated as a sequence of 8-bit characters (bytes) and the range specifies which of them are used.

%Sample1 %SET ABCDEFGH ; Preprocessing variable %Sample1 now contains 8 characters.
 DB "%Sample1[3..5]"   ; This actually assembles as  DB "CDE"
↑ Sublist

Sublist operation is similar to Substring with the difference that curly brackets {} are used instead of braces and that it treats the input text as an array of comma-separated items (in case of %variable expansion), or as a sequence of physical lines (in case of file inclusion).

 INCLUDE "MySource.asm"{1..10} ; Include the first ten lines of file "MySource.asm"

  Common properties of suboperations Substring and Sublist:

Suboperator is appended to the suboperated resource (text) without spaces.
Suboperations can be applied on four kinds of elements:

When applied to files, the file name must always be specified in double quotes.

Character and items are 1-based, the first suboperable member (character/item/line) has number 1.
Number of the last suboperable member is automatically assigned to a special variable %&.

Ordinal number of the last character|item|line of input text is assigned by €ASM to an automatic preprocessing variable with the name %&. This %variable is valid only in the suboperation, it cannot be used outside the braces.

You can use pseudoinstruction %SETS to get the number of characters assigned to a %variable, or pseudoinstruction %SETL to get the number of items in it (array length).
You can use attribute operator FILESIZE# to get the number of bytes in a file at assembly-time.

In Substring operation the value of automatic %variable %& specifies the number of characters assigned in the %variable or it specifies the size of the included file or the object file in bytes.
In Sublist operation it represents the ordinal number of the last non-empty item in the %variable, or the number of physical lines in the included file.

|4142432C4445462C2C4748492C4A4B4C |%Sample %SET ABC,DEF,,GHI,JKL |0000: | ; %& is now 16 in %Sample[%&] and 5 in %Sample{%&}. |0000:4B4C | DB "%Sample[15..%&]" ; DB "KL" |0002:4445462C2C4748492C4A4B4C | DB "%Sample{2..%&}" ; DB "DEF,,GHI,JKL"

A suboperated included file must be enclosed in double quotes even when its name doesn't contain spaces. The opening square bracket must immediately follow the input value (%variable name or the quote which terminates the filename). No white spaces are allowed between the %variable and the suboperation left bracket.

Suboperations are very tolerant about the range values. No warning is reported when they refer to a nonexisting character or item, for instance when the range member is zero or negative. Ranges with negative slope simply return nothing. Ranges with zero slope return one character|item|line when the index is between 1 and %&, otherwise they return nothing.

|4142434445464748 |%Sample %SET ABCDEFGH ; Variable %Sample now contains 8 characters. |0000:4142434445 | DB "%Sample[-3..5]" ; DB "ABCDE" |0005:434445464748 | DB "%Sample[ 3..99]" ; DB "CDEFGH" |000B:43 | DB "%Sample[ 3..3]" ; DB "C" |000C: | DB "%Sample[5..3]" ; DB "" |000C:4142434445464748205B352E2E335D | DB "%Sample [5..3]" ; DB "ABCDEFGH [5..3]" ; Not a suboperation.

Suboperation range consists of three components:

  1. minimum range indices
  2. range operator ..
  3. maximum range indices

Some of those components may be omitted, they will be given the default value. Default minimum indices is 1. Default maximum indices is %&. |4142434445464748 |%Sample %SET ABCDEFGH ; Preprocessing variable %Sample now contains 8 characters. |0000:4142434445 | DB "%Sample[..5]" ; -> DB "%Sample[1..5]" -> DB "ABCDE" |0005:434445464748 | DB "%Sample[3..]" ; -> DB "%Sample[3..8]" -> DB "CDEFGH" |000B:4142434445464748 | DB "%Sample[..]" ; -> DB "%Sample[1..8]" -> DB "ABCDEFGH" |0013:4142434445464748 | DB "%Sample[]" ; -> DB "%Sample[1..8]" -> DB "ABCDEFGH"

All the following notations are identical in %variable expansion:

%variable
%variable[1..%&]
%variable[..%&]
%variable[1..]
%variable[..]
%variable[]
%variable{1..%&}
%variable{..%&}
%variable{1..}
%variable{..}
%variable{}

The last notation in previous example is useful in %variable names concatenating when we need to append some literal text to the %variable, for instance 123 to the %variable contents. We cannot write %variable123 because the appended digits change the name of original %variable. The solution is to use empty suboperation, which doesn't change the %variable contents but it separates its name from the successive text: %variable[]123 or %variable{}123.

When the range inside braces contains only one index without range operator, it is treated as both minimum and maximum value and only one character|item|line is expanded: %Sample1[3] -> %Sample[3..3] -> C.

Suboperations may be chained. The chain is processed from left to right. Example: |4142432C4445462C2C4748492C4A4B4C |%Sample %SET ABC,DEF,,GHI,JKL ; %& is now 16 in %Sample[%&] and 5 in %Sample{%&}. |0000:4A4B | DB "%Sample{4..5}[2..6]{2}" ; DB "JK"

The first sublist in previous example takes items nr.4 and 5, giving the list of two items GHI,JKL. The next substring extracts characters from second to sixth from that sublist, giving HI,JK. The last sublist operation expands the second item, which is JK.

Suboperations may be nested. Inner ranges are calculated before the outer ones: |31323334353637383930 |%Sample %SET 1234567890 |0000:3233343536 | DB "%Sample[2..%Sample[6]]" ; -> DB "%Sample[2..6]" -> DB "23456"

↑ Sections

For each emitting statement the assembler generates some data or machine code which will be dumped to the output file in the end. Fortunately we don't have to write the whole program in the exact sequence which is required by the output file format. Assembled data and code is tossed on demand to one of several output sections. The statement, which will switch assembly to a different section, is quite simple: just the name of the section in square brackets [ ] in the label field of the statement.

Imagine that you (the programmer) act like a manager dictating some code and data to your secretary (EuroAssembler). You have dictated a few instructions, which were written in shorthand by your secretary on a sheet of paper labeled [TEXT]. Then you decided to dictate other kind of data. The secretary will grab another sheet, label it [DATA] and start to write there. Later, when you want to dictate some other instructions, your secretary takes the sheet labeled [TEXT] again, and continues from the point (origin) where it was interrupted.
You are free to open new sheets and to switch between them ad libitum. When the dictation ends, all used sheets will be stapled together (linked).

In EuroAssembler is the term section used for a named division of segment. Each segment has one or more sections. By default any segment has just one section with identical name (base section) which was created at segment definition.

↑ Segments

Intel Architecture divides memory to segments controlled by segment registers. Segment is defined in €ASM by the pseudoinstruction SEGMENT.

In the dawn of computer age, programmers demanded more memory then mere 256 bytes or 64 kilobytes which was addressable by 8-bit and 16-bit registers. Designers at Intel in pre-32-bit times might have chosen to use joinder of two 16-bit general registers, such as DX:AX or SI:BX and to address inconceivable 4 GB of memory with them, but they didn't. Instead, they invented new 16-bit segment registers specialized by the purpose of addressed memory: register CS for machine code, DS for data, SS for machine stack, ES for extra temporary usage.
Segment registers are used for addressing of 16 bytes long chunks of memory called paragraphs (alias octonary word, OWORD). Linear address in real CPU mode is calculated as a sum of Using segment registers for addressing of 16byte paragraphs yields 1 MB of memory addressable by each segment register, which seemed enough for everybody in those times.

Contents of the segment register in real processor mode represents paragraph address of the segment.
Contents of the segment register in protected processor mode represents index to a descriptor table, which holds some auxilliary information about the addressed segment (beside its address and size limit): access privileges and width.

Those auxilliary properties are fixed in real mode:

Segment at run-time is a continuous range of operational memory addressable with the contents of one segment register.

Segment at link-time is a named part of object file, which can be concatenated with segments of the same name from other linkable files.

In [MS_PECOFF] terminology is the linkable segment called section. I think the term segment would be more appropriate here, because COFF "sections" are differentiated by access privileges as they are addressed by different segment registers, ergo by different segment descriptors.
In our segment-highway parable, segments in flat protected mode are highway lanes running in parallel, so they share common milestones (offsets), but each lane is dedicated to a different kind of vehicles.

Segment at write-time is a part of assembler source which begins with section switching statement, and which ends with another switching statement or with the end of program.

There is no ENDS (end-of-segment) directive in €ASM. It is not possible to say this part of source code doesn't belong to any segment. When you write the very first statement of your source text, it already belongs to the default (envelope) program, and every program implicitly defines its default segments. Nevertheless, when a structure or numeric constant is being defined, it is irrelevant which segment is currently in charge, because structures and scalar symbols do not belong to any segment, no matter where was the structure or symbol defined in the source.

Segments and section divisions of assembler source do not have to be continuous. In fact, discontinuity is their main raison d'être. It allows to keep data in the source text near the code which manipulates with it, and this is good for readability and understanding of program function.

↑ Groups

When segments of assembler program are not much huge, they may be coalesced into segment group. The whole group of segments is addressable with one segment register. Group can be defined with pseudoinstruction GROUP.

When a group is defined, e. g. [DGRP] GROUP [DATA],[STRINGS] beside the group [DGRP] it automatically creates a segment with the same name [DGRP] (and consequently a section with the same name [DGRP]). It also declares that segments [DATA] and [STRINGS] belong to group [DGRP] together with its base segment [DGRP]. Nevertheless, when nothing is emitted to the implicitely defined segment [DGRP], it will be discarder in the end.

↑ Segmentation (more about sections, segments, groups)

Base segment and section ↓

Segmentation lifetime ↓

Implicit segments ↓

Segment naming conventions ↓

Loading segment registers ↓

Ordering of sections and segments ↓

Displaying the segment map ↓

The relation between segment and its sections in EuroAssembler is similar to the relation between group and its segments.

↑ Base section and segment

Whenever a segment is defined (with the pseudoinstruction SEGMENT), a section with the same name is automatically created in it (it is called base section). Other sections of the same segment may be created on demand later. This is done by the statement which has only the section name in its label field (there is no explicit SECTION directive in €ASM).

Section properties (class=, purpose=, combine=, align=) are inherited from the segment which they belong to. The alignment is not inherited when special literal sections [@LT64] .. [@LT1], [@RT0], [@RT1].. are created; literal sections are aligned according to the type of data which they keep.

Whenever a group is defined (with the pseudoinstruction GROUP), a segment with the same name is created in it (it is called base segment), together with other segments which we want to incorporate to the group.

↑ Segmentation lifetime

Each segment has one or more sections. Each section belongs to exactly one segment. During assembly time all segments are assumed to be loaded at virtual address 0. At the end of each assembly pass are sections virtually linked to their segment, so they begin at higher VA, where the preceeding section ended. However, in pass 1 it is not known yet what size will those sections have, so all sections are assumed to start at VA=0 in pass 1. When the last assembly pass ends, all sections are linked physically (their emitted contents and relocations are concatenated to the segment=base section) and sections are then discarded. Linker is not aware of €ASM sections at all.

Why should we actually split a segment to sections? Well, it is not necessary, mostly we can get by with just one default section per segment. In big programs, on the other hand, it may be useful to group similar kind of data together; we may want to create separate section for double word sized variables, for floating-point numbers, for text strings. This may save a few bytes of alignment stuff, which would be necessary when variables of different sizes are mixed together. Also literal sektions are organized in that way.

Another occasion where sections are handy is fast retrieving from read-only "databases" defined statically somewhere in data segment.
Database can be mentally visualized as a table with many rows and with columns containing data items of constant size. For fast selection of a particular row by an item of a "indexed" key value it is profitable to emit all items from one column sequentially to a section, one after another. The data from every column will have their own section. The width of "indexed" column should be padded to 1, 2, 4 or 8 bytes, so its items can be scanned with a single machine instruction REPNE SCAS. When an item is found, the difference between register rDI and the start of section identifies the selected row index. Remaining items of this row then can be addressed with the knowledge of row index.
This access method was used in a sample project EuroConvertor and in EuroAssembler itself, where it assigns address of instruction handler to each of two thousands mnemonics, see DistLookupIi.

Each group has one or more segments. Each segment belongs to exactly one group (even when it wasn't explicitly grouped, a group with the segment's name will be implicitly created at link time for the addressing purposes). When a program with executable format is linked, all groups are physically concatenated into image. Loader of realmode executable image is not aware of groups and segments.

↑ Implicit segments and groups

€ASM creates implicit segments when it starts to assemble a program. Implicit segment names depend on the chosen program format:

Implicit segments
FORMAT=Implicit segment names
BIN[BIN]
BOOT[BOOT]
COM[COM]
OMF | MZ[CODE],[RODATA],[DATA],[BSS],[STACK]
COFF | PE | DLL | ELF | ELFX | ELFSO[.text],[.rodata],[.data],[.bss]

If you are not satisfied with the implicit segments created by €ASM, you may redefine them at the start of program or create a new set of segments with different names. Segments and sections which were not used (nothing was emitted to them) will not be linked to output file and they can be ignored.

When the assembly ends and all segments from linked modules have been incorporated (combined) to the base program, €ASM looks at segments which are not part of any group, and creates implicit group for them (name of the group is the same as the segment). Here the memory model is taken into account:

Models with single code segment (TINY, SMALL, COMPACT) link all code into a single group, no matter how many code segments are actually defined in the program.

Multicode models (MEDIUM, LARGE) keep each code segment it its own implicit group, (if they weren't grouped explicitly), hence intersegment jumps, calls and returns should have DIST=FAR.

Similary, single data models (TINY, SMALL, MEDIUM) assume that all initialized and uninitialized data fits into one group not exceeding 64 KB, so the €ASM linker will assign all data segments into the implicit group and register DS does not have to be changed when accessing data from various segments, which may have been defined in the base program or in the linked modules.

↑ Segment naming conventions

Name of the group, segment and section is always surrounded by square brackets in €ASM source.

Unlike symbols, namespace is not preposited to segment name when it starts with . (fullstop). Group, segment, section names are always nonlocal.

Number of characters in group|segment|section name is not limited by €ASM but it may be limited by the output format. In OMF object module the name of a group or segment must not exceed 255 characters. In PE COFF executables the name in section header is truncated to 8 characters.

€ASM treats all names as case sensitive. If you want to link your segment with object module produced by an external compiler which converts segment name to uppercase or which mangles the names by prepending underscores __, you should adapt your naming convention to it.

Segment name should be unique, you cannot define two segments with the identical name in a program, except for the implicitly created segments, if there were not used yet. However, it is possible to define segments with same names in different programs and link them together; their contents will be concatenated according to their COMBINE= property. Similar rule applies to groups.

Section names cannot be duplicated on principle. When a section name appears in the source for the second time, it will only switch to that section rather than creating a new one.

Implicit literal section name begins with @LT or @RT, you'd better avoid names which begin with this combination of letters.

Segment which have dollar sign $ in their name are treated in a special way. If the characters on the left side of this $ match, all such segments will be linked adjacently in alphabetic order.

There are conventions how "sections" are named in COFF modules, you may need to adapt to them to succesfully link €ASM program with modules created by different compilers.

↑ Loading segment registers

When €ASM creates a protected executable ELFX or PE 32-bit or 64-bit program format, we don't have to bother with segments, groups or stack at all. All segment registers are preloaded by Linux or Windows and the stack is established automatically.

When the DOS launches a tiny COM program, it loads CS=DS=SS=ES with the paragraph address of its PSP, sets IP=100h and SP to the end of the stack segment, usually 0FFFEh. Again, we don't have to bother with segment registers at all.

When a MZ executable program is prepared to start, its segment registers have been set by the DOS loader. CS:IP is set to the program entry point, SS:SP is set to the top of machine stack, but both DS and ES point to PSP, which is not our data segment.

There is no instruction in Intel architecture to load segment register with immediate value directly, so this is usually done via register or stack:

; Loading paragraph address of [DATA] to segment register
; using a general purpose register (which is faster):
MOV AX, PARA# [DATA]
MOV DS,AX
; or using the machine stack (which is shorter):
PUSH PARA# [DATA]
POP DS

It is the responsibility of programmer to load segment register with the address of another segment, whenever it is used. €ASM makes no assumption about the contents of segment registers; there is no ASSUME, USING, WRT directive in €ASM.

↑ Ordering of sections and segments

Order is generally based on four sorting keys:

  1. Purpose of the segment, which stipulates access privilegies Read|Write|Execute.
  2. Order in which segments were declared in source text.
  3. Segments with $ in their name, which belong to the same group, and the left-side substrings of their names up to the $ are identical, are kept together and sorted alphabetically by name.

Order of sections

At the end of each assembly pass are all sections linked to their segments in this order:

  1. Base section, defined implicitly together with each segment.
  2. Other non-literal sections in the order as they were defined.
  3. Data-literal sections in descending order of their alignment ([@LT64], [@LT32],..[@LT1]).
  4. Code-literal sections in alphabetical order ([@RT0], [@RT1], [@RT2]..).

Order of segments

Segments are combined and linked at link time in this order:

  1. Group(s) of initialized segments in the order as they were defined.
  2. Initialized segments which are not in any group.
  3. Group(s) of uninitialized segments in the order as they were defined.
  4. Uninitialized segments which are not in any group.

Segments in each group are in the order as they were defined in the source (not as they were declared in the GROUP statement). The base segment is always the first in a group.

When an executable format is linked, every segment is assigned to some group, at least to the implicit one (with identical name).

Implicit groups of segments are used internally for relocation purposes only. Protected mode programs (MODEL=FLAT) do not care of segment registers much, so we don't have to bother with groups in programs for Windows or Linux.
Anatomy of COFF-based file:
NameSegment
purpose
AccessSize
32-bit | 64-bit
File alignment
32-bit | 64-bit
Remark
MZ DOS header RW128 | 1280) 2)
MZ stub programRW16 | 160) 2)
PE signatureRW4 | 432 | 320) 2)
File header R20 | 2016 | 160)
Optional header R224 | 24016 | 160) 2)
Section headers RNrOfSe*4016 | 160)
.textCODERXFiAl|SeAl
.rodataRODATARFiAl|SeAl
.dataDATARWFiAl|SeAl
.bssBSSRWFiAl|SeAl
.idataIMPORT+IATRWX16|160) 2)
.edataEXPORTRW16|160) 2) 5)
.relocBASERELOCRW16|160) 2)
.rsrcRESOURCERW16|160)
Symbol table (SYMBOLS)RNrOfSym*1816 | 160) 1) 3)
String table(STRINGS)4 | 40) 1) 3)
Anatomy of ELF-based file:
NameSegment
purpose
AccessSize
32-bit | 64-bit
File alignment
32-bit | 64-Sbit
Remark
File header R52 | 640)
Program headers RNrOfPh*(32|56)16 | 80) 2)
Section headers RNrOfSe*(40|64)8 | 16)0)
.symtabSYMBOLSNrOfSym*(16|24)16 | 8
.hashHASHR4 | 44)
.strtabSTRINGS1 | 1
.shstrtabSTRINGS1 | 1
.interpRODATAR1 | 14)
.pltPLTRXNrOfJmp*16164)
.textCODERXFiAl|SeAl
.rodataRODATARFiAl|SeAl
.dataDATARWFiAl|SeAl
.bssBSSRWFiAl|SeAl
.dynamicDYNAMICRWNrOfRec*(8|16)8 | 164)

Remarks:
0) Special structure without its own section header.
1) Used in relocatable module only.
2) Used in executable image only.
3) Used in executable image only when EUROASM DEBUG=ENABLED.
4) Used in executable image only when linked with shared object library.

Access rights:
R Allocate memory in process address space and allow read.
W Allow write.
X Allow execute.
FiAl|SeAl maximum of File Alignment | Segment Alignment.

↑ Displaying the segment map

Pseudoinstruction %DISPLAY Sections prints to the listing file a complete map of groups, segments and sections defined so far at assembly time, one object per line represented by a debugging message D1260 (group), D1270 (segment), D1280 (section). Segment is indented with two spaces, section is indented with four spaces.

Instead of %DISPLAY Sections we could use %DISPLAY Segment or %DISPLAY Groups, the result is identical. The entire group/segment/section map is always displayed with those statements.

At link time €ASM prints a similar map of groups and segments to the listing, with finally used virtual addresses, unless it was disabled with option PROGRAM LISTMAP=OFF.

↑ Distance

The distance is property of a difference between two addresses. It is not just the numeric difference of two offsets; in €ASM this term represents one of three enumerated values: FAR, NEAR, SHORT.

The distance of two addresses is FAR when they belong to different groups/segments, otherwise it is NEAR or SHORT. Difference of offsets is SHORT if it fits into 8-bit signed integer, i. e. -128..+127.

↑ Width

€ASM is 64-bit assembler, it can also compile programs for the older CPU which worked with 32 and 16 bit words only. The number of bits which CPU works with simultaneously is called width and it is either 16, 32 or 64.

Width is always measured in bits.

The width is a property of segment. Some 32-bits object file formats allow to mix segments of different widths in one file. Width of addressing and operating mode can be ad hoc changed with instruction prefix ATOGGLE, OTOGGLE.

Pseudoinstruction PROGRAM has the WIDTH= property, too. It will establish the default for all segments declared in the program. Program width is also used to select the format of output file, for instance if the PExecutable should be created as 32-bit or 64-bit.

↑ Size

Size is a plain non-negative number which specifies the number of bytes in object (register, memory variable, structure, segment, file etc). Size of a string is specified in bytes, no matter if the string is composed of ANSI or WIDE characters.

Size of an object can be counted with at assembly time, using the attribute operator SIZE# or FILESIZE#.

Size of a preprocessing %variable contents can be retrieved with pseudoinstruction %SETS.

Size is always measured in bytes.

Size and length of €ASM elements (identifiers, numbers, structures, expressions, file contents, nesting depth, number of operands, etc.) is not limited by design, but such sizes are internally stored as the signed 32-bit integers, so the actual limitation is 2_147_483_647 characters. In practice we will be restricted by the amount of available memory, of course.

↑ Length

This term is used to count the number of comma-separated items in an array, for instance the length of operand list in the statement VPERMI2B XMM1,XMM2,XMM3,MASK=K4,ZEROING=ON is 5.

Length of a preprocessing %variable contents can be retrieved with pseudoinstruction %SETL.

↑ Namespace

The names of symbols and structures created in a program must be unique. In large projects it might be difficult to maintain unique names, especially when more people work on separate parts of the program. That is why the programmer can use local identifiers which must be unique only in a division of source file called namespace. The namespace is a range of the source specified by namespace block. There are four block-pseudoinstructions in €ASM which create the namespace: PROGRAM, PROC, PROC1, STRUC. The block name is also the name of the namespace. An identifier is local when its name begins with fullstop .. Unlike with standards symbols, the characters following the leading fullstop may start with a decimal digit and it is not an error when they form a reserved name. Example of valid local identifiers: .L1, .20, .AX.

Names of local identifiers are kept in €ASM internally concatenated with namespace name, so they form fully qualified name (FQN). Local symbols may be referred with .local name only within their native namespace block; they may also be referred with fully qualified name anywhere in the program.

The namespace actually starts at the operation field of the block statement and it ends at the operation field of the corresponding endblock statement. Thanks to this, the namespace itself (label of the block) may be local, too, and the namespaces may be nested.

MyProg PROGRAM      ; PROGRAM starts the namespace MyProg.         ;
                                                                    ;
Main    PROC        ; PROC starts inner namespace Main.             ;
  .10:   RET        ; Local label; its FQN is Main.10.             ;
        ENDP Main   ; After ENDP we are in MyProg namespace again. ;
                                                                    ;
.Local  PROC        ; Its FQN is MyProg.Local.                     ;
  .10:   RET        ; FQN of this label is MyProg.Local.10.        ;
        ENDP .Local ; MyProg.Local namespace ends right after ENDP.;
                                                                    ;
       ENDPROGRAM MyProg

Beside the namespace blocks there is one more occasion where namespace is unfolded: operand fields of the structured data definition statement, which temporarily take over the namespace of a structure which is being instanceized.

DateProg PROGRAM      ; PROGRAM starts the namespace DateProg.           ;
                                                                         ;
Datum STRUC  ; Declaration of structure Datum creates namespace Datum.   ;
.day   DB 0                                                              ;
.month DB 0                                                              ;
.year  DW 0                                                              ;
      ENDSTRUC Datum ; Namespace Datum ends right behind ENDSTRUC field. ;
                                                                         ;
[.data] ; Segment name is not local label, namespace is ignored here.    ;
Birthday DS Datum, .day=1, .month=1, .year=1970                          ;
                                                                         ;
; The previous statement defines 4 bytes long structured memory variable ;
; called Birthday in section [.data] and statically sets its members.    ;
; On creating the variable "Birthday" €ASM uses properties               ;
; declared as Datum.day, Datum.month, Datum.year (B,B,W).                ;
; Members can be referred as Birthday.day, Birthday.month, Birthday.year.;

↑ Scope

Scope is the property of a symbol which specifies symbol visibility.

A symbol defined in the assembler program, such as label or memory variable, may be referred anywhere within the program at assembly time. Our program may be linked with other programs, object modules or libraries, which might have misused the same name for their own symbols, but it's OK and no conflict occurs because programs are compiled separately. This is the standard behaviour, such symbols have standard private scope and their visibility is limited to the inside of PROGRAM..ENDPROGRAM block.

When a symbol name begins with fullstop ., visibility of such private local name is even narrower, it is limited to the smallest namespace block in which was the symbol defined (PROC..ENDPROC, STRUC..ENDSTRUC).

On the other hand, executables which are linked from several programs (modules, libraries) need to acces symbols outside their standard private scope, for instance to call an entry point of a library function. Names of such global symbols should be unique among all linked programs.

Scope recognized in €ASM
privateGlobal
Standardlocalstatic link dynamic link
PublicExterneXportImport

Scope of a symbol can be examined at assembly time with attribute operator SCOPE#, which returns ASCII value of uppercase scope shortcut, for instance

MySymbol EXTERN
MOV AL,SCOPE# MySymbol ; This is equivalent to MOV AL,'E'

Available shortcuts are underlined in the table above. The same shortcuts are also used when symbol properties are listed by %DISPLAY Symbols and after the link phase if LISTGLOBALS=ENABLED.

GLOBAL, PUBLIC, EXTERN, EXPORT and IMPORT scope of a symbol can be explicitly declared by pseudoinstruction with the corresponding name. GLOBAL scope can be also declared implicitly, using two (or more) terminating colons :: after the symbol name. A symbol declared as GLOBAL is either available as PUBLIC (if it is defined in the same program), or it is marked as EXTERN (if it is not defined in the program).

Only the scopes for static linking (PUBLIC, EXTERN) can be declared by simplified global scope declaration (using two colons). When the symbol will be exported (if a DLL file is created), or when it should be dynamically imported from other DLL, using two colons is not enough and either explicit declaration EXPORT/IMPORT symbol or LINK import_library is required.

Word1:  DW 1   ; Standard private scope.
Word2:: DW 2   ; Public scope declared implicitly (with double colon).
Word3   PUBLIC ; Public scope declared explicitly.
Word4   GLOBAL ; Public or extern scope (which depends on Word4 definition in this program).
Word5   GLOBAL ; Public or extern scope (which depends on Word5 definition in this program).
Word6   EXTERN ; Extern scope. Symbol Word6 must not be defined anywhere else in this program.
Word4:         ; Definition of symbol Word4.
        MOV EAX,Word5 ; Reference of external symbol Word5.
; Scope of Word1 is PRIVATE.
; Scope of Word2, Word3, Word4 is PUBLIC.
; Scope of Word5, Word6 is EXTERN.

↑ Data types

Information in computer memory or register represents the code or data. Important properties of stored texts and numbers is data type, which is a rule specifying how to interpret the information. €ASM recognizes following types of data:

Fundamental data types
TypenameShortSizeAutoalignWidth Typical
storage
Character
string
Integer
number
Floating-point
number
Packed
vector
BYTEB118 R8ANSI8-bit
UNICHARU2216 R16WIDE
WORDW2216 R1616-bit
DWORDD4432 R32,ST32-bitSingle precision
QWORDQ8864 R64,ST64-bitDouble precision
TBYTET10880 STExtended precision
OWORDO1616128 XMM4×D | 2×Q
YWORDY3232256 YMM8×D | 4×Q
ZWORDZ6464512 ZMM16×D | 8×Q
Other data types
TypenameShortSizeAutoalign Usage
Structure nameSvariableSTRUC explicit alignment,
otherwise program width
structured variables
INSTRIvariable1machine instructions

Using of fundamental typenames is often reduced to their first letter. Data types in short or long notation are used for explicit static data definition with pseudoinstruction D, for implicit data definition in literals, as an alignment specification,or in instruction modifiers.

€ASM has some type awareness, though not so strong as in higher programming languages. For instance when processing instruction INC [MemoryVariable] it looks how was MemoryVariable defined and the it selects appropriate encoding version (byte|word|dword).


↑ Symbols

Name of symbols ↓

Numeric symbols ↓

Address symbols ↓

$ - current origin address ↓

Attributes of symbol ↓

Literal symbols ↓

Symbol in assembly language is an alias to a number or address.

There are two kinds of symbols in assembler: numeric and address.

Numeric symbol answers the question how many and address symbol answers the question where (at which position in the program).

Numeric symbol is defined with pseudoinstruction EQU or with its alias =, for instance Dozen EQU 12 or Gross = 144.
Address symbol is defined when its name appears in a label field of a statement.

Value of the numeric symbol is internally kept in 8 bytes (signed QWORD) but address symbols need an additional information about the section where they belong to.
It is not possible in €ASM to define numeric symbol as a label of other statement than EQU, or as a solo label without operation field. Each program statement compulsorily belongs to some section (either explicitly defined or implicitly created when assembly of program block starts).

↑ Name of symbols

Symbol name is an identifier (letter or fullstop optionally followed with other letters, fullstops and digits), which is not a reserved symbol name in either character case.
Symbol name may always be terminated with one or more colons : which helps to recognize the identifier as a symbol name. The colon itself is not a part of the symbol name. Symbols should have self-explaining mnemonic name.

Termination of each symbol name with : is a good habit both when the symbol is defined and referred, though many other assemblers do not support this. It's easier to copy&paste the symbol name without having to delete colon at its end. Colon tells both assembler and human reader that the name represents a symbol, and it protects from mistake when you choose a symbol name which accidentally happens to collide with one of thousands instruction mnemonics.
Structure names, register names (except for segment registers), or machine instruction mnemonics names are never colon-terminated.
Symbol name must be unique in the program.

Symbols and structures may be referred (used in statement) before they are actually defined. However, it's a good practice to define numeric symbols and structures at the beginning of the program, because forward references require additional program passes, which extends the duration of assembly.

Reserved symbol names
CategoryReserved names
Assembly-time current pointer$
Segment register namesCS, DS, ES, FS, GS, SS
Prefix namesATOGGLE, LOCK, OFTEN, OTOGGLE, REP, REPE, REPNE, REPNZ, REPZ, SEGCS, SEGDS, SEGES, SEGFS, SEGGS, SEGSS, SELDOM, XACQUIRE, XRELEASE

Name of symbol may contain fullstop ., which usually connects namespace with symbol's local name. Leading . makes the symbol local, as it is in fact connected with the current namespace internally.

Creating symbol names which collide with names of registers or instructions is discouraged. If you really want to use some of those not recommended name for a symbol, it must be always followed with colon, e.g.

  Byte: DB 1    ; Define a symbol named "Byte".
  MOV AX,Byte:  ; Load AX with offset of the symbol.

In other cases, terminating symbol name with : is voluntary, but recommended.

Not recommended symbol names
CategoryNot recommended names
Fundamental data types B, BYTE, D, DWORD, I, INSTR, O, OWORD, Q, QWORD, S, T, TBYTE, U, UNICHAR, W, WORD, Y, YWORD, Z, ZWORD
Register names AH, AL, AX, BH, BL, BND0, BND1, BND2, BND3, BP, BPB, BPL, BX, CH, CL, CR0, CR2, CR3, CR4, CR8, CX, DH, DI, DIB, DIL, DL, DR0, DR1, DR2, DR3, DR6, DR7, DX, EAX, EBP, EBX, ECX, EDI, EDX, ESI, ESP, K0, K1, K2, K3, K4, K5, K6, K7 MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10L, R10W, R11, R11B, R11D, R11L, R11W, R12, R12B, R12D, R12L, R12W, R13, R13B, R13D, R13L, R13W, R14, R14B, R14D, R14L, R14W, R15, R15B, R15D, R15L, R15W, R8, R8B, R8D, R8L, R8W, R9, R9B, R9D, R9L, R9W, RAX, RBP, RBX, RCX, RDI, RDX, RSI, RSP, SEGR6, SEGR7, SI, SIB, SIL, SP, SPB, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, TR3, TR4, TR5, XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM16, XMM17, XMM18, XMM19, XMM20, XMM21, XMM22, XMM23, XMM24, XMM25, XMM26, XMM27, XMM28, XMM30, XMM31 YMM0, YMM1, YMM2, YMM3, YMM4, YMM5, YMM6, YMM7, YMM8, YMM9, YMM10, YMM11, YMM12, YMM13, YMM14, YMM15, YMM16, YMM17, YMM18, YMM19, YMM20, YMM21, YMM22, YMM23, YMM24, YMM25, YMM26, YMM27, YMM28, YMM30, YMM31 ZMM0, ZMM1, ZMM2, ZMM3, ZMM4, ZMM5, ZMM6, ZMM7, ZMM8, ZMM9, ZMM10, ZMM11, ZMM12, ZMM13, ZMM14, ZMM15, ZMM16, ZMM17, ZMM18, ZMM19, ZMM20, ZMM21, ZMM22, ZMM23, ZMM24, ZMM25, ZMM26, ZMM27, ZMM28, ZMM30, ZMM31
Pseudoinstruction names ALIGN, D, DB, DD, DI, DO, DQ, DS, DU, DW, DY, DZ, ENDHEAD, ENDP, ENDP1, ENDPROC, ENDPROC1, ENDPROGRAM, ENDSTRUC, EQU, EUROASM, EXTERN, GLOBAL, GROUP, HEAD, INCLUDE, INCLUDE1, INCLUDEBIN, INCLUDEHEAD, INCLUDEHEAD1, PROC, PROC1, PROGRAM, PUBLIC, SEGMENT, STRUC
Machine instruction mnemonicsAAA, AAD, ... XTEST, see IiHandlers in €ASM source for the complete list.

↑ Numeric symbols

Numeric symbol is defined with pseudoinstruction EQU (or with its alias =) which specifies a number, numeric expression or other numeric symbol. Examples:

BufferSize: EQU 16K
WM_KEYDOWN = 0x0100
Total      EQU 2*BufferSize
   MOV ECX,BufferSize
Using numeric symbol instead of the direct number notation has its advantages:

↑ Address symbols

An address symbol is defined when it appears as a label of machine instruction or prefix, as a label of empty instruction or as a label of pseudoinstruction D*, PROC, PROC1.

Examples:
[DATA]
SomeValue:   DD 4
[CODE]
             MOV EAX,[SomeValue:]
StartOfLoop: CALL SomeProcedure:
             DEC EAX
             JNZ  StartOfLoop:

While numeric symbol BufferSize was completely defined with its value, in case of address symbol SomeValue it is not sufficient. Instruction MOV EAX,SomeValue loads EAX with the symbol offset, i. e. with the distance between its position and the start of its segment. Address symbol is defined with two properties: its segment and offset. That is why address symbol is sometimes called vector or relative symbol and numeric symbol is called scalar or absolute symbol or constant.

There are five methods how to create a symbol in EuroAssembler:
  1. Symbol is defined when its name occurs in the label field of a statement. Such symbol represents address within the section it was defined in, and the data or code emitted by the statement, too. The statement may be empty (solo label) or it may declare data, prefix or machine instruction. Pseudoinstructions PROC and PROC1 also define the symbol with their name, but pseudoinstructions PROGRAM, STRUC, SEGMENT do not.
  2. External and imported symbols are created with pseudoinstructions EXTERN, IMPORT or GLOBAL, or when they are referred with two colons appended to their name. Extern symbol is not defined in the current program, it must not appear in label field (with an exception of EXTERN pseudoinstruction itself, which declares it as external).
  3. Literal symbol is created when it is referred for the first time. It does not have an explicit name, in fact its name is represented by its value, for instance the instruction LEA ESI,[=D 123] creates literal symbol, which is stored in €ASM symbol table under the pseudo-name =D 123.
  4. €ASM maintains a special dynamic symbol $ for each section, which represents the current assembly position in the section.
  5. Symbol can be defined with pseudoinstruction EQU or with its alias =. This is the only way how to define a plain numeric symbol.

↑ $ symbol

A special dynamic symbol $ represents the address of next free position in emitted code at the beginning of assembly of the statement, in which it is referred. Value of this symbol is not constant but it is changed by €ASM after an emitting statement has been assembled.

Programmer may change the offset of current origin $ with EQU pseudoinstruction, this is equivalent to pseudoinstruction ORG known from other assemblers.

There is no ORG pseudoinstruction in €ASM, $ is made l-value instead.
|00000100:44444444 |DataDword DD 0x44444444 |00000104: | ; Redefine DataDword as a word-accessible union: |00000100: | $ EQU DataDword ; Return emitting pointer back. |00000100:1111 |DataLoWord DW 0x1111 ; Re-emit new data which will overwrite |00000102:2222 |DataHiWord DW 0x2222 ; data defined at DataDword. |00000104: |

See also the test t2551 or sample project boot16.


↑ Symbol, register and file attributes

SIZE# ↓
TYPE# ↓
REGTYPE# ↓
SCOPE# ↓
OFFSET# ↓
SECTION# ↓
SEGMENT# ↓
GROUP# ↓
PARA# ↓
FILESIZE# ↓
FILETIME# ↓

Some important symbol properties are available for next processing in a program at assembly time, they are called attributes. When a symbol is defined, it automatically gets its attributes. They can be referred by prefixing the symbol name with attribute operator. An attribute operator is an identifier which defines the kind of attribute, immediately followed with #. The object, which the attribute operator is applied on, may be separated by zero or more white spaces and it may be in parenthesis. For instance SIZE#SymbolName or SIZE# SymbolName or SIZE#(SymbolName). Remember that the symbol name is case sensitive but the attribute name is not.

Attributes GROUP#, SEGMENT# and SECTION# return an address when applied to an address symbol; they return scalar zero when applied to a numeric symbol. Other attributes always return scalar (plain number).

↑ OFFSET#

Attribute OFFSET# returns the offset of symbol in the current segment as a plain number, i. e. the number of bytes between the start of the segment and the symbol itself. If the symbol is numeric, its value is returned.

Symbol and OFFSET#Symbol are identical only when Symbol is a scalar value, otherwise the former represents its address and the latter represents a plain number.

The expression Symbol - SEGMENT#Symbol is identical with OFFSET#Symbol for both numeric and address kind of symbols.

↑ PARA#

Attribute PARA# represents the paragraph address of beginning of the group that the symbol belongs to. It is the value which has to be loaded to the segment register which will be used for addressing. When PARA# is applied to a numeric symbol, it returns scalar zero.

↑ GROUP#

Attribute GROUP# represents the address of beginning of the group that the symbol belongs to, i.e. address of the first byte of the first (lowest) segment of the group. When applied to a numeric symbol, it returns scalar zero.

↑ SEGMENT#

Attribute SEGMENT# represents the address of beginning of the segment that the symbol belongs to. When applied to a numeric symbol, it returns scalar zero.

↑ SECTION#

Attribute SECTION# represents the address of beginning of the section that the symbol belongs to. When applied to a numeric symbol, it returns scalar zero. If the symbol lies in default section (with the same name as its segment), both SECTION# and SEGMENT# attributes return identical address.

↑ SCOPE#

Attribute SCOPE# returns a number representing the ASCII value of capital letter corresponding with the symbol scope, which can be 'E' for external symbols, 'P' for public symbols, 'X' for exported symbols, 'I' for imported symbols, 'S' for standard (private) symbols, or '?' when the symbol is undeclared.

↑ SIZE#

SIZE# represents the amount of bytes emitted by the statement which defines the symbol. Typically it is the size of data defined with D pseudoinstruction or the size of machine instruction. Symbols defined with EQU pseudoinstruction or defined in non-emitting instruction have attribute SIZE# equal to zero.

↑ TYPE#

Attribute TYPE# returns a number representing the ASCII value of a capital letter corresponding with the symbol type. It may be one of the fundamental data types 'B', 'U', 'W', 'D', 'Q', 'T', 'O', 'Y', 'Z', structured data type 'S' or machine instruction type 'I' when the symbol is defined with data definition pseudoinstruction D.
Numeric symbol returns type attribute 'N'.
Label of a machine instruction or machine prefix have type attribute 'I'.
Address symbols defined with just a label, or as a label of PROC | PROC1, and external symbols return 'A'.
Undefined symbol returns '?'.

Forward reference to a symbol will create its record in the symbol table. However, in the first pass its type attribute is '?' (undefined) until its definition is encounterred. On the other hand, applying an attribute to undefined symbol does not make it referred. That is why we may test with the pseudoinstruction %IF TYPE#Symbol = '?' whether the symbol is undefined in program.

Beside symbols, some attribute operators may be applied to other elements than symbols: to a register, structure name, string, expression in parenthesis () or braces [].

TYPE# of a register is 'R' and its SIZE# is equal to the register width in bytes (1,2,4,8,10,16,32,64).

TYPE# of a structure or segment is 'S' and SIZE# computes its size in bytes.

|[.data] |[.data] |00000000:456E642E0D0A00 |Message D 'End.',13,10,0 ; Defined as DB or DU. |[.text] |[.text] |TRUE | %IF TYPE# Message = 'B' ; If UNICODE is disabled |00000000:B907000000 | MOV ECX,SIZE# Message ; load ECX with its size in bytes. |FALSE | %ELSE ; Otherwise UNICODE is enabled, | | MOV ECX,SIZE#Message/2 ; and SIZE# returns 14 bytes. | | %ENDIF ; ECX is now 7 (message length in characters).

Why should we use SIZE# or TYPE# attributes when the querried symbol is defined by ourselves and therefore we already know its size and type? If we would decide to change the text of Message later, we won't have to bother with its length recalculation.

Attribute operators are often used in macros to determine what type of operand was the macro provided: if it's a register, data symbol, immediate value etc. When we need to check in a macro if the provided operand %1 is a plain number, we could test this with query %IF TYPE# %1 = 'N'.

See tests t16* for more attribute examples.

Detailed differentiation of data symbol which attribute TYPE# yields is sometimes not necessary. For instance we may need to distinguish whether the macro operand %1 needs relocation at link time. This happens when this is address symbol or memory variable which contains some address symbol. TYPE# DataSymbol or TYPE# [DataSymbol+RSI] may return 'A', 'B','W','D','Q','T' or whichever kind of data was the DataSymbol defined with. Otherwise it will return 'N' when the operand was a number which doesn't use relocation, such as TYPE# MAX_PATH_SIZE or TYPE# [RBP-16]. Here we may need to unify all kinds of address+external symbols with attribute operator SEGMENT#, wich returns relocatable address of its bottom, regardless of its datatype. Attribute TYPE# applied to such SEGMENT# attribute will always return 'A'. On the other hand, SEGMENT# ScalarSymbol and TYPE#(SEGMENT#ScalarSymbol) return 'N'.

%IF TYPE# (SEGMENT# %1) = 'A'
  ; %1 is address expression which requires relocation.
%ELSE
  ; %1 is nonrelocatable expression.
%ENDIF
Notice that the chained attributes require parenthesis. This is because all attribute operators have equal priority, so they are evaluated from left to right, and without parenthesis the first operator would attempt to apply itself on another unary operator.
See also test t1695 for more examples.
↑ REGTYPE#

Attribute TYPE# applied on register returns value 'R', regardless of register family. Sometimes it is useful to know the exact kind of register. Attribute REGTYPE# returns a number representing the ASCII value of capital letter corresponding with the register family. General-purpose registers return 'B', 'W', 'D', 'Q', SIMD registers return 'X', 'Y', 'Z', segment registers return 'S' etc. See the Registers table for the complete list. When this attribute is applied to an element which is not a register, it returns '?'. See also test t1648.

↑ FILESIZE#
↑ FILETIME#

Unlike previous attributes, FILESIZE# and FILETIME# can be applied only to files specified by their name, which must be surrounded with double quotes ". The filename may have absolute, relative, or no path, it is related to the current directory at assembly time.

Both file attribute operators investigate the file properties at assembly time.

FILESIZE# "filename" returns the number of bytes in the file, or 0 if the file was not available.
FILETIME# "filename" returns the timestamp of the file, i. e. the number of seconds between midnight, January 1st 1970 UTC and the last file modification. It returns 0 when the file was not found. See also test t1690.

↑ Literals

Literal symbols alias literals are similar to the standard assembler symbols. The main difference is that they don't have explicit definition and name. A literal is defined whenever it is referred and its name is represented with equal sign = followed with data expression, for instance =D(5) or =B"Some text.". They may be duplicated, but unlike in D pseudoinstruction (which may have many operands), only one data expression can be specified. Examples of instructions with literals:

DIV [=W(10)]   ; Divide DX:AX by an anonymous word memory variable with value 10.
MOV DX,=B"This is a literal message.$" ; Load DX with offset of a string defined ad hoc somewhere in data segment.
LEA ESI,[=D 0] ; Load ESI with address of a DWORD memory variable which contains the value 0.
CALL =I"RET"   ; Push EIP and then load EIP with offset of machine instruction RET defined somewhere in code segment.
LEA EBX,[=D 0,1,2,3] ; Error: multiple data expressions.
MOV DX,=B"This is a literal message.",13,10 ; Error: multiple data expressions.
The first example declares a word variable =W(10). Without literals we would have to explicitly define a data variable Ten DW 10 somewhere in data section and give it an explicit unique name.

Advantage of literal is that we don't need to invent unique symbol name and explicitly declare the symbol in data section with D pseudoinstruction. The data contents is visible directly in the instruction which uses the literal.

Literals are automatically aligned.

All literals are autoaligned according to their type, for instance =D 5 is DWORD aligned regardless of current EUROASM AUTOALIGN= option.

String literals are automatically zero-terminated.

String literals, such as =B"Some text" or =U"Some text" are always implicitly terminated with byte or unichar zero when they are declared as literals.
€ASM allows simplified declaration of nonduplicated literal strings, where the type identifier (B or U) is omitted, e.g. ="Some text". The actual type of string (B or U) is then determined by system preprocessing variable %^UNICODE.

Implicit data definition with literals does not allow to control the exact location where the literals will be emitted to. €ASM creates a subservient section for each type of data depending on their natural alignment. The literal section is created either

  1. in the last segment with explicit purpose LITERAL and purpose RODATA or DATA
  2. if no LITERAL segment exists, the last segment with purpose RODATA is chosen
  3. if no RODATA segment exists, the last segment with purpose DATA is chosen
  4. if no DATA|RODATA segment exists, an implicit one @LT will be created with the purpose RODATA+LITERAL.

Names of literal sections are [@LT64], [@LT32], [@LT16], [@LT8], [@LT4], [@LT2], [@LT1].
Literals with INSTRUC data type, such as =8*I"MOVSD", are emitted to subservient section [@RT0] which is similarly created in the segment with PURPOSE=CODE+LITERAL, or in the last code segment, or in automatically created implicit code segment [@RT].

Repeated literals with the same declaration are reused, they represent the same memory variable. Literals with non-verbatim match, such as =W+4, =W 4 and =W(2+2) are stored separately as different symbols, nevertheless their value is reused when it's identical, so it occupies common space in literal section. Similarly =B"Some text", =B'Some text' and =B 'Some text' are different but those three symbols together will occupy only 9+1 bytes in literal section memory at run-time.

Literals should always be treated as read-only memory variables.

Although the programmer cannot be stopped from overwriting the literal value at run-time, this could corrupt behaviour of other parts of the program, which might be reusing the same literal data.

Comparison of standard symbols and literals
PropertyStandard symbolLiteral symbol
DeclarationIt is defined explicitly, with pseudoinstruction D or its clones, e.g. Dozen: DD 12 It is declared when it is first used in any instruction, e.g. MOV ECX,=D 12
NameProgrammer must invent unique symbol name. Name of literal symbol is created from its value.
Position in object codePlacement of the symbol is fully in programmer's hands. The placement is not directly controlled by a programmer.
AlignmentIf required, it must be specified explicitly with pseudoinstruction ALIGN, or with modifier ALIGN= or with EUROASM option AUTOALIGN=. Literals are always naturally aligned, as if EUROASM AUTOALIGN=ENABLED were set at their declaration.
Alignment stuffIn order to minimalize necessary alignment stuff, programer should pay attention when mixing aligned data with different sizes. Literal data of all sizes are packed together in the descending order which minimalizes alignment stuff between them.
Multioperands Data definition pseudoinstruction D and its clones support multiple operands, e.g. Hello DB "Hello, world",13,10,'$' Multiple literal operands are not supported.
String NUL terminationOnly when explicitly declared, for instance Hello: DU "Hello, world",0 Automatically, e.g. MOV ESI,=U "Hello, world"
DuplicationDuplication is supported, e.g. FourDoublePrecOnes: DY 4 * Q 1.0 Duplication is supported, e.g. VMOVUPD YMM7, [= 4 * Q 1.0]
Value overwritingAd libitum.This should be avoided.

↑ Structures

The structure is declared by a piece of assembly code represented with STRUC..ENDSTRUC block. The block declares names, datatypes, sizes and offsets of structure members. In OOP terminology the structure is a class and structured memory variable is an object. Example:

DATUM STRUC           ; Declaration of the structure (class) DATUM.
 .Year  D W
 .Month D B
 .Day   D B
      ENDSTRUC DATUM

Today DS DATUM        ; Definition of memory variable (object) Today.
Members of structure should have local names (beginning with period .). Structure declaration defines namespace block.

Structure declaration creates symbols DATUM.Year, DATUM.Month, DATUM.Day with values 0, 2, 3 respectively. Those symbols are absolute (scalars) and they give names to relative offsets inside the structure.

Data definition creates structured memory variable - symbol Today. At the same time it also creates symbols Today.Year, Today.Month, Today.Day. Their addresses are defined somewhere in data or bss section, they are not scalars but have relocatable addresses.

Value of structure members is undefined (when the structured variable was defined in BSS segment) or it contains all zeroes (if defined in DATA segment). Members of structured memory variable can be defined statically at definition-time with keyword operands, for instance Today DS Datum, .Day=31, see also pseudoinstruction DS.

Memory-variable member can be accessed directly, for instance

MOV [Today.Month],12

We could also use a register to address the whole memory-variable, and employ this register to address individual members with relative offsets specified in structure declaration:

 MOV EDI,Today
 MOV [EDI+DATUM.Month],12

More about structures see here.


↑ %Variables

User-defined %variables ↓

Formal %variables ↓

Automatic %variables ↓

System %variables ↓

€ASM program uses preprocessing variables (alias %variables) for easy manipulation with the source text at assembly-time. Hand in hand with macroinstructions they make a powerful tool to save repetitive programmer's labour. Preprocessing apparatus does not affect the object code directly, as plain assembler does. Instead, it manipulates with the source text, which can be modified with %variables and repeated with preprocessing %pseudoinstructions.

Preprocessing variables always treat their contents as a sequence of characters, without inspecting its syntactic significance, no matter if they were assigned with literal text, string, numeric or logical expression or whatever.

Once assigned, the contents of %variable will be used (expanded) whenever the %variable appears in the source text (except for comments). Expansion takes place before the physical line of source file is parsed into the statement fields. By default the whole contents of %variable is expanded, but this can be limited with Substring or Sublist operation.

See also €ASM function Preprocessing.

Preprocessing %variables families
%Variable family ►User-defined FormalAutomatic   System
EUROASMPROGRAM€ASM
name format%identifier%identifier%spec.character(s) %^option%^option%^fixed
case-sensitiveYesYesYesNoNoNo
(re)assignmentableexplicitly
with %SET*
indirectly by
FOR-loop | MACRO expansion
indirectly by
macro expansion
indirectly by
EUROASM option
indirectly by
PROGRAM option
No

↑ User-defined %variables

Name of user-defined %variable is represented with a percent sign % immediately followed by an identifer, which is not reserved %variable name in either case. Identifier name must begin with a letter and may not contain fullstop or other punctuation.

User-defined %variable name is case-sensitive.
Reserved %variable names
CategoryReserved names
Pseudoinstructions %COMMENT, %DEBUG, %DISPLAY, %DROPMACRO, %ELSE, %ENDCOMMENT, %ENDFOR, %ENDIF, %ENDMACRO, %ENDREPEAT, %ENDWHILE, %ERROR, %EXITFOR, %EXITMACRO, %EXITREPEAT, %EXITWHILE, %FOR, %IF, %MACRO, %PROFILE, %REPEAT, %SET, %SET2 %SETA, %SETB, %SETC, %SETE, %SETL, %SETS, %SETX, %SHIFT, %UNTIL, %WHILE

User %variables are assigned (created) by the programmer with one of the %SET* family of pseudoinstructions.

%Variables may be reassigned later with a different value, they don't have to be unique in the source.

Scope of user-defined %variable begins at its definition and it ends at the end of source file.

%Variables need not be assigned before the first use. Unassigned %variable expands to nothing (empty text). Once defined %variable cannot be unassigned, there is no %UNSET, UNDEFINE or UNASSIGN directive in €ASM. Nevertheless, setting a %variable to emptiness (e.g. %SomeVar %SET) is equivalent to unsetting it. €ASM reports no warning if it encounters user-defined %variable which is empty, which has not been defined earlier or which is not defined in the source file at all.

See also test t7321.

Differences between symbols and %variables
SymbolsUser-defined %variables
are properties of PROGRAM are properties of EuroAssembler
their name never begins with % their name always begins with %
may have membership fullstop in their name never have fullstop in their name
are declared in label field of a statement are assigned with %SET* pseudoinstruction
have assembly attributes such as TYPE# and SIZE#. are simply a piece of text without attributes
may be forward referenced cannot be forward referenced
must be declared just once in a program may be redeclared many times
cannot be referenced if not declared somewhere in the main or linked program may be referenced without declaration
cannot be subject of sublist or substring operation can be sublisted or substringed

↑ Formal %variables

Formal %variable expands to a parameter value used in a %FOR loop or in %MACRO invocation. It is represented by an identifier which stands in the label field of the %FOR statement, or as an operand in the %MACRO prototype.

The scope of formal variables is limited to the block which is being expanded.

Count %FOR 1..8
        DB %Count
      %ENDFOR Count

The previous example generates eight DB statements which define byte values from 1 to 8. Identifier Count used in %FOR and %ENDFOR statements is %FOR-control variable, which is accessible inside the %FOR block as a formal %variable %Count.

Formal variables are also used to access macro operand by name during the macro expansion. In the next example we have two %MACRO-formal variables provided in the %MACRO definition as identifiers Where and Stuff. In the macro body their values are available as formal %variables %Where and %Stuff.

Fill %MACRO Where, Stuff=0   ; Definition of macro Fill.
       MOV %Where,%Stuff
     %ENDMACRO Fill

; invocations of macro Fill:
   Fill [Counter], Stuff=255 ; Will be assembled as MOV [Counter],255
   Fill EBX                  ; Will be assembled as MOV EBX,0

Notice that formal %variables are always written without the percent sign when they are declared, but % must be prefixed to their name when they are referred in the %FOR or %MACRO body. This is important for inheriting of arguments in nested and recursively expanded macroinstructions, see t7233 as an example.

Scope of the formal %variables has higher priority than user-defined %variables with identical name, no matter if they were assigned outside or inside the scope. Reassignment of a %variable with formal name inside the macro body will assign the new value to the user-defined %variable, but inside the macro the value of formal %variable prevails, see t7347, t7362. %Variable with reassigned value will be visible outside the macro, though.


↑ Automatic %variables

Automatic preprocessing variables are created and maintained by EuroAssembler at assembly time; their names contain punctuation characters and, unlike user-defined %variables, they cannot be explicitly reassigned with %SET pseudoinstruction.

The scope of automatic %variables is limited, using them outside their scope leads to an error.

%&

Suboperation size | suboperation length (percent sign followed by an ampersand) %& represents the number of characters | list items | physical lines in the suboperated object.
Its scope is constrained to the suboperation braces [ ] or { }.

Automatic suboperation variable %& is created when the expansion of included file or of another %variable uses suboperations.

When the substring operator [ ] is appended to the %variable name or to the included file name, automatic variable %& can be used inside the brackets, e. g. [1..%&], and it represents the number of bytes in expanded %variable or in the included "file".
For instance, when the user has assigned %aVariable with five letters %aVariable %SET ABCDE, then its size is 5 and the statement DB "%aVariable[4..%&]" expands to DB "DE".

When the sublist operator { } is appended to the %variable name, the contents of this %variable is treated as an array of comma-separated items and %& represents their count (ordinal number of the last nonempty item).
E. g. when the user has assigned %aReglist %SET ax,cx,dx,bx,bp then its length is 5 operands (items) and the statement MOV %aReglist{3},%aReglist{%&} expands to MOV dx,bp.

When the same sublist operator { } is appended to the included file name, contents of the file is treated as a set of physical lines and %& represents number of lines in the file. For instance INCLUDE "file.inc"{%&-10 .. %&} will include the last ten lines from "file.inc".

Using the %& variable outside brackets will throw an error.

The index of a suboperation span from 1 to %&.
%.

The expansion counter (percent sign followed by a fullstop) %. maintains a decadic number which is incremented by €ASM in each expansion of preprocessing block and can be used to create unique labels in repeating blocks.
Its scope is limited to the body of preprocessing blocks %MACRO, %FOR, %WHILE, %REPEAT. If used outside those blocks, it will expand to the single digit 0, see t7362.

If there is some private or local label declared within a macro or repeating block, and if the macro or block is expanded more than once, the same symbols will be defined more than once, and assembler treats that as an error. The identifier used as a label within macro or other expanding pseudooperations (%FOR, %REPEAT, %WHILE) should be unique. This can be achieved with the expansion counter embedded into symbol name.

See the example of macro AbortIf below. The label Skip is postfixed with %., giving the label Skip%. which expands to Skip1 and which will expand to Skip2 on the next AbortIf invocation.

|00000008: | | |AbortIf %MACRO Condition=, Errorlevel=1 ; Definition of macro AbortIf. | | J%!Condition Skip%.: ; Use inverted condition to bypass the abortion. | | PUSH %Errorlevel ; Prepare operand for API invocation. | | CALL ExitProcess:: ; Windows API for program termination. | |Skip%.: ; Label where the program continues. | | %ENDMACRO AbortIf |00000008: | |00000008: | ; Example of conditional abortion: | | EUROASM ListMacro=Yes, ListVar=Yes ; Display the expanded instructions. |00000008:833D[04000000]00 | CMP [Something],0 ; Test the condition and then invoke macro. |0000000F: | AbortIf Condition=E, Errorlevel=8 ; The program exits when Something is zero. | +AbortIf %MACRO Condition=, Errorlevel=1 ; Definition of macro AbortIf. |0000000F:7507 + J%!Condition Skip%.: ; Use inverted condition to bypass the abortion. | !JNE Skip1: |00000011:6A08 + PUSH %Errorlevel ; Prepare operand for API invocation. | !PUSH 8 |00000013:E8(00000000) + CALL ExitProcess:: ; Windows API for program termination. |00000018: +Skip%.: ; Label where the program continues. | !Skip1: | + %ENDMACRO AbortIf |00000018: | ; Continue with the program if not aborted.
The automatic variable %. helps to create unique symbol names.

All the following automatic macro %variables have their scope limited to the %MACRO block body. They refer to operands used when the macro is invoked (expanded).

%:

If a label is used in a macro invocation, the label is by default placed in the first of expanded statements. This behaviour can be overridden when the automatic macro label %variable %: (percent sign followed by a colon) is explicitly declared somewhere in the macro definition. Only one such label may be defined in the macro. Resettlement of macro label may spare a few clocks when jumping to the macro expansion which begins with code which would have to be skipped, see the following example:

SaveCursor %MACRO Videopage=BH
   %IF TYPE#CursorSave != 'W' ; If the memory variable CursorSave was not defined yet.
     JMP %:                   ; Skip to $+4 (below the DW) when the macro is entered in normal statements flow.
     CursorSave DW 0          ; Space for storing the cursor is reserved here in the code section.
   %ENDIF
%: MOV AH,3                   ; Entry point of the macro is here when the macro invocation is jumped to.
   MOV BH,%Videopage
   INT 10h                    ; Get cursor shape via BIOS API.
   MOV [CursorSave],CX
 %ENDMACRO SaveCursor
  ...
Save: SaveCursor Videopage=0  ; Use the macro in program.
  ...
  JMP Save:                   ; Jumps to the instruction MOV AH,3.
The automatic variable %: represents the "entry" of macro body.

See also test t7215.

%1

Ordinal operands of the macro can be referred by digits Unlike in batch scripts for DOS and Windows, their number is not limited to 9, but any positive decadic number is possible, for instance %11. Of course, when the eleventh operand is not specified in the macro invocation, %11 expands to nothing.
See also pseudoinstruction %SHIFT.

Automatic %variable %0 expands to the macro name.

%Formal

Another method how to refer to macro operand (both ordinal and keyword) is prefixing the formal name of the operand with percent sign.

%!1 or %!Formal

When the ordinal number or formal operand name is prefixed with logical NOT operator (exclamation !), it expands to the inverted condition code from ordinal operand. This requires that the referred operand contains a general condition code (case insensitive) such as E, NE, C etc. Operand contents will be replaced with corresponding inverted code. €ASM reports error if the operand did not contain valid condition code.

NASM uses unary-minus operator - to achieve similar functionality. I believe that the logical-not operator ! is more appropriate for the inversion of logical values.

See the macro AbortIf above as an example.

%*

Ordinal operand list %* (percent sign followed by an asterisk) is assigned with all ordinal operands from macro invocation, comma-separated. Keyword operands are omitted from the list.

Macro operands can be referred by various methods. The following example demonstrates three possible ways how to refer the macro ordinal operands:

CopyStr %MACRO FirstOp, SecondOp, ThirdOp ; Macro prototype.
          MOV ESI,%FirstOp ; Using formal %variable name of the operand.
          MOV EDI,%2       ; Using ordinal number of the operand.
          MOV ECX,%*{3}    ; Using the third item of operand list.
          REP MOVSB
        %ENDMACRO CopyStr
          ...
        CopyStr Source, Dest, SIZE# Dest ; invocation of the macro.
%#

Length of the ordinal operand list (ordinal number of the last non-empty operand) is set to ordinals count variable %# (percent sign followed by a pound character) and it represents the number of ordinal operands used in macro invocation (not the number declared in macro prototype).

The same length could be also obtained with %NrOfOrdinals %SETL %*.
%=*

List of keyword operands %=* is similar to the automatic variable %* but is contains only comma-separated keyword=value operands actually used in macro invocation.

Both %* and %=* can be used to make cloned macros with different names. For example

copystr %MACRO
          CopyStr %*, %=*
        %ENDMACRO copystr

This creates a clone of previously defined macro CopyStr but with a different name copystr. All operands used in invocation of copystr will be passed verbatim to CopyStr.

%=#

Keyword count variable %=# represents the number of keyword operands actualy used in macro invocation (not the number declared in macro prototype).
See also t7364.


↑ System %variables

EUROASM system %variables ↓
PROGRAM system %variables ↓
€ASM system %variables ↓

EuroAssembler maintains a collection of preprocessing variables with the values specified by configuration parameters. Their current value can be tested at asm-time, so the assembly process can branch accordingly.

The name of a system variable consists of %^ followed with one of enumerated identifiers.

System %^variable names are case insensitive.

Value of system %^variable cannot be assigned with %SET* pseudoinstruction; it is dynamically maintained by €ASM and it reflects the current value in charge.

%^DumpWidth %SETA 32 ; Use EUROASM DumpWidth=32 instead.
System %^variables are read-only.

Programmer can involve the value of system %^variable only indirectly, with options specified in euroasm.ini configuration file or with EUROASM and PROGRAM pseudoinstructions.

System preprocessing %variables
Category%variable names (case insensitive)
EUROASM %^AES, %^AMD, %^AutoAlign, %^AutoSegment, %^CET, %^CodePage, %^CPU, %^CYRIX, %^D3NOW, %^Debug, %^DisplayEnc, %^DisplayStm, %^Dump, %^DumpAll, %^DumpWidth, %^EVEX, %^FPU, %^ImportPath, %^IncludePath, %^Interpreter, %^Linkpath, %^List, %^ListFile, %^ListInclude, %^ListMacro, %^ListRepeat, %^ListVar, %^LWP, %^MaxInclusions, %^MaxLinks, %^MMX, %^MPX, %^MVEX, %^NoWarn, %^Profile, %^Prot, %^Prov, %^RunPath, %^RTF, %^RTM, %^SHA, %^SIMD, %^Spec, %^SVM, %^TBM, %^TimeStamp, %^TSX, %^Undoc, %^Unicode, %^VIA, %^VMX, %^Warn, %^XOP,
PROGRAM %^DllCharacteristics, %^Entry, %^FileAlign, %^Format, %^IconFile, %^ImageBase, %^ListGlobals, %^ListLiterals, %^ListMap, %^MajorImageVersion, %^MajorLinkerVersion, %^MajorOSVersion, %^MajorSubsystemVersion, %^MaxExpansions, %^MaxPasses, %^MinorImageVersion, %^MinorLinkerVersion, %^MinorOSVersion, %^MinorSubsystemVersion, %^Model, %^OutFile, %^SectionAlign, %^SizeOfHeapCommit %^SizeOfHeapReserve, %^SizeOfStackCommit, %^SizeOfStackReserve, %^StubFile, %^Subsystem, %^TimeStamp, %^Width, %^Win32VersionValue,
€ASM %^Date, %^EuroasmOs, %^Pass, %^Proc, %^Program, %^Section, %^Segment, %^SourceExt, %^SourceFile, %^SourceLine, %^SourceName, %^Time, %^Version,
↑ EUROASM system %^variables

are assigned with values specified in [EUROASM] division of the euroasm.ini or with the pseudoinstruction EUROASM.

For description of system %variables of this category see the corresponding keyword of pseudoinstruction EUROASM.

↑ PROGRAM system %^variables

are assigned with values specified in [PROGRAM] division of the euroasm.ini or with the PROGRAM pseudoinstruction.

For description of system %variables of this category see the corresponding keyword of the pseudoinstruction PROGRAM.

↑ €ASM system %^variables

Value of €ASM system %variables is maintained by €ASM itself and the programmer cannot change them directly. They are described here:

%^Version
Eight decimal digits which identify the version number of EuroAssembler. The version number can be deciphered as the day of €ASM release in the format YYYYMMDD.
%^Date, %^Time
Current time of assembly in the format YYYYMMDD, HHMMSS. These two %^variables are set only once when €ASM starts. All source files assembled with one command euroasm source*.asm will share the same %^Date and %^Time which were set from the current local time at the moment when euroasm.exe launched.
%^EuroasmOs
identifies operation system which EuroAssembler runs on during the assembly. It contains shortcut of operating system, such as Win or Lin.
This is not necessarily the operating system which the output program is intended to run on.
%^SourceFile, %^SourceName, %^SourceExt
Those three %^variables contain full file name including path, name (without path and extension) and extension (including the leading .) of the source file which is currently assembled. €ASM updates the contents of %^Source* variables at the start of source assembly and whenever some other file is included.
When those %^variables are used in a macro, instead of position within the macro body they specify position of the macro invokation.
%^SourceLine
contains the physical line number of the current statement in the current source file.
In multiline statements (with line continuation \) it is the last physical line.
When %^SourceLine is used in a macro, instead of position within the macro body it specifies the line number of the macro invokation.
%^Pass
expands to the number (1, 2, 3,,,) of pass through the current program.
%^Program
is the name of current PROGRAM..ENDPROGRAM block.
%^Proc
is the name of the current procedure. This %^variable is empty outside PROC..ENDPROC or PROC1..ENDPROC1 block.
%^Segment
is the name of current segment (without braces).
%^Section
is the name of current section (without braces).

Combination of €ASM system %^variables is used internally to identify position of statement in error messages: "%^SourceName%^SourceExt"{%^SourceLine}, e.g. "HelloWorld.asm"{3}

€ASM %^variable %^Section can be used to save and restore the current section|segment in macros. Together with statement EUROASM PUSH it guaranties that the €ASM environment will not be modified by expanding a macro, even if the macro required to temporarily change it.

aMacro %MACRO              ; Declaration of a macro which needs to emit to its own private section.
         EUROASM PUSH      ; Save all EUROASM options on their own stack.
%BackupSec %SET %^Section  ; Save the current section name to a user-defined %variable.
[.MacroPrivateSection]     ; Switch to the desired section.
               ...         ; Declare the macro body.
[%BackupSec]               ; Switch back to the original section, whatever it was.
         EUROASM POP       ; Restore EUROASM options.
        %ENDMACRO aMacro

Another example using system €ASM %^variables:

%MonthList %SET Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
%Day %SETA %^DATE[7..8] ; Using %SETA instead of %SET will assign %Day with decimal numeric value to get rid of leading zero.
InfoMsg DB "This program was assembled with €ASM %^EuroasmOs ver.%^Version",13,10
        DB "on %MonthList{%^Date[5..6]} %Day-th, %^Date[1..4] at %^Time[1..2]:%^Time[3..4].",13,10,0
; InfoMsg now contains something like
;           This program was assembled with €ASM Win ver.20081231
;           on Feb 8-th, 2009 at 22:05.

Enumerated option, such as %^CPU, %^FORMAT, %^MODEL etc. is assigned as upper case text. They can be tested at assembly time with string-compare operations.

Numeric options are always assigned as numbers in decimal notation where the positive sign + is omitted. They can be tested at assembly time with numeric-compare operations.

Boolean options, such as AutoSegment=, Priv= etc., are assigned to corresponding system %^variables %^Autosegment, %^Priv as 0 (false) or -1 (true), no matter whether they were specified using enumerated tokens ON/OFF, YES/NO, TRUE/FALSE or with a logical expression. They can be tested at assembly time with boolean expression or directly as an operand of %IF, e.g. %IF %^UNDOC.

Range EUROASM options WARN= and NOWARN= are assigned to system %variables %^Warn, %^NoWarn as series of 3999 digits 0 (false) and 1 (true). The first digit reflects the current status of message I0001, the second I0002, the last W3999.
Example: %IF %^WARN[2820] will assemble the following statements only if message W2820 is currently enabled.

System %^variables can be used in macros to warn users of the macro that the €ASM environment is not set as desired. Examples:

 %IF "%^MODEL" !== "FLAT"
    %ERROR Macro "%0" is intended for flat memory model only.
 %ENDIF
 %IF %^SizeOfStackCommit < 16K
    %ERROR This recursive macro requires stack size at least 16 KB.
 %ENDIF
 %IF %^Width = 64 && ! %^AMD
    %ERROR This 64-bit program for MS-Windows should have AMD=Enabled.
 %ENDIF
 %IF %^NoWarn[2101]
    %ERROR You shouldn't suppress W2101. Move unused symbols to an included file instead.
 %ENDIF

↑ Instructions

Machine instructions ↓

Pseudoinstructions ↓

Macroinstructions ↓

Instruction is an identifier specified in operation field of the statement.

There are three genders (types) of instructions in assembly language:
machine instructions invented by the CPU manufacturer,
pseudoinstructions invented by the assembler manufacturer,
macroinstructions invented by the programmer.


↑ Machine instructions

Instruction suffixes ↓

Instruction modifiers ↓

Instruction enhancements ↓

Undocumented instructions ↓

Machine instruction is the least order for CPU to make some calculation or data manipulation at run-time.

EuroAssembler uses the Intel syntax where the first instruction operand specifies destination (which is often one of source operands, too), and one or more sources may follow.

This is the syntax used in CPU-vendor documentation and also used in most other assemblers, with exception for the Unix-based gas, which prefers alternative paradigma represented by AT&T syntax with reversed operand order. For more differences between AT&T and Intel syntax see [ATTsyntax].

EuroAssembler implements machine instructions mnemonics as defined in specifications by CPU vendors. It also implements some undocumented instructions and instruction-format enhancements which are described below.

Machine instruction mnemonic names and their suffixes are case-insensitive.

Some machine instructions allow alternative encoding of the same mnemonic, €ASM prefers the shortest one, if not instructed differently.
€ASM respects the mnemonic chosen by the programmer, therefore it never encodes e. g. LEA ESI,[MemoryVariable] as MOV ESI,MemoryVariable, although the latter encoding is one byte shorter. There are only two notable exceptions when the mnemonics is not obeyed:

Instruction suffixes ↓

A machine instruction can manipulate with registers and memory variables of different width, usually with a byte, word or doubleword operands. However, CPU-manufacturer manuals define the same mnemonic regardless of data size. For instance, SUB [MemoryVariable],4 tells CPU to subtract the immediate number 4 from the contents of MemoryVariable, which might have been defined as DB, DW, DD or DQ. €ASM looks at the type of MemoryVariable and selects appropriate encoding according to its size. However, the offset might also be external or expressed as a register contents or plain number, such as in SUB [ESI],4, and the type of memory variable is unknown in this case. One method, how to tell EuroAssembler which data-width is desired, is using an instruction suffix, which is one of the letters B W D Q S N F appended to the mnemonic name.

€ASM allows to extend many general-purpose instructions with mnemonic suffix B, W, D, Q to specify operand size.

Transfer control instructions CALL, JMP, RET may be modified with suffix N or F which tells whether the distance of the target is near or far, i. e. if the target belongs to the same segment or if segment descriptor value needs to change, too. The unconditional JMP instruction may be also completed with suffix S when the distance to the target can be encoded into 8 bits (-128..+127).

Suffix aware instructions in €ASMSuffix
ADC, ADD, AND, CMP, CMPS, CRC32, DEC, DIV, IDIV, IMUL, INC, LODS, MOV, MOVS, MUL, NEG, NOT, OR, RCL, RCR, ROL, ROR, SAL, SAL2, SAR, SBB, SCAS, SHL, SHR, STOS, SUB, TEST, TEST2, XOR B, W, D, Q
BT, BTC, BTS, BTR, ENTER, HINT_NOP, IRET, LEAVE, POP, POPF, PUSH, PUSHF W, D, Q
PUSHA, POPAW, D
INS, MOVSX, MOVZX, OUTSB, W, D
XLATB
CALL, RETN, F
JMPS, N, F

Suffix instruction usage is not necessary in most cases because the width of the memory variable can be deduced by its type attribute or the width is determined by the register used as one of the operands. An error is reported if the register is in conflict with the suffix, for instance in MOVW AL,[ESI].

Mnemonic suffix notation is sporadicly used in other assemblers or in CPU documentations, see STOSB/W/D, OUTSB/W/D, RETN/F etc. €ASM just extends this enhancement.

Mnemonics of many SIMD instructions terminate with letters ~SS, ~SD, ~PS, ~PD which specify the type of operands, too (Scalar/Packed Single/Double-precision). €ASM does not treat them as mnemonic suffixes.

There are a few overloads (conflicts) of suffixed mnemonics with IA-32 instructions, they are resolvable by the type and by the number of operands:

|00000000: | ; Standard Move versus MMX Move Doubleword: |00000000:C7450800000000 | MOVD [EBP+8],0 ; Store immediate number to DWORD memory location (suffix ~D). |00000007:0F7E4508 | MOVD [EBP+8],MM0 ; Store DWORD from MMX register to the memory location. |0000000B: | |0000000B: | ; Shift versus Double Precision Shift: |0000000B:C1650804 | SHLD [EBP+8],4 ; Shift left logical the DWORD in memory location by 4 bits (suffix ~D). |0000000F:0FA4450804 | SHLD [EBP+8],EAX,4 ; Shift left 4 bits from register EAX to the memory location. |00000014: | |00000014: | ; Compare String versus Compare Scalar Double-precision FP number: |00000014:A7 | CMPSD ; Compare DWORDs at [DS:ESI] and [ES:EDI] (suffix ~D). |00000015:A7 | CMPSD [ESI],[EDI] ; Ditto, documented with explicit operands. |00000016:F20FC2CA00 | CMPSD XMM1,XMM2,0 ; Compare scalar float64 numbers for EQUAL.

↑ Instruction modifiers

CODE= ↓
DATA= ↓
IMM= ↓
DISP= ↓
SCALE= ↓
DIST= ↓
ADDR= ↓
PREFIX= ↓
MASK= ↓
ZEROING= ↓
EH= ↓
SAE= ↓
ROUND= ↓
BCST= ↓
OPER= ↓
ALIGN= ↓
NESTINGCHECK= ↓

Machine instructions with the same mnemonic name and functionality sometimes may be encoded to a different machine codes. For instance, an immediate value can be optionally encoded in one byte when it does not exceed the range -128..+127, or it can be encoded as a full word or doubleword. Similar rule applies to encoding of displacement value in an address expressions. Scaled address expression such as [1*ESI+EBX] may be encoded without SIB as [ESI+EBX] or using the SIB byte with explicit scaling factor 1.

€ASM prefers the shortest variant but this may be changed with additional keyword operands called instruction modifiers.

Many other assemblers decorate operands with special directives byte, word, dword, qword, short, strict, near, far, ptr to achieve specific encoding, for instance add word ptr [StringOfBytes + 4], 0x20 or jmp short SomeLabel. Instead of those directives, €ASM uses either mnemonic suffix, or instruction modifiers.

Consecuently AVX instruction modifiers MASK=, ZEROING=, SAE=, ROUND=, BCST= are used in €ASM instead of inconsistent and poorly documented decorators, such as {k} {z} {ru-sae} {4to16} {uint16} {cdab} proposed by [IntelAVX512] and [IntelMVEX].

A modifier typical value is an enumerated token such as BYTE, WORD, DWORD etc. The majority of enumerated modifier values may be abbreviated to their first letter. Both names and values of the instruction modifiers are case insensitive.

Some modifiers are boolean type, their value may be TRUE, YES, ON, ENABLE, ENABLED if true, and FALSE, NO, OFF, DISABLE, DISABLED otherwise. Boolean modifier may also be an expression which evaluates to zero (false) or nonzero (true), see boolean extended values.

When the requested modifier cannot be satisfied, €ASM raises a warning and ignores it.

Modifiers actually used for encoding can be displayed by switching ON the EUROASM option DISPLAYENC=. In this case €ASM accompanies each machine instruction with a D1080 diagnostic message that explicitly documents which modifiers were used for encoding:

| | EUROASM DISPLAYENC=ON |00000000:694D10C8000000 | IMUL ECX,[EBP+16],200 |# D1080 Emitted size=7,DATA=DWORD,DISP=BYTE,SCALE=SMART,IMM=DWORD. |00000007: | |00000007:62F1ED2CF44D02<5 | VPMULUDQ YMM1,YMM2,[EBP+40h],MASK=K4 |# D1080 Emitted size=7,PREFIX=EVEX,MASK=K4,ZEROING=OFF,DATA=YWORD,BCST=OFF,OPER=2,DISP=BYTE,SCALE=SMART.
↑ CODE=

As a heritage from the evolution of older processors, some machine instructions have more than one encoding. For instance the instruction POP rAX may be encoded either as 0x58 or as 0x8FC0, keeping the same functionality. Modifier CODE= selects which encoding should €ASM use.

Operation-code modifier may be SHORT or LONG alias S or L. Default behaviour is the one which selects shorter encoding, usually CODE=SHORT.

When an instruction has two possible encodings with the same size, CODE=SHORT selects the variant with numerically lower opcode.

|00000000:43 | INC EBX |00000001:43 | INC EBX,CODE=SHORT ; Intel 8080 legacy encoding, not available in 64-bit mode. |00000002:FFC3 | INC EBX,CODE=LONG |00000004: | |00000004:50 | PUSH EAX |00000005:50 | PUSH EAX,CODE=SHORT ; Intel 8080 legacy encoding, not available in 64-bit mode. |00000006:FFF0 | PUSH EAX,CODE=LONG |00000008: | |00000008:87CA | XCHG ECX,EDX |0000000A:87D1 | XCHG ECX,EDX,CODE=LONG ; Modifier swaps operands in commutative operations XCHG, TEST. |0000000C:87D1 | XCHG EDX,ECX |0000000E:87CA | XCHG EDX,ECX,CODE=LONG |00000010: | |00000010:C3 | RET |00000011:C3 | RET CODE=LONG |00000012:C20000 | RET CODE=SHORT ; Numerically lower opcode 0xC2 requested, which requires imm16. |00000015: | |00000015:83C07F | ADD EAX,127 |00000018:83C07F | ADD EAX,127,CODE=LONG |0000001B:057F000000 | ADD EAX,127,CODE=SHORT ; Shorter opcode 0x05 requested, which cannot sign-extend imm8.
In some cases explicit request for numerically lower opcode with CODE=SHORT may lead to a longer encoding, see the example ADD r32,imm8 above.
↑ DATA=

This modifier controls operation-size, i. e. the width of data that the instruction operates on. It may be one of BYTE, WORD, DWORD, QWORD, TBYTE, OWORD, YWORD, ZWORD alias B, W, D, Q, T, O, Y, Z. The default is not specified.

Modifier DATA= has the same function as instruction suffix, they are only two differences:

There are two other ways how the operand width is controlled. If one of operands is a register, its width prevails and this cannot be overriden with suffix or modifier. When the operand width is not determined with the register, suffix nor modifier, €ASM looks at the TYPE# attribute of the target operand.

Priority of operand-size specifications:

  1. Width of register operand
  2. Mnemonics suffix
  3. Modifier DATA=
  4. Memory operand type

See the following examples:

|00000000:00000000 |MemoryVariable DB 0,0,0,0 |00000004:0107 | ADD [EDI],EAX ; Operand width is set by the register (32 bits). |00000006:830701 | ADDD [EDI],1 ; Operand width is set by the suffix (32 bits). |00000009:66830701 | ADD [EDI],1,DATA=W ; Operand width is set by the modifier (16 bits). |0000000D:800701 | ADDB [EDI],1,DATA=W ; Operand width is set by the suffix (8 bits). Warning:modifier ignored. |## W2401 Modifier "DATA=WORD" could not be obeyed in this instruction. |00000010:660107 | ADDB [EDI],AX ; Operand width is set by the register (16 bits). Error:suffix ignored. |### E6740 Impracticable operand-size requested with mnemonic suffix. |00000013:8387[00000000]01 | ADDD [EDI+MemoryVariable],1 ; Operand width is set by the suffix (32 bits). |0000001A:668387[00000000]01 | ADD [EDI+MemoryVariable],1,DATA=W ; Operand width is set by the modifier (16 bits). |00000022:8087[00000000]01 | ADD [EDI+MemoryVariable],1 ; Operand width is set by TYPE# MemoryVariable = 'B' (8 bits). |00000029:800701 | ADD [EDI],1 ; Error:Operand width is not specified. |### E6730 Operand size could not be determined, please use DATA= modifier.
↑ IMM=

Some instructions allow to encode a small immediate value as one byte, although they operate with full words. The byte value is sign-extended by CPU at run-time.

Modifier IMM= may have value BYTE, WORD, DWORD, QWORD alias B, W, D, Q and it specifies how should the immediate operand be encoded in the instruction.

|00000000:83D001 | ADC EAX,1 |00000003:83D001 | ADC EAX,1,IMM=BYTE |00000006:81D001000000 | ADC EAX,1,IMM=DWORD
↑ DISP=

Displacement address portion in some instructions may be encoded into one byte when its value is in the range -128..+127. The byte value is sign-extended by the CPU at run-time. Values outside this range are encoded in full size, i. e. as WORD, or DWORD, according to the segment width (possibly inverted with ATOGGLE prefix). This is the default behaviour of €ASM. Modifier DISP= can have the same enumerated values as IMM= modifier (BYTE, WORD, DWORD, QWORD alias B, W, D, Q) and it controls whether the displacement is encoded with full size or as a byte.

|00000000:2945FC | SUB [EBP-4],EAX |00000003:2945FC | SUB [EBP-4],EAX,DISP=BYTE |00000006:2985FCFFFFFF | SUB [EBP-4],EAX,DISP=DWORD
↑ SCALE=

Scaling means multiplication of the contents of the index register with 0, 1, 2, 4 or 8 at run-time. The SCALE= modifier can be either SMART or VERBATIM (or shortly S, V). Default is SCALE=SMART.
In verbatim mode no optimisation is performed with index and base registers and the scaling is encoded in SIB byte even when the scale factor is 1 or 0. Encoding of instruction with SCALE=VERBATIM uses SIB byte, if possible.
In smart mode (default) €ASM tries to rearrange registers and not emit SIB byte unless absolutely necessary.
Here are the "smart" optimisation rules (IR is indexregister, BR is baseregister, disp is displacement):

|00000000:A011000000 | MOV AL,[0x11] ; Special encoding without ModR/M. |00000005:A011000000 | MOV AL,[0*ESI+0x11] ; Special encoding without ModR/M. |0000000A:8A042511000000 | MOV AL,[0*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB. ESI is not used. |00000011: | |00000011:8A4611 | MOV AL,[ESI+0x11] ; ModR/M without SIB. ESI is base. |00000014:8A4611 | MOV AL,[ESI+0x11],SCALE=SMART ; ModR/M without SIB. ESI is base. |00000017:8A442611 | MOV AL,[ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB, ESI is base. |0000001B:8A4611 | MOV AL,[1*ESI+0x11] ; ModR/M without SIB. ESI is base. |0000001E:8A4611 | MOV AL,[1*ESI+0x11],SCALE=SMART ; ModR/M without SIB. ESI is base. |00000021:8A043511000000 | MOV AL,[1*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB, ESI is index. |00000028: | |00000028:8A443611 | MOV AL,[ESI+ESI+0x11] ; ModR/M with SIB. ESI is base and index. |0000002C:8A443611 | MOV AL,[2*ESI+0x11] ; ModR/M with SIB. ESI is base and index. |00000030:8A047511000000 | MOV AL,[2*ESI+0x11],SCALE=VERBATIM ; ModR/M with SIB. ESI is scaled index. |00000037: | |00000037:8A442D11 | MOV AL,[EBP+EBP+0x11] ; ModR/M with SIB, EBP is base and index. |0000003B:8A442D11 | MOV AL,[2*EBP+0x11] ; ModR/M with SIB, EBP is base and index. |0000003F:8A046D11000000 | MOV AL,[2*EBP+0x11],SCALE=VERBATIM ; ModR/M with SIB, EBP is scaled index.
Notice that an optimisation with SCALE=SMART may change the register role (base|index) and consequently the default segment register (SS|DS) used for addressing. This is usually not an issue in flat memory model, otherwise use SCALE=VERBATIM.

When the instruction encoding is displayed with EUROASM DisplayEnc=Yes, modifier SCALE=VERBATIM tells that SIB was actually emitted in this encoding, otherwise SCALE=SMART signalizes no SIB byte.

↑ DIST=

This modifier specifies the distance of a target in control-transfer instructions. It can be one of FAR, NEAR, SHORT alias F, N, S.

DIST=FAR is used when the target is in a different segment and both rIP and CS registers need to be changed.

By default in intrasegment transfers €ASM automatically selects between SHORT and NEAR distance depending on the magnitude of the offsets difference.

Modifier DIST= has the same function as instruction suffix, they are only two differences:

Modifier DIST=NEAR or DIST=FAR can be also applied to PROC, PROC1 pseudoinstructions. As a consequence of making a FAR procedure is that CALLs and JMPs to that procedure will be by default FAR, and that any RET inside this procedure will default to DIST=FAR, too.

|[CODE1] |[CODE1] SEGMENT |0000:EB2A | JMP CloseLabel: ; Encoded DIST=SHORT. |0002:E92701 | JMP DistantLabel: ; Encoded DIST=NEAR. |0005:EA[0000]{0000} | JMP FarLabel: ; Encoded DIST=FAR. |000A:EB20 | JMP CloseLabel:,DIST=SHORT ; Encoded DIST=SHORT. |000C:E91D01 | JMP DistantLabel:,DIST=SHORT ; Encoded DIST=NEAR. |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |000F:EA[0000]{0000} | JMP FarLabel:,DIST=SHORT ; Encoded DIST=FAR. |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |0014:E91500 | JMP CloseLabel:,DIST=NEAR ; Encoded DIST=NEAR. |0017:E91201 | JMP DistantLabel:,DIST=NEAR ; Encoded DIST=NEAR. |001A:E9(0000) | JMP FarLabel:,DIST=NEAR ; Encoded DIST=NEAR. |001D:EA[2C00]{0000} | JMP CloseLabel:,DIST=FAR ; Encoded DIST=FAR. |0022:EA[2C01]{0000} | JMP DistantLabel:,DIST=FAR ; Encoded DIST=FAR. |0027:EA[0000]{0000} | JMP FarLabel:,DIST=FAR ; Encoded DIST=FAR. |002C: |CloseLabel: |002C:90909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |012C: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |0000: |FarLabel:
↑ ADDR=

This modifier will choose the reference frame of memory addressing in 64-bit mode. Allowed values are ABS, REL alias A, R. A number encoded in the instruction code with absolute addressing is related to the start of segment, which is always 0 at assembly time.
In a relative adressing frame it is related to the position of the next instruction, i. e. to the contents of register RIP.
In legacy modes (16-bit, 32-bit) the reference frame is hardwired as ADDR=REL in control-transfer instructions (direct JMP, CALL, LOOP, Jcc), and as ADDR=ABS in all other instructions.

RIP-relative addressing is shorter by one byte and it does not require relocation, which saves space in an object file and avoids patching of the code at load-time. That is why ADDR=REL is preferred by default in 64-bit mode.
Explicit selection between absolute and RIP-relative addressing is relevant only in 64-bit mode when the absolute address would require relocation at link-time. This happens when the memory variable is specified as a displacement of an address symbol (not a plain number), and no index or base register is involved in addressing.

|00000000:00000000 | MemDword DD 0 |00000004: | |00000004:0305F6FFFFFF | ADD EAX,[MemDword] ; Encoded with relative addressing. |0000000A:0305F0FFFFFF | ADD EAX,[MemDword],ADDR=REL ; Encoded with relative addressing. |00000010:030425[00000000] | ADD EAX,[MemDword],ADDR=ABS ; Encoded with absolute addressing. |00000017: | |00000017:034540 | ADD EAX,[RBP+0x40] ; Encoded with absolute addressing. |0000001A:034540 | ADD EAX,[RBP+0x40],ADDR=ABS ; Encoded with absolute addressing. |0000001D:034540 | ADD EAX,[RBP+0x40],ADDR=REL ; Encoded with absolute addressing. |## W2401 Modifier "ADDR=REL" could not be obeyed in this instruction.
↑ PREFIX=

All following modifiers apply only to instructions which use Advanced Vector eXtensions (AVX) encoding. Possible value of prefix is XOP, VEX, VEX2, VEX3, MVEX, EVEX (shortcuts are not available).

Most AVX-encodable instructions have their mnemonics prefixed with V~. Some instructions are defined with only one kind of AVX prefix, they don't need explicit modifier. When an instruction can be alternatively encoded with different AVX prefixes, €ASM will by default choose the shortest one.

Prefix VEX exists in two variants: VEX2 and VEX3. The longer encoding (VEX3) is automatically selected when the instruction uses indexregister or baseregister R8..R15 or when it uses opcode from map 0F38 or 0F3A.

Prefix EVEX or MVEX will be selected instead of VEX when the instruction uses register XMM16..XMM31, YMM16..YMM31, ZMM0..ZMM31, K0..K7, or modifier EH=, SAE=, ROUND=, MASK=, ZEROING=, OPER=.

Instruction encodable with both EVEX and MVEX default to PREFIX=EVEX. Software written for Intel® Xeon Phi CPU needs to explicitly request PREFIX=MVEX in each such amphibious instruction. In this case it is useful to disable EVEX EUROASM EVEX=DISABLED and thus be warned if some MVEX instruction encodes as EVEX by omission. Explicit specification of modifier EH= (which is available with MVEX only) will select MVEX too, and explicit PREFIX=MVEX is not necessary in this case.

CPU features required by using AVX prefix
PrefixRequired EUROASM options
XOPSIMD=AVX, AMD=ENABLED, XOP=ENABLED
VEXSIMD=AVX
MVEXSIMD=AVX512, MVEX=ENABLED
EVEXSIMD=AVX512, EVEX=ENABLED
|00000000:8FE868CCCB04 | VPCOMB XMM1,XMM2,XMM3,4 ; VPCOMB is defined with XOP only. |00000006:62F1FA082917 | VMOVNRAPD [RDI],ZMM2 ; VMOVNRAPD is defined with MVEX only. |0000000C:C5E958CB | VADDPD XMM1,XMM2,XMM3 ; VADDPD is defined with VEX,MVEX,EVEX. |00000010:C5E958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX |00000014:C5E958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX2 |00000018:C4E16958CB | VADDPD XMM1,XMM2,XMM3,PREFIX=VEX3 |0000001D:62F1ED0858CB | VADDPD XMM1,XMM2,XMM3,PREFIX=EVEX |00000023:62F1ED4858CB | VADDPD ZMM1,ZMM2,ZMM3,PREFIX=EVEX |00000029:62F1E90858CB | VADDPD ZMM1,ZMM2,ZMM3,PREFIX=MVEX
↑ MASK=

Modifier MASK= (as well as ZEROING=, EH=, SAE=, ROUND=, BCST=, OPER=) is applicable only with Enhanced Advanced Vector eXtensions (EVEX or MVEX). MASK specifies which opcode mask register is used to control which elements (floating-point or integer numbers) should be written to the destination SIMD register. Only those elements which have the corresponding bits in mask-register set, are written. Other elements are either zeroed (if modifier ZEROING=ON) or left unchanged (ZEROING=OFF).

Possible value of MASK= is K0, K1, K2, K2, K3, K4, K5, K6, K7 or an expression which evaluates to a number 0..7. Default is MASK=0. Opmask register K0 is special, it is treated as if it had all bits set, thus no masking is applied in this case.

↑ ZEROING=

Modifier ZEROING= is boolean, it controls whether elements masked-off by the contents of opmask register should be set to zero or left unchanged, which is called merging. It has no meaning when MASK=K0 or when mask is not specified at all. Default is ZEROING=OFF (merging). Modifier is applicable only with EVEX encoding.

|00000000:C5E958CB | VADDPD XMM1,XMM2,XMM3 ; VADDPD is defined with VEX,MVEX,EVEX. |00000004:62F1ED0C58CB | VADDPD XMM1,XMM2,XMM3,MASK=4 ; Using MASK= will force EVEX encoding. |0000000A:62F1ED0C58CB | VADDPD XMM1,XMM2,XMM3,MASK=K4,ZEROING=NO |00000010:62F1ED8C58CB | VADDPD XMM1,XMM2,XMM3,MASK=K4,ZEROING=YES
↑ EH=

Boolean modifier EH= (Eviction Hint) is applicable with the MVEX-encoded instructions only. EH=1 informs CPU that the data is non-temporal and it is unlikely to be reused soon so it has no effect to store them in CPU cache. This concerns register-to-memory instructions only.

Value of EH is also consulted in register-to-register instructions where it will select between swizzle operations and static rounding.

↑ SAE=

If boolean modifier SAE= (Suppress All Exceptions) is switched on, the instruction will not raise any kind of floating-point exception flags, for instance when it operated with not-a-number value. Instruction with SAE=ON behaves as if all the MXCSR mask bits were set.

In EVEX-encoding SAE is by default enabled whenever static rounding is used, this behaviour cannot be switched off.

↑ ROUND=

Modifier ROUND= specifies static rounding mode, it is applicable on EVEX and MVEX instructions with rounding semantic, for instance for conversion from double to single-precision FP numbers. It has four possible enumerated values: NEAR, UP, DOWN, ZERO alias N, U, D, Z.

Static rounding is available only in ZMM register-to-register operations (not if one of the operands is in memory or when XMM and YMM registers are used). Default is no rounding, in this case general rounding mode controlled by RM bits in MXCSR applies.

↑ BCST=

Boolean modifier BCST= can be used to enable data broadcasting in operations which load data from memory. When BCST=ENABLED, the memory source operand specifies only one element and its contents will be broadcast (copied) to all positions of the destination register.

Default is BCST=OFF. Broadcasting cannot be used with register-to-register operations.

|00000000:62F16C48590E | VMULPS ZMM1,ZMM2,[RSI] ; Multiply 16 DWORD FP numbers in ZMM2 with 16 DWORD FP numbers at [RSI], store 16 products to ZMM1. |00000006:62F16C58590E | VMULPS ZMM1,ZMM2,[RSI],BCST=ON ; Multiply 16 DWORD FP numbers in ZMM2 with the same DWORD FP number at [RSI], store 16 products to ZMM1. |0000000C:62F16C4859CB | VMULPS ZMM1,ZMM2,ZMM3 ; Multiply 16 DWORD FP numbers in ZMM2 with 16 DWORD FP numbers in ZMM3, store 16 products to ZMM1. |00000012:62F16C7859CB | VMULPS ZMM1,ZMM2,ZMM3,ROUND=ZERO ; Ditto, truncate each product toward zero.
↑ OPER=

Instruction modifier OPER= encodes kind of operation performed with the source operand at run-time. Affected operations are broadcasting, rounding, conversion, swizzling. Possible value is a numeric expression which evaluates to 0..7.

Value of the operation will be encoded in bits 6, 5, 4 of 32-bit prefix EVEX or MVEX. These bits are named S2, S1, S0 in MVEX specification [IntelMVEX], and L', L, b in EVEX specification [IntelAVX512]. The same bits are also affected by the modifiers BCST=, ROUND=, SAE= and by SIMD register width, but direct OPER= specification has higher priority when a conflict occurs.

Modifier OPER= is the only way how to request special conversion or swizzle (shuffle) operation for MVEX-encoded instruction available on Intel® Xeon Phi CPU. Not all operation values from the table below are available with all MVEX instructions, documentation in [IntelMVEX] should always be consulted prior to using OPER=.

MVEX-encoded operations
OPER=register-to-register, EH=0register-to-register, EH=1memory-to-registerregister-to-memory
0no swizzle {dcba}ROUND=NEAR,SAE=NOno operationno conversion
1swap (inner) pairs {cdab}ROUND=DOWN,SAE=NObcst 1 element {1to16} or {1to8}not available
2swap with two-away {badc}ROUND=UP,SAE=NObcst 4 elements {4to16} or {4to8}not available
3cross-product swizzle {dacb}ROUND=ZERO,SAE=NOconvert from {float16}convert to {float16}
4bcst a element across 4 {aaaa}ROUND=NEAR,SAE=YESconvert from {uint8}convert to {uint8}
5bcst b element across 4 {bbbb}ROUND=DOWN,SAE=YESconvert from {sint8}convert to {sint8}
6bcst c element across 4 {cccc}ROUND=UP,SAE=YESconvert from {uint16}convert to {uint16}
7bcst d element across 4 {dddd}ROUND=ZERO,SAE=YESconvert from {sint16}convert to {sint16}
EVEX-encoded operations
OPER=register-to-registermemory-to-register
0DATA=OWORD,SAE=NODATA=OWORD,BCST=OFF
1DATA=ZWORD,SAE=YES,ROUND=NEARDATA=OWORD,BCST=ON
2DATA=YWORD,SAE=NODATA=YWORD,BCST=OFF
3DATA=ZWORD,SAE=YES,ROUND=DOWNDATA=YWORD,BCST=ON
4DATA=ZWORD,SAE=NODATA=ZWORD,BCST=OFF
5DATA=ZWORD,SAE=YES,ROUND=UPDATA=ZWORD,BCST=ON
6reservedreserved
7DATA=ZWORD,SAE=YES,ROUND=ZEROreserved
|00000000:62F16908DB4D01<6 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=0 ; No broadcast {16to16}. |00000007:62F16918DB4D10<2 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=1 ; Broadcast one element {1to16}. |0000000E:62F16928DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=2 ; Broadcast four elements {4to16}. |00000015:62F16948DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=4 ; Convert from {uint8}. |0000001C:62F16958DB4D04<4 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=5 ; Convert from {sint8}. |00000023:62F16968DB4D02<5 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=6 ; Convert from {uint16}. |0000002A:62F16978DB4D02<5 | VPANDD ZMM1,ZMM2,[RBP+40h],PREFIX=MVEX,OPER=7 ; Convert from {sint16}. |00000031: | |00000031:62F1F9085A4D01<6 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=0 ; No broadcast {8to8}. |00000038:62F1F9185A4D08<3 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=1 ; Broadcast one element {1to8}. |0000003F:62F1F9285A4D02<5 | VCVTPD2PS ZMM1,[RBP+40h],PREFIX=MVEX,OPER=2 ; Broadcast four elements {4to8}. |00000046: | |00000046:62F1F9085ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=0 ; No swizzle {dcba}. |0000004C:62F1F9185ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=1 ; Swap (inner) pairs {cdab}. |00000052:62F1F9285ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=2 ; Swap with two-away {badc}. |00000058:62F1F9385ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=3 ; Cross-product swizzle {dacb}. |0000005E:62F1F9485ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=4 ; Broadcast a element to 4 {aaaa}. |00000064:62F1F9585ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=5 ; Broadcast b element to 4 {bbbb}. |0000006A:62F1F9685ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=6 ; Broadcast c element to 4 {cccc}. |00000070:62F1F9785ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=0,OPER=7 ; Broadcast d element to 4 {dddd}. |00000076: | |00000076:62F1F9885ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=0 ; ROUND=NEAR,SAE=OFF {rn}. |0000007C:62F1F9985ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=1 ; ROUND=DOWN,SAE=OFF {rd}. |00000082:62F1F9A85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=2 ; ROUND=UP, SAE=OFF {ru}. |00000088:62F1F9B85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=3 ; ROUND=ZERO,SAE=OFF (rz). |0000008E:62F1F9C85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=4 ; ROUND=NEAR,SAE=ON {rn-sae}. |00000094:62F1F9D85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=5 ; ROUND=DOWN,SAE=ON {rd-sae}. |0000009A:62F1F9E85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=6 ; ROUND=UP, SAE=ON {ru-sae}. |000000A0:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,EH=1,OPER=7 ; ROUND=ZERO,SAE=ON {rz-sae}. |000000A6:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,PREFIX=MVEX,ROUND=ZERO,SAE=ON |000000AC:62F1F9F85ACA | VCVTPD2PS ZMM1,ZMM2,EH=1,ROUND=ZERO,SAE=ON
↑ ALIGN=

Alignment request may be applied to any machine instruction, and to pseudoinstructions D, PROC, PROC1, STRUC. See the alignment paragraph for accepted values. This instruction modifier has the same effect as if explicit pseudoinstruction ALIGN was placed above the statement.

↑ NESTINGCHECK=

This is a pseudoinstruction modifier, it can be applied only to pseudoinstructions PROC, ENDPROC, PROC1, ENDPROC1. Its value is boolean, default is NESTINGCHECK=ON. Switching the nesting control off will suppress error message on block mismatch. This enables to establish bounds between macros which enhance some block pseudoinstructions. See the definitions of macros Procedure and EndProcedure as an example.

↑ Instruction enhancements

FPU instruction default registers ↓
String instructions operands ↓
XLAT with nondefault [segment:base] ↓
LOOP with nondefault counter ↓
Near and far LOOP and JrCXZ ↓
Near and far Jcc ↓
PUSH, POP, INC, DEC multiple operands ↓
AAD, AAM operand ↓
TEST a register by itself ↓
Shift and rotate 2nd operand ↓
No-operation ↓
PINSR register source ↓
BLENDVPD, BLENDVPS, PBLENDVB 3rd operand ↓
MASKMOVQ, MASKMOVDQU 1st operand ↓
VERR, VERQ, LAR, LSL ↓

Some instructions in IA-64 work with registers fixed by design. €ASM accepts voluntary explicit specification of such registers which serves as a documentation for human reader and sometimes it may be exploited as address-size definition and|or segment override.

↑ FPU instruction default registers

Unary FPU instructions with implicit destination ST0 may explicitly name this register as the first operand, or it may be omitted. In many other FPU instructions the default destination is ST0 and the default source is ST1, in which case one or both operands may be omitted. See also handlers of instructions FNOP, FCMOVB, FADD, FIADD, FADDP, FXCH, FCOM.

|00000000:000000000000F03F |Mem DQ 1.0 |00000008: | |00000008:DAC1 | FCMOVB ; ST0 = ST1 if Below. |0000000A:DAC1 | FCMOVB ST0,ST1 ; ST0 = ST1 if Below. |0000000C: | |0000000C:DAC7 | FCMOVB ST0,ST7 ; ST0 = ST7 if Below. |0000000E:DAC7 | FCMOVB ST7 ; ST0 = ST7 if Below. |00000010: | |00000010:D8C1 | FADD ; ST0 += ST1. |00000012:D8C1 | FADD ST0,ST1 ; ST0 += ST1. |00000014: | |00000014:DC05[00000000] | FADD ST0,[Mem] ; ST0 += [Mem]. |0000001A:DC05[00000000] | FADD [Mem] ; ST0 += [Mem]. |00000020: | |00000020:DCC7 | FADD ST7,ST0 ; ST7 += ST0. |00000022:DCC7 | FADD ST7 ; ST7 += ST0. |00000024: | |00000024:D9E9 | FLDL2T ; ST0 = log210. |00000026:D9E9 | FLDL2T ST0 ; ST0 = log210.
↑ String instructions operands

String instructions are implicitly addressing the source as memory [DS:rSI] or port DX, and the destination as memory [ES:rDI] or port DX. Beside the non-operand version €ASM accepts operand(s) explicitly representing source and destination, with possible segment-override and address-size change.

|00000000:AC | LODSB |00000001:AC | LODSB [DS:ESI] ; Default segment is DS, address-size is 32. |00000002:2EAC | LODSB [CS:ESI] ; Segment override. |00000004:67AC | LODSB [SI] ; Address-size changed. |00000006: | |00000006:AA | STOSB |00000007:AA | STOSB [EDI] |00000008: | |00000008:AE | SCASB |00000009:AE | SCASB [EDI] |0000000A: | |0000000A:A5 | MOVSD |0000000B:A5 | MOVSD [EDI],[ESI] |0000000C:2667A5 | MOVSD [DI],[ES:SI] ; Address-size and source segment changed. |0000000F: | |0000000F:666D | INSW |00000011:666D | INSW [ES:EDI],DX |00000013: | |00000013:6E | OUTSB |00000014:6E | OUTSB DX,[DS:ESI] |00000015:2E6E | OUTSB DX,[CS:ESI] ; Source segment changed.
↑ XLAT with nondefault [segment:base]

Default translation table is implicitly addressed with [DS:rBX]. €ASM accepts optional memory operand which can specify nondefault segment override and nondefault rBX width.

↑ LOOP with nondefault counter

LOOP count register can be specified as the optional second operand.

|00000000:D7 | XLAT |00000001:D7 | XLATB ; XLAT and XLATB are identical. |00000002:D7 | XLATB [DS:EBX] ; Segment DS is the default, no override is necessary. |00000003:26D7 | XLATB [ES:EBX] ; Segment override. |00000005:67D7 | XLATB [BX] ; Address-size changed from 32 to 16 bits. |00000007: | |00000007:E2F6 | LOOP $-8 |00000009:E2F6 | LOOP $-8,ECX ; Default counter in 32-bit mode is ECX. |0000000B:67E2F5 | LOOP $-8,CX ; Counter register (its address-size) changed to 16 bits.
↑ Near and far LOOP and JrCXZ

Looping is not limited to a short-range distance in €ASM. When the destination of LOOP, LOOPcc, JCXZ, JECXZ, JRCXZ is far or near (out of byte range), €ASM will assemble three instructions instead:

LOOP $+2+2 ; Loop to the proxy-jump instead of the original destination. JMPS $+JMPSsize+JMPsize ; Skip the proxy-jump when the loop has finished (rCX is zero). JMP target ; Near or far unconditional proxy-jump to the original destination. |[CODE1] |[CODE1] SEGMENT |00000000:E366 | JECXZ CloseLabel: |00000002:E364 | JECXZ CloseLabel:,DIST=SHORT |00000004:E302EB05E95B000000 | JECXZ CloseLabel:,DIST=NEAR |0000000D:E302EB07EA[68000000]{0000}| JECXZ CloseLabel:,DIST=FAR |00000018: | |00000018:E302EB05E947010000 | JECXZ DistantLabel: |00000021:E302EB05E93E010000 | JECXZ DistantLabel:,DIST=SHORT |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |0000002A:E302EB05E935010000 | JECXZ DistantLabel:,DIST=NEAR |00000033:E302EB07EA[68010000]{0000}| JECXZ DistantLabel:,DIST=FAR |0000003E: | |0000003E:E302EB07EA[00000000]{0000}| JECXZ FarLabel: |00000049:E302EB07EA(00000000){0000}| JECXZ FarLabel:,DIST=SHORT |## W2401 Modifier "DIST=SHORT" could not be obeyed in this instruction. |00000054:E302EB05E9(00000000) | JECXZ FarLabel:,DIST=NEAR |0000005D:E302EB07EA[00000000]{0000}| JECXZ FarLabel:,DIST=FAR |00000068: |CloseLabel: |00000068:909090909090909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |00000168: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |00000000: |FarLabel:
↑ Near and far Jcc

A conditional jump to the distance exceeding the byte limit -128..127 was introduced with 386 CPU. When the program is intended to run on older processors as well, near and far conditional jump Jcc target will be assembled by €ASM as two instructions:

J!cc $+J!ccsize+JMPsize ; Skip the proxy-jump if inverted condition is true. JMP target ; Near or far unconditional proxy-jump to the original destination.

Near proxy-jump instead of standard 386 near conditional jump is assembled when these three conditions are met:

  1. Distance to the target is out of byte range,
  2. Segment width is 16,
  3. EUROASM option CPU= is 286 or lower.
|[CODE1] |[CODE1] SEGMENT WIDTH=16 | | EUROASM CPU=386 |0000:7419 | JE CloseLabel: ; Standard short conditional jump. |0002:0F841501 | JE DistantLabel: ; Standard near conditional jump, available on CPU=386 and newer. |0006:7505EA[0000]{0000}| JE FarLabel: ; Far unconditional proxy-jump skipped by inverted-condition J!cc. | | EUROASM CPU=086 ; The following instructions should run on old PC/XT machine, too. |000D:740C | JE CloseLabel: ; Standard short conditional jump. |000F:7503E90701 | JE DistantLabel: ; Near unconditional proxy-jump skipped by the inverted-condition J!cc. |0014:7505EA[0000]{0000}| JE FarLabel: ; Far unconditional proxy-jump skipped by the inverted-condition J!cc. |001B: |CloseLabel: |001B:9090909090909090~~| DB 256 * B 0x90 ; Some stuff to stall off the DistantLabel. |011B: |DistantLabel: |[CODE2] |[CODE2] SEGMENT |0000: |FarLabel:
↑ PUSH, POP, INC, DEC multiple operands

In many assemblers instructions PUSH, POP, INC, DEC may have just one operand. €ASM does not limit the number of operands, they are performed one by one in the specified order. If an instruction modifier or suffix is used, it applies to all operands. |00000000:57FF370FA06A04 | PUSH EDI,[EDI],FS,4 |00000007:590FA18F0658 | POP ECX,FS,[ESI],EAX |0000000D:40FF07 | INC EAX,[EDI],DATA=DWORD |00000010:48664AFEC9 | DEC EAX,DX,CL

↑ AAD, AAM operand

Instructions AAD and AAM use radix 10 by default for adjusting AL before division or after multiplication of binary decimals. In €ASM they accept optional 8-bit immediate operand, for instance AAD 16. |00000000:D40A | AAM |00000002:D40A | AAM 10 |00000004:D410 | AAM 16 |00000006:D50A | AAD |00000008:D50A | AAD 10 |0000000A:D510 | AAD 16

↑ TEST a register by itself

When both operands in TEST instruction specify the same register, the second operand may be omitted.

↑ Shift and Rotate 2nd operand

When the number of bits to rotate or shift in instructions RCL, ROL, SAL, SHL, RCR, ROR, SAR, SHR is equal to 1, the second operand may be omitted.

|00000000:85D2 | TEST EDX,EDX |00000002:85D2 | TEST EDX ; Operand2 of TEST is by default identical with Operand1. |00000004: | |00000004:D1D0 | RCL EAX,1 |00000006:D1D0 | RCL EAX ; Omitted rotate or shift count defaults to 1. |00000008:D165F8 | SHL [EBP-8],1,DATA=DWORD |0000000B:D165F8 | SHL [EBP-8],DATA=DWORD
↑ No-operation

Instruction which does nothing (no-operation) except for taking some time and incrementing instruction-pointer register, is implemented in all x86 processors as one-byte NOP, actually XCHG rAX,rAX (opcode 0x90). With Pentium II (CPU=686) Intel proposed dedicated multibyte no-operation instructions with opcodes 0x18..0x1F prefixed with 0x0F. Multibyte NOP is more suitable for alignment purposes than series of one-byte NOPs, as it's fetched and executed at once. On older CPU this real NOP must be emulated with legacy instructions, e.g. XCHG reg,reg or LEA reg,[reg].

[Sandpile] and [NasmInsns] define real-NOP mnemonic as an undocumented instructions HINT_NOP0, HINT_NOP1, HINT_NOP2..63. with one memory operand of the desired length. Instead of clutterring the instruction list with 64 new mnemonics, €ASM implements just one mnemonic HINT_NOP (suffixable as HINT_NOPW, HINT_NOPD, HINT_NOPQ) with ordinal number defined in the first immediate operand, and memory specification moved aside to the 2nd operand.

|00000000:0F18D9 | HINT_NOP 03q,ECX |00000003:660F18E1 | HINT_NOP 04q,CX |00000007:66670F182C | HINT_NOPW 05q,[SI] |0000000C:66670F187400 | HINT_NOPW 06q,[SI],DISP=BYTE |00000012:0F18BE00000000 | HINT_NOPD 07q,[ESI],DISP=DWORD |00000019:0F19043500000000 | HINT_NOPD 10q,[1*ESI],DISP=DWORD,SCALE=VERBATIM |00000021: | |00000021:90 | NOP1 |00000022:6690 | NOP2 |00000024:0F1F00 | NOP3 |00000027:0F1F4000 | NOP4 |0000002B:0F1F442000 | NOP5 |00000030:660F1F442000 | NOP6 |00000036:0F1F8000000000 | NOP7 |0000003D:0F1F842000000000 | NOP8 |00000045:660F1F842000000000 | NOP9

Beside that, €ASM implements operandless instructions NOP1, NOP2, NOP3, NOP4, NOP5, NOP6, NOP7, NOP8, NOP9 which occupy the specified number of bytes, respecting the current CPU mode and level:

No-operation encoding
MnemonicOperation code (hexa)Equivalent instruction in €ASM syntax
16-bit mode, CPU=086
NOP190XCHG AX,AX
NOP287C9XCHG CX,CX
NOP39087C9XCHG AX,AX ; XCHG CX,CX
NOP487C987D2XCHG CX,CX ; XCHG DX,DX
NOP59087C987D2XCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX
NOP687C987D287DBXCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX
NOP79087C987D287DBXCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX
NOP887C987D287DB87E4XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX ; XCHG SP,SP
NOP99087C987D287DB87E4XCHG AX,AX ; XCHG CX,CX ; XCHG DX,DX ; XCHG BX,BX ; XCHG SP,SP
16-bit mode, CPU=686
NOP190NOP DATA=WORD
NOP26690OTOGGLE NOP
NOP3666790OTOGGLE ATOGGLE NOP
NOP4670F1F00NOP [EAX],DATA=WORD
NOP5670F1F4000NOP [EAX],DATA=WORD,DISP=BYTE
NOP6670F1F442000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP766670F1F442000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP8670F1F8000000000NOP [EAX],DATA=WORD,DISP=DWORD
NOP9670F1F842000000000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
32-bit mode, CPU=386
NOP190XCHG EAX,EAX,DATA=DWORD
NOP26690XCHG AX,AX,DATA=WORD
NOP38D4000LEA EAX,[EAX],DATA=DWORD
NOP48D442000LEA EAX,[EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP53E8D442000LEA EAX,[DS:EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP68D8000000000LEA EAX,[EAX],DATA=DWORD,DISP=DWORD
NOP78D842000000000LEA EAX,[EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP83E8D842000000000LEA EAX,[DS:EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9663E8D842000000000LEA AX,[DS:EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
32-bit mode, CPU=686
NOP190NOP DATA=DWORD
NOP26690NOP DATA=WORD
NOP30F1F00NOP [EAX],DATA=DWORD
NOP40F1F4000NOP [EAX],DATA=DWORD,DISP=BYTE
NOP50F1F442000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP6660F1F442000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP70F1F8000000000NOP [EAX],DATA=DWORD,DISP=DWORD
NOP80F1F842000000000NOP [EAX+0*EAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9660F1F842000000000NOP [EAX+0*EAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
64-bit mode, CPU=X64
NOP190NOP DATA=DWORD
NOP26690NOP DATA=WORD
NOP30F1F00NOP [RAX],DATA=DWORD
NOP40F1F4000NOP [RAX],DATA=DWORD,DISP=BYTE
NOP50F1F442000NOP [RAX+0*RAX],DATA=DWORD,SCALE=VERBATIM,DISP=BYTE
NOP6660F1F442000NOP [RAX+0*RAX],DATA=WORD,SCALE=VERBATIM,DISP=BYTE
NOP70F1F8000000000NOP [RAX],DATA=DWORD,DISP=DWORD
NOP80F1F842000000000NOP [RAX+0*RAX],DATA=DWORD,SCALE=VERBATIM,DISP=DWORD
NOP9660F1F842000000000NOP [RAX+0*RAX],DATA=WORD,SCALE=VERBATIM,DISP=DWORD
MnemonicOperation code (hexa)Equivalent instruction in €ASM syntax

↑ PINSR register source

Instructions PINSRB, PINSRW, PINSRD (insert Byte/Word/Dword into the destination register XMM) accept as source register (operand 2) not only GPR with the corresponding width, but any wider register. Only lowest byte|word|dword from this register is used.

|00000000:660F3A20C902 | PINSRB XMM1,CL,2 |00000006:660F3A20C902 | PINSRB XMM1,CX,2 |0000000C:660F3A20C902 | PINSRB XMM1,ECX,2 |00000012: | |00000012:660FC4C902 | PINSRW XMM1,CX,2 |00000017:660FC4C902 | PINSRW XMM1,ECX,2
↑ BLENDVPS, BLENDVPD, PBLENDVB 3rd operand

Instruction for variable blending uses fixed implied register XMM0 as a mask register. €ASM allows explicit specification of XMM0 as the third operand.

|00000000:660F3815CA | BLENDVPD XMM1,XMM2 |00000005:660F3815CA | BLENDVPD XMM1,XMM2,XMM0 |0000000A: | |0000000A:660F3814CA | BLENDVPS XMM1,XMM2 |0000000F:660F3814CA | BLENDVPS XMM1,XMM2,XMM0 |00000014: | |00000014:660F3810CA | PBLENDVB XMM1,XMM2 |00000019:660F3810CA | PBLENDVB XMM1,XMM2,XMM0
↑ MASKMOVQ, MASKMOVDQU 1st operand

Maskable copy to memory uses [DS:rDI] as the fixed destination. €ASM allows explicit specification of the destination memory as the optional first operand.

|00000000:0FF7CA | MASKMOVQ MM1,MM2 |00000003:0FF7CA | MASKMOVQ [DS:EDI],MM1,MM2 ; Default destination is [DS:EDI]. |00000006:260FF7CA | MASKMOVQ [ES:EDI],MM1,MM2 ; Segment override. |0000000A: | |0000000A:660FF7CA | MASKMOVDQU XMM1,XMM2 |0000000E:660FF7CA | MASKMOVDQU [DS:EDI],XMM1,XMM2 ; Default destination is [DS:EDI]. |00000012:26660FF7CA | MASKMOVDQU [ES:EDI],XMM1,XMM2 ; Segment override.
↑ VERR, VERW, LAR, LSL

Segment descriptor in system instruction VERR, VERW (operand 1) and LAR, LSL (operand 2) may be specified as 16-bit memory variable or 16, 32 or 64-bit GPR (only lower 16 bits are used).

|00000000:0F00E6 | VERR SI |00000003:0F00E6 | VERR ESI |00000006: | |00000006:0F00EE | VERW SI |00000009:0F00EE | VERW ESI |0000000C: | |0000000C:660F02C6 | LAR AX,SI |00000010:660F02C6 | LAR AX,ESI |00000014:0F02C6 | LAR EAX,SI |00000017:0F02C6 | LAR EAX,ESI |0000001A: | |0000001A:660F03C6 | LSL AX,SI |0000001E:660F03C6 | LSL AX,ESI |00000022:0F03C6 | LSL EAX,SI |00000025:0F03C6 | LSL EAX,ESI

Undocumented instructions ↓

€ASM supports a few instructions which are not documented in the official specification published by CPU manufacturer. They may not work with all processor generations and they require explicit feature EUROASM UNDOC=ENABLED.

For more information see instruction handlers BB0_RESET, CMPXCHG486, F4X4, FCOM2, FCOMP5, FFREEP, FMUL4X4, FNSETPM, FRSTPM, FSBP1, FSBP2, FSBP3, FSTDW, FSTP1, FSTP8, FSTP9, FSTSG, FXCH4, FXCH7, HCF, HINT_NOP, IBTS, ICEBP, INT1, JMPE, LOADALL, LOADALL286, PREFETCHWT1, PSRAQ, SAL2, SALC, SETALC, SMINTOLD, TEST2, UD0, UD1, UD2A, UMOV, XBTS, VLDQQU.

↑ Pseudoinstructions

ALIGN ↓

D, DB, DU, DW, DD, DQ, DT, DO, DY, DZ, DI, DS ↓

ENDHEAD ↓

ENDP ↓

ENDP1 ↓

ENDPROC ↓

ENDPROC1 ↓

ENDPROGRAM ↓

ENDSTRUC ↓

EQU ↓

= ↓

EUROASM ↓

EXTERN ↓

EXPORT ↓

GLOBAL ↓

GROUP ↓

HEAD ↓

IMPORT↓

INCLUDE ↓

INCLUDE1 ↓

INCLUDEBIN ↓

INCLUDEHEAD ↓

INCLUDEHEAD1 ↓

LINK ↓

PROC ↓

PROC1 ↓

PROGRAM ↓

PUBLIC ↓

SEGMENT ↓

STRUC ↓

%COMMENT ↓

%DEBUG ↓

%DISPLAY ↓

%DROPMACRO ↓

%ELSE ↓

%ENDCOMMENT ↓

%ENDFOR ↓

%ENDIF ↓

%ENDMACRO ↓

%ENDREPEAT ↓

%ENDWHILE

%ERROR ↓

%EXITFOR ↓

%EXITMACRO ↓

%EXITREPEAT ↓

%EXITWHILE

%FOR ↓

%IF ↓

%MACRO ↓

%PROFILE ↓

%REPEAT ↓

%SET ↓

%SETA ↓

%SETB ↓

%SETC ↓

%SETE ↓

%SETL ↓

%SETS ↓

%SETX ↓

%SET2 ↓

%SHIFT ↓

%UNTIL ↓

%WHILE

Pseudoinstructions (sometimes also called directives) are orders for the assembler which are formally similar to ordinary machine instructions — many of them may have label field and operands. Some pseudoinstructions (ALIGN and D) can even emit data or code.

Pseudoinstruction names and their keyword operands are case-insensitive.

↑ EUROASM

AUTOALIGN= ↓
AUTOSEGMENT= ↓
CODEPAGE= ↓
CPU= ↓
CPU features ABM=, AES=, AMD=, AVX=, AVX512=, CET=, CYRIX=, D3NOW=, EVEX=, FMA=, FPU=, LWP=, MMX=, MPX=, MVEX=, PRIV=, PROT=, RTM=, SGX=, SHA=, SPEC=, SVM=, TSX=, UNDOC=, VIA=, VMX=, XOP= ↓
DEBUG= ↓
DISPLAYENC= ↓
DISPLAYSTM= ↓
DUMPALL= ↓
DUMP= ↓
DUMPWIDTH= ↓
INCLUDEPATH= ↓
LINKPATH= ↓
LIST= ↓
LISTFILE= ↓
LISTINCLUDE= ↓
LISTMACRO= ↓
LISTREPEAT= ↓
LISTVAR= ↓
MAXINCLUSIONS= ↓
MAXLINKS= ↓
NOWARN= ↓
PROFILE= ↓
RUNPATH= ↓
SIMD= ↓
UNICODE= ↓
WARN= ↓

With the EUROASM pseudoinstruction the programmer controls various settings of EuroAssembler - EUROASM options. Particular options are set with the keyword operands. The same keywords are used in [EUROASM] section of euroasm.ini configuration file.

Options specified with this pseudoinstruction rewrite default options set in the configuration file. Names of those options are case-insensitive.

Current value can be retrieved in the form of EUROASM system %^variables, for instance InfoMsg DB "This program uses code page %^CODEPAGE.",13,10,0

For options which expect a Boolean value it may be provided with enumerated tokens TRUE, YES, ON, ENABLE, ENABLED or FALSE, NO, OFF, DISABLE, DISABLED (case insensitive) or they may contain a logical expression.

Beside the keyword options the EUROASM pseudoinstruction also recognizes ordinal operand(s) which may have one of two enumerated values PUSH or POP. €ASM maintains a special option stack and these two directives allow to save and retrieve the whole set of EUROASM options to this stack. This feature is handy in macros which temporarily require some unusual option value. Blindly setting the option in macro would have had side effect on the statements following the macro invocation, because EUROASM is a switching statement. So it is better to save the current options on its stack at the beginning of macro and restore them at the end; other statements will not be influenced. Example:

SomeMacro %MACRO  ; Macro definition.
            EUROASM PUSH, NOWARN=2102 ; Store all options to the option-stack and then supress the warning W2102.
             ; Here go instructions which may emit warning message W2102
             ...
            EUROASM POP ; Restore the option-stack, W2102 is no longer suppressed.
          %ENDMACRO SomeMacro
↑ AUTOALIGN=

This is a Boolean option; default is AUTOALIGN=ON. Memory variables created or reserved with D pseudoinstruction will be implicitly aligned according to their TYPE#.

Aligned memory-variables can be accessed faster, on the other hand this option may blow up the size of your program if data definition of various types are mixed frequently. It's better to manually group data of the same size, so the alignment stuff is used only once per group.
Memory variables defined as literals are always autoaligned, regardless of EUROASM AUTOALIGN= status.

Structured data variables (defined with DS structure_name) do not autoalign by their largest member. They are aligned by the segment width (WORD, DWORD or QWORD) if AUTOALIGN=ENABLED.

Autoalignment does not work inside structure definition.

Programmers should design their structures with respect to the natural alignment of structure members. This is especially important in 64-bit mode, where API requires all data be aligned. On conversion from badly designed 32-bit structures they need manually inserted stuff-members which complete DWORD member sizes to QWORD alignment of the following members, and which rounds up the strucure size to a multiple of 8. See the WinAPI structure MSG as an example.

Autoalignment does not apply to machine instructions. If we want to have a procedure aligned to the start of a cache boundery (for better performance), it should be aligned explicitly, for instamce Rapid PROC ALIGN=OWORD.

↑ AUTOSEGMENT=

This is a Boolean option; default is AUTOSEGMENT=ON. The section, where the current statement emits to, is implicitly changed by €ASM according to the purpose of the statement. When more than one section with this purpose is defined in a program, autosegment will switch to the last defined one.

If the statement is a machine instruction or prefix or PROC, €ASM will switch to the last defined CODE section.
Similary, when the statement defines or reserves data (pseudoinstruction D and its clones, including DI), the current section is switched to the last DATA or BSS section.

Pseudoinstruction ALIGN, macros and all nonemitting operations, such as EQU or a solo label, do not change the current section.

If you rely on autosegmentation, avoid a pitfall when the new section begins with a macro invocation, with an explicit ALIGN or with just a label itself. These statement will not autoswitch the current section. You may need to insert NOP or PROC to autoswitch to CODE, DB 0 statement to autoswitch to DATA, or DB to autoswitch to BSS. Example of such pitfall:
      EUROASM AUTOSEGMENT=ON
Hello PROGRAM FORMAT=PE, ENTRY=Main:
       INCLUDE winapi.htm; Include some basic code macros.
Title  DB "World!",0     ; Correctly autoswitched to [.data].
Main:  StdOutput Title   ; Macro didn't swich to [.text] as desired.
       TerminateProgram
      ENDPROGRAM Hello   ; Hello.exe does not work because its entry is in [.data] section.

The label Main: incorrectly remained in previous [.data] section. Remedy is simple:

      EUROASM AUTOSEGMENT=ON
Hello PROGRAM FORMAT=PE, ENTRY=Main:
       INCLUDE winapi.htm; Include some basic code macros.
Title  DB "World!",0     ; Correctly autoswitched to [.data].
Main:  PROC              ; Correctly autoswitched to [.text].
        StdOutput Title
        TerminateProgram
       ENDPROC Main:
      ENDPROGRAM Hello   ; Hello.exe works as expected.
Each explicit change of current section disables AUTOSEGMENT as a side effect.

AUTOSEGMENT= is a weak option, it is automatically switched off when the programmer changes the current section explicitly with [section_name] in the label field of statement.

If you want to keep AUTOSEGMENT enabled after manual change of section, you need to explicitly switch it back on with EUROASM AUTOSEGMENT=ON, or save its state using EUROASM PUSH and restore them with EUROASM POP afterwards.
↑ CODEPAGE=

€ASM can use Unicode strings at run time but the data definitions in the source code are defined in bytes. Option CODEPAGE= tells €ASM which code page it should internally use for string conversion in the source text to Unicode at assembly-time.

Codepage may be specified with a direct 16-bit integer value, as specified by [CodePageMS], for instance CODEPAGE=1253 for Greek aplhabet.

Codepage values can also be specified as an enumerated token, such as CODEPAGE=CP852, CODEPAGE=WINDOWS-1252, CODEPAGE=ISO-8859-2 etc, see DictCodePages for the complete list. Names of those specification are case insensitive.

Even though some of those enumerated codepage constants may look like an arithmetic substraction, they are recognized as verbatim tokens and not evaluated.

The factory default and recommended value is CODEPAGE=UTF-8. See also Character encoding above.

↑ INCLUDEPATH=

When an included file is specified without a path, €ASM will search for this file in the directories which are defined in INCLUDEPATH= option. Paths can be separated with a semicolon ; or comma , and the whole list should be in double quotes. Both backward \ and forward slashes / may be used as folder separator. The last slash can be omitted. Default is INCLUDEPATH="./,./maclib,../maclib,".

This syntax doesn't support directory names which begin or end with a space as a significat part of the name. Nevertheless, such names should be avoided anyway.
↑ LINKPATH=

When a linked file is specified without a path, €ASM will search for this file in the directories which are defined in LINKPATH= option. Paths can be separated with semicolon ; or comma , and the whole list should be in double quotes. Both backward \ and forward slashes / may be used as folder separator. The last slash can be omitted. Default is LINKPATH="./,./objlib,../objlib,".

↑ RUNPATH=

When a dynamic shared object (ELFSO module) is specified without a path, Linux dynamic linker will search for this file in the directories which are defined in RUNPATH= option. Paths can be separated with semicolon ; or comma , and the whole list should be in double quotes. Both backward \ and forward slashes / may be used as folder separator. The last slash can be omitted. Default is RUNPATH="./,./objlib,../objlib,".

↑ MAXINCLUSIONS=

Parameter MAXINCLUSIONS limits the maximal number of succesfull executions of INCLUDE* statements in an €ASM source. This prevents the assembler from resource exhausting in the case of recursive inclusion loop.

Default value is EUROASM MAXINCLUSIONS=64.

↑ MAXLINKS=

Parameter MAXLINKS limits the maximal number of files specified by LINK statements in an €ASM source. This prevents the assembler from resource exhausting in case of recursive link loop.

Default value is EUROASM MAXLINKS=64.

↑ Processor generation option CPU=

Not all IA-32 machine instructions are available on all types of Central Processing Unit (CPU). This EUROASM option specifies the minimal type of CPU which the program is intended for. Possible CPU= values are
086 alias 8086,
186,
286,
386,
486,
586 alias PENTIUM,
686 alias P6,
X64.
The default is EUROASM CPU=586. 64-bit program should have EUROASM CPU=X64 enabled.

EuroAssembler pretends that the later CPU also promotes all instructions supported by previous CPU versions.
↑ Processor features

This bunch of EUROASM boolean options tells €ASM which CPU features are required on the target computer. By default are all options switched OFF, you should explicitly enable each capability which you intend to program for.

ABM= assembly of Advanced Bit Manipulation instructions.

AES= assembly of Intel's Advance Encryption Standard (AESNI) instructions.

AMD= instructions specific for AMD CPU manufacturer.

CET= Control-flow Enforcement Technology instructions.

CYRIX= instructions specific for CYRIX CPU manufacturers.

D3NOW= assembly of AMD 3DNow! instructions.

EVEX= assembly of Intel's EVEX-encoded AVX-512 instructions.

FMA= assembly of Fused Multiply-Add instructions.

FPU= assembly of Floating-Point Unit instructions (math coprocessor).

LWP= assembly of AMD's LightWeight Profiling instructions.

MMX= assembly of MultiMedia Extensions.

MPX= assembly of Memory Protection Extensions.

MVEX= assembly of Intel's MVEX-encoded AVX-512 instructions.

PRIV= assembly of privileged mode instructions.

PROT= assembly of protected mode instructions.

SGX= assembly of Software Guard Extensions.

SHA= assembly of Intel's Secure Hash Algorithm instructions.

SPEC= assembly of other special instructions.

SVM= assembly of Shared Virtual Memory instructions.

TSX= assembly of Intel's Transactional Synchronization Extensions.

UNDOC= assembly of undocumented instructions.

VIA= instructions specific for VIA Geode CPU manufacturers.

VMX= assembly of Virtual Machine Extensions.

XOP= assembly of AMD's XOP-encoded AVX instructions.

↑ Streaming SIMD Extension generation option SIMD=

This option defines which Single Instruction Multiple Data (SIMD) generation is required to assemble the following instructions. Possible enumerated values are
SSE1 alias SSE alias boolean true,
SSE2,
SSE3,
SSSE3,
SSE4,
SSE4.1,
SSE4.2,
AVX,
AVX2,
AVX512. Default value is SIMD=DISABLED (no SIMD instructions are expected).

Options CPU generation, CPU features, SIMD generation do not restrain €ASM from assembling instructions for higher CPU but a warning is issued when the instruction requires some capability currently not enabled with EUROASM. This should warn you that your program may not run on every PC, or that you may have made a typo in instruction mnemonics.
↑ DISPLAYSTM=
↑ DISPLAYENC=

Those boolean options are designed for debugging of assembly process, see also pseudoinstruction %DISPLAY. When enabled, €ASM inserts a diagnostic message below each assembled statement, which displays how is the statement parsed into fields, and what modifiers was used for the instruction encoding. Example:

    EUROASM DISPLAYSTM=ON
.L: MOV EAX,[ESI+16],ALIGN=DWORD
    EUROASM DISPLAYSTM=OFF, DISPLAYENC=ON
    LEA EDX,[ESI+16]
    ADD EAX,EDX

Listing of the previous example is here:

| | EUROASM DISPLAYSTM=ON |00000000:8B4610 |.L: MOV EAX,[ESI+16],ALIGN=DWORD |# D1010 **** DISPLAYSTM ".L: MOV EAX,[~~ALIGN=DWORD " |# D1020 label=".L" |# D1040 machine operation="MOV" |# D1050 ordinal operand number=1,value="EAX" |# D1050 ordinal operand number=2,value="[ESI+16]" |# D1060 keyword operand,name="ALIGN",value="DWORD" | | EUROASM DISPLAYSTM=OFF, DISPLAYENC=ON |# D1010 **** DISPLAYSTM "EUROASM DISPL~~SPLAYENC=ON " |# D1040 pseudo operation="EUROASM" |# D1060 keyword operand,name="DISPLAYSTM",value="OFF" |# D1060 keyword operand,name="DISPLAYENC",value="ON" |00000003:8D5610 | LEA EDX,[ESI+16] |# D1080 Emitted size=3,DATA=DWORD,DISP=BYTE,SCALE=SMART,ADDR=ABS. |00000006:01D0 | ADD EAX,EDX |# D1080 Emitted size=2,CODE=SHORT,DATA=DWORD.
↑ DUMP=
↑ DUMPWIDTH=
↑ DUMPALL=

Options DUMP=, DUMPWIDTH= and DUMPALL= control how the dump column with emitted code is presented in listing.

The boolean option DUMP= can switch off the dump completely, the listing copies the input source almost verbatim in this case. Default is DUMP=ON.

DUMPWIDTH= sets the width of dump column in €ASM listing. This option specifies how many characters of dumped data will fit between the starting | and ending | including those two border characters. Default value is DUMPWIDTH=27 which is enough for 8byte long instruction.

Accepted dump width value is between 16 and 128 characters.

Dump data consists of an offset (4 or 8 hexadecimal characters, depending on section width), separator : and 2 hexadecimal digits per each byte of generated code.

When the generated code is too long to fit into the dump column, the Boolean option DUMPALL= decides if the rest will be omitted (the omittion is indicated by a tilde ~ in place of the last character), or if additional lines will be inserted to the listing until all generated code is dumped. Factory default is DUMPALL=OFF.

See also the description of listing file.

Be careful when setting DUMPALL=ON with long duplicated data definition, such as DB 2048 * B 0, because this may clutter the listing with many lines of the useless dump.
↑ LISTFILE=

This option defines the name of the listing file. By default it is LISTFILE="%^SourceName%^SourceExt.lst", i. e. it copies the name and extension of the source file and appends .lst to it.
If not specified otherwise, the listing is always created in the same directory as the corresponding source file.

↑ LIST=
↑ LISTINCLUDE=
↑ LISTMACRO=
↑ LISTREPEAT=
↑ LISTVAR=

LIST* family of options controls what should be copied to the listing file. The boolean option LIST=OFF will suppress the generation of listing until it is switched on again. Default is LIST=ON.
Notice that switching off even a minor part of listing will cause that the listing file is no longer usable as the source file, because some parts are not copied by €ASM from original source to the listing.

Contents of the included files is by default omitted from the listing (LISTINCLUDE=OFF). When this option is ON, the INCLUDE statement will be replaced by the contents of file.

LISTMACRO= controls whether the instructions from macro expansion go to the listing. Default state is LISTMACRO=OFF and only the invocation of macroinstruction is presented.

EUROASM option LISTREPEAT= is similar to LISTMACRO= with the difference that it controls listing of statements expanded in %FOR, %WHILE and %REPEAT blocks.

When a preprocessing %variable is used in the statement and the option LISTVAR=ON, the statement is duplicated in the form of a machine comment just below the original statement and the expanded text is shown instead of %variables. Factory default is LISTVAR=OFF.

See also the description of listing file above.

↑ UNICODE=

UNICODE= determines the character width. This boolean option specifies if data definition of unspecified string, such as D "an explicit string" or ="a literal string" should be treated as a sequence of bytes (8-bit characters) or unichars (16-bit characters).

The system variable %^UNICODE is checked in macros or structure definitions which have different versions for ANSI (8-bit) or WIDE (16-bit) string encoding.
It is also consulted in macros WinAPI (32-bit) and WinABI (64-bit) to determine which version of Windows API function (ANSI or WIDE) should be invoked.

Some string-handling macros and WinAPI functions expect the string size be specified in characters rather than in bytes. Attribute operation SIZE# returns the size of its operand always in bytes. This can be solved by testing the system variable %^UNICODE:

aString D "String" ; Symbol aString defines 6 bytes if UNICODE=OFF or 12 bytes if UNICODE=ON.
  %IF %^UNICODE  ; WIDE version of aString.
     MOV ECX, SIZE# aString / 2
  %ELSE          ; ANSI version of aString.
     MOV ECX,SIZE# aString
  %ENDIF         ; ECX is now loaded with the number of characters in aString.

A trickier but more elegant solution exploits the fact, that %^UNICODE (and all other boolean system %^variables) expands to either 0 or -1, and that shift left by negative value is calculated as shift right by the negated value. When %^UNICODE is -1, size in bytes is shifted to the right by 1 bit, which is equivalent to division by two.

aString D "String" ; Symbol aString defines 6 bytes if UNICODE=OFF or 12 bytes if UNICODE=ON.
  MOV ECX, SIZE# aString << %^UNICODE  ; ECX is now loaded with the number of characters in aString.
↑ DEBUG=

This boolean option specifies if a debug version should be assembled. When EUROASM DEBUG=ENABLED, linker includes symbol table and|or other debugging information to the output program. Macros can change their behaviour depending on condition %IF %^DEBUG.

The final release of your programs should be assembled with this option turned off.

↑ PROFILE=

This boolean option specifies if profileable version should be assembled. Profiling is not implemented yet in this version of EuroAssembler.

The final release should be assembled with this option turned off.

↑ WARN=
↑ NOWARN=

Options WARN= and NOWARN= control which informative and warning messages will be issued in the assembly process. With NOWARN= it is possible to suppress anticipated messages with identification number below 4000. Suppressed warnings have no effect on the final errorlevel. User generated warnings (U5000..U5999) and errors with higher severity cannot be suppressed.

The value of option is either a number, or a range of numbers, which shouldn't exceed 3999. WARN= and NOWARN= operands may repeat in a statement; they are processed from left to right. For instance EUROASM NOWARN=0600..0999, WARN=705 will supress informative messages I0600 to I0999 except for the message I0705 which remains enabled.

The default value is WARN=0..3999 (all messages are enabled}.


↑ PROGRAM

↑ ENDPROGRAM

DLLCHARACTERISTICS= ↓
ENTRY= ↓
FILEALIGN= ↓
FORMAT= ↓
ICONFILE= ↓
IMAGEBASE= ↓
LISTLITERALS= ↓
LISTGLOBALS= ↓
LISTMAP= ↓
MAJORIMAGEVERSION= ↓
MAJORLINKERVERSION= ↓
MAJOROSVERSION= ↓
MAJORSUBSYSTEMVERSION= ↓
MAXEXPANSIONS= ↓
MAXPASSES= ↓
MINORIMAGEVERSION= ↓
MINORIMAGEVERSION= ↓
MINORLINKERVERSION= ↓
MINOROSVERSION= ↓
MINORSUBSYSTEMVERSION= ↓
MODEL= ↓
OUTFILE= ↓
SECTIONALIGN= ↓
SIZEOFHEAPCOMMIT= ↓
SIZEOFHEAPRESERVED= ↓
SIZEOFSTACKCOMMIT= ↓
SIZEOFSTACKRESERVED= ↓
STUBFILE= ↓
SUBSYSTEM= ↓
TIMESTAMP= ↓
WIDTH= ↓
WIN32VERSIONVALUE= ↓

Pseudoinstructions PROGRAM and ENDPROGRAM specify a block of source code, which creates standalone output file. In most other assemblers it is the whole source file which creates the output file, sometimes it is called modul or unit of compilation. For instance, the command nasm -f win32 HelloWorld.asm -o HelloWorld.obj tells NetWide Assembler to create a COFF output file HelloWorld.obj. In €ASM more than one output files could be created with the command euroasm HelloWorld.asm, provided that there are more PROGRAM / ENDPROGRAM blocks in HelloWorld.asm.

The label of PROGRAM statement represents the name of output program. Although it does not define a symbol, its name must follow the rules for symbol names, that is at least one letter followed with letters and digits. The same identifier may be used as the first and only operand in the corresponding ENDPROGRAM statement.

One source may contain more program blocks and the blocks may nest. Each program block assembles to a different output file.

Symbols defined in the program are not visible outside the block. When a program needs to call a label from another program, labels must be marked as extern and public, even when both program may lay in the same source file or when one program be nested in another.

Preprocessing %variables, macro definitions and Euroasm options, on the other hand, are visible throughout the source and they can transfer information between programs at assembly time. See the sample program LockTest as an example.

The PROGRAM pseudoinstruction has many important keyword operands which specify properties of the output file. The same keywords are used in [PROGRAM] division of euroasm.ini configuration file.

The values of all PROGRAM options can be inspected as system %^variables at assembly-time. For instance in the message InfoMsg DB "This is a %^WIDTH-bit program.",13,10,0 the system variable %^WIDTH will be replaced with the actual width of the program (16, 32 or 64), it could be tested with %IF %^Width <> 64 etc.

Unlike EUROASM options, which involve only a part of source, PROGRAM options involve the whole program en bloc. We cannot have a half of the program in a graphic subsystem, and another half in a console subsystem, for instance. That is why options LISTMAP=, LISTGLOBALS=, LISTLITERALS= are properties of pseudoinstruction PROGRAM, but LISTINCLUDE=, LISTMACRO=, LISTREPEAT=, LISTVAR= are properties of pseudoinstruction EUROASM.
↑ FORMAT=

Format and file-extension of the output file is determined with this PROGRAM's parameter.

€ASM output file formats
FORMAT=Default
output file
extension
Default
program
width
Default
memory
model
Description
BIN.bin16-bitsTINYBinary file
BOOT.sec16-bitsTINYBootable file
COM.com16-bitsTINYDOS/CPM executable file
ELF.o32-bitsFLATLinux relocatable object file
ELFX.x32-bitsFLATLinux executable file
ELFSO.so32-bitsFLATLinux dynamic shared object file
OMF.obj16-bitsSMALLObject Module Format
LIBOMF.lib16-bitsSMALLObject library in OMFormat
MZ.exe16-bitsSMALLDOS executable file
COFF.obj32-bitsFLATCommon Object File Format
LIBCOF.lib32-bitsFLATObject library in COFFormat
PE.exe32-bitsFLATWindows Portable Executable file
DLL.dll32-bitsFLATWindows Dynamic Linked Library

See also Program formats for more details.

↑ WIDTH=

This parameter specifies operating mode of the program:

Program width also defines the default width for all its segments. Its value is a numeric expression which evaluates to 16, 32, 64, or to 0. Empty or zero value (factory default) specifies that program width should be set internally by €ASM according to its FORMAT=. Nevertheless, when a segment is defined, it may specify a different width, regardless of the default width of its program. €ASM doesn't protest against mixing 16-bit and 32-bit segments in one module.

↑ MODEL=

Memory model describes sizes and distances of code and data, and the number of code and noncode segments. The main function of memory model specification is to set the default distance for segments and procedures defined in the program.

Program property MODEL= is taken into account in procedure pseudoinstructions (PROC, PROC1) and in control-transfer instructions (JMP, CALL, RET) without explicitly specified distance.
In monocode models (TINY,SMALL,COMPACT,FLAT) the default transfer distance is NEAR.
In multicode models (MEDIUM,LARGE,HUGE) the default transfer distance is FAR.
In monodata models (TINY,SMALL,MEDIUM,FLAT) are all data addressed relatively to the start of the data segment.
In multidata models (COMPACT,LARGE,HUGE) it is the programmers responsibility to load the used segment register with paragraph address of the data before they are accessed.

Properties implied by memory model
MODEL=Default segment properties Link propertiesUsual usage
CODE
distance
DATA
distance
Segm.
width
Multi-
code
Multi-
data
Segm.
overlap
CPU
mode
Used in
formats
TINYNEARNEAR16nonoyesrealCOM
SMALLNEARNEAR16nononorealMZ, OMF
MEDIUMFARNEAR16yesnonorealMZ, OMF
COMPACTNEARFAR16noyesnorealMZ, OMF
LARGEFARFAR16yesyesnorealMZ, OMF
HUGEFARFAR32yesyesnorealMZ, OMF
FLATNEARNEAR32,64nonoyesprotectedELF, PE, DLL, COFF
↑ SUBSYSTEM=

Subsystem is a numeric identifier in the header of Portable Executable file. This parameter specifies whether MS-Windows should create a new console when the PE program starts. The default is SUBSYSTEM=CON. Set it to SUBSYSTEM=GUI when your PE program creates graphical windows rather than using the standard text input and output. Value of subsystem is one of the enumerated tokens from the table below, or a numeric expression which evaluates to the corresponding number.

Subsystems table
SUBSYSTEM=ValueRemark
00Unknown subsystem.
1NATIVESubsystem is not used, i.e. device driver.
2GUIWindows GUI graphical windows.
3CONWindows console (character subsystem).
5OS2OS/2 character subsystem.
7POSIXPosix character subsystem.
8WXDWindows 95/98 native driver.
9WCEWindows CE graphical windows.
↑ ENTRY=

This parameter specifies an address where execution of the program begins. Usually this parameter contains a label whose address is set to CS:rIP when loader transfers execution to the program at run-time.

By default the ENTRY= parameter is empty; in this case €ASM will set it to 0 if PROGRAM FORMAT=BIN or to 256 if PROGRAM FORMAT=COM. This parameter should be left empty in linkable program formats but it must be specified in executable formats, otherwise €ASM reports error.

↑ MAXPASSES=

This parameter limits the number of assembly passes through the source code. It is €ASM who decides how many passes will be necessary, nonetheless this parameter specifies the upper limit.

EuroAssembler repeats assembly passes until offsets of symbols do not change between passes (all symbols are fixed). Then it performs the last, emitting final pass.

In very rare circumstances this may lead to an oscillation of emitted code size due to optimisation of short|near jump encodings. In this very rare case €ASM would request more and more passes forever, that is why their number is limited. When the pass number approaches %^MAXPASSES-1, this (last but one pass) is marked as fixing pass. Symbol offsets may only grow up in the fixing pass and the vacant code space is stuffed with NOP bytes. See the test t7181 as an example of oscillating code with fixing pass.

Factory default value is MAXPASSES=32. You may need to increase this option only in extremely large sources with lots of macros and conditional-assembly constructs. The maximum ever reached within my programs is 44 passes consumed in the module iiz.htm.
↑ MAXEXPANSIONS=

Parameter MAXEXPANSIONS= limits the number of %FOR, %WHILE, %REPEAT or %MACRO block expansions. €ASM declares a numeric program property named %. and increments its value whenever a preprocessing block is expanded. When this number exceeds MAXEXPANSIONS value, €ASM emits error message and prevents further expansions.
Factory default is MAXEXPANSIONS=65536.

This mechanism protects €ASM from exhausting memory resources when some incorrectly written preprocessing loop fails to exit. If your program is really big, you may need to increase MAXEXPANSIONS value.

The same expansion counter is used to maintain the value of the special automatic %variable %..

↑ OUTFILE=

OUTFILE= specifies filename of the output of assembly - executable or linkable object file. This filename is related to the current shell directory, if not specified otherwise. Default value is OUTFILE="%^PROGRAM" followed by the extension specified by FORMAT=.
E. g.: Hello PROGRAM FORMAT=MZ will create output file "Hello.exe", if not directed otherwise.

Suboperation can be applied to the name specified by this option, for instance OutFile="MyData.bin"[1..256] will assemble the whole module in memory but only its first 256 bytes will be written to the output file MyData.bin. See also the sample program boot16.htm as an example.

↑ STUBFILE=

STUBFILE= is only used in COFF-based exectutables - PE and DLL formats. The stub is a 16-bit MZ program which gets control when the output file is launched in a 16-bit disk operating system (DOS). Usualy its only job is to tell the user, that this program requires MS-Windows.

When STUBFILE parameter is empty (default), €ASM will use its own built-in stub code. Otherwise it looks for the previously compiled MZ executable. If the STUBFILE= is specified without a path, €ASM looks for the file in pathes specified by EUROASM option LINKPATH=.

The user-selected 16-bit stub program may have the same functionality as the main 32-bit Windows application. Such executable file then works in the same way both in DOS and in MS-Windows. See the sample project LockTest as an example of this technique.
↑ ICONFILE=

ICONFILE= should specify an existing file with an icon which will be built into the resource segment of PE or DLL output file. This icon is used to graphically represent the output file in MS-Windows environment (Desktop, Explorer etc). Icon file is searched for in the path specified by the EUROASM option LINKPATH=.

Factory-default value is EUROASM ICONFILE="euroasm.ico" which represents an icon   Icon shipped with EuroAssembler in directory objlib.

Option ICONFILE= applies only when no resource file is linked to the output program, otherwise it is ignored and the first icon from resources (if any) is used by Windows Explorer to represent the executable.

When the parameter ICONFILE= is empty, no icon is used and €ASM does not create resource section at all.

↑ LISTMAP=
↑ LISTGLOBALS=
↑ LISTLITERALS=

Those three options control which auxilliary information will be dumped at the end of the listing file. See t8302 as an example of ListMap and ListGlobals format.

When LISTLITERALS=ON, contents of the data and code literal sections @LT16, @LT8, @LT4, @LT2, @LT1, @RT0 will be dumped too. See t1711 for an example of ListLiterals format.

↑ TIMESTAMP=

Specifies the nominal time which is provided by €ASM system variables %^DATE, %^TIME and which is embedded in some COFF-based file formats: PFCOFF_FILE_HEADER.TimeDateStamp, PFLIBCOF_IMPORT_OBJECT_HEADER.TimeDateStamp, PFRSRC_RESOURCE_DIRECTORY.TimeDateStamp.

Value of this parameter represents the number of seconds elapsed since midnight, January 1st 1970, UTC. When it is set to -1 or left empty (factory default), it will by assigned from system timer at the start of assembly session.
TIMESTAMP= can be used to fake the time when was the target file created.

↑ DLLCHARACTERISTICS=
↑ FILEALIGN=
↑ IMAGEBASE=
↑ MAJORIMAGEVERSION=
↑ MAJORLINKERVERSION=
↑ MAJOROSVERSION=
↑ MAJORSUBSYSTEMVERSION=
↑ MINORIMAGEVERSION=
↑ MINORLINKERVERSION=
↑ MINOROSVERSION=
↑ MINORSUBSYSTEMVERSION=
↑ SECTIONALIGN=
↑ SIZEOFHEAPCOMMIT=
↑ SIZEOFHEAPRESERVED=
↑ SIZEOFSTACKCOMMIT=
↑ SIZEOFSTACKRESERVED=
↑ WIN32VERSIONVALUE=

Other PROGRAM parameters are mostly important only in COFF-family of output formats (PE, DLL, COFF) formats and they form a PE header. See [MS PECOFF] specification for detailed description. Do not change them if you don't know what you are doing.

↑ SEGMENT

PURPOSE= ↓
WIDTH= ↓
ALIGN= ↓
COMBINE= ↓
CLASS= ↓

Pseudoinstruction SEGMENT declares a memory segment and specifies its properties. Each segment definition also simultaneously defines a section with the same name. Other section of the segment may be declared (or switched to) later, with an operation-less statement which has the section name in its label field, for example
[Strings] ; Declare section [Strings] in the current segment..

The name of segment is specified in the label field and it looks like an identifier in square brackets. Segment properties are assigned with keyword parameters.

€ASM declares automatically a few default segments when it starts to assemble a program. In most cases there is no need to explicitly declare any other segments. Number and purpose of default segments depends on program format. If these segments are not used in the program (no code was emitted into them), they will be discarded at assembly time and do not appear in the object file. This happens when programers are not satisfied with default segment names and properties and they declare new segments of their own choice, usually near the program beginning.

↑ PURPOSE=

Parameter SEGMENT PURPOSE= specifies what kind of information is the segment intended for. It is important in protected mode (formats ELFX, PE, DLL), where descriptor's access bits control the rights granted to read, write or execute the contents of segment.

Segment purpose table
PURPOSE=AliasAccessDefault nameContents
CODETEXTread, execute[.text]|[CODE]Program code (instructions) (1)
RODATARDATAread[.rodata]|[RODATA]Initialized read-only data (1)
DATAIDATAread, write[.data]|[DATA]Initialized data (1)
BSSUDATAread, write[.bss]|[BSS]Uninitialized data (1)
STACKread, write[STACK]Machine stack (1)
LITERALSLITERALread parasites on other data/code segmentLiteral sections (2)
DRECTVEdiscarded[.drectve]Linker directives (3)
PHDRProgram headers (4)
INTERPDynamic interpreter (4)
SYMBOLS[.symtab] | [.dynsym]Program symbols (4)
HASH[.hash]Hash of symbol names
STRINGS[.strtab] | [.dynstr] | [.shstrtab]Names of symbols|sections (4)
DYNAMIC[.dynamic]Dynamic records
RELOC[.rel(a)*]Relocations (4)
GOT[.got]Global Offset Table
PLT[.plt]Procedure Linkage Table
EXPORT[.edata]Dynamic link export (4)
IMPORT[.idata]Dynamic link import (4)
RESOURCE[.rsrc]Programming resources (4)
EXCEPTION[.pdata]Runtime exceptions (5)
SECURITYAttribute certificate (5)
BASERELOCdiscarded[.reloc]Load-time relocations (4)
DEBUG[.debug]Data for debugger (5)
COPYRIGHTARCHITECTUREArchitecture info (5)
GLOBALPTRRVA of global pointer (5)
TLS[.tls]Thread local storage (5)
LOAD_CONFIGLoad configuration (5)
BOUND_IMPORTBound import (5)
IAT[.idata]Import address table (4)
DELAY_IMPORTDelayed import descriptor (5)
CLR[.cormeta]CLR metadata (5)
RESERVEDReserved (5)
Remarks:
(1) Basic purposes used in all program formats.
(2) Programmer may specify which rodata|data|code segment should be used to host literal symbols.
(3) Synthetic section used for transfer of dynamic-link information in COFF format.
(4) Special sections directly supported by EuroAssembler. They should never be declared explicitly.
(5) Special sections, their contents is not supported. Programmer may include such section in their PE file but the contents must be explicitly specified (with D or INCLUDEBIN), see the program format PE.

Segments with special purpose names (4),(5) will be marked in the corresponding position of DataDirectory table in the optional header of PE or DLL file format.

Although the operand PURPOSE= accepts only enumerated values, they may be combined using the operator Addition + or Bitwise OR |, for instance
[TINY] SEGMENT PURPOSE=CODE|DATA|BSS|STACK or
[.rodata] SEGMENT PURPOSE=DATA+LITERALS.

When this parameter is empty or not specified, €ASM will guess the segment's purpose by its class or [name], following this rules:

  1. If the name exactly case-insesitively matches any purpose enumerated in the table above, this purpose is assumed.
  2. If the name contains string STACK (case insensitive), PURPOSE=STACK is assumed.
  3. If the name contains string BSS or UDATA (case insensitive), PURPOSE=BSS is assumed.
  4. If the name contains string DATA (case insensitive), PURPOSE=DATA is assumed.
  5. If none of the previous rules applies, PURPOSE=CODE is assumed.

PURPOSE=LITERALS is used together with CODE and|or DATA and it only suggests that this segment should be preferably used to host the literal sections. If no segment is explicitly marked as PURPOSE=LITERAL, €ASM will choose the last data or code segment defined when some literal symbol was encountered.

Purpose guessing first looks at the SEGMENT CLASS= property, and only if it's empty, segment name is looked at. This mechanism can be used with segments defined in OMF object files to propagate their purpose to the linked executable.
↑ WIDTH=

Segment width value can be a numeric expression which evaluates to 16, 32 or 64. By default (if omitted) the width of segment is determined by the program width.

↑ ALIGN=

This parameter requests alignment of the segment in memory at run-time. Default alignment is ALIGN=OWORD (16 bytes).

Special ELF and PE segments, such as [.symtab], [.strtab], [.reloc] etc. may have different alignment.

↑ COMBINE=

This parameter specifies how segments from other program modules will be combined at link time. This is important only in the MZ program format (16-bit DOS executables) linked from several object files. Possible values:

PUBLIC
All segments with the same name will be linked together. Total size is the sum of concatenated segments. This is the default option.
PRIVATE
Private segments will be not concatenated with other segments, no matter if they have the same name or not.
COMMON
All common segments with the same name will be linked to the same address so they overlay each other. The total segment size equals to the greatest size of all segments with this name. Data variables declared in common segment will be shared among separately assembled modules.
STACK
The STACK combine method is the same as PUBLIC, in addition the SS:SP pointer in target EXE file will be set to the end of such segment at run-time.
↑ CLASS=

The value of CLASS= in an arbitrary identifier. It may be used by the linker to guess the segment purpose (CODE|DATA|BSS) in object formats which do not carry purpose information (OMF).

↑ GROUP

This pseudoinstruction enumerates segments addressed with the same addressing frame. Data in all grouped segments are addressed with the same value of segment register.

Segment groups are applicable in big realmode 16-bit programs. Only a 16-bit segment can be a member of the group.

Name of the group must be defined in the label field of the pseudoinstruction GROUP. The names of grouped segments are enumerated in operand fields. All names are surrounded in braces [ ]. Group name may be the same as the name of one of its segment. Example:
[DGROUP] GROUP [DATA],[STRINGS].
Grouped segment may be defined before or after the GROUP statement. This pseudoinstruction has no keyword operands.

In short, the relation between a group and its segments at link time is similar to the relation between a segment and its sections at assembly time.


↑ PROC

↑ ENDPROC alias ↑ ENDP

DIST= ↓
ALIGN= ↓
NESTINGCHECK= ↓

The PROC and ENDPROC pseudoinstructions declare a namespace procedure block. In most times it ends with machine instruction RET, so the block can be called to perform some function. After the execution it returns back just behind the CALL instruction.

The mandatory label of PROC declares an assembler symbol which is the procedure name. The same identifier may be used as the first and only operand of the corresponding ENDPROC pseudoinstruction.
Alias ENDP may be used instead of ENDPROC.

Equally the ENDPROC may define its own label, too. This label doesn't represent a return from the subprogram, it points to the code which follows PROC..ENDP block. The label of ENDPROC is useful only when the PROC..ENDP block is used to define the namespace block rather than a callable subprogram block. Examples:

SubPgm:PROC ; Define PROC as a call-able subprogram block.
          ; PROC body instructions.
          TEST SomeCondition
          JC .Abort:  ; Go to return below CALL SubPgm: statement.
          TEST OtherCondition
          JC .End:    ; Go to continue below .End: ENDP. Probably not what the programmer wanted.
          ; More body instructions.
.Abort:   RET         ; Return below CALL SubPgm: statement.
.End:  ENDP SubPgm:
NameSp:PROC ; Define PROC as a pass-through-able namespace block.
          ; PROC body instructions.
          TEST SomeCondition
          JC .End:  ; Go to continue below .End:  ENDP NameSp: statement.
          ; More body instructions. No RET instruction here.
.End:  ENDP NameSp: ; Continue below this statement.

Jumping to the ENDPROC label differs from jumping to macroinstruction EndProcedure defined in calling convention macrolibraries. Pseudoinstructions PROC, ENDPROC, PROC1, ENDPROC1 do not emit any machine code.

What are procedures good for? We could manage to write an assembly program without PROC..ENDP pseudoinstructions easily but wrapping the block of code in PROC..ENDPROC block has some advantages:

↑ DIST=

Pseudoinstructions PROC and PROC1 accept keyword operands DIST= and ALIGN=. DIST= sets the distance of the procedure (NEAR or FAR). When DIST=FAR, all CALL to this proc default to FAR, and all RET within this proc default to FAR (of course this can be overriden with instruction suffix CALLN/CALLF, RETN/RETF). The default parameter value depends on the program's memory model.

↑ ALIGN=

Procedure alignment is ALIGN=BYTE by default. For the best usage of instruction cache it sometimes may be useful to complete frequently called procedures with PROC ALIGN=OWORD, if code size is not an issue.

↑ NESTINGCHECK=

This boolean option allows you to switch off the internal check of PROC..ENDPROC label matching. This has only exceptional use in macros simulating built-in pseudoinstruction, which need to hack their block context, such as Procedure and EndProcedure.

See also the instruction modifier NESTINGCHECK=.

Pseudoinstruction PROC does not accept ordinal parameters. Parameters can be passed in registers or machine stack and managed individually. Calling convention macrolibraries shipped with EuroAssembler define macros Procedure and EndProcedure with similar function as PROC and ENDPROC, which allow to pass arbitrary number of arguments as macro parameter when the Procedure is invoked.

↑ PROC1

↑ ENDPROC1 alias ↑ ENDP1

Pseudoinstructions PROC1 and ENDPROC1 are equivalent to PROC and ENDPROC with two major differences:

  1. A procedure declared with PROC1..ENDPROC1 block may occur in the program more than one time. Repeated declarations of PROC1..ENDPROC1 block with the same label are ignored, it is only emitted once.

    This predetermines PROC1 for semiinline macros, which contain both 1) the call of a procedure, and 2) the procedure itself. When the procedure is defined with PROC1..ENDPROC1, such macro can be invoked many times but the called procedure will be assembled and emitted only once (during the first macro expansion).

  2. A block defined with PROC1..ENDPROC1 is not emitted to the current section. €ASM will automatically switch to another code section instead, and return to the previous section after ENDPROC1 has been processed. The section, which €ASM will switch to, has the name [@RT1] and it is automatically created in the segment with PURPOSE=CODE+LITERAL or in the lastly defined code segment. In some circumstances €ASM may also use runtime sections [@RT2], [@RT3] etc. This happens when the code inside the PROC1..ENDPROC1 block contains other semiinline macros, so the current runtime section already is [@RT1] and €ASM must choose another one.

    Emitting procedures to a different section, than the main program currently uses, has an advantage that the procedure body needs not to be bypassed with jump instruction. It also leads to shorter code because jumps over the semiinline macros need not to jump over the whole procedure body, which could make them exceed 128 distance easily and that would require using longer form of jump instructions.

↑ ENDHEAD

Pseudoinstructions HEAD and ENDHEAD just claim a division of source code. This division may be included to other source files with INCLUDEHEAD or INCLUDEHEAD1. The block usually contains the interface of programming objects (definition of structures, macros, constants) which needs to be included in other separately assembled programs.

Label field of pseudoinstruction HEAD may be used as a block identifier but it does not create a symbol. More than one HEAD..ENDHEAD block can be specified in a source file. When these blocks are nested, the whole outer (larger) block will be included.

Languages which do not have implemented this mechanism require to put interface part in separate header files. With HEAD..ENDHEAD they can be kept together with the implementation body in one compact file.

↑ INCLUDE

This pseudoinstruction incorporates file(s) with the name(s) specified as its operand to the source text. The INCLUDE statement is virtually replaced with the contents of included file.

Inclusion may be nested, i. e. included files may contain other INCLUDE statements.

Double quotes may be omitted when the filename contains only alphanumeric characters (no spaces or punctuation).

The pseudoinstruction INCLUDE can have unlimited number of operands, for example INCLUDE "Win*.htm", ./MyConstants.asm, C:\MyLib\*.inc.

When the file is specified without a path, it will be searched for in folders specified with EUROASM option INCLUDEPATH=. If the included filename contains at least one slash, backslash or colon / \ : , this means that it has specified its own path and the INCLUDEPATH= is ignored in this case.

The filename may contain wildcards * ?, in this case €ASM will include all files conforming this mask. The order of inclusion depends on operating system.

Behaviour of INCLUDE statement is described in the following table:

PathWildcardExample When the first file is foundWhen no file is found
NoNofile.incDone, stops further searching in INCLUDEPATH.Error E6914.
YesNo./file.incDone.Error E6914.
NoYesfile*.incContinue searching for more files in INCLUDEPATH.Nothing is included, no error.
YesYes./file*.incContinue searching for more files in the given path.Nothing is included, no error.

Only a part of source file can be included when substring or sublist operator immediately follows the file name. Example: INCLUDE "file.inc"{%&-20..%&} will include the last twenty lines of file.inc (automatic %variable %& represents the number of lines in the file). Filename must be in double quotes when the suboperation is used. If the suboperation is used on wildcarded filename, it will be applied to all files.

↑ INCLUDE1

The pseudoinstruction include once behaves exactly like INCLUDE but first it looks if the same file (with the same size and contents, regardless of their names) was already included in the program, and skips the file in this case.

Using INCLUDE1 instead of INCLUDE allows to resolve mutual dependencies of source libraries. When some included library uses macros, structures and constant definitions from another library, we can do INCLUDE1 another.library in each such library.

↑ INCLUDEHEAD

The INCLUDEHEAD variant includes only the contents of HEAD..ENDHEAD block(s) of the included file, see the test t2420. An error is reported if no such block is found in the file or if the block is incomplete (missing ENDHEAD). When a suboperation is used with INCLUDEHEAD, it is applied first to the entire included file and HEAD..ENDHEAD block is searched for in the subrange only.

↑ INCLUDEHEAD1

The INCLUDEHEAD1 and INCLUDE1 will ignore the source if the file or any part of it has already been included in the program using INCLUDE, INCLUDE1, INCLUDEHEAD or INCLUDEHEAD1.

Library is treated as already-included when it was included as an entire file with INCLUDE or INCLUDE1, when its interface division was included with INCLUDEHEAD or INCLUDEHEAD1, or when only a suboperated part of it was included.

↑ INCLUDEBIN

Unlike INCLUDE and INCLUDEHEAD, this pseudoinstruction does not treat the file contents as a source to assemble, but the contents is emitted as is at the position specified by the offset pointer $ of current section.

Including binary data should not be misplaced with linking; it does not update relocatable addresses or external symbols. For instance the statement INCLUDEBIN "C:\WINNT\Media\chimes.wav"[0x2C..] will skip the first 0x2C bytes of WAV header in sound file and load the rest (raw samples) to the assembled target, as if they were defined with DB statements.

See also t2470.

Pseudoinstruction LINK specifies file(s) which should be linked into the current program.

Each ordinal operand represents a file name, which may have wildcards and may be specified with or without path. Relative path refers to the current directory.

If the linked file name does not contain path, it will be searched for in all directories specified with EUROASM LINKPATH= option, respectively. Unlike included files, suboperations with linked files are not supported.

Linkable files have specific internal structure, which probably would have been damaged if only suboperated part of the file were subjected to the link process. Therefore only whole object file or library can be linked.

Position of the LINK statement within the program is not important, the actual linking will be performed when the final program pass is about to end. Order in which the files are linked respects the order in which pseudoinstruction LINK appeared in source. However, if linked files are specified with wildcards, e.g. LINK "modul*.lib", their order depends on current filesystem and cannot be reliably predicted. Example:

 LINK Subproc.obj, "..\My modules\W*.obj"

See static linking for more info.

↑ PUBLIC

Pseudoinstructions GLOBAL, PUBLIC, EXTERN, EXPORT, IMPORT set the scope property of symbol(s), which is used in linking.

The symbol, whose scope is being declared, may be in the label field or in the operand field of the statement, or in both. More than one symbol may be declared with one statement. Symbols in question may be forward or backward referred.

Explicit scope declaration may appear before or after the symbol is actually defined or referred.

Example: Explicit scope declaration of four symbols: Sym1 PUBLIC Sym2, Sym3, Sym4

Specifying the symbol as PUBLIC just tells €ASM that the symbol, which was or will be defined somewhere else in the program, should be referrable from other statically linked programs. Public declaration does not create the symbol yet, in fact symbol with that name must be defined somewhere else in the same program.

↑ EXTERN

Pseudoinstruction EXTERN symbol tells €ASM that the symbol is not defined in the program, so references to its address must be patched in the code at link time. It is an error to define symbol which is declared as EXTERN in the same program. Instead, it is searched for in other modules at link time, and only the linker may report an error when the external symbol is not found.

↑ GLOBAL

Pseudoinstruction GLOBAL can be used to automatize dealing with PUBLIC and EXTERN scopes. If the symbol is marked with GLOBAL statement, it behaves either as public or external, depending whether or not it is defined in the same program.

As the programmer surely knows whether the declared symbol belongs to the current program or not, so why is the declaration of PUBLIC and EXTERN scope duplicated by GLOBAL? Lets have a program PgmA which defines the public symbol SymA and refers external symbol SymB. Similary PgmB defines SymB and refers SymA:
PgmA PROGRAM
      PUBLIC SymA
      EXTERN SymB
      CALL SymB: ; Reference to external symbol.
SymA: RET        ; Definition of public symbol.
     ENDPROGRAM PgmA

PgmB PROGRAM
      PUBLIC SymB
      EXTERN SymA
      CALL SymA: ; Reference to external symbol.
SymB: RET        ; Definition of public symbol.
     ENDPROGRAM PgmB
If we replace PUBLIC and EXTERN declarations with GLOBAL, the same declaration statement can be used in all statically linked programs, either copy&pasted or included from external file, which is easier to maintain:
PgmA PROGRAM
      GLOBAL SymA, SymB
      CALL SymB: ; Reference to external symbol.
SymA: RET        ; Definition of public symbol.
     ENDPROGRAM PgmA

PgmB PROGRAM
      GLOBAL SymA, SymB
      CALL SymA: ; Reference to external symbol.
SymB: RET        ; Definition of public symbol.
     ENDPROGRAM PgmB
Another raison d'être of GLOBAL is backwards compatibility with NASM, which doesn't know the directive PUBLIC at all. NASM uses the directive GLOBAL instead whenever €ASM would require PUBLIC.

↑ IMPORT

Scopes IMPORT and EXPORT are used in dynamic linking, when our program calls an imported function from DLL. This pseudoinstruction accepts keyword parameter LIB= which specifies the library file. The LIB= parameter may be omitted when the symbols are imported from the default MS-Windows library kernel32.dll.
Library file name doesn't have to be in quotes when it follows DOS convention 8.3. The library is always specified without a path. Operating system uses its own rules ([WinDllSearchOrder]) concerning directories where are the libraries searched for at bind-time.

↑ EXPORT

Scope EXPORT is used when we make a dynamic library and it declares symbols which are expected to be imported by other programs. Similar to the PUBLIC scope, symbol marked for EXPORT must be defined in the program, sooner or later.

Pseudoinstruction EXPORT accepts two keyword parameters FWD= and LIB=, which specify that the exported symbol (function name) is in fact provided by another dynamic library (defined with LIB=) under a different symbol name (defined with FWD=). Example:

kernel32 PROGRAM FORMAT=DLL
          EXPORT EnterCriticalSection, LIB="NTDLL.dll", FWD=RtlEnterCriticalSection
          ; Other kernel functions.
         ENDPROGRAM kernel32

Library "kernel32.dll" yields API function RtlEnterCriticalSection, which is in fact provided by the library "NTDLL.dll". In other Windows version it may be provided by a different library "XPDLL.dll" but programs importing the function from a proxy library "kernel32.dll" need no update or recompilation.

↑ ALIGN

This pseudoinstruction is used for explicit alignment of current section pointer $. For instance ALIGN OWORD in code section will emit several (0..15) bytes of NOP operation, so that the next statement will be emitted at octword-aligned address. ALIGN in data sections uses NUL byte (0x00) instead of NOP (0x90) as a stuff.

The operand can be a type specifier in short or long notation: B, U, W, D, Q, T, O, Y, Z, BYTE, UNICHAR, WORD, DWORD, QWORD, TBYTE, OWORD, YWORD, ZWORD or arithmetic expression which evaluates to the power of two: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512.
ALIGN TBYTE aligns to 8.

ALIGN statement may have no label but it can have two operands. The second operand is used for intentional unalignment, it needs not to be the power of 2 and it must be lower than the first one. For instance ALIGN OWORD, QWORD alignes $ to an odd multiple of 8.
ALIGN 8,2 requests the current offset be set at the second byte in qword (counted from zero). Example of offsets which meet such requirement are 2, 10, 18, 26...

↑ STRUC

↑ ENDSTRUC

A structure represents a virtual section of data declarations which can be used as a mask or a grid-template laid over a piece of memory. Structure is declared with the STRUC..ENDSTRUC block. The only statements which may be used within the block are

  1. Data definitions specified with D statement and its clones, either initialized or uninitialized
  2. Explicit alignment statements (pseudoinstruction ALIGN)
  3. Pseudoinstructions STRUC and ENDSTRUC (€ASM allows nested definitions of structures)
  4. Line and markup comments
|[.data] ::::Section changed. |00000000: | ; Example of structure declaration: |[MyStruc] |MyStruc: STRUC |00000000:................ |.Member1 D Q ; Uninitialized QWORD member. |00000008:........ |.Member2 D D ; Uninitialized DWORD member. |0000000C:........ | D D ; Uninitialized anonymous DWORD member. |00000010:FF |.Member3 D B 255 ; Initialized BYTE member. |00000011:.. |.Member4 D B ; Uninitialized BYTE member. |00000012:............ | ALIGN QWORD ; Increase size# MyStruc to QWORD. |00000018: | ENDSTRUC MyStruc |[.data] ::::Section changed. |00000000: | |00000000: | |00000000:18000000 | DD SIZE# MyStruc ; MyStruc is 0x18 bytes long. |00000004:53000000 | DD TYPE# MyStruc ; Type of any structure is 'S'. |00000008:00000000 | DD SEGMENT# MyStruc ; Segment/section/offset of any struc declaration is a scalar 0. |0000000C: |

Declaration of a structure does not emit any data to the target file. Data are emitted or reserved only when the declared structure is actually used in a data definition (in pseudoinstruction D or DS). We say that the structure is instatiated.
When initialized data is defined in the structure declaration, it will be used to initialize corresponding members at the time of structured data definition (with pseudoinstruction D or DS), unless explicitly redefined.

Named data definitions in the structure must have local names (starting with .)
This alows to:

  1. Use the same name for members of different structures,
  2. Avoid name conflict when more than one object of this structure is defined.

Each member is given its offset relative to the start of the structure. The program section, which was current at the time of structure declaration, is irrelevant. Each structure declaration temporarily creates its own pseudosection with a zero based virtual address 0.

Structure must be given an unique structure name, which is defined in the label field of a STRUC statement and, optionally, in the operand field of the ENDSTRUC statement.

The size of the structure can be obtained with the attribute SIZE#Structure_name.

Pseudoinstruction STRUC accepts the keyword operand ALIGN=, which specifies alignment of instances of the structure when EUROASM AUTOALIGN=ON.
If the alignment is not explicitly specified with STRUC declaration, alignment corresponding to PROGRAM WIDTH= is used as the default (WORD, DWORD or QWORD).

See tests t2500, t25010, t2504 for more examples of structure declaration.

↑ D, DB, DU, DW, DD, DQ, DT, DO, DY, DZ, DI, DS

Both initialized and uninitialized data are defined and reserved with pseudoinstruction D. When a static value is specified, the data are defined. When the value is omitted, data are reserved. If EUROASM option AUTOSEGMENT=ON, INSTR data definition will switch to code section, all other data definition will switch to data section and data reservation will switch to bss (uninitialized data) section.

|[.data] ::::Section changed. |00000000: |; Integer numbers definitions: |00000000:01 | D BYTE 1 ; Define a byte integer with value 1, using long typename specification. |00000001:00 ....AutoAlignment stuff. |00000002:0200 | D W 2 ; Define a word integer with value 2, using short typename specification. |00000004:03000000 | D D 3 ; Define a dword integer with value 3. |00000008:0400000000000000 | D Q 4 ; Define a qword integer with value 4. |00000010: |; Floating-point numbers definitions: |00000010:0000A040 | D D 5.0 ; Define a single-precision number with value 5. |00000014:00000000 ....AutoAlignment stuff. |00000018:0000000000001840 | D Q 6.0 ; Define a double-precision number with value 6. |00000020:00000000000000E0~| D T 7.0 ; Define an extended-precision number with value 7. |0000002A: | ; String definitions: |0000002A:4279746573 | D B "Bytes" ; Define a string of bytes. |0000002F:00 ....AutoAlignment stuff. |00000030:55006E0069006300~| D U "Unichars" ; Define a string of unichars. |00000040:4368617273 | D "Chars" ; Define a string of bytes or unichars (depends on the option UNICODE=). |00000045: | ; Instruction operation code definitions: |[.text] ::::Section changed. |00000000:90 | D INSTR "NOP" ; Define NOP opcode, using long typename specification. |00000001:C3 | D I "RET" ; Define RET opcode, using short typename specification. |00000002: | ; String reservations: |[.bss] ::::Section changed. |00000000:................ | D 8 * B ; Reserve eight bytes long string. |00000008:................~| D 9 * U ; Reserve nine unichars long string. |0000001A: | ; Number reservations: |0000001A:.... | D W ; Reserve one word. |0000001C:........ | D D ; Reserve one dword. |00000020:................ | D Q ; Reserve one qword. |00000028:................~| D T ; Reserve one tenbyte. |00000032: | ; Vector reservations: |00000032:................~....AutoAlignment stuff. |00000040:................~| D O ; Reserve one oword, which can hold two qword or four dword numbers. |00000050:................~....AutoAlignment stuff. |00000060:................~| D Y ; Reserve one yword, which can hold four qword or eight dword numbers. |00000080:................~| D Z ; Reserve one zword, which can hold eight qword or sixteen dword numbers. |000000C0: |

See t2482 for more examples.

Each operand of D is a data expression.

Pseudoinstruction mnemonic D may be appended with suffix B, U, W, D, Q, T, O, Y, Z, I, S. Suffix defines the default datatype, which is used if it's not explicitly specified in operand. For instance DD 2,3,4 defines three dwords with static values 2, 3 and 4.

Suffix also determines datatype of symbol, which defines the data. For instance in definition Sym1 DQ B 1, W 2, D 4 the suffix specifies that the datatype of Sym1 is QWORD, although it defines only byte, word and dword data.

Types of data may mix in the same D statement.

The default datatype specified with mnemonic suffix can be overridden in operand fields by an explicit datatype in short or long notation. Operands without explicit redefinition take the default data type from D-suffix, for instance DB 27, "$", W 120 defines two bytes followed with one word. Datatypes in the operand may be specified with long names as well, e.g. DB 27, "$", WORD 120.
See t2481 for more examples.

Data from one operand may be duplicated.

For instance TranslateTable: D 256 * BYTE reserves 256 uninitialized bytes.
If duplication is not used, it defaults to 1. A negative duplicator is not permitted.
Duplicator 0 does not define or reserve any data, but still it provides default datatype of the symbol and, if AUTOALIGN=ON, it aligns the curent offset $.

If no suffix is used, the default datatype is taken from the first nonempty operand, e.g. D D 2,3,4 defines three dwords with static values 2,3 and 4. When no default is defined, as in D 2, €ASM reports an error.

The only exception, when the datatype needs not to be explicitly specified, is the definition of a text string, for instance D "Some text.". In this case the default datatype is B or U, which depends on the current value of EUROASM option UNICODE=.

No data is defined or reserved when no operand is used.
L1: D  B 5      ; Define one byte with value 5.    TYPE#L1='B', SIZE#L1=1.
L2: D  2*WORD 3 ; Define two words with value 3.   TYPE#L2='W', SIZE#L2=4.
L3: DW W        ; Reserve one word.                TYPE#L3='W', SIZE#L3=2.
L4: DW 0*D      ; Reserve nothing, align to DWORD. TYPE#L4='W', SIZE#L4=0.
L5: DQ          ; Reserve nothing, align to QWORD. TYPE#L5='Q', SIZE#L5=0.
L6: D           ; Do nothing.                      TYPE#L6='A', SIZE#L6=0.
Unlike other assemblers, omitted operand doesn't emit any data, €ASM requests that operand type and|or value be specified, no matter if the D operation is suffixed or not. For instance DB reserves one byte in MASM but it does nothing in €ASM. Use D B or DB B instead.

EuroAssembler can define operation code of machine instruction as data, with pseudoinstruction DI. It is similar to DB or DU but the string contents is not emitted verbatim, it is assembled first. The quoted text in DI operand(s) should be a valid machine instruction, it may have prefix and operands but not a label.

For instance DI "SEGES:MOVSB" defines bytes 0x26,0xA4.
D 8*I"MOVSD" defines eight bytes 0xA5.
See t2515 for more DI examples.

A structured memory variable is defined with pseudoinstruction DS struc_name or just D struc_name.

Only one structured object can be defined with one D statement.

€ASM does not allow multiple ordinal operands when a structured object is defined, such as DS MyStruc1, Mystruc2. Nevertheless, duplication is supported, e. g. DS 4*MyStruc.

Members of the structured object can be overriden statically, using keyword operands. Keyword name is the local name of defined member, immediately followed with equal sign = and with the new value of statically defined member. Namespace of operand fields in DS statement is temporarily changed to the namespace of structure definition.

The instance of MyStruc declared above in a STRUC example could be for example defined as MyObject DS MyStruc, .Member2=2, .Member4=4. This initializes the contens of MyObject.Member2 to dword integer 2, and the contents of MyObject.Member4 to byte integer 4. Contents of MyObject.Member3 is already statically defined as byte integer 255, other members of MyObject remain uninitialized.
If at least one member is initialized, the object is by default emitted to data section, uninitialized members are filled with zeroes. See also test t2510.

↑ EQU

↑ =

Pseudoinstruction EQU (or its alias =) defines a symbol, which is presented in the label field. The statement must have just one operand, which specifies the address or the numeric value of the symbol.

Instruction Label:EQU $ or Label:= $ are equivalent to Label:, i.e. specifying the statement with label only, which assigns an address to the symbol Label.

Using EQU is the only way how to define a plain numeric symbol, such as FILE_ATTRIBUTE_ARCHIVE = 00000020h.

See any macrolibrary within PROGRAM realm as an example of EQU symbol definitions, for example winsfile.htm.

↑ %COMMENT

↑ %ENDCOMMENT

Those pseudoinstructions define block comments, i.e. range of source code which is ignored by €ASM. In the label field of %COMMENT there may be an identifier, which gives the block a name (but it does not create a symbol). The same identifier can be used as the first operand of %ENDCOMMENT statement. This helps €ASM to check correct matching of %COMMENT & %ENDCOMMENT, especially when the comment blocks are nested.

↑ %DROPMACRO

%DROPMACRO tells €ASM to forget previously defined macroinstruction. One %DROPMACRO statement may drop one or more macros specified as operands, e.g.
%DROPMACRO Macro1, Macro2, Macro3.

Alternatively we may drop all macros declared so far with %DROPMACRO *.

See also %DROPMACRO example below.

↑ %IF

↑ %ELSE

↑ %ENDIF

Instructions between %IF and %ENDIF is assembled only if the condition in the first and only %IF operand is evaluated as true. %IF accepts extended boolean expression and it also accepts an empty operand, which is always evaluated as false.

Pseudoinstruction %ELSE may occur in the %IF..%ENDIF block. It reverses the logic of assembly: instructions between %IF and %ELSE are assembled when the %IF condition is true and instructions between %ELSE and %ENDIF are assembled when the %IF condition is false.

%IF may have an identifier in the label field which does not create a symbol but it identifies the block. The same identifier can be used in the operand field of %ELSE and %ENDIF statements.

↑ %FOR

↑ %EXITFOR

↑ %ENDFOR

Pseudoinstructions %FOR and %ENDFOR create block which is assembled repeatedly for each operand of the %FOR statement. The label field of %FOR statement must be an identifier. It does not create a symbol, instead it defines a formal preprocessing %variable which is accessible in the %FOR..%ENDFOR block only. The name of this %variable consists of percent sign followed with the identifier.

Operands can be arbitrary elements which we need to operate with: register, number, expression, string. The formal %variable will be assigned with each %FOR operand respectively, and the block will be emitted with its value in the formal %variable. The following example defines %FOR loop with three operands and it emits three memory variables:

data %FOR "a", 3*B(5), "Long text"
       D %data
     %ENDFOR data
and it will be expanded to |00000000:61 + D "a" |00000001:050505 + D 3*B(5) |00000004:4C6F6E672074657874 + D "Long text" |0000000D: |

Repeating the identifier in the operand field of %ENDFOR and %EXITFOR statement is optional and it can be used to check proper pairing of block instructions.

The operand of %FOR can also be a numeric range, the block is repeated with each integer value of the range in this case. Slope of the range can be negative; default step of control %variable is -1 in this case instead of +1.

i  %FOR  0..5    ; Slope is positive, therefore implicit step = +1.
      DB "A"+%i  ; Define bytes "A","B","C","D","E","F".
   %ENDFOR i
j  %FOR 'z'..'x' ; Slope is negative, therefore implicit step = -1.
      DB %j      ; Define bytes 'z','y','x'.
   %ENDFOR j

See also t2640.

%FOR accepts keyword integer operand STEP= which explicitly defines how is the control %variable incremented when a range is used. The default value is zero (STEP=0), which is a special case: the actual effective step is then either +1 or -1, depending on the range slope.

Both kind of operands (enumerated and range) can be combined. When the step is explicitly defined and its sign differs from the range slope, the %FOR..%ENDFOR body is not assembled. On the other hand, if STEP= is omitted or set to 0, ranges with both slopes can be combined in one %FOR statement and each range-operand will receive its own appropriate step +1 or -1. Example:

a %FOR 1..3, 6..4, 7
     ; Block is assembled with %a = 1,2,3,6,5,4,7.
   %ENDFOR

b %FOR 0..64, 256, 400..300, 512, STEP=16
     ; Block is assembled with %b = 0,16,32,48,64,256,512.
   %ENDFOR

When the formal %FOR variable has identical name with another previously user-defined %variable, it prevails and the user-defined %variable is not visible inside the %FOR..%ENDFOR block. See t2641.

When €ASM encounters %EXITFOR pseudoinstruction, it breaks the assembly of remaining instructions in %FOR..%ENDFOR block and continues below the %ENDFOR statement, no matter how many unprocessed %FOR operands is left.

i  %FOR 0..9
     DB %i
     %IF %i>=3
       %EXITFOR i
     %ENDIF
     DB "a" + %i
   %ENDFOR i ; This will define bytes 0,"a",1,"b",2,"c",3

In nested %FOR..%ENDFOR blocks the formal variable (%EXITFOR's first and only operand) can be used for specification which of the nested block should be exited, see t2642 as an example.

↑ %WHILE

↑ %EXITWHILE

↑ %ENDWHILE

The block of statements between %WHILE and %ENDWHILE is being assembled repeatedly while the condition in the first and only %WHILE operand is true. If the condition is false at the block entry, it is skipped entirely.

An identifier may be used in the label of %WHILE and in the operand of %ENDWHILE and %EXITWHILE just for visual binding; it does not define a symbol.

Unlike %FOR, which temporarily declares and maintains its own control %variable, the %WHILE does not. It is the programer's duty to declare some control %variable outside the block, and to change it within %WHILE..%ENDWHILE. Example:

%i  %SETA 3        ; Define %variable %i which will control the block expansion.
id1 %WHILE %i
C%i:  DB %i
%i    %SETA %i - 1 ; Alternate the user-defined control %variable.
    %ENDWHILE id1
; Statements assembled with %WHILE..%ENDWHILE block: C3: DB 3, C2: DB 2, C1: DB 1.

%EXITWHILE in the block will cause skipping the rest of statements; €ASM will continue below %ENDWHILE.

See also t2700, t2701, t2702.

↑ %REPEAT

↑ %EXITREPEAT

↑ %ENDREPEAT alias

↑ %UNTIL

The conditional assembly block %REPEAT..%ENDREPAT is similar to %WHILE..%ENDWHILE but the condition is evaluated at the end of block, and the logic is inverted. %REPEAT takes no label and no operand. The statements in the block are always assembled at least once. The control condition is in the operand field of %ENDREPEAT; if it evaluates to false, €ASM will assemble the block repeatedly. Alias %UNTIL may be used instead of mnemonic %ENDREPEAT.

Block %REPEAT..%ENDREPEAT can use identifier for nesting check. Unlike other block statements, position of the block identifier is different: Block identifier can be specified as the first operand of %REPEAT, and as the label of %ENDREPEAT (alias %UNTIL).

%i  %SETA 3           ; Define %variable %i which will control the block expansion.
    %REPEAT Id1
      C%i: DB %i
      %i %SETA %i - 1 ; Alternate the user-defined control %variable.
Id1 %UNTIL %i = 0
; Statements assembled with %REPEAT..%UNTIL block: C3: DB 3, C2: DB 2, C1: DB 1.

%EXITREPEAT in the block will cause skipping the rest of statements; €ASM will continue below %ENDREPEAT.

See also t2750, t2751, t2752.

↑ %SET

Pseudoinstruction %SET and other members of its family are designed to assign a value to preprocessing %variable. This %variable is in the label field of the statement.

%SET assigns the whole list of operands as a verbatim text, including the commas which separate operands from one another. White spaces between the operation mnemonics (%SET) and the first operand are omitted. White spaces after the last operand are trimmed off, too. White spaces are similary trimmed when line-continuation is used.

%CardList %SET Hearts, Diamonds, Clubs, Spades  ; Comment

%CardList now will contain the string Hearts, Diamonds, Clubs, Spades (31 characters including spaces and commas).

See also t2810.

↑ %SETA

%SETA accepts arithmetic expressions. They will be evaluated and assigned to the %variable as a signed decimal number. An error is reported if the %SETA operand is not a valid expression.

When more than one operand is used, each value is set to the corresponding comma-separated item of the %variable, which is being assigned. Example:

%Value %SETA PoolEnd - PoolBegin
%Sizes %SETA 2+3, 4, ,-5*2

The difference between offsets PoolEnd and PoolBegin in previous example was calculated and assigned to %Value as a decadic number.
%Sizes now contains the text 5,4,,-10 (8 characters). Individual items of %Sizes can be retrieved with sublist operation, such as %Sizes{2}.

See also t2821.

%SETA is better suitable for modification of control %variable in preprocessing loop, such as %i %SETA %i+1. Though text assignment %i %SET %i+1 would work here as well, with %SET is the expression not evaluated immediately and we might wind up with something like +1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 after 15th expansion.

↑ %SETB

%SETB is similar to %SETA, it accepts extended boolean expressions and assigns them in the form of binary digits 1 or 0.

See also t2831.

Unlike with %SETA, the binary digits are not separated with commas when more than one operand is used in %SETB statement. Items of assigned variable can be retrieved with substring operation. Example:

%TooBig %SETB 5 > 4                   ; %TooBig is assigned with one character 1 (true).
%Flags  %SETB %TooBig, 2,,3>2,off,4,, ; %Flags  are assigned  with 110101.
        %IF %Flags[1]                 ; True, equals to 1st member of %Flags, i. e. %TooBig, i. e. 1.
Flags:  DB %Flags[]b                  ; Memory variable contains 00110101b.

↑ %SETC

%SETC accepts expression in its operand, which must evaluate to a plain number not above 255 and not lower than -128. The result will be assigned as one character with evaluated ASCII byte value. Example:

%Quote %SETC """" ; One character "quote" (ASCII 33) is assigned.
%Tab   %SETC 9    ; One character "tabelator" (ASCII 9) is assigned.
%NBSP  %SETC -1   ; One character (ASCII 255) is assigned.

Similar with %SETB, multiple operands may be defined in %SETC and the resulting characters are not separated with commas.

%Hexadigits %SETC 'A','B','C','D','E','F'
; %Hexadigits now contains six characters ABCDEF

See also t2841.

%SETC allows to assign special characters to preprocessing %variable, which couldn't be possible to assign as a plain text with %SET due to €ASM parser syntax rules.
%Space %SETC 32 assigns one space. This could also be achieved with
%QuotedSpace %SET " " and suboperating only the second of three assigned characters:
%Space %SET %QuotedSpace[2].

↑ %SETE

This pseudoinstruction reads environment variable from the operating system at assembly time and assigns its value to the preprocessing variable. Name of environment variable specified in the operand field(s) is cited without quotes, percent signs or dollar sign, e.g.

%OS %SETE OS
Msg: DB "This program was assembled at %OS system."

€ASM reports warning W2520 when the requested variable is empty or not defined.

%SETE allows to retrieve more than one environment %variables, their values will be assigned as unquoted and comma-separated. Example:

%CpuInfo %SETE PROCESSOR_ARCHITECTURE, PROCESSOR_IDENTIFIER, \
               PROCESSOR_LEVEL, PROCESSOR_REVISION
On my old computer this will assign following text to %CpuInfo:
x86,x86 Family 15 Model 1 Stepping 2, GenuineIntel,15,0102. Due to comma character inserted by Windows into the value of %PROCESSOR_IDENTIFIER% it wouldn't be easy to retrieve individual components from such concatenation with sublist %CpuInfo{4}. So it is usually better to use %SETE for only one environment variable.

↑ %SETS

%SETS looks at the %variable in its operand field and assigns its size, i.e. the number of bytes which its value occupies.

%SomeVar        %SET  ABC, DEF
%SomeSize       %SETS %SomeVar  ; %SomeSize is now 8 (3 letters + comma + space + 3 letters).
%SizeOfSomeSize %SETS %SomeSize ; %SizeOfSomeSize is now 1 (one digit).

%SETS must have just one operand, which looks like a preprocessing %variable (percent sign followed with an identifier).

See also t2861.

↑ %SETL

%SETL is similar to %SETS except that is assigns length of the %variable contents, i.e. the number of comma-separated items in the %variable contents.

%SomeVar            %SET  ABC, DEF
%SomeLength         %SETL %SomeVar    ; %SomeLength is now 2 (2 comma separated items).
%LengthOfSomeLength %SETL %SomeLength ; %LengthOfSomeLength is now 1 (one item).

%SETL must have just one operand, which looks like a preprocessing %variable (percent sign followed with an identifier).

See also t2866.

↑ %SET2

Consider assembly of the statement %Var1 %SET %Var2. €ASM first expands the %Var2 and the result of expansion is then assigned to %Var1. First two tokens of the statement are not expanded, because %Var1 is the target which is just being assigned, and %SET is reserved name which is never expanded.

%SET2 is similar to %SET except that the operand field is expanded 2 times before being assigned. Each expansion "swallows" one percent sign.

%V1 %SET "A"
%V2 %SET "B"
%V3 %SET "C"
i   %FOR 1..3
      %DataExp %SET2 %%V%i
      DB %DataExp
    %ENDFOR i ; Emit DB "A", DB "B", DB "C".

See also t2871.

Only special macros make use of %SET2, for instance EndProcedure where it is used to expand %variable with not-known-yet dynamically changing name.

↑ %SETX

When a pseudoinstruction of SET* family is being assembled, €ASM does not expand label field and operation field of statements such as %Label %SET* anything. This applies to %SET, %SETA, %SETB, %SETC, %SETU, %SETE, %SETS, %SETL, %SET2 but not to %SETX. In this statement the label field is expanded, too. After the expansion of label field %SETX works like ordinary %SET, which means that it requires a valid %variable name in the label field. For instance %%Var1 %SETX ABC is equivalent to %Var1 %SET ABC.

Using %SETX we can assign %variables whose names are not explicitly set at the assembly time and they dynamically change. Example:

i %FOR 1..4
     %%M%i %SETX %i  ; Identical with %M1 %SET 1, %M2 %SET 2 etc.
  %ENDFOR  ; This will assign values 1,2,3,4 to preprocessing %variables %M1,%M2,%M3,%M4.

See also t2881.

Only special macros make use of %SETX, for instance Procedure where it is used to assign stack-frame addresses to %variables, whose names are not-known-yet at macro-write time.

↑ %MACRO

↑ %EXITMACRO

↑ %ENDMACRO

Block of statements claimed with pseudoinstructions %MACRO and %ENDMACRO is called macro declaration. Identifier in the label field of %MACRO statement is the name of macro.
%MACRO statement itself is called macro prototype, as it declares macro name and gives names to macro arguments. Once declared, macro can be expanded many times it the program.

When €ASM reads the macro declaration in source text, it does not emit any code. Instructions from the macro body will be emitted only when the macro is actualy expanded with its macroinstruction.

%EXITMACRO allows to break the emitting process if it is encountered, usually when some error condition was detected.

Both %EXITMACRO and %ENDMACRO pseudoinstructions may have the macro name in the operand field in order to emphasize the block matching.

Example of a macro declaration and a macro expansion:

AlignEAX %MACRO       ; Round-up the contents of EAX to a multiple of 4.
           ADD EAX,3
           AND EAX,-4
         %ENDMACRO AlignEAX

         MOV EAX,13
         AlignEAX     ; After macro expansion EAX contains 16.

For more information see also the chapter MacroInstructions.

↑ %SHIFT

Pseudoinstruction %SHIFT is usable in macro block only. It will decrement the ordinal number of all macro operands by one or by the integer, which it has in the operand field. %SHIFT may have no label and only one operand which evaluates to a plain integer number. Default 1 is assumed when the operand is omitted.

%SHIFT 0 does nothing. Shifting by negative number will inverse the direction.

Effect of the operation is limited only when macrooperands are accessed by their ordinal number, such as %1, %2 etc. Accessing operands by formal names remains unaffected by %SHIFT operation.

Operands, which are left-shifted from ordinal position %1 to position zero or negative, are not accessible by ordinal number any longer, but they are not lost forever, as they may be shifted back by a negative number.

| |Sample %MACRO Oper1, Oper2, Oper3 | |L1: DB %1, %Oper1 | | %SHIFT 1 | |L2: DB %1, %Oper1 | | %SHIFT 2 | |L3: DB %1, %Oper1 | | %ENDMACRO Sample |0000: | |0000: |Sample 0x44, 0x55, 0x66, 0x77 | +Sample %MACRO Oper1, Oper2, Oper3 |0000:4444 +L1: DB %1, %Oper1 | + %SHIFT 1 |0002:5544 +L2: DB %1, %Oper1 | + %SHIFT 2 |0004:7744 +L3: DB %1, %Oper1 | + %ENDMACRO Sample |0006: |

See also t7221.

↑ %ERROR

Pseudoinstruction %ERROR will insert an user-defined error message into the listing file and to the message output. The message is similar to those emitted by €ASM itself when it founds some mistake in the source text. %ERROR is often used in macroinstructions and it usually warns the programmer that the macro was not used in the intended way.

User defined errors have severity code U and severity level 5, which is somewhere between warnings and assembler errors. The programmer may specify the actual message identifier with optional keyword operand ID= which can be a plain decimal number between 5000 and 5999. %ERROR will also accept identifier with value 0..999 and it adds internally 5000 in this case. Default value is 0, so the user defined message has identifier U5000, if no keyword operand ID= is used.

The message text does not have to be in quotes. If the message text consists from more than one ordinal operands, they will be concatenated verbatim, including quotes, if used. Example:

%ERROR Id=5123, Something went wrong. Try again.

See also t2581 for more examples.

↑ %DISPLAY

Pseudoinstruction %DISPLAY is used for retrieving information about internal objects created by €ASM during the assembly process. Each such object is displayed in the form of debug message with severity level 1. The message is printed both to output console (in each pass) and to the listing file (in the final pass).
%DISPLAY is active even in non-emitting source passages, such as false %IF branch or block disabled with %COMMENT. It is intended to investigate €ASM internals when something is not working as expected.

Pseudoinstrucion %DISPLAY accepts arbitrary number of operands – object categories, which specify the kind of objects that we want to review. Categories may be provided as ordinal operands or as keyword operands with value which specifies the filter. Filter can restrict the amount of displayed lines. Category names are case insensitive but the filtering value, if used, is case sensitive. Filter value defines first few characters of those object names, which we want to display. Filter value may be terminated with asterix *, but this is not mandatory. For instance the statement %DISPLAY Macros=Alig will display all macros whose names begin with "Alig".

Operands of pseudoinstruction %DISPLAY have rather relaxed syntax. Object categories (ordinal operand name or keyword name) may be shortened, too. Only this number of characters is required which is enough to identify the desired category. For instance %DISPLAY se will display map of all segments and their sections. %DISPLAY File displays the list of input files (main source and included libraries). %DISPLAY sym=Num*, sym=En will list only those symbols, whose name begins with Num or En.

%DISPLAY UserVar, %DISPLAY UserVar=*and %DISPLAY user= work equally (empty filter value will match any %variable name). Nonfilterable categories, such as segments, context stack, automatic macro %variables, will always display their complete list, any filtering value is ignored.

When specifying user-defined and system %variable names as the filtering value, the leading percent sign % or %^ may be omitted, or the percent sign must be doubled (otherwise it would have been expanded to its current contents). %DISPLAY UserVar=Loc %DISPLAY us=Loc* and %DISPLAY user=%%Loc are equal in their function: they display the current contents of user-defined preprocessing %variables whose name begins with %Loc.

%DISPLAY object categories
%DISPLAY operandMessagesFilterOrderDisplayed objects
AllD1100..D1900yesalphabetical All objects specified below (shortcut for Fil,Ch,Se,St,Co,Sym,L,Rel,M,V).
FilesD1150..D1190ignorednaturalSource files included in the program.
ChunksD1200..D1240ignorednaturalChunks of source code.
SectionsD1250..D1290ignorednaturalMap of groups, segments and sections.
SegmentsD1250..D1290ignorednaturalMap of groups, segments and sections.
GroupsD1250..D1290ignorednaturalMap of groups, segments and sections.
StructuresD1300..D1340yesalphabeticalStructures declared in the program.
ContextD1350..D1390ignoredstackedContext stack of block statements
SymbolsD1400..D1450yesalphabetical All explicitly defined symbols (shortcut for Fix,Unf,Unr,Ref).
  UnfixedSymbolsD1410..D1450yesalphabeticalSymbols whose properties are not stable yet.
  FixedSymbolsD1420..D1450yesalphabeticalSymbols whose properties are already fixed.
  UnreferencedSymbolsD1430..D1450yesalphabeticalSymbols which were not used yet.
  ReferencedSymbolsD1440..D1450yesalphabeticalSymbols which were mentioned at least once, or used in a structure.
LiteralSymbolsD1500..D1540ignoredalphabeticalAll literal symbols declared in the program.
RelocationsD1550..D1590ignorednaturalRelocation records.
MacrosD1600..D1690yesalphabeticalMacroinstructions declared at this moment.
VariablesD1700..D1790yesalphabetical All preprocessing %variables currently set (shortcut for Au,Fo,Us,Sys).
  AutomaticVariablesD1710..D1730ignoredfixedAutomatic macro %variables.
  FormalVariablesD1740..D1750yesalphabeticalFormal macro/for %variables.
  UserVariablesD1760..D1770yesalphabeticalUser-defined preprocessing %variables.
  SystemVariablesD1780..D1790yesalphabeticalSystem preprocessing %^variables.

Displayed message usually contains object name, it's attributes and other properties.

%DISPLAY operands Groups, Segments, Sections are identical, each of them always displays the complete tree.
A line with the group lists all groups's segment names.
A line with the segment is indented by 2 spaces and displays purpose, width,align, combine, class, src.
A line with the section is indented by 4 chars and displays address, size, align, ref.

Property src= specifies whether the file or chunk is

Chunk property type= shows what kind of information is in this chunk of source text:

A boolean property ref= tells whether the symbol, structure or section was used (referenced at least once in the program). Members of the structure are automatically marked as used when the structure is defined.
Similar property fix= specifies if the offset of symbol or section is already fixed, i.e. it is stable between assembly passes.
Context property emit= informs whether the block is in normal (emitting) status, or if it is just bypassed without emitting any code or data.

Context property %.= shows current value of expansion counter in this block.

Property src= identifies position in source text where the displayed object was defined, in standard form "FileName"{LineNumber}.

Automatic and formal %variables are defined only in %macro or %for expansion, i. e. when the statement %DISPLAY Auto,Formal is inserted in %MACRO..%ENDMACRO or %FOR..%ENDFOR body and the macro is then expanded.

See tests t2901..t2917 for examples of %DISPLAY output.

Unlike other instructions, the statement %DISPLAY is alive and kicking even in non-emitting status. Be cautious to put unfiltered %DISPLAY in repeating preprocessing loops (%FOR, %WHILE, %REPEAT), as this may substantionally flood the output.

The main purpose of %DISPLAY is to find errors at assembly-time, when €ASM doesn't work as expected, together with EUROASM options DISPLAYSTM=, DISPLAYENC= and with PROGRAM options LISTGLOBALS=, LISTLITERALS=, LISTMAP=.
For investigation of your program at run-time use a debugger or the macro Debug.

↑ %DEBUG

↑ %PROFILE

Those pseudoinstruction names are reserved for future extension of EuroAssembler, they are not implemented yet. See also EUROASM boolean options DEBUG= and PROFILE=.


↑ Macroinstructions

Macro is defined by a block of statements (macro body) encapsulated between pseudoinstructions %MACRO and %ENDMACRO. The %MACRO statement itself ( macro prototype) must have a label, which can be used later for macro invocation (alias macro expansion).

Macro must be defined before it is invoked.

Statement, which has the name of previously declared %MACRO in its operation field, is called macroinstruction or simply macro. It will be replaced with statements from the block %MACRO..%ENDMACRO. Macro can be a fixed static set of instructions, such as

CarriageReturn %MACRO
                 MOV AH,2  ; 3 statements between %MACRO and %ENDMACRO are macro body.
                 MOV DL,13
                 INT 21h
               %ENDMACRO CarriageReturn

More useful are macros which can modify the expanded instructions depending on operands they are invoked with. When a macro is invoked, it is usually provided with operand values, which are available in the macro body as formal %variables or as automatic ordinal %variables %1, %2, %3,.... Operands in macrodefinition may be given temporary formal symbolic name; they are accessible in the macro block by this name prefixed with percent sign %. Or they may be referred with their ordinal number prefixed with %. Keyword operands are only accessible with the formal key name prefixed with %. Example:

Copy %MACRO Source, Destination, Size=ECX ; Statement %MACRO is called macro prototype.
       MOV ESI, %Source      ; or MOV ESI, %1
       MOV EDI, %Destination ; or MOV EDI, %2
       MOV ECX, %Size
       REP MOVSB
     %ENDMACRO Copy

The previous macro needlessly moves the number of copied bytes (Size) to register ECX even when it is already there at the time of its invocation. The expanded instruction MOV ECX,ECX could be spared in this case:

Copy %MACRO Source, Destination, Size=ECX
       MOV ESI, %Source      ; Instead of formal %Source we could use MOV ESI, %1
       MOV EDI, %Destination ; Or MOV EDI, %2
       %IF "%Size" !== "ECX"
         MOV ECX, %Size
       %ENDIF
       REP MOVSB
     %ENDMACRO Copy

Now when the macro is invoked as Copy From, To, Size=ecx or as Copy From, To, no superfluous MOV ECX,ECX is expanded.

If the name of the formal macro %variable happens to collide with some previously user-defined preprocessing %variable, visibility of the user-defined %variable is temporarily overriden with the formal %variable, see the test t7347.
Automatic variables, such as %*, %#, %:, %1, %2,,, are not visible outside the macro body.

All macros in EuroAssembler may have variable number of operands.

Number of operands specified at macro invocation doesn't need to correspond with the number of operands specified at macro definition. If the macro is invoked with less ordinal operands than its prototype declares, €ASM does not treat this as error and silently expands the omitted operands to nothing.
When the macro is invoked with more operands than its prototype specifies, those superfluous operands are not accessible in macro expansion by formal names, but still they may be referred by their automatic ordinal number. See also pseudoinstruction %SHIFT.

When a keyword operand is omitted in macro invokation, it retains its value which was specified at macro definition. Adding a voluntary keyword operand(s) allows to extend functionality of macroinstruction without destroying the backward compatibility. Consider this simple macro:

Write %MACRO TextPtr,TextSize ; Write the text to the standard output.
   MOV DX,%TextPtr
   MOV CX,%TextSize
   MOV BX,1       ; File handle of the standard output.
   MOV AH,40h     ; Write string DS:DX to a device or file.
   INT 21h        ; Invoke the DOS service.
 %ENDMACRO Write

Later we may want to use the same macro for writing to other devices, too. Let's extend it with keyword operand Handle= with predefined default value of standard output:

Write %MACRO TextPtr,TextSize,Handle=1 ; Write the text to the standard output or other device.
   MOV DX,%TextPtr
   MOV CX,%TextSize
   MOV BX,%Handle ; Handle of output device or file.
   MOV AH,40h     ; Write string DS:DX to a device or file.
   INT 21h        ; Invoke the DOS service.
 %ENDMACRO Write

Now it's possible to write to other devices, too, for instance to the standard line printer: Write Message,80,Handle=4. The enhanced macro Write is backward compatible. Even if our old programs include updated macrolibrary with enhanced macro Write, they don't have to be recompiled.

Similary to preprocessing %variables, macros may be redefined many times. However, this is not usual and €ASM will emit a warning W2512 in this case. Once defined macro can be undefined with pseudoinstruction %DROPMACRO.

As an example of situation, where dropping of the macro definition may be useful, is emulation of a machine instruction by the macro with the same name.
Machine instruction BSWAP, which reverses the byte order in 32-bit register, was not available on Intel 80386. This could be solved by emulation using three ROR or ROL instructions. If we detect that our program runs on Pentium, we can drop the macro definition and €ASM will assemble BSWAP as a native machine instruction.

|00000000: | | |BSWAP %MACRO reg32 ; Swap the byte order in register. | | %IF TYPE# %reg32 <> 'R' || SIZE# %reg32 <> 4 | | %ERROR 'Macro "BSWAP" expects 32-bit GPR as its operand.' | | %EXITMACRO BSWAP | | %ENDIF | |%reg16 %SET %reg32[2..3] ; Name of the lower half of reg32 (omit the letter E). | | ROL %reg16,8 | | ROL %reg32,16 | | ROL %reg16,8 | | %ENDMACRO BSWAP |00000000: | |00000000:BA78563412| MOV EDX,0x12345678 |00000005: | BSWAP EDX ; Expected result is EDX=0x78563412. | +BSWAP %MACRO reg32 ; Swap the byte order in register. |FALSE + %IF TYPE# %reg32 <> 'R' || SIZE# %reg32 <> 4 | + %ERROR 'Macro "BSWAP" requires 32-bit GPR as its operand.' | + %EXITMACRO BSWAP | + %ENDIF |4458 +%reg16 %SET %reg32[2..3] ; Name of the lower half of reg32. |00000005:66C1C208 + ROL %reg16,8 |00000009:C1C210 + ROL %reg32,16 |0000000C:66C1C208 + ROL %reg16,8 | + %ENDMACRO BSWAP | | ; If CPU is 486 or higher, prefer the machine instruction. | | %DROPMACRO BSWAP |00000010:0FCA | BSWAP EDX ; This time swap the byte order with native 486 instruction. |00000012: |

Advanced EuroAssembler macrolanguage allows to change our programming style. We can create macroinstructions which mimic the functions of high-level languages and customize the new "language" for the particular task. See the macros Ii* in €ASM source file ii.htm as an example of pseudolanguage developed for intelligible description of conversion from assembly-instruction to the machine code.

When something doesn't work as expected, it's always possible to look at the expanded macroinstruction body in the listing and adhere to a plain assembly code.


↑ Program formats

BIN ↓

COM ↓

MZ ↓

OMF ↓

LIBOMF ↓

COFF ↓

LIBCOF ↓

ELF ↓

ELFX ↓

ELFSO ↓

PE ↓

DLL ↓

RSRC ↓

Width of program formats ↓

The target of EuroAssembler's endeavour is an output file in one of the formats selected by PROGRAM FORMAT= option. There are three main categories of €ASM output files:

  1. linkable file (also called module or object file) is designed to be joined with other modules and libraries into the final executable file or to the object library.
    €ASM supports three main standards of object files: ELF, OMF and COFF . Default object file name extension is .o or .obj.
  2. library is a collection of modules, ready to be linked on demand into the final executable file. There are four kinds of libraries supported by EuroAssembler:

    Default filename extension of object or import library is .lib, in case of dynamic library it is .so or .dll.

  3. executable file (also called image) can be loaded and launched directly by the shell of the hosting operating system.
    €ASM can produce executables in the formats ELFX, PE, MZ, COM, they have file extension .x, .exe or .com. It can also create dynamically loaded libraries DLL, very similar to PE format, but they can be executed only indirectly, through invocation of their exported function from another program, or through a special Windows loader, such as RUNDLL32.exe.
    Program formats BIN and BOOT are ranked as executable, too. However, as they lacks any red tape information, binary file needs its own ad hoc loader to be launched directly, or it must be loaded to a special storage place of the computer, such as the firmware ROM or the boot sector of disk device.

↑ BIN

Option PROGRAM FORMAT=BIN is chosen as the default when FORMAT= is not explicitly specified. Default options for BIN format are

Name: PROGRAM FORMAT=BIN, OUTFILE=%^PROGRAM.bin, MODEL=TINY, WIDTH=16, \
              ENTRY=0, IMAGEBASE=0, SECTIONALIGN=0, FILEALIGN=0
.

€ASM creates the default segment [BIN] with universal purpose:

[BIN] SEGMENT WIDTH=16,ALIGN=16, \
              PURPOSE=CODE+DATA+BSS+STACK+LITERALS
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Structure of BIN file is straightforward: the binary image is a concatenation of emitted contents of its segments. Noninitialized (BSS) segments are omitted.

Segment alignment in the image is by default specified by the highest value of PROGRAM FILEALIGN=0, PROGRAM SECTIONALIGN=0 and SEGMENT ALIGN=16. Gaps between segments are filled with alignment stuff, which is 0x90 (NOP) if the neighbouring segments have both SEGMENT PURPOSE=CODE, otherwise it is 0x00.

Typical applications of binary format are pure data files, conversion tables, Dos drivers, boot sectors etc., see the sample BIN projects.

↑ BOOT

Option PROGRAM FORMAT=BOOT creates a binary format file adapted for booting. The difference from the BIN format:

  1. Size of the output file is 512 bytes,
  2. it is loaded at the linear address 07C00h,
  3. size of the code and data is padded to 510 and the last two bytes are 0x55,0xAA,
  4. the default file extension is .sec.

Default options for BOOT format are

Name: PROGRAM FORMAT=BOOT, OUTFILE=%^PROGRAM.sec, MODEL=TINY, WIDTH=16, \
              ENTRY=, IMAGEBASE=0, SECTIONALIGN=0, FILEALIGN=0
.
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

See the sample projects boottest.htm and boot16.htm.

↑ COM

Files in COM format are legacy of CP/M operation system, they are directly executable in DOS and in 32-bit Windows. In other systems only with DOS emulator.

Default options for PROGRAM FORMAT=COM are

Name: PROGRAM FORMAT=COM,OUTFILE=%^PROGRAM.com,MODEL=TINY,WIDTH=16,IMAGEBASE=0, \
              ENTRY=256,SECTIONALIGN=0,FILEALIGN=0
.

Options ENTRY=0x100 and IMAGEBASE=0 are fixed for this format and cannot be changed (they can be omitted from the PROGRAM statement).

€ASM creates default implicit segment [COM] with universal purpose:

[COM] SEGMENT WIDTH=16,ALIGN=16,PURPOSE=CODE+DATA+BSS+STACK+LITERALS
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Structure of COM file is similar to BIN format, there are no metainformation stored in the file except for its extension .com which tells OS to treat it as an executable. OS loader will allocate 64 KB of memory, load segment registers CS,DS,ES,SS with the paragraph address of that block, initialize 256 bytes long [PSP] structure located at offset 0, load the entire file contents at offset 256 (0x0100), set stack pointer to the top of allocated block (usually SP=0xFFFE) and finally set IP=0x0100.

Size of code+data+stack altogether should not exceed 64 KB in TINY memory model. Program in COM format can use 32-bit registers, if CPU is 386 or higher. Also additional memory blocks may be requested from OS at runtime. Typical application of this obsolete format are fast and short little utilities and Terminate-and-Stay-Resident (TSR) programs which provide services in DOS, see the sample project for DOS.

The following COM example is only 1 byte long, yet it is a formally valid computer program, though it does nothing:

         EUROASM
Shortest PROGRAM FORMAT=COM
           RET
         ENDPROGRAM Shortest

Program in COM format can link other object files or libraries, see the test table linker combinations.

↑ MZ

Specifying program format MZ creates a 16-bit or 32-bit realmode executable file, which can be directly run in DOS and in 32-bit Windows. Its structure is described in [MZ] and [MZEXE]. Dos executable file begins with MZ signature 'M','Z'.

Default options for PROGRAM FORMAT=MZ format are:

PROGRAM FORMAT=MZ, ENTRY=, OUTFILE=%^PROGRAM.exe, MODEL=SMALL, WIDTH=16, IMAGEBASE=0, \
        SECTIONALIGN=0, FILEALIGN=0, SIZEOFSTACKCOMMIT=8K, SIZEOFHEAPCOMMIT=1M

€ASM creates default implicit segments [CODE], [RODATA], [DATA], [BSS], [STACK] in program formats MZ, OMF, LIBOMF.

Parameter PROGRAM SizeOfStackCommit= specifies the default size of the segment [STACK], so we don't have to explicitly define stack segment when EUROASM option AUTOSEGMENT= is enabled at the ENDPROGRAM statement.

Parameter PROGRAM SizeOfHeapCommit= can be used to limit the requested amount of heap memory preallocated by the loader (member .e_maxalloc of DOS file header).

If the memory model is HUGE or FLAT and program width is not explicitly specified, it defaults to PROGRAM WIDTH=32, otherwise it is 16.

ImageBase=0 is fixed for this format and cannot be changed.
Explicit specifications of PROGRAM Entry= is mandatory in MZ format.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPRESERVE, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

As an example of a MZ executable program for DOS see the test t8300.

↑ OMF

Object Module Format as specified in [OMF] is designed to be linked to 16-bit and 32-bit real-mode programs. Imports in this format are linkable to the protected-mode executables.

Default segments are the same as in MZ format.

File format OMF is recognized for LINK when it is composed of valid OMF records and the first record is THEADR or LHEADR.

Default options for this format are:

Name: PROGRAM FORMAT=OMF,OUTFILE=%^PROGRAM.obj,MODEL=SMALL,WIDTH=16
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

As an example of a OMF module see the test t8400.

↑ LIBOMF

OMF library format is described in Apendix2 of the same document as [OMF]. The hashed dictionary, required by format specification at the end of library, is created on output, but €ASM linker ignores it. When the library is linked to another program, its public symbols are searched sequentionally. Page size of LIBOMF libraries created by €ASM is fixed at 16.

Default segments are the same as in MZ format.

File format LIBOMF is recognized by LINK when it starts with LIBHDR record with page size 16, 32, 64,..32K, and this record are followed by the valid OMF modules, which start with THEADR or LHEADR records and which end with MODEND or MODEND32 record each. Library dictionary at the end of the file is not checked.

Default options for PROGRAM FORMAT=LIBOMF are:

Name: PROGRAM FORMAT=LIBOMF,OUTFILE=%^PROGRAM.lib

Other properties are inherited from its library modules.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MODEL, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM, WIDTH.

Modules, which will be stored to the library, should be assembled beforehand to the files in OMF format. If the program, which creates library, contains some code, it will be assembled and stored as the first library module. Modules from other linked libraries, which do not declare any global symbol, will not be included in the target library at all. Example of a static OMF library linked from 3 standalone modules:

MyLib: PROGRAM FORMAT=LIBOMF
        LINK "Module1.obj", "Module2.obj", "Module3.obj"
       ENDPROGRAM MyLib

Although format OMF was developed for real-mode programs, in can be enhanced with import declarations represented with OMF records COMENT/IMPDEF, and the such import library used in Windows programs.

Some librarians (for instance [ALIB]) create longer alternatives of import library, which adds LEDATA+FIXUPP records with relocatable machine code of proxy jumps to the imported function.
€ASM does not create the longer version of import libraries but both short and long versions are accepted by the linker. Example of a program creating pure import library in short OMF format:

ImpLib PROGRAM FORMAT=LIBOMF
  IMPORT LIB="kernel32.dll",TerminateProcess,TerminateThread
  IMPORT LIB="user32.dll",CreateCursor,CreateIcon,CreateMenu
 ENDPROGRAM ImpLib

As an example of a LIBOMF library see the test t8600.

↑ COFF

EuroAssembler implements the object format COFF in Microsoft modification described in [MS_PECOFF]. This description is also valid for €ASM formats LIBCOF, PE, DLL (COFF-based formats).

€ASM creates three default segments (sections) in COFF-based formats:
[.text], [.rodata], [.data], [.bss]. Machine stack for executables will be established by the loader at run-time.

Default options for PROGRAM FORMAT=COFF are:

PROGRAM FORMAT=COFF,OUTFILE=%^PROGRAM.obj,MODEL=FLAT,WIDTH=32
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

Generated value in PFCOFF_FILE_HEADER.Machine for legacy mode COFF is always 0x014C (Intel 386) regardless of EUROASM CPU= value. In 64-bit mode PECOFF is always 0x8664 (architecture AMD64). Architecture Itanium (0x0200) is currently not supported.

PFCOFF_FILE_HEADER.TimeDateStamp corresponds with the current system time, unless it is forged by the option EUROASM TIMESTAMP=.

Linked COFF module is recognized by the contents of PFCOFF_FILE_HEADER.Machine which should be one of the words with value 0x0000, 0x014C, 0x014D, 0x014E, 0x0200, 0x8664.

As an example of a COFF program see the test t8850 (for Windows) or t9000 (for Linux).

↑ LIBCOF

COFF library format is described in [COFFlib].

Default options for PROGRAM FORMAT=LIBCOF are:

PROGRAM FORMAT=LIBCOF,OUTFILE=%^PROGRAM.lib,MODEL=FLAT,WIDTH=32

Default segments are the same as in COFF format.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, FILEALIGN, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

COFF library is identified by the signature !<arch> followed with byte 0x0A.

You can create the library by linking from other object files. Modules, which will be stored to the library, should be assembled beforehand to files in COFF format (or OMF or ELF). If the program, which creates the library, contains some code (beside the LINK statements), it will be assembled and stored as the first library module. Modules which do not declare any global symbol, will not be included in the library at all. Example of COFF library linked from 3 modules:

MyLib: PROGRAM FORMAT=LIBCOF
         LINK "Module1.obj", "Module2.obj", "Module3.obj"
       ENDPROGRAM MyLib

€ASM does not create the longer version of import libraries but both short and long versions are accepted by the linker. Example of a program creating import library in short COFF format:

ImpLib: PROGRAM FORMAT=LIBCOF
         IMPORT LIB="kernel32.dll",TerminateProcess,TerminateThread
         IMPORT LIB="user32.dll",CreateCursor,CreateIcon,CreateMenu
        ENDPROGRAM ImpLib:

As an example of a LIBCOF library see the test t9150.

↑ ELF

ELF alias Executable and Linkable Format is the file format used in Linux. There are three kinds of ELF files:

Default options for PROGRAM FORMAT=ELF are

Name: PROGRAM FORMAT=ELF, OUTFILE=%^PROGRAM.o, MODEL=FLAT, WIDTH=32, \
      FILEALIGN=16

ELF is an object (linkable) file with extension .o. It has the default segments [.text], [.rodata], [.data], [.bss]. The segments are called sections in [ELF] documentation. Beside those regular sections €ASM also creates service sections [.symtab], [.strtab], [.shstrtab], [.rela.text], [.rela.data]. See t9750 as an example of ELF object.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, ICONFILE, IMAGEBASE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

↑ ELFX

This is an executable program for Linux. It has the default file extension .x, if not prescribed otherwise by PROGRAM OUTFILE=.

The format ELFX creates segment groups [LOAD.HDR], [LOAD.CODE], [LOAD.RODATA], [LOAD.DATA], see for instance the test t9850. The groups are called program headers in [ELF] documentation or in Linux tools such as readelf.

Name: PROGRAM FORMAT=ELFX, OUTFILE=%^PROGRAM.x, MODEL=FLAT, WIDTH=32, \
    ENTRY=, IMAGEBASE=4M, FILEALIGN=16, SECTIONALIGN=4K

The default extension is .x. Parameter ENTRY= is mandatory, it specifies the entry point of the program.

These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

↑ ELFSO

This is an DSO - Dynamic Shared Object for Linux. EuroAssembler creates DSO with the file extension .so but it does not dynamically link them. €ASM does not encompass capability of specialized Linux dynamic linker GNU ld. When a DSO is linked to an ELFX program, it is linked only statically.

Name: PROGRAM FORMAT=ELFSO, OUTFILE=%^PROGRAM.so, MODEL=FLAT, WIDTH=32, \
    IMAGEBASE=4M, FILEALIGN=4K, SECTIONALIGN=4K
These PROGRAM options are irrelevant: DLLCHARACTERISTICS, ENTRY, ICONFILE, MAJOROSVERSION, MAJORSUBSYSTEMVERSION, MAJORIMAGEVERSION, MAJORLINKERVERSION, MINOROSVERSION, MINORSUBSYSTEMVERSION, MINORIMAGEVERSION, MINORLINKERVERSION, WIN32VERSIONVALUE, SECTIONALIGN, SIZEOFHEAPCOMMIT, SIZEOFHEAPRESERVE, SIZEOFSTACKCOMMIT, SIZEOFSTACKRESERVE, STUBFILE, SUBSYSTEM.

↑ PE

Portable executable file format PE for Windows is described in the document [MS_PECOFF]. Default options for PROGRAM FORMAT=PE are

Name: PROGRAM FORMAT=PE,OUTFILE=%^PROGRAM.exe,MODEL=FLAT,WIDTH=32,IMAGEBASE=4M,FILEALIGN=512,SECTIONALIGN=4K, \
              SUBSYSTEM=CON,ICONFILE="euroasm.ico",MAJORLINKERVERSION=1,MINORLINKERVERSION=0,ENTRY=,          \
              MAJOROSVERSION=4,MINOROSVERSION=0,MAJORIMAGEVERSION=1,MINORIMAGEVERSION=0,                      \
              MAJORSUBSYSTEMVERSION=4,MINORSUBSYSTEMVERSION=0,WIN32VERSIONVALUE=0,DLLCHARACTERISTIC=0x000F,   \
              SIZEOFSTACKRESERVE=1M,SIZEOFSTACKCOMMIT=8K,SIZEOFHEAPRESERVE=4M,SIZOHEAPCOMMIT=1M

Default segments are the same as in COFF format.

PE file begins with DOS program (stub) in MZ format, which is executed when the program is not launched in MS-Windows. At the file address PFMZ_DOS_HEADER.e_lfanew it expects the PE format signature with bytes 'P','E',0,0.

Older file format with NE (New Executable) signature, used in 16-bit Windows and OS/2, is not supported by €ASM.

COFF file header is followed by PFPE_OPTIONAL_HEADER. Almost all its fields are configurable with PROGRAM options.
PROGRAM ENTRY= must be explicitly specified in PE format.
Option PROGRAM STUBFILE= specifies the file name of 16-bit MZ program used when the program runs in DOS. If it is left empty, €ASM will use its own built-in stub, which reports error message This program was launched in DOS but it requires Windows. and terminates.
Factory default option ICONFILE="euroasm.ico" specifies the file name ot the icon , which will be built in the resource section of linked PE file. It visually represents the compiled file in Desktop or in Windows Explorer.

This parameter is ignored if any resource file is explicitly linked into PE (Explorer will then use the first icon found in the PE resources). If the ICONFILE= option is explicitly defined as empty, and if no resources are linked, the resource section [.rsrc] will be omitted from PE file completely.

Optional header is followed with 16 special directory entries which identify sections with special purposes (other than ordinary segment purposes CODE, DATA, BSS). See the last 16 lines in Segment purpose table, starting with EXPORT.

EuroAssembler natively supports only few of special PE directories:

EXPORT
automatically creates section [.edata] with the table of exported symbols, if they are declared.
IMPORT
automatically creates section [.idata] with the table of imported symbols names and ordinals.
RESOURCE
is created when a resource file is linked to the executable or when the program option ICONFILE= specifies an existing icon.
BASERELOC
contains table of relocation which must be applied by the loader when the executable could not be loaded at the preferred VA specified by the program option IMAGEBASE=.
IAT
import address table is created in section [.idata], same as the special directory IMPORT. Concatenation of tables IAT, IMPORT and thunk proxy jumps to the one common section [.idata] reduces the size of image.

Other special directories are not supported by this EuroAssembler version. Nevertheless, their segment may be created explicitly, their contents created manually or by some third-party tool and emitted to the segment with INCLUDEBIN or directly with Data definition statements. If segment parameter PURPOSE= complies with the name in purpose table (case insensitive), the corresponding directory entry in PE optional header will be created, covering the whole segment contents. Example:

[.cormeta] SEGMENT PURPOSE=CLR
 D '<compatibility xmlns="urn:schemas-microsoft-com:compatibility.v1">'
 D '  <application>'
 D '     <!-- A list of all Windows versions that this application is designed to work with.>'
 D '   </application>'
 D ' </compatibility>'

When EUROASM option DEBUG=ENABLED at the ENDPROGRAM pseudoinstruction, symbol table is appended to the PECOFF image.

Debuggers should be able to retrieve symbol names from the debugged executable and associate them with disassembled source lines. Unfortunately, none of tools which I tried, was able to exploit the symbol table from PE.

↑ DLL

File format DLL is almost identical with the format PE, with a few minor differences:
File header field PFCOFF_FILE_HEADER.Characteristic if flagged with pfcoffFILE_DLL = 0x2000,
default file extension and image base are:

Name: PROGRAM FORMAT=DLL,OUTFILE=%^PROGRAM.dll,IMAGEBASE=256M

option ENTRY= is optional in DLL.

Default segments are the same as in COFF format.

Dynamically linkable symbols should be explicitly declared with exported scope.
Pseudoinstruction EXPORT supports dynamic DLL forwarding of exported function to a different function in other DLL, using the EXPORT key operands FWD= and LIB=. See the test t9475 as an example.

Format DLL is sometimes used as resource library which contains only [.rsrc] section, typically a collection of icons. This is achieved by linking of compiled resource file, as created by a third party resource compiler. Example of resource-only DLL, which contains 3 icons, can be found in tests t9485 and t9536.

↑ RSRC

Microsoft resources is the common name for multimedia data, such as bitmap pictures, icons, cursor shapes, fonts etc. The resource used in GUI program are described in resource script as a tree referring individual graphic files. Typical script is a plain text file with extension .rc and it should be converted by a resource compiler into a binary resource file with extension .res, which is linkable by €ASM or other linkers. Its format is described in [RSRC].

MyCompiledResource PROGRAM FORMAT=RSRC does not work, EuroAssembler cannot compile resource scripts. Use third party tool instead, such as [MS_RC], [GoRC], or [ResourceHacker].

When a resource file is linked to the PE or DLL image created by €ASM, program option ICONFILE= is ignored. The file is converted by €ASM to an internal PECOFF binary-tree structure in the special section [.rsrc] and referred with an optional-header directory entry RESOURCE.

↑ Width of program formats

The width of output files linked by EuroAssembler is determined by the program option WIDTH= and it defaults to 32 in ELF and COFF-based formats. To create a 64-bit program ELF, ELFX, ELFSO, PE, DLL, COFF or LIBCOF, the program width must be explicitly specified. 64-bit CPU should be enabled, too (EUROASM CPU=X64).

Differences between PE-COFF formats generated by EuroAssembler
MemberPROGRAM WIDTH=16PROGRAM WIDTH=32PROGRAM WIDTH=64
PFCOFF_FILE_HEADER.Machine 0x014C (Intel 386)0x014C (Intel 386)0x8664 (AMD64)
PFCOFF_FILE_HEADER.Characteristics:32BIT_MACHINE 0 (false)0x0100 (true)0 (false)
PFCOFF_FILE_HEADER.Characteristics:LARGE_ADDRESS_AWARE 0 (false)0 (false)0x0020 (true)
PFPE_OPTIONAL_HEADER32.Magic 0x010B (PE32)0x010B (PE32)0x020B (PE32+)
SIZE# PFPE_OPTIONAL_HEADER32 224224240

↑ EuroAssembler functions

Preprocessing ↓

Refactoring ↓

Assembler ↓

Assembly debugging ↓

Linker ↓

Librarian ↓

Object convertor ↓

Makefile manager ↓

Optimisation ↓

Where to begin ↓

This chapter describes EuroAssembler capabilities.

↑ Preprocessing

Many assemblers provide the tools which help programmer with tedious and repetitive work, they are called macroassemblers. Preprocessing (macro) apparatus in EuroAssembler is recognizable by the percent sign % prefixed to pseudoinstructions which control generating of repeated blocks of source code (%REPEAT, %WHILE, %FOR, %MACRO), conditional assembly (%IF, %COMMENT), assembly-time debugging (%DISPLAY) and assignment and expansion of preprocessing %variables (%SET* family).

This set of tools manipulates with the source text before it is submitted to the final assembly processing (to the plain assembler, which is not aware of preprocessing apparatus at all).

Some compilers perform preprocessing in a special 0-th pass, which takes the input source file and emits plain assembly source. Preprocessed intermediate file can be manually inspected then.

EuroAssembler utilizes a different approach: instead of preprocessing the source file as whole at once it will preprocess statement by statement in each assembly pass. This allows to manipulate with data which dynamically change and which are not fixed before €ASM was given the opportunity to pass through the source program at least once, for instance the distance between labels, size of not-defined-yet structures and segments etc.

When €ASM reads a line of source text, first it searches for percent character %. If found, it inspects the immediately following character(s) and prepares a copy of the source line for the plain assembler, expanded according to the following rules:
What follows after %ExampleWhat will replace it
%%%self-escaped single percent sign %
&%&suboperation size/length
.%.expansion counter
:%:macro label
!%!formalinverted condition, e.g. NC
*%*macro ordinal operand list
#%#number of macro ordinal operands
=*%=*macro keyword operand list
=#%=#number of macro keywords
^identifier%^Widthsystem variable value (digits 16, 32 or 64}
decimal digit(s)%1212th ordinal macro operand value
letter(s) %Ifpseudoinstruction name is left unexpanded %If
%Sizeif it is formal operand of %FOR or %MACRO, it will be expanded to its value
%OtherIdotherwise it is expanded as user-defined preprocessing variable
For more details about scope of %variable expansion see the source text VarExpand.

The relation between preprocessing and the plain assembly is similar to the relation between Javascript and the plain HTML text in internet browsers.

Proper function of €ASM preprocessing can be checked in the listing, by enabling options EUROASM LISTVAR=ENABLE, LISTREPEAT=ENABLE, LISTMACRO=ENABLE.

↑ Refactoring

Inline code ↓

Bypassed PROC ↓

PROC in own section ↓

PROC1 ↓

PROC in INCLUDE ↓

Statically linked PROC ↓

Dynamically linked PROC ↓

Inline macro ↓

Macro calling PROC ↓

Semiinline macro ↓

This chapter demonstrates various methods how we can break up the program functionality to small subprogrames in EuroAssembler.

Let's suppose that we need a function which calculates the third power of input positive integer number. The result should fit to 32 bits, otherwise the program will report an overflow and abort.

Assuming 32-bit mode and the input number loaded in register EAX, the solution uses instruction MUL (unsigned multiplication) two times.

↑ Inline code

Straightforward solution inserts the code directly to the main program flow.

    ; EAX contains the input number N.
    MOV ECX,EAX ; Copy the input value N to the register ECX.
    MUL ECX     ; Let EDX:EAX = N*N
    JC Abort:   ; CF=OF=1 when EDX is nonzero (32-bit overflow).
    MUL ECX     ; Let EDX:EAX = N*N*N
    JC Abort:   ; Abort on overflow.
    ; EAX now contains N3, continue the main program flow.

↑ Bypassed PROC

When such calculation is needed more than once, we should consider refactoring the direct code to a subprocedure which could be called repeatedly. We will insert the procedure named Cube to the program flow when its function is needed for the first time. Insertion of callable procedure requires a bypass skip. The procedure should be also accompanied with remarks which document its function.

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
       JMP Bypass: ; Skip the function code.
Cube PROC  ; Define a function which calculates 3rd power of N.
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to the register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32-bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32-bit overflow).
     ENDPROC Cube
Bypass: ; EAX now contains N3, continue the main program flow.

↑ PROC in own section

The instruction JMP Bypass: could be spared if the procedure code would have been defined somewhere else, below the main program flow. This can be achieved with emitting the procedure to a different code section (for instance [Subproc]).

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
%CurrentSect %SET %^Section ; Backup the current section name to a variable.
[Subproc]  ; Switch emitting to a different code section.
Cube PROC  ; Define a function which calculates 3rd power of N.
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to the register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32-bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32-bit overflow).
     ENDPROC Cube
[%CurrentSect]     ; Return to the original code section.
        ; EAX now contains N3, continue the main program flow.

↑ PROC1

Rather than manual section switch we could also utilize €ASM block PROC1..ENDPROC1 which will switch to a different section [@RT1] and return to the original section automatically.

       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
Cube PROC1 ; Define a function which calculates 3rd power of N in section [@RT1].
; Input:   EAX=integer number N.
; Output:  CF=OF=0, EAX=N3, ECX=N, EDX=0.
; Overflow:CF=OF=1, EAX,ECX,EDX undefined.
       MOV ECX,EAX ; Copy the input value N to the register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC .Abort   ; CF=OF=1 when EDX is nonzero (32-bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
.Abort:RET         ; CF=OF=1 when EDX is nonzero (32-bit overflow).
     ENDPROC1 Cube ; End of subprocedure in section [@RT1]. Return to [.text].
     ; EAX now contains N3, continue the main program flow.

↑ PROC in INCLUDE

Definition of function Cube at the place where it is used is good for understandability. On the other hand, when there are more such definitions, they clutter the main program thread. It could be more clearly organized if those helper functions were put away to a different file, for instance functions.inc. This file will be included to the main source file at assembly-time.

       INCLUDE "functions.inc" ; File with Cube: PROC source definition.
       ; EAX contains the input number N.
       CALL Cube:  ; Invoke the function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

Functions defined in the included file functions.inc can be wrapped to a block(s) functions PROGRAM..ENDPROGRAM and assembled separately to an OMF, ELF or COFF object file functions.obj, eventually to a library. The function name (Cube) must be declared as GLOBAL or PUBLIC in the object file, and it must be declared as GLOBAL or EXTERN in the main file. Instead of explicit GLOBAL declaration it may also be specified with the double colon (Cube::). The assembled object then will be statically linked to the main program at link-time.

       LINK "functions.obj" ; Object file with assembled code of function Cube.
       ; EAX contains the input number N.
       CALL Cube:: ; Invoke the external function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

Functions defined in included file functions.inc can be wrapped to a block(s) functions PROGRAM..ENDPROGRAM and assembled separately to a dynamically linked library file functions.dll, The function name (Cube) must be declared as EXPORT in the library file, and as IMPORT in the main executable file. The assembled function in DLL program then will be dynamically bound to the main program at run-time.

       IMPORT Cube, LIB="functions.dll"
       ; EAX contains the input number N.
       CALL Cube:: ; Invoke the DLL function which calculates N3.
       JC Abort:   ; Abort on overflow.
       ; EAX now contains N3, continue the main program flow.

↑ Inline macro

An alternative approach to the repeated inline code is utilizing a macro which will expand itself whenever the functionality is requested.

Statements which define the macro need not be bypassed, because they don't emit any code, but the macrodefinition must appear before the macro is used. The definition could be put aside to an included file as well, similary to PROC in INCLUDE method.

Cube %MACRO
       MOV ECX,EAX ; Copy the input value N to the register ECX.
       MUL ECX     ; Let EDX:EAX = N*N
       JC Abort%.: ; CF=OF=1 when EDX is nonzero (32-bit overflow).
       MUL ECX     ; Let EDX:EAX = N*N*N
Abort%.:           ; Label name is modified by %. variable, which increments in each macro expansion.
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube          ; Expansion of the macro.
     JC Abort:     ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Macro calling PROC

Inline macros are fast but each invocation repeats the whole function code. Size of program can be reduced if the macro calls the procedure with function code, which also can be put aside to functions.inc. The function of macro is then limited to process eventual parameters and to hide the calling convention (no parameters are actually used in our simple example, thou).

     INCLUDE "functions.inc" ; File with Cube: PROC source definition.
Cube %MACRO       ; Definition of the macro Cube.
       CALL Cube: ; Calling the procedure Cube:
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube         ; Invoke macro which calls the included PROC.
     JC Abort     ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Semiinline macro

Disadvantage of previous method is that we have to maintain two blocks of code: macro definition and procedure definition. €ASM provides procedure block PROC1 which is assembled only once, even if the macro, which contains it, is invoked repeatedly. Thank to this, the procedure code is emitted only once, when the macro is invoked for the first time, and if the macro is never invoked, the code is not emitted at all. Macrolibrary with such semiinline macros can be included to any program and does not increase the final code if the macro is not used (expanded) in the program.

This method is preferred in most macrolibraries shipped with EuroAssembler.

Cube %MACRO          ; Definition of the semiinline macro Cube.
       CALL Cube:    ; Calling the procedure Cube:
 Cube: PROC1         ; The PROC1 block is assembled only once on first macro invocation.
         MOV ECX,EAX ; Copy the input value N to the register ECX.
         MUL ECX     ; Let EDX:EAX = N*N
         JC .Abort:  ; CF=OF=1 when EDX is nonzero (32-bit overflow).
         MUL ECX     ; Let EDX:EAX = N*N*N
  .Abort:RET         ; CF=OF=1 when EDX is nonzero (32-bit overflow).
       ENDPROC1 Cube:
     %ENDMACRO Cube
     ; EAX contains the input number N.
     Cube            ; Invoke of macro which calls the embedded PROC1.
     JC Abort        ; Abort on overflow.
     ; EAX now contains N3, continue the main program flow.

↑ Assembler

Source envelope ↓

Chained programs ↓

Nested programs ↓

This chapter gives a closer look how a program block of statements is processed by EuroAssembler.

↑ Source envelope

Consider a plain text file src.asm submitted to assembler:

 DB 'This source "src.asm" has'
 DB ' no PROGRAM statement.',13,10
 DB 'EuroAssembler will use '
 DB 'a fictive envelope instead.'

As no PROGRAM..ENDPROGRAM block is defined in this source, the output format of €ASM object file is configured only by [PROGRAM] section in the configuration file euroasm.ini, or by built-in default, which is PROGRAM FORMAT=BIN,MODEL=TINY,WIDTH=16.

EuroAssembler formally wraps each source file into the two fictive envelope statements PROGRAM and ENDPROGRAM. Prefixed envelope PROGRAM statement derives its label (module name) from the source file name, cutting off its extension. Thus it will assemble the source src.asm to a data file src.bin. This behaviour is compatible with most other assemblers.

If the source file name starts with a digit, for instance 123.asm, such label is not acceptable by €ASM, so the module name will be prefixed with grave ` and source 123.asm is assembled to `123.bin.

Similary, when the label of PROGRAM statement contains ? or other letters unacceptable by filesystem, such character in the module file name will be replaced with an underscore _. Statement IsNumlockOn? PROGRAM FORMAT=COM will produce program named IsNumlockOn_.com.

€ASM uses ANSI version of Windows API for dealing with file names, so I recomend to abstain from using national characters outside the current codepage in source file names.

When the source file is loaded in the memory, €ASM begins to read the source, starting with the envelope statement PROGRAM. When the corresponding ENDPROGRAM is found, an assembly pass is over. €ASM checks all symbols, which might have been defined in the program, and looks whether their offset is marked fixed, i. e. it did not change between passes. If at least one symbol has its offset not fixed yet, another assembly pass is needed and €ASM goes back to the PROGRAM statement. When all symbols are fixed, €ASM starts the final assembly pass, in which code+data is generated to the target file and listing is produced. Each source requires at least two passes to assemble.

                                                     assembly progress ─>
┌─────────┬──────────────────────────────────┐
│envelope │src: PROGRAM                      │      █       ┌█
├─────────┼──────────────────────────────────┤       █      │ █
│      {1}│ DB 'This source "src.asm" has'   │        █     │  █
│"src.asm"│ DB ' no PROGRAM statement.',13,10│         █    │   █
│      {3}│ DB 'EuroAssembler will use '     │          █   │    █
│      {4}│ DB 'a fictive envelope instead.' │           █  │     █
├─────────┼──────────────────────────────────┤            █ │      █
│envelope │ ENDPROGRAM src:                  │             █┘       █─┐
└─────────┴──────────────────────────────────┘
                                                   ││        │      │ │
I0010 EuroAssembler started.───────────────────────┤│        │      │ │
I0180 Assembling source file "src.asm".────────────┤│        │      │ │
I0270 Assembling source "src".─────────────────────┘│        │      │ │
I0310 Assembling source pass 1.─────────────────────┘        │      │ │
I0330 Assembling source pass 2 - final.──────────────────────┘      │ │
I0760 16-bit TINY BIN file "src.bin" created from source, size=99.───┘ │
I0750 Source "src" (4 lines) assembled in 2 passes with errorlevel 0.─┤
I0860 Listing file "src.asm.lst" created, size=717.───────────────────┤
I0990 EuroAssembler terminated with errorlevel 0.─────────────────────┘

Envelope statements are used regardless if an explicit PROGRAM block was defined in the source text, or not. Source lines between the start of file and the explicit PROGRAM statement, as well as lines between the explicit ENDPROGRAM and the end of source, should not emit any data or code. In this case the envelope source is empty and does not create target file from the source.

Consider the following source file src.asm. There is an explicit block Src:PROGRAM..ENDPROGRAM Src: (lines 5..8) inside the invisible envelope statements src: PROGRAM and ENDPROGRAM src:. When the internal Src:PROGRAM..ENDPROGRAM Src: block is found in assembly process, this entire block is skipped until a final pass of outer block is performed. Then €ASM puts the currently assembled final pass aside, and starts to assemble the inner block in as many passes as necessary, creating the inner program target file. After then €ASM returns to finish the final pass of the outer (envelope) program.

    EUROASM ; Common options.
    ; Source file "src.asm"
    ; with PROGRAM defined
explicitly.
Src:PROGRAM FORMAT=BIN
     DB 'Data emitted '
     DB 'by program Src.'
     ENDPROGRAM Src:

Notice the bug: the wrap of comment line {3} yields an not-comment line {4}. Expression explicitly. is treated as a valid label (definition of an address symbol). This causes the envelope being treated as not empty and target file src.bin is created from it, nonetheless with zero filesize, as it contains only a zero-sized address symbol.
Inner program from lines {5..8} creates target file Src.bin with size 28 bytes, but it is soon overwritten with the envelope zero-sized target src.bin which happens to have almost identical name (filesystem in Dos|Windows is case-insensitive).


┌─────────┬──────────────────────────────────┐  █              assembly progress ─────────>
│envelope │src: PROGRAM                      │   █         ┌█         ┌█
├─────────┼──────────────────────────────────┤    █        │ █        │ █
│      {1}│ EUROASM ; Common options.        │     █       │  █       │  █
│      {2}│    ; Source file "src.asm"       │      █      │   █      │   █
│      {3}│    ; with PROGRAM defined        │       █     │    █     │    █
│      {4}│explicitly.                       │        █┐   │     █┐   │     █
│"src.asm"│Src:PROGRAM FORMAT=BIN            │         │   │      │   │      █─█   ┌█
│      {6}│     DB 'Data emitted '           │         │   │      │   │         █  │ █
│      {7}│     DB 'by program Src.'         │         │   │      │   │          █ │  █
│      {8}│     ENDPROGRAM Src:              │         └█  │      └█  │           █┘   █┐
├─────────┼──────────────────────────────────┤           █ │        █ │                 └█
│envelope │ ENDPROGRAM src:                  │            █┘         █┘                   █┐
└─────────┴──────────────────────────────────┘
                                                ││          │          │    │ ││    │  │  ││
I0010 EuroAssembler started.────────────────────┤│          │          │    │ ││    │  │  ││
I0180 Assembling source file "src.asm".─────────┤│          │          │    │ ││    │  │  ││
I0270 Assembling source "src".──────────────────┘│          │          │    │ ││    │  │  ││
I0310 Assembling source pass 1.──────────────────┘          │          │    │ ││    │  │  ││
I0310 Assembling source pass 2.─────────────────────────────┘          │    │ ││    │  │  ││
I0330 Assembling source pass 3 - final.────────────────────────────────┘    │ ││    │  │  ││
W2101 Symbol "explicitly." was defined but never used. "src.asm"{4}─────────┘ ││    │  │  ││
I0470 Assembling program "Src". "src.asm"{5}──────────────────────────────────┘│    │  │  ││
I0510 Assembling program pass 1. "src.asm"{5}──────────────────────────────────┘    │  │  ││
I0530 Assembling program pass 2 - final. "src.asm"{5}───────────────────────────────┘  │  ││
I0660 16-bit TINY BIN file "Src.bin" created, size=28. "src.asm"{8}─────────────────────┤  ││
I0650 Program "Src" assembled in 2 passes with errorlevel 0. "src.asm"{8}──────────────┘  ││
W3990 Overwriting previously generated output file "Src.bin".─────────────────────────────┤│
I0760 16-bit TINY BIN file "src.bin" created from source, size=0.──────────────────────────┤│
I0750 Source "src" (8 lines) assembled in 3 passes with errorlevel 3.─────────────────────┤│
I0860 Listing file "src.asm.lst" created, size=1372.──────────────────────────────────────┘│
I0990 EuroAssembler terminated with errorlevel 3.──────────────────────────────────────────┘

↑ Chained programs

EuroAssembler allows to define more than one program block in a single source file, and assemble all of them with one command. Remember that symbols used in different PROGRAM..ENDPROGRAM blocks have private scope, so they don't see each other, although they are defined in the same source file. If we want to call a procedure defined in Pgm1 from Pgm2, the called symbol must be declared global and both assembled modules must be linked together.

┌─────────┬──────────────────────────────────┐ █            assembly progress ─────────────────>
│envelope │src: PROGRAM                      │  █       ┌█
├─────────┼──────────────────────────────────┤   █      │ █
│      {1}│     EUROASM ; Common options.    │    █     │  █
│      {2}│Pgm1:PROGRAM FORMAT=PE,ENTRY=Run1:│     █┐   │   █─█   ┌█   ┌█
│      {3}│      ; Pgm1 data.                │      │   │      █  │ █  │ █
│      {4}│Run1: ; Pgm1 code.                │      │   │       █ │  █ │  █
│"src.asm"│     ENDPROGRAM Pgm1:             │      │   │        █┘   █┘   █┐
│      {6}│     ; Pgm2 description.          │      │   │                   █
│      {7}│Pgm2:PROGRAM FORMAT=PE,ENTRY=Run2:│      │   │                   └█   ┌█   ┌█
│      {8}│      ; Pgm2 data.                │      │   │                     █  │ █  │ █
│      {9}│Run2: ; Pgm2 code.                │      │   │                      █ │  █ │  █
│     {10}│      ENDPROGRAM Pgm2:            │      └█  │                       █┘   █┘   █┐
├─────────┼──────────────────────────────────┤        █ │                                  └█
│envelope │ ENDPROGRAM src:                  │         █┘                                    █┐
└─────────┴──────────────────────────────────┘
                                               ││        │    │    │    │  │ │    │    │   │ ││
I0010 EuroAssembler started.───────────────────┤│        │    │    │    │  │ │    │    │   │ ││
I0180 Assembling source file "src.asm".────────┤│        │    │    │    │  │ │    │    │   │ ││
I0270 Assembling source "src".─────────────────┘│        │    │    │    │  │ │    │    │   │ ││
I0310 Assembling source pass 1.─────────────────┘        │    │    │    │  │ │    │    │   │ ││
I0330 Assembling source pass 2 - final.──────────────────┘    │    │    │  │ │    │    │   │ ││
I0470 Assembling program "Pgm1". "src.asm"{2}─────────────────┤    │    │  │ │    │    │   │ ││
I0510 Assembling program pass 1. "src.asm"{2}─────────────────┘    │    │  │ │    │    │   │ ││
I0510 Assembling program pass 2. "src.asm"{2}──────────────────────┘    │  │ │    │    │   │ ││
I0530 Assembling program pass 3 - final. "src.asm"{2}───────────────────┘  │ │    │    │   │ ││
I0660 32bit FLAT PE file "Pgm1.exe" created, size=14320. "src.asm"{5}──────┤ │    │    │   │ ││
I0650 Program "Pgm1" assembled in 3 passes with errorlevel 0. "src.asm"{5}─┘ │    │    │   │ ││
I0470 Assembling program "Pgm2". "src.asm"{7}────────────────────────────────┤    │    │   │ ││
I0510 Assembling program pass 1. "src.asm"{7}────────────────────────────────┘    │    │   │ ││
I0510 Assembling program pass 2. "src.asm"{7}─────────────────────────────────────┘    │   │ ││
I0530 Assembling program pass 3 - final. "src.asm"{7}──────────────────────────────────┘   │ ││
I0660 32bit FLAT PE file "Pgm2.exe" created, size=14320. "src.asm"{10}─────────────────────┤ ││
I0650 Program "Pgm2" assembled in 3 passes with errorlevel 0. "src.asm"{10}────────────────┘ ││
I0750 Source "src" (10 lines) assembled in 2 passes with errorlevel 0.───────────────────────┤│
I0860 Listing file "src.asm.lst" created, size=1736.─────────────────────────────────────────┘│
I0990 EuroAssembler terminated with errorlevel 0.─────────────────────────────────────────────┘

Why should we pack multiple modules together with their documentation to a single file rather than scatter them to a bunch of small files? It's a matter of individual preferences.

One reason could be the transfer of information between modules with preprocessing %variables. Unlike ordinary symbols, scope of %variables is not limited with PROGRAM..ENDPROGRAM block bounderies. Suppose that in Pgm2 we need to know the size of data segment from Pgm1. Let's read the size to %variable with statement %Pgm1DataSize %SETA SIZE# [DATA] which is placed in Pgm1 just above ENDPROGRAM Pgm1. In the final pass of Pgm1 is the segment size reliably known, and the variable %Pgm1DataSize will be visible in the whole source below its definition, so Pgm2 can calculate with it.

Another example where grouping programs is profitable is when the programs are similar or they share common data, declared with preprocessing %variables. The following example creates three similar short programs RstLPT1.com, RstLPT2.com, RstLPT3.com in a loop:

Nr %FOR 1,2,3     ; Repeat the %FOR..%ENDFOR block three times.
 RstLPT%Nr PROGRAM FORMAT=COM ; Program to reset LinePrinter port.
   MOV DX,%Nr     ; LPT port ordinal number (1,2,3).
   MOV AH,1       ; BIOS function INITIALIZE LPT PORT.
   INT 17h        ; Use BIOS function to reset printer.
   MOV DX,Message ; Put the address of $-terminated string to DS:DX.
   MOV AH,9       ; DOS function WRITE STRING TO STDOUT.
   INT 21h        ; Use DOS function to report success.
   RET            ; Terminate program.
   Message:DB "LPT%Nr was reset.$"
 ENDPROGRAM RstLPT%Nr
%ENDFOR Nr        ; Generate 3 clones of the program.

↑ Nested programs

Program modules can be nested in one-another. For instance when building amphibious program executable both in Dos and in Windows we may want to reflect the fact, that the Dos-executable MZ file is embedded as a stub in Windows-executable PE file, both providing the same functionality.
See the sample projects LockTest or EuroConvertor as examples of dual DOS&Windows program.

Again, when the outer program sees inner program block in non-final pass, it is skipped. In the final pass is the assembly of outer program temporarily suspended, inner program completely assembled, and then the final pass of outer program continues.

┌─────────┬──────────────────────────────────┐ █                   assembly progress ──────────────>
│envelope │src: PROGRAM                      │  █       ┌█
├─────────┼──────────────────────────────────┤   █      │ █
│      {1}│      EUROASM ; Common options.   │    █     │  █
│      {2}│Pgm1: PROGRAM FORMAT=PE,ENTRY=Run:│     █┐   │   █─█       ┌█       ┌█
│      {3}│Run:   ; Pgm1 data + code.        │      │   │      █      │ █      │ █
│      {4}│ Pgm2: PROGRAM FORMAT=COFF        │      │   │       █┐    │  █┐    │  █─█  ┌█
│"src.asm"│        ; Pgm2 data + code.       │      │   │        │    │   │    │     █ │ █
│      {6}│       ENDPROGRAM Pgm2:           │      │   │        └█   │   └█   │      █┘  █─█
│      {7}│       ; Pgm1 more code.          │      │   │          █  │     █  │             █
│      {8}│       LINK "Pgm2.obj"            │      │   │           █ │      █ │              █
│      {9}│      ENDPROGRAM Pgm1:            │      └█  │            █┘       █┘               █─█
├─────────┼──────────────────────────────────┤        █ │                                         █
│envelope │ ENDPROGRAM src:                  │         █┘                                          █─┐
└─────────┴──────────────────────────────────┘
                                               ││        │    │        │        │   │   │ │    │   │ │
I0010 EuroAssembler started. ──────────────────┤│        │    │        │        │   │   │ │    │   │ │
I0180 Assembling source file "src.asm".────────┤│        │    │        │        │   │   │ │    │   │ │
I0270 Assembling source "src".─────────────────┘│        │    │        │        │   │   │ │    │   │ │
I0310 Assembling source pass 1.─────────────────┘        │    │        │        │   │   │ │    │   │ │
I0330 Assembling source pass 2 - final.──────────────────┘    │        │        │   │   │ │    │   │ │
I0470 Assembling program "Pgm1". "src.asm"{2}─────────────────┤        │        │   │   │ │    │   │ │
I0510 Assembling program pass 1. "src.asm"{2}─────────────────┘        │        │   │   │ │    │   │ │
I0510 Assembling program pass 2. "src.asm"{2}──────────────────────────┘        │   │   │ │    │   │ │
I0530 Assembling program pass 3 - final. "src.asm"{2}───────────────────────────┘   │   │ │    │   │ │
I0470 Assembling program "Pgm2". "src.asm"{4}───────────────────────────────────────┤   │ │    │   │ │
I0510 Assembling program pass 1. "src.asm"{4}───────────────────────────────────────┘   │ │    │   │ │
I0530 Assembling program pass 2 - final. "src.asm"{4}───────────────────────────────────┘ │    │   │ │
I0660 32bit FLAT COFF file "Pgm2.obj" created, size=78. "src.asm"{6}──────────────────────┤    │   │ │
I0650 Program "Pgm2" assembled in 2 passes with errorlevel 0. "src.asm"{6}────────────────┘    │   │ │
I0560 Linking COFF module ".\Pgm2.obj". "src.asm"{9}───────────────────────────────────────────┤   │ │
I0660 32bit FLAT PE file "Pgm1.exe" created, size=14320. "src.asm"{9}──────────────────────────┤   │ │
I0650 Program "Pgm1" assembled in 3 passes with errorlevel 0. "src.asm"{9}─────────────────────┘   │ │
I0750 Source "src" (9 lines) assembled in 2 passes with errorlevel 0.──────────────────────────────┤ │
I0860 Listing file "src.asm.lst" created, size=1237.───────────────────────────────────────────────┘ │
I0990 EuroAssembler terminated with errorlevel 0.────────────────────────────────────────────────────┘

↑ Assembly debugging

Some useful features of EuroAssembler can help the programmer to assure that the source is assembled as intended.

Keep on mind that this is asm-time debugging which helps to discover misunderstanding and errors in EuroAssembler itself rather than bugs in the assembled program.

Dump column of the listing displays the assembled code . Repeated stretchs, which are considered bug-free, are suppressed by default, but they can be displayed on demand with directives EUROASM LISTINCLUDE=ON, LISTVAR=ON, LISTMACRO=ON, LISTREPEAT=ON.

Recognition of fields in statements can be investigated with option EUROASM DISPLAYSTM=ON, which inserts comment lines identifying each field. As this option blows up the listing size significantly, it's better to limit DISPLAYSTM only to the suspected lines, and then switch the option OFF or restore the previous set of options:

   EUROASM PUSH, DISPLAYSTM=ON ; Store all current EUROASM options with PUSH first.
   MyMacro Operand1, Operand2  ; "MyMacro" was not defined yet as a %MACRO, so it's treated like a label.
D1010 **** DISPLAYSTM "MyMacro Operand1, Operand2"
D1020 label="MyMacro"
D1040 unknown operation="Operand1"
D1050 ordinal operand number=1,value="Operand2"
   EUROASM POP                 ; Restore EUROASM options.
D1010 **** DISPLAYSTM "EUROASM POP"
D1040 pseudo operation="EUROASM"
D1050 ordinal operand number=1,value="POP"
                              ; Statement fields are no longer displayed.

Detailed machine instructions encoding can be displayed with option EUROASM DISPLAYENC=ON, which inserts comment line below machine instruction with the list of actually used modifiers.

   EUROASM PUSH, DISPLAYENC=ON ; Store all current EUROASM options with PUSH first.
   SHRD [RDI+64],RDX,2
D1080 Emitted size=6,DATA=QWORD,DISP=BYTE,SCALE=SMART,ADDR=ABS,IMM=BYTE.
   VMOVNTDQA XMM17,[RBP+40h]
D1080 Emitted size=7,PREFIX=EVEX,DATA=OWORD,OPER=0,DISP=BYTE,SCALE=SMART,ADDR=ABS.
   EUROASM POP         ; Restore EUROASM options. Encodings are no longer displayed.

All configuration options, which can be specified with EUROASM and PROGRAM keyword operands, are retrievable in the form of system %^variables, thus their current value can be checked or otherwise exploited:

   %IF %^NOWARN[2101]
     %ERROR You shouldn't suppress the warning W2101. Move unused symbols to included file instead.
   %ENDIF

The most powerful assembly-time debugging tool is the pseudoinstruction %DISPLAY, which displays internal €ASM objects at assembly-time and helps to find out, why €ASM doesn't work as expected.

See tests t2901..t2917 as examples.

Static linking ↓

Dynamic linking ↓

Linking in IT terminology is the process when the separately assembled or compiled modules are joined, interactions between the globally accessible symbols resolved, their code and data combined and reformated to the target file format. See [Linkers] for more details.

Unlike many other linkers, EuroAssembler can create not only executable files, but also linkable formats ELF, COFF and OMF, and their libraries LIBCOF and LIBOMF (see Object convertor and the table of supported linker combinations).

Linking in EuroAssembler takes place when the pseudoinstruction ENDPROGRAM is processed in the final pass.

Linking is mediated with pseudoinstruction LINK which is followed by filenames of input modules. Input formats acceptable for EuroAssembler linker are of two kinds:

  1. linkable file formats for static linking are ELF, COFF, OMF, LIBCOF, LIBOMF, RSRC.
  2. importable file formats for dynamic linking are DLL, LIBCOF, LIBOMF.
File formats accepted by EuroAssembler statement LINK
CPU
mode
Program
width
Output
executable
Output
linkable
Input
linkable
Input
importable
Real16BIN, BOOT, COM, MZOMF, LIBOMF, COFF, LIBCOFOMF, LIBOMF, COFF, LIBCOF-
Real32BIN, BOOT, COM, MZOMF, LIBOMF, COFF, LIBCOFOMF, LIBOMF, COFF, LIBCOF-
Prot32ELFX, PE, DLLELF, COFF, LIBCOF, OMF, LIBOMFELF, COFF, LIBCOF, RSRC, OMF, LIBOMFELF, COFF, LIBCOF, DLL, OMF, LIBOMF
Prot64ELFX, PE, DLLELF, COFF, LIBCOFELF, COFF, LIBCOF, RSRCELF, COFF, LIBCOF, DLL, OMF, LIBOMF

See also the table of tests on linker combinations.
Notice that the object format OMF cannot be linked in 64-bit programs.

The actual format of linked file is recognized by the file contents, not by the file name extension. Each linked module is loaded and converted to an €ASM internal format (PGM) in memory prior to the actual linking.

Position of pseudoinstruktion LINK within the block PROGRAM..ENDPROGRAM is not important, names of the linked modules are just collected a the linking is postponed till the end of program.

↑ Static linking

Code and data from the linked object files in formats ELF, COFF or OMF will be combined and concatenated with code and data from the base program (i. e. the one to which it's linked). Base program may be empty, however. Linker also resolves mutual references between the public and external symbols from all linked modules.

Unlike other linkers, EuroAssembler does not accept names of linked module as its command line arguments. A linker script (€ASM source program) must be prepared beforehand when we want to employ EuroAssembler as a pure linker, for instance to convert object files created by 3rd-party assembler or compiler to an executable file. The desired output file name and format will be specified as the PROGRAM arguments:

MyExeFile PROGRAM FORMAT=PE, WIDTH=32, ListMap=Yes, ListGlobals=Yes
              LINK MyCoff.obj, PascalOmf.obj, Win32.lib
          ENDPROGRAM MyExeFile
Save the linker script as MyScript.asm, execute euroasm MyScript.asm and it will produce the Windows program MyExeFile.exe and listing MyScript.asm.lst with the map of linked sections and global symbols.

Beside standalone object modules the code and data can be also linked from object libraries in formats LIBCOF and LIBOMF.

When the target base program is executable, €ASM only links those modules from library, which are at least once referenced by other modules (smart linking). This helps to keep size of the linked file small, eliminating the dead (never-to-be-executed) code.

If we nevertheless need to combine the unreferenced library procedures to our executable program, we would have to explicitly declare their names GLOBAL in the the base program.

Smart linking does not apply when the target file is linkable, for instance when a LIBCOF library is created from other libraries and standalone object modules. In this case all modules (referenced or unreferenced) will be linked to the target file.

The good reason why to split big project into smaller, separately assembled modules, is faster build.

When a project grows and its source is doubled in size, the number of symbols in it is likely to double, too. Each symbol needs to be compared with an array of other already declared symbols to avoid duplication. Number of checks, and also the consumed time, grows almost quadratically with source size.

During the developement process we are usually focused only to one part (module) of the project, so the remaining unchanged modules do not need to be recompiled again in each developement cycle (see also Makefile manager).

Recapitulation: If you want to statically link your own function (procedure), declare it PUBLIC function (or terminate its definition label with two colons function:: PROC) and assemble the function to an object or library module.
Then assemble the main program, declare the linked function EXTERN function (or terminate the called name with two colons) and insert pseudoinstruction LINK module.obj into the main program. The main program then can CALL function:: as if it were assembled in its own body.
The same applies for functions from 3rd party library. Again, you must observe its published name, calling convention, number, order and type of arguments.

↑ Dynamic linking

This version of EuroAssembler does not support dynamic linking of Linux dynamic libraries (DSO). The command LINK DSO.so tries to link the file only statically. This chapter concerns dynamic linking for MS-Windows only.

The code and data of dynamically linked functions are not copied to the target executable image, they remain in dynamic library (DLL), which has to be available on the system where our executable runs. When our program calls a function from DLL, it actually executes a thunk code represented by a call of single proxy jump instruction (stub).
€ASM generates stubs in a special import section [.idata] in the form of indirect absolute near jump (JMPN). Each such proxy jump is 7 bytes long (0xFF2425[00000000]) and it uses pointer into Import Address Table (IAT) as its indirect DWORD target. Virtual address in the pointer [00000000] is resolved by the linker, but the actual 32-bit or 64-bit virtual address of the library function (pointed to by the resolved dword) will be fixed up later, by the loader at bind time when the application starts.

Loader, implemented in Windows kernel, needs two pieces of information to dynamically link library functions and to fix up their addresses in IAT:

1) The name of the linked symbol (function name) or its ordinal number in the table of exported symbols.

Calling by ordinals is not supported in €ASM.

2) The name of the library file which exports the symbol (without path).

Path to the library file will be established by the loader. The order of directories where MS-Windows searches for the library is explained in [WinDllSearchOrder].

Program, which needs to call a symbol (imported function) from the dynamic library, should declare the symbol as imported. It may be declared GLOBAL as well, either explicitly or implicitly ( CALL ImportedSymbol::), but €ASM will treat such global symbol as EXTERN (statically linked) and complain that the corresponding public symbol was not found.
There are several methods how to tell €ASM that the symbol should be dynamically linked:

Recapitulation: If you want to dynamically link your own function (procedure) in other programs, declare it EXPORT function and assemble the function to an DLL format (mylib PROGRAM FORMAT=DLL). Be sure to distribute mylib.dll together with your programs.
Then assemble the main executable program, declaring the linked function IMPORT function, LIB=mylib.dll. The main program then can invoke it using CALL function.
More often you will need to call the functions from 3rd party dynamic library, which is the case of MS-Windows API. You might explicitly enumerate each used WinAPI functions with pseudoinstruction such as IMPORT function1,function2,LIB=user32.dll, but more comfortable solution is to use import library, which declares all function names exported by the DLL. Then you won't have to add new import declarations every time when a new function is used in your program during its developement. Simply call the new function with double colon and, when its name appeares in some import library, it will be treated as imported. You may also want to use the macro WinAPI (32-bit) or WinABI (64-bit) which takes care of IMPORT declaration and automatic selection between ANSI and WIDE variant.

↑ Librarian

EuroAssembler can create libraries from previously assembled object modules (files in ELF, OMF or COFF format). When the library program itself contains some code and data, it will be implicitly linked to the library as the first module.

Library PROGRAM FORMAT=LIBOMF  ; or FORMAT=LIBCOF
ObjModule1:: PROC ; One of the object modules can also be defined here.
                  ; Source code of ObjModule1.
             ENDP ObjModule1::
             LINK "ObjModule2.obj", "ObjModule3.obj" ; Other ELF, OMF or COFF object modules.
        ENDPROGRAM Library

If the linked modules contain import information, it is copied to the output library, too. Pure import library contains import declarations only. They may be explicitly declared as IMPORT, or loaded from dynamic library, or linked from other import libraries. Following example exploits all three methods:

ImpLibrary PROGRAM FORMAT=LIBOMF ; or FORMAT=LIBCOF
             IMPORT Symbol1, Symbol2, LIB="DynamicLibrary1.dll" ; Explicit declaration.
             LINK "C:\MyDLLs\DynamicLibrary2.dll"               ; Automatic export detection from DLL.
             LINK "OtherImportLibrary.lib"                      ; Reimport from another library.
           ENDPROGRAM ImpLibrary

Example of libraries created from three separately assembled modules can be found in €ASM tests:
t8552 (object library LIBOMF for 16-bit Dos),
t9113 (object library LIBCOF for 32-bit Windows),
t9164 (object library LIBCOF for 64-bit Windows),
t8675 (import library LIBOMF for Windows),
t9225 (import library LIBCOF for Windows),

↑ Object convertor

EuroAssembler can directly link all main object formats OMF, ELF and COFF, so the demand for explicit object conversion between them should be rare. Example:

OMFobject PROGRAM FORMAT=OMF ; Convert COFF object file to the format OMF.
            LINK "COFFobject.obj"
          ENDPROGRAM OMFobject
COFFobject PROGRAM FORMAT=COFF; Convert OMF object file to the format COFF.
             LINK "OMFobject.obj"
           ENDPROGRAM COFFobject
ELFobject PROGRAM FORMAT=ELF; Convert COFF object file to the format ELF.
             LINK "COFFobject.obj"
           ENDPROGRAM ELFobject
COFFobject PROGRAM FORMAT=COFF; Convert ELF object file to the format COFF.
             LINK "ELFobject.o"
           ENDPROGRAM COFFobject
OMFlibrary PROGRAM FORMAT=LIBOMF ; Convert COFF object library to the format LIBOMF.
             LINK "COFFlibrary.lib"
           ENDPROGRAM OMFlibrary
COFFlibrary PROGRAM FORMAT=LIBCOF ; Convert OMF object library to the format LIBCOF.
              LINK "OMFlibrary.lib"
            ENDPROGRAM COFFlibrary

↑ Makefile manager

Operator FILETIME# retrieves the last modification time of a file at assembly-time, which can be used for detection if the target file needs reassembly or not. Just compare the filetime of target with filetime of each source, which the target depends on. If the target file does not exist, its attribute-operator FILETIME# returns 0, which is the same as if it was very old, so its reassembly will be required anyway.

       ; Recompile "source.asm" only if "target.exe" doesn't exist or if it is older than its sources.
    %IF FILETIME# "target.exe" > FILETIME# "source.asm" && FILETIME# "target.exe" > FILETIME# "included2source.inc"
       %ERROR "target.exe" is fresh, no need to assemble again.
    %ELSE
       target PROGRAM FORMAT=PE
               INCLUDE "source.asm"
              ENDPROGRAM target
    %ENDIF

As an example of more sofisticated makefile script see the main EuroAssembler source file euroasm.htm.


↑ Optimisation

Computer programs are often written in assembler because we want them to be fast and small. However, those are not the only criteria how a program can be optimised:

By program size ↓

By program speed ↓

By assembly speed ↓

By source writeability ↓

By source readability ↓

See also optimisation tutorials.

Let's look how EuroAssembler can help with optimisation.

↑ Optimisation by the program size

€ASM selects by default the shortest possible encoding of machine instruction. On the other hand, it respects instruction mnemonic chosen by the programmer, which doesn't always have to be the shortest variant. A couple of rules worth remembering:

|0000:B80000 | MOV AX,0 |0003:29C0 | SUB AX,AX ; Using SUB or XOR for zeroing is shorter. Side effect: flags are changed. |0005: | |0005:89D8 | MOV AX,BX |0007:93 | XCHG AX,BX ; XCHG is shorter than MOV. Collateral damage: 2nd register is changed, too. |0008: | |0008: |Label: |0008:8D06[0800] | LEA AX,[Label] |000C:B8[0800] | MOV AX,Label ; Moving offset to a register is shorter than loading its address by LEA. |000F: | |000F:5053 | PUSH AX,BX |0011:60 | PUSHAW ; Pushing/popping all registers at once is shorter than individual push/pop. |0012: | |0012:050100 | ADD AX,1 |0015:40 | INC AX ; Increment/decrement is shorter than add/subtract. |0016: | |0016: |LoopStart: |0016:49 | DEC CX |0017:75FD | JNZ LoopStart: |0019:E2FB | LOOP LoopStart: ; LOOP, JCXZ are shorter than separate test+jump.

Programs which aspire for short-size category should have PROGRAM FORMAT=COM and EUROASM AUTOALIGN=OFF. They may be terminated by a simple near RET instead of invoking DOS function TERMINATE PROCESS, because the return address on stack of COM program is initialized to 0 and the final RET transfers execution to DOS terminating interrupt at the beginning of PSP block (CS:0), which was established by the loader.

Hello PROGRAM FORMAT=COM
       MOV DX,=B "Hello world!$"
       MOV AH,9
       INT 21h
       RET
      ENDPROGRAM Hello

For some more inspiration check [Golfing_tips], Hugi Size Coding Competition Series,
Assembly nibbles competition,
Graphical Tetris in 1986 bytes by Sebastian Mihai,
BootChess play in 487 bytes by Oliver Poudade.

Windows executable program created by €ASM will be shorter when the option PROGRAM ICONFILE= is explicitly specified as empty and no resource file is linked. In this case the resource section will not be included in PE file at all. You may also experiment with PE file properties using program options, such as PROGRAM FILEALIGN= value.

↑ Optimisation by the program speed

Writing fast programs is fully in the hands of programmer, EuroAssembler cannot help much here, it does no optimisations behind your back as high-level compilers do. You may want to set EUROASM AUTOALIGN=ON to be sure that all data will be aligned for the best performace. Total control of instruction encoding in €ASM allows to select a variant with exact code size, which is faster than size-optimised encoding stuffed by NOPs. €ASM supports optimised no-operations encoding for fast and easy manual alignment.

There are many tricks how to squeeze every CPU clock: by loop unrolling, parallelization, avoiding memory access, and last but not least, choosing the fastest algorithm. Performance also heavily depends on CPU model and generation. Good guide is [SoftwareOptimisation] by Agner Fog.

Performance is usually traded off with the program size, for instance many tricks mentioned above lead to slower execution. You may want to optimize only the critical parts of the code which are executed many times in your program.

↑ Optimisation by the assembly speed

EuroAssembler is not optimised for speed, nevertheless duration of assembly is usually not an issue. It mostly depends on the number of passes, which is governed by €ASM itself and not directly impactable by the programmer. At least two passes are always required. Number of passes increases when the program contains forward references, assembly-time loops, macroinstructions.

When €ASM is assembling forward-referenced jumps, at first it anticipates short distance to not-yet-defined target, and reserves room for only 2 byte (short) opcode. If we know at write time that the forward target will be further than 127 bytes, it is recommended to explicitly specify DIST=NEAR, which can save one pass at assembly time. However the pass will be spared only when the distances of all such jumps are specified, which is usually not worth the effort.

If you are interrested why €ASM performs this many passes, put the statement %DISPLAY UnfixedSymbols in front of ENDPROGRAM to find out which symbols do oscillate between assembly passes.

The build time of big projects can be reduced significantly by splitting the code to smaller, separately assembled modules, which will be finally linked together. See also the euroasm.htm source itself.

↑ Optimisation by writeability

EuroAssembler introduced some new comfortable features which are not usual among other assemblers:

↑ Optimisation by readability

Well commented and structured program is easy to read and maintain. EuroAssembler allows HTML formatting in comments, so the source code can be directly published on web sites and each part of source can be immediately documented with rich formated remarks, tables, images, hypertext links.

Size and language of identifiers is not limited, so they can be selfdescribing. If English is not your mother tongue, it is a good idea to prefer labels with non-English names, such as Drucken rather than Print, файл rather than file etc. This helps the reader of your program to distinguish built-in reserved words from identifiers created by the author.

Elements of EuroAssembler language use decorators which help the human reader to distinguish the category of decorated identifier:

↑ Where to begin

If you have read this manual hitherward and if you want to try EuroAssembler, download the latest version, print a hardcopy of a paper crib and look at the sample projects. Good luck!

▲Back to the top▲