EuroAssembler is written in EuroAssembler in static pseudo object-oriented paradigma (OOP).
Programming object is represented by a collection of related memory variables
described by an assembler structure (object class)
The data are manipulated with procedures – object methods. Objects are bound with their methods only by naming convention. Inheritance and cascading of methods is not utilized.
Each €ASM object and its methods are encapsulated in a separate source file
which is assembled to COFF module file. All modules are then linked into final executable
euroasm.exe (32bit console application for MS Windows).
|CHUNK||Chunk of assembled source text.||chunk.htm|
|DOS stub program for PE executables.||coffstub.htm|
|CTX||Assembly block context.||ctx.htm|
|EA||EuroAssembler main object.||ea.htm|
|EuroAssembler linker script.||euroasm.htm|
|MEMBER||Member of structure or library.||member.htm|
|PASS||Assembly pass through the source.||pass.htm|
|PGM||The assembled module.||pgm.htm|
|SRC||Input source file.||src.htm|
|SYS||System calls of MS Windows functions.||syswin.htm|
|Instruction category||Uses registers||Module file|
|all||Machine instruction handlers support|
|A||Vendor specific (AMD)|
|B||Intel Fused Multiply-Add (FMA)|
|D||3DNow! specific (AMD, D3NOW)|
|K||Mask-registers manipulation (AVX512)|
|S||System special (SPEC, UNDOC, PROT, PRIV, MPX, SGX)|
|T||Transactional & other extensions (TSX, RTM, VMX, SVM)|
|V||Advanced Vector extension (AVX)|
|Y||Advanced Vector extension (AVX2)|
|Z||Advanced Vector extension (AVX512)|
|all||Linker for all €ASM format output files||pf.htm|
|BIN||Binary output file||pfbin.htm|
|COFF||16|32|64bit Common Object Format module||pfcoff.htm|
|COM||16bit DOS executable||pfcom.htm|
|DLL||32|64bit Dynamically Linked Library||pfdll.htm|
|LIBCOF||Library of COFF modules||pflibcof.htm|
|LIBOMF||Library of OMF modules||pflibomf.htm|
|MZ||16bit DOS executable||pfmz.htm|
|OMF||16|32bit Object Module Format||pfomf.htm|
|PE||32|64bit Windows Portable Executable||pfpe.htm|
|RSRC||Compiled Windows resource (input only)||pfrsrc.htm|
|EUROASM||any||Extensions of CPU machine instructions||cpuext.htm|
|EUROASM||any||Extensions of CPU machine instructions||cpuext32.htm|
|EUROASM||any||Memory management macros.||memory.htm|
|EUROASM||any||Boolean flag manipulation.||status32.htm|
|EUROASM||any||StdCall 32bit calling-convention macros.||stdcall.htm|
|EUROASM||any||Operations with zero-terminated strings.||string32.htm|
|EUROASM||Win||List of MS Windows API functions with ANSI+WIDE variants.||winansi.htm|
|EUROASM||Win||Macros for core 32bit MS Windows functions.||winapi.htm|
|EUROASM||Win||Wrappers of Windows file functions.||winfile.htm|
euroasm.exeterminates its execution.
euroasm.exeas a command-line parameter. It exists in one and only static instance named
Src. €ASM will open the file + its included file(s), create an object Src, assemble, link, close the source, write output and listing and then destroy the object Src.
euroasm.exe, the object
Srcis reinitialized and its assembly repeats.
PROGRAM..ENDPROGRAMblock in the source file. It is created at the start of each assembly pass when pseudoinstruction PROGRAM is assembled, and it is destroyed in ENDPROGRAM handler.
Some information about mutual relation between objects (€ASM source procedures
and macros) is scatterred throughout the source files:
Main €ASM execution and termination
Machine instruction assembly
Linkage processing procedures
Boolean data are implemented as 1 bit flag in object's DWORD member usually called
Integer numbers with 64 bits (QWORD type) are usually accessible as two DWORD variables postfixed Low and High,
STM.OffsetHigh. When loaded into two 32bit registers,
such pair is referred in comments as colon-separated, e.g.
Pointers are referred in comments as
Ptr or as a carret sign ^.
^Name represents 32bit offset of the Name.
Reference to strings is implemented with several methods:
Pointer and size where the first register or variable keeps pointer to the first byte of the string, and the second register or variable keeps the string size in bytes. They are referred in comments as comma-separated pairs, e.g.
Begin and end where the first register points to the first string byte, and the second register points right behind the last string byte. They are referred in comments as ellipsis-separated pairs, e.g.
Size of such string can be computed by subtraction of two registers.
ASCIIZ termination where the string is referred with one and only pointer. The string ends with NULL control character 0x00 (C string). This convention is mostly used when the string specifies file name.
Size prefixed strings have their size encoded in their first byte (Pascal string). Size of such string cannot exceed 255 bytes. This format is employed in some older file formats (OMF).
Project of such magnitude requires strict discipline in choosing symbol names.
They always begin with abbreviated object identification (object shortcut, for instance
Stm is shortcut of the statement object), so it is easy to tell
the class where a method or symbol belongs to, ergo in which source file it is defined.
The character case indicates what kind of data the identifier represents:
Class/structure names are all in uppercase (C convention), for instance
Boolean flag names begin with lowercase shortcut (camel convention), for instance
Procedures and methods have the first letter of object shortcut capitalized (lochness convention),
Local labels in procedures usually do not have mnemonic names. Monotonous numeric sequence
is used instead, e.g.
.10:, .20:, .30: (Basic convention).
The verb, which follows object shortcut in method name, indicates the function of the method.
Object constructors & destructors are named Create & Destroy, e.g.
Boolean flags (max.32 per class) are kept in object DWORD variable named
.Status and manipulated
SetSt, RstSt, JSt, JNSt from library
Zero-terminated (ASCIIZ) strings and macros which operate with such strings have their name terminated with dollar character $, see macrolibrary string32.htm.
Objects which have the property name, such as symbol, %variable, structure, program etc.,
usually keep their name in their first two DWORD variables: pointer to the object name
.NamePtr and size of the name
All case-insensitive names (registers, prefixes, machine instructions, pseudoinstructions, keywords etc.) are written in upper case here in EuroAssembler sources. Names of variables, procedures, macroinstructions are in mixed case.
If a special character is part of €ASM term and should be embedded in an identifier or HTML anchor, it is replaced with two lowercase letters:
For instance the handler of pseudoinstruction %SHIFT has a label
and URL pseudo.htm#PseudopcSHIFT.
EuroAsembler uses three kinds of subprogrammes:
Ad 1.: Beside ad-hoc macroinstructions defined in the same source which uses it (for instance macros in ii.htm), €ASM hires some generally usable macros from libraries shipped with EuroAssembler.
Ad 2.: PROC / ENDPROC blocks are used only sporadically as local subroutines in large procedures. They are called with register calling convention, using input/output registers described in their header.
Ad 3.: €ASM extensively employs subprograms defined with
Procedure, LocalVar, EndProcedure, Invoke.
Their advantage is that those four macros encapsulate their StdCall calling convention,
register preservation, local stack variables reservation, maintenance of stack frame and the final return.
Another advantage is that they can be invoked with arbitrary number of arguments. Nevertheless,
the number of arguments provided by Invoke must exactly match the Procedure declaration.
Where a variable number of arguments was required, the subprogram was implemented as a macro
(see Msg as an example).
Procedures used in €ASM preserve all registers except those which return the result. Usually it is EAX but the result is sometimes returned in other register(s), too. For instance the macro BufferRetrieve returns the contents of the buffer as a string in registers ESI,ECX.
Arithmetic CPU flags are not preserved by subprogrammes. The exceptions are macros Msg MsgUnexpected, which preserve all CPU flags and registers. Nonetheless, many procedures use Carry flag to signalize error, or Zero flag to signalize emptiness.
All procedures expect clear Direction flag on input, and they return it reset on output (DF=0).
Calling convention of operating system functions is hidden in system macros in the file sys*.htm.
euroasm.ini are loaded to memory, assembled and immediately released.
Input files (the actually assembled source file and its included files) are mapped to memory and kept open with sharing access allow read, deny write until the assembler/linker ends and output file is completed in a memory stream.
Actually assembled source files may be kept open by the text editor in which they are being written, but you won't be able to save them until the assembly terminates.
Output files (the target object|executable file and listing) are compiled in memory. When they are complete, input files are closed and only then is the output compilation flushed at once to an output disk file.
This method allows to create output listing with the same name as input source, overwriting the source with its listing (or even with the assembled output file)..
Requests for service from the operating system is encapsulated in macroinstructions
easource/sys???.htm. In Windows version of EuroAssembler
it is the source file easource/syswin.htm which imports the
following API services from system library
See syswin.htm for their description.
Encapsulation of OS calls by Sys* macroinstructions facilitates future porting of EuroAssembler from MS Windows to other operating systems.
Unique objects Ea, Src, dictionary of enumerated tokens used by EuroAssembler language, text of €ASM messages, literal strings and some ad hoc local tables are allocated statically, in [.data] or [.bss] segments.
All other €ASM objects are allocated dynamically at run time,
either on machine stack, or in the memory provided on request from the operating system.
Recursively invokable procedures protect themselves from stack overflow with macro EaStackCheck.
EuroAssembler does not use system heap. It allocates dynamic memory in portions called
pool, implemented as a linked list of pool blocks with typical size 64 KB (or larger, if requested so).
In MS Windows it is provided by API functions
Memory once allocated from pool is not returned to OS at the moment when the object is discarded,
there is no garbage collection.
Instead, the pool memory is returned as a whole to operating system when the pool's owner is destroyed.
There are four classes in €ASM which maintain their own pools: EA, SRC, PGM, PASS. Object methods choose the appropriate pool depending on the lifetime of each stored object, see also DOM.
Although the memory can be allocated from pool directly (using macros PoolNew or PoolStore), the pool serves mainly as a container for more sofisticated access methods:
STACK keeps table of objects of the same size.
It has nothing common with CPU stack SS:ESP except its name and access method LIFO (Last In = First Out).
STACK is used by €ASM to reflect the structure of nested block objects CTX, CHUNK_HEAD, EAOPT.
LIST keeps the bidirectionally linked list of objects of the same size,
which are not kept together in memory. Listed objects can be accessed only sequentionally,
either forward (FIFO | LILO) or backward (FILO | LIFO).
This method is used to store €ASM objects whose number cannot be reliably estimated at the beginning, such as symbols, %variables, macros, sections.
STREAM is a write-only memory class
which stores unformated data string sequentionally to a collection of memory blocks.
StreamStore access method is similar to FileWrite. When the stream is completed, it may be flushed
to a disk file at once with macro StreamDump.
This method is used in €ASM when output files are formated.
BUFFER is used to store data items (strings) of variable size. Unlike the stream or list method, all data in buffer are stored continuously. If the estimated buffer size specified on BufferCreate was underestimated and is exhausted, the buffer silently allocates from its pool another block of memory with doubled size, and copies the whole previous contents to the new location. Thus the entire buffer contents returned by BufferRetrieve is always continuous.
Buffers are extensively used by €ASM. Leaving their contents abandoned, until
the termination of parent PASS or PGM, would have negative impact on total memory consumption.
Therefore buffers can also be borrowed from the stack of preallocated buffers
Ea.BufferStack by the invocation of
EaBufferReserve and returned with
EaBufferRelease, not wasting the once allocated memory.
When the requested dynamic memory size exceeds usual values, for which are buffers preallocated, the buffer or stack will automatically request additional memory portion from its pool, and if there is no more free memory on the pool, its manager requests another block from the operating system. Thus dynamic memory management works transparently for the programmer and it is limited only by the amount of OS virtual memory, no matter how big identifiers, expressions, nesting level etc. may occur in the assembled source.
euroasm.exe can be recompiled in
with the command
..\euroasm euroasm.htm, assuming that the stable
euroasm.exe version was not moved yet from EuroAssembler home directory
somewhere to system %PATH%.
euroasm.exe is created in subdirectory
EuroAssembler can also be built from browser at page
On my Intel Pentium machine running at 3 GHz with 8 GB RAM the complete rebuild reports: ... I0750 Source "euroasm" (954269 lines) assembled in 2 passes with errorlevel 0. I0860 Listing file "euroasm.htm.lst" created, size=4884076. I0980 Memory allocation 121344 KB. 26270552 statements assembled in 1069 s. I0990 EuroAssembler terminated with errorlevel 0.
Hints and technique how to extend EuroAssembler source are scatterred throughout the source files:
Add a new EUROASM option
Add a new PROGRAM option
Add a new /DIRECTIVE option
Add a new operator
Add a new machine instruction
Add a new output format
Porting EuroAssembler to other OS
If you want to modify EuroAssembler, I recommend to follow these steps:
- Copy the latest downloaded stableeuroasm.exeboth to €ASM home and to its source subdirectory (easource\euroasm.exe).
- Usingtestman.phporgenerate.phpassure that all tests ofeasource\euroasm.exepass without error.
- Usinggenerate.php#Buildassure that all modules can be rebuilt without errors. Or change toeasource\, delete old modules with
del *.objand then rebuild all with
..\euroasm.exe euroasm.htm. The build should terminate with errorlevel 0.
- Modify the EuroAssembler sources (easource\*.htm) with your enhancements.
- Rebuild the modified sources with downloaded stable version from €ASM home (repeat step 3).
- Perform all tests with the neweasource\euroasm.exe(repeat step 2).
- If all passed, you can copy the modifiedeasource\euroasm.exeto %PATH% and use it on your computer.
See also Licence for information concerning the modification of EuroAssembler sources.
€ASM assembler and linker is not optimized for speed, it has plenty of issues which could make it run faster:
LOOP Targetalthough it is usually slower then
SUB ECX,1 ; JNZ Target.
My goal was an application which gets along with processor 486 and any 32bit version of MS Windows. Order of optimalization criterii was:
- Maintainability and extensibility,
- readability and understandability,
- debugability and stability (proper treatment of errors),
- size of the code,
- speed of code,
- economical usage of memory.
Computer users should never trust executable files downloaded from Internet. It is a good practice to download the project in the form of source files and recompile it on your own PC with a compiler which you trust.
In case of self-compiled program it is complicated, because you don't have a trusted compiler yet. Suggested sandbox solution for paranoid EuroAssembler users follows:
euroasm.zip, compute its hash (
md5 euroasm.zip) and compare obtained value with the hash published on the distribution site. Hash of each €ASM release is also published on discussion forum, on Twitter account @EuroAsm, or, in reasoned case, you can try to ask the author for confirmation.
easourceand rebuild the source with downloaded executable. Be sure to provide the same forged timestamp which was used when the original source was released, e.g.
..\euroasm.exe euroasm.htm, timestamp=1512345678. Otherwise the compiled file couldn't be binary-identical with the downloaded version.
easource\euroasm.htm.lstwith the code in COFF objects or in the PE itself.
fc ..\euroasm.exe euroasm.exe. When both files are identical, the assumption made in step 3 is true and
euroasm.exewas succesfully audited.