EuroAssembler Index Manual Download Source Macros


Sitemap Links Forum Tests Projects

EuroAssembler source
Architecture
Building EuroAssembler
Calling convention
Data representation
Document Object Model
Extensibility
File access method
Files
Interaction with OS
Memory management
Naming convention
Optimization
Source audit

↑ Architecture

EuroAssembler is written in EuroAssembler in static pseudo object-oriented paradigma (OOP). Programming object is represented by a collection of related memory variables described by an assembler structure (object class) STRUC..ENDSTRUC.

The data are manipulated with procedures – object methods. Objects are bound with their methods only by naming convention. Inheritance and cascading of methods is not utilized.

Each €ASM object and its methods are encapsulated in a separate source file which is assembled to COFF module file. All modules are then linked into final executable file euroasm.exe (32bit console application for MS Windows).



General object modules
ClassRepresentationModule file
CHUNKChunk of assembled source text.chunk.htm
DOS stub program for PE executables.coffstub.htm
CTXAssembly block context.ctx.htm
DICTLanguage dictionary.dict.htm
EAEuroAssembler main object.ea.htm
EAOPTEUROASM options.eaopt.htm
EuroAssembler linker script.euroasm.htm
EXPExpression evaluator.exp.htm
LSTAssembly listing.lst.htm
MACMacroinstruction handler.mac.htm
MEMBERMember of structure or library.member.htm
MSGEuroAssembler message.msg.htm
PASSAssembly pass through the source.pass.htm
PGMThe assembled module.pgm.htm
PGMOPTPROGRAM options.pgmopt.htm
PSEUDOPseudoinstruction handlers.pseudo.htm
RELOCRelocation.reloc.htm
SRCInput source file.src.htm
SSSStructure/section/segment/group.sss.htm
STMStatement.stm.htm
SYMSymbol.sym.htm
SYSSystem calls of MS Windows functions.syswin.htm
VARPreprocessing %variable.var.htm
Machine instruction modules
Instruction categoryUses registersModule file
allMachine instruction handlers support -ii.htm
AVendor specific (AMD) RAX..YMM15iia.htm
BIntel Fused Multiply-Add (FMA) XMM0..ZMM31iib.htm
CVendor-specific (CYRIX) MM0..MM7iic.htm
D3DNow! specific (AMD, D3NOW) XMM0..XMM15iid.htm
FFloating-point (FPU) ST0..ST7iif.htm
GGeneral instructions RAX..R15iig.htm
KMask-registers manipulation (AVX512) K0..K7iik.htm
MMultimedia (MMX) MM0..MM7iim.htm
PPacked (SSE) XMM0..XMM15iip.htm
SSystem special (SPEC, UNDOC, PROT, PRIV, MPX, SGX) specialiis.htm
TTransactional & other extensions (TSX, RTM, VMX, SVM) -iit.htm
VAdvanced Vector extension (AVX) XMM0..YMM15iiv.htm
XXOP-encodable AMD XMM0..YMM15iix.htm
YAdvanced Vector extension (AVX2) XMM0..YMM15iiy.htm
ZAdvanced Vector extension (AVX512) XMM0..ZMM31iiz.htm

Program formats modules
FormatPlatformModule file
allLinker for all €ASM format output filespf.htm
BINBinary output filepfbin.htm
COFF16|32|64bit Common Object Format modulepfcoff.htm
COM16bit DOS executablepfcom.htm
DLL32|64bit Dynamically Linked Librarypfdll.htm
LIBCOFLibrary of COFF modulespflibcof.htm
LIBOMFLibrary of OMF modulespflibomf.htm
MZ16bit DOS executablepfmz.htm
OMF16|32bit Object Module Formatpfomf.htm
PE32|64bit Windows Portable Executablepfpe.htm
RSRCCompiled Windows resource (input only)pfrsrc.htm
Includable libraries used in €ASM sources
RealmOSSupportMaclib file
EUROASManyExtensions of CPU machine instructionscpuext.htm
EUROASManyExtensions of CPU machine instructionscpuext32.htm
EUROASManyMemory management macros.memory.htm
EUROASManyData sorting.sort.htm
EUROASManyBoolean flag manipulation.status32.htm
EUROASManyStdCall 32bit calling-convention macros.stdcall.htm
EUROASManyOperations with zero-terminated strings.string32.htm
EUROASMWinList of MS Windows API functions with ANSI+WIDE variants.winansi.htm
EUROASMWinMacros for core 32bit MS Windows functions.winapi.htm
EUROASMWinWrappers of Windows file functions.winfile.htm

↑ Document Object Model

EA
EA is name of the main object class with global €ASM data. It exists as one and only static instance named Ea until euroasm.exe terminates its execution.
SRC
This object class SRC describes one source file which was submitted to euroasm.exe as a command-line parameter. It exists in one and only static instance named Src. €ASM will open the file + its included file(s), create an object Src, assemble, link, close the source, write output and listing and then destroy the object Src.
If more then one source were submitted to euroasm.exe, the object Src is reinitialized and its assembly repeats.
PGM
Object of class PGM represents a program module, i.e. an autonomous unit, which can be assembled, linked and which creates one target (executable file or linkable object file). Its instance is created when €ASM first meets pseudoinstruction PROGRAM, and it is destroyed when it encounters the corresponding pseudoinstruction ENDPROGRAM in the last pass.
PASS
Object of class PASS keeps data of an assembly pass through the PROGRAM..ENDPROGRAM block in the source file. It is created at the start of each assembly pass when pseudoinstruction PROGRAM is assembled, and it is destroyed in ENDPROGRAM handler.

Some information about mutual relation between objects (€ASM source procedures and macros) is scatterred throughout the source files:
Main €ASM execution and termination
Machine instruction assembly
Linkage processing procedures

↑ Data representation

Boolean data are implemented as 1 bit flag in object's DWORD member usually called .Status.

Integer numbers with 64 bits (QWORD type) are usually accessible as two DWORD variables postfixed Low and High, e.g. STM.OffsetLow and STM.OffsetHigh. When loaded into two 32bit registers, such pair is referred in comments as colon-separated, e.g. EDX:EAX.

Pointers are referred in comments as Ptr or as a carret sign ^. for instance ^Name represents 32bit offset of the Name.

Reference to strings is implemented with several methods:
Pointer and size where the first register or variable keeps pointer to the first byte of the string, and the second register or variable keeps the string size in bytes. They are referred in comments as comma-separated pairs, e.g. ESI,ECX.
Begin and end where the first register points to the first string byte, and the second register points right behind the last string byte. They are referred in comments as ellipsis-separated pairs, e.g. ESI..EDX. Size of such string can be computed by subtraction of two registers.
ASCIIZ termination where the string is referred with one and only pointer. The string ends with NULL control character 0x00 (C string). This convention is mostly used when the string specifies file name.
Size prefixed strings have their size encoded in their first byte (Pascal string). Size of such string cannot exceed 255 bytes. This format is employed in some older file formats (OMF).

↑ Naming convention

Project of such magnitude requires strict discipline in choosing symbol names. They always begin with abbreviated object identification (object shortcut, for instance Stm is shortcut of the statement object), so it is easy to tell the class where a method or symbol belongs to, ergo in which source file it is defined. The character case indicates what kind of data the identifier represents:

Class/structure names are all in uppercase (C convention), for instance STM.

Boolean flag names begin with lowercase shortcut (camel convention), for instance stmPrefixPresent.

Procedures and methods have the first letter of object shortcut capitalized (lochness convention), for instance StmParse.

Local labels in procedures usually do not have mnemonic names. Monotonous numeric sequence is used instead, e.g. .10:, .20:, .30: (Basic convention).

The verb, which follows object shortcut in method name, indicates the function of the method. Object constructors & destructors are named Create & Destroy, e.g. StmCreate, StmDestroy.

Boolean flags (max.32 per class) are kept in object DWORD variable named .Status and manipulated with macros SetSt, RstSt, JSt, JNSt from library status32.htm.

Zero-terminated (ASCIIZ) strings and macros which operate with such strings have their name terminated with dollar character $, see macrolibrary string32.htm.

Objects which have the property name, such as symbol, %variable, structure, program etc., usually keep their name in their first two DWORD variables: pointer to the object name .NamePtr and size of the name .NameSize.

All case-insensitive names (registers, prefixes, machine instructions, pseudoinstructions, keywords etc.) are written in upper case here in EuroAssembler sources. Names of variables, procedures, macroinstructions are in mixed case.

If a special character is part of €ASM term and should be embedded in an identifier or HTML anchor, it is replaced with two lowercase letters:

CharReplacementCharReplacement
%pc&am
$do:co
#ha*as
=eq.pt

For instance the handler of pseudoinstruction %SHIFT has a label PseudopcSHIFT and URL pseudo.htm#PseudopcSHIFT.

↑ Calling convention

EuroAsembler uses three kinds of subprogrammes:

  1. expandable macros,
  2. callable native blocks PROC..ENDPROC,
  3. invokable procedures implemented by macros Procedure..EndProcedure from macrolibrary stdcall.htm.

Ad 1.: Beside ad-hoc macroinstructions defined in the same source which uses it (for instance macros in ii.htm), €ASM hires some generally usable macros from libraries shipped with EuroAssembler.

Ad 2.: PROC / ENDPROC blocks are used only sporadically as local subroutines in large procedures. They are called with register calling convention, using input/output registers described in their header.

Ad 3.: €ASM extensively employs subprograms defined with Procedure, LocalVar, EndProcedure, Invoke. Their advantage is that those four macros encapsulate their StdCall calling convention, register preservation, local stack variables reservation, maintenance of stack frame and the final return. Another advantage is that they can be invoked with arbitrary number of arguments. Nevertheless, the number of arguments provided by Invoke must exactly match the Procedure declaration. Where a variable number of arguments was required, the subprogram was implemented as a macro (see Msg as an example).

Procedures used in €ASM preserve all registers except those which return the result. Usually it is EAX but the result is sometimes returned in other register(s), too. For instance the macro BufferRetrieve returns the contents of the buffer as a string in registers ESI,ECX.

Arithmetic CPU flags are not preserved by subprogrammes. The exceptions are macros Msg MsgUnexpected, which preserve all CPU flags and registers. Nonetheless, many procedures use Carry flag to signalize error, or Zero flag to signalize emptiness.

All procedures expect clear Direction flag on input, and they return it reset on output (DF=0).

Calling convention of operating system functions is hidden in system macros in the file sys*.htm.

↑ File access method

Configuration files euroasm.ini are loaded to memory, assembled and immediately released.

Input files (the actually assembled source file and its included files) are mapped to memory and kept open with sharing access allow read, deny write until the assembler/linker ends and output file is completed in a memory stream.

Actually assembled source files may be kept open by the text editor in which they are being written, but you won't be able to save them until the assembly terminates.

Output files (the target object|executable file and listing) are compiled in memory. When they are complete, input files are closed and only then is the output compilation flushed at once to an output disk file.

This method allows to create output listing with the same name as input source, overwriting the source with its listing (or even with the assembled output file).
.

↑ Interaction with OS

Requests for service from the operating system is encapsulated in macroinstructions gathered in easource/sys???.htm. In Windows version of EuroAssembler it is the source file easource/syswin.htm which imports the following API services from system library kernel32.dll:

CloseHandle, CreateDirectory, CreateFile, CreateFileMapping, ExitProcess, FileTimeToDosDateTime, FindClose, FindFirstFile, FindNextFile, GetCommandLine, GetEnvironmentVariable, GetLastError, GetModuleFileName, GetModuleFileName, GetModuleHandle, GetSystemInfo, GetSystemTime, GlobalFree, MapViewOfFile, MultiByteToWideChar, SetFilePointer, SystemTimeToFileTime, UnmapViewOfFile, VirtualAlloc, VirtualFree, WriteFile.

See syswin.htm for their description.

Encapsulation of OS calls by Sys* macroinstructions facilitates future porting of EuroAssembler from MS Windows to other operating systems.

↑ Memory management

Unique objects Ea, Src, dictionary of enumerated tokens used by EuroAssembler language, text of €ASM messages, literal strings and some ad hoc local tables are allocated statically, in [.data] or [.bss] segments.

All other €ASM objects are allocated dynamically at run time, either on machine stack, or in the memory provided on request from the operating system.
Recursively invokable procedures protect themselves from stack overflow with macro EaStackCheck.

EuroAssembler does not use system heap. It allocates dynamic memory in portions called pool, implemented as a linked list of pool blocks with typical size 64 KB (or larger, if requested so). In MS Windows it is provided by API functions VirtualAlloc(), VirtualFree().

Memory once allocated from pool is not returned to OS at the moment when the object is discarded, there is no garbage collection. Instead, the pool memory is returned as a whole to operating system when the pool's owner is destroyed.
There are four classes in €ASM which maintain their own pools: EA, SRC, PGM, PASS. Object methods choose the appropriate pool depending on the lifetime of each stored object, see also DOM.

Although the memory can be allocated from pool directly (using macros PoolNew or PoolStore), the pool serves mainly as a container for more sofisticated access methods:

STACK keeps table of objects of the same size. It has nothing common with CPU stack SS:ESP except its name and access method LIFO (Last In = First Out).
STACK is used by €ASM to reflect the structure of nested block objects CTX, CHUNK_HEAD, EAOPT.

LIST keeps the bidirectionally linked list of objects of the same size, which are not kept together in memory. Listed objects can be accessed only sequentionally, either forward (FIFO | LILO) or backward (FILO | LIFO).
This method is used to store €ASM objects whose number cannot be reliably estimated at the beginning, such as symbols, %variables, macros, sections.

STREAM is a write-only memory class which stores unformated data string sequentionally to a collection of memory blocks. StreamStore access method is similar to FileWrite. When the stream is completed, it may be flushed to a disk file at once with macro StreamDump.
This method is used in €ASM when output files are formated.

BUFFER is used to store data items (strings) of variable size. Unlike the stream or list method, all data in buffer are stored continuously. If the estimated buffer size specified on BufferCreate was underestimated and is exhausted, the buffer silently allocates from its pool another block of memory with doubled size, and copies the whole previous contents to the new location. Thus the entire buffer contents returned by BufferRetrieve is always continuous.

Buffers are extensively used by €ASM. Leaving their contents abandoned, until the termination of parent PASS or PGM, would have negative impact on total memory consumption. Therefore buffers can also be borrowed from the stack of preallocated buffers Ea.BufferStack by the invocation of EaBufferReserve and returned with EaBufferRelease, not wasting the once allocated memory.

When the requested dynamic memory size exceeds usual values, for which are buffers preallocated, the buffer or stack will automatically request additional memory portion from its pool, and if there is no more free memory on the pool, its manager requests another block from the operating system. Thus dynamic memory management works transparently for the programmer and it is limited only by the amount of OS virtual memory, no matter how big identifiers, expressions, nesting level etc. may occur in the assembled source.

↑ Building EuroAssembler

Executable file euroasm.exe can be recompiled in easource subdirectory with the command ..\euroasm euroasm.htm, assuming that the stable euroasm.exe version was not moved yet from EuroAssembler home directory somewhere to system %PATH%.
Target euroasm.exe is created in subdirectory easource.

EuroAssembler can also be built from browser at page generate.php.

On my Intel Pentium machine running at 3 GHz with 8 GB RAM the complete rebuild reports: ... I0750 Source "euroasm" (954269 lines) assembled in 2 passes with errorlevel 0. I0860 Listing file "euroasm.htm.lst" created, size=4884076. I0980 Memory allocation 121344 KB. 26270552 statements assembled in 1069 s. I0990 EuroAssembler terminated with errorlevel 0.

↑ Extensibility

Hints and technique how to extend EuroAssembler source are scatterred throughout the source files:
Add a new EUROASM option
Add a new PROGRAM option
Add a new /DIRECTIVE option
Add a new operator
Add a new machine instruction
Add a new output format
Porting EuroAssembler to other OS

If you want to modify EuroAssembler, I recommend to follow these steps:
  1. Copy the latest downloaded stable euroasm.exe both to €ASM home and to its source subdirectory (easource\euroasm.exe).
  2. Using testman.php or generate.php assure that all tests of easource\euroasm.exe pass without error.
  3. Using generate.php#Build assure that all modules can be rebuilt without errors. Or change to easource\, delete old modules with del *.obj and then rebuild all with ..\euroasm.exe euroasm.htm. The build should terminate with errorlevel 0.
  4. Modify the EuroAssembler sources (easource\*.htm) with your enhancements.
  5. Rebuild the modified sources with downloaded stable version from €ASM home (repeat step 3).
  6. Perform all tests with the new easource\euroasm.exe (repeat step 2).
  7. If all passed, you can copy the modified easource\euroasm.exe to %PATH% and use it on your computer.

See also Licence for information concerning the modification of EuroAssembler sources.

↑ Optimization of euroasm.exe

€ASM assembler and linker is not optimized for speed, it has plenty of issues which could make it run faster:

My goal was an application which gets along with processor 486 and any 32bit version of MS Windows. Order of optimalization criterii was:
  1. Maintainability and extensibility,
  2. readability and understandability,
  3. debugability and stability (proper treatment of errors),
  4. size of the code,
  5. speed of code,
  6. economical usage of memory.

↑ Source audit

Computer users should never trust executable files downloaded from Internet. It is a good practice to download the project in the form of source files and recompile it on your own PC with a compiler which you trust.

In case of self-compiled program it is complicated, because you don't have a trusted compiler yet. Suggested sandbox solution for paranoid EuroAssembler users follows:

  1. Download euroasm.zip, compute its hash (md5 euroasm.zip) and compare obtained value with the hash published on the distribution site. Hash of each €ASM release is also published on discussion forum, on Twitter account @EuroAsm, or, in reasoned case, you can try to ask the author for confirmation.
  2. Unzip the downloaded archive in a sandbox, such as isolated PC not connected to network, or in virtual PC.
  3. Assume for this moment, that the downloaded and unzipped executable euroasm.exe is trustworthy.
  4. Change to subdirectory easource and rebuild the source with downloaded executable. Be sure to provide the same forged timestamp which was used when the original source was released, e.g. ..\euroasm.exe euroasm.htm, timestamp=1512345678. Otherwise the compiled file couldn't be binary-identical with the downloaded version.
  5. Make an audit check of the sources and assure that it does not contain any vulnerabilities, backdoors or other malicious code. You may also want to compare the assembled code in dump column of listing file easource\euroasm.htm.lst with the code in COFF objects or in the PE itself.
  6. You are now assured that, if the downloaded executable was trustworthy, the compiled executable is trustworthy as well.
  7. Binary compare the downloaded and the just compiled executables, e.g. fc ..\euroasm.exe euroasm.exe. When both files are identical, the assumption made in step 3 is true and euroasm.exe was succesfully audited.

▲Back to the top▲