Autoincrement/decrement Addressing Modes

Unread postby **AndreyDmitriev** » 20 Jan 2026 16:37

Hi Pavel,

Thank you very much for the new €Assembler version — it works like a charm!

I have one suggestion that I would like to discuss.
A long time ago, I worked with the MACRO‑11 assembler on a DEC PDP‑11 computer.
Manual:
https://bitsavers.org/pdf/dec/pdp11/rst ... _Oct87.pdf

In section 5‑2 (page 46), the addressing modes are listed, and one thing I am missing in Intel‑style assembler syntax is the **Autoincrement Mode**.
You can see how it is used on page 3‑11 (page 37), Figure 3‑1:

1$: CLR (R0)+
    CMP R0, #IMPURT; Test if the end reached
    BNE 1$

Here, `CLR (R0)+` clears a word at the address pointed to by `R0`, then increments `R0` (by 2, because it is a word). As a result, `R0` points to the next word on the next iteration. This way, we do not need a dedicated increment instruction—the increment is combined with `CLR`. As far as I know, this is directly supported by the DEC's CPU.

Table 5‑3 shows that this works with any operation, such as `OPR R, (R)+`.

Perhaps you could add similar “syntax sugar” to your assembler as well?
For example, consider this code, which adds two byte arrays `a` and `b` of length `n` and writes the result to `c`:

EUROASM CPU=X64, SIMD=AVX2, AMD=ENABLED
AsmDLL64 PROGRAM FORMAT=DLL, MODEL=FLAT, WIDTH=64

EXPORT add_bytes_avx2
; void add_bytes_avx2(const uint8_t* a,
;                     const uint8_t* b,
;                     uint8_t* c,
;                     size_t n);
add_bytes_avx2 PROC
    test    r9, r9
    jz      done

    ; number of full 32-byte blocks
    mov     r10, r9
    shr     r10, 5            ; r10 = n / 32
    jz      tail

avx_loop:
    vmovdqu ymm0, [rcx]
    vmovdqu ymm1, [rdx]
    vpaddb  ymm0, ymm0, ymm1
    vmovdqu [r8], ymm0

    add     rcx, 32 ; < next 32 bytes
    add     rdx, 32
    add     r8,  32
    dec     r10
    jnz     avx_loop

tail:
    ; remaining bytes
    and     r9, 31
    jz      done

tail_loop:
    mov     al, [rcx]
    add     al, [rdx]
    mov     [r8], al

    inc     rcx, rdx, r8
    dec     r9
    jnz     tail_loop

done:
    vzeroupper ; important for ABI
    ret
ENDP add_bytes_avx2

ENDPROGRAM AsmDLL64

My suggestion is to add post‑increment support, for example:

    vmovdqu ymm0, [rcx]
    add     rcx, 32
    ; will be turned to
    vmovdqu ymm0, [rcx]+ ; < auto increment

or

    mov     al, [rcx]
    inc     rcx
    ; will be written as
    mov     al, [rcx]+ ; < auto increment

With this feature, the full code above could look like this, shorter and elegant:

EUROASM CPU=X64, SIMD=AVX2, AMD=ENABLED
AsmDLL64 PROGRAM FORMAT=DLL, MODEL=FLAT, WIDTH=64

EXPORT add_bytes_avx2
; void add_bytes_avx2(const uint8_t* a,
;                     const uint8_t* b,
;                     uint8_t* c,
;                     size_t n);
add_bytes_avx2 PROC
    test    r9, r9
    jz      done

    ; number of full 32-byte blocks
    mov     r10, r9
    shr     r10, 5            ; r10 = n / 32
    jz      tail

avx_loop:
    vmovdqu ymm0, [rcx]+
    vmovdqu ymm1, [rdx]+
    vpaddb  ymm0, ymm0, ymm1
    vmovdqu [r8]+, ymm0

    dec     r10
    jnz     avx_loop

tail:
    ; remaining bytes
    and     r9, 31
    jz      done

tail_loop:
    mov     al, [rcx]+
    add     al, [rdx]+
    mov     [r8]+, al

    dec     r9
    jnz     tail_loop

done:
    vzeroupper ; important for ABI
    ret
ENDP add_bytes_avx2

ENDPROGRAM AsmDLL64

This could also be extended to post‑ and pre‑increments and decrements:

mov  al, [rcx]+ == mov  al, [rcx] & inc rcx
mov  al, [rcx]- == mov  al, [rcx] & dec rcx
mov  al, +[rcx] == inc rcx & mov  al, [rcx]
mov  al, -[rcx] == dec rcx & mov  al, [rcx]

The amount of increment or decrement would be inferred from the data size and context: 1 for byte, 2 for word, and so on, up to 64 for AVX‑512.
This could reduce code size and improve readability.

Just an idea — what do you think?

Unread postby **vitsoft** » 20 Jan 2026 19:38

Table 5‑3 shows that this works with any operation, such as `OPR R, (R)+`.
Perhaps you could add similar “syntax sugar” to your assembler as well?

I'm afraid postfix/prefix autoincrement contradict with the syntax of assemblers. Plus and Minus signs are reserved for addition and subtraction.
The only case where it magically works are string instructions such as LODS which have the autoincrementation hardwired in x64-86 CPU.

Autoincrementation can be however easily programmed at macro level.

vmovdqu ymm0, [rcx]
add rcx, 32
; will be turned to
vmovdqu ymm0, [rcx]+ ; < auto increment

will change to

Load32Bytes %MACRO DestYMM, SourceAddr
     VMOVDQU %DestYMM, [%SourceAddr]
     ADD %SourceAddr, 32
    %ENDMACRO Load32Bytes

Store32Bytes %MACRO DestAddr, SourceYMM
     VMOVDQU [%DestAddr], %SourceYMM
    ADD %DestAddr, 32
  %ENDMACRO Store32Bytes

and then instead of

avx_loop:
vmovdqu ymm0, [rcx]+
vmovdqu ymm1, [rdx]+
vpaddb ymm0, ymm0, ymm1
vmovdqu [r8]+, ymm0

you will write

    Load32Bytes ymm0, rcx
    Load32Bytes ymm1, rdx
    vpadd ymm0, ymm0, ymm1
    Store32Bytes r8, ymm0

which is almost as elegant and does not conflict with assembler syntax.

Unread postby **AndreyDmitriev** » 22 Jan 2026 10:48

vitsoft wrote: 20 Jan 2026 19:38 instead of
avx_loop:
vmovdqu ymm0, [rcx]+
vmovdqu ymm1, [rdx]+
vpaddb ymm0, ymm0, ymm1
vmovdqu [r8]+, ymm0

you will write
    Load32Bytes ymm0, rcx
    Load32Bytes ymm1, rdx
    vpadd ymm0, ymm0, ymm1
    Store32Bytes r8, ymm0
which is almost as elegant and does not conflict with assembler syntax.

No, macros are obvious, but unfortunately they are not very elegant, because there are many possible combinations and they are not generic enough. For example, if I change the unaligned move VMOVDQU to the aligned VMOVDQA, I will need another pair of macros. Or, if I pass the instruction as a parameter as well, I will end up with an additional separator and more “information noise.”
In general, you are already conflicting with canonical assembler syntax when you introduce instructions like inc rcx, rdx, r8 (the same applies to dec, push, and pop, where multiple registers are allowed — and I actually love this). This will cause an A2008 syntax error in MASM, at least.
So I’m fine with broken syntax; this is just an enhancement taken from the MACRO-11 assembler, where the plus and minus signs are also reserved for addition and subtraction, but when used as (Ri)+ or -(Ri) they mean autoincrement or autodecrement.
See:
https://www.teach.cs.toronto.edu/~ajr/258/pdp11.pdf
Maybe I’ll try to add this myself with the help of AI.

Unread postby **vitsoft** » 22 Jan 2026 12:25

For example, if I change the unaligned move VMOVDQU to the aligned VMOVDQA, I will need another pair of macros.

You won't when keyword operand is employed:

 MyVMOVDQ %MACRO DestYMM, SourceAddr, A=U
   vmovdq%A %DestYMM, [%SourceAddr]
   ADD %SourceAddr, 32
  %ENDMACRO MyVMOVDQ

You will then write for unaligned move MyVMOVDQ ymm0, rcx
or for aligned move MyVMOVDQ ymm0, rcx, A=A

Or you could also use a macroparameter for the instruction mnemonic, such as

DoAndInc %MACRO Mnemo, DestYMM, SourceAddr
   %Mnemo %DestYmm, %SourceAddr
   ADD %SourceAddr, 32
  %ENDMACRO

Instead of your vmovdqu ymm0, [rcx]+ you would write DoAndInc vmovdqu ymm0, rcx

this is just an enhancement taken from the MACRO-11 assembler

I understand the concept of prefixing and postfixing plus and minus sign, it works well in C but I am not convinced that it belongs to the assembly language.

My enhancement of PUSH|POP|INC|DEC is consistent with the concept of ordinal and keyword operands used in €ASM (BTW is is copied from TASM)
however, adding a separate + at the end of the expression is not.
Nevertheless, interested parties can achieve this at the macro level:

To test whether the operand SourceAddr ends with a standalone +, such as in [rcx]+, use

 %IF "%SourceAddr[%&]" === "+"

To isolate register name from square braces use

 %RegWithoutBraces %SET %SourceAddr[2..%&-1]

You can make any enhancements that you like without need to interfere with the concept of EuroAssembler's syntax itself.

Unread postby **AndreyDmitriev** » 22 Jan 2026 12:50

Thank you very much for your suggestions and insights — I’ll try different approaches.
I’m not sure which modern assemblers have something similar; perhaps ARM does.

By the way, I wrote a small article about your assembler (in Russian) to raise interest:
https://habr.com/ru/articles/986752/
You might be able to read it using Google Translate (try to scroll down a little bit):
https://habr-com.translate.goog/ru/arti ... x_tr_tl=cs
It received positive feedback. Thank you again.

Autoincrement/decrement Addressing Modes

Autoincrement/decrement Addressing Modes

Re: Autoincrement/decrement Addressing Modes

Re: Autoincrement/decrement Addressing Modes

Re: Autoincrement/decrement Addressing Modes

Re: Autoincrement/decrement Addressing Modes

Who is online