vitsoft wrote: ↑30 Aug 2018 22:20
ar18 wrote: ↑30 Aug 2018 14:58
After the call is made to the function, per the Win64 ABI it must then copy rcx,rdx,r8, and r9 into the SHADOW SPACE at the start of that function.
I didn't know that. Agner Fog in book 5, page 19, says ad 64 bit Windows:
Since the shadow space is owned by the called function, it is safe to use these 32 bytes of shadow space for any purpose by the called function.
Why would I care what Agner Fog says, when the author of the Win64 ABI spec, Microsoft, says at
https://msdn.microsoft.com/en-us/library/ms235286.aspx that "Space is allocated on the call stack as a shadow store for callees to save those registers. There is a strict one-to-one correspondence between the arguments to a function call and the registers used for those arguments"?
Can one ignore what Microsoft and the Win64 ABI says, and use the shadow space anyway you please as Agner Fog suggests, and not have any problems? Probably not, so long as you were consistent in how you save/address the values passed to your function in the rcx, rdx, r8, and r9 registers. But why have a shadow space to begin with? Why not ignore that too? Could you make it work? Certainly you could -- Agner Fog does it. I too don't have to follow any spec to make my program work, but I do need to follow the spec if I want my program to work with someone's else libraries or code. More importantly, if you want your program to work with modern debuggers, you would follow the spec and not Agner Fog's suggestion.
It is a very stupid thing to save parameters to a register, only to move them onto the stack. That wastes cycles and memory. The question then is, why ignore the ABI and waste that shadow space for something else, like Agner Fog suggests, when you could also ignore the ABI and not create a shadow space to begin with? You will be in big trouble if you don't save the first four parameters from the stack to somewhere, but where to save the registers parameters? How about to the shadow space as suggested by the ABI, and the only reason the shadow space even has a reason for existing? Instead let's just ignore the spec and slice and dice our program however we like?
Either save parameters to the registers (like a fastcall does) or move them to the stack (like cdecl does) but don't do both at the same time ... that is, unless you have a non-stupid reason for doing so. Does Microsoft have a good reason for doing this stupid thing? Yes. For one, they partially followed AMD's ABI spec. Two, ever notice how Win64 officially has no fastcall calling convention? They only have stdcall and vector. There is no fastcall, although you can specify it in C/C++, but the compiler will ignore it and replace it with stdcall. You can only completely ignore the shadow space if you use your function like an actual real life fastcall does (where parameters are only passed via registers and the stack is completely unused except for the return address). The Win64 stdcall is not a fastcall or a cdecl, it is a hybird between a fastcall and cdecl. That is the only reason why the shadow space exists.
The exception to this argument is when you pass 3 or less parameters to a function. Then I would say Agner Fog finally got it right because any unused reserved stack locations in the shadow space can be used as local storage and would affect nothing else, not even a debugger.
vitsoft wrote: ↑30 Aug 2018 22:20
ar18 wrote:
If the only non-volatile register I use is RBX, I would much rather push one QWORD onto the stack instead of nine. Anything else will lead to exe bloat.
Sure, and it costs only PUSH RBX and POP RBX, which is not much more bothering than adding something like
USES=EBX to the operands of macro
Procedure.
Not exactly. A properly constructed
USES=EBX would PUSH rbx at the beginning of my function, but anywhere there is a
ret would replace it with a
pop rbx
ret
vitsoft wrote: ↑30 Aug 2018 22:20
X64 yields enough scratch registers, so I don't think implementation of USES= is necessary.
Here is a list of all the non-volatile registers in Win64: rbx, r12, r13, r14, r15, xmm7, xmm8, xmm9, xmm10, xmm11, xmm12, xmm13, xmm14, xmm15, and xmm16. That isn't a small list and they are provided for your perusal, so why not support their usage?
vitsoft wrote: ↑30 Aug 2018 22:20
Some other problems to solve would arouse:
- Syntax of the list of stored registers. Perhaps something like StrCopy Procedure Src,Dest,Size, USES=RSI:RDI:RBX
- Propagation of the list from Procedure macro to EndProcedure.
- Parsing and reversing the array of registers at assembly time.
- Allow other than callee-save registers, too?
- What if user specifies nonpushable register, e.g. USES=BX:MMX7:YMM15
- Save only the lowest 64 bits of SIMD register?
- I wouldn't worry about syntax too much, just so long as you are consistent with your syntax usage elsewhere.
- Well they don't call it "work" for nothing.
- Ditto.
- It wouldn't be smart but it wouldn't hurt anything either.
- Flag it as an obvious error. SIMD registers cannot be pushed onto the stack.
- Save the lowest 8 bytes of only the XMM registers, per the ABI.