[openssl-dev] #GP happens in do_sse3_after_all

Andy Polyakov appro at openssl.org
Fri Oct 20 08:14:02 UTC 2017


Hi,

> I met an issue in the crypto/chacha/chacha-x86_64.S, could you be kind 
> to have a look on it? Thanks very much.
> 
> Currently it will stuck in the function *do_sse3_after_all*, and a #GP 
> will occurs due to the following instructions
> 
> ““movdqa %xmm0,0(%rsp)” need 16 bytes alignment, however, after I go 
> through the detail code, I find that it already
> 
> adjust the rsp by “subq $64+8,%rsp” and I simply tried to change it like 
> “subq $64,%rsp” then it will works correctly.
> 
> I don’t know whether there’s an issue about it?, if I have some mistake 
> please correct me. J
> 
> I suppose that the “subq $64+8,%rsp” is used to align the stack with 16 
> bytes, but in my case if the default RSP already be 16 bytes
> 
> align then after execute it the stack will becomes 8 bytes align so the 
> #GP happensL  So could you please help to check it?

All known x86_64 ABIs specify that top of stack is to be aligned at 16 
bytes. Obviously it can't be aligned at each given moment, not on 
x86_64, so question is *when* does it have to be aligned? It has to be 
aligned at least at moment of call to another subroutine. Since x86_64 
call instruction pushes return address to stack, this means that upon 
entry to function stack is actually misaligned. Hence compliant function 
has to allocate 16*n+8 frame. And that's what we see in code, 64+8 in 
the referred case. Now, if you experience crash at the point in 
question, it can only mean one thing, caller is not compliant with ABI. 
Though there is ambiguity and it might be wrong to blame direct caller 
for following reason. Customarily compilers don't explicitly align stack 
in each subroutine, but instead assume that caller aligned it. In other 
words stack alignment is kind of collective effort, with each subroutine 
relying on its caller. So that all subroutines can be compliant, but it 
would still be a problem. This would be case when stack was *initially* 
misaligned [upon its creation]. To summarize, it's either one of 
subroutines in chain of calls leading to ChaCha20_ctr32 that is not 
compliant with ABI, or stack was initially seeded misaligned.


More information about the openssl-dev mailing list