[openssl-dev] #GP happens in do_sse3_after_all

Yan, Shaopu shaopu.yan at intel.com
Tue Oct 17 03:01:37 UTC 2017


Hi dear openssl maintainer,
I met an issue in the crypto/chacha/chacha-x86_64.S, could you be kind to have a look on it? Thanks very much.

Currently it will stuck in the function do_sse3_after_all, and a #GP will occurs due to the following instructions
""movdqa %xmm0,0(%rsp)" need 16 bytes alignment, however, after I go through the detail code, I find that it already
adjust the rsp by "subq $64+8,%rsp" and I simply tried to change it like "subq $64,%rsp" then it will works correctly.

I don't know whether there's an issue about it?, if I have some mistake please correct me. :)
I suppose that the "subq $64+8,%rsp" is used to align the stack with 16 bytes, but in my case if the default RSP already be 16 bytes
align then after execute it the stack will becomes 8 bytes align so the #GP happens:(  So could you please help to check it?



438ChaCha20_4x:
439.LChaCha20_4x:
440        movq        %rsp,%r9
441        movq        %r10,%r11
442        shrq        $32,%r10
443        testq        $32,%r10
444        jnz        .LChaCha20_8x
445        cmpq        $192,%rdx
446        ja        .Lproceed4x
447
448        andq        $71303168,%r11
449        cmpq        $4194304,%r11
450        je        .Ldo_sse3_after_all



987.LChaCha20_8x:
988        movq        %rsp,%r9
989        subq        $0x280+8,%rsp
990        andq        $-32,%rsp
991        vzeroupper





.Lproceed4x:
453        subq        $0x140+8,%rsp
454        movdqa        .Lsigma(%rip),%xmm11
455        movdqu        (%rcx),%xmm15
456        movdqu        16(%rcx),%xmm7
457        movdqu        (%r8),%xmm3
458        leaq        256(%rsp),%rcx
459        leaq        .Lrot16(%rip),%r10
460        leaq        .Lrot24(%rip),%r11





.Ldo_sse3_after_all:
312        subq        $64+8,%rsp
313        movdqa        .Lsigma(%rip),%xmm0
314        movdqu        (%rcx),%xmm1
315        movdqu        16(%rcx),%xmm2
316        movdqu        (%r8),%xmm3
317        movdqa        .Lrot16(%rip),%xmm6
318        movdqa        .Lrot24(%rip),%xmm7
319
320        movdqa        %xmm0,0(%rsp)
321        movdqa        %xmm1,16(%rsp)
322        movdqa        %xmm2,32(%rsp)
323        movdqa        %xmm3,48(%rsp)
324        movq        $10,%r8
325        jmp        .Loop_ssse3

/Best Regards!
--Shaopu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mta.openssl.org/pipermail/openssl-dev/attachments/20171017/3faaaaae/attachment.html>


More information about the openssl-dev mailing list