[openssl-dev] EXT :Re: [openssl.org #3931] OpenSSL 1.0.2(c, d) hangs on Sun T3 in OPENSSL_cpuid_setup()

Puckett, Rick via RT rt at openssl.org
Tue Jul 14 23:29:36 UTC 2015


Misaki, Andy,

I ran the truss command line you specified on the Sun T-3 and had to kill -9 the process as Ctrl-C and Ctrl-Z did not work. Attached is the truss.log output and below are the last few lines of that file where the process was hung up.  Setting OPENSSL_sparcv9cap to "0x20" (or even "0") allowed the program to complete (the code looks like it bypasses the probes if this is set to anything), though I don't know the operational ramifications of any value. 

If this helps, I noted that sending the process a "kill -BUS" or "kill -ILL" causes the process to complete normally, even if generating useful output, though I don't know the state of "OPENSSL_sparcv9cap_P" array or the correctness of any results under those circumstances, but the output of the "version" command is correct :-)

I applied the patch you sent and configured/compiled using "solaris-sparcv9-gcc" and the program completes normally.

As I am unable to use patched/unofficial code for our operational needs, what I did last week is use the following option to the OpenSSL Configure script "solaris-sparcv7-gcc" or "solaris-sparcv7-cc" (we use both GCC and Sun C) and that seemed to fix/bypass the problem on the T-3.

Thank you again and please let me know if I can be of further assistance.
- Rick

2783/1:		sigaction(SIGILL, 0xFFBFF448, 0xFFBFF528)	= 0
2783/1:		sigaction(SIGBUS, 0xFFBFF448, 0xFFBFF548)	= 0
2783/1 at 1:	    -> libcrypto:_sparcv9_rdtick(0x0, 0x1, 0x0, 0xa)
2783/1 at 1:	    <- libcrypto:_sparcv9_rdtick() = 0x99367288
2783/1 at 1:	    -> libcrypto:_sparcv9_vis1_probe(0x0, 0x1, 0x0, 0xa)
2783/1 at 1:	    <- libcrypto:_sparcv9_vis1_probe() = 0
2783/1 at 1:	    -> libcrypto:_sparcv9_vis1_instrument(0x0, 0xffbff48a, 0x0, 0xa)
2783/1 at 1:	    <- libcrypto:_sparcv9_vis1_instrument() = 19
2783/1 at 1:	    -> libcrypto:_sparcv9_fmadd_probe(0x0, 0x1, 0x0, 0x13)
2783/1 at 1:	    <- libcrypto:_sparcv9_fmadd_probe() = 0
2783/1 at 1:	    -> libcrypto:_sparcv9_vis3_probe(0x0, 0x1, 0x0, 0x13)
2783/1 at 1:	    <- libcrypto:_sparcv9_vis3_probe() = 0
2783/1 at 1:	    -> libcrypto:_sparcv9_rdcfr(0x0, 0xffbff548, 0x0, 0x13)


-----Original Message-----
From: Andy Polyakov via RT [mailto:rt at openssl.org] 
Sent: Tuesday, July 14, 2015 5:43 PM
To: Puckett, Rick (IS)
Cc: openssl-dev at openssl.org
Subject: EXT :Re: [openssl-dev] [openssl.org #3931] OpenSSL 1.0.2(c, d) hangs on Sun T3 in OPENSSL_cpuid_setup()

Hi,

Misaki.Miyashita wrote:
> Hi Rick,
> 
> Can you run the truss(1) command when you run "openssl version" as follows?
> 
> i.e.
> % truss -lf -u libcrypto:: -u libpkcs11:: -o /tmp/truss.out openssl 
> version
> 
> The output will tell you more information about the function calls 
> made by the openssl(1) application.
> 
> Thank you,
> 
> -- misaki

Misaki,

There were couple of private reports that make me think that there is what can be classified as kernel bug. When processor hits unimplemented instruction an exception is risen and it's either handled by in-kernel emulator or passed down to application as SIGILL. Consider http://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=3caeef94bd045608af03b061643992e3afd9c445.
Problem there was that emulator was failing to handle 16-bit load in delay slot. This was on T1 (as commit message suggests) where 16-bit load is emulated. Another example (not reflected in history) is T4 detection that had to be guarded by VIS3 flag. Trouble there is that rd
%asr26,%o0 sends application into endless uninterruptible loop.
"Uninterruptible" means that you can't terminate application with ctrl-c or even suspend it with ctrl-z. The only thing that works is to kill -KILL from another window (never tried to kill -ILL or -BUG as OP suggested). Note that all these probes was developed and verified to work on Linux, which handles both cases gracefully. Which is basically why one can argue that it's Solaris kernel bug.

Rick,

You are likely to suffer from second part of above problem description.
At least it's coherent with private report, and the fact that T3 is first processor to implement VIS3, when T4 detection probe (above mentioned rd %asr26,%o0) is guarded by VIS3 capability. If you can confirm that neither ctrl-c or ctrl-z work with hung program, then that's more than likely it. As for what to do. If running with OPENSSL_sparcv9cap set system-wide to appropriate value (0x20 would be right for T3) is not an option, then you'd have to modify OPENSSL_cpuid_setup. If your Solaris version running on T3 is new enough attached patch should do the trick. Can you verify it?

> On 07/09/15 16:34, Puckett, Rick via RT wrote:
>> Request: Bug Report
>>
>> Hello,
>>
>> I recently compiled OpenSSL 1.0.2(c,d) for Solaris 5.10 using GCC
>> 4.8.2 on an UltraSPARC 45 and our group tested it on several 
>> different types of other systems (V245, T4, T3, etc...) and it runs 
>> as expected on all systems except the T3 where it hangs - even for a 
>> simple call like "openssl version".  The process continues normally 
>> when sent either a SIGBUS or SIGILL.
>>
>> I believe I've tracked it down to the function "OPENSSL_cpuid_setup"
>> in the file "crypto/sparcv9cap.c" after the initial sigaction calls 
>> to set the signal handlers for SIGILL and SIGBUS and before the 
>> trailing sigaction calls to reset the handlers for SIGILL and SIGBUS.  
>> There's a partial dtrace listing below, generated by my colleague 
>> Carolyn, with the last output lines showing the sigaction calls for 
>> SIGILL then SIGBUS (the trailing sigaction calls are in the reverse 
>> order in the code).
>>
>> The "OPENSSL_cpuid_setup" function supports reading the environment 
>> variable "OPENSSL_sparcv9cap" to skip further processing and setting 
>> this variable (to anything) prevents the process from hanging, so I'm 
>> also encouraged that the issue resides within this function, but am, 
>> obviously, hesitant to rely on this as an operational solution ...
>>
>> Is there any other information I can provide you and/or anything I 
>> can do on my side to investigate and resolve this.
>>
>> Thank you,
>> - Rick
>>
>>
>> 4503:   lwp_sigmask(SIG_SETMASK, 0xFFBFF827, 0x0000FFF7) = 0xFFBFFEFF
>> [0x0000FFFF]
>>
>> 4503:   sigaction(SIGILL, 0xFFBFEC10, 0xFFBFECF0)       = 0
>>
>> 4503:       new: hand = 0xFEF4F824 mask = 0xFFBFFEFF 0x0000FFFF 0 0
>> flags = 0x0000
>>
>> 4503:       old: hand = 0x00000000 mask = 0 0 0 0 flags = 0x0000
>>
>> 4503:   sigaction(SIGBUS, 0xFFBFEC10, 0xFFBFED10)       = 0
>>
>> 4503:       new: hand = 0xFEF4F824 mask = 0xFFBFFEFF 0x0000FFFF 0 0
>> flags = 0x0000
>>
>> 4503:       old: hand = 0x00000000 mask = 0 0 0 0 flags = 0x0000
>>
>>
>>
>>
>>
>> _______________________________________________
>> openssl-bugs-mod mailing list
>> openssl-bugs-mod at openssl.org
>> https://mta.openssl.org/mailman/listinfo/openssl-bugs-mod
>>
>>
>> _______________________________________________
>> openssl-dev mailing list
>> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
> 
> 



-------------- next part --------------
A non-text attachment was scrubbed...
Name: truss.log
Type: application/octet-stream
Size: 10874 bytes
Desc: not available
URL: <http://mta.openssl.org/pipermail/openssl-dev/attachments/20150714/2a276fc4/attachment-0001.obj>


More information about the openssl-dev mailing list