<div dir="ltr">Andy,<div>   1:2.5 is pretty in my opinion for ARM !  </div><div><br></div><div>   We  will check out Mongoose.</div><div><br></div><div>   Hmm - will try to get to the bottom of those cache misses (at a lower priority).</div><div><br></div><div>Thanks,</div><div>-vijay</div><div><br></div><div>   </div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Feb 7, 2017 at 11:07 AM, Andy Polyakov <span dir="ltr"><<a href="mailto:appro@openssl.org" target="_blank">appro@openssl.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">> A72 is running 1GHz compared to x86 at 2.1Ghz. So that should hopefully<br>

> get down to -1:5.<br>

<br>

</span>And Mongoose will take you to ~1:2.5 (scaled to same frequency that is).<br>

Which I'd say is a fair result. Well, still could have been a bit<br>

better, but it's not unreasonable given ISA differences. Keep in mind<br>

that presented x86_64 result is for code utilizing Intel-specific code<br>

extensions.<br>

<span class=""><br>

> There is no L3 cache on the A72 eval board and performance counters do<br>

> show 9x more DRAM accesses for ARM compared to x86.<br>

<br>

</span>This is unexpected, because it takes *less* references to memory to<br>

perform it on ARMv8. Because it has larger register bank. And cache<br>

requirement is not that high for L3 to kick in... But at any case memory<br>

is not bottleneck here...<br>

<div class="HOEnZb"><div class="h5"><br>

--<br>

openssl-users mailing list<br>

To unsubscribe: <a href="https://mta.openssl.org/mailman/listinfo/openssl-users" rel="noreferrer" target="_blank">https://mta.openssl.org/<wbr>mailman/listinfo/openssl-users</a><br>

</div></div></blockquote></div><br></div>