[dpdk-dev] [PATCH v2] eal: fix up bad asm in rte_cpu_get_features

Neil Horman nhorman at tuxdriver.com
Thu Mar 20 12:27:34 CET 2014


On Thu, Mar 20, 2014 at 07:03:23AM -0400, Neil Horman wrote:
> On Wed, Mar 19, 2014 at 09:22:03PM -0700, H. Peter Anvin wrote:
> > On 03/19/2014 05:40 PM, Neil Horman wrote:
> > > So after some discussion with hpa, I need to self NAK this again, apologies for
> > > the noise.  Theres some clean up to be done in this area, and I'm still getting
> > > a segfault that is in some way related to this code, but I need to dig deeper to
> > > understand it.
> > > 
> > > Neil
> > 
> > I still believe we should add the patch I posted in the previous email;
> > I should clean it up and put a proper header on it.
> > 
> I agree, but the fact of the matter is that I'm still getting a segfault very
> close to these instructions and I dont' understand why yet.  I'd hate to just
> make the problem go away without understanding the reason why.  The patch you
> propose doesn't fix (yet moving the xchgl to its own asm statement does).
> 
> > This is, if there is actually a need to feed %ebx and %edx into CPUID
> > (the native instruction is sensitive to %eax and %ecx, but not %ebx or
> > %edx.)
> > 
> > For reference, this is a version of CPUID I personally often use:
> > 
> > struct cpuid {
> > 	unsigned int eax, ecx, edx, ebx;
> > };
> > 
> > static inline void cpuid(unsigned int leaf, unsigned int subleaf,
> > 			 struct cpuid *out)
> > {
> > #if defined(__i386__) && defined(__PIC__)
> So, this is an additional difference and this in fact does make the problem
> clear up.  By applying only this patch:
> 
> @@ -192,7 +192,7 @@ rte_cpu_get_features(struct cpuid_parameters_t params)
>  {
>         int eax, ebx, ecx, edx;            /* registers */
>  
> -#ifndef __PIC__
> +#if !defined(__PIC__) || !defined(__i386__)
>     asm volatile ("cpuid"
>                   /* output */
>                   : "=a" (eax),
> 
> my build compiles the cpuid instruction branch, not the mov;cpuid; xchgl branch
> (its an x86_64 build).  Is there any reason that x86_64 doesn't need to save the
> ebx register when running cpuid while building PIE code?
> 
> Neil
> 
> 
So, I answered my own question, sort of.  The __i386__ is clear: x86_64 uses RIP
relative addressing, making the saving of ebx not needed - thats perfectly
clear.

Whats a bit less clear to me is why it matters.  Ideally moving ebx and
restoring it with an xchg should change the register state at all.  It would
clobber the lower part of rbx I think, but looking at the disassembly that
shouldn't be used, so as long as the calling function saves its value of rbx, it
should be ok.  The odd part is, if I look at the disassembly of
rte_cpu_get_flag_enabled compiled with and without the mov and xchgl operations,
I see that without those additional instructions the compiler adds a push rbx
and pop rbx instruction at the start and end of the assembly, but not when the
mov ebx, %0 and xchgl %ebx, %0 instructions are added.  I'm not sure what the
compiler is sensitive to when adding those instructions, but it seems like it
should be sensitive to the cpuid instruction, and should be adding it to both.

I'd like your thought Peter on that, but either way it seems clear that the
mov/xchgl aren't needed for x86_64 code, so I'll clean that up and post a new
patch shortly.

Neil



More information about the dev mailing list