[dpdk-dev,v1] build: add more implementers' IDs and PNs for Arm platforms

Message ID 1517384359-1438-1-git-send-email-herbert.guan@arm.com (mailing list archive)
State Accepted, archived
Delegated to: Bruce Richardson
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Herbert Guan Jan. 31, 2018, 7:39 a.m. UTC
  1) Add native PN option '-march=native' to allow automatic detection.
   Set 'arm_force_native_march' to 'true' in config/arm/meson.build
   to use native PN option.
2) Add implementer_pn option for part num selection in cross compile
3) Add known Arm cortex PN support
4) Add known implementers' IDs (use generic flags/archs by default)
5) Sync build options with config/common_armv8a_linuxapp

Signed-off-by: Herbert Guan <herbert.guan@arm.com>
---
 config/arm/arm64_armv8_linuxapp_gcc | 14 ++++++
 config/arm/meson.build              | 99 ++++++++++++++++++++++++++++---------
 2 files changed, 91 insertions(+), 22 deletions(-)
  

Comments

Bruce Richardson Feb. 1, 2018, 2:35 p.m. UTC | #1
On Wed, Jan 31, 2018 at 03:39:19PM +0800, Herbert Guan wrote:
> 1) Add native PN option '-march=native' to allow automatic detection.
>    Set 'arm_force_native_march' to 'true' in config/arm/meson.build
>    to use native PN option.
> 2) Add implementer_pn option for part num selection in cross compile
> 3) Add known Arm cortex PN support
> 4) Add known implementers' IDs (use generic flags/archs by default)
> 5) Sync build options with config/common_armv8a_linuxapp
> 
> Signed-off-by: Herbert Guan <herbert.guan@arm.com>
> ---

Is it intended to get this into 18.02, or can it be delayed till 18.05? 

Pavan, can you please review, as author of the existing ARM-specific
meson code?

Thanks,
/Bruce
  
Pavan Nikhilesh Feb. 5, 2018, 9:22 a.m. UTC | #2
Hi Herbert,

On Wed, Jan 31, 2018 at 03:39:19PM +0800, Herbert Guan wrote:
> 1) Add native PN option '-march=native' to allow automatic detection.
>    Set 'arm_force_native_march' to 'true' in config/arm/meson.build
>    to use native PN option.
> 2) Add implementer_pn option for part num selection in cross compile
> 3) Add known Arm cortex PN support
> 4) Add known implementers' IDs (use generic flags/archs by default)
> 5) Sync build options with config/common_armv8a_linuxapp
>
> Signed-off-by: Herbert Guan <herbert.guan@arm.com>
> ---
<snip>
> +
>  machine_args_generic = [
> -	['default', ['-march=armv8-a+crc+crypto']]]
> +	['default', ['-march=armv8-a']],

Any specific reason for this change?
Traditional make uses
MACHINE_CFLAGS += -march=armv8-a+crc+crypto
found at mk/machine/armv8a/rte.vars.mk

> +	['native', ['-march=native']],
> +	['0xd03', ['-mcpu=cortex-a53']],
> +	['0xd04', ['-mcpu=cortex-a35']],
> +	['0xd07', ['-mcpu=cortex-a57']],
> +	['0xd08', ['-mcpu=cortex-a72']],
> +	['0xd09', ['-mcpu=cortex-a73']],
> +	['0xd0a', ['-mcpu=cortex-a75']],
> +]
>  machine_args_cavium = [
>  	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> +	['native', ['-march=native']],
>  	['0xa1', ['-mcpu=thunderxt88']],
>  	['0xa2', ['-mcpu=thunderxt81']],
>  	['0xa3', ['-mcpu=thunderxt83']]]
>
> -flags_generic = [[]]
> +flags_common_default = [
> +	# Accelarate rte_memcpy. Be sure to run unit test (memcpy_perf_autotest)
> +	# to determine the best threshold in code. Refer to notes in source file
> +	# (lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h) for more info.
> +	['RTE_ARCH_ARM64_MEMCPY', false],
> +	#	['RTE_ARM64_MEMCPY_ALIGNED_THRESHOLD', 2048],
> +	#	['RTE_ARM64_MEMCPY_UNALIGNED_THRESHOLD', 512],
> +	# Leave below RTE_ARM64_MEMCPY_xxx options commented out, unless there're
> +	# strong reasons.
> +	#	['RTE_ARM64_MEMCPY_SKIP_GCC_VER_CHECK', false],
> +	#	['RTE_ARM64_MEMCPY_ALIGN_MASK', 0xF],
> +	#	['RTE_ARM64_MEMCPY_STRICT_ALIGN', false],
> +
> +	['RTE_LIBRTE_FM10K_PMD', false],
> +	['RTE_LIBRTE_SFC_EFX_PMD', false],
> +	['RTE_LIBRTE_AVP_PMD', false],
> +
> +	['RTE_SCHED_VECTOR', false],
> +]
> +
> +flags_generic = [
> +	['RTE_MACHINE', '"armv8a"'],
> +	['RTE_CACHE_LINE_SIZE', 128]]
>  flags_cavium = [
>  	['RTE_MACHINE', '"thunderx"'],
>  	['RTE_CACHE_LINE_SIZE', 128],
> @@ -22,8 +55,21 @@ flags_cavium = [
>  	['RTE_MAX_VFIO_GROUPS', 128],
>  	['RTE_RING_USE_C11_MEM_MODEL', false]]
>
> +## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page G7-5321)
>  impl_generic = ['Generic armv8', flags_generic, machine_args_generic]
> +impl_0x41 = ['Arm', flags_generic, machine_args_generic]
> +impl_0x42 = ['Broadcom', flags_generic, machine_args_generic]
>  impl_0x43 = ['Cavium', flags_cavium, machine_args_cavium]
> +impl_0x44 = ['DEC', flags_generic, machine_args_generic]
> +impl_0x49 = ['Infineon', flags_generic, machine_args_generic]
> +impl_0x4d = ['Motorola', flags_generic, machine_args_generic]
> +impl_0x4e = ['NVIDIA', flags_generic, machine_args_generic]
> +impl_0x50 = ['AppliedMicro', flags_generic, machine_args_generic]
> +impl_0x51 = ['Qualcomm', flags_generic, machine_args_generic]
> +impl_0x53 = ['Samsung', flags_generic, machine_args_generic]
> +impl_0x56 = ['Marvell', flags_generic, machine_args_generic]
> +impl_0x69 = ['Intel', flags_generic, machine_args_generic]
> +
>

One minor concern here is DPAA/DPAA2 use cacheline size og 64B unlike
traditional 128B armv8. found at config/defconfig_arm64-dpaa/2-linuxapp-gcc
maybe Hemanth could comment on this.

>  if cc.get_define('__clang__') != ''
>  	dpdk_conf.set_quoted('RTE_TOOLCHAIN', 'clang')
> @@ -55,19 +101,31 @@ else
>  				meson.current_source_dir(), 'armv8_machine.py'))
>  		cmd = run_command(detect_vendor.path())
>  		if cmd.returncode() == 0
> -			cmd_output = cmd.stdout().strip().split(' ')
> +			cmd_output = cmd.stdout().to_lower().strip().split(' ')
>  		endif

<snip>

Verified on thunderx with gcc 5.3.0/7.2.1 and clang 5.0.1

Regards,
Pavan

> @@ -79,22 +137,19 @@ else
>  	# for gcc versions > 7
>  	if cc.version().version_compare(
>  			'<7.0') or cmd_output.length() == 0
> -		foreach marg: machine[2]
> -			if marg[0] == 'default'
> -				foreach f: marg[1]
> -					machine_args += f
> -				endforeach
> -			endif
> -		endforeach
> -	else
> -		foreach marg: machine[2]
> -			if marg[0] == cmd_output[3]
> -				foreach f: marg[1]
> -					machine_args += f
> -				endforeach
> -			endif
> -		endforeach
> +		if not meson.is_cross_build() and arm_force_native_march == true
> +			impl_pn = 'native'
> +		else
> +			impl_pn = 'default'
> +		endif
>  	endif
> +	foreach marg: machine[2]
> +		if marg[0] == impl_pn
> +			foreach f: marg[1]
> +				machine_args += f
> +			endforeach
> +		endif
> +	endforeach
>  endif
>  message(machine_args)
>
> --
> 1.8.3.1
>
  
Herbert Guan Feb. 6, 2018, 5:51 a.m. UTC | #3
Hi Pavan,

> -----Original Message-----
> From: Pavan Nikhilesh [mailto:pbhagavatula@caviumnetworks.com]
> Sent: Monday, February 5, 2018 17:23
> To: Herbert Guan <Herbert.Guan@arm.com>;
> jerin.jacob@caviumnetworks.com; hemant.agrawal@nxp.com;
> bruce.richardson@intel.com; harry.van.haaren@intel.com
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v1] build: add more implementers' IDs and PNs for Arm
> platforms
>
> Hi Herbert,
>
> On Wed, Jan 31, 2018 at 03:39:19PM +0800, Herbert Guan wrote:
> > 1) Add native PN option '-march=native' to allow automatic detection.
> >    Set 'arm_force_native_march' to 'true' in config/arm/meson.build
> >    to use native PN option.
> > 2) Add implementer_pn option for part num selection in cross compile
> > 3) Add known Arm cortex PN support
> > 4) Add known implementers' IDs (use generic flags/archs by default)
> > 5) Sync build options with config/common_armv8a_linuxapp
> >
> > Signed-off-by: Herbert Guan <herbert.guan@arm.com>
> > ---
> <snip>
> > +
> >  machine_args_generic = [
> > -['default', ['-march=armv8-a+crc+crypto']]]
> > +['default', ['-march=armv8-a']],
>
> Any specific reason for this change?
> Traditional make uses
> MACHINE_CFLAGS += -march=armv8-a+crc+crypto
> found at mk/machine/armv8a/rte.vars.mk
>

Both CRC and Crypto are optional instructions / extensions on Arm v8 CPUs.
When making a general build (e.g. a release build for distribution), we need to
ensure all targeted CPUs (all Armv8 for example) can support this compiled
binary.  Defaulting crc and crypto to be supported may introduce risks.  For a certain
CPU/platform, '-march=native' may be used, or CPU implementers can further
Customize these args in this file.
On the other hand, the rte_cpuflags.c is already supporting run-time CPU flags
(instruction sets) detection and this is the preferred approach.

> > +['native', ['-march=native']],
> > +['0xd03', ['-mcpu=cortex-a53']],
> > +['0xd04', ['-mcpu=cortex-a35']],
> > +['0xd07', ['-mcpu=cortex-a57']],
> > +['0xd08', ['-mcpu=cortex-a72']],
> > +['0xd09', ['-mcpu=cortex-a73']],
> > +['0xd0a', ['-mcpu=cortex-a75']],
> > +]
> >  machine_args_cavium = [
> >  ['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> > +['native', ['-march=native']],
> >  ['0xa1', ['-mcpu=thunderxt88']],
> >  ['0xa2', ['-mcpu=thunderxt81']],
> >  ['0xa3', ['-mcpu=thunderxt83']]]
> >
> > -flags_generic = [[]]
> > +flags_common_default = [
> > +# Accelarate rte_memcpy. Be sure to run unit test
> (memcpy_perf_autotest)
> > +# to determine the best threshold in code. Refer to notes in source
> file
> > +# (lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h) for
> more info.
> > +['RTE_ARCH_ARM64_MEMCPY', false],
> > +#['RTE_ARM64_MEMCPY_ALIGNED_THRESHOLD', 2048],
> > +#['RTE_ARM64_MEMCPY_UNALIGNED_THRESHOLD', 512],
> > +# Leave below RTE_ARM64_MEMCPY_xxx options commented out,
> unless there're
> > +# strong reasons.
> > +#['RTE_ARM64_MEMCPY_SKIP_GCC_VER_CHECK', false],
> > +#['RTE_ARM64_MEMCPY_ALIGN_MASK', 0xF],
> > +#['RTE_ARM64_MEMCPY_STRICT_ALIGN', false],
> > +
> > +['RTE_LIBRTE_FM10K_PMD', false],
> > +['RTE_LIBRTE_SFC_EFX_PMD', false],
> > +['RTE_LIBRTE_AVP_PMD', false],
> > +
> > +['RTE_SCHED_VECTOR', false],
> > +]
> > +
> > +flags_generic = [
> > +['RTE_MACHINE', '"armv8a"'],
> > +['RTE_CACHE_LINE_SIZE', 128]]
> >  flags_cavium = [
> >  ['RTE_MACHINE', '"thunderx"'],
> >  ['RTE_CACHE_LINE_SIZE', 128],
> > @@ -22,8 +55,21 @@ flags_cavium = [
> >  ['RTE_MAX_VFIO_GROUPS', 128],
> >  ['RTE_RING_USE_C11_MEM_MODEL', false]]
> >
> > +## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page G7-
> 5321)
> >  impl_generic = ['Generic armv8', flags_generic, machine_args_generic]
> > +impl_0x41 = ['Arm', flags_generic, machine_args_generic]
> > +impl_0x42 = ['Broadcom', flags_generic, machine_args_generic]
> >  impl_0x43 = ['Cavium', flags_cavium, machine_args_cavium]
> > +impl_0x44 = ['DEC', flags_generic, machine_args_generic]
> > +impl_0x49 = ['Infineon', flags_generic, machine_args_generic]
> > +impl_0x4d = ['Motorola', flags_generic, machine_args_generic]
> > +impl_0x4e = ['NVIDIA', flags_generic, machine_args_generic]
> > +impl_0x50 = ['AppliedMicro', flags_generic, machine_args_generic]
> > +impl_0x51 = ['Qualcomm', flags_generic, machine_args_generic]
> > +impl_0x53 = ['Samsung', flags_generic, machine_args_generic]
> > +impl_0x56 = ['Marvell', flags_generic, machine_args_generic]
> > +impl_0x69 = ['Intel', flags_generic, machine_args_generic]
> > +
> >
>
> One minor concern here is DPAA/DPAA2 use cacheline size og 64B unlike
> traditional 128B armv8. found at config/defconfig_arm64-dpaa/2-linuxapp-
> gcc
> maybe Hemanth could comment on this.
>
> >  if cc.get_define('__clang__') != ''
> >  dpdk_conf.set_quoted('RTE_TOOLCHAIN', 'clang')
> > @@ -55,19 +101,31 @@ else
> >  meson.current_source_dir(),
> 'armv8_machine.py'))
> >  cmd = run_command(detect_vendor.path())
> >  if cmd.returncode() == 0
> > -cmd_output = cmd.stdout().strip().split(' ')
> > +cmd_output = cmd.stdout().to_lower().strip().split(' ')
> >  endif
>
> <snip>
>
> Verified on thunderx with gcc 5.3.0/7.2.1 and clang 5.0.1
>
> Regards,
> Pavan
>
> > @@ -79,22 +137,19 @@ else
> >  # for gcc versions > 7
> >  if cc.version().version_compare(
> >  '<7.0') or cmd_output.length() == 0
> > -foreach marg: machine[2]
> > -if marg[0] == 'default'
> > -foreach f: marg[1]
> > -machine_args += f
> > -endforeach
> > -endif
> > -endforeach
> > -else
> > -foreach marg: machine[2]
> > -if marg[0] == cmd_output[3]
> > -foreach f: marg[1]
> > -machine_args += f
> > -endforeach
> > -endif
> > -endforeach
> > +if not meson.is_cross_build() and arm_force_native_march
> == true
> > +impl_pn = 'native'
> > +else
> > +impl_pn = 'default'
> > +endif
> >  endif
> > +foreach marg: machine[2]
> > +if marg[0] == impl_pn
> > +foreach f: marg[1]
> > +machine_args += f
> > +endforeach
> > +endif
> > +endforeach
> >  endif
> >  message(machine_args)
> >
> > --
> > 1.8.3.1
> >
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
  
Pavan Nikhilesh Feb. 6, 2018, 6:02 a.m. UTC | #4
On Tue, Feb 06, 2018 at 05:51:29AM +0000, Herbert Guan wrote:
> Hi Pavan,
>
> > -----Original Message-----
> > From: Pavan Nikhilesh [mailto:pbhagavatula@caviumnetworks.com]
> > Sent: Monday, February 5, 2018 17:23
> > To: Herbert Guan <Herbert.Guan@arm.com>;
> > jerin.jacob@caviumnetworks.com; hemant.agrawal@nxp.com;
> > bruce.richardson@intel.com; harry.van.haaren@intel.com
> > Cc: dev@dpdk.org
> > Subject: Re: [PATCH v1] build: add more implementers' IDs and PNs for Arm
> > platforms
> >
> > Hi Herbert,
> >
> > On Wed, Jan 31, 2018 at 03:39:19PM +0800, Herbert Guan wrote:
> > > 1) Add native PN option '-march=native' to allow automatic detection.
> > >    Set 'arm_force_native_march' to 'true' in config/arm/meson.build
> > >    to use native PN option.
> > > 2) Add implementer_pn option for part num selection in cross compile
> > > 3) Add known Arm cortex PN support
> > > 4) Add known implementers' IDs (use generic flags/archs by default)
> > > 5) Sync build options with config/common_armv8a_linuxapp
> > >
> > > Signed-off-by: Herbert Guan <herbert.guan@arm.com>
> > > ---
> > <snip>
> > > +
> > >  machine_args_generic = [
> > > -['default', ['-march=armv8-a+crc+crypto']]]
> > > +['default', ['-march=armv8-a']],
> >
> > Any specific reason for this change?
> > Traditional make uses
> > MACHINE_CFLAGS += -march=armv8-a+crc+crypto
> > found at mk/machine/armv8a/rte.vars.mk
> >
>
> Both CRC and Crypto are optional instructions / extensions on Arm v8 CPUs.
> When making a general build (e.g. a release build for distribution), we need to
> ensure all targeted CPUs (all Armv8 for example) can support this compiled
> binary.  Defaulting crc and crypto to be supported may introduce risks.  For a certain
> CPU/platform, '-march=native' may be used, or CPU implementers can further
> Customize these args in this file.
> On the other hand, the rte_cpuflags.c is already supporting run-time CPU flags
> (instruction sets) detection and this is the preferred approach.
>

Makes sense, As I mentioned in the previous mail some vendors use 64B
cacheline instead of 128B as of now I dont see a way to detect that. The vendor
needs to modify implementor Id specific flags flags_<vendor>.
With that in mind.

Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
  
Bruce Richardson Feb. 6, 2018, 3:12 p.m. UTC | #5
On Tue, Feb 06, 2018 at 11:32:59AM +0530, Pavan Nikhilesh wrote:
> On Tue, Feb 06, 2018 at 05:51:29AM +0000, Herbert Guan wrote:
> > Hi Pavan,
> >
> > > -----Original Message-----
> > > From: Pavan Nikhilesh [mailto:pbhagavatula@caviumnetworks.com]
> > > Sent: Monday, February 5, 2018 17:23
> > > To: Herbert Guan <Herbert.Guan@arm.com>;
> > > jerin.jacob@caviumnetworks.com; hemant.agrawal@nxp.com;
> > > bruce.richardson@intel.com; harry.van.haaren@intel.com
> > > Cc: dev@dpdk.org
> > > Subject: Re: [PATCH v1] build: add more implementers' IDs and PNs for Arm
> > > platforms
> > >
> > > Hi Herbert,
> > >
> > > On Wed, Jan 31, 2018 at 03:39:19PM +0800, Herbert Guan wrote:
> > > > 1) Add native PN option '-march=native' to allow automatic detection.
> > > >    Set 'arm_force_native_march' to 'true' in config/arm/meson.build
> > > >    to use native PN option.
> > > > 2) Add implementer_pn option for part num selection in cross compile
> > > > 3) Add known Arm cortex PN support
> > > > 4) Add known implementers' IDs (use generic flags/archs by default)
> > > > 5) Sync build options with config/common_armv8a_linuxapp
> > > >
> > > > Signed-off-by: Herbert Guan <herbert.guan@arm.com>
> > > > ---
> > > <snip>
> > > > +
> > > >  machine_args_generic = [
> > > > -['default', ['-march=armv8-a+crc+crypto']]]
> > > > +['default', ['-march=armv8-a']],
> > >
> > > Any specific reason for this change?
> > > Traditional make uses
> > > MACHINE_CFLAGS += -march=armv8-a+crc+crypto
> > > found at mk/machine/armv8a/rte.vars.mk
> > >
> >
> > Both CRC and Crypto are optional instructions / extensions on Arm v8 CPUs.
> > When making a general build (e.g. a release build for distribution), we need to
> > ensure all targeted CPUs (all Armv8 for example) can support this compiled
> > binary.  Defaulting crc and crypto to be supported may introduce risks.  For a certain
> > CPU/platform, '-march=native' may be used, or CPU implementers can further
> > Customize these args in this file.
> > On the other hand, the rte_cpuflags.c is already supporting run-time CPU flags
> > (instruction sets) detection and this is the preferred approach.
> >
> 
> Makes sense, As I mentioned in the previous mail some vendors use 64B
> cacheline instead of 128B as of now I dont see a way to detect that. The vendor
> needs to modify implementor Id specific flags flags_<vendor>.
> With that in mind.
> 
> Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>

Applied to dpdk-next-build

Thanks,
/Bruce
  

Patch

diff --git a/config/arm/arm64_armv8_linuxapp_gcc b/config/arm/arm64_armv8_linuxapp_gcc
index 3b4d3c4..987c02f 100644
--- a/config/arm/arm64_armv8_linuxapp_gcc
+++ b/config/arm/arm64_armv8_linuxapp_gcc
@@ -2,9 +2,23 @@ 
 c = 'aarch64-linux-gnu-gcc'
 cpp = 'aarch64-linux-gnu-cpp'
 ar = 'aarch64-linux-gnu-gcc-ar'
+strip = 'aarch64-linux-gnu-strip'
 
 [host_machine]
 system = 'linux'
 cpu_family = 'aarch64'
 cpu = 'armv8-a'
 endian = 'little'
+
+[properties]
+implementor_id = 'generic'
+
+# Valid options for Arm's implementor_pn:
+# 'default': valid for all armv8-a architectures (default value)
+# '0xd03':   cortex-a53
+# '0xd04':   cortex-a35
+# '0xd07':   cortex-a57
+# '0xd08':   cortex-a72
+# '0xd09':   cortex-a73
+# '0xd0a':   cortex-a75
+implementor_pn = 'default'
diff --git a/config/arm/meson.build b/config/arm/meson.build
index a5bfb96..4e788a4 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -5,15 +5,48 @@ 
 # for checking defines we need to use the correct compiler flags
 march_opt = '-march=@0@'.format(machine)
 
+arm_force_native_march = false
+
 machine_args_generic = [
-	['default', ['-march=armv8-a+crc+crypto']]]
+	['default', ['-march=armv8-a']],
+	['native', ['-march=native']],
+	['0xd03', ['-mcpu=cortex-a53']],
+	['0xd04', ['-mcpu=cortex-a35']],
+	['0xd07', ['-mcpu=cortex-a57']],
+	['0xd08', ['-mcpu=cortex-a72']],
+	['0xd09', ['-mcpu=cortex-a73']],
+	['0xd0a', ['-mcpu=cortex-a75']],
+]
 machine_args_cavium = [
 	['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
+	['native', ['-march=native']],
 	['0xa1', ['-mcpu=thunderxt88']],
 	['0xa2', ['-mcpu=thunderxt81']],
 	['0xa3', ['-mcpu=thunderxt83']]]
 
-flags_generic = [[]]
+flags_common_default = [
+	# Accelarate rte_memcpy. Be sure to run unit test (memcpy_perf_autotest)
+	# to determine the best threshold in code. Refer to notes in source file
+	# (lib/librte_eal/common/include/arch/arm/rte_memcpy_64.h) for more info.
+	['RTE_ARCH_ARM64_MEMCPY', false],
+	#	['RTE_ARM64_MEMCPY_ALIGNED_THRESHOLD', 2048],
+	#	['RTE_ARM64_MEMCPY_UNALIGNED_THRESHOLD', 512],
+	# Leave below RTE_ARM64_MEMCPY_xxx options commented out, unless there're
+	# strong reasons.
+	#	['RTE_ARM64_MEMCPY_SKIP_GCC_VER_CHECK', false],
+	#	['RTE_ARM64_MEMCPY_ALIGN_MASK', 0xF],
+	#	['RTE_ARM64_MEMCPY_STRICT_ALIGN', false],
+
+	['RTE_LIBRTE_FM10K_PMD', false],
+	['RTE_LIBRTE_SFC_EFX_PMD', false],
+	['RTE_LIBRTE_AVP_PMD', false],
+
+	['RTE_SCHED_VECTOR', false],
+]
+
+flags_generic = [
+	['RTE_MACHINE', '"armv8a"'],
+	['RTE_CACHE_LINE_SIZE', 128]]
 flags_cavium = [
 	['RTE_MACHINE', '"thunderx"'],
 	['RTE_CACHE_LINE_SIZE', 128],
@@ -22,8 +55,21 @@  flags_cavium = [
 	['RTE_MAX_VFIO_GROUPS', 128],
 	['RTE_RING_USE_C11_MEM_MODEL', false]]
 
+## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page G7-5321)
 impl_generic = ['Generic armv8', flags_generic, machine_args_generic]
+impl_0x41 = ['Arm', flags_generic, machine_args_generic]
+impl_0x42 = ['Broadcom', flags_generic, machine_args_generic]
 impl_0x43 = ['Cavium', flags_cavium, machine_args_cavium]
+impl_0x44 = ['DEC', flags_generic, machine_args_generic]
+impl_0x49 = ['Infineon', flags_generic, machine_args_generic]
+impl_0x4d = ['Motorola', flags_generic, machine_args_generic]
+impl_0x4e = ['NVIDIA', flags_generic, machine_args_generic]
+impl_0x50 = ['AppliedMicro', flags_generic, machine_args_generic]
+impl_0x51 = ['Qualcomm', flags_generic, machine_args_generic]
+impl_0x53 = ['Samsung', flags_generic, machine_args_generic]
+impl_0x56 = ['Marvell', flags_generic, machine_args_generic]
+impl_0x69 = ['Intel', flags_generic, machine_args_generic]
+
 
 if cc.get_define('__clang__') != ''
 	dpdk_conf.set_quoted('RTE_TOOLCHAIN', 'clang')
@@ -55,19 +101,31 @@  else
 				meson.current_source_dir(), 'armv8_machine.py'))
 		cmd = run_command(detect_vendor.path())
 		if cmd.returncode() == 0
-			cmd_output = cmd.stdout().strip().split(' ')
+			cmd_output = cmd.stdout().to_lower().strip().split(' ')
 		endif
 		# Set to generic if variable is not found
 		machine = get_variable('impl_' + cmd_output[0], 'generic')
+		if machine == 'generic'
+			machine = impl_generic
+			cmd_output = cmd_generic
+		endif
+		impl_pn = cmd_output[3]
+		if arm_force_native_march == true
+			impl_pn = 'native'
+		endif
 	else
 		impl_id = meson.get_cross_property('implementor_id', 'generic')
+		impl_pn = meson.get_cross_property('implementor_pn', 'default')
 		machine = get_variable('impl_' + impl_id)
 	endif
 
-	if machine == 'generic'
-		machine = impl_generic
-		cmd_output = cmd_generic
-	endif
+	# Apply Common Defaults. These settings may be overwritten by machine
+	# settings later.
+	foreach flag: flags_common_default
+		if flag.length() > 0
+			dpdk_conf.set(flag[0], flag[1])
+		endif
+	endforeach
 
 	message('Implementer : ' + machine[0])
 	foreach flag: machine[1]
@@ -79,22 +137,19 @@  else
 	# for gcc versions > 7
 	if cc.version().version_compare(
 			'<7.0') or cmd_output.length() == 0
-		foreach marg: machine[2]
-			if marg[0] == 'default'
-				foreach f: marg[1]
-					machine_args += f
-				endforeach
-			endif
-		endforeach
-	else
-		foreach marg: machine[2]
-			if marg[0] == cmd_output[3]
-				foreach f: marg[1]
-					machine_args += f
-				endforeach
-			endif
-		endforeach
+		if not meson.is_cross_build() and arm_force_native_march == true
+			impl_pn = 'native'
+		else
+			impl_pn = 'default'
+		endif
 	endif
+	foreach marg: machine[2]
+		if marg[0] == impl_pn
+			foreach f: marg[1]
+				machine_args += f
+			endforeach
+		endif
+	endforeach
 endif
 message(machine_args)