Bug 1400 - net/ena: Failed to initialize ENA device
Summary: net/ena: Failed to initialize ENA device
Status: UNCONFIRMED
Alias: None
Product: DPDK
Classification: Unclassified
Component: ethdev (show other bugs)
Version: 23.11
Hardware: x86 Linux
: Normal major
Target Milestone: ---
Assignee: shaibran
URL:
Depends on:
Blocks:
 
Reported: 2024-03-13 16:45 CET by Madhuker Mythri
Modified: 2024-03-15 09:54 CET (History)
1 user (show)



Attachments

Description Madhuker Mythri 2024-03-13 16:45:29 CET
Hi,

With DPDK-23.11 on AWS cloud with ENA network devices, DPDK initialization failed with 4GB memory for 2 ports.

While DPDK rte_eal_init() call on ENA devices, in the ena_com_allocate_customer_metrics_buffer() DPDK memory zone with "ena_p0_mz0" is taking lots of memory in GB's just for one zone, why ?

As part this call "na_com_allocate_customer_metrics_buffer()" API memory size was passed as '0' (customer_metrics->buffer_len = 0) to this "ena_mem_alloc_coherent()" API, which call's --> rte_memzone_reserve_aligned() with RTE_MEMZONE_IOVA_CONTIG option. 
Thus, when memory size is '0' to allocate, this memzone allocated maximum available contiguous memory in GB's.

So, for two DPDK ENA ports, this two zones takes  around 2GB Or more of memory in DPDK-23.11. For other Mbuf pool's could not allocate memory, as memory exhausted.

Why the "customer_metrics->buffer_len = 0" size is '0' passing to this ena_mem_alloc_coherent() ?

Regards,
Madhuker.
Comment 1 Madhuker Mythri 2024-03-15 09:54:22 CET
Here are the logs on the dpdk-testpmd execution failures on AWS cloud C5.2xl VM's:
==============================================

/boot # ./dpdk-testpmd  -l 0-3 -n 2 --legacy-mem -- -i --mbcache=64
EAL: Detected CPU lcores: 8
EAL: Detected NUMA nodes: 1
EAL: Static memory layout is selected, amount of reserved memory can be adjusted with -m or --socket-mem
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket -1)
test-aws: ena_com_allocate_customer_metrics_buffer() Enter
test-aws: ena_com_allocate_customer_metrics_buffer() Enter
[ENA_COM: ena_com_allocate_customer_metrics_buffer]test-aws: size: sizeof(*customer_metrics->virt_addr) = 1
[ENA_COM: ena_com_allocate_customer_metrics_buffer]test-aws: customer_metrics->buffer_len = 0
test-aws: size: sizeof(*customer_metrics->virt_addr) = 1
test-aws: customer_metrics->buffer_len = 0
test-aws: customer_metrics->buffer_dma_addr = 0
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 0
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz0
test-aws: port_id = 0
test-aws: size = 0
test-aws: z_name = ena_p0_mz0
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
[ENA_COM: ena_com_allocate_customer_metrics_buffer]test-aws: ena_com_allocate_customer_metrics_buffer() Exit
test-aws: ena_com_allocate_customer_metrics_buffer() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 8
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz1
test-aws: port_id = 0
test-aws: size = 8
test-aws: z_name = ena_p0_mz1
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 2048
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz2
test-aws: port_id = 0
test-aws: size = 2048
test-aws: z_name = ena_p0_mz2
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 2048
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz3
test-aws: port_id = 0
test-aws: size = 2048
test-aws: z_name = ena_p0_mz3
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 1024
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz4
test-aws: port_id = 0
test-aws: size = 1024
test-aws: z_name = ena_p0_mz4
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 4096
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz5
test-aws: port_id = 0
test-aws: size = 4096
test-aws: z_name = ena_p0_mz5
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_get_metrics_entries(): 0x6 customer metrics are supported
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 640
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz6
test-aws: port_id = 0
test-aws: size = 640
test-aws: z_name = ena_p0_mz6
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 512
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz7
test-aws: port_id = 0
test-aws: size = 512
test-aws: z_name = ena_p0_mz7
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 0
ena_mem_alloc_coherent(): test-aws: size = 256
ena_mem_alloc_coherent(): test-aws: z_name = ena_p0_mz8
test-aws: port_id = 0
test-aws: size = 256
test-aws: z_name = ena_p0_mz8
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:07.0 (socket -1)
[ENA_COM: ena_com_allocate_customer_metrics_buffer]test-aws: ena_com_allocate_customer_metrics_buffer() Enter
test-aws: ena_com_allocate_customer_metrics_buffer() Enter
[ENA_COM: ena_com_allocate_customer_metrics_buffer]test-aws: size: sizeof(*customer_metrics->virt_addr) = 1
[ENA_COM: ena_com_allocate_customer_metrics_buffer]test-aws: customer_metrics->buffer_len = 0
test-aws: size: sizeof(*customer_metrics->virt_addr) = 1
test-aws: customer_metrics->buffer_len = 0
test-aws: customer_metrics->buffer_dma_addr = 0
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 0
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz0
test-aws: port_id = 1
test-aws: size = 0
test-aws: z_name = ena_p1_mz0
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
[ENA_COM: ena_com_allocate_customer_metrics_buffer]test-aws: ena_com_allocate_customer_metrics_buffer() Exit
test-aws: ena_com_allocate_customer_metrics_buffer() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 8
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz1
test-aws: port_id = 1
test-aws: size = 8
test-aws: z_name = ena_p1_mz1
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 2048
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz2
test-aws: port_id = 1
test-aws: size = 2048
test-aws: z_name = ena_p1_mz2
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 2048
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz3
test-aws: port_id = 1
test-aws: size = 2048
test-aws: z_name = ena_p1_mz3
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 1024
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz4
test-aws: port_id = 1
test-aws: size = 1024
test-aws: z_name = ena_p1_mz4
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 4096
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz5
test-aws: port_id = 1
test-aws: size = 4096
test-aws: z_name = ena_p1_mz5
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_get_metrics_entries(): 0x6 customer metrics are supported
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 640
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz6
test-aws: port_id = 1
test-aws: size = 640
test-aws: z_name = ena_p1_mz6
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 512
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz7
test-aws: port_id = 1
test-aws: size = 512
test-aws: z_name = ena_p1_mz7
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Enter
test-aws: ena_mem_alloc_coherent() Enter
ena_mem_alloc_coherent(): test-aws: port_id = 1
ena_mem_alloc_coherent(): test-aws: size = 256
ena_mem_alloc_coherent(): test-aws: z_name = ena_p1_mz8
test-aws: port_id = 1
test-aws: size = 256
test-aws: z_name = ena_p1_mz8
ena_mem_alloc_coherent(): test-aws: ena_mem_alloc_coherent() Exit
test-aws: ena_mem_alloc_coherent() Exit
TELEMETRY: No legacy callbacks, legacy socket not created
Interactive-mode selected
Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=76800, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
EAL: Error - exiting with code: 1
  Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory
Port 0 is closed
Port 1 is closed
/boot #

====================================

Note You need to log in before you can comment on or make changes to this bug.