[PATCH] eal: zero out new added memory

lic121 chengtcli at qq.com
Tue Aug 30 11:49:42 CEST 2022


On Tue, Aug 30, 2022 at 01:11:25AM +0000, lic121 wrote:
> On Mon, Aug 29, 2022 at 03:49:25PM +0300, Dmitry Kozlyuk wrote:
> > 2022-08-29 14:37 (UTC+0200), Morten Brørup:
> > > > From: David Marchand [mailto:david.marchand at redhat.com]
> > > > Sent: Monday, 29 August 2022 13.58
> > > >
> > > > > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:  
> > > > > > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.  
> > > 
> > > David, are you suggesting that this invariant - guaranteeing that DPDK memory is zeroed - was violated by SELinux in the SELinux/container issue you were tracking?
> > > 
> > > If so, the method to ensure the invariant is faulty for SELinux. Assuming DPDK supports SELinux, this bug should be fixed.
> > 
> > +1, I'd like to know more about that case.
> > 
> > EAL checks the unlink() result, so if it fails, the allocation should fail
> > and the invariant should not be broken.
> > Code from 20.11.5:
> > 
> > 	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> > 			unlink(path) == -1 &&
> > 			errno != ENOENT) {
> > 		RTE_LOG(DEBUG, EAL, "%s(): could not remove '%s': %s\n",
> > 			__func__, path, strerror(errno));
> > 		return -1;
> > 	}
> > 
> > Can SELinux restriction result in errno == ENOENT?
> > I'd expect EPERM/EACCESS.
> 
> Thanks for your info, the selinux is disabled on my server. Also I
> checked that the selinux fix is already in my dpdk. Could any other
> settings may cause dirty memory? If you can think of any thing related,
> I can have a try.
> 
> BTW, this is my nic info:
> ```
> Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
> 
> driver: ice
> version: 1.9.3
> firmware-version: 2.30 0x80005d22 1.2877.0
> expansion-rom-version:
> bus-info: 0000:3b:00.1
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
> ```


update with more debugs:

Preparation:
1. set hugepage size to 2 GB.
```
[root at gz15-compute-s3-55e247e16e22 huge]# grep -i huge /proc/meminfo
AnonHugePages:    124928 kB
ShmemHugePages:        0 kB
HugePages_Total:       2
HugePages_Free:        2
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
Hugetlb:         2097152 kB
```

2. make a simple programe to poison memory
```c
#include <stdio.h>
#include <sys/mman.h>
#include <string.h>

static int memvcmp(void *memory, unsigned char val, size_t size)
{
    unsigned char *mm = (unsigned char*)memory;
    return (*mm == val) && memcmp(mm, mm + 1, size - 1) == 0;
}

int main(int argc, char *argv[]){
    size_t size = 2 * (1 << 30)-1;
    void *ptr2 = mmap(NULL,  size,
                            PROT_READ | PROT_WRITE,
                            MAP_PRIVATE | MAP_ANONYMOUS |
                            MAP_HUGETLB, -1, 0);
    if (! ptr2) {
        printf("failed to allocted mm");
        return 0;
    }
    if (argc > 1) {
        memset(ptr2, 0xff, size);
    }
    unsigned char * ss = ptr2;
    printf("ss: %x\n", *ss);
    if (memvcmp(ptr2, 0, size)){
        printf("all zero\n");
    } else {
        printf("not all zero\n");
    }
}
```

3. insert debug info to check if memory all zero
```
diff --git a/lib/librte_eal/common/malloc_heap.c
b/lib/librte_eal/common/malloc_heap.c
index 5a09247a6..026560333 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -91,16 +91,32 @@ malloc_socket_to_heap_id(unsigned int socket_id)
 /*
  * Expand the heap with a memory area.
  */
+static int memvcmp(void *memory, unsigned char val, size_t size)
+{
+    unsigned char *mm = (unsigned char*)memory;
+    return (*mm == val) && memcmp(mm, mm + 1, size - 1) == 0;
+}
 static struct malloc_elem *
 malloc_heap_add_memory(struct malloc_heap *heap, struct rte_memseg_list
*msl,
                void *start, size_t len)
 {
        struct malloc_elem *elem = start;
+       void *ptr;
+       size_t data_len;
+

        malloc_elem_init(elem, heap, msl, len, elem, len);

        malloc_elem_insert(elem);

+       ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
+       data_len = elem->size - MALLOC_ELEM_OVERHEAD;
+    if (memvcmp(ptr, 0, data_len)){
+        RTE_LOG(ERR, EAL, "liiiiiiilog: all zero\n");
+    } else {
+        RTE_LOG(ERR, EAL, "liiiiiiilog: not all zero\n");
+    }
+
        elem = malloc_elem_join_adjacent_free(elem);

        malloc_elem_free_list_insert(elem);
```

debug steps:
1. poison 2GB memory
```
[root at gz15-compute-s3-55e247e16e22 secure]# rm -rf
/dev/hugepages/rtemap_* ; huge/a.out 1
ss: ff
not all zero
```
2. Run testpmd(with no nic bind vfio-pci)
```
[root at gz15-compute-s3-55e247e16e22 secure]# dpdk-testpmd -l 0-3 -n 4 --
-i --nb-cores=3
EAL: Detected 64 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: liiiiiiilog: not all zero
EAL: No legacy callbacks, legacy socket not created
testpmd: No probed ethernet devices
Interactive-mode selected
testpmd: create a new mbuf pool <mb_pool_0>: n=171456, size=2176,
socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
testpmd: create a new mbuf pool <mb_pool_1>: n=171456, size=2176,
socket=1
testpmd: preferred mempool ops selected: ring_mp_mc
EAL: liiiiiiilog: not all zero
Done
testpmd>
```

Dirty memory happens even no nic probe.
I tried on two CPUs, the same issue.
- Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
- Intel(R) Xeon(R) Platinum 8378A CPU @ 3.00GHz



More information about the dev mailing list