[PATCH 2/3] net/nfp: fix free resource problem

Ferruh Yigit ferruh.yigit at amd.com
Thu Jan 11 13:32:41 CET 2024


On 1/11/2024 2:02 AM, Chaoyong He wrote:
>> On 1/9/2024 7:56 AM, Chaoyong He wrote:
>>>> On 12/18/2023 1:50 AM, Chaoyong He wrote:
>>>>>> On 12/14/2023 10:24 AM, Chaoyong He wrote:
>>>>>>> From: Long Wu <long.wu at corigine.com>
>>>>>>>
>>>>>>> Set the representor array to NULL to avoid that close interface
>>>>>>> does not free some resource.
>>>>>>>
>>>>>>> Fixes: a135bc1644d6 ("net/nfp: fix resource leak for flower
>>>>>>> firmware")
>>>>>>> Cc: chaoyong.he at corigine.com
>>>>>>> Cc: stable at dpdk.org
>>>>>>>
>>>>>>> Signed-off-by: Long Wu <long.wu at corigine.com>
>>>>>>> Reviewed-by: Chaoyong He <chaoyong.he at corigine.com>
>>>>>>> Reviewed-by: Peng Zhang <peng.zhang at corigine.com>
>>>>>>> ---
>>>>>>>  drivers/net/nfp/flower/nfp_flower_representor.c | 15
>>>>>>> ++++++++++++++-
>>>>>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c
>>>>>>> b/drivers/net/nfp/flower/nfp_flower_representor.c
>>>>>>> index 27ea3891bd..5f7c1fa737 100644
>>>>>>> --- a/drivers/net/nfp/flower/nfp_flower_representor.c
>>>>>>> +++ b/drivers/net/nfp/flower/nfp_flower_representor.c
>>>>>>> @@ -294,17 +294,30 @@ nfp_flower_repr_tx_burst(void *tx_queue,
>>>>>>> static int  nfp_flower_repr_uninit(struct rte_eth_dev *eth_dev)  {
>>>>>>> +	uint16_t index;
>>>>>>>  	struct nfp_flower_representor *repr;
>>>>>>>
>>>>>>>  	repr = eth_dev->data->dev_private;
>>>>>>>  	rte_ring_free(repr->ring);
>>>>>>>
>>>>>>> +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) {
>>>>>>> +		index = NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(repr-
>>>>>>> port_id);
>>>>>>> +		repr->app_fw_flower->phy_reprs[index] = NULL;
>>>>>>> +	} else {
>>>>>>> +		index = repr->vf_id;
>>>>>>> +		repr->app_fw_flower->vf_reprs[index] = NULL;
>>>>>>> +	}
>>>>>>> +
>>>>>>>  	return 0;
>>>>>>>  }
>>>>>>>
>>>>>>>  static int
>>>>>>> -nfp_flower_pf_repr_uninit(__rte_unused struct rte_eth_dev
>>>>>>> *eth_dev)
>>>>>>> +nfp_flower_pf_repr_uninit(struct rte_eth_dev *eth_dev)
>>>>>>>  {
>>>>>>> +	struct nfp_flower_representor *repr =
>>>>>>> +eth_dev->data->dev_private;
>>>>>>> +
>>>>>>> +	repr->app_fw_flower->pf_repr = NULL;
>>>>>>>
>>>>>>
>>>>>> Here it is assigned to NULL but is it freed? If freed, why not set
>>>>>> to NULL where it is freed?
>>>>>>
>>>>>> Same for above phy_reprs & vf_reprs.
>>>>>
>>>>> The whole invoke view:
>>>>> rte_eth_dev_close()
>>>>>     --> nfp_flower_repr_dev_close()
>>>>>         --> nfp_flower_repr_free()
>>>>>             --> nfp_flower_pf_repr_uninit()
>>>>>             --> nfp_flower_repr_uninit()
>>>>>            // In these two functions, we just assigned to NULL but not freed
>> yet.
>>>>>            // It is still refer by the `eth_dev->data->dev_private`.
>>>>>     --> rte_eth_dev_release_port()
>>>>>         --> rte_free(eth_dev->data->dev_private);
>>>>>         // And here it is really freed (by the rte framework).
>>>>>
>>>>
>>>> 'rte_eth_dev_release_port()' frees the device private data, but not
>>>> all pointers, like 'repr->app_fw_flower->pf_repr', in the struct are
>>>> freed, it is dev_close() or
>>>> unint() functions responsibility.
>>>>
>>>> Can you please double check if
>>>> 'eth_dev->data->dev_private->app_fw_flower->pf_repr' freed or not?
>>>
>>> (gdb) b nfp_flower_repr_dev_close
>>> Breakpoint 1 at 0x7f839a4ad37f:
>> file ../drivers/net/nfp/flower/nfp_flower_representor.c, line 356.
>>> (gdb) c
>>> Continuing.
>>>
>>> Thread 1 "dpdk-testpmd" hit Breakpoint 1, nfp_flower_repr_dev_close
>> (dev=0x7f839aed2340 <rte_eth_devices>)
>>>     at ../drivers/net/nfp/flower/nfp_flower_representor.c:356
>>> 356             if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>>> (gdb) n
>>> 359             repr = dev->data->dev_private;
>>> (gdb)
>>> 360             app_fw_flower = repr->app_fw_flower;
>>> (gdb)
>>> 361             hw = app_fw_flower->pf_hw;
>>> (gdb)
>>> 362             pf_dev = hw->pf_dev;
>>> (gdb)
>>> 368             nfp_net_disable_queues(dev);
>>> (gdb) p repr
>>> $1 = (struct nfp_flower_representor *) 0x17c49c800
>>> (gdb) p dev->data->dev_private
>>> $2 = (void *) 0x17c49c800
>>> (gdb) p repr->app_fw_flower->pf_repr
>>> $3 = (struct nfp_flower_representor *) 0x17c49c800
>>>
>>> As we can see, these three pointers point the same block of memory.
>>>
>>
>> Ahh, I missed that 'repr->app_fw_flower->pf_repr' points to 'dev_private', so
>> your code makes sense.
>>
>> But if it is 'dev_private', why free it in 'nfp_pf_uninit()' as it will be freed by
>> 'rte_eth_dev_release_port()'?
> 
> Sorry, I'm not understanding this.
> The 'dev_private' is a 'struct nfp_flower_representor *', and it will be freed in 'rte_eth_dev_release_port()'.
> What I freed in 'nfp_pf_uninit()' is a 'struct nfp_pf_dev *', so I'm not catch your point about this.
> 
>> Won't removing 'rte_free(pf_dev);' from 'nfp_pf_uninit()' will have the same
>> effect, instead of setting it NULL in advance?
>>
> 
> If I remove the 'rte_free(pf_dev);' from 'nfp_pf_uninit()', there will be a memory leak as no one will free it, and actually I'm not 'setting it NULL in advance'.
> 
> 359             repr = dev->data->dev_private;
> 360             app_fw_flower = repr->app_fw_flower;
> 361             hw = app_fw_flower->pf_hw;
> 362             pf_dev = hw->pf_dev;
> 
> Maybe you just confuse the 'pf_repr' and 'pf_dev'? Just a guess.
>

Yes I did confuse those two, sorry about that.

'repr->app_fw_flower->pf_repr' is 'dev_private', and I assumed you are
setting it NULL to escape from double free (and was checking where that
double free happens), but I guess that is not the case.

'rte_eth_dev_destroy()' calls 'rte_eth_dev_release_port()' and frees
'dev_private' but 'repr->app_fw_flower->pf_repr' remains as dangling
pointer and perhaps prevents 'nfp_flower_repr_dev_close()' move forward
(because of "if (app_fw_flower->pf_repr != NULL)" check), and you are
fixing it, is it the case?



More information about the stable mailing list