[dpdk-dev,V20,4/4] app/testpmd: show example to handler hot unplug

Message ID 1524058689-4954-5-git-send-email-jia.guo@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Guo, Jia April 18, 2018, 1:38 p.m. UTC
  Use testpmd for example, to show how an application smoothly handle
failure when device being hot unplug. Once app detect the removal event,
the callback would be called, it first stop the packet forwarding, then
stop the port, close the port and finally detach the port.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
---
v20->v19:
remove the auto binding example.
---
 app/test-pmd/testpmd.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)
  

Comments

Matan Azrad May 3, 2018, 7:25 a.m. UTC | #1
Hi Jeff

> From: Jeff Guo, Wednesday, April 18, 2018 4:38 PM
> Use testpmd for example, to show how an application smoothly handle
> failure when device being hot unplug. Once app detect the removal event,
> the callback would be called, it first stop the packet forwarding, then stop the
> port, close the port and finally detach the port.
> 
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> ---
> v20->v19:
> remove the auto binding example.
> ---
>  app/test-pmd/testpmd.c | 29 +++++++++++++++++++++++++----
>  1 file changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> 5986ff7..3751901 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -1125,6 +1125,9 @@ run_pkt_fwd_on_lcore(struct fwd_lcore *fc,
> packet_fwd_t pkt_fwd)
>  	tics_datum = rte_rdtsc();
>  	tics_per_1sec = rte_get_timer_hz();
>  #endif
> +	if (hot_plug)
> +		rte_dev_handle_hot_unplug();
> +

Again, I don't understand why the application should configure it - it already started the hot-plug,
Can't the EAL handle this automatically when the user starts the hot-plug?

>  	fsm = &fwd_streams[fc->stream_idx];
>  	nb_fs = fc->stream_nb;
>  	do {
> @@ -2069,6 +2072,26 @@ rmv_event_callback(void *arg)
>  			dev->device->name);
>  }
> 
> +static void
> +rmv_dev_event_callback(char *dev_name)
> +{
> +	uint16_t port_id;
> +	int ret;
> +
> +	ret = rte_eth_dev_get_port_by_name(dev_name, &port_id);
> +	if (ret) {
> +		printf("can not get port by device %s!\n", dev_name);
> +		return;
> +	}
> +
> +	RTE_ETH_VALID_PORTID_OR_RET(port_id);
> +	printf("removing port id:%u\n", port_id);
> +	stop_packet_forwarding();
> +	stop_port(port_id);
> +	close_port(port_id);
> +	detach_port(port_id);
> +}

We have also the rmv_event_callback() which is triggered by a RMV interrupt and running by the host thread.
What is the context thread of rmv_dev_event_callback()?
Shouldn't they be synchronized? Should we need both in the same time?

> +
>  /* This function is used by the interrupt thread */  static int
> eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void
> *param, @@ -2130,9 +2153,7 @@ eth_dev_event_callback(char
> *device_name, enum rte_dev_event_type type,
>  	case RTE_DEV_EVENT_REMOVE:
>  		RTE_LOG(ERR, EAL, "The device: %s has been removed!\n",
>  			device_name);
> -		/* TODO: After finish failure handle, begin to stop
> -		 * packet forward, stop port, close port, detach port.
> -		 */
> +		rmv_dev_event_callback(device_name);
>  		break;
>  	case RTE_DEV_EVENT_ADD:
>  		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
> @@ -2640,7 +2661,7 @@ main(int argc, char** argv)
>  			return -1;
>  		}
>  		eth_dev_event_callback_register();
> -
> +		rte_dev_handle_hot_unplug();
>  	}
> 
>  	if (start_port(RTE_PORT_ALL) != 0)
> --
> 2.7.4
  
Guo, Jia May 3, 2018, 9:35 a.m. UTC | #2
hi, matan


On 5/3/2018 3:25 PM, Matan Azrad wrote:
> Hi Jeff
>
>> From: Jeff Guo, Wednesday, April 18, 2018 4:38 PM
>> Use testpmd for example, to show how an application smoothly handle
>> failure when device being hot unplug. Once app detect the removal event,
>> the callback would be called, it first stop the packet forwarding, then stop the
>> port, close the port and finally detach the port.
>>
>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>> ---
>> v20->v19:
>> remove the auto binding example.
>> ---
>>   app/test-pmd/testpmd.c | 29 +++++++++++++++++++++++++----
>>   1 file changed, 25 insertions(+), 4 deletions(-)
>>
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>> 5986ff7..3751901 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -1125,6 +1125,9 @@ run_pkt_fwd_on_lcore(struct fwd_lcore *fc,
>> packet_fwd_t pkt_fwd)
>>   	tics_datum = rte_rdtsc();
>>   	tics_per_1sec = rte_get_timer_hz();
>>   #endif
>> +	if (hot_plug)
>> +		rte_dev_handle_hot_unplug();
>> +
> Again, I don't understand why the application should configure it - it already started the hot-plug,
> Can't the EAL handle this automatically when the user starts the hot-plug?
please check v21,  agree with you and have already modify it.
>>   	fsm = &fwd_streams[fc->stream_idx];
>>   	nb_fs = fc->stream_nb;
>>   	do {
>> @@ -2069,6 +2072,26 @@ rmv_event_callback(void *arg)
>>   			dev->device->name);
>>   }
>>
>> +static void
>> +rmv_dev_event_callback(char *dev_name)
>> +{
>> +	uint16_t port_id;
>> +	int ret;
>> +
>> +	ret = rte_eth_dev_get_port_by_name(dev_name, &port_id);
>> +	if (ret) {
>> +		printf("can not get port by device %s!\n", dev_name);
>> +		return;
>> +	}
>> +
>> +	RTE_ETH_VALID_PORTID_OR_RET(port_id);
>> +	printf("removing port id:%u\n", port_id);
>> +	stop_packet_forwarding();
>> +	stop_port(port_id);
>> +	close_port(port_id);
>> +	detach_port(port_id);
>> +}
> We have also the rmv_event_callback() which is triggered by a RMV interrupt and running by the host thread.
> What is the context thread of rmv_dev_event_callback()?
> Shouldn't they be synchronized? Should we need both in the same time?
the context thread is interrupt thread.  and we might be discuss how to 
sync it. do you have comment if i combine these 2 into  1 callback?
>> +
>>   /* This function is used by the interrupt thread */  static int
>> eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void
>> *param, @@ -2130,9 +2153,7 @@ eth_dev_event_callback(char
>> *device_name, enum rte_dev_event_type type,
>>   	case RTE_DEV_EVENT_REMOVE:
>>   		RTE_LOG(ERR, EAL, "The device: %s has been removed!\n",
>>   			device_name);
>> -		/* TODO: After finish failure handle, begin to stop
>> -		 * packet forward, stop port, close port, detach port.
>> -		 */
>> +		rmv_dev_event_callback(device_name);
>>   		break;
>>   	case RTE_DEV_EVENT_ADD:
>>   		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
>> @@ -2640,7 +2661,7 @@ main(int argc, char** argv)
>>   			return -1;
>>   		}
>>   		eth_dev_event_callback_register();
>> -
>> +		rte_dev_handle_hot_unplug();
>>   	}
>>
>>   	if (start_port(RTE_PORT_ALL) != 0)
>> --
>> 2.7.4
  
Matan Azrad May 3, 2018, 11:27 a.m. UTC | #3
Hi Guo

From: Guo, Jia, Thursday, May 3, 2018 12:36 PM
> hi, matan
> 
> 
> On 5/3/2018 3:25 PM, Matan Azrad wrote:
> > Hi Jeff
> >
> >> From: Jeff Guo, Wednesday, April 18, 2018 4:38 PM Use testpmd for
> >> example, to show how an application smoothly handle failure when
> >> device being hot unplug. Once app detect the removal event, the
> >> callback would be called, it first stop the packet forwarding, then
> >> stop the port, close the port and finally detach the port.
> >>
> >> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> >> ---
> >> v20->v19:
> >> remove the auto binding example.
> >> ---
> >>   app/test-pmd/testpmd.c | 29 +++++++++++++++++++++++++----
> >>   1 file changed, 25 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> >> 5986ff7..3751901 100644
> >> --- a/app/test-pmd/testpmd.c
> >> +++ b/app/test-pmd/testpmd.c
> >> @@ -1125,6 +1125,9 @@ run_pkt_fwd_on_lcore(struct fwd_lcore *fc,
> >> packet_fwd_t pkt_fwd)
> >>   	tics_datum = rte_rdtsc();
> >>   	tics_per_1sec = rte_get_timer_hz();
> >>   #endif
> >> +	if (hot_plug)
> >> +		rte_dev_handle_hot_unplug();
> >> +
> > Again, I don't understand why the application should configure it - it
> > already started the hot-plug, Can't the EAL handle this automatically when
> the user starts the hot-plug?
> please check v21,  agree with you and have already modify it.

Looks good, thanks.

> >>   	fsm = &fwd_streams[fc->stream_idx];
> >>   	nb_fs = fc->stream_nb;
> >>   	do {
> >> @@ -2069,6 +2072,26 @@ rmv_event_callback(void *arg)
> >>   			dev->device->name);
> >>   }
> >>
> >> +static void
> >> +rmv_dev_event_callback(char *dev_name) {
> >> +	uint16_t port_id;
> >> +	int ret;
> >> +
> >> +	ret = rte_eth_dev_get_port_by_name(dev_name, &port_id);
> >> +	if (ret) {
> >> +		printf("can not get port by device %s!\n", dev_name);
> >> +		return;
> >> +	}
> >> +
> >> +	RTE_ETH_VALID_PORTID_OR_RET(port_id);
> >> +	printf("removing port id:%u\n", port_id);
> >> +	stop_packet_forwarding();
> >> +	stop_port(port_id);
> >> +	close_port(port_id);
> >> +	detach_port(port_id);
> >> +}
> > We have also the rmv_event_callback() which is triggered by a RMV
> interrupt and running by the host thread.
> > What is the context thread of rmv_dev_event_callback()?
> > Shouldn't they be synchronized? Should we need both in the same time?
> the context thread is interrupt thread.  and we might be discuss how to sync
> it. do you have comment if i combine these 2 into  1 callback?


Please see the patch series I sent today regarding rmv_event_callback() function:
" [PATCH 0/6] Testpmd: fix port hotplug".


Yes, I think you should use rmv_event_callback() by your function (after the port id retrieving) to do code reuse.

Regarding synchronization, let's discuss:

So, the both callbacks are running from the same thread, but by different fd.
Right?
So, they will be triggered sequentially by the kernel when a device is plugged-out and we cannot know the order.
Right?
The second one may get an "invalid port" error  because the first one was detached the port - not a major issue,
but the port id can be reused between them and then the second one may detach an available port.
Right?

So, looks like it is better to choose only one of them,

If all the above conclusions are correct , I suggest to disable the ethdev mechanism when the EAL hotplug is enabled:


	@@ -2152,9 +2152,10 @@ struct pmd_test_command {
	 
	        switch (type) {
	        case RTE_ETH_EVENT_INTR_RMV:
	-               if (rte_eal_alarm_set(100000,
	-                               rmv_event_callback, (void *)(intptr_t)port_id))
	-                       fprintf(stderr, "Could not set up deferred device removal\n");
	+               if (!hot_plug)
	+                       if (rte_eal_alarm_set(100000, rmv_event_callback,
	+                           (void *)(intptr_t)port_id))
	+                               fprintf(stderr, "Could not set up deferred device removal\n");
	                break;
	        default:
	                break;


What do you think?

Matan.


> >>   /* This function is used by the interrupt thread */  static int
> >> eth_event_callback(portid_t port_id, enum rte_eth_event_type type,
> >> void *param, @@ -2130,9 +2153,7 @@ eth_dev_event_callback(char
> >> *device_name, enum rte_dev_event_type type,
> >>   	case RTE_DEV_EVENT_REMOVE:
> >>   		RTE_LOG(ERR, EAL, "The device: %s has been removed!\n",
> >>   			device_name);
> >> -		/* TODO: After finish failure handle, begin to stop
> >> -		 * packet forward, stop port, close port, detach port.
> >> -		 */
> >> +		rmv_dev_event_callback(device_name);
> >>   		break;
> >>   	case RTE_DEV_EVENT_ADD:
> >>   		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
> @@ -2640,7
> >> +2661,7 @@ main(int argc, char** argv)
> >>   			return -1;
> >>   		}
> >>   		eth_dev_event_callback_register();
> >> -
> >> +		rte_dev_handle_hot_unplug();
> >>   	}
> >>
> >>   	if (start_port(RTE_PORT_ALL) != 0)
> >> --
> >> 2.7.4
  

Patch

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 5986ff7..3751901 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1125,6 +1125,9 @@  run_pkt_fwd_on_lcore(struct fwd_lcore *fc, packet_fwd_t pkt_fwd)
 	tics_datum = rte_rdtsc();
 	tics_per_1sec = rte_get_timer_hz();
 #endif
+	if (hot_plug)
+		rte_dev_handle_hot_unplug();
+
 	fsm = &fwd_streams[fc->stream_idx];
 	nb_fs = fc->stream_nb;
 	do {
@@ -2069,6 +2072,26 @@  rmv_event_callback(void *arg)
 			dev->device->name);
 }
 
+static void
+rmv_dev_event_callback(char *dev_name)
+{
+	uint16_t port_id;
+	int ret;
+
+	ret = rte_eth_dev_get_port_by_name(dev_name, &port_id);
+	if (ret) {
+		printf("can not get port by device %s!\n", dev_name);
+		return;
+	}
+
+	RTE_ETH_VALID_PORTID_OR_RET(port_id);
+	printf("removing port id:%u\n", port_id);
+	stop_packet_forwarding();
+	stop_port(port_id);
+	close_port(port_id);
+	detach_port(port_id);
+}
+
 /* This function is used by the interrupt thread */
 static int
 eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void *param,
@@ -2130,9 +2153,7 @@  eth_dev_event_callback(char *device_name, enum rte_dev_event_type type,
 	case RTE_DEV_EVENT_REMOVE:
 		RTE_LOG(ERR, EAL, "The device: %s has been removed!\n",
 			device_name);
-		/* TODO: After finish failure handle, begin to stop
-		 * packet forward, stop port, close port, detach port.
-		 */
+		rmv_dev_event_callback(device_name);
 		break;
 	case RTE_DEV_EVENT_ADD:
 		RTE_LOG(ERR, EAL, "The device: %s has been added!\n",
@@ -2640,7 +2661,7 @@  main(int argc, char** argv)
 			return -1;
 		}
 		eth_dev_event_callback_register();
-
+		rte_dev_handle_hot_unplug();
 	}
 
 	if (start_port(RTE_PORT_ALL) != 0)