[v12,3/7] bus: add sigbus handler

Message ID 1538483726-96411-4-git-send-email-jia.guo@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [v12,1/7] bus: add hot-unplug handler |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Guo, Jia Oct. 2, 2018, 12:35 p.m. UTC
  When a device is hot-unplugged, a sigbus error will occur of the datapath
can still read/write to the device. A handler is required here to capture
the sigbus signal and handle it appropriately.

This patch introduces a bus ops to handle sigbus errors. Each bus can
implement its own case-dependent logic to handle the sigbus errors.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
---
v12->v11:
no change.
---
 lib/librte_eal/common/include/rte_bus.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
  

Comments

Anatoly Burakov Oct. 2, 2018, 2:32 p.m. UTC | #1
On 02-Oct-18 1:35 PM, Jeff Guo wrote:
> When a device is hot-unplugged, a sigbus error will occur of the datapath
> can still read/write to the device. A handler is required here to capture
> the sigbus signal and handle it appropriately.
> 
> This patch introduces a bus ops to handle sigbus errors. Each bus can
> implement its own case-dependent logic to handle the sigbus errors.
> 
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> Acked-by: Shaopeng He <shaopeng.he@intel.com>
> ---
> v12->v11:
> no change.
> ---
>   lib/librte_eal/common/include/rte_bus.h | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
> index 1bb53dc..201454a 100644
> --- a/lib/librte_eal/common/include/rte_bus.h
> +++ b/lib/librte_eal/common/include/rte_bus.h
> @@ -182,6 +182,21 @@ typedef int (*rte_bus_parse_t)(const char *name, void *addr);
>   typedef int (*rte_bus_hot_unplug_handler_t)(struct rte_device *dev);
>   
>   /**
> + * Implement a specific sigbus handler, which is responsible for handling
> + * the sigbus error which is either original memory error, or specific memory
> + * error that caused of device be hot-unplugged. When sigbus error be captured,
> + * it could call this function to handle sigbus error.
> + * @param failure_addr
> + *	Pointer of the fault address of the sigbus error.
> + *
> + * @return
> + *	0 for success handle the sigbus.
> + *	1 for no bus handle the sigbus.

I think the comment here should be reworded. I can't parse "no bus 
handle the sigbus" - what does that mean, and how is it different from 
an error?
  
Guo, Jia Oct. 4, 2018, 3:14 a.m. UTC | #2
On 10/2/2018 10:32 PM, Burakov, Anatoly wrote:
> On 02-Oct-18 1:35 PM, Jeff Guo wrote:
>> When a device is hot-unplugged, a sigbus error will occur of the 
>> datapath
>> can still read/write to the device. A handler is required here to 
>> capture
>> the sigbus signal and handle it appropriately.
>>
>> This patch introduces a bus ops to handle sigbus errors. Each bus can
>> implement its own case-dependent logic to handle the sigbus errors.
>>
>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>> Acked-by: Shaopeng He <shaopeng.he@intel.com>
>> ---
>> v12->v11:
>> no change.
>> ---
>>   lib/librte_eal/common/include/rte_bus.h | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/lib/librte_eal/common/include/rte_bus.h 
>> b/lib/librte_eal/common/include/rte_bus.h
>> index 1bb53dc..201454a 100644
>> --- a/lib/librte_eal/common/include/rte_bus.h
>> +++ b/lib/librte_eal/common/include/rte_bus.h
>> @@ -182,6 +182,21 @@ typedef int (*rte_bus_parse_t)(const char *name, 
>> void *addr);
>>   typedef int (*rte_bus_hot_unplug_handler_t)(struct rte_device *dev);
>>     /**
>> + * Implement a specific sigbus handler, which is responsible for 
>> handling
>> + * the sigbus error which is either original memory error, or 
>> specific memory
>> + * error that caused of device be hot-unplugged. When sigbus error 
>> be captured,
>> + * it could call this function to handle sigbus error.
>> + * @param failure_addr
>> + *    Pointer of the fault address of the sigbus error.
>> + *
>> + * @return
>> + *    0 for success handle the sigbus.
>> + *    1 for no bus handle the sigbus.
>
> I think the comment here should be reworded. I can't parse "no bus 
> handle the sigbus" - what does that mean, and how is it different from 
> an error?
>

ok, let me detail more.
  

Patch

diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index 1bb53dc..201454a 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -182,6 +182,21 @@  typedef int (*rte_bus_parse_t)(const char *name, void *addr);
 typedef int (*rte_bus_hot_unplug_handler_t)(struct rte_device *dev);
 
 /**
+ * Implement a specific sigbus handler, which is responsible for handling
+ * the sigbus error which is either original memory error, or specific memory
+ * error that caused of device be hot-unplugged. When sigbus error be captured,
+ * it could call this function to handle sigbus error.
+ * @param failure_addr
+ *	Pointer of the fault address of the sigbus error.
+ *
+ * @return
+ *	0 for success handle the sigbus.
+ *	1 for no bus handle the sigbus.
+ *	-1 for failed to handle the sigbus
+ */
+typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr);
+
+/**
  * Bus scan policies
  */
 enum rte_bus_scan_mode {
@@ -228,6 +243,9 @@  struct rte_bus {
 	rte_dev_iterate_t dev_iterate; /**< Device iterator. */
 	rte_bus_hot_unplug_handler_t hot_unplug_handler;
 				/**< handle hot-unplug failure on the bus */
+	rte_bus_sigbus_handler_t sigbus_handler;
+					/**< handle sigbus error on the bus */
+
 };
 
 /**