[RFC PATCH 0/6] add json string escaping to telemetry

Morten Brørup mb at smartsharesystems.com
Fri Jun 24 11:12:05 CEST 2022


> From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> Sent: Friday, 24 June 2022 10.14
> 
> On Thu, Jun 23, 2022 at 09:04:31PM +0200, Morten Brørup wrote:
> > > From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > > Sent: Thursday, 23 June 2022 18.43
> > >
> > > This RFC shows one possible approach for escaping strings for the
> json
> > > output of telemetry library. For now this RFC supports escaping
> strings
> > > for the cases of returning a single string, or returning an array
> of
> > > strings. Not done is escaping of strings in objs/dicts [see more
> below
> > > on TODO]
> >
> > Very good initiative.
> >
> > >
> > > As well as telemetry lib changes, this patchset includes unit tests
> for
> > > the above and also little bit of cleanup to the json tests.
> > >
> > > TODO:
> > > Beyond what is here in this RFC:
> > >
> > > 1. we need to decide what to do about name/value pairs. Personally,
> I
> > >    think we should add the restriction to the
> "rte_tel_data_add_obj_*"
> > > APIs
> > >    to only allow a defined subset of characters in names: e.g.
> > > alphanumeric
> > >    chars, underscore and dash. That means that we only need to
> escape
> > >    the data part in the case of string returns.
> >
> > I agree about only allowing a subset of characters in names, so JSON
> (and other) encoding is not required.
> >
> > However, I think we should be less restrictive, and also allow
> characters commonly used for separation, indexing and wildcard, such as
> '/', '[', ']', and '*', '?' or '%'.
> >
> > Obviously, we should disallow characters requiring escaping in not
> just JSON, but also other foreseeable encodings and protocols. So
> please bring your crystal ball to the discussion. ;-)
> >
> Exactly why I am looking for feedback - and why I'm looking to have an
> explicit allowed list of characters rather than trying to just block
> the
> known-bad in json ones.
> 
> For your suggestions: +1 to separators and indexing, i.e. '[', ']' and
> '/',
> though I would probably also add ',' and maybe '.' (unless it's likely
> to
> cause issues with some protocol we are likely to want to use).

After having slept on it, I think we should also allow characters that could appear in IP and MAC addresses, i.e. '.' and ':' (and '/' for subnetting).

> For the wildcarding, I find it hard to see why we would want those?

Initially, I thought a wildcard might be useful as a placeholder in templates.

But it might also be useful for partial IP or MAC addresses. E.g.:
- The SmartShare Systems OUI could be represented by the MAC address "00:1F:B4:??:??:??".
- A default gateway address in a template configuration could be "192.168.*.1".

On the other hand, wildcard characters could be disallowed or require escaping in other (non-JSON) protocols.

So I'm just being a bit creative here, throwing out ideas in our search for the right balance in the restrictions.

> 
> The other advantage of using an allowlist of characters is that it
> makes it
> possible to expand over time, compared to a blocklist which always runs
> the
> risk of breaking something if you expand it. Therefore I suggest we
> keep
> the list as small as we need right now, and expand it only as we need.

+1

> 
> > > 2. once agreed, need to implement a patch to escape strings in
> > >    dicts/objs
> >
> > Yes.
> >
> > >
> > > 3. need to add a patch to escape the input command if it contains
> > >    invalid chars
> >
> > What do you mean here? You mean unescape JSON encoded input (arriving
> on the JSON telemetry socket) to a proper binary string?
> >
> 
> The thing with the telemetry socket interface right now is that the
> input
> requests are not-json. The reasons for that is that they be kept as
> simple
> as possible, and to avoid needing a full json parser inside DPDK.
> Therefore, the input sent by the user could contain invalid characters
> for
> json output so we need to:
> 1. Guarantee that no command registered with the telemetry library
> contains
>    invalid json characters (though why someone would do so, I don't
> know!)
> 2. When we return the command back in the reply, properly escape any
>    invalid characters in the error case.
> 
> #1 is very important for sanity checking, but now that I think about it
> #2
> is probably optional, since if any user does start sending invalid
> garbage
> input that breaks their json parser on return, they are only hurting
> themselves and not affecting anything else on the system.
> 
> > > 4. some small refactoring of the main telemetry.c json-encoding
> > > function may be possible.
> >
> > Perhaps.
> >
> I saw some options for cleanup when I was working on the code, so
> including
> this as a note-to-self as much as anything else for feedback. :-)
> 
> /Bruce



More information about the dev mailing list