[dpdk-dev,6/6] app/crypto-perf: use single mempool

Message ID 20170818080520.43088-7-pablo.de.lara.guarch@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Pablo de Lara Guarch
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

De Lara Guarch, Pablo Aug. 18, 2017, 8:05 a.m. UTC
  In order to improve memory utilization, a single mempool
is created, containing the crypto operation and mbufs
(one if operation is in-place, two if out-of-place).
This way, a single object is allocated and freed
per operation, reducing the amount of memory in cache,
which improves scalability.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
---
 app/test-crypto-perf/cperf_ops.c             |  96 ++++++--
 app/test-crypto-perf/cperf_ops.h             |   2 +-
 app/test-crypto-perf/cperf_test_latency.c    | 350 ++++++++++++--------------
 app/test-crypto-perf/cperf_test_throughput.c | 347 ++++++++++++--------------
 app/test-crypto-perf/cperf_test_verify.c     | 356 ++++++++++++---------------
 5 files changed, 553 insertions(+), 598 deletions(-)
  

Comments

Akhil Goyal Aug. 30, 2017, 8:30 a.m. UTC | #1
Hi Pablo,
On 8/18/2017 1:35 PM, Pablo de Lara wrote:
> In order to improve memory utilization, a single mempool
> is created, containing the crypto operation and mbufs
> (one if operation is in-place, two if out-of-place).
> This way, a single object is allocated and freed
> per operation, reducing the amount of memory in cache,
> which improves scalability.
> 
> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> ---
>   app/test-crypto-perf/cperf_ops.c             |  96 ++++++--
>   app/test-crypto-perf/cperf_ops.h             |   2 +-
>   app/test-crypto-perf/cperf_test_latency.c    | 350 ++++++++++++--------------
>   app/test-crypto-perf/cperf_test_throughput.c | 347 ++++++++++++--------------
>   app/test-crypto-perf/cperf_test_verify.c     | 356 ++++++++++++---------------
>   5 files changed, 553 insertions(+), 598 deletions(-)
> 
NACK.
This patch replaces rte_pktmbuf_pool_create with the rte_mempool_create 
for mbufs, which is not a preferred way to allocate memory for pktmbuf.

Any example/test application in DPDK should not be using this, as this 
kind of usages will  not be compatible for all dpdk drivers in general.

This kind of usages of rte_mempool_create will not work for any devices 
using hw offloaded memory pools for pktmbuf.
one such example is dpaa2.

-Akhil
  
De Lara Guarch, Pablo Sept. 11, 2017, 11:08 a.m. UTC | #2
> -----Original Message-----

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Akhil Goyal

> Sent: Wednesday, August 30, 2017 9:31 AM

> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Doherty,

> Declan <declan.doherty@intel.com>; Trahe, Fiona

> <fiona.trahe@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;

> Griffin, John <john.griffin@intel.com>;

> jerin.jacob@caviumnetworks.com; hemant.agrawal@nxp.com

> Cc: dev@dpdk.org

> Subject: Re: [dpdk-dev] [PATCH 6/6] app/crypto-perf: use single

> mempool

>

> Hi Pablo,

> On 8/18/2017 1:35 PM, Pablo de Lara wrote:

> > In order to improve memory utilization, a single mempool is created,

> > containing the crypto operation and mbufs (one if operation is

> > in-place, two if out-of-place).

> > This way, a single object is allocated and freed per operation,

> > reducing the amount of memory in cache, which improves scalability.

> >

> > Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>

> > ---

> >   app/test-crypto-perf/cperf_ops.c             |  96 ++++++--

> >   app/test-crypto-perf/cperf_ops.h             |   2 +-

> >   app/test-crypto-perf/cperf_test_latency.c    | 350 ++++++++++++-------

---
> ----

> >   app/test-crypto-perf/cperf_test_throughput.c | 347

> > ++++++++++++------

> --------

> >   app/test-crypto-perf/cperf_test_verify.c     | 356 ++++++++++++--------

---
> ----

> >   5 files changed, 553 insertions(+), 598 deletions(-)

> >

> NACK.

> This patch replaces rte_pktmbuf_pool_create with the

> rte_mempool_create for mbufs, which is not a preferred way to allocate

memory for pktmbuf.
>

> Any example/test application in DPDK should not be using this, as this

> kind of usages will  not be compatible for all dpdk drivers in general.

>

> This kind of usages of rte_mempool_create will not work for any

> devices using hw offloaded memory pools for pktmbuf.

> one such example is dpaa2.


Hi Akhil, 

Sorry for the delay on this reply and thanks for the review.

I think, since we are not getting the buffers from the NIC, but we are allocating
them ourselves, it is not strictly required to call rte_pktmbuf_pool_create.
In the end, we only need them for memory for the crypto PMDs and we are not touching
anything in them, so I think using calling rte_mempool_create should work ok.
Having a single mempool would be way more performant and would avoid the scalability
issues that we are having in this application now, and knowing that this application
was created to test crypto PMD performance, I think it is worth trying this out.

What is it exactly needed for dpaa2? Is the mempool handler?
Would it work for you if I create the mempool in a similar way as what
rte_pktmbuf_pool_create is doing? Calling rte_mempool_set_ops_byname?

Thanks!
Pablo


>

> -Akhil
  
Shreyansh Jain Sept. 11, 2017, 1:10 p.m. UTC | #3
Hello Pablo,

I have a comment inline:

On Monday 11 September 2017 04:38 PM, De Lara Guarch, Pablo wrote:
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Akhil Goyal
>> Sent: Wednesday, August 30, 2017 9:31 AM
>> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Doherty,
>> Declan <declan.doherty@intel.com>; Trahe, Fiona
>> <fiona.trahe@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;
>> Griffin, John <john.griffin@intel.com>;
>> jerin.jacob@caviumnetworks.com; hemant.agrawal@nxp.com
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH 6/6] app/crypto-perf: use single
>> mempool
>>
>> Hi Pablo,
>> On 8/18/2017 1:35 PM, Pablo de Lara wrote:
>>> In order to improve memory utilization, a single mempool is created,
>>> containing the crypto operation and mbufs (one if operation is
>>> in-place, two if out-of-place).
>>> This way, a single object is allocated and freed per operation,
>>> reducing the amount of memory in cache, which improves scalability.
>>>
>>> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
>>> ---
>>>    app/test-crypto-perf/cperf_ops.c             |  96 ++++++--
>>>    app/test-crypto-perf/cperf_ops.h             |   2 +-
>>>    app/test-crypto-perf/cperf_test_latency.c    | 350 ++++++++++++-------
> ---
>> ----
>>>    app/test-crypto-perf/cperf_test_throughput.c | 347
>>> ++++++++++++------
>> --------
>>>    app/test-crypto-perf/cperf_test_verify.c     | 356 ++++++++++++--------
> ---
>> ----
>>>    5 files changed, 553 insertions(+), 598 deletions(-)
>>>
>> NACK.
>> This patch replaces rte_pktmbuf_pool_create with the
>> rte_mempool_create for mbufs, which is not a preferred way to allocate
> memory for pktmbuf.
>>
>> Any example/test application in DPDK should not be using this, as this
>> kind of usages will  not be compatible for all dpdk drivers in general.
>>
>> This kind of usages of rte_mempool_create will not work for any
>> devices using hw offloaded memory pools for pktmbuf.
>> one such example is dpaa2.
> 
> Hi Akhil,
> 
> Sorry for the delay on this reply and thanks for the review.
> 
> I think, since we are not getting the buffers from the NIC, but we are allocating
> them ourselves, it is not strictly required to call rte_pktmbuf_pool_create.
> In the end, we only need them for memory for the crypto PMDs and we are not touching
> anything in them, so I think using calling rte_mempool_create should work ok.
> Having a single mempool would be way more performant and would avoid the scalability
> issues that we are having in this application now, and knowing that this application
> was created to test crypto PMD performance, I think it is worth trying this out.
> 
> What is it exactly needed for dpaa2? Is the mempool handler?

If I recall correctly:
This is the call flow when rte_pktmbuf_pool_create is called:
  - rte_pktmbuf_pool_create
    `-> rte_mempool_create_empty
        `-> allocate and fill mempool object with defaults
    `-> rte_mempool_set_ops_byname
        `-> sets mempool handler to RTE_MBUF_DEFAULT_MEMPOOL_OPS
    `-> rte_mempool_populate_default
        `-> calls pool handler specific enqueue/dequeue

but that of rte_mempool_create is:
  - rte_mempool_create
    `-> rte_mempool_create_empty
        `-> allocate and fill mempool object with defaults
    `-> rte_mempool_set_ops_byname
        `-> set to one of ring_*_*
            No check/logic for configuration defined handler
            like RTE_MBUF_DEFAULT_MEMPOOL_OPS
    `-> rte_mempool_populate_default
        `-> calls ring* handler specific enqueue/dequeue

Calling rte_mempool_create bypasses the check for any mempool handler 
configured through the build system.

> Would it work for you if I create the mempool in a similar way as what
> rte_pktmbuf_pool_create is doing? Calling rte_mempool_set_ops_byname?

Yes, but that would mean using the combination of 
rte_mempool_create_empty and rte_mempool_set_ops_byname which, 
eventually, would be equal to using rte_pktmbuf_pool_create.

rte_mempool_set_ops_byname over a mempool created by rte_mempool_create 
would mean changing the enqueue/dequeue operations *after* the mempool 
has been populated. That would be incorrect.

I am not sure of what the intent it - whether these buffers should be 
allowed to be offloaded to hardware. If yes, then rte_mempool_create 
wouldn't help.

> 
> Thanks!
> Pablo
> 
> 
>>
>> -Akhil
  
De Lara Guarch, Pablo Sept. 11, 2017, 1:56 p.m. UTC | #4
> -----Original Message-----

> From: Shreyansh Jain [mailto:shreyansh.jain@nxp.com]

> Sent: Monday, September 11, 2017 2:11 PM

> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Akhil Goyal

> <akhil.goyal@nxp.com>

> Cc: Doherty, Declan <declan.doherty@intel.com>; Trahe, Fiona

> <fiona.trahe@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;

> Griffin, John <john.griffin@intel.com>; jerin.jacob@caviumnetworks.com;

> hemant.agrawal@nxp.com; dev@dpdk.org

> Subject: Re: [PATCH 6/6] app/crypto-perf: use single mempool

> 

> Hello Pablo,

> 

> I have a comment inline:

> 

> On Monday 11 September 2017 04:38 PM, De Lara Guarch, Pablo wrote:

> >> -----Original Message-----

> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Akhil Goyal

> >> Sent: Wednesday, August 30, 2017 9:31 AM

> >> To: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Doherty,

> >> Declan <declan.doherty@intel.com>; Trahe, Fiona

> >> <fiona.trahe@intel.com>; Jain, Deepak K <deepak.k.jain@intel.com>;

> >> Griffin, John <john.griffin@intel.com>;

> >> jerin.jacob@caviumnetworks.com; hemant.agrawal@nxp.com

> >> Cc: dev@dpdk.org

> >> Subject: Re: [dpdk-dev] [PATCH 6/6] app/crypto-perf: use single

> >> mempool

> >>

> >> Hi Pablo,

> >> On 8/18/2017 1:35 PM, Pablo de Lara wrote:

> >>> In order to improve memory utilization, a single mempool is created,

> >>> containing the crypto operation and mbufs (one if operation is

> >>> in-place, two if out-of-place).

> >>> This way, a single object is allocated and freed per operation,

> >>> reducing the amount of memory in cache, which improves scalability.

> >>>

> >>> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>

> >>> ---

> >>>    app/test-crypto-perf/cperf_ops.c             |  96 ++++++--

> >>>    app/test-crypto-perf/cperf_ops.h             |   2 +-

> >>>    app/test-crypto-perf/cperf_test_latency.c    | 350 ++++++++++++-----

> --

> > ---

> >> ----

> >>>    app/test-crypto-perf/cperf_test_throughput.c | 347

> >>> ++++++++++++------

> >> --------

> >>>    app/test-crypto-perf/cperf_test_verify.c     | 356 ++++++++++++-------

> -

> > ---

> >> ----

> >>>    5 files changed, 553 insertions(+), 598 deletions(-)

> >>>

> >> NACK.

> >> This patch replaces rte_pktmbuf_pool_create with the

> >> rte_mempool_create for mbufs, which is not a preferred way to

> >> allocate

> > memory for pktmbuf.

> >>

> >> Any example/test application in DPDK should not be using this, as

> >> this kind of usages will  not be compatible for all dpdk drivers in general.

> >>

> >> This kind of usages of rte_mempool_create will not work for any

> >> devices using hw offloaded memory pools for pktmbuf.

> >> one such example is dpaa2.

> >

> > Hi Akhil,

> >

> > Sorry for the delay on this reply and thanks for the review.

> >

> > I think, since we are not getting the buffers from the NIC, but we are

> > allocating them ourselves, it is not strictly required to call

> rte_pktmbuf_pool_create.

> > In the end, we only need them for memory for the crypto PMDs and we

> > are not touching anything in them, so I think using calling

> rte_mempool_create should work ok.

> > Having a single mempool would be way more performant and would

> avoid

> > the scalability issues that we are having in this application now, and

> > knowing that this application was created to test crypto PMD

> performance, I think it is worth trying this out.

> >

> > What is it exactly needed for dpaa2? Is the mempool handler?

> 

> If I recall correctly:

> This is the call flow when rte_pktmbuf_pool_create is called:

>   - rte_pktmbuf_pool_create

>     `-> rte_mempool_create_empty

>         `-> allocate and fill mempool object with defaults

>     `-> rte_mempool_set_ops_byname

>         `-> sets mempool handler to RTE_MBUF_DEFAULT_MEMPOOL_OPS

>     `-> rte_mempool_populate_default

>         `-> calls pool handler specific enqueue/dequeue

> 

> but that of rte_mempool_create is:

>   - rte_mempool_create

>     `-> rte_mempool_create_empty

>         `-> allocate and fill mempool object with defaults

>     `-> rte_mempool_set_ops_byname

>         `-> set to one of ring_*_*

>             No check/logic for configuration defined handler

>             like RTE_MBUF_DEFAULT_MEMPOOL_OPS

>     `-> rte_mempool_populate_default

>         `-> calls ring* handler specific enqueue/dequeue

> 

> Calling rte_mempool_create bypasses the check for any mempool handler

> configured through the build system.

> 

> > Would it work for you if I create the mempool in a similar way as what

> > rte_pktmbuf_pool_create is doing? Calling

> rte_mempool_set_ops_byname?

> 

> Yes, but that would mean using the combination of

> rte_mempool_create_empty and rte_mempool_set_ops_byname which,

> eventually, would be equal to using rte_pktmbuf_pool_create.

> 

> rte_mempool_set_ops_byname over a mempool created by

> rte_mempool_create would mean changing the enqueue/dequeue

> operations *after* the mempool has been populated. That would be

> incorrect.

> 

> I am not sure of what the intent it - whether these buffers should be

> allowed to be offloaded to hardware. If yes, then rte_mempool_create

> wouldn't help.


Ok, got it. I think I would go for the option of imitating what rte_pktmbuf_pool_create,
but adding the flexibility of having a crypto operation and mbuf, instead of just the mbuf.

Thanks for the input.
Pablo

> 

> >

> > Thanks!

> > Pablo

> >

> >

> >>

> >> -Akhil
  

Patch

diff --git a/app/test-crypto-perf/cperf_ops.c b/app/test-crypto-perf/cperf_ops.c
index ad32065..f76dbdd 100644
--- a/app/test-crypto-perf/cperf_ops.c
+++ b/app/test-crypto-perf/cperf_ops.c
@@ -37,7 +37,7 @@ 
 
 static int
 cperf_set_ops_null_cipher(struct rte_crypto_op **ops,
-		struct rte_mbuf **bufs_in, struct rte_mbuf **bufs_out,
+		uint32_t src_buf_offset, uint32_t dst_buf_offset,
 		uint16_t nb_ops, struct rte_cryptodev_sym_session *sess,
 		const struct cperf_options *options,
 		const struct cperf_test_vector *test_vector __rte_unused,
@@ -48,10 +48,18 @@  cperf_set_ops_null_cipher(struct rte_crypto_op **ops,
 	for (i = 0; i < nb_ops; i++) {
 		struct rte_crypto_sym_op *sym_op = ops[i]->sym;
 
+		ops[i]->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
 		rte_crypto_op_attach_sym_session(ops[i], sess);
 
-		sym_op->m_src = bufs_in[i];
-		sym_op->m_dst = bufs_out[i];
+		sym_op->m_src = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							src_buf_offset);
+
+		/* Set dest mbuf to NULL if out-of-place (dst_buf_offset = 0) */
+		if (dst_buf_offset == 0)
+			sym_op->m_dst = NULL;
+		else
+			sym_op->m_dst = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							dst_buf_offset);
 
 		/* cipher parameters */
 		sym_op->cipher.data.length = options->test_buffer_size;
@@ -63,7 +71,7 @@  cperf_set_ops_null_cipher(struct rte_crypto_op **ops,
 
 static int
 cperf_set_ops_null_auth(struct rte_crypto_op **ops,
-		struct rte_mbuf **bufs_in, struct rte_mbuf **bufs_out,
+		uint32_t src_buf_offset, uint32_t dst_buf_offset,
 		uint16_t nb_ops, struct rte_cryptodev_sym_session *sess,
 		const struct cperf_options *options,
 		const struct cperf_test_vector *test_vector __rte_unused,
@@ -74,10 +82,18 @@  cperf_set_ops_null_auth(struct rte_crypto_op **ops,
 	for (i = 0; i < nb_ops; i++) {
 		struct rte_crypto_sym_op *sym_op = ops[i]->sym;
 
+		ops[i]->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
 		rte_crypto_op_attach_sym_session(ops[i], sess);
 
-		sym_op->m_src = bufs_in[i];
-		sym_op->m_dst = bufs_out[i];
+		sym_op->m_src = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							src_buf_offset);
+
+		/* Set dest mbuf to NULL if out-of-place (dst_buf_offset = 0) */
+		if (dst_buf_offset == 0)
+			sym_op->m_dst = NULL;
+		else
+			sym_op->m_dst = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							dst_buf_offset);
 
 		/* auth parameters */
 		sym_op->auth.data.length = options->test_buffer_size;
@@ -89,7 +105,7 @@  cperf_set_ops_null_auth(struct rte_crypto_op **ops,
 
 static int
 cperf_set_ops_cipher(struct rte_crypto_op **ops,
-		struct rte_mbuf **bufs_in, struct rte_mbuf **bufs_out,
+		uint32_t src_buf_offset, uint32_t dst_buf_offset,
 		uint16_t nb_ops, struct rte_cryptodev_sym_session *sess,
 		const struct cperf_options *options,
 		const struct cperf_test_vector *test_vector,
@@ -100,10 +116,18 @@  cperf_set_ops_cipher(struct rte_crypto_op **ops,
 	for (i = 0; i < nb_ops; i++) {
 		struct rte_crypto_sym_op *sym_op = ops[i]->sym;
 
+		ops[i]->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
 		rte_crypto_op_attach_sym_session(ops[i], sess);
 
-		sym_op->m_src = bufs_in[i];
-		sym_op->m_dst = bufs_out[i];
+		sym_op->m_src = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							src_buf_offset);
+
+		/* Set dest mbuf to NULL if out-of-place (dst_buf_offset = 0) */
+		if (dst_buf_offset == 0)
+			sym_op->m_dst = NULL;
+		else
+			sym_op->m_dst = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							dst_buf_offset);
 
 		/* cipher parameters */
 		if (options->cipher_algo == RTE_CRYPTO_CIPHER_SNOW3G_UEA2 ||
@@ -132,7 +156,7 @@  cperf_set_ops_cipher(struct rte_crypto_op **ops,
 
 static int
 cperf_set_ops_auth(struct rte_crypto_op **ops,
-		struct rte_mbuf **bufs_in, struct rte_mbuf **bufs_out,
+		uint32_t src_buf_offset, uint32_t dst_buf_offset,
 		uint16_t nb_ops, struct rte_cryptodev_sym_session *sess,
 		const struct cperf_options *options,
 		const struct cperf_test_vector *test_vector,
@@ -143,10 +167,18 @@  cperf_set_ops_auth(struct rte_crypto_op **ops,
 	for (i = 0; i < nb_ops; i++) {
 		struct rte_crypto_sym_op *sym_op = ops[i]->sym;
 
+		ops[i]->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
 		rte_crypto_op_attach_sym_session(ops[i], sess);
 
-		sym_op->m_src = bufs_in[i];
-		sym_op->m_dst = bufs_out[i];
+		sym_op->m_src = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							src_buf_offset);
+
+		/* Set dest mbuf to NULL if out-of-place (dst_buf_offset = 0) */
+		if (dst_buf_offset == 0)
+			sym_op->m_dst = NULL;
+		else
+			sym_op->m_dst = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							dst_buf_offset);
 
 		if (test_vector->auth_iv.length) {
 			uint8_t *iv_ptr = rte_crypto_op_ctod_offset(ops[i],
@@ -167,9 +199,9 @@  cperf_set_ops_auth(struct rte_crypto_op **ops,
 			struct rte_mbuf *buf, *tbuf;
 
 			if (options->out_of_place) {
-				buf =  bufs_out[i];
+				buf = sym_op->m_dst;
 			} else {
-				tbuf =  bufs_in[i];
+				tbuf = sym_op->m_src;
 				while ((tbuf->next != NULL) &&
 						(offset >= tbuf->data_len)) {
 					offset -= tbuf->data_len;
@@ -219,7 +251,7 @@  cperf_set_ops_auth(struct rte_crypto_op **ops,
 
 static int
 cperf_set_ops_cipher_auth(struct rte_crypto_op **ops,
-		struct rte_mbuf **bufs_in, struct rte_mbuf **bufs_out,
+		uint32_t src_buf_offset, uint32_t dst_buf_offset,
 		uint16_t nb_ops, struct rte_cryptodev_sym_session *sess,
 		const struct cperf_options *options,
 		const struct cperf_test_vector *test_vector,
@@ -230,10 +262,18 @@  cperf_set_ops_cipher_auth(struct rte_crypto_op **ops,
 	for (i = 0; i < nb_ops; i++) {
 		struct rte_crypto_sym_op *sym_op = ops[i]->sym;
 
+		ops[i]->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
 		rte_crypto_op_attach_sym_session(ops[i], sess);
 
-		sym_op->m_src = bufs_in[i];
-		sym_op->m_dst = bufs_out[i];
+		sym_op->m_src = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							src_buf_offset);
+
+		/* Set dest mbuf to NULL if out-of-place (dst_buf_offset = 0) */
+		if (dst_buf_offset == 0)
+			sym_op->m_dst = NULL;
+		else
+			sym_op->m_dst = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							dst_buf_offset);
 
 		/* cipher parameters */
 		if (options->cipher_algo == RTE_CRYPTO_CIPHER_SNOW3G_UEA2 ||
@@ -256,9 +296,9 @@  cperf_set_ops_cipher_auth(struct rte_crypto_op **ops,
 			struct rte_mbuf *buf, *tbuf;
 
 			if (options->out_of_place) {
-				buf =  bufs_out[i];
+				buf = sym_op->m_dst;
 			} else {
-				tbuf =  bufs_in[i];
+				tbuf = sym_op->m_src;
 				while ((tbuf->next != NULL) &&
 						(offset >= tbuf->data_len)) {
 					offset -= tbuf->data_len;
@@ -316,7 +356,7 @@  cperf_set_ops_cipher_auth(struct rte_crypto_op **ops,
 
 static int
 cperf_set_ops_aead(struct rte_crypto_op **ops,
-		struct rte_mbuf **bufs_in, struct rte_mbuf **bufs_out,
+		uint32_t src_buf_offset, uint32_t dst_buf_offset,
 		uint16_t nb_ops, struct rte_cryptodev_sym_session *sess,
 		const struct cperf_options *options,
 		const struct cperf_test_vector *test_vector,
@@ -329,10 +369,18 @@  cperf_set_ops_aead(struct rte_crypto_op **ops,
 	for (i = 0; i < nb_ops; i++) {
 		struct rte_crypto_sym_op *sym_op = ops[i]->sym;
 
+		ops[i]->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
 		rte_crypto_op_attach_sym_session(ops[i], sess);
 
-		sym_op->m_src = bufs_in[i];
-		sym_op->m_dst = bufs_out[i];
+		sym_op->m_src = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							src_buf_offset);
+
+		/* Set dest mbuf to NULL if out-of-place (dst_buf_offset = 0) */
+		if (dst_buf_offset == 0)
+			sym_op->m_dst = NULL;
+		else
+			sym_op->m_dst = (struct rte_mbuf *)((uint8_t *)ops[i] +
+							dst_buf_offset);
 
 		/* AEAD parameters */
 		sym_op->aead.data.length = options->test_buffer_size;
@@ -354,9 +402,9 @@  cperf_set_ops_aead(struct rte_crypto_op **ops,
 			struct rte_mbuf *buf, *tbuf;
 
 			if (options->out_of_place) {
-				buf =  bufs_out[i];
+				buf = sym_op->m_dst;
 			} else {
-				tbuf =  bufs_in[i];
+				tbuf = sym_op->m_src;
 				while ((tbuf->next != NULL) &&
 						(offset >= tbuf->data_len)) {
 					offset -= tbuf->data_len;
diff --git a/app/test-crypto-perf/cperf_ops.h b/app/test-crypto-perf/cperf_ops.h
index 1f8fa93..94951cc 100644
--- a/app/test-crypto-perf/cperf_ops.h
+++ b/app/test-crypto-perf/cperf_ops.h
@@ -47,7 +47,7 @@  typedef struct rte_cryptodev_sym_session *(*cperf_sessions_create_t)(
 		uint16_t iv_offset);
 
 typedef int (*cperf_populate_ops_t)(struct rte_crypto_op **ops,
-		struct rte_mbuf **bufs_in, struct rte_mbuf **bufs_out,
+		uint32_t src_buf_offset, uint32_t dst_buf_offset,
 		uint16_t nb_ops, struct rte_cryptodev_sym_session *sess,
 		const struct cperf_options *options,
 		const struct cperf_test_vector *test_vector,
diff --git a/app/test-crypto-perf/cperf_test_latency.c b/app/test-crypto-perf/cperf_test_latency.c
index 997844a..2415d77 100644
--- a/app/test-crypto-perf/cperf_test_latency.c
+++ b/app/test-crypto-perf/cperf_test_latency.c
@@ -50,17 +50,15 @@  struct cperf_latency_ctx {
 	uint16_t qp_id;
 	uint8_t lcore_id;
 
-	struct rte_mempool *pkt_mbuf_pool_in;
-	struct rte_mempool *pkt_mbuf_pool_out;
-	struct rte_mbuf **mbufs_in;
-	struct rte_mbuf **mbufs_out;
-
-	struct rte_mempool *crypto_op_pool;
+	struct rte_mempool *pool;
 
 	struct rte_cryptodev_sym_session *sess;
 
 	cperf_populate_ops_t populate_ops;
 
+	uint32_t src_buf_offset;
+	uint32_t dst_buf_offset;
+
 	const struct cperf_options *options;
 	const struct cperf_test_vector *test_vector;
 	struct cperf_op_result *res;
@@ -74,116 +72,128 @@  struct priv_op_data {
 #define min(a, b) (a < b ? (uint64_t)a : (uint64_t)b)
 
 static void
-cperf_latency_test_free(struct cperf_latency_ctx *ctx, uint32_t mbuf_nb)
+cperf_latency_test_free(struct cperf_latency_ctx *ctx)
 {
-	uint32_t i;
-
 	if (ctx) {
 		if (ctx->sess) {
 			rte_cryptodev_sym_session_clear(ctx->dev_id, ctx->sess);
 			rte_cryptodev_sym_session_free(ctx->sess);
 		}
 
-		if (ctx->mbufs_in) {
-			for (i = 0; i < mbuf_nb; i++)
-				rte_pktmbuf_free(ctx->mbufs_in[i]);
-
-			rte_free(ctx->mbufs_in);
-		}
-
-		if (ctx->mbufs_out) {
-			for (i = 0; i < mbuf_nb; i++) {
-				if (ctx->mbufs_out[i] != NULL)
-					rte_pktmbuf_free(ctx->mbufs_out[i]);
-			}
-
-			rte_free(ctx->mbufs_out);
-		}
-
-		if (ctx->pkt_mbuf_pool_in)
-			rte_mempool_free(ctx->pkt_mbuf_pool_in);
-
-		if (ctx->pkt_mbuf_pool_out)
-			rte_mempool_free(ctx->pkt_mbuf_pool_out);
-
-		if (ctx->crypto_op_pool)
-			rte_mempool_free(ctx->crypto_op_pool);
+		if (ctx->pool)
+			rte_mempool_free(ctx->pool);
 
 		rte_free(ctx->res);
 		rte_free(ctx);
 	}
 }
 
-static struct rte_mbuf *
-cperf_mbuf_create(struct rte_mempool *mempool,
-		uint32_t segment_sz,
-		uint32_t segment_nb,
-		const struct cperf_options *options)
-{
-	struct rte_mbuf *mbuf;
-	uint8_t *mbuf_data;
-	uint32_t remaining_bytes = options->max_buffer_size;
+struct obj_params {
+	uint32_t src_buf_offset;
+	uint32_t dst_buf_offset;
+	uint16_t segment_sz;
+	uint16_t segments_nb;
+};
 
-	mbuf = rte_pktmbuf_alloc(mempool);
-	if (mbuf == NULL)
-		goto error;
+static void
+fill_single_seg_mbuf(struct rte_mbuf *m, struct rte_mempool *mp,
+		void *obj, uint32_t mbuf_offset, uint16_t segment_sz)
+{
+	uint32_t mbuf_hdr_size = sizeof(struct rte_mbuf);
+
+	/* start of buffer is after mbuf structure and priv data */
+	m->priv_size = 0;
+	m->buf_addr = (char *)m + mbuf_hdr_size;
+	m->buf_physaddr = rte_mempool_virt2phy(mp, obj) +
+		mbuf_offset + mbuf_hdr_size;
+	m->buf_len = segment_sz;
+	m->data_len = segment_sz;
+
+	/* No headroom needed for the buffer */
+	m->data_off = 0;
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = 0xff;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+}
 
-	mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf, segment_sz);
-	if (mbuf_data == NULL)
-		goto error;
+static void
+fill_multi_seg_mbuf(struct rte_mbuf *m, struct rte_mempool *mp,
+		void *obj, uint32_t mbuf_offset, uint16_t segment_sz,
+		uint16_t segments_nb)
+{
+	uint16_t mbuf_hdr_size = sizeof(struct rte_mbuf);
+	uint16_t remaining_segments = segments_nb;
+	struct rte_mbuf *next_mbuf;
+	phys_addr_t next_seg_phys_addr = rte_mempool_virt2phy(mp, obj) +
+			 mbuf_offset + mbuf_hdr_size;
+
+	do {
+		/* start of buffer is after mbuf structure and priv data */
+		m->priv_size = 0;
+		m->buf_addr = (char *)m + mbuf_hdr_size;
+		m->buf_physaddr = next_seg_phys_addr;
+		next_seg_phys_addr = (phys_addr_t)((uint8_t *)next_seg_phys_addr +
+				mbuf_hdr_size + segment_sz);
+		m->buf_len = segment_sz;
+		m->data_len = segment_sz;
+
+		/* No headroom needed for the buffer */
+		m->data_off = 0;
+
+		/* init some constant fields */
+		m->pool = mp;
+		m->nb_segs = segments_nb;
+		m->port = 0xff;
+		rte_mbuf_refcnt_set(m, 1);
+		next_mbuf = (struct rte_mbuf *) ((uint8_t *) m +
+					mbuf_hdr_size + segment_sz);
+		m->next = next_mbuf;
+		m = next_mbuf;
+		remaining_segments--;
+
+	} while (remaining_segments > 0);
+
+	m->next = NULL;
+}
 
-	if (options->max_buffer_size <= segment_sz)
-		remaining_bytes = 0;
+static void
+mempool_obj_init(struct rte_mempool *mp,
+		 void *opaque_arg,
+		 void *obj,
+		 __attribute__((unused)) unsigned int i)
+{
+	struct obj_params *params = opaque_arg;
+	struct rte_crypto_op *op = obj;
+	struct rte_mbuf *m = (struct rte_mbuf *) ((uint8_t *) obj +
+					params->src_buf_offset);
+	/* Set crypto operation */
+	op->type = RTE_CRYPTO_OP_TYPE_SYMMETRIC;
+	op->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
+	op->sess_type = RTE_CRYPTO_OP_WITH_SESSION;
+
+	/* Set source buffer */
+	op->sym->m_src = m;
+	if (params->segments_nb == 1)
+		fill_single_seg_mbuf(m, mp, obj, params->src_buf_offset,
+				params->segment_sz);
 	else
-		remaining_bytes -= segment_sz;
-
-	segment_nb--;
-
-	while (remaining_bytes) {
-		struct rte_mbuf *m;
-
-		m = rte_pktmbuf_alloc(mempool);
-		if (m == NULL)
-			goto error;
-
-		rte_pktmbuf_chain(mbuf, m);
-
-		mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf, segment_sz);
-		if (mbuf_data == NULL)
-			goto error;
-
-		if (remaining_bytes <= segment_sz)
-			remaining_bytes = 0;
-		else
-			remaining_bytes -= segment_sz;
-
-		segment_nb--;
-	}
-
-	/*
-	 * If there was not enough room for the digest at the end
-	 * of the last segment, allocate a new one
-	 */
-	if (segment_nb != 0) {
-		struct rte_mbuf *m;
-
-		m = rte_pktmbuf_alloc(mempool);
-
-		if (m == NULL)
-			goto error;
-
-		rte_pktmbuf_chain(mbuf, m);
-		mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf,	segment_sz);
-		if (mbuf_data == NULL)
-			goto error;
-	}
-
-	return mbuf;
-error:
-	if (mbuf != NULL)
-		rte_pktmbuf_free(mbuf);
-
-	return NULL;
+		fill_multi_seg_mbuf(m, mp, obj, params->src_buf_offset,
+				params->segment_sz, params->segments_nb);
+
+
+	/* Set destination buffer */
+	if (params->dst_buf_offset) {
+		m = (struct rte_mbuf *) ((uint8_t *) obj +
+				params->dst_buf_offset);
+		fill_single_seg_mbuf(m, mp, obj, params->dst_buf_offset,
+				params->segment_sz);
+		op->sym->m_dst = m;
+	} else
+		op->sym->m_dst = NULL;
 }
 
 void *
@@ -194,7 +204,6 @@  cperf_latency_test_constructor(struct rte_mempool *sess_mp,
 		const struct cperf_op_fns *op_fns)
 {
 	struct cperf_latency_ctx *ctx = NULL;
-	unsigned int mbuf_idx = 0;
 	char pool_name[32] = "";
 
 	ctx = rte_malloc(NULL, sizeof(struct cperf_latency_ctx), 0);
@@ -218,83 +227,52 @@  cperf_latency_test_constructor(struct rte_mempool *sess_mp,
 	if (ctx->sess == NULL)
 		goto err;
 
-	snprintf(pool_name, sizeof(pool_name), "cperf_pool_in_cdev_%d",
-				dev_id);
-
+	/* Calculate the object size */
+	uint16_t crypto_op_size = sizeof(struct rte_crypto_op) +
+		sizeof(struct rte_crypto_sym_op);
+	uint16_t crypto_op_private_size = sizeof(struct priv_op_data) +
+				test_vector->cipher_iv.length +
+				test_vector->auth_iv.length +
+				options->aead_aad_sz;
+	uint16_t crypto_op_total_size = crypto_op_size +
+				crypto_op_private_size;
+	uint16_t crypto_op_total_size_padded =
+				RTE_CACHE_LINE_ROUNDUP(crypto_op_total_size);
+	uint32_t mbuf_size = sizeof(struct rte_mbuf) + options->segment_sz;
 	uint32_t max_size = options->max_buffer_size + options->digest_sz;
-	uint32_t segment_nb = (max_size % options->segment_sz) ?
+	uint16_t segments_nb = (max_size % options->segment_sz) ?
 			(max_size / options->segment_sz) + 1 :
 			max_size / options->segment_sz;
+	uint32_t obj_size = crypto_op_total_size_padded +
+				(mbuf_size * segments_nb);
 
-	ctx->pkt_mbuf_pool_in = rte_pktmbuf_pool_create(pool_name,
-			options->pool_sz * segment_nb, 0, 0,
-			RTE_PKTMBUF_HEADROOM + options->segment_sz,
-			rte_socket_id());
-
-	if (ctx->pkt_mbuf_pool_in == NULL)
-		goto err;
-
-	/* Generate mbufs_in with plaintext populated for test */
-	ctx->mbufs_in = rte_malloc(NULL,
-			(sizeof(struct rte_mbuf *) *
-			ctx->options->pool_sz), 0);
-
-	for (mbuf_idx = 0; mbuf_idx < options->pool_sz; mbuf_idx++) {
-		ctx->mbufs_in[mbuf_idx] = cperf_mbuf_create(
-				ctx->pkt_mbuf_pool_in,
-				options->segment_sz,
-				segment_nb,
-				options);
-		if (ctx->mbufs_in[mbuf_idx] == NULL)
-			goto err;
-	}
-
-	if (options->out_of_place == 1)	{
-
-		snprintf(pool_name, sizeof(pool_name),
-				"cperf_pool_out_cdev_%d",
-				dev_id);
-
-		ctx->pkt_mbuf_pool_out = rte_pktmbuf_pool_create(
-				pool_name, options->pool_sz, 0, 0,
-				RTE_PKTMBUF_HEADROOM +
-				max_size,
-				rte_socket_id());
-
-		if (ctx->pkt_mbuf_pool_out == NULL)
-			goto err;
-	}
-
-	ctx->mbufs_out = rte_malloc(NULL,
-			(sizeof(struct rte_mbuf *) *
-			ctx->options->pool_sz), 0);
-
-	for (mbuf_idx = 0; mbuf_idx < options->pool_sz; mbuf_idx++) {
-		if (options->out_of_place == 1)	{
-			ctx->mbufs_out[mbuf_idx] = cperf_mbuf_create(
-					ctx->pkt_mbuf_pool_out, max_size,
-					1, options);
-			if (ctx->mbufs_out[mbuf_idx] == NULL)
-				goto err;
-		} else {
-			ctx->mbufs_out[mbuf_idx] = NULL;
-		}
-	}
-
-	snprintf(pool_name, sizeof(pool_name), "cperf_op_pool_cdev_%d",
+	snprintf(pool_name, sizeof(pool_name), "pool_in_cdev_%d",
 			dev_id);
 
-	uint16_t priv_size = RTE_ALIGN_CEIL(sizeof(struct priv_op_data) +
-			test_vector->cipher_iv.length +
-			test_vector->auth_iv.length +
-			test_vector->aead_iv.length, 16) +
-			RTE_ALIGN_CEIL(options->aead_aad_sz, 16);
+	ctx->src_buf_offset = crypto_op_total_size_padded;
+
+	struct obj_params params = {
+		.segment_sz = options->segment_sz,
+		.segments_nb = segments_nb,
+		.src_buf_offset = crypto_op_total_size_padded,
+		.dst_buf_offset = 0
+	};
+
+	if (options->out_of_place) {
+		ctx->dst_buf_offset = ctx->src_buf_offset +
+				(mbuf_size * segments_nb);
+		params.dst_buf_offset = ctx->dst_buf_offset;
+		/* Destination buffer will be one segment online */
+		obj_size += max_size;
+	}
 
-	ctx->crypto_op_pool = rte_crypto_op_pool_create(pool_name,
-			RTE_CRYPTO_OP_TYPE_SYMMETRIC, options->pool_sz,
-			512, priv_size, rte_socket_id());
+	ctx->pool = rte_mempool_create(pool_name,
+			options->pool_sz, obj_size, 512, 0,
+			NULL, NULL, mempool_obj_init,
+			(void *)&params,
+			rte_socket_id(), 0);
 
-	if (ctx->crypto_op_pool == NULL)
+	if (ctx->pool == NULL)
 		goto err;
 
 	ctx->res = rte_malloc(NULL, sizeof(struct cperf_op_result) *
@@ -305,7 +283,7 @@  cperf_latency_test_constructor(struct rte_mempool *sess_mp,
 
 	return ctx;
 err:
-	cperf_latency_test_free(ctx, mbuf_idx);
+	cperf_latency_test_free(ctx);
 
 	return NULL;
 }
@@ -370,7 +348,7 @@  cperf_latency_test_runner(void *arg)
 
 	while (test_burst_size <= ctx->options->max_burst_size) {
 		uint64_t ops_enqd = 0, ops_deqd = 0;
-		uint64_t m_idx = 0, b_idx = 0;
+		uint64_t b_idx = 0;
 
 		uint64_t tsc_val, tsc_end, tsc_start;
 		uint64_t tsc_max = 0, tsc_min = ~0UL, tsc_tot = 0, tsc_idx = 0;
@@ -385,11 +363,9 @@  cperf_latency_test_runner(void *arg)
 							ctx->options->total_ops -
 							enqd_tot;
 
-			/* Allocate crypto ops from pool */
-			if (burst_size != rte_crypto_op_bulk_alloc(
-					ctx->crypto_op_pool,
-					RTE_CRYPTO_OP_TYPE_SYMMETRIC,
-					ops, burst_size)) {
+			/* Allocate objects containing crypto operations and mbufs */
+			if (rte_mempool_get_bulk(ctx->pool, (void **)ops,
+						burst_size) != 0) {
 				RTE_LOG(ERR, USER1,
 					"Failed to allocate more crypto operations "
 					"from the the crypto operation pool.\n"
@@ -399,8 +375,8 @@  cperf_latency_test_runner(void *arg)
 			}
 
 			/* Setup crypto op, attach mbuf etc */
-			(ctx->populate_ops)(ops, &ctx->mbufs_in[m_idx],
-					&ctx->mbufs_out[m_idx],
+			(ctx->populate_ops)(ops, ctx->src_buf_offset,
+					ctx->dst_buf_offset,
 					burst_size, ctx->sess, ctx->options,
 					ctx->test_vector, iv_offset);
 
@@ -429,7 +405,7 @@  cperf_latency_test_runner(void *arg)
 
 			/* Free memory for not enqueued operations */
 			if (ops_enqd != burst_size)
-				rte_mempool_put_bulk(ctx->crypto_op_pool,
+				rte_mempool_put_bulk(ctx->pool,
 						(void **)&ops[ops_enqd],
 						burst_size - ops_enqd);
 
@@ -445,16 +421,11 @@  cperf_latency_test_runner(void *arg)
 			}
 
 			if (likely(ops_deqd))  {
-				/*
-				 * free crypto ops so they can be reused. We don't free
-				 * the mbufs here as we don't want to reuse them as
-				 * the crypto operation will change the data and cause
-				 * failures.
-				 */
+				/* free crypto ops so they can be reused. */
 				for (i = 0; i < ops_deqd; i++)
 					store_timestamp(ops_processed[i], tsc_end);
 
-				rte_mempool_put_bulk(ctx->crypto_op_pool,
+				rte_mempool_put_bulk(ctx->pool,
 						(void **)ops_processed, ops_deqd);
 
 				deqd_tot += ops_deqd;
@@ -466,9 +437,6 @@  cperf_latency_test_runner(void *arg)
 			enqd_max = max(ops_enqd, enqd_max);
 			enqd_min = min(ops_enqd, enqd_min);
 
-			m_idx += ops_enqd;
-			m_idx = m_idx + test_burst_size > ctx->options->pool_sz ?
-					0 : m_idx;
 			b_idx++;
 		}
 
@@ -487,7 +455,7 @@  cperf_latency_test_runner(void *arg)
 				for (i = 0; i < ops_deqd; i++)
 					store_timestamp(ops_processed[i], tsc_end);
 
-				rte_mempool_put_bulk(ctx->crypto_op_pool,
+				rte_mempool_put_bulk(ctx->pool,
 						(void **)ops_processed, ops_deqd);
 
 				deqd_tot += ops_deqd;
@@ -585,5 +553,5 @@  cperf_latency_test_destructor(void *arg)
 
 	rte_cryptodev_stop(ctx->dev_id);
 
-	cperf_latency_test_free(ctx, ctx->options->pool_sz);
+	cperf_latency_test_free(ctx);
 }
diff --git a/app/test-crypto-perf/cperf_test_throughput.c b/app/test-crypto-perf/cperf_test_throughput.c
index 121ceb1..46889c4 100644
--- a/app/test-crypto-perf/cperf_test_throughput.c
+++ b/app/test-crypto-perf/cperf_test_throughput.c
@@ -43,131 +43,141 @@  struct cperf_throughput_ctx {
 	uint16_t qp_id;
 	uint8_t lcore_id;
 
-	struct rte_mempool *pkt_mbuf_pool_in;
-	struct rte_mempool *pkt_mbuf_pool_out;
-	struct rte_mbuf **mbufs_in;
-	struct rte_mbuf **mbufs_out;
-
-	struct rte_mempool *crypto_op_pool;
+	struct rte_mempool *pool;
 
 	struct rte_cryptodev_sym_session *sess;
 
 	cperf_populate_ops_t populate_ops;
 
+	uint32_t src_buf_offset;
+	uint32_t dst_buf_offset;
+
 	const struct cperf_options *options;
 	const struct cperf_test_vector *test_vector;
 };
 
 static void
-cperf_throughput_test_free(struct cperf_throughput_ctx *ctx, uint32_t mbuf_nb)
+cperf_throughput_test_free(struct cperf_throughput_ctx *ctx)
 {
-	uint32_t i;
-
 	if (ctx) {
 		if (ctx->sess) {
 			rte_cryptodev_sym_session_clear(ctx->dev_id, ctx->sess);
 			rte_cryptodev_sym_session_free(ctx->sess);
 		}
 
-		if (ctx->mbufs_in) {
-			for (i = 0; i < mbuf_nb; i++)
-				rte_pktmbuf_free(ctx->mbufs_in[i]);
-
-			rte_free(ctx->mbufs_in);
-		}
-
-		if (ctx->mbufs_out) {
-			for (i = 0; i < mbuf_nb; i++) {
-				if (ctx->mbufs_out[i] != NULL)
-					rte_pktmbuf_free(ctx->mbufs_out[i]);
-			}
-
-			rte_free(ctx->mbufs_out);
-		}
-
-		if (ctx->pkt_mbuf_pool_in)
-			rte_mempool_free(ctx->pkt_mbuf_pool_in);
-
-		if (ctx->pkt_mbuf_pool_out)
-			rte_mempool_free(ctx->pkt_mbuf_pool_out);
-
-		if (ctx->crypto_op_pool)
-			rte_mempool_free(ctx->crypto_op_pool);
+		if (ctx->pool)
+			rte_mempool_free(ctx->pool);
 
 		rte_free(ctx);
 	}
 }
 
-static struct rte_mbuf *
-cperf_mbuf_create(struct rte_mempool *mempool,
-		uint32_t segment_sz,
-		uint32_t segment_nb,
-		const struct cperf_options *options)
-{
-	struct rte_mbuf *mbuf;
-	uint8_t *mbuf_data;
-	uint32_t remaining_bytes = options->max_buffer_size;
+struct obj_params {
+	uint32_t src_buf_offset;
+	uint32_t dst_buf_offset;
+	uint16_t segment_sz;
+	uint16_t segments_nb;
+};
 
-	mbuf = rte_pktmbuf_alloc(mempool);
-	if (mbuf == NULL)
-		goto error;
+static void
+fill_single_seg_mbuf(struct rte_mbuf *m, struct rte_mempool *mp,
+		void *obj, uint32_t mbuf_offset, uint16_t segment_sz)
+{
+	uint32_t mbuf_hdr_size = sizeof(struct rte_mbuf);
+
+	/* start of buffer is after mbuf structure and priv data */
+	m->priv_size = 0;
+	m->buf_addr = (char *)m + mbuf_hdr_size;
+	m->buf_physaddr = rte_mempool_virt2phy(mp, obj) +
+		mbuf_offset + mbuf_hdr_size;
+	m->buf_len = segment_sz;
+	m->data_len = segment_sz;
+
+	/* No headroom needed for the buffer */
+	m->data_off = 0;
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = 0xff;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+}
 
-	mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf, segment_sz);
-	if (mbuf_data == NULL)
-		goto error;
+static void
+fill_multi_seg_mbuf(struct rte_mbuf *m, struct rte_mempool *mp,
+		void *obj, uint32_t mbuf_offset, uint16_t segment_sz,
+		uint16_t segments_nb)
+{
+	uint16_t mbuf_hdr_size = sizeof(struct rte_mbuf);
+	uint16_t remaining_segments = segments_nb;
+	struct rte_mbuf *next_mbuf;
+	phys_addr_t next_seg_phys_addr = rte_mempool_virt2phy(mp, obj) +
+			 mbuf_offset + mbuf_hdr_size;
+
+	do {
+		/* start of buffer is after mbuf structure and priv data */
+		m->priv_size = 0;
+		m->buf_addr = (char *)m + mbuf_hdr_size;
+		m->buf_physaddr = next_seg_phys_addr;
+		next_seg_phys_addr = (phys_addr_t)((uint8_t *)next_seg_phys_addr +
+				mbuf_hdr_size + segment_sz);
+		m->buf_len = segment_sz;
+		m->data_len = segment_sz;
+
+		/* No headroom needed for the buffer */
+		m->data_off = 0;
+
+		/* init some constant fields */
+		m->pool = mp;
+		m->nb_segs = segments_nb;
+		m->port = 0xff;
+		rte_mbuf_refcnt_set(m, 1);
+		next_mbuf = (struct rte_mbuf *) ((uint8_t *) m +
+					mbuf_hdr_size + segment_sz);
+		m->next = next_mbuf;
+		m = next_mbuf;
+		remaining_segments--;
+
+	} while (remaining_segments > 0);
+
+	m->next = NULL;
+}
 
-	if (options->max_buffer_size <= segment_sz)
-		remaining_bytes = 0;
+static void
+mempool_obj_init(struct rte_mempool *mp,
+		 void *opaque_arg,
+		 void *obj,
+		 __attribute__((unused)) unsigned int i)
+{
+	struct obj_params *params = opaque_arg;
+	struct rte_crypto_op *op = obj;
+	struct rte_mbuf *m = (struct rte_mbuf *) ((uint8_t *) obj +
+					params->src_buf_offset);
+	/* Set crypto operation */
+	op->type = RTE_CRYPTO_OP_TYPE_SYMMETRIC;
+	op->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
+	op->sess_type = RTE_CRYPTO_OP_WITH_SESSION;
+
+	/* Set source buffer */
+	op->sym->m_src = m;
+	if (params->segments_nb == 1)
+		fill_single_seg_mbuf(m, mp, obj, params->src_buf_offset,
+				params->segment_sz);
 	else
-		remaining_bytes -= segment_sz;
-
-	segment_nb--;
-
-	while (remaining_bytes) {
-		struct rte_mbuf *m;
-
-		m = rte_pktmbuf_alloc(mempool);
-		if (m == NULL)
-			goto error;
-
-		rte_pktmbuf_chain(mbuf, m);
-
-		mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf, segment_sz);
-		if (mbuf_data == NULL)
-			goto error;
-
-		if (remaining_bytes <= segment_sz)
-			remaining_bytes = 0;
-		else
-			remaining_bytes -= segment_sz;
-
-		segment_nb--;
-	}
-
-	/*
-	 * If there was not enough room for the digest at the end
-	 * of the last segment, allocate a new one
-	 */
-	if (segment_nb != 0) {
-		struct rte_mbuf *m;
-
-		m = rte_pktmbuf_alloc(mempool);
-
-		if (m == NULL)
-			goto error;
-
-		rte_pktmbuf_chain(mbuf, m);
-		mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf,	segment_sz);
-		if (mbuf_data == NULL)
-			goto error;
-	}
-
-	return mbuf;
-error:
-	if (mbuf != NULL)
-		rte_pktmbuf_free(mbuf);
-
-	return NULL;
+		fill_multi_seg_mbuf(m, mp, obj, params->src_buf_offset,
+				params->segment_sz, params->segments_nb);
+
+
+	/* Set destination buffer */
+	if (params->dst_buf_offset) {
+		m = (struct rte_mbuf *) ((uint8_t *) obj +
+				params->dst_buf_offset);
+		fill_single_seg_mbuf(m, mp, obj, params->dst_buf_offset,
+				params->segment_sz);
+		op->sym->m_dst = m;
+	} else
+		op->sym->m_dst = NULL;
 }
 
 void *
@@ -178,7 +188,6 @@  cperf_throughput_test_constructor(struct rte_mempool *sess_mp,
 		const struct cperf_op_fns *op_fns)
 {
 	struct cperf_throughput_ctx *ctx = NULL;
-	unsigned int mbuf_idx = 0;
 	char pool_name[32] = "";
 
 	ctx = rte_malloc(NULL, sizeof(struct cperf_throughput_ctx), 0);
@@ -201,83 +210,56 @@  cperf_throughput_test_constructor(struct rte_mempool *sess_mp,
 	if (ctx->sess == NULL)
 		goto err;
 
-	snprintf(pool_name, sizeof(pool_name), "cperf_pool_in_cdev_%d",
-			dev_id);
-
+	/* Calculate the object size */
+	uint16_t crypto_op_size = sizeof(struct rte_crypto_op) +
+		sizeof(struct rte_crypto_sym_op);
+	uint16_t crypto_op_private_size = test_vector->cipher_iv.length +
+				test_vector->auth_iv.length +
+				options->aead_aad_sz;
+	uint16_t crypto_op_total_size = crypto_op_size +
+				crypto_op_private_size;
+	uint16_t crypto_op_total_size_padded =
+				RTE_CACHE_LINE_ROUNDUP(crypto_op_total_size);
+	uint32_t mbuf_size = sizeof(struct rte_mbuf) + options->segment_sz;
 	uint32_t max_size = options->max_buffer_size + options->digest_sz;
-	uint32_t segment_nb = (max_size % options->segment_sz) ?
+	uint16_t segments_nb = (max_size % options->segment_sz) ?
 			(max_size / options->segment_sz) + 1 :
 			max_size / options->segment_sz;
+	uint32_t obj_size = crypto_op_total_size_padded +
+				(mbuf_size * segments_nb);
 
-	ctx->pkt_mbuf_pool_in = rte_pktmbuf_pool_create(pool_name,
-			options->pool_sz * segment_nb, 0, 0,
-			RTE_PKTMBUF_HEADROOM + options->segment_sz,
-			rte_socket_id());
-
-	if (ctx->pkt_mbuf_pool_in == NULL)
-		goto err;
-
-	/* Generate mbufs_in with plaintext populated for test */
-	ctx->mbufs_in = rte_malloc(NULL,
-			(sizeof(struct rte_mbuf *) * ctx->options->pool_sz), 0);
-
-	for (mbuf_idx = 0; mbuf_idx < options->pool_sz; mbuf_idx++) {
-		ctx->mbufs_in[mbuf_idx] = cperf_mbuf_create(
-				ctx->pkt_mbuf_pool_in,
-				options->segment_sz,
-				segment_nb,
-				options);
-		if (ctx->mbufs_in[mbuf_idx] == NULL)
-			goto err;
-	}
-
-	if (options->out_of_place == 1)	{
-
-		snprintf(pool_name, sizeof(pool_name), "cperf_pool_out_cdev_%d",
-				dev_id);
-
-		ctx->pkt_mbuf_pool_out = rte_pktmbuf_pool_create(
-				pool_name, options->pool_sz, 0, 0,
-				RTE_PKTMBUF_HEADROOM +
-				max_size,
-				rte_socket_id());
-
-		if (ctx->pkt_mbuf_pool_out == NULL)
-			goto err;
-	}
+	snprintf(pool_name, sizeof(pool_name), "pool_in_cdev_%d",
+			dev_id);
 
-	ctx->mbufs_out = rte_malloc(NULL,
-			(sizeof(struct rte_mbuf *) *
-			ctx->options->pool_sz), 0);
-
-	for (mbuf_idx = 0; mbuf_idx < options->pool_sz; mbuf_idx++) {
-		if (options->out_of_place == 1)	{
-			ctx->mbufs_out[mbuf_idx] = cperf_mbuf_create(
-					ctx->pkt_mbuf_pool_out, max_size,
-					1, options);
-			if (ctx->mbufs_out[mbuf_idx] == NULL)
-				goto err;
-		} else {
-			ctx->mbufs_out[mbuf_idx] = NULL;
-		}
+	ctx->src_buf_offset = crypto_op_total_size_padded;
+
+	struct obj_params params = {
+		.segment_sz = options->segment_sz,
+		.segments_nb = segments_nb,
+		.src_buf_offset = crypto_op_total_size_padded,
+		.dst_buf_offset = 0
+	};
+
+	if (options->out_of_place) {
+		ctx->dst_buf_offset = ctx->src_buf_offset +
+				(mbuf_size * segments_nb);
+		params.dst_buf_offset = ctx->dst_buf_offset;
+		/* Destination buffer will be one segment online */
+		obj_size += max_size;
 	}
 
-	snprintf(pool_name, sizeof(pool_name), "cperf_op_pool_cdev_%d",
-			dev_id);
-
-	uint16_t priv_size = RTE_ALIGN_CEIL(test_vector->cipher_iv.length +
-		test_vector->auth_iv.length + test_vector->aead_iv.length, 16) +
-		RTE_ALIGN_CEIL(options->aead_aad_sz, 16);
+	ctx->pool = rte_mempool_create(pool_name,
+			options->pool_sz, obj_size, 512, 0,
+			NULL, NULL, mempool_obj_init,
+			(void *)&params,
+			rte_socket_id(), 0);
 
-	ctx->crypto_op_pool = rte_crypto_op_pool_create(pool_name,
-			RTE_CRYPTO_OP_TYPE_SYMMETRIC, options->pool_sz,
-			512, priv_size, rte_socket_id());
-	if (ctx->crypto_op_pool == NULL)
+	if (ctx->pool == NULL)
 		goto err;
 
 	return ctx;
 err:
-	cperf_throughput_test_free(ctx, mbuf_idx);
+	cperf_throughput_test_free(ctx);
 
 	return NULL;
 }
@@ -329,7 +311,7 @@  cperf_throughput_test_runner(void *test_ctx)
 		uint64_t ops_enqd = 0, ops_enqd_total = 0, ops_enqd_failed = 0;
 		uint64_t ops_deqd = 0, ops_deqd_total = 0, ops_deqd_failed = 0;
 
-		uint64_t m_idx = 0, tsc_start, tsc_end, tsc_duration;
+		uint64_t tsc_start, tsc_end, tsc_duration;
 
 		uint16_t ops_unused = 0;
 
@@ -345,11 +327,9 @@  cperf_throughput_test_runner(void *test_ctx)
 
 			uint16_t ops_needed = burst_size - ops_unused;
 
-			/* Allocate crypto ops from pool */
-			if (ops_needed != rte_crypto_op_bulk_alloc(
-					ctx->crypto_op_pool,
-					RTE_CRYPTO_OP_TYPE_SYMMETRIC,
-					ops, ops_needed)) {
+			/* Allocate objects containing crypto operations and mbufs */
+			if (rte_mempool_get_bulk(ctx->pool, (void **)ops,
+						ops_needed) != 0) {
 				RTE_LOG(ERR, USER1,
 					"Failed to allocate more crypto operations "
 					"from the the crypto operation pool.\n"
@@ -359,10 +339,11 @@  cperf_throughput_test_runner(void *test_ctx)
 			}
 
 			/* Setup crypto op, attach mbuf etc */
-			(ctx->populate_ops)(ops, &ctx->mbufs_in[m_idx],
-					&ctx->mbufs_out[m_idx],
-					ops_needed, ctx->sess, ctx->options,
-					ctx->test_vector, iv_offset);
+			(ctx->populate_ops)(ops, ctx->src_buf_offset,
+					ctx->dst_buf_offset,
+					ops_needed, ctx->sess,
+					ctx->options, ctx->test_vector,
+					iv_offset);
 
 			/**
 			 * When ops_needed is smaller than ops_enqd, the
@@ -407,12 +388,8 @@  cperf_throughput_test_runner(void *test_ctx)
 					ops_processed, test_burst_size);
 
 			if (likely(ops_deqd))  {
-				/* free crypto ops so they can be reused. We don't free
-				 * the mbufs here as we don't want to reuse them as
-				 * the crypto operation will change the data and cause
-				 * failures.
-				 */
-				rte_mempool_put_bulk(ctx->crypto_op_pool,
+				/* free crypto ops so they can be reused. */
+				rte_mempool_put_bulk(ctx->pool,
 						(void **)ops_processed, ops_deqd);
 
 				ops_deqd_total += ops_deqd;
@@ -425,9 +402,6 @@  cperf_throughput_test_runner(void *test_ctx)
 				ops_deqd_failed++;
 			}
 
-			m_idx += ops_needed;
-			m_idx = m_idx + test_burst_size > ctx->options->pool_sz ?
-					0 : m_idx;
 		}
 
 		/* Dequeue any operations still in the crypto device */
@@ -442,9 +416,8 @@  cperf_throughput_test_runner(void *test_ctx)
 			if (ops_deqd == 0)
 				ops_deqd_failed++;
 			else {
-				rte_mempool_put_bulk(ctx->crypto_op_pool,
+				rte_mempool_put_bulk(ctx->pool,
 						(void **)ops_processed, ops_deqd);
-
 				ops_deqd_total += ops_deqd;
 			}
 		}
@@ -532,5 +505,5 @@  cperf_throughput_test_destructor(void *arg)
 
 	rte_cryptodev_stop(ctx->dev_id);
 
-	cperf_throughput_test_free(ctx, ctx->options->pool_sz);
+	cperf_throughput_test_free(ctx);
 }
diff --git a/app/test-crypto-perf/cperf_test_verify.c b/app/test-crypto-perf/cperf_test_verify.c
index b18426c..aa065ed 100644
--- a/app/test-crypto-perf/cperf_test_verify.c
+++ b/app/test-crypto-perf/cperf_test_verify.c
@@ -43,135 +43,141 @@  struct cperf_verify_ctx {
 	uint16_t qp_id;
 	uint8_t lcore_id;
 
-	struct rte_mempool *pkt_mbuf_pool_in;
-	struct rte_mempool *pkt_mbuf_pool_out;
-	struct rte_mbuf **mbufs_in;
-	struct rte_mbuf **mbufs_out;
-
-	struct rte_mempool *crypto_op_pool;
+	struct rte_mempool *pool;
 
 	struct rte_cryptodev_sym_session *sess;
 
 	cperf_populate_ops_t populate_ops;
 
+	uint32_t src_buf_offset;
+	uint32_t dst_buf_offset;
+
 	const struct cperf_options *options;
 	const struct cperf_test_vector *test_vector;
 };
 
-struct cperf_op_result {
-	enum rte_crypto_op_status status;
-};
-
 static void
-cperf_verify_test_free(struct cperf_verify_ctx *ctx, uint32_t mbuf_nb)
+cperf_verify_test_free(struct cperf_verify_ctx *ctx)
 {
-	uint32_t i;
-
 	if (ctx) {
 		if (ctx->sess) {
 			rte_cryptodev_sym_session_clear(ctx->dev_id, ctx->sess);
 			rte_cryptodev_sym_session_free(ctx->sess);
 		}
 
-		if (ctx->mbufs_in) {
-			for (i = 0; i < mbuf_nb; i++)
-				rte_pktmbuf_free(ctx->mbufs_in[i]);
-
-			rte_free(ctx->mbufs_in);
-		}
-
-		if (ctx->mbufs_out) {
-			for (i = 0; i < mbuf_nb; i++) {
-				if (ctx->mbufs_out[i] != NULL)
-					rte_pktmbuf_free(ctx->mbufs_out[i]);
-			}
-
-			rte_free(ctx->mbufs_out);
-		}
-
-		if (ctx->pkt_mbuf_pool_in)
-			rte_mempool_free(ctx->pkt_mbuf_pool_in);
-
-		if (ctx->pkt_mbuf_pool_out)
-			rte_mempool_free(ctx->pkt_mbuf_pool_out);
-
-		if (ctx->crypto_op_pool)
-			rte_mempool_free(ctx->crypto_op_pool);
+		if (ctx->pool)
+			rte_mempool_free(ctx->pool);
 
 		rte_free(ctx);
 	}
 }
 
-static struct rte_mbuf *
-cperf_mbuf_create(struct rte_mempool *mempool,
-		uint32_t segment_sz,
-		uint32_t segment_nb,
-		const struct cperf_options *options)
-{
-	struct rte_mbuf *mbuf;
-	uint8_t *mbuf_data;
-	uint32_t remaining_bytes = options->max_buffer_size;
+struct obj_params {
+	uint32_t src_buf_offset;
+	uint32_t dst_buf_offset;
+	uint16_t segment_sz;
+	uint16_t segments_nb;
+};
 
-	mbuf = rte_pktmbuf_alloc(mempool);
-	if (mbuf == NULL)
-		goto error;
+static void
+fill_single_seg_mbuf(struct rte_mbuf *m, struct rte_mempool *mp,
+		void *obj, uint32_t mbuf_offset, uint16_t segment_sz)
+{
+	uint32_t mbuf_hdr_size = sizeof(struct rte_mbuf);
+
+	/* start of buffer is after mbuf structure and priv data */
+	m->priv_size = 0;
+	m->buf_addr = (char *)m + mbuf_hdr_size;
+	m->buf_physaddr = rte_mempool_virt2phy(mp, obj) +
+		mbuf_offset + mbuf_hdr_size;
+	m->buf_len = segment_sz;
+	m->data_len = segment_sz;
+
+	/* No headroom needed for the buffer */
+	m->data_off = 0;
+
+	/* init some constant fields */
+	m->pool = mp;
+	m->nb_segs = 1;
+	m->port = 0xff;
+	rte_mbuf_refcnt_set(m, 1);
+	m->next = NULL;
+}
 
-	mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf, segment_sz);
-	if (mbuf_data == NULL)
-		goto error;
+static void
+fill_multi_seg_mbuf(struct rte_mbuf *m, struct rte_mempool *mp,
+		void *obj, uint32_t mbuf_offset, uint16_t segment_sz,
+		uint16_t segments_nb)
+{
+	uint16_t mbuf_hdr_size = sizeof(struct rte_mbuf);
+	uint16_t remaining_segments = segments_nb;
+	struct rte_mbuf *next_mbuf;
+	phys_addr_t next_seg_phys_addr = rte_mempool_virt2phy(mp, obj) +
+			 mbuf_offset + mbuf_hdr_size;
+
+	do {
+		/* start of buffer is after mbuf structure and priv data */
+		m->priv_size = 0;
+		m->buf_addr = (char *)m + mbuf_hdr_size;
+		m->buf_physaddr = next_seg_phys_addr;
+		next_seg_phys_addr = (phys_addr_t)((uint8_t *)next_seg_phys_addr +
+				mbuf_hdr_size + segment_sz);
+		m->buf_len = segment_sz;
+		m->data_len = segment_sz;
+
+		/* No headroom needed for the buffer */
+		m->data_off = 0;
+
+		/* init some constant fields */
+		m->pool = mp;
+		m->nb_segs = segments_nb;
+		m->port = 0xff;
+		rte_mbuf_refcnt_set(m, 1);
+		next_mbuf = (struct rte_mbuf *) ((uint8_t *) m +
+					mbuf_hdr_size + segment_sz);
+		m->next = next_mbuf;
+		m = next_mbuf;
+		remaining_segments--;
+
+	} while (remaining_segments > 0);
+
+	m->next = NULL;
+}
 
-	if (options->max_buffer_size <= segment_sz)
-		remaining_bytes = 0;
+static void
+mempool_obj_init(struct rte_mempool *mp,
+		 void *opaque_arg,
+		 void *obj,
+		 __attribute__((unused)) unsigned int i)
+{
+	struct obj_params *params = opaque_arg;
+	struct rte_crypto_op *op = obj;
+	struct rte_mbuf *m = (struct rte_mbuf *) ((uint8_t *) obj +
+					params->src_buf_offset);
+	/* Set crypto operation */
+	op->type = RTE_CRYPTO_OP_TYPE_SYMMETRIC;
+	op->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
+	op->sess_type = RTE_CRYPTO_OP_WITH_SESSION;
+
+	/* Set source buffer */
+	op->sym->m_src = m;
+	if (params->segments_nb == 1)
+		fill_single_seg_mbuf(m, mp, obj, params->src_buf_offset,
+				params->segment_sz);
 	else
-		remaining_bytes -= segment_sz;
-
-	segment_nb--;
-
-	while (remaining_bytes) {
-		struct rte_mbuf *m;
-
-		m = rte_pktmbuf_alloc(mempool);
-		if (m == NULL)
-			goto error;
-
-		rte_pktmbuf_chain(mbuf, m);
-
-		mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf, segment_sz);
-		if (mbuf_data == NULL)
-			goto error;
-
-		if (remaining_bytes <= segment_sz)
-			remaining_bytes = 0;
-		else
-			remaining_bytes -= segment_sz;
-
-		segment_nb--;
-	}
-
-	/*
-	 * If there was not enough room for the digest at the end
-	 * of the last segment, allocate a new one
-	 */
-	if (segment_nb != 0) {
-		struct rte_mbuf *m;
-
-		m = rte_pktmbuf_alloc(mempool);
-
-		if (m == NULL)
-			goto error;
-
-		rte_pktmbuf_chain(mbuf, m);
-		mbuf_data = (uint8_t *)rte_pktmbuf_append(mbuf,	segment_sz);
-		if (mbuf_data == NULL)
-			goto error;
-	}
-
-	return mbuf;
-error:
-	if (mbuf != NULL)
-		rte_pktmbuf_free(mbuf);
-
-	return NULL;
+		fill_multi_seg_mbuf(m, mp, obj, params->src_buf_offset,
+				params->segment_sz, params->segments_nb);
+
+
+	/* Set destination buffer */
+	if (params->dst_buf_offset) {
+		m = (struct rte_mbuf *) ((uint8_t *) obj +
+				params->dst_buf_offset);
+		fill_single_seg_mbuf(m, mp, obj, params->dst_buf_offset,
+				params->segment_sz);
+		op->sym->m_dst = m;
+	} else
+		op->sym->m_dst = NULL;
 }
 
 static void
@@ -210,7 +216,6 @@  cperf_verify_test_constructor(struct rte_mempool *sess_mp,
 		const struct cperf_op_fns *op_fns)
 {
 	struct cperf_verify_ctx *ctx = NULL;
-	unsigned int mbuf_idx = 0;
 	char pool_name[32] = "";
 
 	ctx = rte_malloc(NULL, sizeof(struct cperf_verify_ctx), 0);
@@ -224,7 +229,7 @@  cperf_verify_test_constructor(struct rte_mempool *sess_mp,
 	ctx->options = options;
 	ctx->test_vector = test_vector;
 
-	/* IV goes at the end of the cryptop operation */
+	/* IV goes at the end of the crypto operation */
 	uint16_t iv_offset = sizeof(struct rte_crypto_op) +
 		sizeof(struct rte_crypto_sym_op);
 
@@ -233,83 +238,56 @@  cperf_verify_test_constructor(struct rte_mempool *sess_mp,
 	if (ctx->sess == NULL)
 		goto err;
 
-	snprintf(pool_name, sizeof(pool_name), "cperf_pool_in_cdev_%d",
-			dev_id);
-
+	/* Calculate the object size */
+	uint16_t crypto_op_size = sizeof(struct rte_crypto_op) +
+		sizeof(struct rte_crypto_sym_op);
+	uint16_t crypto_op_private_size = test_vector->cipher_iv.length +
+				test_vector->auth_iv.length +
+				options->aead_aad_sz;
+	uint16_t crypto_op_total_size = crypto_op_size +
+				crypto_op_private_size;
+	uint16_t crypto_op_total_size_padded =
+				RTE_CACHE_LINE_ROUNDUP(crypto_op_total_size);
+	uint32_t mbuf_size = sizeof(struct rte_mbuf) + options->segment_sz;
 	uint32_t max_size = options->max_buffer_size + options->digest_sz;
-	uint32_t segment_nb = (max_size % options->segment_sz) ?
+	uint16_t segments_nb = (max_size % options->segment_sz) ?
 			(max_size / options->segment_sz) + 1 :
 			max_size / options->segment_sz;
+	uint32_t obj_size = crypto_op_total_size_padded +
+				(mbuf_size * segments_nb);
 
-	ctx->pkt_mbuf_pool_in = rte_pktmbuf_pool_create(pool_name,
-			options->pool_sz * segment_nb, 0, 0,
-			RTE_PKTMBUF_HEADROOM + options->segment_sz,
-			rte_socket_id());
-
-	if (ctx->pkt_mbuf_pool_in == NULL)
-		goto err;
-
-	/* Generate mbufs_in with plaintext populated for test */
-	ctx->mbufs_in = rte_malloc(NULL,
-			(sizeof(struct rte_mbuf *) * ctx->options->pool_sz), 0);
-
-	for (mbuf_idx = 0; mbuf_idx < options->pool_sz; mbuf_idx++) {
-		ctx->mbufs_in[mbuf_idx] = cperf_mbuf_create(
-				ctx->pkt_mbuf_pool_in,
-				options->segment_sz,
-				segment_nb,
-				options);
-		if (ctx->mbufs_in[mbuf_idx] == NULL)
-			goto err;
-	}
-
-	if (options->out_of_place == 1)	{
-
-		snprintf(pool_name, sizeof(pool_name), "cperf_pool_out_cdev_%d",
-				dev_id);
-
-		ctx->pkt_mbuf_pool_out = rte_pktmbuf_pool_create(
-				pool_name, options->pool_sz, 0, 0,
-				RTE_PKTMBUF_HEADROOM +
-				max_size,
-				rte_socket_id());
-
-		if (ctx->pkt_mbuf_pool_out == NULL)
-			goto err;
-	}
+	snprintf(pool_name, sizeof(pool_name), "pool_in_cdev_%d",
+			dev_id);
 
-	ctx->mbufs_out = rte_malloc(NULL,
-			(sizeof(struct rte_mbuf *) *
-			ctx->options->pool_sz), 0);
-
-	for (mbuf_idx = 0; mbuf_idx < options->pool_sz; mbuf_idx++) {
-		if (options->out_of_place == 1)	{
-			ctx->mbufs_out[mbuf_idx] = cperf_mbuf_create(
-					ctx->pkt_mbuf_pool_out, max_size,
-					1, options);
-			if (ctx->mbufs_out[mbuf_idx] == NULL)
-				goto err;
-		} else {
-			ctx->mbufs_out[mbuf_idx] = NULL;
-		}
+	ctx->src_buf_offset = crypto_op_total_size_padded;
+
+	struct obj_params params = {
+		.segment_sz = options->segment_sz,
+		.segments_nb = segments_nb,
+		.src_buf_offset = crypto_op_total_size_padded,
+		.dst_buf_offset = 0
+	};
+
+	if (options->out_of_place) {
+		ctx->dst_buf_offset = ctx->src_buf_offset +
+				(mbuf_size * segments_nb);
+		params.dst_buf_offset = ctx->dst_buf_offset;
+		/* Destination buffer will be one segment online */
+		obj_size += max_size;
 	}
 
-	snprintf(pool_name, sizeof(pool_name), "cperf_op_pool_cdev_%d",
-			dev_id);
-
-	uint16_t priv_size = RTE_ALIGN_CEIL(test_vector->cipher_iv.length +
-		test_vector->auth_iv.length + test_vector->aead_iv.length, 16) +
-		RTE_ALIGN_CEIL(options->aead_aad_sz, 16);
+	ctx->pool = rte_mempool_create(pool_name,
+			options->pool_sz, obj_size, 512, 0,
+			NULL, NULL, mempool_obj_init,
+			(void *)&params,
+			rte_socket_id(), 0);
 
-	ctx->crypto_op_pool = rte_crypto_op_pool_create(pool_name,
-			RTE_CRYPTO_OP_TYPE_SYMMETRIC, options->pool_sz,
-			512, priv_size, rte_socket_id());
-	if (ctx->crypto_op_pool == NULL)
+	if (ctx->pool == NULL)
 		goto err;
 
 	return ctx;
 err:
-	cperf_verify_test_free(ctx, mbuf_idx);
+	cperf_verify_test_free(ctx);
 
 	return NULL;
 }
@@ -425,7 +403,7 @@  cperf_verify_test_runner(void *test_ctx)
 
 	static int only_once;
 
-	uint64_t i, m_idx = 0;
+	uint64_t i;
 	uint16_t ops_unused = 0;
 
 	struct rte_crypto_op *ops[ctx->options->max_burst_size];
@@ -465,11 +443,9 @@  cperf_verify_test_runner(void *test_ctx)
 
 		uint16_t ops_needed = burst_size - ops_unused;
 
-		/* Allocate crypto ops from pool */
-		if (ops_needed != rte_crypto_op_bulk_alloc(
-				ctx->crypto_op_pool,
-				RTE_CRYPTO_OP_TYPE_SYMMETRIC,
-				ops, ops_needed)) {
+		/* Allocate objects containing crypto operations and mbufs */
+		if (rte_mempool_get_bulk(ctx->pool, (void **)ops,
+					ops_needed) != 0) {
 			RTE_LOG(ERR, USER1,
 				"Failed to allocate more crypto operations "
 				"from the the crypto operation pool.\n"
@@ -479,8 +455,8 @@  cperf_verify_test_runner(void *test_ctx)
 		}
 
 		/* Setup crypto op, attach mbuf etc */
-		(ctx->populate_ops)(ops, &ctx->mbufs_in[m_idx],
-				&ctx->mbufs_out[m_idx],
+		(ctx->populate_ops)(ops, ctx->src_buf_offset,
+				ctx->dst_buf_offset,
 				ops_needed, ctx->sess, ctx->options,
 				ctx->test_vector, iv_offset);
 
@@ -520,10 +496,6 @@  cperf_verify_test_runner(void *test_ctx)
 		ops_deqd = rte_cryptodev_dequeue_burst(ctx->dev_id, ctx->qp_id,
 				ops_processed, ctx->options->max_burst_size);
 
-		m_idx += ops_needed;
-		if (m_idx + ctx->options->max_burst_size > ctx->options->pool_sz)
-			m_idx = 0;
-
 		if (ops_deqd == 0) {
 			/**
 			 * Count dequeue polls which didn't return any
@@ -538,13 +510,10 @@  cperf_verify_test_runner(void *test_ctx)
 			if (cperf_verify_op(ops_processed[i], ctx->options,
 						ctx->test_vector))
 				ops_failed++;
-			/* free crypto ops so they can be reused. We don't free
-			 * the mbufs here as we don't want to reuse them as
-			 * the crypto operation will change the data and cause
-			 * failures.
-			 */
-			rte_crypto_op_free(ops_processed[i]);
 		}
+		/* free crypto ops so they can be reused. */
+		rte_mempool_put_bulk(ctx->pool,
+					(void **)ops_processed, ops_deqd);
 		ops_deqd_total += ops_deqd;
 	}
 
@@ -566,13 +535,10 @@  cperf_verify_test_runner(void *test_ctx)
 			if (cperf_verify_op(ops_processed[i], ctx->options,
 						ctx->test_vector))
 				ops_failed++;
-			/* free crypto ops so they can be reused. We don't free
-			 * the mbufs here as we don't want to reuse them as
-			 * the crypto operation will change the data and cause
-			 * failures.
-			 */
-			rte_crypto_op_free(ops_processed[i]);
 		}
+		/* free crypto ops so they can be reused. */
+		rte_mempool_put_bulk(ctx->pool,
+					(void **)ops_processed, ops_deqd);
 		ops_deqd_total += ops_deqd;
 	}
 
@@ -628,5 +594,5 @@  cperf_verify_test_destructor(void *arg)
 
 	rte_cryptodev_stop(ctx->dev_id);
 
-	cperf_verify_test_free(ctx, ctx->options->pool_sz);
+	cperf_verify_test_free(ctx);
 }