eachParallel doesn't use all threads

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

eachParallel doesn't use all threads

Ken DeLong

I am confused about some behavior I’m seeing with GPars.  When I use eachParallel() I find that the last few elements do not use the entire thread pool:

 

import static groovyx.gpars.GParsPool.withPool

 

def nodes = [1,2,3,4,5,6]

 

withPool(3) {

    nodes.eachParallel() {

        sleep(2000)

        println "${new Date().format('mm:ss')} $it"

    }

}

 

The output is:

47:43 1

47:43 4

47:43 5

47:45 2

47:45 6

47:47 3

 

No matter how many elements I put in the array, the last few are always executed with less than the total thread pool.  Is this expected behavior?  Is there a way to always have 3 threads executing as long as there are unprocessed items in the list?

 

---------------------------------------------------------------------------------------

Kenneth DeLong |  Vice President of Engineering, Chief Software Architect

BabyCenter

o: 415.344.7616

[hidden email]

Twitter: kenwdelong

AIM: kenwdelong

babycenter.com

 

like BabyCenter on Facebook

 

Reply | Threaded
Open this post in threaded view
|

RE: eachParallel doesn't use all threads

Bob Brown
A VERY slight modification, just to make the threads/times clearer:

///
import static groovyx.gpars.GParsPool.withPool

def nodes = [1,2,3,4,5,6,7,8]

withPool(3) {
    nodes.eachParallel() {
        sleep(2000)
        println "(${Thread.currentThread().name}) ${new
Date().format('mm:ss.SS')} $it"
    }
}

groovy> import static groovyx.gpars.GParsPool.withPool
groovy>  
groovy> def nodes = [1,2,3,4,5,6,7,8]
groovy>  
groovy> withPool(3) {
groovy>     nodes.eachParallel() {
groovy>         sleep(2000)
groovy>         println "(${Thread.currentThread().name}) ${new
Date().format('mm:ss.SS')} $it"
groovy>     }
groovy> }
 
(ForkJoinPool-2-worker-3) 35:50.669 7
(ForkJoinPool-2-worker-1) 35:50.669 1
(ForkJoinPool-2-worker-2) 35:50.669 5
(ForkJoinPool-2-worker-3) 35:52.670 8
(ForkJoinPool-2-worker-2) 35:52.671 6
(ForkJoinPool-2-worker-1) 35:52.671 2
(ForkJoinPool-2-worker-3) 35:54.670 3
(ForkJoinPool-2-worker-2) 35:54.671 4
Result: [1, 2, 3, 4, 5, 6, 7, 8]
///

In this run, I see 2 iterations of threads 1,2,3 followed by 1 iteration
using threads 2,3.

Looks good to me.

A later run gives slightly different results:

///
(ForkJoinPool-5-worker-2) 39:49.483 5
(ForkJoinPool-5-worker-1) 39:49.483 1
(ForkJoinPool-5-worker-3) 39:49.483 7
(ForkJoinPool-5-worker-2) 39:51.484 6
(ForkJoinPool-5-worker-1) 39:51.484 2
(ForkJoinPool-5-worker-3) 39:51.484 8
(ForkJoinPool-5-worker-1) 39:53.484 3
(ForkJoinPool-5-worker-1) 39:55.485 4
///

In this run, I see what you mean: 2 *3, then 2 * 1

I see this second execution style more often, but NOT exclusively. Maybe 60%
of the time.

This is Groovy 2.1.6 (Gpars 1.0.0).

I tried replacing gpars-1.0.0.jar with gpars-1.1.0.jar. The result is pretty
much the same but this time, I'd say the ratio between the two forms of
behaviour is 50:50.

Slightly less than optimal, but not actually 'buggy' per-se?

BOB

> -----Original Message-----
> From: Ken DeLong [mailto:[hidden email]]
> Sent: Wednesday, 4 September 2013 10:07 AM
> To: [hidden email]
> Subject: [gpars-user] eachParallel doesn't use all threads
>
> I am confused about some behavior I'm seeing with GPars.  When I use
> eachParallel() I find that the last few elements do not use the entire
thread

> pool:
>
>
>
> import static groovyx.gpars.GParsPool.withPool
>
>
>
> def nodes = [1,2,3,4,5,6]
>
>
>
> withPool(3) {
>
>     nodes.eachParallel() {
>
>         sleep(2000)
>
>         println "${new Date().format('mm:ss')} $it"
>
>     }
>
> }
>
>
>
> The output is:
>
> 47:43 1
>
> 47:43 4
>
> 47:43 5
>
> 47:45 2
>
> 47:45 6
>
> 47:47 3
>
>
>
> No matter how many elements I put in the array, the last few are always
> executed with less than the total thread pool.  Is this expected behavior?
Is
> there a way to always have 3 threads executing as long as there are
> unprocessed items in the list?
>
>
>
>
----------------------------------------------------------------------------
-----------

>
> Kenneth DeLong |  Vice President of Engineering, Chief Software Architect
>
> BabyCenter
>
> o: 415.344.7616
>
> [hidden email] <mailto:[hidden email]>
>
> Twitter: kenwdelong
>
> AIM: kenwdelong
>
> babycenter.com <http://babycenter.com/>
>
>
>
> like BabyCenter on Facebook <http://www.facebook.com/BabyCenter>
>
>



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: eachParallel doesn't use all threads

Johannes Link
Bob,

I still don't understand where you see a problem. From the output I see a fairly evenly distributed usage of your 3 threads. Consider that println introduces (unwanted?) synchronisation between threads.

What output would you expect?

Johannes

Am 04.09.2013 um 07:04 schrieb "Bob Brown" <[hidden email]>:

> A VERY slight modification, just to make the threads/times clearer:
>
> ///
> import static groovyx.gpars.GParsPool.withPool
>
> def nodes = [1,2,3,4,5,6,7,8]
>
> withPool(3) {
>    nodes.eachParallel() {
>        sleep(2000)
>        println "(${Thread.currentThread().name}) ${new
> Date().format('mm:ss.SS')} $it"
>    }
> }
>
> groovy> import static groovyx.gpars.GParsPool.withPool
> groovy>  
> groovy> def nodes = [1,2,3,4,5,6,7,8]
> groovy>  
> groovy> withPool(3) {
> groovy>     nodes.eachParallel() {
> groovy>         sleep(2000)
> groovy>         println "(${Thread.currentThread().name}) ${new
> Date().format('mm:ss.SS')} $it"
> groovy>     }
> groovy> }
>
> (ForkJoinPool-2-worker-3) 35:50.669 7
> (ForkJoinPool-2-worker-1) 35:50.669 1
> (ForkJoinPool-2-worker-2) 35:50.669 5
> (ForkJoinPool-2-worker-3) 35:52.670 8
> (ForkJoinPool-2-worker-2) 35:52.671 6
> (ForkJoinPool-2-worker-1) 35:52.671 2
> (ForkJoinPool-2-worker-3) 35:54.670 3
> (ForkJoinPool-2-worker-2) 35:54.671 4
> Result: [1, 2, 3, 4, 5, 6, 7, 8]
> ///
>
> In this run, I see 2 iterations of threads 1,2,3 followed by 1 iteration
> using threads 2,3.
>
> Looks good to me.
>
> A later run gives slightly different results:
>
> ///
> (ForkJoinPool-5-worker-2) 39:49.483 5
> (ForkJoinPool-5-worker-1) 39:49.483 1
> (ForkJoinPool-5-worker-3) 39:49.483 7
> (ForkJoinPool-5-worker-2) 39:51.484 6
> (ForkJoinPool-5-worker-1) 39:51.484 2
> (ForkJoinPool-5-worker-3) 39:51.484 8
> (ForkJoinPool-5-worker-1) 39:53.484 3
> (ForkJoinPool-5-worker-1) 39:55.485 4
> ///
>
> In this run, I see what you mean: 2 *3, then 2 * 1
>
> I see this second execution style more often, but NOT exclusively. Maybe 60%
> of the time.
>
> This is Groovy 2.1.6 (Gpars 1.0.0).
>
> I tried replacing gpars-1.0.0.jar with gpars-1.1.0.jar. The result is pretty
> much the same but this time, I'd say the ratio between the two forms of
> behaviour is 50:50.
>
> Slightly less than optimal, but not actually 'buggy' per-se?
>
> BOB
>
>> -----Original Message-----
>> From: Ken DeLong [mailto:[hidden email]]
>> Sent: Wednesday, 4 September 2013 10:07 AM
>> To: [hidden email]
>> Subject: [gpars-user] eachParallel doesn't use all threads
>>
>> I am confused about some behavior I'm seeing with GPars.  When I use
>> eachParallel() I find that the last few elements do not use the entire
> thread
>> pool:
>>
>>
>>
>> import static groovyx.gpars.GParsPool.withPool
>>
>>
>>
>> def nodes = [1,2,3,4,5,6]
>>
>>
>>
>> withPool(3) {
>>
>>    nodes.eachParallel() {
>>
>>        sleep(2000)
>>
>>        println "${new Date().format('mm:ss')} $it"
>>
>>    }
>>
>> }
>>
>>
>>
>> The output is:
>>
>> 47:43 1
>>
>> 47:43 4
>>
>> 47:43 5
>>
>> 47:45 2
>>
>> 47:45 6
>>
>> 47:47 3
>>
>>
>>
>> No matter how many elements I put in the array, the last few are always
>> executed with less than the total thread pool.  Is this expected behavior?
> Is
>> there a way to always have 3 threads executing as long as there are
>> unprocessed items in the list?
> ----------------------------------------------------------------------------
> -----------
>>
>> Kenneth DeLong |  Vice President of Engineering, Chief Software Architect
>>
>> BabyCenter
>>
>> o: 415.344.7616
>>
>> [hidden email] <mailto:[hidden email]>
>>
>> Twitter: kenwdelong
>>
>> AIM: kenwdelong
>>
>> babycenter.com <http://babycenter.com/>
>>
>>
>>
>> like BabyCenter on Facebook <http://www.facebook.com/BabyCenter>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

RE: eachParallel doesn't use all threads

Bob Brown
Not really MY problem. I was just chipping in to Ken's issue. And also
trying to show how "Thread.currentThread().name" can be useful.

Consider: pool of 3 threads with 8 elements.

You'd expect processing to use threads in batches of 3, 3, 2. This is what
one would (naively?) expect.

He (and I) am sometimes seeing 3, 3, 1, 1.

The question boils down to: why aren't threads being taken from the pool
when they SEEM to be available.

A minor inefficiency perhaps.

I take your point about println (I'd considered it but didn't try to resolve
that particular issue).

Might be worth doing: how would you suggest?

BOB

> Bob,
>
> I still don't understand where you see a problem. From the output I see a
> fairly evenly distributed usage of your 3 threads. Consider that println
> introduces (unwanted?) synchronisation between threads.
>
> What output would you expect?
>
> Johannes
>
> Am 04.09.2013 um 07:04 schrieb "Bob Brown" <[hidden email]>:
>
> > A VERY slight modification, just to make the threads/times clearer:
> >
> > ///
> > import static groovyx.gpars.GParsPool.withPool
> >
> > def nodes = [1,2,3,4,5,6,7,8]
> >
> > withPool(3) {
> >    nodes.eachParallel() {
> >        sleep(2000)
> >        println "(${Thread.currentThread().name}) ${new
> > Date().format('mm:ss.SS')} $it"
> >    }
> > }
> >
> > groovy> import static groovyx.gpars.GParsPool.withPool
> > groovy>
> > groovy> def nodes = [1,2,3,4,5,6,7,8]
> > groovy>
> > groovy> withPool(3) {
> > groovy>     nodes.eachParallel() {
> > groovy>         sleep(2000)
> > groovy>         println "(${Thread.currentThread().name}) ${new
> > Date().format('mm:ss.SS')} $it"
> > groovy>     }
> > groovy> }
> >
> > (ForkJoinPool-2-worker-3) 35:50.669 7
> > (ForkJoinPool-2-worker-1) 35:50.669 1
> > (ForkJoinPool-2-worker-2) 35:50.669 5
> > (ForkJoinPool-2-worker-3) 35:52.670 8
> > (ForkJoinPool-2-worker-2) 35:52.671 6
> > (ForkJoinPool-2-worker-1) 35:52.671 2
> > (ForkJoinPool-2-worker-3) 35:54.670 3
> > (ForkJoinPool-2-worker-2) 35:54.671 4
> > Result: [1, 2, 3, 4, 5, 6, 7, 8]
> > ///
> >
> > In this run, I see 2 iterations of threads 1,2,3 followed by 1
> > iteration using threads 2,3.
> >
> > Looks good to me.
> >
> > A later run gives slightly different results:
> >
> > ///
> > (ForkJoinPool-5-worker-2) 39:49.483 5
> > (ForkJoinPool-5-worker-1) 39:49.483 1
> > (ForkJoinPool-5-worker-3) 39:49.483 7
> > (ForkJoinPool-5-worker-2) 39:51.484 6
> > (ForkJoinPool-5-worker-1) 39:51.484 2
> > (ForkJoinPool-5-worker-3) 39:51.484 8
> > (ForkJoinPool-5-worker-1) 39:53.484 3
> > (ForkJoinPool-5-worker-1) 39:55.485 4
> > ///
> >
> > In this run, I see what you mean: 2 *3, then 2 * 1
> >
> > I see this second execution style more often, but NOT exclusively.
> > Maybe 60% of the time.
> >
> > This is Groovy 2.1.6 (Gpars 1.0.0).
> >
> > I tried replacing gpars-1.0.0.jar with gpars-1.1.0.jar. The result is
> > pretty much the same but this time, I'd say the ratio between the two
> > forms of behaviour is 50:50.
> >
> > Slightly less than optimal, but not actually 'buggy' per-se?
> >
> > BOB
> >
> >> -----Original Message-----
> >> From: Ken DeLong [mailto:[hidden email]]
> >> Sent: Wednesday, 4 September 2013 10:07 AM
> >> To: [hidden email]
> >> Subject: [gpars-user] eachParallel doesn't use all threads
> >>
> >> I am confused about some behavior I'm seeing with GPars.  When I use
> >> eachParallel() I find that the last few elements do not use the
> >> entire
> > thread
> >> pool:
> >>
> >>
> >>
> >> import static groovyx.gpars.GParsPool.withPool
> >>
> >>
> >>
> >> def nodes = [1,2,3,4,5,6]
> >>
> >>
> >>
> >> withPool(3) {
> >>
> >>    nodes.eachParallel() {
> >>
> >>        sleep(2000)
> >>
> >>        println "${new Date().format('mm:ss')} $it"
> >>
> >>    }
> >>
> >> }
> >>
> >>
> >>
> >> The output is:
> >>
> >> 47:43 1
> >>
> >> 47:43 4
> >>
> >> 47:43 5
> >>
> >> 47:45 2
> >>
> >> 47:45 6
> >>
> >> 47:47 3
> >>
> >>
> >>
> >> No matter how many elements I put in the array, the last few are
> >> always executed with less than the total thread pool.  Is this expected
> behavior?
> > Is
> >> there a way to always have 3 threads executing as long as there are
> >> unprocessed items in the list?
> > ----------------------------------------------------------------------
> > ------
> > -----------
> >>
> >> Kenneth DeLong |  Vice President of Engineering, Chief Software
> >> Architect
> >>
> >> BabyCenter
> >>
> >> o: 415.344.7616
> >>
> >> [hidden email] <mailto:[hidden email]>
> >>
> >> Twitter: kenwdelong
> >>
> >> AIM: kenwdelong
> >>
> >> babycenter.com <http://babycenter.com/>
> >>
> >>
> >>
> >> like BabyCenter on Facebook <http://www.facebook.com/BabyCenter>
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe from this list, please visit:
> >
> >    http://xircles.codehaus.org/manage_email
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>     http://xircles.codehaus.org/manage_email
>



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: eachParallel doesn't use all threads

Johannes Link
My guess: You don't see 3-3-2 because GParsPool uses a Fork/Join pool
(with work stealing) and not a plain fixed thread pool.

As for the synchronization of output you definitely need it to make
the order visible; I just wanted to point to the fact that it has
influence on the observed behaviour.

Johannes

2013/9/4 Bob Brown <[hidden email]>:

> Not really MY problem. I was just chipping in to Ken's issue. And also
> trying to show how "Thread.currentThread().name" can be useful.
>
> Consider: pool of 3 threads with 8 elements.
>
> You'd expect processing to use threads in batches of 3, 3, 2. This is what
> one would (naively?) expect.
>
> He (and I) am sometimes seeing 3, 3, 1, 1.
>
> The question boils down to: why aren't threads being taken from the pool
> when they SEEM to be available.
>
> A minor inefficiency perhaps.
>
> I take your point about println (I'd considered it but didn't try to resolve
> that particular issue).
>
> Might be worth doing: how would you suggest?
>
> BOB
>
>> Bob,
>>
>> I still don't understand where you see a problem. From the output I see a
>> fairly evenly distributed usage of your 3 threads. Consider that println
>> introduces (unwanted?) synchronisation between threads.
>>
>> What output would you expect?
>>
>> Johannes
>>
>> Am 04.09.2013 um 07:04 schrieb "Bob Brown" <[hidden email]>:
>>
>> > A VERY slight modification, just to make the threads/times clearer:
>> >
>> > ///
>> > import static groovyx.gpars.GParsPool.withPool
>> >
>> > def nodes = [1,2,3,4,5,6,7,8]
>> >
>> > withPool(3) {
>> >    nodes.eachParallel() {
>> >        sleep(2000)
>> >        println "(${Thread.currentThread().name}) ${new
>> > Date().format('mm:ss.SS')} $it"
>> >    }
>> > }
>> >
>> > groovy> import static groovyx.gpars.GParsPool.withPool
>> > groovy>
>> > groovy> def nodes = [1,2,3,4,5,6,7,8]
>> > groovy>
>> > groovy> withPool(3) {
>> > groovy>     nodes.eachParallel() {
>> > groovy>         sleep(2000)
>> > groovy>         println "(${Thread.currentThread().name}) ${new
>> > Date().format('mm:ss.SS')} $it"
>> > groovy>     }
>> > groovy> }
>> >
>> > (ForkJoinPool-2-worker-3) 35:50.669 7
>> > (ForkJoinPool-2-worker-1) 35:50.669 1
>> > (ForkJoinPool-2-worker-2) 35:50.669 5
>> > (ForkJoinPool-2-worker-3) 35:52.670 8
>> > (ForkJoinPool-2-worker-2) 35:52.671 6
>> > (ForkJoinPool-2-worker-1) 35:52.671 2
>> > (ForkJoinPool-2-worker-3) 35:54.670 3
>> > (ForkJoinPool-2-worker-2) 35:54.671 4
>> > Result: [1, 2, 3, 4, 5, 6, 7, 8]
>> > ///
>> >
>> > In this run, I see 2 iterations of threads 1,2,3 followed by 1
>> > iteration using threads 2,3.
>> >
>> > Looks good to me.
>> >
>> > A later run gives slightly different results:
>> >
>> > ///
>> > (ForkJoinPool-5-worker-2) 39:49.483 5
>> > (ForkJoinPool-5-worker-1) 39:49.483 1
>> > (ForkJoinPool-5-worker-3) 39:49.483 7
>> > (ForkJoinPool-5-worker-2) 39:51.484 6
>> > (ForkJoinPool-5-worker-1) 39:51.484 2
>> > (ForkJoinPool-5-worker-3) 39:51.484 8
>> > (ForkJoinPool-5-worker-1) 39:53.484 3
>> > (ForkJoinPool-5-worker-1) 39:55.485 4
>> > ///
>> >
>> > In this run, I see what you mean: 2 *3, then 2 * 1
>> >
>> > I see this second execution style more often, but NOT exclusively.
>> > Maybe 60% of the time.
>> >
>> > This is Groovy 2.1.6 (Gpars 1.0.0).
>> >
>> > I tried replacing gpars-1.0.0.jar with gpars-1.1.0.jar. The result is
>> > pretty much the same but this time, I'd say the ratio between the two
>> > forms of behaviour is 50:50.
>> >
>> > Slightly less than optimal, but not actually 'buggy' per-se?
>> >
>> > BOB
>> >
>> >> -----Original Message-----
>> >> From: Ken DeLong [mailto:[hidden email]]
>> >> Sent: Wednesday, 4 September 2013 10:07 AM
>> >> To: [hidden email]
>> >> Subject: [gpars-user] eachParallel doesn't use all threads
>> >>
>> >> I am confused about some behavior I'm seeing with GPars.  When I use
>> >> eachParallel() I find that the last few elements do not use the
>> >> entire
>> > thread
>> >> pool:
>> >>
>> >>
>> >>
>> >> import static groovyx.gpars.GParsPool.withPool
>> >>
>> >>
>> >>
>> >> def nodes = [1,2,3,4,5,6]
>> >>
>> >>
>> >>
>> >> withPool(3) {
>> >>
>> >>    nodes.eachParallel() {
>> >>
>> >>        sleep(2000)
>> >>
>> >>        println "${new Date().format('mm:ss')} $it"
>> >>
>> >>    }
>> >>
>> >> }
>> >>
>> >>
>> >>
>> >> The output is:
>> >>
>> >> 47:43 1
>> >>
>> >> 47:43 4
>> >>
>> >> 47:43 5
>> >>
>> >> 47:45 2
>> >>
>> >> 47:45 6
>> >>
>> >> 47:47 3
>> >>
>> >>
>> >>
>> >> No matter how many elements I put in the array, the last few are
>> >> always executed with less than the total thread pool.  Is this expected
>> behavior?
>> > Is
>> >> there a way to always have 3 threads executing as long as there are
>> >> unprocessed items in the list?
>> > ----------------------------------------------------------------------
>> > ------
>> > -----------
>> >>
>> >> Kenneth DeLong |  Vice President of Engineering, Chief Software
>> >> Architect
>> >>
>> >> BabyCenter
>> >>
>> >> o: 415.344.7616
>> >>
>> >> [hidden email] <mailto:[hidden email]>
>> >>
>> >> Twitter: kenwdelong
>> >>
>> >> AIM: kenwdelong
>> >>
>> >> babycenter.com <http://babycenter.com/>
>> >>
>> >>
>> >>
>> >> like BabyCenter on Facebook <http://www.facebook.com/BabyCenter>
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe from this list, please visit:
>> >
>> >    http://xircles.codehaus.org/manage_email
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>     http://xircles.codehaus.org/manage_email
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>     http://xircles.codehaus.org/manage_email
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: eachParallel doesn't use all threads

Vaclav
Administrator
Hi guys,

I understand the behavior is somewhat surprising. As Johannes says, it is Fork/Join at work here, so the usual scheme of numbers sitting in a queue waiting for processing doesn't quite hold for this case. BTW, if you use GParsExecutorsPool, which doesn't use Fork/Join, you get the behavior that you expect.
It is the hierarchical Fork/Join algorithm that most likely introduces this inefficiency in how it splits the array into chunks, which then get stolen by other threads.

Cheers,

Vaclav



On Wed, Sep 4, 2013 at 8:58 AM, Johannes Link <[hidden email]> wrote:
My guess: You don't see 3-3-2 because GParsPool uses a Fork/Join pool
(with work stealing) and not a plain fixed thread pool.

As for the synchronization of output you definitely need it to make
the order visible; I just wanted to point to the fact that it has
influence on the observed behaviour.

Johannes

2013/9/4 Bob Brown <[hidden email]>:
> Not really MY problem. I was just chipping in to Ken's issue. And also
> trying to show how "Thread.currentThread().name" can be useful.
>
> Consider: pool of 3 threads with 8 elements.
>
> You'd expect processing to use threads in batches of 3, 3, 2. This is what
> one would (naively?) expect.
>
> He (and I) am sometimes seeing 3, 3, 1, 1.
>
> The question boils down to: why aren't threads being taken from the pool
> when they SEEM to be available.
>
> A minor inefficiency perhaps.
>
> I take your point about println (I'd considered it but didn't try to resolve
> that particular issue).
>
> Might be worth doing: how would you suggest?
>
> BOB
>
>> Bob,
>>
>> I still don't understand where you see a problem. From the output I see a
>> fairly evenly distributed usage of your 3 threads. Consider that println
>> introduces (unwanted?) synchronisation between threads.
>>
>> What output would you expect?
>>
>> Johannes
>>
>> Am 04.09.2013 um 07:04 schrieb "Bob Brown" <[hidden email]>:
>>
>> > A VERY slight modification, just to make the threads/times clearer:
>> >
>> > ///
>> > import static groovyx.gpars.GParsPool.withPool
>> >
>> > def nodes = [1,2,3,4,5,6,7,8]
>> >
>> > withPool(3) {
>> >    nodes.eachParallel() {
>> >        sleep(2000)
>> >        println "(${Thread.currentThread().name}) ${new
>> > Date().format('mm:ss.SS')} $it"
>> >    }
>> > }
>> >
>> > groovy> import static groovyx.gpars.GParsPool.withPool
>> > groovy>
>> > groovy> def nodes = [1,2,3,4,5,6,7,8]
>> > groovy>
>> > groovy> withPool(3) {
>> > groovy>     nodes.eachParallel() {
>> > groovy>         sleep(2000)
>> > groovy>         println "(${Thread.currentThread().name}) ${new
>> > Date().format('mm:ss.SS')} $it"
>> > groovy>     }
>> > groovy> }
>> >
>> > (ForkJoinPool-2-worker-3) 35:50.669 7
>> > (ForkJoinPool-2-worker-1) 35:50.669 1
>> > (ForkJoinPool-2-worker-2) 35:50.669 5
>> > (ForkJoinPool-2-worker-3) 35:52.670 8
>> > (ForkJoinPool-2-worker-2) 35:52.671 6
>> > (ForkJoinPool-2-worker-1) 35:52.671 2
>> > (ForkJoinPool-2-worker-3) 35:54.670 3
>> > (ForkJoinPool-2-worker-2) 35:54.671 4
>> > Result: [1, 2, 3, 4, 5, 6, 7, 8]
>> > ///
>> >
>> > In this run, I see 2 iterations of threads 1,2,3 followed by 1
>> > iteration using threads 2,3.
>> >
>> > Looks good to me.
>> >
>> > A later run gives slightly different results:
>> >
>> > ///
>> > (ForkJoinPool-5-worker-2) 39:49.483 5
>> > (ForkJoinPool-5-worker-1) 39:49.483 1
>> > (ForkJoinPool-5-worker-3) 39:49.483 7
>> > (ForkJoinPool-5-worker-2) 39:51.484 6
>> > (ForkJoinPool-5-worker-1) 39:51.484 2
>> > (ForkJoinPool-5-worker-3) 39:51.484 8
>> > (ForkJoinPool-5-worker-1) 39:53.484 3
>> > (ForkJoinPool-5-worker-1) 39:55.485 4
>> > ///
>> >
>> > In this run, I see what you mean: 2 *3, then 2 * 1
>> >
>> > I see this second execution style more often, but NOT exclusively.
>> > Maybe 60% of the time.
>> >
>> > This is Groovy 2.1.6 (Gpars 1.0.0).
>> >
>> > I tried replacing gpars-1.0.0.jar with gpars-1.1.0.jar. The result is
>> > pretty much the same but this time, I'd say the ratio between the two
>> > forms of behaviour is 50:50.
>> >
>> > Slightly less than optimal, but not actually 'buggy' per-se?
>> >
>> > BOB
>> >
>> >> -----Original Message-----
>> >> From: Ken DeLong [mailto:[hidden email]]
>> >> Sent: Wednesday, 4 September 2013 10:07 AM
>> >> To: [hidden email]
>> >> Subject: [gpars-user] eachParallel doesn't use all threads
>> >>
>> >> I am confused about some behavior I'm seeing with GPars.  When I use
>> >> eachParallel() I find that the last few elements do not use the
>> >> entire
>> > thread
>> >> pool:
>> >>
>> >>
>> >>
>> >> import static groovyx.gpars.GParsPool.withPool
>> >>
>> >>
>> >>
>> >> def nodes = [1,2,3,4,5,6]
>> >>
>> >>
>> >>
>> >> withPool(3) {
>> >>
>> >>    nodes.eachParallel() {
>> >>
>> >>        sleep(2000)
>> >>
>> >>        println "${new Date().format('mm:ss')} $it"
>> >>
>> >>    }
>> >>
>> >> }
>> >>
>> >>
>> >>
>> >> The output is:
>> >>
>> >> 47:43 1
>> >>
>> >> 47:43 4
>> >>
>> >> 47:43 5
>> >>
>> >> 47:45 2
>> >>
>> >> 47:45 6
>> >>
>> >> 47:47 3
>> >>
>> >>
>> >>
>> >> No matter how many elements I put in the array, the last few are
>> >> always executed with less than the total thread pool.  Is this expected
>> behavior?
>> > Is
>> >> there a way to always have 3 threads executing as long as there are
>> >> unprocessed items in the list?
>> > ----------------------------------------------------------------------
>> > ------
>> > -----------
>> >>
>> >> Kenneth DeLong |  Vice President of Engineering, Chief Software
>> >> Architect
>> >>
>> >> BabyCenter
>> >>
>> >> o: 415.344.7616
>> >>
>> >> [hidden email] <mailto:[hidden email]>
>> >>
>> >> Twitter: kenwdelong
>> >>
>> >> AIM: kenwdelong
>> >>
>> >> babycenter.com <http://babycenter.com/>
>> >>
>> >>
>> >>
>> >> like BabyCenter on Facebook <http://www.facebook.com/BabyCenter>
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe from this list, please visit:
>> >
>> >    http://xircles.codehaus.org/manage_email
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>     http://xircles.codehaus.org/manage_email
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>     http://xircles.codehaus.org/manage_email
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email





--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech
Reply | Threaded
Open this post in threaded view
|

RE: eachParallel doesn't use all threads

Ken DeLong

Yes!  Using GParsExecutorsPool produces the desired behavior!  Thanks all!

 

The problem I was trying to solve is that we are driving our deployment system with Groovy.  If we have a farm of say, 9 servers, we might want to rollout or bounce them 3 at a time.  But instead of 3-3-3, we were getting 3-3-1-1-1, which almost doubled the rollout times.  But GParsExecutorsPool fixes it! 

 

Thanks again!

 

From: Václav Pech [mailto:[hidden email]]
Sent: Wednesday, September 04, 2013 12:38 AM
To: GPars Users
Subject: Re: [gpars-user] eachParallel doesn't use all threads

 

Hi guys,

 

I understand the behavior is somewhat surprising. As Johannes says, it is Fork/Join at work here, so the usual scheme of numbers sitting in a queue waiting for processing doesn't quite hold for this case. BTW, if you use GParsExecutorsPool, which doesn't use Fork/Join, you get the behavior that you expect.

It is the hierarchical Fork/Join algorithm that most likely introduces this inefficiency in how it splits the array into chunks, which then get stolen by other threads.

 

Cheers,

 

Vaclav

 

 

On Wed, Sep 4, 2013 at 8:58 AM, Johannes Link <[hidden email]> wrote:

My guess: You don't see 3-3-2 because GParsPool uses a Fork/Join pool
(with work stealing) and not a plain fixed thread pool.

As for the synchronization of output you definitely need it to make
the order visible; I just wanted to point to the fact that it has
influence on the observed behaviour.

Johannes

2013/9/4 Bob Brown <[hidden email]>:

> Not really MY problem. I was just chipping in to Ken's issue. And also


> trying to show how "Thread.currentThread().name" can be useful.
>
> Consider: pool of 3 threads with 8 elements.
>
> You'd expect processing to use threads in batches of 3, 3, 2. This is what
> one would (naively?) expect.
>
> He (and I) am sometimes seeing 3, 3, 1, 1.
>
> The question boils down to: why aren't threads being taken from the pool
> when they SEEM to be available.
>
> A minor inefficiency perhaps.
>
> I take your point about println (I'd considered it but didn't try to resolve
> that particular issue).
>
> Might be worth doing: how would you suggest?
>
> BOB
>
>> Bob,
>>
>> I still don't understand where you see a problem. From the output I see a
>> fairly evenly distributed usage of your 3 threads. Consider that println
>> introduces (unwanted?) synchronisation between threads.
>>
>> What output would you expect?
>>
>> Johannes
>>
>> Am 04.09.2013 um 07:04 schrieb "Bob Brown" <[hidden email]>:
>>
>> > A VERY slight modification, just to make the threads/times clearer:
>> >
>> > ///
>> > import static groovyx.gpars.GParsPool.withPool
>> >
>> > def nodes = [1,2,3,4,5,6,7,8]
>> >
>> > withPool(3) {
>> >    nodes.eachParallel() {
>> >        sleep(2000)
>> >        println "(${Thread.currentThread().name}) ${new
>> > Date().format('mm:ss.SS')} $it"
>> >    }
>> > }
>> >
>> > groovy> import static groovyx.gpars.GParsPool.withPool
>> > groovy>
>> > groovy> def nodes = [1,2,3,4,5,6,7,8]
>> > groovy>
>> > groovy> withPool(3) {
>> > groovy>     nodes.eachParallel() {
>> > groovy>         sleep(2000)
>> > groovy>         println "(${Thread.currentThread().name}) ${new
>> > Date().format('mm:ss.SS')} $it"
>> > groovy>     }
>> > groovy> }
>> >
>> > (ForkJoinPool-2-worker-3) 35:50.669 7
>> > (ForkJoinPool-2-worker-1) 35:50.669 1
>> > (ForkJoinPool-2-worker-2) 35:50.669 5
>> > (ForkJoinPool-2-worker-3) 35:52.670 8
>> > (ForkJoinPool-2-worker-2) 35:52.671 6
>> > (ForkJoinPool-2-worker-1) 35:52.671 2
>> > (ForkJoinPool-2-worker-3) 35:54.670 3
>> > (ForkJoinPool-2-worker-2) 35:54.671 4
>> > Result: [1, 2, 3, 4, 5, 6, 7, 8]
>> > ///
>> >
>> > In this run, I see 2 iterations of threads 1,2,3 followed by 1
>> > iteration using threads 2,3.
>> >
>> > Looks good to me.
>> >
>> > A later run gives slightly different results:
>> >
>> > ///
>> > (ForkJoinPool-5-worker-2) 39:49.483 5
>> > (ForkJoinPool-5-worker-1) 39:49.483 1
>> > (ForkJoinPool-5-worker-3) 39:49.483 7
>> > (ForkJoinPool-5-worker-2) 39:51.484 6
>> > (ForkJoinPool-5-worker-1) 39:51.484 2
>> > (ForkJoinPool-5-worker-3) 39:51.484 8
>> > (ForkJoinPool-5-worker-1) 39:53.484 3
>> > (ForkJoinPool-5-worker-1) 39:55.485 4
>> > ///
>> >
>> > In this run, I see what you mean: 2 *3, then 2 * 1
>> >
>> > I see this second execution style more often, but NOT exclusively.
>> > Maybe 60% of the time.
>> >
>> > This is Groovy 2.1.6 (Gpars 1.0.0).
>> >
>> > I tried replacing gpars-1.0.0.jar with gpars-1.1.0.jar. The result is
>> > pretty much the same but this time, I'd say the ratio between the two
>> > forms of behaviour is 50:50.
>> >
>> > Slightly less than optimal, but not actually 'buggy' per-se?
>> >
>> > BOB
>> >
>> >> -----Original Message-----
>> >> From: Ken DeLong [mailto:[hidden email]]
>> >> Sent: Wednesday, 4 September 2013 10:07 AM
>> >> To: [hidden email]
>> >> Subject: [gpars-user] eachParallel doesn't use all threads
>> >>
>> >> I am confused about some behavior I'm seeing with GPars.  When I use
>> >> eachParallel() I find that the last few elements do not use the
>> >> entire
>> > thread
>> >> pool:
>> >>
>> >>
>> >>
>> >> import static groovyx.gpars.GParsPool.withPool
>> >>
>> >>
>> >>
>> >> def nodes = [1,2,3,4,5,6]
>> >>
>> >>
>> >>
>> >> withPool(3) {
>> >>
>> >>    nodes.eachParallel() {
>> >>
>> >>        sleep(2000)
>> >>
>> >>        println "${new Date().format('mm:ss')} $it"
>> >>
>> >>    }
>> >>
>> >> }
>> >>
>> >>
>> >>
>> >> The output is:
>> >>
>> >> 47:43 1
>> >>
>> >> 47:43 4
>> >>
>> >> 47:43 5
>> >>
>> >> 47:45 2
>> >>
>> >> 47:45 6
>> >>
>> >> 47:47 3
>> >>
>> >>
>> >>
>> >> No matter how many elements I put in the array, the last few are
>> >> always executed with less than the total thread pool.  Is this expected
>> behavior?
>> > Is
>> >> there a way to always have 3 threads executing as long as there are
>> >> unprocessed items in the list?
>> > ----------------------------------------------------------------------
>> > ------
>> > -----------
>> >>
>> >> Kenneth DeLong |  Vice President of Engineering, Chief Software
>> >> Architect
>> >>
>> >> BabyCenter
>> >>
>> >> o: 415.344.7616
>> >>
>> >> [hidden email] <mailto:[hidden email]>
>> >>
>> >> Twitter: kenwdelong
>> >>
>> >> AIM: kenwdelong
>> >>
>> >> babycenter.com <http://babycenter.com/>
>> >>
>> >>
>> >>
>> >> like BabyCenter on Facebook <http://www.facebook.com/BabyCenter>
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe from this list, please visit:
>> >
>> >    http://xircles.codehaus.org/manage_email
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>     http://xircles.codehaus.org/manage_email
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>     http://xircles.codehaus.org/manage_email
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



 

--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech

Reply | Threaded
Open this post in threaded view
|

RE: eachParallel doesn't use all threads

Bob Brown
In reply to this post by Vaclav
> Hi guys,
>
> I understand the behavior is somewhat surprising. As Johannes says, it is
> Fork/Join at work here, so the usual scheme of numbers sitting in a queue
> waiting for processing doesn't quite hold for this case. BTW, if you use
> GParsExecutorsPool, which doesn't use Fork/Join, you get the behavior that
> you expect.
> It is the hierarchical Fork/Join algorithm that most likely introduces
this
> inefficiency in how it splits the array into chunks, which then get stolen
by
> other threads.

It IS surprising. I wonder if it is noting this in the documentation.

Something like:

"""
Because GParsPool uses a Fork/Join pool (with work stealing), threads may
not be applied to a waiting processing task even though they may appear
idle. With a work-stealing algorithm, worker threads that run out of things
to do can steal tasks from other threads that are still busy.

if you use GParsExecutorsPool, which doesn't use Fork/Join, you get the
thread allocation behavior that you would naively expect.
"""

?

BOB


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply | Threaded
Open this post in threaded view
|

Re: eachParallel doesn't use all threads

Vaclav
Administrator
OK, thanks. I will make the update.

Vaclav



On Thu, Sep 5, 2013 at 12:57 AM, Bob Brown <[hidden email]> wrote:
> Hi guys,
>
> I understand the behavior is somewhat surprising. As Johannes says, it is
> Fork/Join at work here, so the usual scheme of numbers sitting in a queue
> waiting for processing doesn't quite hold for this case. BTW, if you use
> GParsExecutorsPool, which doesn't use Fork/Join, you get the behavior that
> you expect.
> It is the hierarchical Fork/Join algorithm that most likely introduces
this
> inefficiency in how it splits the array into chunks, which then get stolen
by
> other threads.

It IS surprising. I wonder if it is noting this in the documentation.

Something like:

"""
Because GParsPool uses a Fork/Join pool (with work stealing), threads may
not be applied to a waiting processing task even though they may appear
idle. With a work-stealing algorithm, worker threads that run out of things
to do can steal tasks from other threads that are still busy.

if you use GParsExecutorsPool, which doesn't use Fork/Join, you get the
thread allocation behavior that you would naively expect.
"""

?

BOB


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email





--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech