CPUs go crazy on large withPool

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

CPUs go crazy on large withPool

strayph
import static groovyx.gpars.GParsPool.withPool

Future<Integer> processAllAsync(List<ImpressionLog> logs){
       def count=  withPool(100){ // logs.size()){                              
                 logs.parallel.map { this.process(it) ? 1 : 0 }.reduce {
                     a,b -> a + b
                 }
          }
    def future = new FutureTask<Integer>({ count } as Callable)
    future.run()
    return future
  }

Sorry, I can't quite break out the code for an example.  I'll work on that.

The issue is the withPool() argument.
Basically, this code processes log files in parallel.
I am testing with 3 log files.
If I use logs.size(), then it always works fine.
If I use 10, it works fine.
If I use 100 or 1000, the CPUs just lock up hung.

You see my confusion.  I do have 8 CPUs with my Mac, but 10 > 8 CPUs, and 10 > 3 log files, so
it seems 10, 100, or 1000 should all behave the same.

They don't.
Instead, 100 and 1000 pin all my processors and the system goes crazy.
I tried with a simple math problem, but that worked fine with 10, 100, or 1000.

What else shall I investigate?

Have any others run into similar problems?

Strayph
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Vaclav
Administrator
Hi Strayph,

I can't really see why this is happening. Could you possibly try experimenting with removing the operations performed inside the withPool block, so as we could see, whether the issue is at pool construction time (inside withPool) or during the actual map/reduce work?

Vaclav


On Tue, May 3, 2011 at 9:04 AM, strayph <[hidden email]> wrote:
import static groovyx.gpars.GParsPool.withPool

Future<Integer> processAllAsync(List<ImpressionLog> logs){
      def count=  withPool(100){ // logs.size()){
                logs.parallel.map { this.process(it) ? 1 : 0 }.reduce {
                    a,b -> a + b
                }
         }
   def future = new FutureTask<Integer>({ count } as Callable)
   future.run()
   return future
 }

Sorry, I can't quite break out the code for an example.  I'll work on that.

The issue is the withPool() argument.
Basically, this code processes log files in parallel.
I am testing with 3 log files.
If I use logs.size(), then it always works fine.
If I use 10, it works fine.
If I use 100 or 1000, the CPUs just lock up hung.

You see my confusion.  I do have 8 CPUs with my Mac, but 10 > 8 CPUs, and 10
> 3 log files, so
it seems 10, 100, or 1000 should all behave the same.

They don't.
Instead, 100 and 1000 pin all my processors and the system goes crazy.
I tried with a simple math problem, but that worked fine with 10, 100, or
1000.

What else shall I investigate?

Have any others run into similar problems?

Strayph

--
View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2893107.html
Sent from the GPars - user mailing list mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email





--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

strayph
Okay, I made a simple real example, which exhibits the same behavior.
It definitely shows something happens when there is an excess of Threads in the ThreadPool.

> ./gpars.groovy  # All is well for the 3 files
331056
> THREADS=10 ./gpars.groovy  # All is well for the 3 files
331056
> THREADS=100 ./gpars.groovy  # All CPUs pin and never return


gpar.groovy:
#!/usr/bin/env groovy

@Grab(group='org.codehaus.gpars', module='gpars', version='0.11')
import static groovyx.gpars.GParsPool.withPool

import java.util.zip.GZIPInputStream;

def root = System.getenv('LOG_ROOT')

def files = [
    '2011_05_02-15_00_01-i-251c2849-msnbc-video.log.gz',
    '2011_05_02-15_00_01-i-d89a46b7-msnbc-video.log.gz',
    '2011_05_02-15_00_02-i-da9a46b5-msnbc-video.log.gz',
]

def threadCount = System.getenv('THREADS') as Integer ?: files.size()

def rows = []

println withPool(threadCount){

    files.collect { root + it }.parallel.map {
        def file = new File(it)
        def fis = new FileInputStream(file);
        def gzis = new GZIPInputStream(fis);
        def isr = new InputStreamReader(gzis);
        def reader = new BufferedReader(isr)
        def row
        while (row = reader.readLine()?.split(' - ')?.collect { it.replace(/"/,\
'') }){                                                                        
        rows += row                                                            
        }                                                                      
        return rows.size()                                                      
      } .reduce {a, b -> a + b}                                                
}                                                                              
           
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Vaclav
Administrator
def rows = [] is not very thread-safe and so shouldn't be accesed from within the parallel map operations.
Have you considered making it a variable local to the closures? This is more a corectness problem than a performance one, though, so I'll keep on exploring.


On Tue, May 3, 2011 at 5:40 PM, strayph <[hidden email]> wrote:
Okay, I made a simple real example, which exhibits the same behavior.
It definitely shows something happens when there is an excess of Threads in
the ThreadPool.

> ./gpars.groovy  # All is well for the 3 files
331056
> THREADS=10 ./gpars.groovy  # All is well for the 3 files
331056
> THREADS=100 ./gpars.groovy  # All CPUs pin and never return


gpar.groovy:
#!/usr/bin/env groovy

@Grab(group='org.codehaus.gpars', module='gpars', version='0.11')
import static groovyx.gpars.GParsPool.withPool

import java.util.zip.GZIPInputStream;

def root = System.getenv('LOG_ROOT')

def files = [
   '2011_05_02-15_00_01-i-251c2849-msnbc-video.log.gz',
   '2011_05_02-15_00_01-i-d89a46b7-msnbc-video.log.gz',
   '2011_05_02-15_00_02-i-da9a46b5-msnbc-video.log.gz',
]

def threadCount = System.getenv('THREADS') ?: files.size()

def rows = []

println withPool(threadCount){

   files.collect { root + it }.parallel.map {
       def file = new File(it)
       def fis = new FileInputStream(file);
       def gzis = new GZIPInputStream(fis);
       def isr = new InputStreamReader(gzis);
       def reader = new BufferedReader(isr)
       def row
       while (row = reader.readLine()?.split(' - ')?.collect {
it.replace(/"/,\
'') }){
       rows += row
       }
       return rows.size()
     } .reduce {a, b -> a + b}
}

--
View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2894739.html
Sent from the GPars - user mailing list mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email





--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

strayph
My first thoughts on this had been that it was a relative problem, not an absolute one.
Now I am pretty sure it is an absolute problem.

I had thought it was because I had extra threads with no work to do, and they were somehow misbehaving.
However, if I change my example to 100 files with 100 threads to work on them, the exact same issue occurs.

So I think there are some probles with Gpars (or the underlying Java libraries) with large thread pools.

Strayph
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Vaclav
Administrator
Strayph,

could I ask you to try your code with GPars-0.12-beta-1-SNAPSHOT? We switched the underlying Fork/Join framework for the 0.12 version and my tests over here indicate that the pool behaves much better inthe newer version.

Vaclav


On Tue, May 3, 2011 at 7:29 PM, strayph <[hidden email]> wrote:
My first thoughts on this had been that it was a relative problem, not an
absolute one.
Now I am pretty sure it is an absolute problem.

I had thought it was because I had extra threads with no work to do, and
they were somehow misbehaving.
However, if I change my example to 100 files with 100 threads to work on
them, the exact same issue occurs.

So I think there are some probles with Gpars (or the underlying Java
libraries) with large thread pools.

Strayph

--
View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2895143.html
Sent from the GPars - user mailing list mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email





--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

strayph
Sure. Is it in a repo where I can connect from gradle?  I haven't built gpars from source. 

Sent from my iPhone

On May 3, 2011, at 1:24 PM, "Vaclav [via GPars - user mailing list]" <[hidden email]> wrote:

Strayph,

could I ask you to try your code with GPars-0.12-beta-1-SNAPSHOT? We switched the underlying Fork/Join framework for the 0.12 version and my tests over here indicate that the pool behaves much better inthe newer version.

Vaclav


On Tue, May 3, 2011 at 7:29 PM, strayph <[hidden email]> wrote:
My first thoughts on this had been that it was a relative problem, not an
absolute one.
Now I am pretty sure it is an absolute problem.

I had thought it was because I had extra threads with no work to do, and
they were somehow misbehaving.
However, if I change my example to 100 files with 100 threads to work on
them, the exact same issue occurs.

So I think there are some probles with Gpars (or the underlying Java
libraries) with large thread pools.

Strayph

--
View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2895143.html
Sent from the GPars - user mailing list mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email





--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech



If you reply to this email, your message will be added to the discussion below:
http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2895866.html
To unsubscribe from CPUs go crazy on large withPool, click here.
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Vaclav
Administrator
You can use
@GrabResolver(name='gpars', root='http://snapshots.repository.codehaus.org/', m2Compatible=true)
@Grab(group='org.codehaus.gpars', module='gpars', version='0.12-beta-1-SNAPSHOT')



On Tue, May 3, 2011 at 11:10 PM, strayph <[hidden email]> wrote:
Sure. Is it in a repo where I can connect from gradle?  I haven't built gpars from source.

Sent from my iPhone

On May 3, 2011, at 1:24 PM, "Vaclav [via GPars - user mailing list]" <[hidden email]> wrote:

> Strayph,
>
> could I ask you to try your code with GPars-0.12-beta-1-SNAPSHOT? We switched the underlying Fork/Join framework for the 0.12 version and my tests over here indicate that the pool behaves much better inthe newer version.
>
> Vaclav
>
>
> On Tue, May 3, 2011 at 7:29 PM, strayph <[hidden email]> wrote:
> My first thoughts on this had been that it was a relative problem, not an
> absolute one.
> Now I am pretty sure it is an absolute problem.
>
> I had thought it was because I had extra threads with no work to do, and
> they were somehow misbehaving.
> However, if I change my example to 100 files with 100 threads to work on
> them, the exact same issue occurs.
>
> So I think there are some probles with Gpars (or the underlying Java
> libraries) with large thread pools.
>
> Strayph
>
> --
> View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2895143.html
> Sent from the GPars - user mailing list mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>
>
>
> --
> E-mail: [hidden email]
> If you reply to this email, your message will be added to the discussion below:
> http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2895866.html
> To unsubscribe from CPUs go crazy on large withPool, click here.


--
View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2896089.html
Sent from the GPars - user mailing list mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email





--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

strayph
Nope.  Still getting CPU pin and no return if I set my withPool() to 100 threads.

:-(

Strayph

On May 3, 2011, at 10:01 PM, Vaclav [via GPars - user mailing list] wrote:

You can use
@GrabResolver(name='gpars', root='http://snapshots.repository.codehaus.org/', m2Compatible=true)
@Grab(group='org.codehaus.gpars', module='gpars', version='0.12-beta-1-SNAPSHOT')



On Tue, May 3, 2011 at 11:10 PM, strayph <<a href="x-msg://756/user/SendEmail.jtp?type=node&amp;node=2897418&amp;i=0&amp;by-user=t" target="_top" rel="nofollow" link="external">[hidden email]> wrote:
Sure. Is it in a repo where I can connect from gradle?  I haven't built gpars from source.

Sent from my iPhone

On May 3, 2011, at 1:24 PM, "Vaclav [via GPars - user mailing list]" <<a href="x-msg://756/user/SendEmail.jtp?type=node&amp;node=2897418&amp;i=1&amp;by-user=t" target="_top" rel="nofollow" link="external">[hidden email]> wrote:

> Strayph,
>
> could I ask you to try your code with GPars-0.12-beta-1-SNAPSHOT? We switched the underlying Fork/Join framework for the 0.12 version and my tests over here indicate that the pool behaves much better inthe newer version.
>
> Vaclav
>
>
> On Tue, May 3, 2011 at 7:29 PM, strayph <[hidden email]> wrote:
> My first thoughts on this had been that it was a relative problem, not an
> absolute one.
> Now I am pretty sure it is an absolute problem.
>
> I had thought it was because I had extra threads with no work to do, and
> they were somehow misbehaving.
> However, if I change my example to 100 files with 100 threads to work on
> them, the exact same issue occurs.
>
> So I think there are some probles with Gpars (or the underlying Java
> libraries) with large thread pools.
>
> Strayph
>
> --
> View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2895143.html
> Sent from the GPars - user mailing list mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>
>
>
> --
> E-mail: [hidden email]
> If you reply to this email, your message will be added to the discussion below:
> http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2895866.html
> To unsubscribe from CPUs go crazy on large withPool, click here.


--
View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2896089.html
Sent from the GPars - user mailing list mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email





--
E-mail: <a href="x-msg://756/user/SendEmail.jtp?type=node&amp;node=2897418&amp;i=2&amp;by-user=t" target="_top" rel="nofollow" link="external">[hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech



If you reply to this email, your message will be added to the discussion below:
http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2897418.html
To unsubscribe from CPUs go crazy on large withPool, click here.

Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Vaclav
Administrator
I still did not succeed to simulate the problem here using 0.12. If you're using Groovy 1.8, please make sure you removed the gpars jar file from the Groovy installation before you fire groovy console. Otherwise you're still using gpars 0.11 with the old thread pool library.



On Wed, May 4, 2011 at 11:46 PM, strayph <[hidden email]> wrote:
Nope.  Still getting CPU pin and no return if I set my withPool() to 100 threads.

:-(

Strayph

--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

strayph
Yes, you were correct.  I needed to remove the older JAR file.

This works fine with 0.12.  Yay!

Strayph
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Vaclav
Administrator
Great, I'm glad to hear that. Thank you for the confirmation, Strayph.
It seems, we definitely need to do something about jar prioritization in Grapes.

Cheers,

Vaclav


On Mon, May 9, 2011 at 11:42 PM, strayph <[hidden email]> wrote:
Yes, you were correct.  I needed to remove the older JAR file.

This works fine with 0.12.  Yay!

Strayph

--
View this message in context: http://gpars-user-mailing-list.19372.n3.nabble.com/CPUs-go-crazy-on-large-withPool-tp2893107p2920692.html
Sent from the GPars - user mailing list mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email





--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Russel Winder
On Tue, 2011-05-10 at 05:10 +0200, Vaclav Pech wrote:
> Great, I'm glad to hear that. Thank you for the confirmation, Strayph.
> It seems, we definitely need to do something about jar prioritization
> in Grapes.

Please vote (and get everyone else to vote) for:

http://jira.codehaus.org/browse/GROOVY-4809

and especially:

http://jira.codehaus.org/browse/GROOVY-4810

so as to get this problem resolved!

--
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:[hidden email]
41 Buckmaster Road    m: +44 7770 465 077   xmpp: [hidden email]
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Vaclav
Administrator
Yes, everybody who likes to experiment with gpars snapshots, please vote.


On Tue, May 10, 2011 at 7:40 AM, Russel Winder <[hidden email]> wrote:
On Tue, 2011-05-10 at 05:10 +0200, Vaclav Pech wrote:
> Great, I'm glad to hear that. Thank you for the confirmation, Strayph.
> It seems, we definitely need to do something about jar prioritization
> in Grapes.

Please vote (and get everyone else to vote) for:

http://jira.codehaus.org/browse/GROOVY-4809

and especially:

http://jira.codehaus.org/browse/GROOVY-4810

so as to get this problem resolved!

--
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: [hidden email]
41 Buckmaster Road    m: +44 7770 465 077   xmpp: [hidden email]
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder



--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech
Reply | Threaded
Open this post in threaded view
|

Re: CPUs go crazy on large withPool

Dierk König
voted +1

Am 10.05.2011 um 08:37 schrieb Vaclav Pech:

Yes, everybody who likes to experiment with gpars snapshots, please vote.


On Tue, May 10, 2011 at 7:40 AM, Russel Winder <[hidden email]> wrote:
On Tue, 2011-05-10 at 05:10 +0200, Vaclav Pech wrote:
> Great, I'm glad to hear that. Thank you for the confirmation, Strayph.
> It seems, we definitely need to do something about jar prioritization
> in Grapes.

Please vote (and get everyone else to vote) for:

http://jira.codehaus.org/browse/GROOVY-4809

and especially:

http://jira.codehaus.org/browse/GROOVY-4810

so as to get this problem resolved!

--
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: [hidden email]
41 Buckmaster Road    m: +44 7770 465 077   xmpp: [hidden email]
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder



--
E-mail: [hidden email]
Blog: http://www.jroller.com/vaclav
Linkedin page: http://www.linkedin.com/in/vaclavpech