EachParallel and timeout

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

EachParallel and timeout

Davide Rossi
Hi,

I have a problem with eachParallel.

I have a quartz job that at some point has to iterate through a map and execute some task on each element.
This task is usually short but as it involves the network and the DB can be long, hang or throw any kind of  exception.
The tasks to be executed on the list are not related: they have to connect to some sites, get some informations, do some elaborations and save the results in a database.

I'm using a eachParallel closure, but I have a big problem as sometimes it stops or hangs ( I don't know exactly what's happening, but it just don't end the execution).
The problem is not only that the execution is suspended, but that sometimes it starts after some hours, job executions are concurrent and then it fails for some database connection timeout.
I was thinking as a possible solution to just kill the thread after a timeout, to avoid hanging, db locks or whatever: how can i do this ? It's not possible to set a timeout with eachParallel.

Thanks
Davide

Reply | Threaded
Open this post in threaded view
|

Re: EachParallel and timeout

BungleFeet
Hi Davide,

Just a thought, but are you using a connection pool to access the DB, and if so is it configured with enough connections?  The default configuration for many pools is to wait indefinitely for a connection to become available, so if your concurrent code is exhausting the connection pool, this may result in the behaviour you are seeing.

Regards,

Ewan 


On 8 January 2013 12:24, Davide Rossi <[hidden email]> wrote:
Hi,

I have a problem with eachParallel.

I have a quartz job that at some point has to iterate through a map and execute some task on each element.
This task is usually short but as it involves the network and the DB can be long, hang or throw any kind of  exception.
The tasks to be executed on the list are not related: they have to connect to some sites, get some informations, do some elaborations and save the results in a database.

I'm using a eachParallel closure, but I have a big problem as sometimes it stops or hangs ( I don't know exactly what's happening, but it just don't end the execution).
The problem is not only that the execution is suspended, but that sometimes it starts after some hours, job executions are concurrent and then it fails for some database connection timeout.
I was thinking as a possible solution to just kill the thread after a timeout, to avoid hanging, db locks or whatever: how can i do this ? It's not possible to set a timeout with eachParallel.

Thanks
Davide


Reply | Threaded
Open this post in threaded view
|

Re: EachParallel and timeout

Davide Rossi
Thank you,

I will try to change the maxActive property to match the DB value and see if this solves the problem, anyway I have the maxWait parameter set (maxWait = 1000), so I don't think this is the issue.

Davide

2013/1/8 Ewan Dawson <[hidden email]>
Hi Davide,

Just a thought, but are you using a connection pool to access the DB, and if so is it configured with enough connections?  The default configuration for many pools is to wait indefinitely for a connection to become available, so if your concurrent code is exhausting the connection pool, this may result in the behaviour you are seeing.

Regards,

Ewan 


On 8 January 2013 12:24, Davide Rossi <[hidden email]> wrote:
Hi,

I have a problem with eachParallel.

I have a quartz job that at some point has to iterate through a map and execute some task on each element.
This task is usually short but as it involves the network and the DB can be long, hang or throw any kind of  exception.
The tasks to be executed on the list are not related: they have to connect to some sites, get some informations, do some elaborations and save the results in a database.

I'm using a eachParallel closure, but I have a big problem as sometimes it stops or hangs ( I don't know exactly what's happening, but it just don't end the execution).
The problem is not only that the execution is suspended, but that sometimes it starts after some hours, job executions are concurrent and then it fails for some database connection timeout.
I was thinking as a possible solution to just kill the thread after a timeout, to avoid hanging, db locks or whatever: how can i do this ? It's not possible to set a timeout with eachParallel.

Thanks
Davide



Reply | Threaded
Open this post in threaded view
|

Re: EachParallel and timeout

BungleFeet
Davide,

A useful tool for debugging concurrent Java software is jvisualvm, which comes as part of the JDK.  Using this tool, you can connect to the running JVM and create a thread dump. This stacktrace produced for each thread will show you at which line of code the thread is waiting, which should help you figure out where the problem lies.

Cheers,
Ewan


On 8 January 2013 13:37, Davide Rossi <[hidden email]> wrote:
Thank you,

I will try to change the maxActive property to match the DB value and see if this solves the problem, anyway I have the maxWait parameter set (maxWait = 1000), so I don't think this is the issue.

Davide


2013/1/8 Ewan Dawson <[hidden email]>
Hi Davide,

Just a thought, but are you using a connection pool to access the DB, and if so is it configured with enough connections?  The default configuration for many pools is to wait indefinitely for a connection to become available, so if your concurrent code is exhausting the connection pool, this may result in the behaviour you are seeing.

Regards,

Ewan 


On 8 January 2013 12:24, Davide Rossi <[hidden email]> wrote:
Hi,

I have a problem with eachParallel.

I have a quartz job that at some point has to iterate through a map and execute some task on each element.
This task is usually short but as it involves the network and the DB can be long, hang or throw any kind of  exception.
The tasks to be executed on the list are not related: they have to connect to some sites, get some informations, do some elaborations and save the results in a database.

I'm using a eachParallel closure, but I have a big problem as sometimes it stops or hangs ( I don't know exactly what's happening, but it just don't end the execution).
The problem is not only that the execution is suspended, but that sometimes it starts after some hours, job executions are concurrent and then it fails for some database connection timeout.
I was thinking as a possible solution to just kill the thread after a timeout, to avoid hanging, db locks or whatever: how can i do this ? It's not possible to set a timeout with eachParallel.

Thanks
Davide




Reply | Threaded
Open this post in threaded view
|

Re: EachParallel and timeout

Davide Rossi
Hi, thank for the answer.

after looking at the thread dump for a while I now understand the problem but cannot find a way to fix it.
The problem is the following: multiple threads are crawling some websites and are saving the results in the DB. There are around 25 / 30 threads running concurrently. They delete some old data in the DB and insert the new one. Each thread is working on different rows of the same tables.
Sometimes (it happens random) a thread hangs, with the following exception:

java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <7ddb75670> (a java.io.BufferedInputStream)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3049)
at com.mysql.jdbc.MysqlIO.readPacket(MysqlIO.java:597)
at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1084)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2483)
at com.mysql.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:2324)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2306)
- locked <7ddb81680> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:834)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.GeneratedConstructorAccessor179.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.springsource.loaded.ri.ReflectiveInterceptor.jlrConstructorNewInstance(ReflectiveInterceptor.java:963)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)
at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
...

So it seems like some read lock.

Anyone ever seen this ? Any idea ?

Thank you
Davide


2013/1/8 Ewan Dawson <[hidden email]>
Davide,

A useful tool for debugging concurrent Java software is jvisualvm, which comes as part of the JDK.  Using this tool, you can connect to the running JVM and create a thread dump. This stacktrace produced for each thread will show you at which line of code the thread is waiting, which should help you figure out where the problem lies.

Cheers,
Ewan


On 8 January 2013 13:37, Davide Rossi <[hidden email]> wrote:
Thank you,

I will try to change the maxActive property to match the DB value and see if this solves the problem, anyway I have the maxWait parameter set (maxWait = 1000), so I don't think this is the issue.

Davide


2013/1/8 Ewan Dawson <[hidden email]>
Hi Davide,

Just a thought, but are you using a connection pool to access the DB, and if so is it configured with enough connections?  The default configuration for many pools is to wait indefinitely for a connection to become available, so if your concurrent code is exhausting the connection pool, this may result in the behaviour you are seeing.

Regards,

Ewan 


On 8 January 2013 12:24, Davide Rossi <[hidden email]> wrote:
Hi,

I have a problem with eachParallel.

I have a quartz job that at some point has to iterate through a map and execute some task on each element.
This task is usually short but as it involves the network and the DB can be long, hang or throw any kind of  exception.
The tasks to be executed on the list are not related: they have to connect to some sites, get some informations, do some elaborations and save the results in a database.

I'm using a eachParallel closure, but I have a big problem as sometimes it stops or hangs ( I don't know exactly what's happening, but it just don't end the execution).
The problem is not only that the execution is suspended, but that sometimes it starts after some hours, job executions are concurrent and then it fails for some database connection timeout.
I was thinking as a possible solution to just kill the thread after a timeout, to avoid hanging, db locks or whatever: how can i do this ? It's not possible to set a timeout with eachParallel.

Thanks
Davide





Reply | Threaded
Open this post in threaded view
|

Re: EachParallel and timeout

chiquitinxx
Hello!

I'm not an expert, just to give you some ideas. Maybe, the connection pool manager is not working very well, I doubt the manager of the connections pool is thread safe. So maybe some times, return the same connection to two threads.

So I'm thinking in three solutions. Don't use any poll, just create a connection each time you access the database. Use any other pool manager. Or create your awesome thread safe pool manager :) maybe with actors.

El 28/03/2013, a las 13:15, Davide Rossi escribió:

Hi, thank for the answer.

after looking at the thread dump for a while I now understand the problem but cannot find a way to fix it.
The problem is the following: multiple threads are crawling some websites and are saving the results in the DB. There are around 25 / 30 threads running concurrently. They delete some old data in the DB and insert the new one. Each thread is working on different rows of the same tables.
Sometimes (it happens random) a thread hangs, with the following exception:

java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <7ddb75670> (a java.io.BufferedInputStream)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3049)
at com.mysql.jdbc.MysqlIO.readPacket(MysqlIO.java:597)
at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1084)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2483)
at com.mysql.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:2324)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2306)
- locked <7ddb81680> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:834)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.GeneratedConstructorAccessor179.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.springsource.loaded.ri.ReflectiveInterceptor.jlrConstructorNewInstance(ReflectiveInterceptor.java:963)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)
at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
...

So it seems like some read lock.

Anyone ever seen this ? Any idea ?

Thank you
Davide


2013/1/8 Ewan Dawson <[hidden email]>
Davide,

A useful tool for debugging concurrent Java software is jvisualvm, which comes as part of the JDK.  Using this tool, you can connect to the running JVM and create a thread dump. This stacktrace produced for each thread will show you at which line of code the thread is waiting, which should help you figure out where the problem lies.

Cheers,
Ewan


On 8 January 2013 13:37, Davide Rossi <[hidden email]> wrote:
Thank you,

I will try to change the maxActive property to match the DB value and see if this solves the problem, anyway I have the maxWait parameter set (maxWait = 1000), so I don't think this is the issue.

Davide


2013/1/8 Ewan Dawson <[hidden email]>
Hi Davide,

Just a thought, but are you using a connection pool to access the DB, and if so is it configured with enough connections?  The default configuration for many pools is to wait indefinitely for a connection to become available, so if your concurrent code is exhausting the connection pool, this may result in the behaviour you are seeing.

Regards,

Ewan 


On 8 January 2013 12:24, Davide Rossi <[hidden email]> wrote:
Hi,

I have a problem with eachParallel.

I have a quartz job that at some point has to iterate through a map and execute some task on each element.
This task is usually short but as it involves the network and the DB can be long, hang or throw any kind of  exception.
The tasks to be executed on the list are not related: they have to connect to some sites, get some informations, do some elaborations and save the results in a database.

I'm using a eachParallel closure, but I have a big problem as sometimes it stops or hangs ( I don't know exactly what's happening, but it just don't end the execution).
The problem is not only that the execution is suspended, but that sometimes it starts after some hours, job executions are concurrent and then it fails for some database connection timeout.
I was thinking as a possible solution to just kill the thread after a timeout, to avoid hanging, db locks or whatever: how can i do this ? It's not possible to set a timeout with eachParallel.

Thanks
Davide






Reply | Threaded
Open this post in threaded view
|

Re: EachParallel and timeout

Russel Winder-3
In reply to this post by Davide Rossi
On Thu, 2013-03-28 at 13:15 +0100, Davide Rossi wrote:
> Hi, thank for the answer.
>
> after looking at the thread dump for a while I now understand the problem
> but cannot find a way to fix it.
> The problem is the following: multiple threads are crawling some websites
> and are saving the results in the DB. There are around 25 / 30 threads
> running concurrently. They delete some old data in the DB and insert the
> new one. Each thread is working on different rows of the same tables.
> Sometimes (it happens random) a thread hangs, with the following exception:

My immediate reaction is that I wonder if this is GPars problem at all,
but a database driver problem. My guess is that in your case the JDBC
driver is not totally threadsafe.  Unless you know you are using a
database and driver that can handle parallelism, I suggest you need to
use an agent in the application code to act as a central proxy for the
database driver. Hacky but works.

--
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:[hidden email]
41 Buckmaster Road    m: +44 7770 465 077   xmpp: [hidden email]
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: EachParallel and timeout

Davide Rossi
Thanks both for your answers, yes it is actually a DB driver problem.
This problem was coming and going for a few months, and was not a big issue, but now as there are more concurrent threads it has become critical.

It seems a few people in the last years had the same problem, but nobody could identify the real cause: someone said the DB (MySql), the driver (Connector/J), the pool manager (DBCP) or a combination of these all and the OS (Linux).
The underlying problem is that for some reason the read operation done by the driver hangs, maybe because the size of the real buffer is different than what expected or something similar.

So, it's not a GPars problem, (I still wonder why it started when I migrated from old Java threads to GPars, maybe GPars just made it more evident...), but a connection problem.

The only working solution I could find is a really dirt one, but it's working so far: since i cannot interrupt the locked thread I set a Socket Timeout on the connection, so an exception is raised. Not elegant but works.

Davide

2013/3/29 Russel Winder <[hidden email]>
On Thu, 2013-03-28 at 13:15 +0100, Davide Rossi wrote:
> Hi, thank for the answer.
>
> after looking at the thread dump for a while I now understand the problem
> but cannot find a way to fix it.
> The problem is the following: multiple threads are crawling some websites
> and are saving the results in the DB. There are around 25 / 30 threads
> running concurrently. They delete some old data in the DB and insert the
> new one. Each thread is working on different rows of the same tables.
> Sometimes (it happens random) a thread hangs, with the following exception:

My immediate reaction is that I wonder if this is GPars problem at all,
but a database driver problem. My guess is that in your case the JDBC
driver is not totally threadsafe.  Unless you know you are using a
database and driver that can handle parallelism, I suggest you need to
use an agent in the application code to act as a central proxy for the
database driver. Hacky but works.

--
Russel.
=============================================================================
Dr Russel Winder      t: <a href="tel:%2B44%2020%207585%202200" value="+442075852200">+44 20 7585 2200   voip: [hidden email]
41 Buckmaster Road    m: <a href="tel:%2B44%207770%20465%20077" value="+447770465077">+44 7770 465 077   xmpp: [hidden email]
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder