reasonable to use GPars to split one massive sql query into n parallel ones ?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

reasonable to use GPars to split one massive sql query into n parallel ones ?

soda
This post has NOT been accepted by the mailing list yet.

Hi, Ive been reading lots of the GPArs docs and samples but am yet to find a straight forward example of this..

My problem is the following : I have query that runs in a jms listener. Its a BIG query, it takes hours to run. it involves finding top 3 similar users to each other user in my schema and then storing them somewhere  - a bit like getting the results for Twitters 'Suggested people to follow'

Im not using hadoop or any of that jazz yet, merely a hibernate session on top of a datasource to a  mysql single database.

So, Is it possible or reasonable to use Gpars to break my query into n smaller queries - each on a subset of the database (i.e. each query having a 'where id>= :minId and id<=:maxid ), have all these queries execute in parallel and then do a reduce on the multiple resultsets that each query returns ? or does this actually just put the same load onn the db at the end of the day ?

thanks