Target CPU run time

Message boards : Number crunching : Target CPU run time

To post messages, you must log in.

AuthorMessage
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,820,785
RAC: 12,438
Message 101253 - Posted: 12 Apr 2021, 12:16:55 UTC

Should I make "Target CPU run time" different for different machines? Is it preferable from the project's point of view to have my slower computers run for longer than my faster computers so they produce a similar amount of data? The CPUMarks of the computers per thread are:

Ryzen: 1383
Glass: 1711
Xeon1, Xeon2: 467
Picard: 292
Black: 507
Laptop: 197
ID: 101253 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1679
Credit: 17,830,136
RAC: 22,939
Message 101271 - Posted: 13 Apr 2021, 7:34:20 UTC - in response to Message 101253.  

Should I make "Target CPU run time" different for different machines? Is it preferable from the project's point of view to have my slower computers run for longer than my faster computers so they produce a similar amount of data?
Nope.
While a faster CPU will do more work than a slower one within the same Target CPU time period, the project determined that 8 hours on even a lower end CPU, gives useful results. The extra work a faster CPU does is a nice bonus, and such systems get more Credit for that extra work done.
Plus you may run in to deadline issues due to the fixed Estimated completion time, regardless of the Target CPU time selected, due to the slower systems downloading more work than they can actually do in the extended Target CPU time period. Along with the fact you're running multiple projects will also lead to deadline issues with differing runtimes on different systems.
A small (or better yet 0) cache, and sticking with the project Target CPU time default, and just let BOINC work things out from there is the best way to avoid annoying issues from occurring.
Grant
Darwin NT
ID: 101271 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,820,785
RAC: 12,438
Message 101278 - Posted: 13 Apr 2021, 12:20:02 UTC - in response to Message 101271.  

Should I make "Target CPU run time" different for different machines? Is it preferable from the project's point of view to have my slower computers run for longer than my faster computers so they produce a similar amount of data?
Nope.
While a faster CPU will do more work than a slower one within the same Target CPU time period, the project determined that 8 hours on even a lower end CPU, gives useful results. The extra work a faster CPU does is a nice bonus, and such systems get more Credit for that extra work done.
Plus you may run in to deadline issues due to the fixed Estimated completion time, regardless of the Target CPU time selected, due to the slower systems downloading more work than they can actually do in the extended Target CPU time period. Along with the fact you're running multiple projects will also lead to deadline issues with differing runtimes on different systems.
A small (or better yet 0) cache, and sticking with the project Target CPU time default, and just let BOINC work things out from there is the best way to avoid annoying issues from occurring.
Ok, I'll leave it as is. 8 hour default run time, 0+3 hour cache, so tasks start pretty much immediately.

I'm interested in how the models within the tasks work though. Let's say I get handed some tasks, each containing 20 models. My slower machines do 5 of those models in each task, my faster machines do 15. What happens to the ones that weren't done? Are they collected and handed out in other tasks? Or are all the models I get in one task pretty similar, and they just need an average of x models per task?
ID: 101278 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1679
Credit: 17,830,136
RAC: 22,939
Message 101298 - Posted: 14 Apr 2021, 5:27:33 UTC - in response to Message 101278.  

I'm interested in how the models within the tasks work though. Let's say I get handed some tasks, each containing 20 models. My slower machines do 5 of those models in each task, my faster machines do 15. What happens to the ones that weren't done? Are they collected and handed out in other tasks? Or are all the models I get in one task pretty similar, and they just need an average of x models per task?
Only the project staff can answer that.

My Wild Arse Guess- Any particular Task can do thousands of possible models (they use a Random value to start from for each Task). Due to their complexity some Tasks might only produce 1 Decoy after 8 hours of processing on a low end system, but produce 2, 10 or whatever Decoys on a higher end system. Simpler Tasks might produce hundreds of Decoys on that same low end system, but tens of hundreds on a higher end system.
Even on a low end system, you get some Tasks that run for less than 8 hours- they get to a point where they can't produce any more useful Decoys. Other Tasks could run for days & days on a high end system, and still not come close to exhausting the number of Decoys it can produce.

The researchers put out a batch of Tasks, each with it's own slight variation from the others, and then the Rosetta programme processes the data to see if something useful can be made form those Tasks. If not, then they know what direction not to go in. If useful information is returned, then they know to continue in that direction & will put out more Tasks to further develop along that line.

There are almost a limitless number of possible combinations, but only a small number of then will produce something that is physically possible & of use. But even that small number of possible models is a mind boggling large number of possibilities, and so it takes a lot of computational work to further develop them to give useful results (but as i said, that's just my WAG/speculation).
Grant
Darwin NT
ID: 101298 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 11,820,785
RAC: 12,438
Message 101305 - Posted: 14 Apr 2021, 17:34:07 UTC - in response to Message 101298.  

I'm interested in how the models within the tasks work though. Let's say I get handed some tasks, each containing 20 models. My slower machines do 5 of those models in each task, my faster machines do 15. What happens to the ones that weren't done? Are they collected and handed out in other tasks? Or are all the models I get in one task pretty similar, and they just need an average of x models per task?
Only the project staff can answer that.

My Wild Arse Guess- Any particular Task can do thousands of possible models (they use a Random value to start from for each Task). Due to their complexity some Tasks might only produce 1 Decoy after 8 hours of processing on a low end system, but produce 2, 10 or whatever Decoys on a higher end system. Simpler Tasks might produce hundreds of Decoys on that same low end system, but tens of hundreds on a higher end system.
Even on a low end system, you get some Tasks that run for less than 8 hours- they get to a point where they can't produce any more useful Decoys. Other Tasks could run for days & days on a high end system, and still not come close to exhausting the number of Decoys it can produce.

The researchers put out a batch of Tasks, each with it's own slight variation from the others, and then the Rosetta programme processes the data to see if something useful can be made form those Tasks. If not, then they know what direction not to go in. If useful information is returned, then they know to continue in that direction & will put out more Tasks to further develop along that line.

There are almost a limitless number of possible combinations, but only a small number of then will produce something that is physically possible & of use. But even that small number of possible models is a mind boggling large number of possibilities, and so it takes a lot of computational work to further develop them to give useful results (but as i said, that's just my WAG/speculation).
I sort of understand that. But.... if the task I'm running on a fast machine produces 20 decoys, and if it had run on a slow machine it would have produced 4 decoys, then why do they need those extra 16?
ID: 101305 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Millenium

Send message
Joined: 20 Sep 05
Posts: 68
Credit: 184,283
RAC: 0
Message 101306 - Posted: 14 Apr 2021, 19:16:41 UTC

Nonono, it does not work like that!

A WU crunches as many decoys (3d structure, the correct term is "tertiary structure", how the aminoacid sequence folds) as possible of the protein that that WU represents.
So, you have the WU, which represent a single protein, with its aminoacid sequence.
Then it computes a decoy, which models its 3d structure. Each decoy thus is a slightly different 3d structure. The best one is the one with lowest energy.
So, no, a task does not contain "20 models". It just contain the protein aminoacid sequence. And then it start crunching decoys, until target time is reached, then it finishes the decoy currently in elaboration and send all the decoys to the server. If a protein is small, a decoy does not take a lot of time to be completed, and in 8 hours a WU can complete a lot of them, if it's bigger, of course it takes more time, sometime a single decoy per WU if the protein is particularly big and complex.
The project thus receives thoudands and thousands of decoys from all the users, the best one is the one with lowest energy, which is the one that is more similar to the real one. More decoys, more chances to get a better simulation!

This is what Rosetta does, predict the protein tertiary sequence, which is very important about how the protein works, what it does and so on.
ID: 101306 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Target CPU run time



©2024 University of Washington
https://www.bakerlab.org