Minirosetta 3.52

Message boards : Number crunching : Minirosetta 3.52

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
JugNut

Send message
Joined: 30 Apr 12
Posts: 11
Credit: 2,437,453
RAC: 0
Message 77874 - Posted: 29 Jan 2015, 13:50:45 UTC

Not sure whats goes on here.. https://boinc.bakerlab.org/rosetta/result.php?resultid=713895694
I have my WU's set to 1 hour but this WU went for 5.1hrs. Is this behavior normal with some types of tasks?
On a side note besides the long time is the small credit given instead of getting 5 times more credit as was asked for instead the WU recieved 5 times less credit. This certainly isn't the first time i've seen this happen either but thankfully only seems to happen a few times a day. So is it mormal?

Any idea's?

TIA
ID: 77874 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JugNut

Send message
Joined: 30 Apr 12
Posts: 11
Credit: 2,437,453
RAC: 0
Message 77875 - Posted: 29 Jan 2015, 16:44:48 UTC
Last modified: 29 Jan 2015, 17:13:09 UTC

Sorry my bad it's not the above link, it's this one.. https://boinc.bakerlab.org/rosetta/result.php?resultid=713894281

@ P . P . L: I have many of those validate errors too but you still end up getting credited for them in the end. It's considered a normal part of the process. The way I understand it is these WU's are given credit by a script once every 24 hrs but it doesn't show up in your results in the normal spot. If you wait 24/48hrs then click the task details link you'll see right down the very bottom that they did get credited eventually after a day or two.

Like this one of yours.. https://boinc.bakerlab.org/rosetta/result.php?resultid=713405534
ID: 77875 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 77879 - Posted: 30 Jan 2015, 7:33:20 UTC - in response to Message 77875.  

Sorry my bad it's not the above link, it's this one.. https://boinc.bakerlab.org/rosetta/result.php?resultid=713894281

At the end of the log it looks like the watchdog had to force the task to shut down - the watchdog is a kind of fail-safe if something goes wrong with the task and it doesn't complete properly.

It happens very occasionally. Looks like you were unlucky with that one.

ID: 77879 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JugNut

Send message
Joined: 30 Apr 12
Posts: 11
Credit: 2,437,453
RAC: 0
Message 77880 - Posted: 30 Jan 2015, 12:41:46 UTC
Last modified: 30 Jan 2015, 12:44:03 UTC

Thanks for answering Sid. Your right luckily there only seems to be about 4 or 5 a day but it adds up. Especially when for some reason they only get credited with a fraction of what they should. Also it's hard to tell exactly how many there are as I would have to search through 100's of tasks each day to find them.

Rosettas task viewing leaves much to be desired. As you'd know on other projects you can click on say errors & get a list of errors or do a search for a particular task name, that would be a big help here.

PS I've just noticed that out of the blue i'm having comp errors like this https://boinc.bakerlab.org/rosetta/result.php?resultid=714075014 on one of my PC's. It's the same exact error as described in this post above. https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6444&nowrap=true#76816
So far i've had more than a dozen of these over the last 4hrs or so the strange thing is many of them end up getting validated by the next guy along. Not sure if that makes any difference or not but in most cases where it does gets validated it was by someone using Linux . Just a thought? I'll keep digging into it.

Thanks again..
ID: 77880 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 77884 - Posted: 2 Feb 2015, 4:45:21 UTC - in response to Message 77880.  

Thanks for answering Sid. Your right luckily there only seems to be about 4 or 5 a day but it adds up. Especially when for some reason they only get credited with a fraction of what they should. Also it's hard to tell exactly how many there are as I would have to search through 100's of tasks each day to find them.

Rosetta's task viewing leaves much to be desired. As you'd know on other projects you can click on say errors & get a list of errors or do a search for a particular task name, that would be a big help here...

You're right on that last bit about sorting tasks by error etc. 4 or 5 a day sounds a lot, so I just had a look at your machines and tasks and... holy moly!

Why is it you have really fast computers but you run with just a 1hr runtime? You have near 1000 tasks per machine either complete or in progress! That must be taking up massive band width at both ends. In the context of 1000s, the occasional few tasks going wrong is trivial. I thought 4-5 would be a lot.
ID: 77884 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JugNut

Send message
Joined: 30 Apr 12
Posts: 11
Credit: 2,437,453
RAC: 0
Message 77885 - Posted: 3 Feb 2015, 11:14:55 UTC
Last modified: 3 Feb 2015, 11:58:12 UTC

Hi Sid,
Thank you again for your reply. The reason I use 1hr is simply because it credits the most, or at least it certainly seems too. While credits are no where near the top of my list for crunching they are like for most others a side interest. On different occasions I checked other PC's with similar rigs to mine & those I checked on that were using larger times than me on average never got got equal to what I was getting. And I figure since i'm helping anyway what does it matter? After all if using the 1hr option was bad why would it still be an option?

If it was an imperative to get crunchers to use longer times then there would be an advantage for them to do so, at the moment there isn't. A simple way to achieve this if it is indeed a project necessity would be to offer crunchers a bonus for crunching longer times for the extra risk & commitment involved in doing so. Things go pear shaped here more than most other projects. Other projects give bonuses for quick return & doing long tasks so it could be done here too.

Plus The extra overhead at my end seems negligible when running larger size units. Although I didn't do a thorough check when last I used the longer times so I could be wrong about it. Of course if it became a necessity for the projects good then I would oblige happily. With the errors & problems I had well if i'm having them then there could well be who knows how many others with such errors so I thought they would be worth reporting, especially since the majority of crunchers don't use the forums at all & when they find to many errors will just move on.

Crunch-on Cheers Greg.
ID: 77885 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 77886 - Posted: 3 Feb 2015, 17:19:33 UTC

Short runtimes simply report and claim credit sooner. The level of runtime and network overhead is reduced by longer runtimes. Credit is very hard to compare, as different tasks can have different performance characteristics. There is some level of overhead just opening up the zip files and reference data that is used by a task, so the less times you do that in a day, the less overhead in the processing. Longer runtimes should be a smidge more efficient. Also reduces the number of tasks on your pending and completed lists, and reduces the number of hits to the project servers. I don't think anyone intended to imply a long runtime was "imperative". Just that it may offer some benefits for you by reducing the overall number of tasks, disk space requirements, etc. The underlying work results are the same, so there is no premium either way for return time nor run length. The choice is there to help adapt to various usage scenarios.

BEWARE, changes to runtime preference will effect tasks currently on your machine and BOINC has to crunch a few with the new runtime preference before it accurately factors it in to it's future work requests. So ideally you reduce your work buffer, and change runtime preference gradually over the course of a week. Then bump the buffer of work back up as desired. Also, there currently seems to be an issue with the 2 day preference, so I suggest using 1 day.
Rosetta Moderator: Mod.Sense
ID: 77886 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JugNut

Send message
Joined: 30 Apr 12
Posts: 11
Credit: 2,437,453
RAC: 0
Message 77889 - Posted: 4 Feb 2015, 2:23:43 UTC
Last modified: 4 Feb 2015, 2:40:56 UTC

No worries mod sense i'll give that some thought & also try some longer timed WU's later & see what there like now.
Thanks for your time.

Greg
ID: 77889 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 77890 - Posted: 4 Feb 2015, 5:02:05 UTC - in response to Message 77885.  

Hi Sid,
Thank you again for your reply. The reason I use 1hr is simply because it credits the most, or at least it certainly seems too. While credits are no where near the top of my list for crunching they are like for most others a side interest. On different occasions I checked other PC's with similar rigs to mine & those I checked on that were using larger times than me on average never got got equal to what I was getting. And I figure since i'm helping anyway what does it matter? After all if using the 1hr option was bad why would it still be an option?

If it was an imperative to get crunchers to use longer times then there would be an advantage for them to do so, at the moment there isn't. A simple way to achieve this if it is indeed a project necessity would be to offer crunchers a bonus for crunching longer times for the extra risk & commitment involved in doing so. Things go pear shaped here more than most other projects. Other projects give bonuses for quick return & doing long tasks so it could be done here too.

Plus The extra overhead at my end seems negligible when running larger size units. Although I didn't do a thorough check when last I used the longer times so I could be wrong about it. Of course if it became a necessity for the projects good then I would oblige happily. With the errors & problems I had well if i'm having them then there could well be who knows how many others with such errors so I thought they would be worth reporting, especially since the majority of crunchers don't use the forums at all & when they find to many errors will just move on.

Crunch-on Cheers Greg.

I didn't mean to make a big deal about bandwidth (though it is a side issue). I just meant it was such a chore to work through your task lists to see what issue you were having. 4 or 5 errors is a lot with the default 6hr runtimes, but at 1hr (with up to 16 cores running at a time) that's 4/5 out of 384 tasks a day, not 48.

For what it's worth, I did see someone experiment with different runtimes on a machine and the differences were barely perceptible, with just the slightest advantage to longer runtimes (nothing conclusive either way though). On the back of that, also with the bandwidth usage in mind, I admit, I decided to change from the default to 8hrs, but it's completely down to you.

If you're looking to maximise credit, I guess it's worth bearing in mind that if you have a rogue task, like the one you first reported, instead of over-running by 4hrs on a 1hr task (watchdog cuts in at runtime +4hrs) you lose 5 tasks worth of processing, whereas a default 6hr task will run for 10hrs, only losing 1.67 tasks worth of processing. This is very much splitting hairs though. Whatever suits you.

As mod.sense says, don't make a dramatic change. Either run down tasks first before switching andor only change runtime by one step at a time. If 1000 tasks at 1hr suddenly became 1000 at 6hrs you'd have a problem!
ID: 77890 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JugNut

Send message
Joined: 30 Apr 12
Posts: 11
Credit: 2,437,453
RAC: 0
Message 77891 - Posted: 4 Feb 2015, 8:14:05 UTC

Hi Sid,
It was me who had the wrong slant on things probably from skimming posts that I read before I read yours. Thanks again for your help I hope I can return the favour some day.

Cheers Greg
ID: 77891 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jesse Viviano

Send message
Joined: 14 Jan 10
Posts: 42
Credit: 2,700,472
RAC: 0
Message 77893 - Posted: 4 Feb 2015, 20:24:12 UTC
Last modified: 4 Feb 2015, 20:26:07 UTC

Work unit 647152330 generated result files that were too big to upload when the work unit processing time limit is set to 24 hours. Please see my result log and the result log for someone who used a shorter work unit time limit.
ID: 77893 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jesse Viviano

Send message
Joined: 14 Jan 10
Posts: 42
Credit: 2,700,472
RAC: 0
Message 77895 - Posted: 5 Feb 2015, 4:36:37 UTC - in response to Message 77893.  

Work unit 647152330 generated result files that were too big to upload when the work unit processing time limit is set to 24 hours. Please see my result log and the result log for someone who used a shorter work unit time limit.

I found the relevant BOINC event log entries by digging into the appropriate BOINC data directory. By default, this file is located at C:ProgramDataBOINCstdoutdae.old in Windows 7. The BOINC event log entries are listed below.
02-Feb-2015 13:14:03 [rosetta@home] Computation for task A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0 finished
02-Feb-2015 13:14:03 [rosetta@home] Output file A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0_0 for task A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0 exceeds size limit.
02-Feb-2015 13:14:03 [rosetta@home] File size: 65833683.000000 bytes.  Limit: 50000000.000000 bytes

I therefore will have to change my preferences to 12 hour work units to prevent this error once my current work units drain out unless the file upload size limit is raised.
ID: 77895 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 77899 - Posted: 6 Feb 2015, 3:01:11 UTC - in response to Message 77895.  

Work unit 647152330 generated result files that were too big to upload when the work unit processing time limit is set to 24 hours. Please see my result log and the result log for someone who used a shorter work unit time limit.

I found the relevant BOINC event log entries by digging into the appropriate BOINC data directory. By default, this file is located at C:ProgramDataBOINCstdoutdae.old in Windows 7. The BOINC event log entries are listed below.
02-Feb-2015 13:14:03 [rosetta@home] Computation for task A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0 finished
02-Feb-2015 13:14:03 [rosetta@home] Output file A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0_0 for task A__2_2015_01_29_B__2_2015_01_29_patchdock_split_02_150129_SAVE_ALL_OUT__242418_37_0 exceeds size limit.
02-Feb-2015 13:14:03 [rosetta@home] File size: 65833683.000000 bytes.  Limit: 50000000.000000 bytes

I therefore will have to change my preferences to 12 hour work units to prevent this error once my current work units drain out unless the file upload size limit is raised.

Blimey! That's a new one! I've never come across an output file that big and I never knew there was a limit to the filesize either.
ID: 77899 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
alvin

Send message
Joined: 19 Jul 15
Posts: 5
Credit: 6,550,555
RAC: 0
Message 78496 - Posted: 26 Jul 2015, 2:29:28 UTC

Do you do GPU crunching or plan to?
Do you support NVidia and/or ATI?
ID: 78496 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 78498 - Posted: 26 Jul 2015, 4:26:25 UTC
Last modified: 26 Jul 2015, 4:27:14 UTC

Short answer to both NO.
ID: 78498 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sandman192

Send message
Joined: 22 Sep 07
Posts: 16
Credit: 2,018,819
RAC: 0
Message 99284 - Posted: 9 Oct 2020, 1:28:01 UTC

Ever since Rosetta added a change so you can change "Target CPU run time"=TCPU I have stopped getting work. I change to 2-hours TCPU and still got no work. I change it to a day and a half and no work.

I get this message.
10/8/2020 8:23:22 PM | Rosetta@home | Sending scheduler request: To fetch work.
10/8/2020 8:23:22 PM | Rosetta@home | Requesting new tasks for CPU
10/8/2020 8:23:23 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
10/8/2020 8:23:23 PM | Rosetta@home | No tasks sent
10/8/2020 8:23:23 PM | Rosetta@home | Tasks won't finish in time: BOINC runs 99.0% of the time; computation is enabled 99.9% of that
10/8/2020 8:23:23 PM | Rosetta@home | Project requested delay of 31 seconds


I have 2 computers giving me this I call err.
ID: 99284 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1731
Credit: 18,495,428
RAC: 20,518
Message 99287 - Posted: 9 Oct 2020, 9:11:35 UTC - in response to Message 99284.  

Ever since Rosetta added a change so you can change "Target CPU run time"=TCPU I have stopped getting work. I change to 2-hours TCPU and still got no work. I change it to a day and a half and no work.
With the number of projects you are running, and the low priority for Rosetta work, your best chance of getting work is to set the Target CPU time to 2 hours and to set your cache to 0 days.
Computing preferences, other,
           Store at least 0.00 days of work
Store up to an additional 0.01 days of work


Save the changes, then Update on the BOINC Manager of the computers for them to get those changes.
After a while (it could be several hours depending on the work you presently have running- especially for the Q8600 system) they should start to get some Rosetta work occasionally. Increase the Resource share value for Rosetta if you wan them to do more Rosetta work.
Grant
Darwin NT
ID: 99287 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sandman192

Send message
Joined: 22 Sep 07
Posts: 16
Credit: 2,018,819
RAC: 0
Message 99298 - Posted: 10 Oct 2020, 0:10:11 UTC - in response to Message 99287.  

With the number of projects you are running, and the low priority for Rosetta work,


Rosetta is to low priority??? It's set at 1000%. How is that low???

Why should I set "Store at least 0.00 days and additional 0.01 days or work"?
Maybe I want 10 days of work for both. And I want to get a day and a half of work. What's the point of giving these options if you can't use them?

Are you saying I can't use an option that BOINC and Rosetta gives me to use?

I have no problems with Prime when it came to a WU that took a week to finish with 100% Priority.

If prime can do it then Rosetta needs to fix it the same way.

Sounds like a bug somewhere.
ID: 99298 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1731
Credit: 18,495,428
RAC: 20,518
Message 99299 - Posted: 10 Oct 2020, 1:12:48 UTC - in response to Message 99298.  
Last modified: 10 Oct 2020, 1:16:39 UTC

Rosetta is to low priority??? It's set at 1000%. How is that low???
It is not 1000%.
Is is a ratio, compared to the values you have set for the other projects.
And being connected to Word Community Grid will result in odd behaviour as they don't honour Resource share settings in the same way as all other BOINC projects do.



Why should I set "Store at least 0.00 days and additional 0.01 days or work"?
So you can get more Rosetta work as they have 3 day deadlines.
That was why you posted wasn't it- you wanted more Rosetta work? If not, then just leave things as they are.



Maybe I want 10 days of work for both. And I want to get a day and a half of work. What's the point of giving these options if you can't use them?
The point is they can be used, when appropriate. Just because you can do something doesn't mean you should.
There is no need for 10 days work if you are connected to multiple projects. You will never run out of work so there is no need for any cache at all. That was the only reason for the cache settings- back in the days of dialup people didn't have 24/7 internet access.



Are you saying I can't use an option that BOINC and Rosetta gives me to use?
No, what i am saying is that it make no sense to use something, when there is no need to use it & especially so when making use of it, it will impact on what you are actually trying to do. There are a lot of settings you can change- and many of them will act against each other.- eg setting a large cache, when a project has short deadlines will impact on your ability to get work for that project when you are working on multiple other projects with long deadlines.

Just because you can do something doesn't mean you should.



I have no problems with Prime when it came to a WU that took a week to finish with 100% Priority.
Different projects, different deadlines.
If you want to do multiple projects, then you need to use settings that make it possible. Many of the settings are of use with only a single project, of limited use with a couple of projects, and of no use with multiple projects- such as you are doing.



Sounds like a bug somewhere.
No bug, just you using values suitable for a single project while running lots of multiple projects all with differing deadlines.
Grant
Darwin NT
ID: 99299 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 99302 - Posted: 10 Oct 2020, 9:04:25 UTC - in response to Message 99298.  
Last modified: 10 Oct 2020, 9:13:03 UTC

Maybe I want 10 days of work for both
You do not want 10 days of work for Rosetta@home. Task deadlines are always 3 days from when they are delivered. If you ask for 10 days’ worth of work, you might get it – but stand no chance of completing more than 30% of it. All that will do is delay the analysis of the results of the other 70%, as researchers have to wait at least an extra 3 days for the server to resend the tasks to hosts that will actually do the work on time.
ID: 99302 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : Minirosetta 3.52



©2024 University of Washington
https://www.bakerlab.org