Problems with Rosetta version 5.59

Message boards : Number crunching : Problems with Rosetta version 5.59

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

AuthorMessage
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 39010 - Posted: 4 Apr 2007, 22:52:23 UTC

Tim: To give the programmers more to work with, what was your run time preference, keep in memory setting, and which WUs were you having problems with?


ID: 39010 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tim Kunz

Send message
Joined: 27 Dec 05
Posts: 9
Credit: 1,037,919
RAC: 1,730
Message 39030 - Posted: 5 Apr 2007, 3:13:50 UTC - in response to Message 39010.  

All of my preferences are the defaults except max memory use at idle (now 92% to accommodate some SETI WUs).

All run at all times, no screen savers (i.e., blank).

I think (but not 100% sure) the problematic WUs were:

s029__BOINC_SYMM_FOLD_AND_DOCK_RELAX-s029_-truncate_hom023__1638_7173_0
s029__BOINC_SYMM_FOLD_AND_DOCK_RELAX-s029_-truncate_hom010__1638_9319_0

----------------------------------------------------------------------------------

Tim: To give the programmers more to work with, what was your run time preference, keep in memory setting, and which WUs were you having problems with?


ID: 39030 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 39039 - Posted: 5 Apr 2007, 12:22:09 UTC

These are the new science that Rhiju was talking about when he posted about the new 5.59 release. They do take a long time (45-90min or so on my 3Ghz machine) to complete a model, and they do not (yet) save checkpoints (just based on my observations of them running).

Rosetta has had similar tasks to this for months, so I don't know what upset you so much that after 85,000 credits you wish to redirect your machine's time. The only thing that's changed is the % complete that is displayed. And the Project Team is already working on enhancing the checkpointing so that the new RNA and Fold_and_dock tasks can save more frequently. But they are running properly at present. So it's not like they are doing any harm to your machine. Nor is any more time (on average) being lost when you turn off your PC then previously.
Rosetta Moderator: Mod.Sense
ID: 39039 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tim Kunz

Send message
Joined: 27 Dec 05
Posts: 9
Credit: 1,037,919
RAC: 1,730
Message 39042 - Posted: 5 Apr 2007, 15:02:39 UTC - in response to Message 39039.  
Last modified: 5 Apr 2007, 15:10:10 UTC

Not particularly upset...just that over 3 CPU hours were lost on one reset which then took 5+ CPU hours to recompute for 5.89 credit, so I'm applying resources where more is accomplished. I'll return once resolved.

--------------------------------------------------------------------------------

These are the new science that Rhiju was talking about when he posted about the new 5.59 release. They do take a long time (45-90min or so on my 3Ghz machine) to complete a model, and they do not (yet) save checkpoints (just based on my observations of them running).

Rosetta has had similar tasks to this for months, so I don't know what upset you so much that after 85,000 credits you wish to redirect your machine's time. The only thing that's changed is the % complete that is displayed. And the Project Team is already working on enhancing the checkpointing so that the new RNA and Fold_and_dock tasks can save more frequently. But they are running properly at present. So it's not like they are doing any harm to your machine. Nor is any more time (on average) being lost when you turn off your PC then previously.

ID: 39042 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Thomas Leibold

Send message
Joined: 30 Jul 06
Posts: 55
Credit: 19,627,164
RAC: 0
Message 39092 - Posted: 6 Apr 2007, 16:49:24 UTC

I'm not sure it really is a 'Problem', better call it an observation. One of my servers was shut down for a couple of days. When I restarted Boinc (this quad (2 dual-core) cpu linux server is dedicated to Rosetta) all 4 Rosetta tasks were restarted and immediately terminated (successfully and for credit), despite the fact that the workunit deadline was still far away and despite being still more than 1 hour short of the requested 8 hour runtime.

The workunits (result ids) are 70914277,
70911561,
70911269,
70910655.

The number of decoys generated (37-42) is lower than the number of nstructs (48-49), but that is to be expected since they didn't run for the full 8 hours.
Team Helix
ID: 39092 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 39094 - Posted: 6 Apr 2007, 17:05:26 UTC

Thomas' WUs all have the message
Completed 30 RNA decoys.
that was described by feet1st on Ralph here.

The RNA WUs, upon a full restart, seem to go ahead and end any time 30 decoys have been produced. I'm sure that will be an easy fix. It is a normal end in the sense that there were no errors. But, given any other calculation, there was still plenty of runtime left to complete more models.

This issue would be noticed more frequently by users with a longer runtime preference. This is simply because users with a short preference won't complete 30 models before their target runtime is reached.
Rosetta Moderator: Mod.Sense
ID: 39094 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 39110 - Posted: 7 Apr 2007, 5:03:38 UTC - in response to Message 39094.  
Last modified: 7 Apr 2007, 5:05:06 UTC

Hi: that "30 decoys" message sounds a bit odd, but I might know the reason. There is an output message that always says that 30 decoys were created, but in actuality there can be anywhere from 1 to 1000 decoys made, depending on your cpu run time preferences. If you really do notice that WUs are taking significantly less time than your CPU run time preference, please post ... in the meanwhile, I'll try to fix that incorrect message!

My apologies for the long WUs without checkpointing -- we will have checkpointing for basically all Rosetta modes by the next update, hopefully in two weeks!

Thomas' WUs all have the message
Completed 30 RNA decoys.
that was described by feet1st on Ralph here.

The RNA WUs, upon a full restart, seem to go ahead and end any time 30 decoys have been produced. I'm sure that will be an easy fix. It is a normal end in the sense that there were no errors. But, given any other calculation, there was still plenty of runtime left to complete more models.

This issue would be noticed more frequently by users with a longer runtime preference. This is simply because users with a short preference won't complete 30 models before their target runtime is reached.


ID: 39110 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile (_KoDAk_)

Send message
Joined: 18 Jul 06
Posts: 109
Credit: 1,859,263
RAC: 0
Message 39114 - Posted: 7 Apr 2007, 11:19:56 UTC

!!! cruch(rosetta_5.59_windows_intelx86.exe) on view grafic mode
ID: 39114 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Geoff Roynon

Send message
Joined: 4 Nov 05
Posts: 5
Credit: 960,686
RAC: 0
Message 39130 - Posted: 7 Apr 2007, 22:27:03 UTC

I was having problems with the 5.54 version on my G5 dual 1.8MHz Mac and thought they had been resolved with version 5.59 but have noticed that one of the Rosetta workunits has a CPU time of 00:09:47 (which doesn't change) but looking at it in Activity Monitor it has a CPU time of 6:58:44 which is going up. My Rosetta workunits usually finish in less than 6 hours so something is not right.
I leave my Mac running all the time and share Boinc with Seti but have noticed that Seti hasn't been running - probably due to the Rosetta work unit that's stuck.
The name of the work unit is "1bxm__BOINC_POSEABINITIO__1645_2598_0"

I will abort the work unit and see if things run correctly.

PPC Mac dual 1.8 MHz, 3GB Ram, Mac OSX 10.4.8

Geoff
PPC Mac G5 dual 1.8GHz, 3GB Ram, Mac OSX 10.5
BOINC Manager 5.10.28
Rosetta Beta 5.81
ID: 39130 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rechenknecht123

Send message
Joined: 15 Oct 06
Posts: 17
Credit: 2,022
RAC: 0
Message 39146 - Posted: 8 Apr 2007, 12:01:40 UTC - in response to Message 38953.  
Last modified: 8 Apr 2007, 12:20:30 UTC

The same problem with

So 8 Apr 13:50:16 2007|rosetta@home|Restarting task s029__BOINC_SYMM_FOLD_AND_DOCK_RELAX-s029_-truncate_hom001__1638_96906_0 using rosetta version 559

Result ID 71817153

end befor restart powerbook mac os 103.9
98,547 % running time 11.50h resttime 00:38.45h

what happen there


When I started my Powerbook up again, I noticed that the Progress of the same work unit is 0% and the data processing has since restarted.

I didn't see any error messages at the time of when I switched my mac back on.[/quote]

ID: 39146 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 39147 - Posted: 8 Apr 2007, 12:08:48 UTC - in response to Message 39146.  

The same problem with

So 8 Apr 13:50:16 2007|rosetta@home|Restarting task s029__BOINC_SYMM_FOLD_AND_DOCK_RELAX-s029_-truncate_hom001__1638_96906_0 using rosetta version 559

end befor restart powerbook mac os 103.9
98,547 % running time 11.50h resttime 00:38.45h

what happen there


When I started my Powerbook up again, I noticed that the Progress of the same work unit is 0% and the data processing has since restarted.

I didn't see any error messages at the time of when I switched my mac back on.

[/quote]

If you look at grafics what model did it resart at?

What running time is it showing?

Anders n

ID: 39147 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rechenknecht123

Send message
Joined: 15 Oct 06
Posts: 17
Credit: 2,022
RAC: 0
Message 39148 - Posted: 8 Apr 2007, 12:25:25 UTC

Running time 00:24:00 13,327% Time til ready 2281:34:26
Result ID 71817153

ID: 39148 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 39149 - Posted: 8 Apr 2007, 12:32:50 UTC - in response to Message 39148.  

Running time 00:24:00 13,327% Time til ready 2281:34:26
Result ID 71817153


Time to complete dropping fast?

What model did at restart at?
ID: 39149 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rechenknecht123

Send message
Joined: 15 Oct 06
Posts: 17
Credit: 2,022
RAC: 0
Message 39150 - Posted: 8 Apr 2007, 12:33:49 UTC

If I opens the Grafik the the Left window with " Searching runs"

Low energy at -71,xxx

now at 00:31:0h 17,166% ready
time til ready 2105:44:05h to crunsch
mac os 10:39

powerbook g 4/ @400mhz
786 mb ram
rechenknecht



ID: 39150 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 39151 - Posted: 8 Apr 2007, 12:36:35 UTC

If you look to the right on grafics it looks something like this

Model: 5(?) Step: 300245

what model do you have?

ID: 39151 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rechenknecht123

Send message
Joined: 15 Oct 06
Posts: 17
Credit: 2,022
RAC: 0
Message 39152 - Posted: 8 Apr 2007, 12:38:26 UTC - in response to Message 39149.  

Running time 00:24:00 13,327% Time til ready 2281:34:26
Result ID 71817153


Time to complete dropping fast?


yes time to complete dropping very fast in my eyes
runnning time 00:35:00h
complete 19,624%

time til ready 1981:05:45h







What model did at restart at?
So 8 Apr 13:50:13 2007||Starting BOINC client version 5.8.17 for powerpc-apple-darwin
So 8 Apr 13:50:13 2007||log flags: task, file_xfer, sched_ops
So 8 Apr 13:50:13 2007||Libraries: libcurl/7.15.5 OpenSSL/0.9.7l zlib/1.1.4
So 8 Apr 13:50:13 2007||Data directory: /Library/Application Support/BOINC Data
So 8 Apr 13:50:15 2007||Processor: 1 Power Macintosh Power Macintosh [Power Macintosh Model PowerBook3,2] [AltiVec]
So 8 Apr 13:50:15 2007||Memory: 768.00 MB physical, 604.07 MB virtual
So 8 Apr 13:50:15 2007||Disk: 27.42 GB total, 354.07 MB free
So 8 Apr 13:50:16 2007|rosetta@home|URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 328691; location: (none); project prefs: default
So 8 Apr 13:50:16 2007|boincsimap|URL: http://boinc.bio.wzw.tum.de/boincsimap/; Computer ID: 33988; location: (none); project prefs: default
So 8 Apr 13:50:16 2007|Predictor @ Home|URL: http://predictor.scripps.edu/; Computer ID: 282590; location: (none); project prefs: default
So 8 Apr 13:50:16 2007||General prefs: from Predictor @ Home (last modified 2007-02-16 01:17:52)
So 8 Apr 13:50:16 2007||Host location: none
So 8 Apr 13:50:16 2007||General prefs: using your defaults
So 8 Apr 13:50:16 2007|rosetta@home|Restarting task s029__BOINC_SYMM_FOLD_AND_DOCK_RELAX-s029_-truncate_hom001__1638_96906_0 using rosetta version 559



ID: 39152 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rechenknecht123

Send message
Joined: 15 Oct 06
Posts: 17
Credit: 2,022
RAC: 0
Message 39153 - Posted: 8 Apr 2007, 12:41:37 UTC - in response to Message 39151.  

If you look to the right on grafics it looks something like this

Model: 5(?) Step: 300245

what model do you have?

now Run Model 1 step 23613




ID: 39153 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 39155 - Posted: 8 Apr 2007, 12:46:40 UTC


98,547 % running time 11.50h resttime 00:38.45h


is that 11H 50 min?
ID: 39155 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rechenknecht123

Send message
Joined: 15 Oct 06
Posts: 17
Credit: 2,022
RAC: 0
Message 39156 - Posted: 8 Apr 2007, 12:50:41 UTC - in response to Message 39155.  


98,547 % running time 11.50h resttime 00:38.45h


is that 11H 50 min?


yes this was the running time before i restart my powerbook after rosetta was running in the night.

after restart all was 0 % and 00:00:00h running time Rest til ready 3012:06:07h

ID: 39156 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 39157 - Posted: 8 Apr 2007, 12:59:56 UTC

I don't know what happend. After 11H there should have been atleast a few models made.

Next time you shut down the MAC please check in the grafics how much work
it has done on the WU. Then check again when you restart it.

If it happens again please post here again.

Remember for now 1 model has to be done before a checkpoint is made.
(they are working at more checkpoints in the Wu)
Anders n
ID: 39157 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

Message boards : Number crunching : Problems with Rosetta version 5.59



©2024 University of Washington
https://www.bakerlab.org