Problems with Rosetta version 5.46

Message boards : Number crunching : Problems with Rosetta version 5.46

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
288VKYUjwsXfAaTXn6SFJC4LVPRf

Send message
Joined: 16 Dec 05
Posts: 31
Credit: 153,110
RAC: 0
Message 36986 - Posted: 20 Feb 2007, 10:46:27 UTC

I also have a Core2duo. And just to say I have now a WU with -1300 energy and no problems. So forget about that ;)
ID: 36986 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TeAm Enterprise
Avatar

Send message
Joined: 28 Sep 05
Posts: 18
Credit: 27,904,257
RAC: 24
Message 36999 - Posted: 20 Feb 2007, 14:48:32 UTC

Here is the result from the problem WU I posted earlier in message 36982.

https://boinc.bakerlab.org/rosetta/result.php?resultid=63079778

After restarting it took another 2-3 hours to complete.
ID: 36999 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Vagelis Stefas

Send message
Joined: 27 Aug 06
Posts: 5
Credit: 118,856
RAC: 0
Message 37010 - Posted: 20 Feb 2007, 17:23:21 UTC

Another one stuck today.
WU: DOC_1DQJ_R070216_pose_u_pert_bbmin_from_farlx_abs_tol_1571_9171

This one stuck at model 53 step 545. After pausing and resuming rosetta it completed ok. Weird thing is that this one also stuck when was nearly done. It wanted about 10 minutes to complete but did an additional 20 or so before I notice it was actually stuck.
ID: 37010 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Viromancy

Send message
Joined: 23 Sep 06
Posts: 8
Credit: 125,713
RAC: 0
Message 37017 - Posted: 20 Feb 2007, 18:19:09 UTC

ID: 37017 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 150
Credit: 3,794,203
RAC: 2,238
Message 37037 - Posted: 21 Feb 2007, 0:16:07 UTC

This workunit stopped doing anything with the run times stopped and no CPU useage but still saying it is running and 75.54% complete, I had to abort it. Linux machine with no graphics.

https://boinc.bakerlab.org/rosetta/result.php?resultid=62904144
ID: 37037 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Vagelis Stefas

Send message
Joined: 27 Aug 06
Posts: 5
Credit: 118,856
RAC: 0
Message 37038 - Posted: 21 Feb 2007, 0:16:33 UTC

And yet another rosetta termination by the watchdog. Stuck after 3600 sec without progress. So it tells me. Claimed credit is 20 when it should be 50 according to my claims. Standard crunching with my configuration is about 70-80.

All of the problems so far are regarding DOC WUs. Should we blacklist them? :)
ID: 37038 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Alan Roberts

Send message
Joined: 7 Jun 06
Posts: 61
Credit: 6,901,926
RAC: 0
Message 37040 - Posted: 21 Feb 2007, 0:53:39 UTC

FWIW, the 5.46 upgrade doesn't seem to have fixed my stuck/timed-out WUs. Results from my small herd of cats below. All are 5.46, seem to fit the pattern described, and are either DOC... or CAPRI...DOCKING... WUs. Hope they did something useful before choking, and good luck in running down the problem.

Cheers,
Alan


63590248
63334842
62700842
63333917
63397464
62726409
63224255
62713986
63371353
62729329
63300964
62919358
62728295
62711260
62710446
63192709
63450095
63396659
63535809
62660685


ID: 37040 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tng*

Send message
Joined: 28 Oct 05
Posts: 14
Credit: 5,389,798
RAC: 0
Message 37041 - Posted: 21 Feb 2007, 1:26:28 UTC


Got a couple of compute errors on a system that's been crunching just fine:

63567729

63597368

Anybody else seeing problems with these, or do I need to check my system?
ID: 37041 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 37043 - Posted: 21 Feb 2007, 1:58:54 UTC

ID: 37043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile (_KoDAk_)

Send message
Joined: 18 Jul 06
Posts: 109
Credit: 1,859,263
RAC: 0
Message 37049 - Posted: 21 Feb 2007, 6:39:18 UTC
Last modified: 21 Feb 2007, 6:40:36 UTC

!!!!!!!!!!!! "rosetta_5.46_windows_intelx86.exe" Have Error on 1st second what the ************
/w sending report "к такой-то матери"
ID: 37049 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 37076 - Posted: 21 Feb 2007, 19:20:19 UTC

Stuck Wu. https://boinc.bakerlab.org/rosetta/result.php?resultid=63510605

Stuck att model 1 step 500 for 35 min. Restarted.
Stuck att model 1 step 500 for 1H 35 min. Aborted.

Anders n
ID: 37076 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile (_KoDAk_)

Send message
Joined: 18 Jul 06
Posts: 109
Credit: 1,859,263
RAC: 0
Message 37155 - Posted: 24 Feb 2007, 7:11:50 UTC

https://boinc.bakerlab.org/rosetta/results.php?hostid=350512
60938061 54205701 4 Feb 2007 15:09:47 UTC 22 Feb 2007 14:51:29 UTC Over Success Done 50,636.67 166.39 "0.00" why granted credit = "0.00"
ID: 37155 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 37156 - Posted: 24 Feb 2007, 7:51:15 UTC - in response to Message 37155.  

https://boinc.bakerlab.org/rosetta/results.php?hostid=350512
60938061 54205701 4 Feb 2007 15:09:47 UTC 22 Feb 2007 14:51:29 UTC Over Success Done 50,636.67 166.39 "0.00" why granted credit = "0.00"


Result was reported too late to validate

Anders n
ID: 37156 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Cheryl

Send message
Joined: 17 Feb 07
Posts: 5
Credit: 3,564,449
RAC: 0
Message 37157 - Posted: 24 Feb 2007, 12:55:48 UTC

I am using Boinc 5.8.11 on a Mac OS 10.4.8, Rosetta 5.46 and I am getting these messages:

Sat Feb 24 04:30:34 2007|rosetta@home|Starting task s030__BOINC_ABRELAX_NEWRELAXFLAGS_hom002__1576_9068_0 using rosetta version 546
Sat Feb 24 05:32:16 2007||Restarting s030__BOINC_ABRELAX_NEWRELAXFLAGS_hom002__1576_9068_0 - message timeout
Sat Feb 24 05:32:16 2007|rosetta@home|Restarting task s030__BOINC_ABRELAX_NEWRELAXFLAGS_hom002__1576_9068_0 using rosetta version 546
Sat Feb 24 05:32:17 2007||[error] Process 1558 not found

The process numbers change with each message. Is this a WU error or Boinc error?
ID: 37157 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chuck

Send message
Joined: 16 Feb 07
Posts: 1
Credit: 55,064
RAC: 0
Message 37170 - Posted: 25 Feb 2007, 2:07:53 UTC - in response to Message 37157.  

I am using Boinc 5.8.11 on a Mac OS 10.4.8, Rosetta 5.46 and I am getting these messages:

Sat Feb 24 04:30:34 2007|rosetta@home|Starting task s030__BOINC_ABRELAX_NEWRELAXFLAGS_hom002__1576_9068_0 using rosetta version 546
Sat Feb 24 05:32:16 2007||Restarting s030__BOINC_ABRELAX_NEWRELAXFLAGS_hom002__1576_9068_0 - message timeout
Sat Feb 24 05:32:16 2007|rosetta@home|Restarting task s030__BOINC_ABRELAX_NEWRELAXFLAGS_hom002__1576_9068_0 using rosetta version 546
Sat Feb 24 05:32:17 2007||[error] Process 1558 not found

The process numbers change with each message. Is this a WU error or Boinc error?

I get exactly the same thing. Same setup.

Lots of "Client Error"s also.

I don't really know if there's any other info I can post that will help. (I'm not familiar with whether there are any appropriate log files, for instance.) Let me know and I will post it.
ID: 37170 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Charlie

Send message
Joined: 25 Mar 06
Posts: 53
Credit: 424,472
RAC: 0
Message 37181 - Posted: 25 Feb 2007, 20:49:01 UTC

Guese the quesion i have here is why i am getting the error in the 1st place. i hav e had 4 more in the last week. Is it something to do with the WU is it my machine or what. Because if it is my machine would like to try to fix the problem anyone able to answer this

Charlie
ID: 37181 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 37186 - Posted: 26 Feb 2007, 1:07:35 UTC
Last modified: 26 Feb 2007, 1:14:10 UTC

I moved Charlie's post here.

He's concerned about occasional validator errors on his host.
For example:

Result ID 63695800
Name s021__BOINC_ABRELAX_NEWRELAXFLAGS_hom011__1568_6060_0
Workunit 56783985
Created 20 Feb 2007 21:44:36 UTC
Sent 20 Feb 2007 21:45:07 UTC
Received 21 Feb 2007 17:13:59 UTC
Server state Over
Outcome Validate error
Client state Done
Exit status 0 (0x0)
Computer ID 185711
Report deadline 2 Mar 2007 21:45:07 UTC
CPU time 33305.109375
stderr out <core_client_version>5.8.4</core_client_version>

Validate state Invalid
Claimed credit 36.570871790918
Granted credit 0


Rosetta Moderator: Mod.Sense
ID: 37186 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rinselberg
Avatar

Send message
Joined: 8 Nov 06
Posts: 4
Credit: 276,842
RAC: 0
Message 37187 - Posted: 26 Feb 2007, 1:24:06 UTC

The last Rosetta work unit got "stuck" - status was "running", but after many hours it did not show any progress towards completion - still at 0.00 percent complete ... I aborted it.

Have not received any Rosetta work units for last umpteen hours. Did a project reset, still no work unit.

Have been running Rosetta 5.46 for some time prior to today.

Running a dual core Intel architecture Mac with Mac OS X 10.4.8 and 1 GB RAM and BOINC 5.9.0 with SETI, SETI-beta and Einstein also eligible to run.
Are you reading more posts and enjoying it less? Make RadioFreeRinsel your next Internet port of call ...
ID: 37187 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 37190 - Posted: 26 Feb 2007, 1:42:50 UTC

rinselberg, seems your WU escaped BOINC control. It shows zero CPU time recorded.

As for not having work for Rosetta, BOINC keeps track of how much work is done for each project. Since Rosetta had a run there of a few hours, it is just giving the other projects their fair share of runtime, trying to maintain your configured resource shares.

It can sometimes take a day or two for BOINC to come back around to getting more work for any particular project. It does not keep a work unit on-hand unless it is ready to allocate time to that project. But if you'd like it to work more like that, you might consider increasing the size of your work "cache". This is configured in your General Preferences by the setting for "connect to network every ... days".

If you would like to run Rosetta a higher percentage of the time, you can adjust your resource share configured in your Rosetta Preferences.
Rosetta Moderator: Mod.Sense
ID: 37190 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile champ
Avatar

Send message
Joined: 28 Mar 06
Posts: 29
Credit: 42,108
RAC: 0
Message 37302 - Posted: 1 Mar 2007, 15:54:08 UTC

I got this messages:

01.03.2007 16:38:01|rosetta@home|Restarting task vp10__BOINC_ABRELAX_cterm_hom002__1581_12037_0 using rosetta version 546
01.03.2007 16:38:09|rosetta@home|Task vp10__BOINC_ABRELAX_cterm_hom002__1581_12037_0 exited with zero status but no 'finished' file
01.03.2007 16:38:09|rosetta@home|If this happens repeatedly you may need to reset the project.

What to do???? I run Rosetta von an AMD Duron with 750MHz on WIN 2000.
ID: 37302 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Problems with Rosetta version 5.46



©2024 University of Washington
https://www.bakerlab.org