Problems with Rosetta version 5.98

Message boards : Number crunching : Problems with Rosetta version 5.98

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 10 · Next

AuthorMessage
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1480
Credit: 4,334,829
RAC: 0
Message 53999 - Posted: 25 Jun 2008, 22:44:19 UTC

Please post bugs/issues regarding version 5.98 here.
ID: 53999 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Francis
Avatar

Send message
Joined: 24 Nov 05
Posts: 8
Credit: 623,519
RAC: 0
Message 54021 - Posted: 26 Jun 2008, 20:48:51 UTC

6/26/2008 3:31:54 PM|rosetta@home|Starting t434_1_NMRREF_1_t434_1_T0434_2QPWA_2JV0_hybridIGNORE_THE_REST_truncated_4104_1_1
6/26/2008 3:31:54 PM|rosetta@home|Starting task t434_1_NMRREF_1_t434_1_T0434_2QPWA_2JV0_hybridIGNORE_THE_REST_truncated_4104_1_1 using rosetta_beta version 598
6/26/2008 4:16:56 PM|rosetta@home|Computation for task t434_1_NMRREF_1_t434_1_T0434_2QPWA_2JV0_hybridIGNORE_THE_REST_truncated_4104_1_1 finished
6/26/2008 4:16:56 PM|rosetta@home|Output file t434_1_NMRREF_1_t434_1_T0434_2QPWA_2JV0_hybridIGNORE_THE_REST_truncated_4104_1_1_0 for task t434_1_NMRREF_1_t434_1_T0434_2QPWA_2JV0_hybridIGNORE_THE_REST_truncated_4104_1_1 absent

ID: 54021 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [KWSN]John Galt 007
Avatar

Send message
Joined: 4 Aug 06
Posts: 6
Credit: 1,017,647
RAC: 0
Message 54025 - Posted: 27 Jun 2008, 2:36:06 UTC

ID: 54025 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Adam

Send message
Joined: 26 Jun 07
Posts: 7
Credit: 487,917
RAC: 0
Message 54031 - Posted: 27 Jun 2008, 17:10:47 UTC

Compute error,
https://boinc.bakerlab.org/rosetta/result.php?resultid=173774049
ID: 54031 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 54034 - Posted: 28 Jun 2008, 0:39:23 UTC

This one fell over on both hosts, same error.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=158612212

Output file FRA_t449_CASP8_MANUAL_1_IGNORE_THE_RESTt449_1_ttxxxxT0449_1CHIM_0001_0001_0001_4126_3627_1_0 for task absent

<core_client_version>5.10.30</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 21600
# random seed: 2404847
ERROR:: Exit from: .loop_relax.cc line: 1745

</stderr_txt>

pete.



ID: 54034 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
RC

Send message
Joined: 27 Sep 05
Posts: 13
Credit: 262,048
RAC: 0
Message 54035 - Posted: 28 Jun 2008, 0:45:05 UTC - in response to Message 54031.  

Another compute error,
https://boinc.bakerlab.org/rosetta/result.php?resultid=173797309
ID: 54035 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
anti-cancers

Send message
Joined: 2 Sep 06
Posts: 9
Credit: 173,262
RAC: 0
Message 54037 - Posted: 28 Jun 2008, 6:20:01 UTC
Last modified: 28 Jun 2008, 6:20:48 UTC

ID: 54037 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,888,320
RAC: 0
Message 54038 - Posted: 28 Jun 2008, 10:33:46 UTC - in response to Message 54034.  

This one fell over on both hosts, same error.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=158612212

Output file FRA_t449_CASP8_MANUAL_1_IGNORE_THE_RESTt449_1_ttxxxxT0449_1CHIM_0001_0001_0001_4126_3627_1_0 for task absent

<core_client_version>5.10.30</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 21600
# random seed: 2404847
ERROR:: Exit from: .loop_relax.cc line: 1745

</stderr_txt>

pete.




same here
ID: 54038 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Konstantin Iliev

Send message
Joined: 22 May 06
Posts: 4
Credit: 2,205,841
RAC: 0
Message 54053 - Posted: 28 Jun 2008, 16:54:45 UTC

Again errors as 5.96 :(

https://boinc.bakerlab.org/rosetta/result.php?resultid=173787198
https://boinc.bakerlab.org/rosetta/result.php?resultid=173807571
https://boinc.bakerlab.org/rosetta/result.php?resultid=173821223
ID: 54053 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 2
Message 54065 - Posted: 29 Jun 2008, 14:56:31 UTC
Last modified: 29 Jun 2008, 14:58:03 UTC

174088567
174032945
174000755

First and last validate errors after the full job, (10,000+ seconds), the middle one after just a few seconds, (Exit from: .loop_relax.cc line: 1745).
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 54065 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Wonderwall

Send message
Joined: 19 Mar 07
Posts: 1
Credit: 39,192
RAC: 0
Message 54067 - Posted: 29 Jun 2008, 16:25:51 UTC - in response to Message 53999.  

Please post bugs/issues regarding version 5.98 here.

rosetta@home Rosetta Beta 5.96 1405_CaspB_IUMPAB_Type2_RES81to19... 03:51:35 00.000% ...06/28/... Running high prio...
ID: 54067 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 54072 - Posted: 29 Jun 2008, 19:58:38 UTC

This WU did 1247 decoys, then was marked "invalid" for no apparent reason:

FRA_t449_CASP8_AUTO_1SNZ_1L7J_2CIQ_1_IGNORE_THE_RESTt449_1_ttttaaT0449_1L7JA_10_0001_0001_0002_4134_634
ID: 54072 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 54082 - Posted: 30 Jun 2008, 14:19:45 UTC

This little bugger has been running all weekend. 47hrs on a 24hr preference.
FRA_t449_CASP8_MANUAL_1_IGNORE_THE_RESTt449_1_ttxxxxT0449_1CHIM_0001_0001_0001_4142_1913 Yet it is still getting CPU time, and the step number is still incrementing. It says it is on model 151.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 54082 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile sslickerson

Send message
Joined: 14 Oct 05
Posts: 101
Credit: 578,497
RAC: 0
Message 54085 - Posted: 30 Jun 2008, 17:47:13 UTC

Here is a very fast error running version 5.98 on Windows XP: 174404220. It looks like it failed on at least one other host in the same manner.





ID: 54085 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BrnmccO1

Send message
Joined: 26 Jun 07
Posts: 17
Credit: 578,825
RAC: 0
Message 54089 - Posted: 30 Jun 2008, 20:41:29 UTC
Last modified: 30 Jun 2008, 20:44:50 UTC

Bizzare problem with this WU; had an 'unhandled exception error' after about approx 50 mins CPU run time, with a lenthy Std_Out: 157316144

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 10800
# random seed: 2747207


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00B3C947 read attempt to address 0x000000A4

Engaging BOINC Windows Runtime Debugger...


Otherwise no other errors so far with 5.98 on both of my hosts (knocks on wood ;p)
ID: 54089 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile TeAm Enterprise
Avatar

Send message
Joined: 28 Sep 05
Posts: 18
Credit: 27,911,735
RAC: 17
Message 54096 - Posted: 1 Jul 2008, 4:56:20 UTC
Last modified: 1 Jul 2008, 4:57:11 UTC

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=158648907

Two validate errors after full crunch.

Rosetta needs to think about how to apply credit when the problems are obviously of project/WU source.

Jim
Crunch with friends - TeAm Anandtech
ID: 54096 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,860,059
RAC: 3,073
Message 54097 - Posted: 1 Jul 2008, 8:19:57 UTC - in response to Message 54096.  

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=158648907

Two validate errors after full crunch.

Rosetta needs to think about how to apply credit when the problems are obviously of project/WU source.

Jim

Credit is applied to these as claimed - it doesn't show on the task's main page but does if you hit the Task ID link on the left.

HTH
Danny
ID: 54097 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Virtual Boss*
Avatar

Send message
Joined: 10 May 08
Posts: 35
Credit: 713,981
RAC: 0
Message 54101 - Posted: 1 Jul 2008, 10:58:04 UTC
Last modified: 1 Jul 2008, 11:18:21 UTC

WU FRA_t449_CASP8_MANUAL_1_IGNORE_THE_RESTt449_1_ttxxxxT0449_1CHIM_0001_0001_0001_4142_3294_1 using rosetta_beta version 598

Original estimated run time about 6 CPU Hrs

Still Runing at 10:10:00 CPU

Progress 98.386% and incrementing 0.001 about every 25 CPU secs

To Completion 00:09:55 (no change last 30 CPU minutes

At current % increase will take another 11+ CPU Hrs, or if Prog% is calculated from time done as % of Time done+To completion then will run forever.

BTW Currently Model 22 Step 47795
ID: 54101 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,860,059
RAC: 3,073
Message 54103 - Posted: 1 Jul 2008, 11:24:21 UTC - in response to Message 54101.  

WU FRA_t449_CASP8_MANUAL_1_IGNORE_THE_RESTt449_1_ttxxxxT0449_1CHIM_0001_0001_0001_4142_3294_1 using rosetta_beta version 598

Original estimated run time about 6 CPU Hrs

Still Runing at 10:10:00 CPU

Progress 98.386% and incrementing 0.001 about every 25 CPU secs

To Completion 00:09:55 (no change last 30 CPU minutes

At current % increase will take another 11+ CPU Hrs, or if Prog% is calculated from time done as % of Time done+To completion then will run forever.

BTW Currently Model 22 Step 47795

the % complete and time to completion aren't linear - they're estimates, so don't worry about them if Rosetta's CPU time is increasing in task manager.

Danny
ID: 54103 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 54105 - Posted: 1 Jul 2008, 14:37:53 UTC

Danny is correct about the time estimates.

But the t449 I reported earlier never did finish. I let it run for 68 hours before aborting it (my runtime preference is 24hrs so I'm sure the watchdog would have discovered it after 4x that preference, but I didn't want to waste the time).

My aborted task didn't seem to send the normal data in to the server. 150 presumably good models lost. So, I would suggest (if you have the patience) to exit and restart BOINC 5 times. Each time leaving it run for long enough to get itself initialized and running the problem task. Rosetta will then detect no progress after 4 or 5 restarts and more cleanly cut it off and send it in.

I'm also wishing I had saved a copy of all the slots directories. Again, if you have the time, after your first exit of BOINC, I would save all the directories under your BOINC installation path with /slots on the end of the path, and EMail it to the rosettamod.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 54105 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 . . . 10 · Next

Message boards : Number crunching : Problems with Rosetta version 5.98



©2024 University of Washington
https://www.bakerlab.org