Problems with Rosetta version 5.68 and 5.70

Message boards : Number crunching : Problems with Rosetta version 5.68 and 5.70

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Stacey Baird
Avatar

Send message
Joined: 11 Apr 06
Posts: 19
Credit: 74,745
RAC: 0
Message 42920 - Posted: 30 Jun 2007, 19:40:28 UTC

I hope this is on topic: I have received no new work from Rosetta in more than two weeks. I have made Update Requests and Reset the project several times to no avail.

What do I need to do?

(I use a 3.2GH pentium 4 dual core with 1GB memory under XP 2nd Ed, HOme, and am also running SETI, Einstein, and Climate Prediction projects.)

Thanks

Stacey
ID: 42920 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 42938 - Posted: 1 Jul 2007, 4:36:25 UTC

Stacey, have a look at the discussion in this thread, and if problems getting work persist, please post there.
Rosetta Moderator: Mod.Sense
ID: 42938 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 43000 - Posted: 1 Jul 2007, 22:50:57 UTC

I'm the second to have a problem with this one!

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=81076322

Haven't had this before.

core_client_version>5.8.16</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 36000
# random seed: 1897198
# cpu_run_time_pref: 36000
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score 0 for 900 seconds
**********************************************************************
GZIP SILENT FILE: .cc2ci2.out

ID: 43000 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Michael Casey

Send message
Joined: 8 Oct 05
Posts: 3
Credit: 171,969
RAC: 0
Message 43007 - Posted: 2 Jul 2007, 1:52:57 UTC

I just had this workunit error out: https://boinc.bakerlab.org/rosetta/result.php?resultid=89646602. Something about an incorrect function?

I'm running the 5.10.7 version of the Boinc client on Windows XP SP2, on a 3.2 Ghz dual Xeon with hyperthreading.
ID: 43007 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Michael Casey

Send message
Joined: 8 Oct 05
Posts: 3
Credit: 171,969
RAC: 0
Message 43008 - Posted: 2 Jul 2007, 1:53:16 UTC

I just had this workunit error out: https://boinc.bakerlab.org/rosetta/result.php?resultid=89646602. Something about an incorrect function? Rosetta version is 5.68.

I'm running the 5.10.7 version of the Boinc client on Windows XP SP2, on a 3.2 Ghz dual Xeon with hyperthreading.
ID: 43008 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jon C Melusky
Avatar

Send message
Joined: 29 Nov 05
Posts: 12
Credit: 187,710
RAC: 148
Message 43016 - Posted: 2 Jul 2007, 7:13:23 UTC

Hi All cool peeps & geeks !

Recently I got this result on rosetta and I was curious what it means and if I should adjust any settings I may have for rosetta or BOINC.

Result ID 89702036
Name CNTRL_01ABRELAX_SAVE_ALL_OUT_-1c8cA-_filters_1792_403_0
Workunit 81064633
Created 30 Jun 2007 2:34:00 UTC
Sent 30 Jun 2007 2:36:36 UTC
Received 30 Jun 2007 13:55:34 UTC
Server state Over
Outcome Client error
Client state Done
Exit status 3 (0x3)
Computer ID 78725
Report deadline 10 Jul 2007 2:36:36 UTC
CPU time 30.671875
stderr out

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# cpu_run_time_pref: 7200

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 0.0610518749062124
Granted credit 0
application version 5.68

Thank you for any input you may put forth. *smile*

Geothermal
ID: 43016 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 43030 - Posted: 2 Jul 2007, 13:30:53 UTC
Last modified: 2 Jul 2007, 13:31:49 UTC

Jon seems to have this error frequently. Windows XP home SP1. Generally within 30 seconds of starting a task. Some tasks run fine, others get the:
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
link to Jon's results page
Rosetta Moderator: Mod.Sense
ID: 43030 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 11,805,838
RAC: 0
Message 43075 - Posted: 3 Jul 2007, 3:02:51 UTC

I seem to be having a number of tasks that appear to be stuck at 0%. The latest is CNTRL_01ABRELAX_SAVE_ALL_OUT_-1bk2_-_filters_1792_2981_0 ( workunit 81530519 on Rosetta 5.68 ). I don't see any other reports about this problem : is it just my system ( IMac2 under OSX 10.4.10 ) or are other users letting the watchdog thread terminate it automatically?
ID: 43075 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile thewildhaggis

Send message
Joined: 5 May 07
Posts: 2
Credit: 58,047
RAC: 0
Message 43211 - Posted: 5 Jul 2007, 8:37:25 UTC
Last modified: 5 Jul 2007, 8:58:04 UTC

Good Day Folks,
First time posting here, though i have been reading these forums for a while now. Recently all of my completed WU's are returning 'Client Error'.

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>rosetta_5.68_windows_intelx86.exe</file_name>
<error_code>-120</error_code>
<error_message>signature verification error</error_message>
</file_xfer_error>

</message>
]]>

I am using version 5.68

This has only recently happened.
Any ideas as to why?
ID: 43211 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5662
Credit: 5,701,005
RAC: 2,103
Message 43273 - Posted: 5 Jul 2007, 19:45:10 UTC
Last modified: 5 Jul 2007, 19:46:09 UTC

every single time RAH goes to download a file (from the 3rd to currently) it comes back with these types messages:

7/5/2007 4:10:49 PM|rosetta@home|Sending scheduler request: To fetch work
7/5/2007 4:10:49 PM|rosetta@home|Requesting 12 seconds of new work
7/5/2007 4:10:54 PM|rosetta@home|Scheduler RPC succeeded [server version 509]
7/5/2007 4:10:54 PM|rosetta@home|Deferring communication for 4 min 2 sec
7/5/2007 4:10:54 PM|rosetta@home|Reason: requested by project
7/5/2007 4:10:56 PM|rosetta@home|[file_xfer] Started download of file nterm_hom001_t321_.fasta.gz
7/5/2007 4:10:56 PM|rosetta@home|[file_xfer] Started download of file nterm_hom001_t321_.psipred_ss2.gz
7/5/2007 4:10:59 PM||Project communication failed: attempting access to reference site
7/5/2007 4:10:59 PM|rosetta@home|[file_xfer] Temporarily failed download of nterm_hom001_t321_.fasta.gz: system connect
7/5/2007 4:10:59 PM|rosetta@home|[file_xfer] Temporarily failed download of nterm_hom001_t321_.psipred_ss2.gz: system connect
7/5/2007 4:10:59 PM|rosetta@home|[file_xfer] Started download of file boinc_nterm_hom001_aat321_03_05.200_v1_3.gz
7/5/2007 4:10:59 PM|rosetta@home|[file_xfer] Started download of file boinc_nterm_hom001_aat321_09_05.200_v1_3.gz
7/5/2007 4:11:01 PM||Access to reference site succeeded - project servers may be temporarily down.
7/5/2007 4:11:02 PM||Project communication failed: attempting access to reference site
7/5/2007 4:11:02 PM|rosetta@home|[file_xfer] Temporarily failed download of boinc_nterm_hom001_aat321_03_05.200_v1_3.gz: system connect
7/5/2007 4:11:02 PM|rosetta@home|[file_xfer] Temporarily failed download of boinc_nterm_hom001_aat321_09_05.200_v1_3.gz: system connect
7/5/2007 4:11:02 PM|rosetta@home|[file_xfer] Started download of file nterm_hom001_t321.pdb.gz
7/5/2007 4:11:03 PM||Access to reference site succeeded - project servers may be temporarily down.
7/5/2007 4:11:03 PM|rosetta@home|[file_xfer] Started download of file nterm_hom001_t321_.fasta.gz
7/5/2007 4:11:05 PM||Project communication failed: attempting access to reference site
7/5/2007 4:11:05 PM|rosetta@home|[file_xfer] Finished download of file nterm_hom001_t321_.fasta.gz
7/5/2007 4:11:05 PM|rosetta@home|[file_xfer] Throughput 402 bytes/sec
7/5/2007 4:11:05 PM|rosetta@home|[file_xfer] Started download of file nterm_hom001_t321_.psipred_ss2.gz
7/5/2007 4:11:05 PM|rosetta@home|[file_xfer] Temporarily failed download of nterm_hom001_t321.pdb.gz: system connect
7/5/2007 4:11:06 PM||Access to reference site succeeded - project servers may be temporarily down.
7/5/2007 4:11:06 PM|rosetta@home|[file_xfer] Finished download of file nterm_hom001_t321_.psipred_ss2.gz
7/5/2007 4:11:06 PM|rosetta@home|[file_xfer] Throughput 3420 bytes/sec
7/5/2007 4:11:06 PM|rosetta@home|[file_xfer] Started download of file boinc_nterm_hom001_aat321_03_05.200_v1_3.gz
7/5/2007 4:11:06 PM|rosetta@home|[file_xfer] Started download of file boinc_nterm_hom001_aat321_09_05.200_v1_3.gz
7/5/2007 4:11:10 PM|rosetta@home|[file_xfer] Finished download of file boinc_nterm_hom001_aat321_09_05.200_v1_3.gz
7/5/2007 4:11:10 PM|rosetta@home|[file_xfer] Throughput 80298 bytes/sec
7/5/2007 4:11:10 PM|rosetta@home|[file_xfer] Started download of file nterm_hom001_t321.pdb.gz
7/5/2007 4:11:12 PM|rosetta@home|[file_xfer] Finished download of file nterm_hom001_t321.pdb.gz
7/5/2007 4:11:12 PM|rosetta@home|[file_xfer] Throughput 32495 bytes/sec
7/5/2007 4:11:16 PM|rosetta@home|[file_xfer] Finished download of file boinc_nterm_hom001_aat321_03_05.200_v1_3.gz
7/5/2007 4:11:16 PM|rosetta@home|[file_xfer] Throughput 101298 bytes/sec



ID: 43273 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 43285 - Posted: 6 Jul 2007, 0:55:19 UTC - in response to Message 42850.  

No, the workunits that greg_be posted about should be pretty standard workunits -- I don't think they're memory hungry and we've run that type before... so I'm afraid I don't have much advice here!


Rhiju? Is there any variation in these tasks as to some requesting large memory and some not?

I see now why you are saying that was an odd thing for BOINC to do. I had originally thought you were just confused about starting multiple tasks. You've got much more to your picture.

If the project sends out a large memory task, I believe the BOINC client knows that, and so it may have skipped a few of those in favor of a lower memory task on the later due date. Often the task names will be similar enough that they look the same.


ID: 43285 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 43286 - Posted: 6 Jul 2007, 0:58:03 UTC - in response to Message 43030.  

Hi Jon and Mod.Sense -- weird, this looks like a system-specific issue. Might be worth restarting BOINC (or even the computer). I wonder if Jon is running something else in parallel that's asking a lot from his hard drive?

Jon seems to have this error frequently. Windows XP home SP1. Generally within 30 seconds of starting a task. Some tasks run fine, others get the:
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
link to Jon's results page


ID: 43286 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 43287 - Posted: 6 Jul 2007, 1:01:21 UTC - in response to Message 42826.  

We've had a lot of trouble getting graphics working on linux boxes and that may not change... right now, our priorities are to get an optimized version for 64bit windows (and linux!) so that Rosetta can get the most out of your machines. We're getting a test machine in the next few weeks...


ok... here's a post with no probs !!

:-)

Except no graphics

The Cruncher

Mem now fully popped (2gb)

Graphics Ati x800 (512mb)

Os = debian 64 / Dual boot with win XP that ONLY gets used for NLE


More graphcs PRETTY please

Oh.. and a port for AmigaOs4 (PPC)

:-)


ID: 43287 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 43318 - Posted: 6 Jul 2007, 16:04:08 UTC

This one:

1vcc__BOINC_ABRELAX_HB1.5_SAVE_ALL_OUT_BARCODE-1vcc_-frags83__1824_793_0 has stalled twice at step 29043

resultid 90824783
ID: 43318 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Marky-UK

Send message
Joined: 1 Nov 05
Posts: 73
Credit: 1,689,495
RAC: 0
Message 43323 - Posted: 6 Jul 2007, 19:21:28 UTC

Just had this failure, the first one ever on this host I think: 91030536

core_client_version>5.10.7</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 1086965
No heartbeat from core client for 31 sec - exiting
SIGSEGV: segmentation violation
Stack trace (13 frames):
[0x8cf2edb]
[0x8cedd0c]
[0xffffe420]
[0x8c5d2db]
[0x8b63872]
[0x8c44d10]
[0x849ba9e]
[0x80dae11]
[0x85c8e17]
[0x86f632b]
[0x86f63d6]
[0x8d56dd4]
[0x8048111]

Exiting...

</stderr_txt>
]]>
ID: 43323 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
LBobrycki

Send message
Joined: 6 Jun 06
Posts: 1
Credit: 16,756
RAC: 0
Message 43363 - Posted: 8 Jul 2007, 6:40:22 UTC

I have been having trouble since the beginning with computer lockups. I downloaded the program over a year ago and want to participate but I can't keep cold-starting my computer when Boinc freezes. I lose all of my open files. I estimate that it has happened over 3 dozen times. What is going on?

Lori Bobrycki
lbobrycki@yahoo.com
ID: 43363 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FoX

Send message
Joined: 11 Dec 05
Posts: 4
Credit: 2,304,747
RAC: 0
Message 43415 - Posted: 9 Jul 2007, 10:03:55 UTC

Hi everyone

i am taking part of the project with my pc, p3 500 katamai ram 192 mb linux 2.6.21 .
Using the rosetta beta 5.70 version i found a bug:
CPU time does not show the exact timing, the counter adds one second every 6-7 real seconds.
Work units last for a long time (12/13 hours) and the timing that is shown is wrong. This does not happen with version 5.68 in which wu last about 2 3 hours.


ID: 43415 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 43417 - Posted: 9 Jul 2007, 10:33:40 UTC
Last modified: 9 Jul 2007, 16:56:52 UTC

LBobrycki: You are still using the same version of BOINC that you probably downloaded a year ago. There have been many improvements made to the BOINC environment (which Rosetta is running within) since that time. The current BOINC version should be more stable for you.

Here are some Q&A items to get you started with further info:
How do I know what version of BOINC I am running?
How can I check what the current BOINC version available is?
How do I install a new version of BOINC without losing my existing work?

It's all much easier then it sounds when you write it all out.

FoX: Nothing in the Rosetta version has changed the target runtimes. But each type of task runs differently. Some take a long time to complete a single model. The estimated time to completion is updated every 5 seconds. The twitches in between are BOINC making calculation based on the CPU time increasing as compared to the % completed, which is also only updated every 5 seconds.
Rosetta Moderator: Mod.Sense
ID: 43417 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FoX

Send message
Joined: 11 Dec 05
Posts: 4
Credit: 2,304,747
RAC: 0
Message 43430 - Posted: 9 Jul 2007, 16:39:53 UTC

Thank you for your message Mod.Sense but i think i've not explained the situation.
I am referring to "CPU time" bar, not "To completion" bar (i know that "To completion" is an estimated time).
The WU on rosetta 5.70 beta show a "CPU time" of "X" but the true time is about "6*X".
This doesn't happen in rosetta 5.68.
Is this normal?
Thank you, bye

ID: 43430 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 43431 - Posted: 9 Jul 2007, 16:56:32 UTC
Last modified: 9 Jul 2007, 17:02:46 UTC

Your CPU time is an indication of the amount of CPU time that has been spent processing the task. If your CPU time only adds one second every 5 seconds then it means your PC was busy working on other applications and only had enough low priority CPU time left to give 1 second to BOINC.

...that or there is a BOINC setting for the % of CPU that you want BOINC to utilize. If you had it set to 20% and had no other applications running on your PC, that would cause the same 1 second of CPU time per 5 seconds of time passing. This setting is in your General Preferences. It is generally used to control heat output of the CPU. For that purpose, you will probably find that a setting of 70-80% will reduce heat levels considerably.

[edit] FoX, it looks like your Linux box only has 187MB of memory? You may have hit one of the tasks that requires more memory then most, and your machine may be spending considerable time doing page faults. Also possible that the task is suspended by BOINC with a status of "waiting for memory".
Rosetta Moderator: Mod.Sense
ID: 43431 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Problems with Rosetta version 5.68 and 5.70



©2024 University of Washington
https://www.bakerlab.org