Report Problems with Rosetta Version 5.07

Message boards : Number crunching : Report Problems with Rosetta Version 5.07

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 7 · Next

AuthorMessage
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 14922 - Posted: 28 Apr 2006, 22:56:29 UTC
Last modified: 12 May 2006, 20:39:55 UTC


Version 5.07 has been released,

This Version contains a minor fix for run times exceeding your time preferences settings.

Please do not abort your current 5.01, or 5.06 Work Units if they are runing well. The science from them is still important to the project. Even the long running 5.06 Work Units should complete after finishing a predertermined number of models (I suspect it is 10). If a Work Unit has run longer than 4 times your preferences "Time" setting, then consider aborting it.

Many of you have asked if credit will be awarded for failed Work units, and the answer is YES. The credit is awarded on Fridays for failed Work Units.

For errors resulting from Rosetta Version 5.01, continue to report here.
For errors relating to Rosetta Version 5.06 report here.
For errors related to Version 5.07 use this thread.

For information on the new Version and what it is supposed to do see this post.

For a message from Dr. Baker about the work unit runs in preparation for CASP see his journal entry here.



Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 14922 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
senatoralex85

Send message
Joined: 27 Sep 05
Posts: 66
Credit: 169,644
RAC: 0
Message 14931 - Posted: 29 Apr 2006, 0:25:06 UTC - in response to Message 14922.  

4/28/2006 7:20:57 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_ROT_TRIALS_TRIE_462_1089_0 (WU download error: couldn't get input files:<file_xfer_error> <file_name>aa1mkyA03_05.400_v1_3.gz</file_name> <error_code>-200</error_code> <error_message></error_message></file_xfer_error>)
4/28/2006 7:20:58 PM|rosetta@home|Deferring communication with project for 3 minutes and 41 seconds
4/28/2006 7:22:25 PM|rosetta@home|Couldn't delete file projects/boinc.bakerlab.org_rosetta/aa1mkyA09_05.400_v1_3.gz
4/28/2006 7:22:34 PM|rosetta@home|Message from server: Not sending work - last RPC too recent: 17 sec
4/28/2006 7:22:34 PM|rosetta@home|No work from project
4/28/2006 7:22:35 PM|rosetta@home|Deferring communication with project for 4 minutes and 1 seconds

A few files downloaded fine. All of a sudden I got a download error. My internet connection is working fine and I have no problems downloading on any other project
ID: 14931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 14933 - Posted: 29 Apr 2006, 0:30:12 UTC - in response to Message 14931.  
Last modified: 29 Apr 2006, 0:30:52 UTC

4/28/2006 7:20:57 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_ROT_TRIALS_TRIE_462_1089_0 (WU download error: couldn't get input files:<file_xfer_error> <file_name>aa1mkyA03_05.400_v1_3.gz</file_name> <error_code>-200</error_code> <error_message></error_message></file_xfer_error>)
4/28/2006 7:20:58 PM|rosetta@home|Deferring communication with project for 3 minutes and 41 seconds
4/28/2006 7:22:25 PM|rosetta@home|Couldn't delete file projects/boinc.bakerlab.org_rosetta/aa1mkyA09_05.400_v1_3.gz
4/28/2006 7:22:34 PM|rosetta@home|Message from server: Not sending work - last RPC too recent: 17 sec
4/28/2006 7:22:34 PM|rosetta@home|No work from project
4/28/2006 7:22:35 PM|rosetta@home|Deferring communication with project for 4 minutes and 1 seconds

A few files downloaded fine. All of a sudden I got a download error. My internet connection is working fine and I have no problems downloading on any other project


This could just be a temporary network error. We all get them from time to time. You will have to wait for the fall back time to pass. If it happens again, assuming your queue is empty you could try resetting the project from the projects tab.

My systems are getting new work just fine, and there is work available.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 14933 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
senatoralex85

Send message
Joined: 27 Sep 05
Posts: 66
Credit: 169,644
RAC: 0
Message 14935 - Posted: 29 Apr 2006, 0:34:42 UTC

4/28/2006 7:30:35 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
4/28/2006 7:30:35 PM|rosetta@home|Requesting 8640 seconds of work, returning 0 results
4/28/2006 7:30:36 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
4/28/2006 7:30:38 PM|rosetta@home|Started download of rosetta_5.07_windows_intelx86.exe
4/28/2006 7:30:38 PM|rosetta@home|Started download of 1di2_.psipred_ss2.gz
4/28/2006 7:30:38 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1di2_ROT_TRIALS_TRIE_461_1133_0 (app_version download error: couldn't get input files:<file_xfer_error> <file_name>rosetta_5.07_windows_intelx86.exe</file_name> <error_code>-200</error_code> <error_message></error_message></file_xfer_error>)
4/28/2006 7:30:39 PM|rosetta@home|Deferring communication with project for 3 minutes and 59 seconds
4/28/2006 7:30:39 PM|rosetta@home|Finished download of 1di2_.psipred_ss2.gz
4/28/2006 7:30:39 PM|rosetta@home|Throughput 2535 bytes/sec
4/28/2006 7:30:39 PM|rosetta@home|Started download of frags400.txt
4/28/2006 7:30:41 PM|rosetta@home|Finished download of frags400.txt
4/28/2006 7:30:41 PM|rosetta@home|Throughput 819 bytes/sec
4/28/2006 7:30:41 PM|rosetta@home|Started download of 1di2.pdb.gz
4/28/2006 7:30:45 PM|rosetta@home|Finished download of 1di2.pdb.gz
4/28/2006 7:30:45 PM|rosetta@home|Throughput 2811 bytes/sec
4/28/2006 7:30:45 PM|rosetta@home|Started download of 1di2_.fasta
4/28/2006 7:30:46 PM|rosetta@home|Finished download of 1di2_.fasta
4/28/2006 7:30:46 PM|rosetta@home|Throughput 277 bytes/sec
4/28/2006 7:30:46 PM|rosetta@home|Started download of aa1di2_09_05.400_v1_3.gz


I got this error again. I included more information. Does this help at all?
ID: 14935 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 14937 - Posted: 29 Apr 2006, 0:37:56 UTC - in response to Message 14935.  

4/28/2006 7:30:35 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
4/28/2006 7:30:35 PM|rosetta@home|Requesting 8640 seconds of work, returning 0 results
4/28/2006 7:30:36 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
4/28/2006 7:30:38 PM|rosetta@home|Started download of rosetta_5.07_windows_intelx86.exe
4/28/2006 7:30:38 PM|rosetta@home|Started download of 1di2_.psipred_ss2.gz
4/28/2006 7:30:38 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1di2_ROT_TRIALS_TRIE_461_1133_0 (app_version download error: couldn't get input files:<file_xfer_error> <file_name>rosetta_5.07_windows_intelx86.exe</file_name> <error_code>-200</error_code> <error_message></error_message></file_xfer_error>)
4/28/2006 7:30:39 PM|rosetta@home|Deferring communication with project for 3 minutes and 59 seconds
4/28/2006 7:30:39 PM|rosetta@home|Finished download of 1di2_.psipred_ss2.gz
4/28/2006 7:30:39 PM|rosetta@home|Throughput 2535 bytes/sec
4/28/2006 7:30:39 PM|rosetta@home|Started download of frags400.txt
4/28/2006 7:30:41 PM|rosetta@home|Finished download of frags400.txt
4/28/2006 7:30:41 PM|rosetta@home|Throughput 819 bytes/sec
4/28/2006 7:30:41 PM|rosetta@home|Started download of 1di2.pdb.gz
4/28/2006 7:30:45 PM|rosetta@home|Finished download of 1di2.pdb.gz
4/28/2006 7:30:45 PM|rosetta@home|Throughput 2811 bytes/sec
4/28/2006 7:30:45 PM|rosetta@home|Started download of 1di2_.fasta
4/28/2006 7:30:46 PM|rosetta@home|Finished download of 1di2_.fasta
4/28/2006 7:30:46 PM|rosetta@home|Throughput 277 bytes/sec
4/28/2006 7:30:46 PM|rosetta@home|Started download of aa1di2_09_05.400_v1_3.gz


I got this error again. I included more information. Does this help at all?


It might. Let me draw additional attention to this. Hang on, we are with you.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 14937 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
senatoralex85

Send message
Joined: 27 Sep 05
Posts: 66
Credit: 169,644
RAC: 0
Message 14939 - Posted: 29 Apr 2006, 0:48:52 UTC
Last modified: 29 Apr 2006, 0:52:52 UTC

4/28/2006 7:42:00 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
4/28/2006 7:42:00 PM|rosetta@home|Requesting 8640 seconds of work, returning 0 results
4/28/2006 7:42:02 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
4/28/2006 7:42:03 PM|rosetta@home|Started download of 1mkyA.psipred_ss2.gz
4/28/2006 7:42:03 PM|rosetta@home|Started download of 1mky.pdb.gz
4/28/2006 7:42:04 PM|rosetta@home|Finished download of 1mkyA.psipred_ss2.gz
4/28/2006 7:42:04 PM|rosetta@home|Throughput 3300 bytes/sec
4/28/2006 7:42:04 PM|rosetta@home|Finished download of 1mky.pdb.gz
4/28/2006 7:42:04 PM|rosetta@home|Throughput 20776 bytes/sec
4/28/2006 7:42:04 PM|rosetta@home|Started download of aa1mkyA03_05.400_v1_3.gz
4/28/2006 7:42:05 PM|rosetta@home|File aa1mkyA09_05.400_v1_3.gz exists already, skipping download
4/28/2006 7:42:05 PM|rosetta@home|Started download of frags400.txt
4/28/2006 7:42:07 PM|rosetta@home|Finished download of frags400.txt
4/28/2006 7:42:07 PM|rosetta@home|Throughput 519 bytes/sec
4/28/2006 7:42:07 PM|rosetta@home|Started download of 1mkyA.fasta
4/28/2006 7:42:09 PM|rosetta@home|Finished download of 1mkyA.fasta
4/28/2006 7:42:09 PM|rosetta@home|Throughput 68 bytes/sec
4/28/2006 7:42:25 PM|rosetta@home|Finished download of aa1mkyA03_05.400_v1_3.gz
4/28/2006 7:42:25 PM|rosetta@home|Throughput 72336 bytes/sec
4/28/2006 7:42:25 PM||request_reschedule_cpus: files downloaded
4/28/2006 7:42:25 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:42:26 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:42:26 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:42:27 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:42:27 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:42:27 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1mky_ROT_TRIALS_TRIE_461_1180_0 (CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20))
4/28/2006 7:42:27 PM||request_reschedule_cpus: start failed
4/28/2006 7:42:27 PM|rosetta@home|Deferring communication with project for 3 minutes and 37 seconds
4/28/2006 7:42:27 PM|rosetta@home|Computation for result HBLR_1.0_1mky_ROT_TRIALS_TRIE_461_1180_0 finished
4/28/2006 7:43:35 PM|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
4/28/2006 7:43:35 PM|LHC@home|Requesting 8640 seconds of work, returning 0 results
4/28/2006 7:43:36 PM|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
4/28/2006 7:43:36 PM|LHC@home|No work from project
4/28/2006 7:43:37 PM|LHC@home|Deferring communication with project for 4 minutes and 40 seconds
4/28/2006 7:46:05 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
4/28/2006 7:46:05 PM|rosetta@home|Requesting 8640 seconds of work, returning 1 results
4/28/2006 7:46:06 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
4/28/2006 7:46:08 PM|rosetta@home|Couldn't delete file projects/boinc.bakerlab.org_rosetta/aa1mkyA09_05.400_v1_3.gz
4/28/2006 7:46:08 PM|rosetta@home|Started download of aa1n0u_03_05.400_v1_3.gz
4/28/2006 7:46:08 PM|rosetta@home|Started download of aa1n0u_09_05.400_v1_3.gz
4/28/2006 7:46:38 PM|rosetta@home|Finished download of aa1n0u_03_05.400_v1_3.gz
4/28/2006 7:46:38 PM|rosetta@home|Throughput 41669 bytes/sec
4/28/2006 7:46:38 PM|rosetta@home|Started download of 1n0u_.fasta
4/28/2006 7:46:40 PM|rosetta@home|Finished download of 1n0u_.fasta
4/28/2006 7:46:40 PM|rosetta@home|Throughput 52 bytes/sec
4/28/2006 7:46:40 PM|rosetta@home|Started download of 1n0u_.psipred_ss2.gz
4/28/2006 7:46:42 PM|rosetta@home|Finished download of 1n0u_.psipred_ss2.gz
4/28/2006 7:46:42 PM|rosetta@home|Throughput 456 bytes/sec
4/28/2006 7:46:42 PM|rosetta@home|Started download of 1n0u.pdb.gz
4/28/2006 7:46:46 PM|rosetta@home|Finished download of 1n0u.pdb.gz
4/28/2006 7:46:46 PM|rosetta@home|Throughput 2252 bytes/sec
4/28/2006 7:47:03 PM|rosetta@home|Finished download of aa1n0u_09_05.400_v1_3.gz
4/28/2006 7:47:03 PM|rosetta@home|Throughput 58943 bytes/sec
4/28/2006 7:47:03 PM||request_reschedule_cpus: files downloaded
4/28/2006 7:47:04 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:47:04 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:47:04 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:47:05 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:47:05 PM|rosetta@home|CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20)
4/28/2006 7:47:06 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1n0u_ROT_TRIALS_TRIE_462_1198_0 (CreateProcess() failed - The process cannot access the file because it is being used by another process. (0x20))
4/28/2006 7:47:06 PM||request_reschedule_cpus: start failed
4/28/2006 7:47:06 PM|rosetta@home|Deferring communication with project for 3 minutes and 1 seconds
4/28/2006 7:47:06 PM|rosetta@home|Computation for result HBLR_1.0_1n0u_ROT_TRIALS_TRIE_462_1198_0 finished


Sorry for the long thread. These are all of the errors I am getting from the last workunit. I have never seen them before. Let me know if you need more information and I will do my best to get it to you

****Edit****

I can detach and reattach if you would like. I have not tried that yet
ID: 14939 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 14940 - Posted: 29 Apr 2006, 0:52:17 UTC - in response to Message 14937.  

Hi: thanks for posting. Everytime we change the application our fileservers and database servers get hammered as everyone downloads it! So not all download requests finish. This also happened briefly yesterday with the 5.06 release. Hang on for the next couple workunits -- things will stabilize in about 12-24 hours. Please do let us know if this continues to happen next week.

4/28/2006 7:30:35 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
4/28/2006 7:30:35 PM|rosetta@home|Requesting 8640 seconds of work, returning 0 results
4/28/2006 7:30:36 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
4/28/2006 7:30:38 PM|rosetta@home|Started download of rosetta_5.07_windows_intelx86.exe
4/28/2006 7:30:38 PM|rosetta@home|Started download of 1di2_.psipred_ss2.gz
4/28/2006 7:30:38 PM|rosetta@home|Unrecoverable error for result HBLR_1.0_1di2_ROT_TRIALS_TRIE_461_1133_0 (app_version download error: couldn't get input files:<file_xfer_error> <file_name>rosetta_5.07_windows_intelx86.exe</file_name> <error_code>-200</error_code> <error_message></error_message></file_xfer_error>)
4/28/2006 7:30:39 PM|rosetta@home|Deferring communication with project for 3 minutes and 59 seconds
4/28/2006 7:30:39 PM|rosetta@home|Finished download of 1di2_.psipred_ss2.gz
4/28/2006 7:30:39 PM|rosetta@home|Throughput 2535 bytes/sec
4/28/2006 7:30:39 PM|rosetta@home|Started download of frags400.txt
4/28/2006 7:30:41 PM|rosetta@home|Finished download of frags400.txt
4/28/2006 7:30:41 PM|rosetta@home|Throughput 819 bytes/sec
4/28/2006 7:30:41 PM|rosetta@home|Started download of 1di2.pdb.gz
4/28/2006 7:30:45 PM|rosetta@home|Finished download of 1di2.pdb.gz
4/28/2006 7:30:45 PM|rosetta@home|Throughput 2811 bytes/sec
4/28/2006 7:30:45 PM|rosetta@home|Started download of 1di2_.fasta
4/28/2006 7:30:46 PM|rosetta@home|Finished download of 1di2_.fasta
4/28/2006 7:30:46 PM|rosetta@home|Throughput 277 bytes/sec
4/28/2006 7:30:46 PM|rosetta@home|Started download of aa1di2_09_05.400_v1_3.gz


I got this error again. I included more information. Does this help at all?


It might. Let me draw additional attention to this. Hang on, we are with you.


ID: 14940 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15016 - Posted: 29 Apr 2006, 16:55:39 UTC
Last modified: 29 Apr 2006, 16:57:20 UTC

Computation error on wuid=15371801. 5.07 HBLR_1.0_1mky_ROT_TRIALS_TRIE_462_4126

<core_client_version>5.4.7</core_client_version>
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# random seed: 1932935
# cpu_run_time_pref: 10800


BOINC Windows Runtime Debugger Version 5.5.0

Dump Timestamp : 04/29/06 12:32:11
Debugger Engine : 4.0.5.0

*** UNHANDLED EXCEPTION ****
Reason: Access Violation (0xc0000005) at address 0x7C910F29 read attempt to address 0x00000000

*** Dump of the Worker(offending) thread: ***
eax=01544690 ebx=014f0000 ecx=00000000 edx=00000000 esi=01544688 edi=014ef710
eip=7c910f29 esp=062cdff0 ebp=062cdffc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246

ChildEBP RetAddr Args to Child
062cdffc 7c910d5c 0096a867 00000000 062ce0b4 00000000 ntdll!_RtlpCoalesceFreeBlocks@16+0x0
062ce0d0 0095e329 014f0000 00000000 014f0190 987cb48e ntdll!_RtlFreeHeap@12+0x0
062ce110 0040451b 014f0190 987cb4be 00000001 062ce3c0 rosetta_5.07_windows_intelx86!+0x0
062ce138 0074f967 987cb4da 00a31960 00a318d4 00000000 rosetta_5.07_windows_intelx86!+0x0
00000000 00000000 00000000 00000000 00000000 00000000 rosetta_5.07_windows_intelx86!+0x0

*** Dump of the Timer thread: ***
eax=7af9bb60 ebx=00000000 ecx=00000015 edx=0a7570ce esi=00000001 edi=00252620
eip=7c90eb94 esp=063cff08 ebp=063cffb4
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000296

ChildEBP RetAddr Args to Child
063cff04 7c90e9ab 76b5af21 00000002 063cff6c 00000001 ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0]
063cff08 76b5af21 00000002 063cff6c 00000001 00000001 ntdll!_ZwWaitForMultipleObjects@20+0x0 FPO: [5,0,0]
063cffb4 7c80b50b 00000000 00252620 0015ba58 00000000 WINMM!_timeThread@4+0x0
063cffec 00000000 76b5aee7 00000000 00000000 06680000 kernel32!_BaseThreadStart@8+0x0

*** Dump of the Graphics thread: ***
eax=01c66baa ebx=77d496b8 ecx=3fc33d50 edx=01c66baa esi=0012ee2c edi=77d48bf6
eip=7c90eb94 esp=0012ede0 ebp=0012ee04
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246

ChildEBP RetAddr Args to Child
0012eddc 77d491be 77d51082 0012ee2c 00000000 00000000 ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0]
0012ee04 00948b4f 0012ee2c 00000000 00000000 00000000 USER32!_NtUserGetMessage@16+0x0
0012ffc0 7c816d4f 00000000 000000b4 7ffd9000 80543dfd rosetta_5.07_windows_intelx86!+0x0
0012fff0 00000000 00966e4d 00000000 78746341 00000020 kernel32!_BaseProcessStart@4+0x0

Exiting...

</stderr_txt>


Validate state Invalid
Claimed credit 36.2380255388419
Granted credit 0
application version 5.07
ID: 15016 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin

Send message
Joined: 15 Jan 06
Posts: 21
Credit: 109,496
RAC: 0
Message 15061 - Posted: 30 Apr 2006, 3:01:26 UTC

Computation error on wuid18589228

<core_client_version>5.3.12.tx36</core_client_version>
<message>The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# random seed: 1933787
# cpu_run_time_pref: 21600

ID: 15061 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cMw

Send message
Joined: 24 Apr 06
Posts: 9
Credit: 14,036
RAC: 0
Message 15066 - Posted: 30 Apr 2006, 4:04:53 UTC

dunno if im getting an error or what but i seem to not be able to complete a WU recently however while I was away today they were completing fine but now since im getting new ones they seem to get stuck at a random percentage never going over 10
if you look at the past few results for this computer : https://boinc.bakerlab.org/rosetta/results.php?hostid=210073 maybe someone can help and would it be possible if anyone could set all of my preferences back to default as i think that might have something to do with it :x
ID: 15066 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 15067 - Posted: 30 Apr 2006, 4:18:33 UTC - in response to Message 15066.  

dunno if im getting an error or what but i seem to not be able to complete a WU recently however while I was away today they were completing fine but now since im getting new ones they seem to get stuck at a random percentage never going over 10
if you look at the past few results for this computer : https://boinc.bakerlab.org/rosetta/results.php?hostid=210073 maybe someone can help and would it be possible if anyone could set all of my preferences back to default as i think that might have something to do with it :x



Something had to have changed on your system. It seemed to have been running and then every work unit past a certain time started failing. Did you perform and system maintenance? Any software Installs or upgrades? Firewall changes?

As for resetting you preferences to the defaults there does not seem to be a "button" for that on Rosetta as there is on some of the other projects, so There really is no simple way to do that. But I don't think what you are seeing is from your pref settings. You might try resetting the project. While fairly severe it sometime fixes a lot of strange things like what you are seeing.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 15067 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 15068 - Posted: 30 Apr 2006, 4:47:21 UTC - in response to Message 15066.  

Hi CmW. It looks like you had a string of error jobs a few days ago as well, and then somehow things stabilized. The -197 error is on our list of major remaining errors to trace -- see this link. it would help us a tremendous amount if you could attach your problem computer to ralph as well as r@h, if you haven't already. With ralph, we can get more backtrace information. With your computer we might have a consistent source of this error. Can you also try Moderator9's suggestion of restarting your computer, and post if that helps or hurts?


dunno if im getting an error or what but i seem to not be able to complete a WU recently however while I was away today they were completing fine but now since im getting new ones they seem to get stuck at a random percentage never going over 10
if you look at the past few results for this computer : https://boinc.bakerlab.org/rosetta/results.php?hostid=210073 maybe someone can help and would it be possible if anyone could set all of my preferences back to default as i think that might have something to do with it :x


ID: 15068 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cMw

Send message
Joined: 24 Apr 06
Posts: 9
Credit: 14,036
RAC: 0
Message 15071 - Posted: 30 Apr 2006, 5:51:24 UTC - in response to Message 15067.  
Last modified: 30 Apr 2006, 5:51:52 UTC

dunno if im getting an error or what but i seem to not be able to complete a WU recently however while I was away today they were completing fine but now since im getting new ones they seem to get stuck at a random percentage never going over 10
if you look at the past few results for this computer : https://boinc.bakerlab.org/rosetta/results.php?hostid=210073 maybe someone can help and would it be possible if anyone could set all of my preferences back to default as i think that might have something to do with it :x



Something had to have changed on your system. It seemed to have been running and then every work unit past a certain time started failing. Did you perform and system maintenance? Any software Installs or upgrades? Firewall changes?

As for resetting you preferences to the defaults there does not seem to be a "button" for that on Rosetta as there is on some of the other projects, so There really is no simple way to do that. But I don't think what you are seeing is from your pref settings. You might try resetting the project. While fairly severe it sometime fixes a lot of strange things like what you are seeing.


not gonna lie i did change my bios settings around as i have been having problems with cpu/ram past 2 weeks(rosetta isn't involved in this at all) but i had set everything back to stock in my bios but right now the one WU i am doing seems to be going ok its just the progress is wierd and jumps from like 7% to like 24% and doesnt update constantly but w/e as long as i get the credit


ALSO riju how do i attach myself to ralph, can you gimme the project link?
ID: 15071 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 15072 - Posted: 30 Apr 2006, 5:54:46 UTC - in response to Message 15071.  

can you gimme the project link?



Here it is.

http://ralph.bakerlab.org/

Anders n

ID: 15072 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile blackbird

Send message
Joined: 4 Nov 05
Posts: 15
Credit: 93,414
RAC: 0
Message 15089 - Posted: 30 Apr 2006, 17:25:32 UTC

It looks like my Rosetta has problems with this WU:
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=14208762

Application has stopped on 1.19%. My computer has Suse Linux 9.2 installed, details: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=54409



Stderr.log:

# random seed: 3264160
# cpu_run_time_pref: 86400
SIGSEGV: segmentation violationStack trace (18 frames):
[0x8822533]
[0x883adec]
[0xffffe420]
[0x88bb810]
[0x88bd0c9]
[0x888be07]
[0x888e1f1]
[0x84bfcb8]
[0x84c08a3]
[0x84cf4e5]
[0x84d1325]
[0x87c0393]
[0x869f41a]
[0x86a1351]
[0x8487fd2]
[0x848b41b]
[0x889a2d4]
[0x8048111]

tail stdout.txt:

[T/F OPT]Default FALSE value for [-stringent_relax]
[REAL OPT]Default value for [-farlx_cycle_ratio] 1
CYCLES::number is 1 x total_residue: 155
[T/F OPT]Default FALSE value for [-more_relax_cycles]
initializing full atom coordinates
[T/F OPT]Default FALSE value for [-do_farlx_checkpointing]
starting score 1909.42151 rms 6.14713573
starting full atom minimization
[T/F OPT]Default FALSE value for [-infinite_loop]
[T/F OPT]Default FALSE value for [-relax_score_filter]

ID: 15089 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 15094 - Posted: 30 Apr 2006, 18:24:38 UTC - in response to Message 15089.  

It looks like my Rosetta has problems with this WU:
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=14208762

Application has stopped on 1.19%. My computer has Suse Linux 9.2 installed, details: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=54409


Those are very large WU and some seem to stuck. If you have Rosetta Version 5.06 or 5.07 they should be aborted if stuck but if you have 5.01 or lower you should immediately abort. Even with 5.06 or 5.07 I would abort if you have not finished the first model in 24 hours.
ID: 15094 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 15095 - Posted: 30 Apr 2006, 18:29:59 UTC - in response to Message 15094.  

It looks like my Rosetta has problems with this WU:
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=14208762

Application has stopped on 1.19%. My computer has Suse Linux 9.2 installed, details: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=54409


Those are very large WU and some seem to stuck. If you have Rosetta Version 5.06 or 5.07 they should be aborted if stuck but if you have 5.01 or lower you should immediately abort. Even with 5.06 or 5.07 I would abort if you have not finished the first model in 24 hours.

If the run time as shown by the CPU time in the tasks tab has exceeded 4 times your run time preference setting then abort the work unit.
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 15095 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 15099 - Posted: 30 Apr 2006, 19:06:11 UTC - in response to Message 15095.  

If the run time as shown by the CPU time in the tasks tab has exceeded 4 times your run time preference setting then abort the work unit.

Just wanted to say thanks, guys, for keeping a close eye on things over the weekend, including Sunday. Knock on wood, looks like a very clean release! :)

Regards,
Bob P.
ID: 15099 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nightbird

Send message
Joined: 17 Sep 05
Posts: 70
Credit: 32,418
RAC: 0
Message 15102 - Posted: 30 Apr 2006, 20:20:48 UTC
Last modified: 30 Apr 2006, 20:30:08 UTC

First wu with 5.07
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=15414717

and a problem : the wu stopped at 33.22 % done.

(edit : i don't wish to reboot the machine or start again Boinc immediately. Why ? because i'm running also uFluids@home on this machine and this application has no checkpoint ; if needeed i will have to wait)


ID: 15102 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 15104 - Posted: 30 Apr 2006, 21:44:48 UTC - in response to Message 15102.  

and a problem : the wu stopped at 33.22 % done.


It might just be a slow spot in the WU. Give it some time.

If it's really stuck, then the watchdog should get it. The watchdog then sends back a lot of information about the WU that is useful to the project.
ID: 15104 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 . . . 7 · Next

Message boards : Number crunching : Report Problems with Rosetta Version 5.07



©2024 University of Washington
https://www.bakerlab.org