Message boards : Number crunching : Report Problems with Rosetta Version 5.07
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next
Author | Message |
---|---|
Jon C Melusky Send message Joined: 29 Nov 05 Posts: 12 Credit: 192,743 RAC: 76 |
...I don't know what ralph is. I am attached to 4 of the other main BOINC projects. They run fine. Well, lately they have. (^: Thank you for the link to Ralph. Sadly, my system only has 384 ram and the min requirements of Ralph are 512 ram. I guess I have no choice but to scale back rosetta to 5% instead of 20%. Jonathan |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
...I don't know what ralph is. I am attached to 4 of the other main BOINC projects. They run fine. Well, lately they have. (^: Actually the basic requirements are the same for both Ralph and Rosetta. Moderator9 ROSETTA@home FAQ Moderator Contact |
Jon C Melusky Send message Joined: 29 Nov 05 Posts: 12 Credit: 192,743 RAC: 76 |
Sadly, my system only has 384 ram and the min requirements of Ralph are 512 ram. Actually the basic requirements are the same for both Ralph and Rosetta.[/quote] Well, all I know is that Rosetta worked perfectly from 29 Nov 2005 to early April 2006 with only 384 ram, so I don't know why it used to work so well below basic requirements. Was it 512 ram back in Nov of 2005 ? Should I not have been allowed to attach to Rosetta with 384 ram ? Should I try Ralph with 384 ram ? Please advise. Jonathan |
anders n Send message Joined: 19 Sep 05 Posts: 403 Credit: 537,991 RAC: 0 |
Well, all I know is that Rosetta worked perfectly from 29 Nov 2005 to early April 2006 with only 384 ram, so I don't know why it used to work so well below basic requirements. Was it 512 ram back in Nov of 2005 ? Should I not have been allowed to attach to Rosetta with 384 ram ? Should I try Ralph with 384 ram ? Do try joining Ralf. There are computers there with less than 512 in memory. Anders n |
Astro Send message Joined: 2 Oct 05 Posts: 987 Credit: 500,253 RAC: 0 |
like my Celeron 500, Win98, and 256 Mram. If I'm not mistaken it's "minimum Recommended" specs, not "min Required". |
TioSuper Send message Joined: 2 May 06 Posts: 17 Credit: 164 RAC: 0 |
https://boinc.bakerlab.org/rosetta/result.php?resultid=19492107 Resulted in one of the now infamous 107 type of errors. |
kevint Send message Joined: 8 Oct 05 Posts: 84 Credit: 2,530,451 RAC: 0 |
So I came in this morning and noticed that this machine machine had all the WU's (about 50 or so)aborted/errors for no apparent reason. This machine has been running very nicely now for several months without ever a hickup. I have not changed anything with this machine for a long time. Did we get a batch of bad WU's. 5/8/2006 6:20:46 AM|rosetta@home|3 consecutive failures fetching scheduler list - deferring 604800 seconds 5/8/2006 6:24:06 AM|rosetta@home|Computation for result HBLR_1.0_1n0u_RDFLAGS_485_7128_0 finished 5/8/2006 6:24:06 AM|rosetta@home|Starting result AB_CASP6_JUMPING_STRAND2_STRAND5_t212_SAVE_ALL_OUT_488_3479_0 using rosetta version 507 5/8/2006 6:24:15 AM|rosetta@home|Unrecoverable error for result AB_CASP6_JUMPING_STRAND2_STRAND5_t212_SAVE_ALL_OUT_488_3479_0 ( - exit code -1073741819 (0xc0000005)) 5/8/2006 6:24:15 AM|rosetta@home|3 consecutive failures fetching scheduler list - deferring 604800 seconds 5/8/2006 6:24:15 AM||Rescheduling CPU: application exited 5/8/2006 6:24:15 AM|rosetta@home|Computation for result AB_CASP6_JUMPING_STRAND2_STRAND5_t212_SAVE_ALL_OUT_488_3479_0 finished 5/8/2006 6:24:15 AM|rosetta@home|Starting result AB_CASP6_JUMPING__t242_SAVE_ALL_OUT_488_3479_0 using rosetta version 507 5/8/2006 6:24:26 AM|rosetta@home|Unrecoverable error for result AB_CASP6_JUMPING__t242_SAVE_ALL_OUT_488_3479_0 ( - exit code -1073741819 (0xc0000005)) 5/8/2006 6:24:26 AM|rosetta@home|3 consecutive failures fetching scheduler list - deferring 604800 seconds 5/8/2006 6:24:26 AM||Rescheduling CPU: application exited 5/8/2006 6:24:26 AM|rosetta@home|Computation for result AB_CASP6_JUMPING__t242_SAVE_ALL_OUT_488_3479_0 finished 5/8/2006 6:24:26 AM|rosetta@home|Starting result JUMP_ALLBARCODES_ANTIPARALLEL_1tul__SAVE_ALL_OUT_490_1401_0 using rosetta version 507 5/8/2006 6:24:53 AM|rosetta@home|Unrecoverable error for result JUMP_ALLBARCODES_ANTIPARALLEL_1tul__SAVE_ALL_OUT_490_1401_0 ( - exit code -1073741819 (0xc0000005)) 5/8/2006 6:24:53 AM|rosetta@home|3 consecutive failures fetching scheduler list - deferring 604800 seconds 5/8/2006 6:24:53 AM||Rescheduling CPU: application exited 5/8/2006 6:24:53 AM|rosetta@home|Computation for result JUMP_ALLBARCODES_ANTIPARALLEL_1tul__SAVE_ALL_OUT_490_1401_0 finished SETI.USA |
Charles Dennett Send message Joined: 27 Sep 05 Posts: 102 Credit: 2,081,660 RAC: 345 |
like my Celeron 500, Win98, and 256 Mram. If I'm not mistaken it's "minimum Recommended" specs, not "min Required". My older son moved out on his own a few weeks ago. Took his laptop with him but left his old Dell Optiplex GX110. Said I could do what I wanted with it. It's got a 667 P3 with only 128 MB of memory. I'm running W2K on it. All I did was bump up the initial virtual memory allocation (I think it was from 192 MB to 256 MB) after it complained about running out of VM and it's been running R@H just fine. -Charlie |
Nite Owl Send message Joined: 2 Nov 05 Posts: 87 Credit: 3,019,449 RAC: 0 |
I just had a failure on a machine that I believe was caused by my viewing the graphics. It is a A64 x2 4400+ w/ 1GB memory, SLI duel Graphics boards w/ 256MB each. Result ID 19507834 Name FA_CASP6_t212__470_13327_0 Workunit 16182368 Created 7 May 2006 20:29:38 UTC Sent 8 May 2006 0:15:29 UTC Received 8 May 2006 21:52:32 UTC Server state Over Outcome Client error Client state Computing Exit status 1 (0x1) Computer ID 201779 Report deadline 22 May 2006 0:15:29 UTC CPU time 49699.375 stderr out <core_client_version>5.2.13</core_client_version> <message>Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # random seed: 1549694 # cpu_run_time_pref: 86400 No heartbeat from core client for 31 sec - exiting # cpu_run_time_pref: 86400 ERROR:: Exit at: .dock_structure.cc line:401 </stderr_txt> Validate state Invalid Claimed credit 355.538220736369 Granted credit 0 application version 5.07 Two other failures on an Intel Pentium 4HT, 3218 MHz, 1GB memory, NVIDIA GeForce FX 5200 (128 MB) graphics and 30 GB HD drive... Both failures were attributed to "Maximum disk usage exceeded". Result ID 19546463 Name JUMP_CLOSE_CHAINBREAK_ALLBARCODE_1q7sA_SAVE_ALL_OUT_472_11569_0 Workunit 16217955 Created 8 May 2006 5:33:41 UTC Sent 8 May 2006 9:21:18 UTC Received 8 May 2006 21:25:16 UTC Server state Over Outcome Client error Client state Computing Exit status -177 (0xffffff4f) Computer ID 142263 Report deadline 22 May 2006 9:21:18 UTC CPU time 43029.984375 stderr out <core_client_version>5.2.13</core_client_version> <message>Maximum disk usage exceeded </message> <stderr_txt> # random seed: 1491312 # cpu_run_time_pref: 86400 </stderr_txt> Validate state Invalid Claimed credit 186.565840823042 Granted credit 0 application version 5.07 Join the Teddies@WCG |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
So I came in this morning and noticed that this machine Well, the larger workunits now running on the system may be a problem for you. You could try increasing Virtual memory. But it looks like a file problem. It is possible that the BOINC files system has become corrupted somehow. Try resetting the project. If that does not work, then increase the virtual memory for the system (expect more disk activity if you do that). Moderator9 ROSETTA@home FAQ Moderator Contact |
kevint Send message Joined: 8 Oct 05 Posts: 84 Credit: 2,530,451 RAC: 0 |
So I came in this morning and noticed that this machine Will do, virtual memory may be an issue however, looks like I need to install a 2nd hard drive. I think I have a couple of old junkers laying around. THanks. SETI.USA |
Buffalo Bill Send message Joined: 25 Mar 06 Posts: 71 Credit: 1,630,458 RAC: 0 |
Is there any way to save this WU which is stuck at: CPU Time: 05:37:38 Progress: 100% Status: Uploading I've tried rebooting etc. but it won't progress to "ready to report". https://boinc.bakerlab.org/rosetta/result.php?resultid=19494764 What should I do with it? Next WU is at 1 hour 30 min. and running. |
DeMatt Send message Joined: 30 Apr 06 Posts: 2 Credit: 188,295 RAC: 0 |
Hmmm... I just joined Rosetta@home, after CPDN stopped giving out Mac units... but thus far have had no successful runs (3 failures, 1 just started processing). They've all failed with error code -161... which I think has to do with the fact that my computer isn't dedicated to Rosetta (I run 3 other projects on it) or even ON all the time. I noticed the latest unit failed the instant it (tried to) start up... My computer: Power Mac G5 Dual Processor @ 2 GHz, running OS X 10.3.9 BOINC Client: Command-line version 5.2.13, set to "use only 1 CPU" and "leave apps in memory"; I just recently changed the timeslice setting from 60 to 120 minutes in the hopes it would help. Some of the log file text: Command-line error output: 2006-05-02 16:37:22 [rosetta@home] Unrecoverable error for result AB_CASP6_t241__465_2827_1 (<file_xfer_error> <file_name>AB_CASP6_t241__465_2827_1_0</file_name> <error_code>-161</error_code> <error_message></error_message> </file_xfer_error> ) 2006-05-05 22:37:25 [rosetta@home] Unrecoverable error for result HBLR_1.0_1b72_RDFLAGS_474_909_0 (<file_xfer_error> <file_name>HBLR_1.0_1b72_RDFLAGS_474_909_0_0</file_name> <error_code>-161</error_code> <error_message></error_message> </file_xfer_error> ) 2006-05-08 15:37:40 [rosetta@home] Unrecoverable error for result HBLR_1.0_1n0u_RDFLAGS_484_1900_0 (<file_xfer_error> <file_name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0_0</file_name> <error_code>-161</error_code> <error_message></error_message> </file_xfer_error> ) From sched_request_boinc.bakerlab.org_rosetta.html: <result> <name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0</name> <final_cpu_time>0.950000</final_cpu_time> <exit_status>0</exit_status> <state>3</state> <app_version_num>507</app_version_num> <stderr_out> <core_client_version>5.2.13</core_client_version> <stderr_txt> # random seed: 3903101 # random seed: 3903101 # random seed: 3903101 # cpu_run_time_pref: 10800 # random seed: 3903101 # random seed: 3903101 Too many restarts with no progress. Keep application in memory while preempted. WARNING! attempt to gzip file ./aa1n0u.out failed: file does not exist. # DONE :: 0 starting structures built 0 (nstruct) times # This process generated 0 decoys from 0 attempts </stderr_txt> <message><file_xfer_error> <file_name>HBLR_1.0_1n0u_RDFLAGS_484_1900_0_0</file_name> <error_code>-161</error_code> <error_message></error_message> </file_xfer_error> </message> </stderr_out> </result> Should I be looking for logging information somewhere else? |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
Hmmm... I just joined Rosetta@home, after CPDN stopped giving out Mac units... but thus far have had no successful runs (3 failures, 1 just started processing). They've all failed with error code -161... which I think has to do with the fact that my computer isn't dedicated to Rosetta (I run 3 other projects on it) or even ON all the time. I noticed the latest unit failed the instant it (tried to) start up... The 161 errors refer to the fact that the Work Unit did not generate an output file. When the system went to compress the file to send it in since it was not there it generated an error. Your work Units were terminated by the Rosetta application "Watchdog". This feature looks at the Work unit progress each time the work starts processing and decides if progress has been made since the last start up. If not it terminates the Work Unit and returns any errors or results. For some reason your system is not processing the Work Units correctly. All of my systems are running OS 10.4.6, and while the project only requires 10.3.9 it is possible that the newer application may have a problem with 10.3.x. If that is the case it is a bug and it will be fixed. However, you might want to try the GUI version of BOINC Manager, or even the menu bar version unless there is just some reason you need the CLI version. The GUI version seems to run better on the Mac. While it does take a few more cycles to run the GUI it is not that significant especially if you close the manager window when you are not using it. All of my systems are running multiple projects some as many as 5 so your problem is not in that area. You might try resetting the project from the projects tab. If that does not fix it then you might want to try detaching and reattaching. Failing that then you should attach to the RALPH@Home project. Ralph is the beta test project for Rosetta, and we can diagnose the problem better there. Moderator9 ROSETTA@home FAQ Moderator Contact |
K1100LTSE Send message Joined: 28 Feb 06 Posts: 7 Credit: 192,387 RAC: 0 |
|
David@home Send message Joined: 7 Oct 05 Posts: 29 Credit: 185,330 RAC: 0 |
I have a Rosetta 5.07 WU apparently stuck at 1% progress. It has completed two lots of one hour project swap intervals and Boinc Manager shows progress at 1.03%. I will leave running overnight and check in the morning. Are there any error log files I should look out for on my system that may help? |
Ian Send message Joined: 14 Apr 06 Posts: 29 Credit: 326,863 RAC: 577 |
Had this wu fail in the 7th: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=16118342 Result: https://boinc.bakerlab.org/rosetta/result.php?resultid=19436273 Only just noticed - first error for ages. Ian Cundell, St Albans, UK |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
I think Moderator9's comments are right on, but I think what's triggering the error is that Rosetta has been preempted 5 times -- and we have a "feature" that kills WUs that have started/stopped several times. I like the idea to increase the time to 120 (or even 240 minutes, which is what I run on my Mac!). Its a bit puzzling though because you have "leave apps in memory" set -- it shouldn't matter if its preempted. So please double check your Rosetta@home setting are "leave apps in memory" let us know if you continue to get errors. Hmmm... I just joined Rosetta@home, after CPDN stopped giving out Mac units... but thus far have had no successful runs (3 failures, 1 just started processing). They've all failed with error code -161... which I think has to do with the fact that my computer isn't dedicated to Rosetta (I run 3 other projects on it) or even ON all the time. I noticed the latest unit failed the instant it (tried to) start up... |
hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0 |
I have a "Rosetta@Home 5.07, Win98se, BOINC ver 5.22, 1 day WU setting" WU sitting on 100% completed, it has been doing that for about 2 days ( I've had one other like this, I restarted BOINC and the WU started from the begining again, which is strange, so I thought I'd let this one go and see what happens, the answer is that it just sits there at 100% ). What would you like me to do with it? If there as some type of Win98 debugger that you could talk me through? I would be happy to do that, or some memory / thread dump that I don't know about in 98, I assume that watchdog will kill it sometime within about 30hours or so. It seems to be BOINC has lost the plot, the messages are: 10/05/06 10:12:34 AM||Suspending network activity - user request 10/05/06 10:54:59 AM|rosetta@home|Deferring communication with project for 1 days, 22 hours, 59 minutes, and 49 seconds 10/05/06 11:55:01 AM|rosetta@home|Deferring communication with project for 1 days, 21 hours, 59 minutes, and 46 seconds 10/05/06 12:55:07 PM|rosetta@home|Deferring communication with project for 1 days, 20 hours, 59 minutes, and 41 seconds 10/05/06 1:55:08 PM|rosetta@home|Deferring communication with project for 1 days, 19 hours, 59 minutes, and 40 seconds 10/05/06 2:55:08 PM|rosetta@home|Deferring communication with project for 1 days, 18 hours, 59 minutes, and 39 seconds 10/05/06 3:55:12 PM|rosetta@home|Deferring communication with project for 1 days, 17 hours, 59 minutes, and 36 seconds 10/05/06 4:55:14 PM|rosetta@home|Deferring communication with project for 1 days, 16 hours, 59 minutes, and 34 seconds 10/05/06 5:55:18 PM|rosetta@home|Deferring communication with project for 1 days, 15 hours, 59 minutes, and 30 seconds 10/05/06 6:55:22 PM|rosetta@home|Deferring communication with project for 1 days, 14 hours, 59 minutes, and 26 seconds Which shouldn't happen as Suspending network activity should stop all attempts at network communication. I doubt that I have enought (watchdog) time left to give it access to the Internet and see what happens :? edited to add: and some spelling and stuff I can't see the graphics (I know I tried) as it's run via (Win 98se)dos command line Can a mod get rid of the graphic(s) that is making this so wide? |
duanra Send message Joined: 12 Feb 06 Posts: 8 Credit: 36,223 RAC: 0 |
Hello ! Using Rosetta@home v. 5.07, windows XP ant ATI Mobility Radeon Graphics card ; each time I open the rosetta screensaver to look at the graphics, it stops after a couple of minutes, my screen becomes black then it reopens again and I've got to close quickly the window of the screensaver or it continues all the time. Conclusion : I cannot see the graphics without my screen crashing down. (sorry for my poor English) Duanra |
Message boards :
Number crunching :
Report Problems with Rosetta Version 5.07
©2024 University of Washington
https://www.bakerlab.org