Message boards : Number crunching : Problems with Rosetta version 5.98
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next
Author | Message |
---|---|
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
...two previous files of Rosetta@home were deleted when the BOINC manager got server request of Rosetta@home! I wondered if this workunit's crash was in connection with the deletion of two previous files. The deleted files should be the databases used by the "mini" version of Rosetta, and so if this had been a mini task that would be likely. Others have reported that the BOINC client did not seem to recognize the fact that existing tasks required the files. Since it was not a mini task, I do not believe the deleted files is related to the problem you ran in to. Here is a link to DK's post about that over on Ralph. Rosetta Moderator: Mod.Sense |
Path7 Send message Joined: 25 Aug 07 Posts: 128 Credit: 61,751 RAC: 0 |
Hello all, Just having twice a Compute error, Exit status 193 (0xc1) on my Ubuntu 7.10 x86. Link to results From Boinc I received the next massages: za 26 jul 2008 19:04:38 CEST|rosetta@home|Computation for task t499__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t499_-_4244_6669_0 finished za 26 jul 2008 19:04:38 CEST|rosetta@home|Output file t499__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t499_-_4244_6669_0_0 for task t499__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t499_-_4244_6669_0 absent zo 27 jul 2008 00:10:28 CEST|rosetta@home|Computation for task t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_13407_0 finished zo 27 jul 2008 00:10:28 CEST|rosetta@home|Output file t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_13407_0_0 for task t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_13407_0 absent Have a nice day, Path7. |
Robert Gammon Send message Joined: 9 Nov 07 Posts: 14 Credit: 969,848 RAC: 0 |
BOINC Client 5.10.45 Rosetta 5.98 WinXP SP2 - intermittently connected to Internet This problem duplicates with almost any WU. Scenario is that the laptop is connected to internet long enough to upload completed results and to request/download new work. The laptop then disconnects from Internet to begin number crunching. Rosetta processes the file a variable amount ( I have seen 55%, 72%, 88%, and 97% completion), then for one reason or another, BOINC shuts down (XP locks up and needs a reboot, power fails, or its time to shutdown for the night). Note that some of these BOINC shutdowns are orderly, others are not. The result is the same, regardless of how we got there. Rosetta RESTARTS AT ZERO!! The WU gets reprocessed, redoing the work of 2-4 hours compute time. |
Robert Gammon Send message Joined: 9 Nov 07 Posts: 14 Credit: 969,848 RAC: 0 |
BOINC Client 5.10.45 I just duplicated this again. I did an orderly shutdown to move the laptop. Rosetta was at 95.583% complete. When BOINC restarted, Setiathome was the selected task. I let that run for about 5 minutes, then suspended Seti and allowed Rosetta to restart. In a few moments, 0.00% complete, 3:55:20 to completion!!! On computers with more than one project active, if this is NOT unique to my laptop, switching to other projects from Rosetta, then back to Rosetta, should show the same characteristic. Note that this is a configuration item on all project Account Info pages, interval between switching tasks. Mine is set to 3 hours. I cannot do this as I only have access to a single computer. |
Robert Gammon Send message Joined: 9 Nov 07 Posts: 14 Credit: 969,848 RAC: 0 |
BOINC Client 5.10.45 I tried again, putting the project on Suspend, waiting 30 minutes while I did some other work, then did a Resume, and EUREKA, it WORKED, execution continued from the spot it left of when the Suspend was issued. So this makes it seem like the signal BOINC issues when the user EXITS the application leaves the Rosetta work unit in an unstable state, same as an abort due to power fail on the computer. SUSPEND appears to act differently and Rosetta does an orderly pause of the work. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Robert, suspend, at least when you have it set to leave applications in memory while suspended... is entirely different then BOINC shutting down. What you are seeing is normal, and not unexpected. It will differ for different types of work units. Some checkpoint more frequently then others. Some complete models more frequently then others. If you would like to discuss it further, since this is not a problem specific to v5.98, please open a new thread. Rosetta Moderator: Mod.Sense |
Korz53 Send message Joined: 22 Apr 07 Posts: 2 Credit: 144,960 RAC: 0 |
No graphics when clicking show graphics when Rossetta is running. .CPU is high in kernel_task (36.5%) when minirosetta is running . boinc ( not Responding) may be do to minirosetta. well need to quit BOINC and restart to reset boinc. boinc not Responding has been happening on and off Model Name: iMac Model Identifier: iMac6,1 Processor Name: Intel Core 2 Duo Processor Speed: 2.33 GHz Number Of Processors: 1 Total Number Of Cores: 2 L2 Cache: 4 MB Memory: 3 GB Bus Speed: 667 MHz Boot ROM Version: IM61.0093.B07 SMC Version: 1.10f2 |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
A couple of t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_ WUs ended with a segmentation violation on two different Linux computers. The stack trace looks similar in each case. https://boinc.bakerlab.org/rosetta/result.php?resultid=180458759 https://boinc.bakerlab.org/rosetta/result.php?resultid=180399471 |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
t498__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_5892_0 50 minute computation and then: Exit status -1073741819 (0xc0000005) CPU runtime 3013.89 secs stderr out <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> # cpu_run_time_pref: 14400 # random seed: 3449163 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0093E2E5 write attempt to address 0x1610F000 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x7C911D8F read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... # cpu_run_time_pref: 14400 WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... and alot of other stuff mostly PDB symbols. Is this going to become a commmon theme? |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 1 |
Had this one crash and put up the "Rosetta has encountered an error and needs to close" dialog box - not seen that for a while. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
UBT - The Prof.... Send message Joined: 5 Nov 06 Posts: 1 Credit: 18,584 RAC: 0 |
Have had so many crash in the last few days with "client error", etc, that I think I am going to go crunch something else for a while till this gets properly de-bugged. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Have had so many crash in the last few days with "client error", etc, that I think I am going to go crunch something else for a while till this gets properly de-bugged. you should post this as well so they know whats going on... <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> Too many restarts with no progress. Keep application in memory while preempted. ====================================================== DONE :: 1 starting structures 96.3281 cpu seconds This process generated 0 decoys from 0 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>h001__BOINC_CASP8_ABRELAX_RANGE_tvat_d2r__IGNORE_THE_REST-S25-6-S3-9--h001_-_4307_155_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> You got that how many times? 6 or so? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Too many restarts with no progress Normally this would be due to ending BOINC before the tasks can checkpoint, or suspending the task before it can checkpoint and not leaving suspended tasks in memory... however, I don't see the output lines that indicate how many times it did start, and with what runtime preference that I would normally expect to see if that was truely what had occured. UBT, does your machine run 24/7? Do you run other projects? Some background may prove helpful. Rosetta Moderator: Mod.Sense |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
t499__BOINC_SYMMETRY_D2SYMM_FOLD_AND_DOCK_RELAX-t499_-_4244_11023_0 <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 14400 # random seed: 3439032 # cpu_run_time_pref: 14400 # random seed: 3439032 WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... ====================================================== DONE :: 1 starting structures 12695.9 cpu seconds This process generated 7 decoys from 7 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> ]]> i LOST 7 points on this one...why would you lose points instead of break even? |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
t499__BOINC_SYMMETRY_C4SYMM_FOLD_AND_DOCK_RELAX-t499_-_4244_16487_0 14543.28 stderr out <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 14400 # random seed: 3423568 # cpu_run_time_pref: 14400 WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... # cpu_run_time_pref: 14400 WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... ====================================================== DONE :: 1 starting structures 14543.1 cpu seconds This process generated 7 decoys from 7 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> ]]> .40 credit LOST on this one |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
t498__BOINC_SYMMETRY_C3SYMM_FOLD_AND_DOCK_RELAX-t498_-_4244_17219_0 CPU time 14213.28 stderr out <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 14400 # random seed: 3447836 # cpu_run_time_pref: 14400 # cpu_run_time_pref: 14400 WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... WARNING! Not sure non-ideal rotamers are compatible with symmetry yet... ====================================================== DONE :: 1 starting structures 14212.5 cpu seconds This process generated 26 decoys from 26 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... </stderr_txt> ]]> gained some serious credit on this even with the errors |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Mod: check out all the errors in his profile. He has alot of <error_code>-161</error_code> https://boinc.bakerlab.org/rosetta/results.php?hostid=837076 Too many restarts with no progress |
mike Send message Joined: 10 Dec 05 Posts: 1 Credit: 77,598 RAC: 0 |
I seem to have a problem with minirosetta application not starting.I just installed the new version of boinc.I keep getting windows error messages,telling me that there was a problem and asking me to send an error report to Microsoft.I tried aborting the work units to get more but it still give me error messages.It seems when it tries to start a work unit it will go immediately to 100% and the error messsage pops up |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I seem to have a problem with minirosetta application not starting.I just installed the new version of boinc.I keep getting windows error messages,telling me that there was a problem and asking me to send an error report to Microsoft.I tried aborting the work units to get more but it still give me error messages.It seems when it tries to start a work unit it will go immediately to 100% and the error messsage pops up you need to post this over in 1.28 not here in 5.98 |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 1 |
Had this one crash on me today with an unhandled exception, (while I was out of course), and it put up the "Rosetta has encountered..." message box which then leaves that core dead until I return to click the OK button. It should not do this under ANY circumstances, yet is the second time recently. I have Rosetta on machines at remote sites which I don't visit often. If it happens out there, the core/machine is dead until I next get there. try { Rosetta } catch (...) { Bomb out } Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Message boards :
Number crunching :
Problems with Rosetta version 5.98
©2025 University of Washington
https://www.bakerlab.org