minirosetta v1.19 bug thread

Author	Message
James Thompson Send message Joined: 13 Oct 05 Posts: 46 Credit: 186,109 RAC: 0	Message 52876 - Posted: 6 May 2008, 0:37:02 UTC We have an updated version of minirosetta v1.19 which should fix some of the stability issues with v1.15. Post minirosetta v1.19 bugs here. ID: 52876 · Rating: 0 · rate: / Reply Quote

David Emigh Send message Joined: 13 Mar 06 Posts: 158 Credit: 417,178 RAC: 0	Message 52900 - Posted: 7 May 2008, 17:55:14 UTC Here is an access violation error after 68,000+ seconds of CPU time: Reason: Access Violation (0xc0000005) at address 0x005C3051 write attempt to address 0x00000024 There is a large and detailed debugger message. Rosie, Rosie, she's our gal, If she can't do it, no one shall! ID: 52900 · Rating: 0 · rate: / Reply Quote

glaesum Send message Joined: 16 Oct 06 Posts: 21 Credit: 508,632 RAC: 0	Message 52910 - Posted: 8 May 2008, 12:54:00 UTC things must be going pretty well as the thread is so quiet... good news too with win98 OS - the 1.19 app is running, completing and validating although an error message is still getting thrown up. no idea if this matters or not. on all three wus completed so far this is the message: Task ID 161439715 Name score13_hb_envtest62_A_1ctf__3171_14411_0 Workunit 147493846 Received 8 May 2008 11:10:33 UTC Outcome Success <core_client_version>5.10.30</core_client_version> <![CDATA[ <stderr_txt> AllocateAndInitializeSid Error 120 failed to create shared mem segment # cpu_run_time_pref: 14400 ====================================================== DONE :: 1 starting structures 13875.8 cpu seconds This process generated 3 decoys from 3 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... called boinc_finish </stderr_txt> ]]> work unit ID nos are: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=147390671 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=147405464 https://boinc.bakerlab.org/rosetta/workunit.php?wuid=147493846 ID: 52910 · Rating: 0 · rate: / Reply Quote

radu Send message Joined: 7 May 08 Posts: 4 Credit: 66,301 RAC: 0	Message 52911 - Posted: 8 May 2008, 13:22:26 UTC Last modified: 8 May 2008, 13:24:08 UTC I get a crash when I detach from the project. I'm not sure if this is a minirosetta bug. Log messages seem to show that minirosetta was running when the crash occurred. I'm running Gentoo linux 2.6.24-r7. boinc-5.10.45 Logs: 08-May-2008 16:07:47 [rosetta@home] Starting task fa_max_dis_9-2vik_-test_2008-5-6_3222_134_0 using minirosetta version 119 08-May-2008 16:09:29 [rosetta@home] Resetting project 08-May-2008 16:09:30 [rosetta@home] Detaching from project SIGSEGV: segmentation violation Stack trace (9 frames): /usr/bin/boinc_client[0x46cbf9] /lib/libpthread.so.0[0x2aba6d950ed0] /usr/bin/boinc_client[0x40afec] /usr/bin/boinc_client[0x43060e] /usr/bin/boinc_client[0x4310bc] /usr/bin/boinc_client[0x422319] /usr/bin/boinc_client[0x4516a4] /lib/libc.so.6(__libc_start_main+0xf4)[0x2aba6ddfdb74] /usr/bin/boinc_client(__gxx_personality_v0+0x1b1)[0x4048f9] Exiting... ID: 52911 · Rating: 0 · rate: / Reply Quote

Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0	Message 52913 - Posted: 8 May 2008, 13:56:23 UTC - in response to Message 52911. I get a crash when I detach from the project. I'm not sure if this is a minirosetta bug. Log messages seem to show that minirosetta was running when the crash occurred. It is quite possible (and logical IMO) that the client forcibly terminates all related processes upon detach. Otherwise it could not clean up client_state.xml, slots/ and projects/. Peter ID: 52913 · Rating: 0 · rate: / Reply Quote

radu Send message Joined: 7 May 08 Posts: 4 Credit: 66,301 RAC: 0	Message 52914 - Posted: 8 May 2008, 15:32:48 UTC - in response to Message 52913. Last modified: 8 May 2008, 15:37:21 UTC I get a crash when I detach from the project. I'm not sure if this is a minirosetta bug. Log messages seem to show that minirosetta was running when the crash occurred. It is quite possible (and logical IMO) that the client forcibly terminates all related processes upon detach. Otherwise it could not clean up client_state.xml, slots/ and projects/. Peter I'm new to BOINC so I don't know how the detach operation is handled. I don't use the gui manager and boinc_client appears to be the only BOINC related process running: $ ps -e \| grep boinc 6279 ? 00:00:05 boinc_client Anyway killing related processes should not generate segmentation faults, so it's clearly an error in boinc_client. I don't know if it has anything to do with minirosetta though. ID: 52914 · Rating: 0 · rate: / Reply Quote

Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0	Message 52915 - Posted: 8 May 2008, 15:42:05 UTC - in response to Message 52914. I get a crash when I detach from the project. I'm not sure if this is a minirosetta bug. Log messages seem to show that minirosetta was running when the crash occurred. It is quite possible (and logical IMO) that the client forcibly terminates all related processes upon detach. Otherwise it could not clean up client_state.xml, slots/ and projects/. I'm new to BOINC so I don't know how the detach operation is handled. Anyway killing related processes should not generate segmentation faults, so it's clearly an error in boinc_client. I'm sorry, you are right. I was thinking on Rosetta crashing and omitted that actually the client crashed. Off course it should not. (And actually the application should also exit cleanly if asked to by the client.) I don't know if it has anything to do with minirosetta though. It should not. Which client, 5.10.45? Peter ID: 52915 · Rating: 0 · rate: / Reply Quote

radu Send message Joined: 7 May 08 Posts: 4 Credit: 66,301 RAC: 0	Message 52916 - Posted: 8 May 2008, 15:45:30 UTC - in response to Message 52915. It should not. Which client, 5.10.45? yes, 5.10.45 ID: 52916 · Rating: 0 · rate: / Reply Quote

Rob Send message Joined: 16 Oct 06 Posts: 3 Credit: 121,375 RAC: 0	Message 52917 - Posted: 8 May 2008, 18:55:53 UTC Someone forgot to post the Minirosetta 1.19 details on the version thread. ID: 52917 · Rating: 0 · rate: / Reply Quote

Alexander Klauer Send message Joined: 10 Mar 08 Posts: 3 Credit: 110,308 RAC: 0	Message 52933 - Posted: 9 May 2008, 8:35:22 UTC Hi, I switched off my computer yesterday, in the middle (maybe 60%) of a task. When I switched it back on today, I got Fri 09 May 2008 09:51:30 AM CEST\|rosetta@home\|URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 762923; location: (none); project prefs: default Fri 09 May 2008 09:51:31 AM CEST\|rosetta@home\|Restarting task fa_max_dis_9-1ptq_-test_2008-5-6_3222_268_0 using minirosetta version 119 Fri 09 May 2008 09:52:00 AM CEST\|rosetta@home\|Computation for task fa_max_dis_9-1ptq_-test_2008-5-6_3222_268_0 finished Fri 09 May 2008 09:52:01 AM CEST\|rosetta@home\|Starting lambda_repressor_folding_3191_8370_0 Fri 09 May 2008 09:52:01 AM CEST\|rosetta@home\|Starting task lambda_repressor_folding_3191_8370_0 using rosetta_beta version 596 Fri 09 May 2008 09:52:03 AM CEST\|rosetta@home\|Started upload of fa_max_dis_9-1ptq_-test_2008-5-6_3222_268_0_0 Fri 09 May 2008 09:52:14 AM CEST\|rosetta@home\|Finished upload of fa_max_dis_9-1ptq_-test_2008-5-6_3222_268_0_0 so the task finished virtually immediately after restart. When I switched my computer on yesterday morning, I also had some task crunching at 0%. Back then I believed an old task had been restarted from the beginning due to some fluke, but now it seems more likely that the same thing as today has happened. To me, it seems too much of a coincidence of a task interrupted in the middle being finished immediately after resume, twice in a row. ID: 52933 · Rating: 0 · rate: / Reply Quote

Betting Slip Send message Joined: 26 Sep 05 Posts: 71 Credit: 5,702,246 RAC: 0	Message 52937 - Posted: 9 May 2008, 11:54:47 UTC - in response to Message 52910. Really All access violations https://boinc.bakerlab.org/rosetta/result.php?resultid=161740698 https://boinc.bakerlab.org/rosetta/result.php?resultid=160201341 https://boinc.bakerlab.org/rosetta/result.php?resultid=159794241 https://boinc.bakerlab.org/rosetta/result.php?resultid=160129454 https://boinc.bakerlab.org/rosetta/result.php?resultid=160185394 https://boinc.bakerlab.org/rosetta/result.php?resultid=161332559 https://boinc.bakerlab.org/rosetta/result.php?resultid=159408171 ID: 52937 · Rating: 0 · rate: / Reply Quote

Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 17 Sep 05 Posts: 18 Credit: 40,071 RAC: 0	Message 52961 - Posted: 10 May 2008, 4:10:29 UTC - in response to Message 52937. Last modified: 10 May 2008, 4:10:59 UTC All access violations https://boinc.bakerlab.org/rosetta/result.php?resultid=161740698 https://boinc.bakerlab.org/rosetta/result.php?resultid=160201341 https://boinc.bakerlab.org/rosetta/result.php?resultid=159794241 https://boinc.bakerlab.org/rosetta/result.php?resultid=160129454 https://boinc.bakerlab.org/rosetta/result.php?resultid=160185394 https://boinc.bakerlab.org/rosetta/result.php?resultid=161332559 https://boinc.bakerlab.org/rosetta/result.php?resultid=159408171 All those crashes are a result of an out of memory error. ----- Rom My Blog ID: 52961 · Rating: 0 · rate: / Reply Quote

Ian_D Send message Joined: 21 Sep 05 Posts: 55 Credit: 4,216,173 RAC: 0	Message 52967 - Posted: 10 May 2008, 6:48:21 UTC Last modified: 10 May 2008, 6:48:45 UTC My latest weirdness <core_client_version>5.10.30</core_client_version> <![CDATA[ <message> Maximum memory exceeded </message> ]]> resultid=161607307 ID: 52967 · Rating: 0 · rate: / Reply Quote

Quidgydog Send message Joined: 28 Sep 06 Posts: 3 Credit: 499,462 RAC: 0	Message 52969 - Posted: 10 May 2008, 8:22:42 UTC Last modified: 10 May 2008, 8:24:56 UTC Having exactly the same issue as I was having with the v1.15 WU. WU just sits there, CPU time not running, no progress. Log file...... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x7C82A714 read attempt to address 0x00D767E5 Engaging BOINC Windows Runtime Debugger... I'm detaching this computer until this is resolved. ID: 52969 · Rating: 0 · rate: / Reply Quote

Betting Slip Send message Joined: 26 Sep 05 Posts: 71 Credit: 5,702,246 RAC: 0	Message 52970 - Posted: 10 May 2008, 9:49:08 UTC - in response to Message 52961. All access violations https://boinc.bakerlab.org/rosetta/result.php?resultid=161740698 https://boinc.bakerlab.org/rosetta/result.php?resultid=160201341 https://boinc.bakerlab.org/rosetta/result.php?resultid=159794241 https://boinc.bakerlab.org/rosetta/result.php?resultid=160129454 https://boinc.bakerlab.org/rosetta/result.php?resultid=160185394 https://boinc.bakerlab.org/rosetta/result.php?resultid=161332559 https://boinc.bakerlab.org/rosetta/result.php?resultid=159408171 All those crashes are a result of an out of memory error. With 4Gb of memory what do I do to put it right? ID: 52970 · Rating: 0 · rate: / Reply Quote

Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0	Message 52972 - Posted: 10 May 2008, 10:09:31 UTC - in response to Message 52970. Last modified: 10 May 2008, 10:10:08 UTC All those crashes are a result of an out of memory error. With 4Gb of memory what do I do to put it right? You could once get out of memory with also 64 GB of RAM... (Do you know the sentence about 64 KB of RAM?) How much pagefile do you have available there? Any other memory load? Like other projects' applications, preempted and waiting in memory? Take occasionally a look into Task Manager, Performance tab - what are the Commit Charge values like? If the Total (or Peak) anytimes reach the Limit, that's it. You're running at least 7 projects on the host, each Rosetta can require up to 600-900 MB, CPDN at least some 200-300 MB, other projects as well something, and it is a quad... Peter ID: 52972 · Rating: 0 · rate: / Reply Quote

Betting Slip Send message Joined: 26 Sep 05 Posts: 71 Credit: 5,702,246 RAC: 0	Message 52973 - Posted: 10 May 2008, 10:30:52 UTC - in response to Message 52972. All those crashes are a result of an out of memory error. With 4Gb of memory what do I do to put it right? You could once get out of memory with also 64 GB of RAM... (Do you know the sentence about 64 KB of RAM?) How much pagefile do you have available there? Any other memory load? Like other projects' applications, preempted and waiting in memory? Take occasionally a look into Task Manager, Performance tab - what are the Commit Charge values like? If the Total (or Peak) anytimes reach the Limit, that's it. You're running at least 7 projects on the host, each Rosetta can require up to 600-900 MB, CPDN at least some 200-300 MB, other projects as well something, and it is a quad... Peter Yes, I understand but my commit charge is a fraction of of my available charge 10% at the moment. I have increased my page file to 6GB with a total memory of 4GB on Win XP Pro 64 It just strikes me that the very kowledgeable Rom is arrogant enough to point to the cause without indicating any sort of a solution. ID: 52973 · Rating: 0 · rate: / Reply Quote

alpha Send message Joined: 4 Nov 06 Posts: 27 Credit: 1,550,107 RAC: 0	Message 52974 - Posted: 10 May 2008, 13:58:09 UTC This work unit finished earlier than expected, but with no errors: https://boinc.bakerlab.org/rosetta/result.php?resultid=161362748 Claimed 130.48, granted 32.86. :( ID: 52974 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 52977 - Posted: 10 May 2008, 15:04:09 UTC Fat Loss, I'm guessing that the error is an indication that the task grew to exceed the maximum memory it was configured for, and so was terminated by BOINC. And so, regardless of your machine's physical configuration or % memory used to BOINC etc. etc. it still would have failed. So that would tend to indicate a logic problem in Mini, or perhaps a task that should be created with a higher memory maximum allowed. We'll have to wait to see what DK finds. Rosetta Moderator: Mod.Sense ID: 52977 · Rating: 0 · rate: / Reply Quote

Rom Walton (BOINC) Volunteer moderator Project developer Send message Joined: 17 Sep 05 Posts: 18 Credit: 40,071 RAC: 0	Message 52979 - Posted: 10 May 2008, 15:34:13 UTC - in response to Message 52973. It just strikes me that the very kowledgeable Rom is arrogant enough to point to the cause without indicating any sort of a solution. In this particular case there isn't anything that any of us can do, I've passed the info on to the MiniRosetta devs. Basically MiniRosetta is a 32-bit process, and generally 32-bit processes are limited to 2GB of user-mode memory. MiniRosetta hit that limit and so when it asked for more the OS said NO, leading to the crash. The sign that this sort of problem has occurred is: LoadLibraryA( dbghelp.dll ): GetLastError = 8 and - Virtual Memory Usage - VirtualSize: 2127511552, PeakVirtualSize: 2127511552 Sorry for not explaining the situation sooner, I was heading for bed and I started thinking about how I was going to help the devs debug this problem in the wild if they are unable to reproduce this issue in the lab. At present there isn't anything in the BOINC application framework that'll help them debug this in the wild. ----- Rom My Blog ID: 52979 · Rating: 0 · rate: / Reply Quote