Message boards : Number crunching : Minirosetta v1.34 bug thread
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
DanieI Send message Joined: 13 Jun 07 Posts: 1 Credit: 13,969,568 RAC: 1,442 |
restarting task? 2008-09-29 17:57:19|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134 2008-09-29 18:14:25|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:21:06|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:26:24|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:28:58|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:31:10|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:33:49|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:36:05|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:38:48|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:41:30|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:44:23|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 18:58:04|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134 2008-09-29 19:16:08|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 19:28:25|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 19:35:40|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 19:39:09|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 19:41:36|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 19:45:52|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 19:50:01|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 19:53:22|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:06:08|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134 2008-09-29 20:24:17|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:29:11|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:33:04|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:36:11|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:38:54|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:43:22|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:46:51|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:49:16|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 20:51:35|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 21:00:29|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134 2008-09-29 21:13:21|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 21:17:02|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 21:20:15|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed 2008-09-29 21:23:10|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed What is it? |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
restarting task? The task is being restarted each one hour. Your cpu_run_time_pref seems to be 3 hours, so it is probably something different. The task is still not reported. Could you please try to add <cpu_sched> and/or <cpu_sched_debug> (and possibly also <task_debug>?) to your cc_config.xml and let the client reread it (BOINC Mgr / Advanced / Read config file) without stopping the client? Peter |
l_mckeon Send message Joined: 5 Jun 07 Posts: 44 Credit: 180,717 RAC: 0 |
Here's a weird one. I've had a couple of tasks that refuse to suspend, both hombench tasks on 1.34. I manually suspend the task (and BOINC reports Task Suspended) or the task is "Waiting to Run" and the task continues to run. I've only got a dual core processor but I have three tasks running at once, two Rosetta tasks plus one World Community grid. Task Manager reports all three tasks having 33 per cent of CPU cycles. The tasks complete and upload normally. |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
I've had a couple of tasks that refuse to suspend, both hombench tasks on 1.34. It happens. Peter |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
restarting task? Don't confuse the WU runtime preference, with the other BOINC prefernces such as how frequently to switch between applications. Basically, the message about restarting is an indication that the task was suspended and is now beginning to run again. Your BOINC preferences basically dictate what conditions would suspend a running task. The simplest being that another task from another project begins running. If you've included all of the messages, then that is not the case here. Perhaps you have set up BOINC to not run while the computer is in use? If so, then each time to step up to use it, BOINC suspends the tasks. Then once the machine is idle for the configured period of time, it resumes what it was doing. The message DOES NOT mean that the task is starting again from the beginning. You see all of those checkpoint messages? The worst case is that when it restarts, it begins from the last checkpoint. You want to double check your setting for leaving applications in memory while suspended though, especially if you are set to not run while computer is in use. And you want to say "YES" to leave applications in memory. Or if you are looking at the preferences on your local machine, there is a checkbox indicating you want suspended applications to remain in memory. This assures there work that hasn't been checkpointed yet is retained until the task "restarts" later. Rosetta Moderator: Mod.Sense |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
The task is being restarted each one hour. Your cpu_run_time_pref seems to be 3 hours, so it is probably something different. I'm not. It was just an idea about one of the possible reasons for the task being restarted - something happening in the application algorithm at the end of preferred time interval... Basically, the message about restarting is an indication that the task was suspended and is now beginning to run again. Your BOINC preferences basically dictate what conditions would suspend a running task. The simplest being that another task from another project begins running. ...Another was that the client is restarting the task (the default time slot is 1 hour), but the lack of any other messages suggested that it is not the case. The messages suggested the application is rather terminating itself :-) (But looking at the time stamps there really seem to be few minutes short gaps, where some small app could fit in. If there is any.) If you've included all of the messages, then that is not the case here. I failed to ask whether the message list is complete at all. I've simply not thought of this. Why? Because DanieI currently seems to be a dedicated Rosetta cruncher (sure, his CPIDs could be out of sync). Perhaps you have set up BOINC to not run while the computer is in use? If so, then each time to step up to use it, BOINC suspends the tasks. Then once the machine is idle for the configured period of time, it resumes what it was doing. I was hoping the additional logging flags should help to reveal this. But possibly just adding the discarded messages could solve it (a "constructed mystery"). Peter |
upstatelabs Send message Joined: 22 Jun 06 Posts: 10 Credit: 516,767 RAC: 0 |
Can anyone explain why I might be getting this kind of repeating error? 9/28/2008 11:17:09 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file 9/28/2008 11:17:09 PM|rosetta@home|If this happens repeatedly you may need to reset the project. 9/28/2008 11:17:09 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 using minirosetta version 134 9/28/2008 11:17:50 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file 9/28/2008 11:17:50 PM|rosetta@home|If this happens repeatedly you may need to reset the project. 9/28/2008 11:17:50 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 using minirosetta version 134 9/28/2008 11:18:31 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file 9/28/2008 11:18:31 PM|rosetta@home|If this happens repeatedly you may need to reset the project. This is only an excerpt of the series of messages. I've had a few WUs do this recently. |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
Can anyone explain why I might be getting this kind of repeating error? Either the client or the system is busy (and the client then fails to deliver heartbeat messages to the Rosetta app, which in turn quits after 30 seconds), or the application has some own problem and keeps terminating for unknown reason. I'd bet the reason is the same as DanieI described in Message 56087, although the behavior differs. You could try to use the mentioned debugging logs for a couple of minutes, whether it will reveal something... Peter |
upstatelabs Send message Joined: 22 Jun 06 Posts: 10 Credit: 516,767 RAC: 0 |
Where is this cc_config.xml file located? I can't seem to find it. Thanks. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
In version 6 BOINC clients, it is located in the data directory. The data directory is shown in the messages as BOINC first starts. Rosetta Moderator: Mod.Sense |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
You could try to use the mentioned debugging flags for a couple of minutes, whether it will reveal something... It does not exist if you did not yet create it mamually. It's description is here. As Mod.Sense said, the exact place is best described in the first messages. You need tu put just (some of?) the following tags in: <cc_config> and then let the client read it (BOINC Mgr / Advanced / Read config file), stopping the client is not necessary. It will possibly generate a lot of output (maybe not), which will be similar across the restarts. You could select some text from some last checkpoint prior to a restart, until the next subsequent checkpoint after the same restart. We will see... (maybe we will not see anything obvious :-) Peter |
Bob Browett Send message Joined: 14 Dec 05 Posts: 11 Credit: 2,275,743 RAC: 0 |
Hi Multiple 1.34 failures: Uniits 179091999,179091969,179091951,179091934,179091933 and several others. All show the following: <core_client_version>6.2.19</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> needs psipred_ss2 to run filters needs psipred_ss2 to run filters Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00573D62 read attempt to address 0x2319D810 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 6.3.10 Dump Timestamp : 10/04/08 03:03:07 Install Directory : C:Program FilesBOINC Data Directory : C:Documents and SettingsAll UsersApplication DataBOINC Project Symstore : Loaded Library : C:Program FilesBOINC\dbghelp.dll Loaded Library : C:Program FilesBOINC\symsrv.dll Loaded Library : C:Program FilesBOINC\srcsrv.dll LoadLibraryA( C:Program FilesBOINC\version.dll ): GetLastError = 126 Loaded Library : version.dll Debugger Engine : 4.0.5.0 </stderr_txt> ]]> AND also <core_client_version>6.2.19</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00573D9D write attempt to address 0x237660A8 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 6.3.10 Dump Timestamp : 10/04/08 04:52:23 Install Directory : C:Program FilesBOINC Data Directory : C:Documents and SettingsAll UsersApplication DataBOINC Project Symstore : Loaded Library : C:Program FilesBOINC\dbghelp.dll Loaded Library : C:Program FilesBOINC\symsrv.dll Loaded Library : C:Program FilesBOINC\srcsrv.dll LoadLibraryA( C:Program FilesBOINC\version.dll ): GetLastError = 126 Loaded Library : version.dll Debugger Engine : 4.0.5.0 Symbol Search Path: C:Documents and SettingsAll UsersApplication DataBOINCslots2;C:Documents and SettingsAll UsersApplication DataBOINCprojectsboinc.bakerlab.org_rosetta;srv*C:DOCUME~1BobLOCALS~1Tempsymbols*http://msdl.microsoft.com/download/symbols;srv*C:DOCUME~1BobLOCALS~1Tempsymbols*http://boinc.berkeley.edu/symstore ModLoad: 00400000 00605000 C:Documents and SettingsAll UsersApplication DataBOINCprojectsboinc.bakerlab.org_rosettaminirosetta_1.34_windows_intelx86.exe (-nosymbols- Symbols Loaded) Linked PDB Filename : C:cygwinhomeboincboinc_buildminirosettaminirosetta_1.34miniVisual StudioBoincReleaseminirosetta_1.34_windows_intelx86.pdb ModLoad: 7c900000 000af000 C:WINDOWSsystem32ntdll.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : ntdll.pdb File Version : 5.1.2600.5512 (xpsp.080413-2111) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 7c800000 000f6000 C:WINDOWSsystem32kernel32.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : kernel32.pdb File Version : 5.1.2600.5512 (xpsp.080413-2111) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 7e410000 00091000 C:WINDOWSsystem32USER32.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : user32.pdb File Version : 5.1.2600.5512 (xpsp.080413-2105) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 77f10000 00049000 C:WINDOWSsystem32GDI32.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : gdi32.pdb File Version : 5.1.2600.5512 (xpsp.080413-2105) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 77dd0000 0009b000 C:WINDOWSsystem32ADVAPI32.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : advapi32.pdb File Version : 5.1.2600.5512 (xpsp.080413-2113) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 77e70000 00092000 C:WINDOWSsystem32RPCRT4.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : rpcrt4.pdb File Version : 5.1.2600.5512 (xpsp.080413-2108) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 77fe0000 00011000 C:WINDOWSsystem32Secur32.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : secur32.pdb File Version : 5.1.2600.5512 (xpsp.080413-2113) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 76390000 0001d000 C:WINDOWSsystem32IMM32.DLL (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : imm32.pdb File Version : 5.1.2600.5512 (xpsp.080413-2105) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 77690000 00021000 C:WINDOWSsystem32NTMARTA.DLL (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : ntmarta.pdb File Version : 5.1.2600.5512 (xpsp.080413-2113) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 77c10000 00058000 C:WINDOWSsystem32msvcrt.dll (7.0.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : msvcrt.pdb File Version : 7.0.2600.5512 (xpsp.080413-2111) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 7.0.2600.5512 ModLoad: 774e0000 0013d000 C:WINDOWSsystem32ole32.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : ole32.pdb File Version : 5.1.2600.5512 (xpsp.080413-2108) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 71bf0000 00013000 C:WINDOWSsystem32SAMLIB.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : samlib.pdb File Version : 5.1.2600.5512 (xpsp.080413-2113) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 76f60000 0002c000 C:WINDOWSsystem32WLDAP32.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : wldap32.pdb File Version : 5.1.2600.5512 (xpsp.080413-2113) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 ModLoad: 02510000 00115000 C:Program FilesBOINCdbghelp.dll (6.8.4.0) (PDB Symbols Loaded) Linked PDB Filename : dbghelp.pdb File Version : 6.8.0004.0 (debuggers(dbg).070515-1751) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 02730000 00048000 C:Program FilesBOINCsymsrv.dll (6.8.4.0) (PDB Symbols Loaded) Linked PDB Filename : symsrv.pdb File Version : 6.8.0004.0 (debuggers(dbg).070515-1751) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 003c0000 0003b000 C:Program FilesBOINCsrcsrv.dll (6.8.4.0) (PDB Symbols Loaded) Linked PDB Filename : srcsrv.pdb File Version : 6.8.0004.0 (debuggers(dbg).070515-1751) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 77c00000 00008000 C:WINDOWSsystem32version.dll (5.1.2600.5512) (PDB Symbols Loaded) Linked PDB Filename : version.pdb File Version : 5.1.2600.5512 (xpsp.080413-2105) Company Name : Microsoft Corporation Product Name : Microsoft� Windows� Operating System Product Version : 5.1.2600.5512 *** Dump of the Process Statistics: *** - I/O Operations Counters - Read: 4496, Write: 0, Other 1589 - I/O Transfers Counters - Read: 0, Write: 40419, Other 0 - Paged Pool Usage - QuotaPagedPoolUsage: 46364, QuotaPeakPagedPoolUsage: 46364 QuotaNonPagedPoolUsage: 2744, QuotaPeakNonPagedPoolUsage: 2744 - Virtual Memory Usage - VirtualSize: 169021440, PeakVirtualSize: 170872832 - Pagefile Usage - PagefileUsage: 119431168, PeakPagefileUsage: 124928000 - Working Set Size - WorkingSetSize: 122888192, PeakWorkingSetSize: 128348160, PageFaultCount: 1890158 *** Dump of thread ID 3252 (state: Waiting): *** - Information - Status: Wait Reason: UserRequest, , Kernel Time: 118593752.000000, User Time: 7576093696.000000, Wait Time: 2235388.000000 - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00573D9D write attempt to address 0x237660A8 - Registers - eax=03776d40 ebx=ffffffff ecx=0851c448 edx=237660a4 esi=032cdf30 edi=030be688 eip=00573d9d esp=0012c900 ebp=01e6fac8 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010213 - Callstack - ChildEBP RetAddr Args to Child 0012c908 00573f1f 0851c460 03776d40 035e9fe8 07f7a6d8 minirosetta_1.34_windows_intelx!+0x0 0012c924 006cd396 07f7a6d8 006cd523 00000000 0012d298 minirosetta_1.34_windows_intelx!+0x0 0012c92c 006cd523 00000000 0012d298 0000005b 004f87c0 minirosetta_1.34_windows_intelx!+0x0 0012c93c 004f87c0 07f7a6d8 0012d298 0012cef8 00000001 minirosetta_1.34_windows_intelx!+0x0 0012c964 004f908c 01e6fac8 0012d298 0012d298 0012d298 minirosetta_1.34_windows_intelx!+0x0 0012c978 004fb29d 00000000 0012cef8 004c59ce 0012c9cc minirosetta_1.34_windows_intelx!+0x0 0012d298 008c3470 00000000 00000090 008c6f9c 00000000 minirosetta_1.34_windows_intelx!+0x0 0012d29c 00000000 00000090 008c6f9c 00000000 03236848 minirosetta_1.34_windows_intelx!+0x0 *** Dump of thread ID 2876 (state: Waiting): *** - Information - Status: Wait Reason: ExecutionDelay, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 2235383.000000 - Registers - eax=0152f8b0 ebx=00000000 ecx=00000005 edx=00000078 esi=00000000 edi=0152ff70 eip=7c90e4f4 esp=0152ff40 ebp=0152ff98 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 - Callstack - ChildEBP RetAddr Args to Child 0152ff3c 7c90d1fc 7c8023f1 00000000 0152ff70 00000000 ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0] 0152ff40 7c8023f1 00000000 0152ff70 00000000 7c802446 ntdll!_NtDelayExecution@8+0x0 FPO: [2,0,0] 0152ff98 7c802455 00000064 00000000 0152ffec 0041c24b kernel32!_SleepEx@8+0x0 0152ffa8 0041c24b 00000064 00000000 7c80b713 00000000 kernel32!_Sleep@4+0x0 0152ffec 00000000 0041c240 00000000 00000000 1dbb0000 minirosetta_1.34_windows_intelx!+0x0 *** Dump of thread ID 3780 (state: Waiting): *** - Information - Status: Wait Reason: ExecutionDelay, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 2235281.000000 - Registers - eax=0241fe18 ebx=01e8ce00 ecx=0241e82c edx=000001f9 esi=00000000 edi=0241fdf8 eip=7c90e4f4 esp=0241fdc8 ebp=0241fe20 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 - Callstack - ChildEBP RetAddr Args to Child 0241fdc4 7c90d1fc 7c8023f1 00000000 0241fdf8 000001fc ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0] 0241fdc8 7c8023f1 00000000 0241fdf8 000001fc 01e8ceb8 ntdll!_NtDelayExecution@8+0x0 FPO: [2,0,0] 0241fe20 7c802455 000007d0 00000000 7c802446 0076bf14 kernel32!_SleepEx@8+0x0 0241fe30 0076bf14 000007d0 7824af0a 0012c8c8 01e8ceb8 kernel32!_Sleep@4+0x0 0241fe38 7824af0a 0012c8c8 01e8ceb8 0241ff6c 01e8ceb8 minirosetta_1.34_windows_intelx!+0x0 0241fe3c 0012c8c8 01e8ceb8 0241ff6c 01e8ceb8 00000001 minirosetta_1.34_windows_intelx!+0x0 SymFromAddr(): GetLastError = '126' SymGetLineFromAddr(): GetLastError = '126' SymGetModuleInfo(): GetLastError = '126' Address = '7824af0a' 0241ff3c 7c917de9 7c917ea0 7c800000 0241ff7c 00000000 minirosetta_1.34_windows_intelx!+0x0 SymFromAddr(): GetLastError = '126' SymGetLineFromAddr(): GetLastError = '126' SymGetModuleInfo(): GetLastError = '126' Address = '0012c8c8' 0241ffe0 7c80b71f 00000000 00000000 00000000 00429456 ntdll!_LdrpGetProcedureAddress@20+0x0 SymFromAddr(): GetLastError = '126' SymGetLineFromAddr(): GetLastError = '126' SymGetModuleInfo(): GetLastError = '126' Address = '7c917de9' 0241ffe4 00000000 00000000 00000000 00429456 01e8ceb8 kernel32!_BaseThreadStart@8+0x0 FPO: [0,0,0] SymFromAddr(): GetLastError = '126' SymGetLineFromAddr(): GetLastError = '126' SymGetModuleInfo(): GetLastError = '126' Address = '7c80b71f' *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... </stderr_txt> ]]> Yummy! |
Bossone Send message Joined: 6 Jan 06 Posts: 3 Credit: 37,125 RAC: 0 |
Hello , since this evening Rosseta v1.34 is causing maasive failures. It has a signature verfication error and the output-files are missing. See below a part of the report: 6-10-2008 21:35:30|rosetta@home|[error] Signature verification failed for minirosetta_1.34_windows_intelx86.exe 6-10-2008 21:35:30|rosetta@home|Starting abinitio_nohomfrag_70_A_1h75A_4466_32572_0 6-10-2008 21:35:31|rosetta@home|Computation for task abinitio_nohomfrag_70_A_1zd0A_4466_32533_0 finished 6-10-2008 21:35:31|rosetta@home|Output file abinitio_nohomfrag_70_A_1zd0A_4466_32533_0_0 for task abinitio_nohomfrag_70_A_1zd0A_4466_32533_0 absent 6-10-2008 21:35:31|rosetta@home|Computation for task abinitio_nohomfrag_70_A_1h75A_4466_32572_0 finished 6-10-2008 21:35:31|rosetta@home|Output file abinitio_nohomfrag_70_A_1h75A_4466_32572_0_0 for task abinitio_nohomfrag_70_A_1h75A_4466_32572_0 absent 6-10-2008 21:35:32|rosetta@home|Started upload of abinitio_nohomfrag_70_A_1tzaA_4466_28308_0_0 6-10-2008 21:35:36|rosetta@home|Finished upload of abinitio_nohomfrag_70_A_1tzaA_4466_28308_0_0 6-10-2008 21:36:34|rosetta@home|Sending scheduler request: To fetch work. Requesting 60480 seconds of work, reporting 4 completed tasks 6-10-2008 21:36:39|rosetta@home|Scheduler request succeeded: got 3 new tasks 6-10-2008 21:36:40|rosetta@home|[error] garbage_collect(); still have active task for acked result abinitio_nohomfrag_70_A_1faaA_4466_28786_0; state 0 6-10-2008 21:36:40|rosetta@home|Computation for task abinitio_nohomfrag_70_A_1faaA_4466_28786_0 finished 6-10-2008 21:36:40|rosetta@home|Output file abinitio_nohomfrag_70_A_1faaA_4466_28786_0_0 for task abinitio_nohomfrag_70_A_1faaA_4466_28786_0 absent Thank you. |
Michael E.@ team Carl Sagan Send message Joined: 5 Apr 08 Posts: 16 Credit: 1,942,656 RAC: 1,016 |
I have a task that may be looping. The Message tab shows the task restarting every 30 minutes and the time remaining has been stuck at 00:09:57 for almost two working days. These tasks usually take 9-10 hours on this computer (computer ID is 858463). Total CPU time and % complete follow: CPU time 12:34:xx and Progress 98.691% (Friday AM) CPU time 18:28:xx and Progress 99.105% (Monday late afternoon) Some of the log messages in the Message tab indicate a restart about every thirty minutes: 10/6/2008 4:13:53 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_loopbuild_boinctest3_foldcst_loopbuild_t328__IGNORE_THE_REST_1VIMA_16_4578_2_0 using minirosetta version 134 10/6/2008 4:45:54 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_loopbuild_boinctest3_foldcst_loopbuild_t328__IGNORE_THE_REST_1VIMA_16_4578_2_0 using minirosetta version 134 Should I let this task continue or abort it? Task ID is 194948950 and Work unit is 178087521. It is now past the Report Deadline, so it has been re-sent. The system is a laptop running Windows XP. It has development software installed. Michael E. (Mike) |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2124 Credit: 41,224,965 RAC: 11,021 |
Can anyone explain why I might be getting this kind of repeating error?... In short, no, but it's something that comes up massively on my Vista64 system. Seems like it's being reported on XPSP3 machines now too. The task itself reports "too many exit(0)s - Can't acquire lockfile - exiting" |
Bob Browett Send message Joined: 14 Dec 05 Posts: 11 Credit: 2,275,743 RAC: 0 |
Hi 4 more errors to add; tasks: 199237177 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00518F33 read attempt to address 0x235EBBB8 Engaging BOINC Windows Runtime Debugger... Others not showing on results page yet,but all failed within the last hour |
Bob Browett Send message Joined: 14 Dec 05 Posts: 11 Credit: 2,275,743 RAC: 0 |
Hi Overnight I had 2 successful crunchings and 61 (!) failed. Is it worth my computer time for so many errors? |
Bob Browett Send message Joined: 14 Dec 05 Posts: 11 Credit: 2,275,743 RAC: 0 |
Hi Overnight tonight 1 successful 40 failed |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Bob those same tasks failed on systems prior to yours. Rosetta has to try 2 systems to make sure its not just a single computer that failed. It looks like you got a bad batch of abinitio_nohomfrag_70_A tasks. |
Bob Browett Send message Joined: 14 Dec 05 Posts: 11 Credit: 2,275,743 RAC: 0 |
Bob those same tasks failed on systems prior to yours. 100 bad units at once!! I call that more than bad luck. I call that a credit crunch. I better nationalise Rosetta. How much do you want David? |
Message boards :
Number crunching :
Minirosetta v1.34 bug thread
©2024 University of Washington
https://www.bakerlab.org