Minirosetta v1.34 bug thread

Message boards : Number crunching : Minirosetta v1.34 bug thread

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
DanieI

Send message
Joined: 13 Jun 07
Posts: 1
Credit: 13,624,933
RAC: 2,041
Message 56087 - Posted: 29 Sep 2008, 19:54:56 UTC

restarting task?


2008-09-29 17:57:19|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 18:14:25|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:21:06|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:26:24|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:28:58|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:31:10|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:33:49|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:36:05|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:38:48|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:41:30|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:44:23|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 18:58:04|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 19:16:08|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 19:28:25|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 19:35:40|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 19:39:09|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 19:41:36|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 19:45:52|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 19:50:01|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 19:53:22|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:06:08|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 20:24:17|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:29:11|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:33:04|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:36:11|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:38:54|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:43:22|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:46:51|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:49:16|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 20:51:35|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 21:00:29|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 21:13:21|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 21:17:02|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 21:20:15|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed
2008-09-29 21:23:10|rosetta@home|[checkpoint_debug] result hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 checkpointed


What is it?
ID: 56087 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pepo
Avatar

Send message
Joined: 28 Sep 05
Posts: 115
Credit: 101,358
RAC: 0
Message 56090 - Posted: 29 Sep 2008, 21:11:24 UTC - in response to Message 56087.  

restarting task?

2008-09-29 17:57:19|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 18:58:04|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 20:06:08|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 21:00:29|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134


What is it?

The task is being restarted each one hour. Your cpu_run_time_pref seems to be 3 hours, so it is probably something different.

The task is still not reported. Could you please try to add <cpu_sched> and/or <cpu_sched_debug> (and possibly also <task_debug>?) to your cc_config.xml and let the client reread it (BOINC Mgr / Advanced / Read config file) without stopping the client?

Peter
ID: 56090 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
l_mckeon

Send message
Joined: 5 Jun 07
Posts: 44
Credit: 180,717
RAC: 0
Message 56092 - Posted: 29 Sep 2008, 22:17:51 UTC

Here's a weird one. I've had a couple of tasks that refuse to suspend, both hombench tasks on 1.34.

I manually suspend the task (and BOINC reports Task Suspended) or the task is "Waiting to Run" and the task continues to run.

I've only got a dual core processor but I have three tasks running at once, two Rosetta tasks plus one World Community grid. Task Manager reports all three tasks having 33 per cent of CPU cycles. The tasks complete and upload normally.

ID: 56092 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pepo
Avatar

Send message
Joined: 28 Sep 05
Posts: 115
Credit: 101,358
RAC: 0
Message 56093 - Posted: 29 Sep 2008, 23:08:10 UTC - in response to Message 56092.  

I've had a couple of tasks that refuse to suspend, both hombench tasks on 1.34.

I manually suspend the task (and BOINC reports Task Suspended) or the task is "Waiting to Run" and the task continues to run.

It happens.

Peter
ID: 56093 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 56094 - Posted: 29 Sep 2008, 23:29:07 UTC - in response to Message 56090.  

restarting task?

2008-09-29 17:57:19|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 18:58:04|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 20:06:08|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134
2008-09-29 21:00:29|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t315___4592_834_0 using minirosetta version 134


What is it?

The task is being restarted each one hour. Your cpu_run_time_pref seems to be 3 hours, so it is probably something different.

The task is still not reported. Could you please try to add <cpu_sched> and/or <cpu_sched_debug> (and possibly also <task_debug>?) to your cc_config.xml and let the client reread it (BOINC Mgr / Advanced / Read config file) without stopping the client?

Peter


Don't confuse the WU runtime preference, with the other BOINC prefernces such as how frequently to switch between applications.

Basically, the message about restarting is an indication that the task was suspended and is now beginning to run again. Your BOINC preferences basically dictate what conditions would suspend a running task. The simplest being that another task from another project begins running. If you've included all of the messages, then that is not the case here. Perhaps you have set up BOINC to not run while the computer is in use? If so, then each time to step up to use it, BOINC suspends the tasks. Then once the machine is idle for the configured period of time, it resumes what it was doing.

The message DOES NOT mean that the task is starting again from the beginning. You see all of those checkpoint messages? The worst case is that when it restarts, it begins from the last checkpoint. You want to double check your setting for leaving applications in memory while suspended though, especially if you are set to not run while computer is in use. And you want to say "YES" to leave applications in memory. Or if you are looking at the preferences on your local machine, there is a checkbox indicating you want suspended applications to remain in memory. This assures there work that hasn't been checkpointed yet is retained until the task "restarts" later.
Rosetta Moderator: Mod.Sense
ID: 56094 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pepo
Avatar

Send message
Joined: 28 Sep 05
Posts: 115
Credit: 101,358
RAC: 0
Message 56105 - Posted: 30 Sep 2008, 11:02:45 UTC - in response to Message 56094.  

The task is being restarted each one hour. Your cpu_run_time_pref seems to be 3 hours, so it is probably something different.

The task is still not reported. Could you please try to add <cpu_sched> and/or <cpu_sched_debug> (and possibly also <task_debug>?) to your cc_config.xml and let the client reread it (BOINC Mgr / Advanced / Read config file) without stopping the client?

Don't confuse the WU runtime preference, with the other BOINC prefernces such as how frequently to switch between applications.

I'm not. It was just an idea about one of the possible reasons for the task being restarted - something happening in the application algorithm at the end of preferred time interval...

Basically, the message about restarting is an indication that the task was suspended and is now beginning to run again. Your BOINC preferences basically dictate what conditions would suspend a running task. The simplest being that another task from another project begins running.

...Another was that the client is restarting the task (the default time slot is 1 hour), but the lack of any other messages suggested that it is not the case. The messages suggested the application is rather terminating itself :-) (But looking at the time stamps there really seem to be few minutes short gaps, where some small app could fit in. If there is any.)

If you've included all of the messages, then that is not the case here.

I failed to ask whether the message list is complete at all. I've simply not thought of this. Why? Because DanieI currently seems to be a dedicated Rosetta cruncher (sure, his CPIDs could be out of sync).

Perhaps you have set up BOINC to not run while the computer is in use? If so, then each time to step up to use it, BOINC suspends the tasks. Then once the machine is idle for the configured period of time, it resumes what it was doing.

I was hoping the additional logging flags should help to reveal this. But possibly just adding the discarded messages could solve it (a "constructed mystery").

Peter
ID: 56105 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
upstatelabs

Send message
Joined: 22 Jun 06
Posts: 10
Credit: 516,767
RAC: 0
Message 56106 - Posted: 30 Sep 2008, 11:35:56 UTC
Last modified: 30 Sep 2008, 11:36:19 UTC

Can anyone explain why I might be getting this kind of repeating error?


9/28/2008 11:17:09 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file
9/28/2008 11:17:09 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
9/28/2008 11:17:09 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 using minirosetta version 134
9/28/2008 11:17:50 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file
9/28/2008 11:17:50 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
9/28/2008 11:17:50 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 using minirosetta version 134
9/28/2008 11:18:31 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file
9/28/2008 11:18:31 PM|rosetta@home|If this happens repeatedly you may need to reset the project.

This is only an excerpt of the series of messages. I've had a few WUs do this recently.
ID: 56106 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pepo
Avatar

Send message
Joined: 28 Sep 05
Posts: 115
Credit: 101,358
RAC: 0
Message 56107 - Posted: 30 Sep 2008, 11:49:16 UTC - in response to Message 56106.  

Can anyone explain why I might be getting this kind of repeating error?

9/28/2008 11:17:09 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file
9/28/2008 11:17:09 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
9/28/2008 11:17:09 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 using minirosetta version 134
9/28/2008 11:17:50 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file
9/28/2008 11:17:50 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
9/28/2008 11:17:50 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 using minirosetta version 134
9/28/2008 11:18:31 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file
9/28/2008 11:18:31 PM|rosetta@home|If this happens repeatedly you may need to reset the project.


This is only an excerpt of the series of messages. I've had a few WUs do this recently.

Either the client or the system is busy (and the client then fails to deliver heartbeat messages to the Rosetta app, which in turn quits after 30 seconds), or the application has some own problem and keeps terminating for unknown reason. I'd bet the reason is the same as DanieI described in Message 56087, although the behavior differs.

You could try to use the mentioned debugging logs for a couple of minutes, whether it will reveal something...

Peter
ID: 56107 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
upstatelabs

Send message
Joined: 22 Jun 06
Posts: 10
Credit: 516,767
RAC: 0
Message 56119 - Posted: 30 Sep 2008, 15:52:32 UTC - in response to Message 56107.  
Last modified: 30 Sep 2008, 15:53:25 UTC


I'd bet the reason is the same as DanieI described in Message 56087, although the behavior differs.

You could try to use the mentioned debugging logs for a couple of minutes, whether it will reveal something...

Peter


Where is this cc_config.xml file located? I can't seem to find it.

Thanks.
ID: 56119 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 56126 - Posted: 30 Sep 2008, 19:38:55 UTC

In version 6 BOINC clients, it is located in the data directory. The data directory is shown in the messages as BOINC first starts.
Rosetta Moderator: Mod.Sense
ID: 56126 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pepo
Avatar

Send message
Joined: 28 Sep 05
Posts: 115
Credit: 101,358
RAC: 0
Message 56138 - Posted: 1 Oct 2008, 8:27:10 UTC - in response to Message 56119.  

You could try to use the mentioned debugging flags for a couple of minutes, whether it will reveal something...

Where is this cc_config.xml file located? I can't seem to find it.

It does not exist if you did not yet create it mamually. It's description is here. As Mod.Sense said, the exact place is best described in the first messages.

You need tu put just (some of?) the following tags in:
<cc_config>
<log_flags>
<cpu_sched>1</cpu_sched>
<cpu_sched_debug>1</cpu_sched_debug>
<checkpoint_debug>1</checkpoint_debug>
<task_debug>1</task_debug>
</log_flags>
</cc_config>

and then let the client read it (BOINC Mgr / Advanced / Read config file), stopping the client is not necessary.

It will possibly generate a lot of output (maybe not), which will be similar across the restarts. You could select some text from some last checkpoint prior to a restart, until the next subsequent checkpoint after the same restart. We will see... (maybe we will not see anything obvious :-)

Peter
ID: 56138 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bob Browett

Send message
Joined: 14 Dec 05
Posts: 11
Credit: 2,275,743
RAC: 0
Message 56224 - Posted: 4 Oct 2008, 17:37:03 UTC

Hi
Multiple 1.34 failures:
Uniits 179091999,179091969,179091951,179091934,179091933
and several others.

All show the following:
<core_client_version>6.2.19</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
needs psipred_ss2 to run filters
needs psipred_ss2 to run filters


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00573D62 read attempt to address 0x2319D810

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.3.10


Dump Timestamp : 10/04/08 03:03:07
Install Directory : C:Program FilesBOINC
Data Directory : C:Documents and SettingsAll UsersApplication DataBOINC
Project Symstore :
Loaded Library : C:Program FilesBOINC\dbghelp.dll
Loaded Library : C:Program FilesBOINC\symsrv.dll
Loaded Library : C:Program FilesBOINC\srcsrv.dll
LoadLibraryA( C:Program FilesBOINC\version.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0

</stderr_txt>
]]>


AND also


<core_client_version>6.2.19</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00573D9D write attempt to address 0x237660A8

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.3.10


Dump Timestamp : 10/04/08 04:52:23
Install Directory : C:Program FilesBOINC
Data Directory : C:Documents and SettingsAll UsersApplication DataBOINC
Project Symstore :
Loaded Library : C:Program FilesBOINC\dbghelp.dll
Loaded Library : C:Program FilesBOINC\symsrv.dll
Loaded Library : C:Program FilesBOINC\srcsrv.dll
LoadLibraryA( C:Program FilesBOINC\version.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
Symbol Search Path: C:Documents and SettingsAll UsersApplication DataBOINCslots2;C:Documents and SettingsAll UsersApplication DataBOINCprojectsboinc.bakerlab.org_rosetta;srv*C:DOCUME~1BobLOCALS~1Tempsymbols*http://msdl.microsoft.com/download/symbols;srv*C:DOCUME~1BobLOCALS~1Tempsymbols*http://boinc.berkeley.edu/symstore


ModLoad: 00400000 00605000 C:Documents and SettingsAll UsersApplication DataBOINCprojectsboinc.bakerlab.org_rosettaminirosetta_1.34_windows_intelx86.exe (-nosymbols- Symbols Loaded)
Linked PDB Filename : C:cygwinhomeboincboinc_buildminirosettaminirosetta_1.34miniVisual StudioBoincReleaseminirosetta_1.34_windows_intelx86.pdb

ModLoad: 7c900000 000af000 C:WINDOWSsystem32ntdll.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : ntdll.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2111)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 7c800000 000f6000 C:WINDOWSsystem32kernel32.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : kernel32.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2111)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 7e410000 00091000 C:WINDOWSsystem32USER32.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : user32.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2105)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 77f10000 00049000 C:WINDOWSsystem32GDI32.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : gdi32.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2105)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 77dd0000 0009b000 C:WINDOWSsystem32ADVAPI32.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : advapi32.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2113)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 77e70000 00092000 C:WINDOWSsystem32RPCRT4.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : rpcrt4.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2108)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 77fe0000 00011000 C:WINDOWSsystem32Secur32.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : secur32.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2113)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 76390000 0001d000 C:WINDOWSsystem32IMM32.DLL (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : imm32.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2105)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 77690000 00021000 C:WINDOWSsystem32NTMARTA.DLL (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : ntmarta.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2113)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 77c10000 00058000 C:WINDOWSsystem32msvcrt.dll (7.0.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : msvcrt.pdb
File Version : 7.0.2600.5512 (xpsp.080413-2111)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 7.0.2600.5512

ModLoad: 774e0000 0013d000 C:WINDOWSsystem32ole32.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : ole32.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2108)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 71bf0000 00013000 C:WINDOWSsystem32SAMLIB.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : samlib.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2113)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 76f60000 0002c000 C:WINDOWSsystem32WLDAP32.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : wldap32.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2113)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512

ModLoad: 02510000 00115000 C:Program FilesBOINCdbghelp.dll (6.8.4.0) (PDB Symbols Loaded)
Linked PDB Filename : dbghelp.pdb
File Version : 6.8.0004.0 (debuggers(dbg).070515-1751)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0

ModLoad: 02730000 00048000 C:Program FilesBOINCsymsrv.dll (6.8.4.0) (PDB Symbols Loaded)
Linked PDB Filename : symsrv.pdb
File Version : 6.8.0004.0 (debuggers(dbg).070515-1751)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0

ModLoad: 003c0000 0003b000 C:Program FilesBOINCsrcsrv.dll (6.8.4.0) (PDB Symbols Loaded)
Linked PDB Filename : srcsrv.pdb
File Version : 6.8.0004.0 (debuggers(dbg).070515-1751)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0

ModLoad: 77c00000 00008000 C:WINDOWSsystem32version.dll (5.1.2600.5512) (PDB Symbols Loaded)
Linked PDB Filename : version.pdb
File Version : 5.1.2600.5512 (xpsp.080413-2105)
Company Name : Microsoft Corporation
Product Name : Microsoft� Windows� Operating System
Product Version : 5.1.2600.5512



*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 4496, Write: 0, Other 1589

- I/O Transfers Counters -
Read: 0, Write: 40419, Other 0

- Paged Pool Usage -
QuotaPagedPoolUsage: 46364, QuotaPeakPagedPoolUsage: 46364
QuotaNonPagedPoolUsage: 2744, QuotaPeakNonPagedPoolUsage: 2744

- Virtual Memory Usage -
VirtualSize: 169021440, PeakVirtualSize: 170872832

- Pagefile Usage -
PagefileUsage: 119431168, PeakPagefileUsage: 124928000

- Working Set Size -
WorkingSetSize: 122888192, PeakWorkingSetSize: 128348160, PageFaultCount: 1890158

*** Dump of thread ID 3252 (state: Waiting): ***

- Information -
Status: Wait Reason: UserRequest, , Kernel Time: 118593752.000000, User Time: 7576093696.000000, Wait Time: 2235388.000000

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00573D9D write attempt to address 0x237660A8

- Registers -
eax=03776d40 ebx=ffffffff ecx=0851c448 edx=237660a4 esi=032cdf30 edi=030be688
eip=00573d9d esp=0012c900 ebp=01e6fac8
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010213

- Callstack -
ChildEBP RetAddr Args to Child
0012c908 00573f1f 0851c460 03776d40 035e9fe8 07f7a6d8 minirosetta_1.34_windows_intelx!+0x0
0012c924 006cd396 07f7a6d8 006cd523 00000000 0012d298 minirosetta_1.34_windows_intelx!+0x0
0012c92c 006cd523 00000000 0012d298 0000005b 004f87c0 minirosetta_1.34_windows_intelx!+0x0
0012c93c 004f87c0 07f7a6d8 0012d298 0012cef8 00000001 minirosetta_1.34_windows_intelx!+0x0
0012c964 004f908c 01e6fac8 0012d298 0012d298 0012d298 minirosetta_1.34_windows_intelx!+0x0
0012c978 004fb29d 00000000 0012cef8 004c59ce 0012c9cc minirosetta_1.34_windows_intelx!+0x0
0012d298 008c3470 00000000 00000090 008c6f9c 00000000 minirosetta_1.34_windows_intelx!+0x0
0012d29c 00000000 00000090 008c6f9c 00000000 03236848 minirosetta_1.34_windows_intelx!+0x0

*** Dump of thread ID 2876 (state: Waiting): ***

- Information -
Status: Wait Reason: ExecutionDelay, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 2235383.000000

- Registers -
eax=0152f8b0 ebx=00000000 ecx=00000005 edx=00000078 esi=00000000 edi=0152ff70
eip=7c90e4f4 esp=0152ff40 ebp=0152ff98
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202

- Callstack -
ChildEBP RetAddr Args to Child
0152ff3c 7c90d1fc 7c8023f1 00000000 0152ff70 00000000 ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0]
0152ff40 7c8023f1 00000000 0152ff70 00000000 7c802446 ntdll!_NtDelayExecution@8+0x0 FPO: [2,0,0]
0152ff98 7c802455 00000064 00000000 0152ffec 0041c24b kernel32!_SleepEx@8+0x0
0152ffa8 0041c24b 00000064 00000000 7c80b713 00000000 kernel32!_Sleep@4+0x0
0152ffec 00000000 0041c240 00000000 00000000 1dbb0000 minirosetta_1.34_windows_intelx!+0x0

*** Dump of thread ID 3780 (state: Waiting): ***

- Information -
Status: Wait Reason: ExecutionDelay, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 2235281.000000

- Registers -
eax=0241fe18 ebx=01e8ce00 ecx=0241e82c edx=000001f9 esi=00000000 edi=0241fdf8
eip=7c90e4f4 esp=0241fdc8 ebp=0241fe20
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202

- Callstack -
ChildEBP RetAddr Args to Child
0241fdc4 7c90d1fc 7c8023f1 00000000 0241fdf8 000001fc ntdll!_KiFastSystemCallRet@0+0x0 FPO: [0,0,0]
0241fdc8 7c8023f1 00000000 0241fdf8 000001fc 01e8ceb8 ntdll!_NtDelayExecution@8+0x0 FPO: [2,0,0]
0241fe20 7c802455 000007d0 00000000 7c802446 0076bf14 kernel32!_SleepEx@8+0x0
0241fe30 0076bf14 000007d0 7824af0a 0012c8c8 01e8ceb8 kernel32!_Sleep@4+0x0
0241fe38 7824af0a 0012c8c8 01e8ceb8 0241ff6c 01e8ceb8 minirosetta_1.34_windows_intelx!+0x0
0241fe3c 0012c8c8 01e8ceb8 0241ff6c 01e8ceb8 00000001 minirosetta_1.34_windows_intelx!+0x0 SymFromAddr(): GetLastError = '126' SymGetLineFromAddr(): GetLastError = '126' SymGetModuleInfo(): GetLastError = '126' Address = '7824af0a'
0241ff3c 7c917de9 7c917ea0 7c800000 0241ff7c 00000000 minirosetta_1.34_windows_intelx!+0x0 SymFromAddr(): GetLastError = '126' SymGetLineFromAddr(): GetLastError = '126' SymGetModuleInfo(): GetLastError = '126' Address = '0012c8c8'
0241ffe0 7c80b71f 00000000 00000000 00000000 00429456 ntdll!_LdrpGetProcedureAddress@20+0x0 SymFromAddr(): GetLastError = '126' SymGetLineFromAddr(): GetLastError = '126' SymGetModuleInfo(): GetLastError = '126' Address = '7c917de9'
0241ffe4 00000000 00000000 00000000 00429456 01e8ceb8 kernel32!_BaseThreadStart@8+0x0 FPO: [0,0,0] SymFromAddr(): GetLastError = '126' SymGetLineFromAddr(): GetLastError = '126' SymGetModuleInfo(): GetLastError = '126' Address = '7c80b71f'


*** Debug Message Dump ****


*** Foreground Window Data ***
Window Name :
Window Class :
Window Process ID: 0
Window Thread ID : 0

Exiting...

</stderr_txt>
]]>


Yummy!

ID: 56224 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bossone

Send message
Joined: 6 Jan 06
Posts: 3
Credit: 37,125
RAC: 0
Message 56258 - Posted: 6 Oct 2008, 21:00:36 UTC

Hello , since this evening Rosseta v1.34 is causing maasive failures.
It has a signature verfication error and the output-files are missing.
See below a part of the report:
6-10-2008 21:35:30|rosetta@home|[error] Signature verification failed for minirosetta_1.34_windows_intelx86.exe
6-10-2008 21:35:30|rosetta@home|Starting abinitio_nohomfrag_70_A_1h75A_4466_32572_0
6-10-2008 21:35:31|rosetta@home|Computation for task abinitio_nohomfrag_70_A_1zd0A_4466_32533_0 finished
6-10-2008 21:35:31|rosetta@home|Output file abinitio_nohomfrag_70_A_1zd0A_4466_32533_0_0 for task abinitio_nohomfrag_70_A_1zd0A_4466_32533_0 absent
6-10-2008 21:35:31|rosetta@home|Computation for task abinitio_nohomfrag_70_A_1h75A_4466_32572_0 finished
6-10-2008 21:35:31|rosetta@home|Output file abinitio_nohomfrag_70_A_1h75A_4466_32572_0_0 for task abinitio_nohomfrag_70_A_1h75A_4466_32572_0 absent
6-10-2008 21:35:32|rosetta@home|Started upload of abinitio_nohomfrag_70_A_1tzaA_4466_28308_0_0
6-10-2008 21:35:36|rosetta@home|Finished upload of abinitio_nohomfrag_70_A_1tzaA_4466_28308_0_0
6-10-2008 21:36:34|rosetta@home|Sending scheduler request: To fetch work. Requesting 60480 seconds of work, reporting 4 completed tasks
6-10-2008 21:36:39|rosetta@home|Scheduler request succeeded: got 3 new tasks
6-10-2008 21:36:40|rosetta@home|[error] garbage_collect(); still have active task for acked result abinitio_nohomfrag_70_A_1faaA_4466_28786_0; state 0
6-10-2008 21:36:40|rosetta@home|Computation for task abinitio_nohomfrag_70_A_1faaA_4466_28786_0 finished
6-10-2008 21:36:40|rosetta@home|Output file abinitio_nohomfrag_70_A_1faaA_4466_28786_0_0 for task abinitio_nohomfrag_70_A_1faaA_4466_28786_0 absent


Thank you.
ID: 56258 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael E.@ team Carl Sagan

Send message
Joined: 5 Apr 08
Posts: 16
Credit: 1,836,449
RAC: 0
Message 56261 - Posted: 6 Oct 2008, 21:43:42 UTC

I have a task that may be looping. The Message tab shows the task restarting every 30 minutes and the time remaining has been stuck at 00:09:57 for almost two working days. These tasks usually take 9-10 hours on this computer (computer ID is 858463).

Total CPU time and % complete follow:

CPU time 12:34:xx and Progress 98.691% (Friday AM)
CPU time 18:28:xx and Progress 99.105% (Monday late afternoon)

Some of the log messages in the Message tab indicate a restart about every thirty minutes:

10/6/2008 4:13:53 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_loopbuild_boinctest3_foldcst_loopbuild_t328__IGNORE_THE_REST_1VIMA_16_4578_2_0 using minirosetta version 134
10/6/2008 4:45:54 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_loopbuild_boinctest3_foldcst_loopbuild_t328__IGNORE_THE_REST_1VIMA_16_4578_2_0 using minirosetta version 134

Should I let this task continue or abort it?

Task ID is 194948950 and Work unit is 178087521. It is now past the Report Deadline, so it has been re-sent.

The system is a laptop running Windows XP. It has development software installed.

Michael E. (Mike)
ID: 56261 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1965
Credit: 38,160,504
RAC: 9,210
Message 56264 - Posted: 6 Oct 2008, 23:44:55 UTC - in response to Message 56106.  

Can anyone explain why I might be getting this kind of repeating error?...

9/28/2008 11:17:50 PM|rosetta@home|Task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 exited with zero status but no 'finished' file
9/28/2008 11:17:50 PM|rosetta@home|If this happens repeatedly you may need to reset the project.
9/28/2008 11:17:50 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_boinc_test3_foldcst_simple_t302___4585_718_0 using minirosetta version 134

[...]

This is only an excerpt of the series of messages. I've had a few WUs do this recently.

In short, no, but it's something that comes up massively on my Vista64 system. Seems like it's being reported on XPSP3 machines now too.

The task itself reports "too many exit(0)s - Can't acquire lockfile - exiting"

ID: 56264 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bob Browett

Send message
Joined: 14 Dec 05
Posts: 11
Credit: 2,275,743
RAC: 0
Message 56358 - Posted: 13 Oct 2008, 21:33:49 UTC

Hi
4 more errors to add; tasks:
199237177
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00518F33 read attempt to address 0x235EBBB8

Engaging BOINC Windows Runtime Debugger...

Others not showing on results page yet,but all failed within the last hour


ID: 56358 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bob Browett

Send message
Joined: 14 Dec 05
Posts: 11
Credit: 2,275,743
RAC: 0
Message 56361 - Posted: 14 Oct 2008, 6:45:12 UTC

Hi
Overnight I had 2 successful crunchings and

61 (!) failed.

Is it worth my computer time for so many errors?
ID: 56361 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bob Browett

Send message
Joined: 14 Dec 05
Posts: 11
Credit: 2,275,743
RAC: 0
Message 56372 - Posted: 15 Oct 2008, 5:36:49 UTC

Hi
Overnight tonight

1 successful

40 failed
ID: 56372 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 56373 - Posted: 15 Oct 2008, 10:04:03 UTC

Bob those same tasks failed on systems prior to yours.
Rosetta has to try 2 systems to make sure its not just a single computer that failed.

It looks like you got a bad batch of abinitio_nohomfrag_70_A tasks.
ID: 56373 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bob Browett

Send message
Joined: 14 Dec 05
Posts: 11
Credit: 2,275,743
RAC: 0
Message 56387 - Posted: 15 Oct 2008, 18:15:27 UTC - in response to Message 56373.  

Bob those same tasks failed on systems prior to yours.
Rosetta has to try 2 systems to make sure its not just a single computer that failed.

It looks like you got a bad batch of abinitio_nohomfrag_70_A tasks.


100 bad units at once!! I call that more than bad luck. I call that a credit crunch.
I better nationalise Rosetta. How much do you want David?
ID: 56387 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Minirosetta v1.34 bug thread



©2024 University of Washington
https://www.bakerlab.org