minirosetta v1.24 bug thread

Message boards : Number crunching : minirosetta v1.24 bug thread

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 53254 - Posted: 21 May 2008, 23:31:35 UTC

Please post minirosetta v1.24 bugs/issues here.
ID: 53254 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile nouqraz

Send message
Joined: 8 Apr 08
Posts: 6
Credit: 328,006
RAC: 52
Message 53259 - Posted: 22 May 2008, 2:05:18 UTC

Just got 4 minirosetta 1.24 WUs on my 4 Xeon (2 dual core chips) machine, and it appears to be doing the same thing that I reported in the 1.19 thread (https://boinc.bakerlab.org/rosetta/forum_thread.php?id=4094&nowrap=true#53225) --- The work units are listed as "running", but no CPU time is used (either as viewed in BOINC or through task manager).

I will leave them 'running' for a while and see if they 'wake up'. If they're still hung up tomorrow night I guess I will have to abort them as I have been doing with 1.19 WUs on this machine.



Thanks,
Adam
ID: 53259 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 53281 - Posted: 22 May 2008, 19:37:59 UTC

Adam, can you join our Ralph test project? http://ralph.bakerlab.org

With Rom Walton's help, we are reviving the use of the windows symbol store debugging utility on Ralph and hopefully this will give us some information.
ID: 53281 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
yose-ue

Send message
Joined: 30 Dec 05
Posts: 3
Credit: 228,710
RAC: 0
Message 53283 - Posted: 22 May 2008, 20:24:38 UTC

Task ID 165420228
Name h001__BOINC_ABRELAXt397__IGNORE_THE_REST-S25-7-S3-8--h001_-_3339_1086_0
Workunit 151201617
Created 22 May 2008 3:18:48 UTC
Sent 22 May 2008 3:19:39 UTC
Received 22 May 2008 20:14:49 UTC
Server state Over
Outcome Validate error
Client state Done
Exit status 0 (0x0)
Computer ID 404801
Report deadline 1 Jun 2008 3:19:39 UTC
CPU time 11603.78
stderr out <core_client_version>5.10.45</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 21600
# cpu_run_time_pref: 21600
# cpu_run_time_pref: 21600
# cpu_run_time_pref: 21600
======================================================
DONE :: 1 starting structures 11602.8 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
called boinc_finish

</stderr_txt>
]]>


Validate state Invalid
Claimed credit 19.6495769571169
Granted credit 0
application version 1.24

ID: 53283 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile nouqraz

Send message
Joined: 8 Apr 08
Posts: 6
Credit: 328,006
RAC: 52
Message 53285 - Posted: 22 May 2008, 22:14:41 UTC - in response to Message 53281.  

Adam, can you join our Ralph test project? http://ralph.bakerlab.org

With Rom Walton's help, we are reviving the use of the windows symbol store debugging utility on Ralph and hopefully this will give us some information.


Okay, I signed up at Ralph, set the host with the problem to work on ralph and suspended rosetta on that host.

This is the host on Ralph: http://ralph.bakerlab.org/show_host_detail.php?hostid=13930

This is my account on Ralph: http://ralph.bakerlab.org/show_user.php?userid=4329



--Adam
ID: 53285 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Thomas Maenner

Send message
Joined: 11 Apr 08
Posts: 1
Credit: 114,470
RAC: 0
Message 53286 - Posted: 22 May 2008, 22:24:38 UTC

Having problems with v1.24. Same as in V1.19, the app starts and aborts after 15 seconds.
Let me know what other information would help.

--------------------
cat stderr.txt:
SIGABRT: abort called
Stack trace (22 frames):
[0x878cccb]
[0x87b7210]
[0xffffe500]
[0x8819dd4]
[0x882f653]
[0x8834876]
[0x8834997]
[0x8805321]
[0x82bfb93]
[0x86b3cda]
[0x82145e8]
[0x84bc88d]
[0x80729a8]
[0x807d323]
[0x806f012]
[0x81730dc]
[0x8175d27]
[0x80a8542]
[0x80a8ee8]
[0x804bd19]
[0x8812cdc]
[0x8048111]
------------------------

cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 12
model name : AMD Athlon(tm) 64 Processor 3200+
stepping : 0
cpu MHz : 1000.000
cache size : 512 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow up
bogomips : 2011.00
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

--------------

uname -a
Linux xyz 2.6.22.17-0.1-default #1 SMP 2008/02/10 20:01:04 UTC x86_64 x86_64 x86_64 GNU/Linux

Having to abort all mini apps from the GUI until fixed...

Thanks
Tom

ID: 53286 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nothing But Idle Time

Send message
Joined: 28 Sep 05
Posts: 209
Credit: 139,545
RAC: 0
Message 53296 - Posted: 23 May 2008, 11:43:02 UTC

Several versions of minirosetta have been released and each time I get another data base. I now have four (Feb,Mar,Apr,May) of these huge files totaling about 56MB. The total space allocated to rosetta in the boinc folder is 104MB, my largest ever. Are all these minirosetta data bases necessary? Should old ones be deleted when newer DBs are downloaded? Just wondering if the auto-deletion is working or needs some attention.
ID: 53296 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
David Ball

Send message
Joined: 25 Nov 05
Posts: 25
Credit: 1,439,333
RAC: 0
Message 53297 - Posted: 23 May 2008, 12:16:50 UTC
Last modified: 23 May 2008, 12:28:50 UTC

For some reason a Mini-Rosetta 1.24 WU got the following error:

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
ERROR: Option matching -description_file not found in command line top-level context

</stderr_txt>
]]>





Other 1.24 WU on the same machine are ok.


EDIT: This was on a linux machine. I just noticed that the same WU got the same error on an XP Pro machine so it's not being re-issued due to "Too many error results". Could the command line have contained an illegal character?

The XP pro machine said:

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
Funzione non corretta. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
ERROR: Option matching -description_file not found in command line top-level context

</stderr_txt>
]]>

Have you read a good Science Fiction book lately?
ID: 53297 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 53299 - Posted: 23 May 2008, 13:40:59 UTC - in response to Message 53296.  

Just wondering if the auto-deletion is working or needs some attention.

When your machine has room to spare, so far as the amount of disk BOINC is allowed to use, it doesn't force as many clean ups to occur. Please watch them to be sure, but I'll bet they self-delete just fine in a week or two.

Rosetta Moderator: Mod.Sense
ID: 53299 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
IrishMike

Send message
Joined: 23 Nov 05
Posts: 1
Credit: 60,952
RAC: 0
Message 53301 - Posted: 23 May 2008, 15:03:52 UTC

I was just updated from some beta version of Rosetta to Rosetta Mini 1.24. For the year or so I have been running Rosetta I had a very nice graphics display showing the protein being worked and the changes being attempted and it also explained the purpose of the current task. This 'updated' version does not show the purpose and it shows the protein and the very gross changes/states, but not the incremental tweaks. I really find this new display less informative and interesting. Why the change, and is there any plan to go to the former display or for me to return to the older beta version (I know the answer to that already, but I thought I would ask anyway)?
ID: 53301 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jeremy

Send message
Joined: 15 May 08
Posts: 13
Credit: 2,636
RAC: 0
Message 53302 - Posted: 23 May 2008, 15:06:25 UTC
Last modified: 23 May 2008, 15:07:44 UTC

Same issue as before: Compute errors, almoust every minirosetta task, the normal rosetta tasks work perfectly.
See my tasks for details
ID: 53302 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 53303 - Posted: 23 May 2008, 15:28:09 UTC - in response to Message 53301.  

...is there any plan to go to the former display or for me to return to the older beta version.


No, the application version is assigned when the task is sent to you, so you cannot return to the older version. But both applications are currently in use, so you may see some of each.

Yes, enhancements are planned for the new mini application's graphics.
Rosetta Moderator: Mod.Sense
ID: 53303 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 53305 - Posted: 23 May 2008, 17:16:51 UTC

Please ignore the -description_file errors. A bad batch was accidentally sent out by someone in our group. The batch was relatively small in size.

We are working on the graphics for minirosetta. The current graphics are just a placeholder until a better one is developed. We have the option to have graphics similar to the older app or use the FoldIt graphics. Maybe we should take a survey on this in a separate thread.

We will add code to delete the older database versions in the next application release. Thanks for reminding us about this issue. David Anderson said he will look into placing some code in the boinc client that will allow an expiration of persistent input files. Currently, this feature does not exist in the client.
ID: 53305 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
KC0ISW

Send message
Joined: 28 Sep 05
Posts: 2
Credit: 58,926
RAC: 0
Message 53306 - Posted: 23 May 2008, 17:28:21 UTC

fail units on


https://boinc.bakerlab.org/rosetta/result.php?resultid=165864680

https://boinc.bakerlab.org/rosetta/result.php?resultid=165866472
ID: 53306 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
yose-ue

Send message
Joined: 30 Dec 05
Posts: 3
Credit: 228,710
RAC: 0
Message 53307 - Posted: 24 May 2008, 3:48:06 UTC

I had a process that compleated sucessfully but had a warning

WARNING: Override of option -out:nstruct sets a different value
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1

the second line repeated aproximatly the same number of times as the number of decoys it processed. number 151511959
ID: 53307 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Ortiz
Avatar

Send message
Joined: 13 Jul 07
Posts: 2
Credit: 68,035
RAC: 0
Message 53317 - Posted: 24 May 2008, 18:24:27 UTC

Error while crunching

<core_client_version>6.1.0</core_client_version>
<![CDATA[
<stderr_txt>
can not open psipred_ss2 file tt
Error writing
Error writing
Error writing
...
Error writing
Error writing
Error wri
</stderr_txt>
]]>


on workunit:
https://boinc.bakerlab.org/rosetta/result.php?resultid=166141022
ID: 53317 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
googloo
Avatar

Send message
Joined: 15 Sep 06
Posts: 133
Credit: 21,691,755
RAC: 5,442
Message 53320 - Posted: 24 May 2008, 20:06:37 UTC

Compute error, https://boinc.bakerlab.org/rosetta/result.php?resultid=165788486

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
ERROR: Option matching -description_file not found in command line top-level context

</stderr_txt>
]]>
ID: 53320 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Adam

Send message
Joined: 26 Jun 07
Posts: 7
Credit: 487,917
RAC: 0
Message 53321 - Posted: 24 May 2008, 23:16:02 UTC

Compute error, https://boinc.bakerlab.org/rosetta/result.php?resultid=165937764

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<stderr_txt>
WARNING: Override of option -out:nstruct sets a different value
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
**********************************************************************
Rosetta is going too long. Watchdog is ending the run!
CPU time: 35697.6 seconds. Greater than 3X preferred time: 10800 seconds
**********************************************************************
called boinc_finish

</stderr_txt>
]]>

ID: 53321 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Adam

Send message
Joined: 26 Jun 07
Posts: 7
Credit: 487,917
RAC: 0
Message 53322 - Posted: 24 May 2008, 23:19:03 UTC
Last modified: 24 May 2008, 23:22:07 UTC

Compute error, https://boinc.bakerlab.org/rosetta/result.php?resultid=165937764

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<stderr_txt>
WARNING: Override of option -out:nstruct sets a different value
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
sequence mismatch between pose and psipred_ss2 S vs � at seqpos 1
**********************************************************************
Rosetta is going too long. Watchdog is ending the run!
CPU time: 35697.6 seconds. Greater than 3X preferred time: 10800 seconds
**********************************************************************
called boinc_finish

</stderr_txt>
]]>

Sorry for the double post
ID: 53322 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Klimax

Send message
Joined: 27 Apr 07
Posts: 38
Credit: 2,509,938
RAC: 4,060
Message 53324 - Posted: 25 May 2008, 5:27:38 UTC

This one was going well,until I stopped it for defrag.Then I restarted and WU just crashed.

WU:

<core_client_version>5.10.18</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 86400


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x004484E8 read attempt to address 0x015B6000

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.3.2


Dump Timestamp : 05/25/08 07:12:02
Loaded Library : C:Program FilesBOINCdbghelp.dll
Loaded Library : C:Program FilesBOINCsymsrv.dll
Loaded Library : C:Program FilesBOINCsrcsrv.dll
LoadLibraryA( C:Program FilesBOINCversion.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
Symbol Search Path: C:Program FilesBOINCslots;C:Program FilesBOINCprojectsboinc.bakerlab.org_rosetta;srv*C:DOCUME~1KlimaxLOCALS~1Tempsymbols*http://msdl.microsoft.com/download/symbols;srv*C:DOCUME~1KlimaxLOCALS~1Tempsymbols*https://boinc.bakerlab.org/rosetta/symstore;srv*C:DOCUME~1KlimaxLOCALS~1Tempsymbols*http://boinc.berkeley.edu/symstore

<snip>
*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 7, Write: 0, Other 209

- I/O Transfers Counters -
Read: 0, Write: 90, Other 0

- Paged Pool Usage -
QuotaPagedPoolUsage: 56076, QuotaPeakPagedPoolUsage: 56076
QuotaNonPagedPoolUsage: 2248, QuotaPeakNonPagedPoolUsage: 2304

- Virtual Memory Usage -
VirtualSize: 35221504, PeakVirtualSize: 35221504

- Pagefile Usage -
PagefileUsage: 10870784, PeakPagefileUsage: 10870784

- Working Set Size -
WorkingSetSize: 10285056, PeakWorkingSetSize: 10285056, PageFaultCount: 2542

*** Dump of thread ID 8052 (state: Ready): ***

- Information -
Status: Base Priority: Above Normal, Priority: Above Normal, , Kernel Time: 156250.000000, User Time: 7812500.000000, Wait Time: 11237628.000000

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x004484E8 read attempt to address 0x015B6000

- Registers -
eax=01958b4a ebx=0196a008 ecx=00000000 edx=00000001 esi=015b5ffc edi=015b6000
eip=004484e8 esp=0013ed18 ebp=0013ed38
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202

- Callstack -
ChildEBP RetAddr Args to Child
0013ed38 00448669 0196a008 00000001 015b5ffc 00000004 minirosetta_1.24_windows_intelx!+0x0
0013ed50 00445a06 00000004 015b5ff0 83874561 01950e98 minirosetta_1.24_windows_intelx!+0x0
0013ed8c 0044656d 00000000 00000000 015a56f0 01950e98 minirosetta_1.24_windows_intelx!+0x0
00000000 00000000 00000000 00000000 00000000 00000000 minirosetta_1.24_windows_intelx!+0x0

*** Dump of thread ID 6492 (state: Waiting): ***

- Information -
Status: Wait Reason: ExecutionDelay, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 11237624.000000

- Registers -
eax=0000b471 ebx=00000000 ecx=018ff0d8 edx=00951634 esi=00000000 edi=018fff70
eip=7c90eb94 esp=018fff40 ebp=018fff98
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202

- Callstack -
ChildEBP RetAddr Args to Child
018fff98 7c802451 00000064 00000000 018fffec 0042945b ntdll!KiFastSystemCallRet+0x0
018fffa8 0042945b 00000064 00000000 7c80b683 00000000 kernel32!Sleep+0x0
018fffec 00000000 00429450 00000000 00000000 48920000 minirosetta_1.24_windows_intelx!+0x0


*** Debug Message Dump ****


*** Foreground Window Data ***
Window Name :
Window Class :
Window Process ID: 0
Window Thread ID : 0

Exiting...

</stderr_txt>
]]>

And I am underway back to RALPH...
ID: 53324 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : minirosetta v1.24 bug thread



©2024 University of Washington
https://www.bakerlab.org