Minirosetta v1.47 bug thread.

Message boards : Number crunching : Minirosetta v1.47 bug thread.

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

AuthorMessage
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 58012 - Posted: 18 Dec 2008, 22:00:40 UTC

This WU had a validate error:

normal_relax_rlbd_1ynv_IGNORE_THE_REST_DECOY_5565_171_0

It looks from the stderr file like it crunched normally for 16 hours (my current preference) with no error. However, it was then marked "Invalid" with no explanation. The only other thing I see is that it crunched an unusually high number of decoys (8777 decoys). Does that cause problems with the validator?
ID: 58012 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 58013 - Posted: 18 Dec 2008, 22:28:12 UTC
Last modified: 18 Dec 2008, 22:37:08 UTC

Jay, RE: page faults...

If you change the view you can add a column to display the number of faults since the task started. I have long runtimes, but currently have two tasks from Ralph that topped 100,000,000 page faults. One in 15hrs and the other in 19hrs. This is the highest fault rate I've ever seen. Indeed, I recall the days when I thought that 1M per hour of runtime was excessive.

The only solice I can offer is that not all faults are hard faults to disk. Some recorded faults are "soft". Perhaps someone else can further elaborate on the concepts.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 58013 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Stephen

Send message
Joined: 26 Apr 08
Posts: 32
Credit: 429,286
RAC: 0
Message 58024 - Posted: 19 Dec 2008, 4:07:51 UTC
Last modified: 19 Dec 2008, 4:35:03 UTC

a WU will get to around 85% complete , progress will stay the same. time to completion stays around 10 minutes. i suspend all tasks, resume then the "stuck" WUs will complete.

edited: doing this also rolls back the "cpu time spent" to around 30 minutes
ID: 58024 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58027 - Posted: 19 Dec 2008, 5:47:30 UTC
Last modified: 19 Dec 2008, 5:49:36 UTC

Stephen, this may be part of why you are having problems keeping all 8 CPUs busy. Suggest you just let BOINC manage the machine for the next 12 hours or so. Don't abort, suspend, update, anything at all.

Some tasks will take longer then 3 hours to run, and their % complete progress bar will not move steadily. Rather then tell you the task has -30 minutes left, they reflect the situation by making time move very slowly after the task gets to 10 minutes remaining.

It's simply a problem with the estimate, not the work being done.
Rosetta Moderator: Mod.Sense
ID: 58027 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 58029 - Posted: 19 Dec 2008, 8:52:37 UTC

how do you "lose credit" on a task?
on this task i claimed 83 and got 68 for 4 hrs runtime. That is just weird when most of the other work I have been running always comes out on the plus side for granted.
ID: 58029 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 58030 - Posted: 19 Dec 2008, 9:54:17 UTC

ID: 58030 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 58031 - Posted: 19 Dec 2008, 12:26:59 UTC - in response to Message 58030.  

https://boinc.bakerlab.org/rosetta/result.php?resultid=213832280


you didn't have to reboot your computer a few times during the tasks run did you?
that will kill a task.
ID: 58031 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 58032 - Posted: 19 Dec 2008, 14:14:26 UTC - in response to Message 58031.  
Last modified: 19 Dec 2008, 14:15:09 UTC

yes i did... thanks for that info a Microsoft upgrade required a reboot



https://boinc.bakerlab.org/rosetta/result.php?resultid=213832280


you didn't have to reboot your computer a few times during the tasks run did you?
that will kill a task.
ID: 58032 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 58033 - Posted: 19 Dec 2008, 14:44:26 UTC - in response to Message 58032.  


heres a tip: before rebooting, because you never know how many times windows will want you to do that when you do a update install, goto the activity tab of boinc manager and put all activity in suspend. wait for your hardrive to stop grinding away with all the saving and then you can reboot. also be sure to have the leave jobs/tasks in memory turned on as well. then you will not lose your position in the task. suspend seems to save everything to the hardrive and you can reboot all you want and not lose any data for the task.

yes i did... thanks for that info a Microsoft upgrade required a reboot



https://boinc.bakerlab.org/rosetta/result.php?resultid=213832280


you didn't have to reboot your computer a few times during the tasks run did you?
that will kill a task.

ID: 58033 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 58034 - Posted: 19 Dec 2008, 14:50:09 UTC - in response to Message 58033.  




thanks again ...ill do that next time






heres a tip: before rebooting, because you never know how many times windows will want you to do that when you do a update install, goto the activity tab of boinc manager and put all activity in suspend. wait for your hardrive to stop grinding away with all the saving and then you can reboot. also be sure to have the leave jobs/tasks in memory turned on as well. then you will not lose your position in the task. suspend seems to save everything to the hardrive and you can reboot all you want and not lose any data for the task.

yes i did... thanks for that info a Microsoft upgrade required a reboot



https://boinc.bakerlab.org/rosetta/result.php?resultid=213832280


you didn't have to reboot your computer a few times during the tasks run did you?
that will kill a task.


ID: 58034 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58035 - Posted: 19 Dec 2008, 15:18:35 UTC
Last modified: 19 Dec 2008, 17:09:18 UTC

I do not agree with greg's comments about preservation of work and reasons why, but would prefer to take them up in another thread if you'd like to discuss further.

[edit]
We're discussing this under a new thread here.
Rosetta Moderator: Mod.Sense
ID: 58035 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 58037 - Posted: 19 Dec 2008, 15:30:07 UTC - in response to Message 58035.  


ok i just want to know what to do



I do not agree with greg's comments about preservation of work and reasons why, but would prefer to take them up in another thread if you'd like to discuss further.

ID: 58037 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kr12

Send message
Joined: 6 Dec 07
Posts: 2
Credit: 85,902
RAC: 0
Message 58044 - Posted: 19 Dec 2008, 20:25:15 UTC

"graphic viewer" hangs with this task
cs_noe_fullw_nolin_homo_bench_cs_noe_abrelax_cs_mth1598_olange_5607_11086_0
(https://boinc.bakerlab.org/rosetta/result.php?resultid=215720373)
ID: 58044 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
stewjack

Send message
Joined: 23 Apr 06
Posts: 39
Credit: 95,871
RAC: 0
Message 58050 - Posted: 20 Dec 2008, 4:43:14 UTC - in response to Message 58044.  

"graphic viewer" hangs with this task
cs_noe_fullw_nolin_homo_bench_cs_noe_abrelax_cs_mth1598_olange_5607_11086_0


I had the same thing happen with this similar WU.

cs_noe_fullw_nolin_homo_bench_cs_noe_abrelax_cs_nsp1_olange_5608_14752_0

Note: I didn't have time to mess with this one - so I just aborted it.

ID: 58050 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rhb

Send message
Joined: 19 Jan 07
Posts: 5
Credit: 277,050
RAC: 0
Message 58052 - Posted: 20 Dec 2008, 7:14:45 UTC

I had a computation error. Running Ubuntu Linux 6.06, Boinc 5.4.9.
This is the first error I've seen in the last two weeks.

https://boinc.bakerlab.org/rosetta/result.php?resultid=215760302

Task ID 215760302
Name cs_noe_fullw_nolin_homo_bench_cs_noe_abrelax_cs_nsp1_olange_5608_24330_0
Workunit 196639962

<core_client_version>5.4.9</core_client_version>
<message>
process exited with code 193 (0xc1)
</message>
<stderr_txt>
*** glibc detected *** double free or corruption (!prev): 0x0bd2d980 ***
SIGABRT: abort called


ID: 58052 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 58068 - Posted: 20 Dec 2008, 20:53:45 UTC

Hi.

This one has problems, it's failed twice.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=194507659

<core_client_version>6.2.14</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
SIGSEGV: segmentation violation
Stack trace (15 frames):
[0x8b979b7]
[0x8bc20b0]
[0xffffe500]
[0x84c0863]
[0x85ddf0a]
[0x85df32e]
[0x85e65b8]
[0x819a650]
[0x818d3b7]
[0x818ee89]
[0x8127771]
[0x8129a1a]
[0x804b9c8]
[0x8c1dbac]
[0x8048111]

Exiting...

</stderr_txt>

pete.

ID: 58068 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 11,805,838
RAC: 0
Message 58084 - Posted: 21 Dec 2008, 3:19:52 UTC

I'm seeing problems when attempting to show graphics on workunits with names such as cs_noe* on Mac OS X 10.4.11. Its seems like several other people are seeing similar problems.

The first time Show graphics is pressed the graphics app starts and displays a blank window. Moving the mouse causes the graphics app to crash.

The second and subsequent times Show graphics is pressed the graphics app starts and displays a blank window along with the spinning rainbow beach ball. The graphics app is frozen and you can't even force quit in the normal way: it's necessary to quit via the Activity Monitor.
ID: 58084 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
lusvladimir

Send message
Joined: 18 Oct 05
Posts: 12
Credit: 1,784,854
RAC: 0
Message 58087 - Posted: 21 Dec 2008, 9:38:41 UTC
Last modified: 21 Dec 2008, 9:41:39 UTC

Running Debian Linux , Boinc 6.2.14.

https://boinc.bakerlab.org/result.php?resultid=215464278

Task ID 215464278
Name cc_nonideal_1_3_nocst4_hb_t286__IGNORE_THE_REST_1VYHA_6_5693_20_0
Workunit 196380006

<core_client_version>6.2.14</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
# cpu_run_time_pref: 3600
*** glibc detected *** double free or corruption (!prev): 0x0e13a4f0 ***
SIGABRT: abort called
Stack trace (23 frames):
ID: 58087 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
NewtonianRefractor

Send message
Joined: 29 Sep 08
Posts: 19
Credit: 2,350,860
RAC: 0
Message 58088 - Posted: 21 Dec 2008, 10:17:10 UTC

The graphics for one of my Minirosetta 1.47 work units crash. If I click on the show graphics button under boinc, a windows is launched, but it remains black and to close it I have to physically end the unresponsive process. The work unit runs fine though. It's under boinc 6.2.19
ID: 58088 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 58089 - Posted: 21 Dec 2008, 13:34:11 UTC

two more that wasted my cpu time crashing halfway

https://boinc.bakerlab.org/rosetta/result.php?resultid=215547790
t071_1_RDC_NMR_NESG_5480_118996_0
Client state Compute error
Exit status -1073741819 (0xc0000005)
CPU time 941.5781
--------------

https://boinc.bakerlab.org/rosetta/result.php?resultid=215490731
t072_1_RDC_NMR_NESG_5481_92626_0
Client state Compute error
Exit status -1073741819 (0xc0000005)
CPU time 12309.66
-----------------------------

ID: 58089 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

Message boards : Number crunching : Minirosetta v1.47 bug thread.



©2024 University of Washington
https://www.bakerlab.org