To those with a WU stuck at 1% - what does the gfx look like?

Message boards : Number crunching : To those with a WU stuck at 1% - what does the gfx look like?

To post messages, you must log in.

AuthorMessage
Christian Diepold
Avatar

Send message
Joined: 23 Sep 05
Posts: 37
Credit: 300,225
RAC: 0
Message 6546 - Posted: 17 Dec 2005, 9:47:35 UTC

I'd like to know from those of you, who had a WU stuck at 1%, how the gfx looked like.

I mean, were the gfx still moving and whiggling or were they stuck as well? I'm just looking for a way to be totaly sure that a WU is stuck and thought, that maybe the gfx can help on that. I'd just hate to abort a WU that wasn't really stuck and we're having abig discussion about it over at TeAm Anandtech as well.

Thx all!
ID: 6546 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 6550 - Posted: 17 Dec 2005, 12:47:43 UTC - in response to Message 6546.  
Last modified: 17 Dec 2005, 12:49:55 UTC

I'd like to know from those of you, who had a WU stuck at 1%, ...



Can't tell you about gfx because its turned off on my win boxes and my linux boxes are pure command line!

But the advice from Jack in the report stuck wu here thread is that they are interested in WU that are stuck at 1% for 10 or more hours.

I'd just hate to abort a WU that wasn't really stuck and we're having abig discussion about it over at TeAm Anandtech as well.


yeah, tell me.

I panicked a bit and aborted three WU in a row that ran for 6000sec, 3222sec, and 1700sec respectively; and now wonder if they'd have come right after all? Like it seemed a long time to me, till I saw the "10 hour" advice...

Doh!
ID: 6550 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 6573 - Posted: 17 Dec 2005, 17:03:22 UTC - in response to Message 6546.  

I'd like to know from those of you, who had a WU stuck at 1%, how the gfx looked like.

I mean, were the gfx still moving and whiggling or were they stuck as well? I'm just looking for a way to be totaly sure that a WU is stuck and thought, that maybe the gfx can help on that. I'd just hate to abort a WU that wasn't really stuck and we're having abig discussion about it over at TeAm Anandtech as well.

Thx all!

The one *I* trapped after 25 hours was frozen with the exception of the time counter ticking up. CPU use was 100% ...

So, it is not like predictor where the pop-up idles the cpu, it , Rosetta, is madly trying to work, but is accomplishing exactly nothing. Since it has not checkpointed you lose the time invested ... :(

Before you restart, also save the stdout, stderr text files in the slot directory ( you will have to look at all of them to find the work unit, sorry ...).

Restart BOINC and watch that WU when it starts again and it is VERY likely it will run to completion. A fix is in the works according to the project, not sure when it will come out but will be a new science application.
ID: 6573 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
geojim

Send message
Joined: 11 Nov 05
Posts: 3
Credit: 8,726,153
RAC: 0
Message 6594 - Posted: 17 Dec 2005, 21:12:10 UTC - in response to Message 6573.  

I'd like to know from those of you, who had a WU stuck at 1%, how the gfx looked like.

I mean, were the gfx still moving and whiggling or were they stuck as well? I'm just looking for a way to be totaly sure that a WU is stuck and thought, that maybe the gfx can help on that. I'd just hate to abort a WU that wasn't really stuck and we're having abig discussion about it over at TeAm Anandtech as well.

Thx all!

The one *I* trapped after 25 hours was frozen with the exception of the time counter ticking up. CPU use was 100% ...

So, it is not like predictor where the pop-up idles the cpu, it , Rosetta, is madly trying to work, but is accomplishing exactly nothing. Since it has not checkpointed you lose the time invested ... :(

Before you restart, also save the stdout, stderr text files in the slot directory ( you will have to look at all of them to find the work unit, sorry ...).

Restart BOINC and watch that WU when it starts again and it is VERY likely it will run to completion. A fix is in the works according to the project, not sure when it will come out but will be a new science application.


ID: 6594 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
geojim

Send message
Joined: 11 Nov 05
Posts: 3
Credit: 8,726,153
RAC: 0
Message 6595 - Posted: 17 Dec 2005, 21:16:19 UTC - in response to Message 6594.  

I've had several WU's stuck at 1% sometimes 10% and the time to completion increases versus decreasing as normal. If I check the graphic, it too is stuck and non-responsive. I've then just aborted the WU & hit update. I've reset the project more than once and seems like every three or four WU's is doing this on more than one machine.
Jim





I'd like to know from those of you, who had a WU stuck at 1%, how the gfx looked like.

I mean, were the gfx still moving and whiggling or were they stuck as well? I'm just looking for a way to be totaly sure that a WU is stuck and thought, that maybe the gfx can help on that. I'd just hate to abort a WU that wasn't really stuck and we're having abig discussion about it over at TeAm Anandtech as well.

Thx all!

The one *I* trapped after 25 hours was frozen with the exception of the time counter ticking up. CPU use was 100% ...

So, it is not like predictor where the pop-up idles the cpu, it , Rosetta, is madly trying to work, but is accomplishing exactly nothing. Since it has not checkpointed you lose the time invested ... :(

Before you restart, also save the stdout, stderr text files in the slot directory ( you will have to look at all of them to find the work unit, sorry ...).

Restart BOINC and watch that WU when it starts again and it is VERY likely it will run to completion. A fix is in the works according to the project, not sure when it will come out but will be a new science application.



ID: 6595 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Christian Diepold
Avatar

Send message
Joined: 23 Sep 05
Posts: 37
Credit: 300,225
RAC: 0
Message 6603 - Posted: 17 Dec 2005, 22:44:20 UTC

Thx for the answers so far guys! I just wanted to make sure that the gfx actually do show when a WU is stuck for sure. I guess that should be made public somehow. It's an easy but effective way to check and it avoids aborting WUs too early.
ID: 6603 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 6622 - Posted: 18 Dec 2005, 4:58:42 UTC - in response to Message 6595.  

I've had several WU's stuck at 1% sometimes 10% and the time to completion increases versus decreasing as normal. If I check the graphic, it too is stuck and non-responsive. I've then just aborted the WU & hit update. I've reset the project more than once and seems like every three or four WU's is doing this on more than one machine.

Jim,

This is a bug in the application. You do not have to reset the project. In fact, if you just restart BOINC it is likely that you will process the work unit to completion. Again, if you don't want to run the risk, go ahead and abort them, but, I have never seen one hang twice in a row.

A fix is in the works so we just have to wait and watch ...
ID: 6622 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
geojim

Send message
Joined: 11 Nov 05
Posts: 3
Credit: 8,726,153
RAC: 0
Message 6796 - Posted: 19 Dec 2005, 17:53:26 UTC - in response to Message 6622.  

Thanks Paul,
I'll try this.
Jim Chandler



I've had several WU's stuck at 1% sometimes 10% and the time to completion increases versus decreasing as normal. If I check the graphic, it too is stuck and non-responsive. I've then just aborted the WU & hit update. I've reset the project more than once and seems like every three or four WU's is doing this on more than one machine.

Jim,

This is a bug in the application. You do not have to reset the project. In fact, if you just restart BOINC it is likely that you will process the work unit to completion. Again, if you don't want to run the risk, go ahead and abort them, but, I have never seen one hang twice in a row.

A fix is in the works so we just have to wait and watch ...


ID: 6796 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : To those with a WU stuck at 1% - what does the gfx look like?



©2024 University of Washington
https://www.bakerlab.org