Report Maximum CPU Time Exceeded WU HERE

Message boards : Number crunching : Report Maximum CPU Time Exceeded WU HERE

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Cureseekers~VortoN

Send message
Joined: 11 Nov 05
Posts: 3
Credit: 1,396,786
RAC: 0
Message 10242 - Posted: 31 Jan 2006, 1:19:40 UTC - in response to Message 10221.  

ID: 10242 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cureseekers~Joschy

Send message
Joined: 8 Dec 05
Posts: 2
Credit: 1,969,809
RAC: 0
Message 10249 - Posted: 31 Jan 2006, 8:13:40 UTC
Last modified: 31 Jan 2006, 8:28:37 UTC

I had some bad wu's too.

Client error: exit status -197 this one stuck on xx % for a very long time. No progress.
PRODUCTION_ABINITO_1dhn__250_1426_0

CPU time issue: exit status -177
PRODUCTION_ABINITO_1bkrA_250_1031_0
PRODUCTION_ABINITO_1dhn__250_911_0
PRODUCTION_ABINITO_2chf__250_859_0
PRODUCTION_ABINITO_1louA_250_913_0
PRODUCTION_ABINITO_1louA_250_913_0
MORE_FRAGS_W_BARCODE_2reb_229_3721_0

Hope i stil get the claimed credit.


ID: 10249 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Los Alcoholicos~La Muis

Send message
Joined: 4 Nov 05
Posts: 34
Credit: 1,041,724
RAC: 0
Message 10250 - Posted: 31 Jan 2006, 8:24:44 UTC
Last modified: 31 Jan 2006, 8:28:00 UTC

Some more:

PRODUCTION_ABINITIO_2chf__250_1848_0 7356030
PRODUCTION_ABINITIO_1bgf__250_693_1 7321911

This one I aborted after 6:50 hours and still at 20% (Max CPU time for this machine is 25:30)

PRODUCTION_ABINITIO_1fna__250_1176_2 7943015
ID: 10250 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[DPC]FOKschaap~Eronymus

Send message
Joined: 7 Nov 05
Posts: 2
Credit: 50,153
RAC: 0
Message 10313 - Posted: 1 Feb 2006, 12:45:48 UTC

I also have jobs with Maximum CPU time exceeded
It seems that all jobs longer than 16 hours got this error.

PRODUCTION_ABINITIO_2vik__250_340_0
PRODUCTION_ABINITIO_1dhn__250_340_0
PRODUCTION_ABINITIO_1louA_250_337_0
PRODUCTION_ABINITIO_1npsA_250_336_0
PRODUCTION_ABINITIO_1tig__250_335_0
PRODUCTION_ABINITIO_1who__250_334_0

I aborted one since all jobs longer than 16 hours get the same error
PRODUCTION_ABINITIO_1ten__250_340_0

ID: 10313 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BurnHard

Send message
Joined: 22 Nov 05
Posts: 4
Credit: 2,139,569
RAC: 0
Message 10320 - Posted: 1 Feb 2006, 21:09:57 UTC

Got a list as well


https://boinc.bakerlab.org/rosetta/result.php?resultid=7262113
https://boinc.bakerlab.org/rosetta/result.php?resultid=7249237
https://boinc.bakerlab.org/rosetta/result.php?resultid=7249220
https://boinc.bakerlab.org/rosetta/result.php?resultid=7249140
https://boinc.bakerlab.org/rosetta/result.php?resultid=7249134
https://boinc.bakerlab.org/rosetta/result.php?resultid=7247559
https://boinc.bakerlab.org/rosetta/result.php?resultid=7247550
https://boinc.bakerlab.org/rosetta/result.php?resultid=7247535
https://boinc.bakerlab.org/rosetta/result.php?resultid=7247345
https://boinc.bakerlab.org/rosetta/result.php?resultid=7246137
https://boinc.bakerlab.org/rosetta/result.php?resultid=7246125
https://boinc.bakerlab.org/rosetta/result.php?resultid=7246111
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245934
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245896
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245894
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245817
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245762
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245735
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245700
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245686
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245670
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245608
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245597
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245591
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245494
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245480
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245465
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245293
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245276
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245260
https://boinc.bakerlab.org/rosetta/result.php?resultid=7245198
https://boinc.bakerlab.org/rosetta/result.php?resultid=7027124


Hope to get the points anyway

BurnHard
ID: 10320 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dave Wilson

Send message
Joined: 8 Jan 06
Posts: 35
Credit: 379,049
RAC: 0
Message 10342 - Posted: 2 Feb 2006, 14:32:47 UTC

I kind of hate to say it but without these points I will not transfer all my machines to Rosetta nor will I continue with this machine. These points are a small price to pay for my help and expence. With nothing get nothing. I do hope this is fixed soon and I will be watching.
ID: 10342 · Rating: -2 · rate: Rate + / Rate - Report as offensive    Reply Quote
Divide Overflow

Send message
Joined: 17 Sep 05
Posts: 82
Credit: 921,382
RAC: 0
Message 10400 - Posted: 3 Feb 2006, 4:56:28 UTC
Last modified: 3 Feb 2006, 5:27:56 UTC


ID: 10400 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[DPC]FOKschaap~Eronymus

Send message
Joined: 7 Nov 05
Posts: 2
Credit: 50,153
RAC: 0
Message 10457 - Posted: 4 Feb 2006, 15:16:42 UTC

Getting more of those errors....

PRODUCTION_ABINITIO_1acf__250_357_0
PRODUCTION_ABINITIO_1tul__250_354_0

I really don't like it...
ID: 10457 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ricky@SETI.USA
Avatar

Send message
Joined: 13 Dec 05
Posts: 20
Credit: 97,355
RAC: 0
Message 10471 - Posted: 5 Feb 2006, 0:54:42 UTC - in response to Message 10457.  

Getting more of those errors....

PRODUCTION_ABINITIO_1acf__250_357_0
PRODUCTION_ABINITIO_1tul__250_354_0

I really don't like it...


I have 5 and one running where it says it is a 9 hour WU but this one has been running for over 19 hours and it says it has 28 more hours to go! Should I abort them?


"Life is like an Ice Cream cone, just when you think you got it licked, it drips all over you!"

ID: 10471 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Los Alcoholicos~La Muis

Send message
Joined: 4 Nov 05
Posts: 34
Credit: 1,041,724
RAC: 0
Message 10474 - Posted: 5 Feb 2006, 9:24:04 UTC

ID: 10474 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dalephi

Send message
Joined: 5 Dec 05
Posts: 2
Credit: 1,450,698
RAC: 0
Message 10528 - Posted: 7 Feb 2006, 6:59:14 UTC - in response to Message 9946.  

ID: 10528 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 10545 - Posted: 7 Feb 2006, 19:27:30 UTC

I've been working with David Baker on these runs (PRODUCTION_ABINITIO), and
we really appreciate these reports of timeouts. As David mentioned, we're reducing the sizes of any further jobs -- by 4 fold -- and increasing the maxCPU time to prevent this error from occuring again.

We've just queued up a few more jobs with the tag PRODUCTION_ABINITIO_CENTROID_PACKING. Please let us know if there are more timeouts! So far the results are very enlightening -- stay tuned to the Rosetta@home Science message boards for an update after we get this next round of results.
ID: 10545 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 10548 - Posted: 7 Feb 2006, 21:35:44 UTC - in response to Message 10545.  
Last modified: 7 Feb 2006, 21:35:57 UTC

We've just queued up a few more jobs with the tag PRODUCTION_ABINITIO_CENTROID_PACKING. Please let us know if there are more timeouts! So far the results are very enlightening -- stay tuned to the Rosetta@home Science message boards for an update after we get this next round of results.

This is how it is supposed to be guys ...

We have problems, we work together and communicate news and test results ...

The project is patient with the participants venting a little ...
The participants are patient (well, we need more practice maybe) with the project taking longer than we would like to solve the issues ...
ID: 10548 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Team TMR

Send message
Joined: 2 Nov 05
Posts: 21
Credit: 1,583,679
RAC: 0
Message 10569 - Posted: 8 Feb 2006, 10:05:40 UTC
Last modified: 8 Feb 2006, 10:06:18 UTC

This one just timed out: WU 5610404, Result 9094342

I also have 3 other ABINITIO WUs in progress that have been running over 12 hours (2 are on 2+ GHz PCs) which might be heading the same way.

If these timed out WUs are of use, are you still giving credit for them?
ID: 10569 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Team TMR

Send message
Joined: 2 Nov 05
Posts: 21
Credit: 1,583,679
RAC: 0
Message 10574 - Posted: 8 Feb 2006, 13:05:29 UTC - in response to Message 10569.  
Last modified: 8 Feb 2006, 13:06:04 UTC

I also have 3 other ABINITIO WUs in progress that have been running over 12 hours (2 are on 2+ GHz PCs) which might be heading the same way.

One of them now has: Result 9027571
ID: 10574 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Team TMR

Send message
Joined: 2 Nov 05
Posts: 21
Credit: 1,583,679
RAC: 0
Message 10575 - Posted: 8 Feb 2006, 14:37:24 UTC

And now the 3rd has failed.

Result 8433350

I hope we're going to get credit for these!
ID: 10575 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Los Alcoholicos~La Muis

Send message
Joined: 4 Nov 05
Posts: 34
Credit: 1,041,724
RAC: 0
Message 10577 - Posted: 8 Feb 2006, 16:44:08 UTC - in response to Message 10548.  

We've just queued up a few more jobs with the tag PRODUCTION_ABINITIO_CENTROID_PACKING. Please let us know if there are more timeouts! So far the results are very enlightening -- stay tuned to the Rosetta@home Science message boards for an update after we get this next round of results.

This is how it is supposed to be guys ...

We have problems, we work together and communicate news and test results ...

The project is patient with the participants venting a little ...
The participants are patient (well, we need more practice maybe) with the project taking longer than we would like to solve the issues ...


I lost more then 380 hours of CPU time on the 'Stuck at 1%' and the 'Maximum CPU Time Exceeded' wu's the last 30 days. How much more practice do I need?
ID: 10577 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Los Alcoholicos~La Muis

Send message
Joined: 4 Nov 05
Posts: 34
Credit: 1,041,724
RAC: 0
Message 10578 - Posted: 8 Feb 2006, 16:53:57 UTC - in response to Message 10545.  

I've been working with David Baker on these runs (PRODUCTION_ABINITIO), and
we really appreciate these reports of timeouts. As David mentioned, we're reducing the sizes of any further jobs -- by 4 fold -- and increasing the maxCPU time to prevent this error from occuring again.

We've just queued up a few more jobs with the tag PRODUCTION_ABINITIO_CENTROID_PACKING. Please let us know if there are more timeouts! So far the results are very enlightening -- stay tuned to the Rosetta@home Science message boards for an update after we get this next round of results.


The new PRODUCTION_ABINITIO_CENTROID_PACKING_1bm8_301_50_0 I recieved yesterday was still at 1% after 13.30 hours. After a restart is runs fine (it is now at 50% after 58 minutes)

It isn't a 'Maximum CPU Time Exceeded' but a 'Stuck at 1%' issue this time.
ID: 10578 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Koen

Send message
Joined: 29 Sep 05
Posts: 8
Credit: 8,542,574
RAC: 0
Message 10590 - Posted: 9 Feb 2006, 10:57:21 UTC

Got one more:

WU5802261
ID: 10590 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dalephi

Send message
Joined: 5 Dec 05
Posts: 2
Credit: 1,450,698
RAC: 0
Message 10593 - Posted: 9 Feb 2006, 16:01:19 UTC - in response to Message 10528.  

Here are two more:

https://boinc.bakerlab.org/rosetta/result.php?resultid=7405116
https://boinc.bakerlab.org/rosetta/result.php?resultid=7403676

Could you tell us what your plans are for these work units? Will we get the credit for them? I would like to report back to my TeAm to let them know the status.

Dalephi
ID: 10593 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Report Maximum CPU Time Exceeded WU HERE



©2024 University of Washington
https://www.bakerlab.org