Message boards : Number crunching : Please abort WUs with
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next
Author | Message |
---|---|
AMDave Send message Joined: 16 Dec 05 Posts: 35 Credit: 12,576,896 RAC: 0 |
I don't think any information is left to be gained on these... so no, _I_ certainly don't need to see anything. I don't see any reason the project would either, but I can't be 100% sure of that. I have a couple of examples now (thanks Fuzzy!) for my email to DK. Bill: The WU I mentioned in this thread has yet to be processed from my cache. Would it prudent for me to suspend it, assuming that the aborted status can be changed? If it can be changed and I do suspend it, will the BOINC mgr skip to the following WUs in my cache? |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 7 |
The WU I mentioned in this thread has yet to be processed from my cache. Would it prudent for me to suspend it, assuming that the aborted status can be changed? If it can be changed and I do suspend it, will the BOINC mgr skip to the following WUs in my cache? I don't know if you can do anything with the status, but you can certainly try to "suspend" it in the Work tab. BOINC will indeed just ignore anything "suspended" and continue with other work. |
AMDave Send message Joined: 16 Dec 05 Posts: 35 Credit: 12,576,896 RAC: 0 |
Well, I highlighted the WU and clicked the Suspend button. The button changed to Resume, however, the status remained as Aborted. Will have to wait and see - but my cache is set for 2 days and the WUs have increased to over 5 hrs each. It may be over 45 hrs before the WU in question is looked at by the mgr. |
Fuzzy Hollynoodles Send message Joined: 7 Oct 05 Posts: 234 Credit: 15,020 RAC: 0 |
Yep, it will! I did that, when I got the first Graphic Rosetta WU, suspended everything else, and that WU started, so I could watch the show! :-D It will be suspended untill you click the "Resume" button! [b]"I'm trying to maintain a shred of dignity in this world." - Me[/b] |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
Well, I highlighted the WU and clicked the Suspend button. The button changed to Resume, however, the status remained as Aborted. Will have to wait and see - but my cache is set for 2 days and the WUs have increased to over 5 hrs each. It may be over 45 hrs before the WU in question is looked at by the mgr. Noramlly an aborted wu will change later on to Computation Error. If it is suspended (and it must be for the butto to say 'resume'), then it will not make this change, and will not ever become a candidate for reporting. Later on when you do want to let it be reported you will need to resume it. If you had suspended it after the change to computation error then I believe it would be shipped out at the next update. R~~ |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
With mine, even if showing 'aborted' and the project suspended, an 'update' will get them reported and out of your list. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 7 |
With mine, even if showing 'aborted' and the project suspended, an 'update' will get them reported and out of your list. True, suspending a project is overriden on reporting by the "update" - but individual suspended results _shouldn't_ be affected by that. Still, might be a good idea to avoid hitting "update" unless you have to... |
Fuzzy Hollynoodles Send message Joined: 7 Oct 05 Posts: 234 Credit: 15,020 RAC: 0 |
Bill, do you want an example of one of the zombie WU's? Anyway, here it is: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=3774628 With the result: https://boinc.bakerlab.org/rosetta/result.php?resultid=5235969 So I was number 11 who was so "lucky" to draw one from the lottery! I wonder if it's killed by now, or what??? Anyway, it only ran a mere about 35 seconds! [b]"I'm trying to maintain a shred of dignity in this world." - Me[/b] |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 7 |
Bill, do you want an example of one of the zombie WU's? You staked it! It's finally dead! :-) The "short WUs" actually appear to be flushing out better than the "205"s, just the opposite of what was hoped for. The staff is going to have to manually kill off the remaining 205's when they return. THOSE are the _real_ "zombies". If there are any "short" ones left by the time the staff is back, it'll be very few, only the ones hoarded at some point by people with a 10-day cache. I would guess those folks are just now starting to have computation errors... |
Fuzzy Hollynoodles Send message Joined: 7 Oct 05 Posts: 234 Credit: 15,020 RAC: 0 |
Bill, do you want an example of one of the zombie WU's? LOLLL!!!! :-D Whooo Hooooo!!!! So I guess the rants from them will start soon! ;-) [b]"I'm trying to maintain a shred of dignity in this world." - Me[/b] |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
Still getting a few 'shorts' per day, looking at them they seem to have been waiting in someones cache for a while. Also had a '205' which ran for 5 hours+ before I noticed it and 'shot' it.... |
Fuzzy Hollynoodles Send message Joined: 7 Oct 05 Posts: 234 Credit: 15,020 RAC: 0 |
Here's a weird one! It didn't stop after a few seconds, it actually ran for some time: The message from my manager (with a very weird message from Seti in between!?!?): 12/27/2005 11:06:58 PM|rosetta@home|Starting result 1di2__abrelax_rand_len10_jit02_omega_sim_filters_23953_1 using rosetta version 481 12/27/2005 11:06:59 PM|rosetta@home|Started upload of DEFAULT_2tif_220_2691_1_0 12/27/2005 11:07:06 PM|rosetta@home|Finished upload of DEFAULT_2tif_220_2691_1_0 12/27/2005 11:07:06 PM|rosetta@home|Throughput 5931 bytes/sec 12/27/2005 11:45:03 PM||Rescheduling CPU: project op 12/27/2005 11:45:09 PM|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi 12/27/2005 11:45:09 PM|rosetta@home|Reason: Requested by user 12/27/2005 11:45:09 PM|rosetta@home|Reporting 1 results 12/27/2005 11:45:14 PM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded 12/28/2005 1:16:58 AM|SETI@home|Result 16fe05ab.14254.26049.798576.1.16_0 exited with zero status but no 'finished' file 12/28/2005 1:16:58 AM|SETI@home|If this happens repeatedly you may need to reset the project. 12/28/2005 1:16:58 AM|rosetta@home|Unrecoverable error for result 1di2__abrelax_rand_len10_jit02_omega_sim_filters_23953_1 ( - exit code -1073741819 (0xc0000005)) 12/28/2005 1:17:00 AM||Rescheduling CPU: process exited https://boinc.bakerlab.org/rosetta/workunit.php?wuid=2232941 With the result: https://boinc.bakerlab.org/rosetta/result.php?resultid=5251325 [b]"I'm trying to maintain a shred of dignity in this world." - Me[/b] |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 7 |
Here's a weird one! It didn't stop after a few seconds, it actually ran for some time: Very strange - there's not even a single "random seed" _in_ that result... |
Nuadormrac Send message Joined: 27 Sep 05 Posts: 37 Credit: 202,469 RAC: 0 |
This would explain the computational errors I've been seeing in Rosetta. BTW, still got DEFAULT... WUs even today. Guess, we should still kill them then. BTW, I got a computational error today on a WU of a different sort. https://boinc.bakerlab.org/rosetta/result.php?resultid=5182702 As can be seen, it doesn't start with the DEFAULT... I've been seeing a fair amount of lately. Also, will our download quotas be negatively impacted from all these bad units, and if peeps start running out of work (as the servers don't allow anymore downloads) will steps be taken to rectify this possible side effect? thx |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 7 |
BTW, still got DEFAULT... WUs even today. Guess, we should still kill them then. Please see "Four kinds of errors" thread - ONLY the DEFAULT_xxxxx_205's should be killed. There are a lot of perfectly good "DEFAULT_xxxxx_somethingelse" WUs out there. Quotas are affected by the "short" WUs, but a couple of good ones is all it takes to compensate. Credit will be granted for all that they can, definitely the DEFAULT_xxxxx_205's. |
R/B Send message Joined: 8 Dec 05 Posts: 195 Credit: 28,095 RAC: 0 |
OH NO....I saw the message on the home page....and it wasn't immediately clear that other DEFAULT XXX above 205 were ok. I just aborted 8 unworked on units because I just read the 'DEFAULT' portion. Perhaps someone can rewrite the instructions and include something in all caps that says the Default xxx units above 205 are ok and not to abort them? Founder of BOINC GROUP - Objectivists - Philosophically minded rational data crunchers. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 7 |
I just aborted 8 unworked on units because I just read the 'DEFAULT' portion. Well, no harm done to project (they'll be reissued), and only harm to you is lost download time... I don't know what happened to the title of _this_ thread, it did say "with DEFAULT_xxxxx_205". I don't think anybody from project will be around before Monday to do anything else, and by then these (hopefully) will either be gone, or staff will kill them. |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
OH NO....I saw the message on the home page....and it wasn't immediately clear that other DEFAULT XXX above 205 were ok. NB, below 205 ok too - there may just be a few of those still being re-issued I just aborted 8 unworked on units because I just read the 'DEFAULT' portion. Easily done even when you do know - I'm trying to keep track of 17 boxes and I noticed I'd mistakenly aborted a load of good wu at one point - and Ive no idea if I just made that mistake once. Most of us do other stuff besides BOINC and it is easy for the attention to slip for a moment, whether it is missing something on the boards or clicking the wrong button on the GUI. As Bill said, BOINC is designed so that most mistakes will not hurt the project and this one won't even have hurt your own stats either.
Only mods can re-write existing postings. Apart from that these boards are pretty egalitarian. You are free, as free as I am, or Bill, to post a thread with the title NB- most DEFAULT are ok. If you made a mistake, maybe others will do the same. The very best person to explain how to avoid the mistake is someone who has already mnade it. You might feel that we have too many threads already - and I'd agree that we don't really want to spread discussion over yet another thread. If I made the posting I'd want to include in the body of the posting a mention of the titles or urls of a few other relevant threads and ask people to post responses there. River~~ |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,362 RAC: 7 |
Only mods can re-write existing postings. Just to clarify - I can only edit my _own_ postings past the "one hour deadline", I think. I can't edit anyone elses postings at any time, only move or delete them. Nor can I change thread titles, or anything on any of the other pages on the site, or, or... |
Fuzzy Hollynoodles Send message Joined: 7 Oct 05 Posts: 234 Credit: 15,020 RAC: 0 |
Only mods can re-write existing postings. Yes, as the thread owner you can change the thread title. Just reply to the thread, not to any post, write something, like "Change the thread title" or whatever, post it. Then go back and edit that post, and then you'll have access to change the thread title. Just go ahead! Unless, of course, that the dev's here have decided they won't go with titles longer than x characters. But it works on other boards. [b]"I'm trying to maintain a shred of dignity in this world." - Me[/b] |
Message boards :
Number crunching :
Please abort WUs with
©2024 University of Washington
https://www.bakerlab.org