Warning: Don't shut down BOINC Manager..!!

Message boards : Number crunching : Warning: Don't shut down BOINC Manager..!!

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 290 - Posted: 21 Sep 2005, 20:21:56 UTC - in response to Message 287.  

I'm not sure if this is a problem with the WU or if it's a problem due to my system crash.

System crash ... :)

That is a common error for things like device drivers. You can look it up in the Wiki off the Messages link in the front page.

BUt, I would like to look at the log for the heck of it ...

Zip up the *.TXT files in the boinc directory and send to p.d.buck@comcast.net

I never know if I am going to find something good or not ... You just can never tell ...
ID: 290 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Polian
Avatar

Send message
Joined: 21 Sep 05
Posts: 152
Credit: 10,141,266
RAC: 0
Message 353 - Posted: 23 Sep 2005, 4:41:31 UTC - in response to Message 222.  

Don't want to work much during vacation.


haha :)


ID: 353 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [B@H] Ray
Avatar

Send message
Joined: 20 Sep 05
Posts: 118
Credit: 100,251
RAC: 0
Message 423 - Posted: 24 Sep 2005, 19:34:40 UTC

CPDN has been having the same problem foe about 1/3 of there users. Before rebooting if you suspend CPDN and Rosetta and exit BOINC this should not be a problem they found.
CPDN is worse, a lot of people have been reporting damaged models after lousing power so CPDN is recommending that there people back up the BOINC folder occasionally.


Pizza@Home Rays Place Rays place Forums
ID: 423 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Desti

Send message
Joined: 16 Sep 05
Posts: 50
Credit: 3,018
RAC: 0
Message 435 - Posted: 24 Sep 2005, 23:57:35 UTC - in response to Message 423.  

CPDN has been having the same problem foe about 1/3 of there users. Before rebooting if you suspend CPDN and Rosetta and exit BOINC this should not be a problem they found.
CPDN is worse, a lot of people have been reporting damaged models after lousing power so CPDN is recommending that there people back up the BOINC folder occasionally.


Yea, happend to me today after I accidently turned of the wrong power switch. Luckily that I have made a backup yesterday. :-)
LUE
ID: 435 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [B@H] Ray
Avatar

Send message
Joined: 20 Sep 05
Posts: 118
Credit: 100,251
RAC: 0
Message 440 - Posted: 25 Sep 2005, 3:15:01 UTC - in response to Message 435.  

CPDN has been having the same problem foe about 1/3 of there users. Before rebooting if you suspend CPDN and Rosetta and exit BOINC this should not be a problem they found.
CPDN is worse, a lot of people have been reporting damaged models after lousing power so CPDN is recommending that there people back up the BOINC folder occasionally.


Yea, happend to me today after I accidently turned of the wrong power switch. Luckily that I have made a backup yesterday. :-)


You are luckey, as far as I know only people running CPDN make backup's of BOINC. If the WU's were larger here this would be another to back up for.


Pizza@Home Rays Place Rays place Forums
ID: 440 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
RDC

Send message
Joined: 16 Sep 05
Posts: 43
Credit: 101,644
RAC: 0
Message 444 - Posted: 25 Sep 2005, 6:48:54 UTC - in response to Message 440.  

CPDN has been having the same problem foe about 1/3 of there users. Before rebooting if you suspend CPDN and Rosetta and exit BOINC this should not be a problem they found.
CPDN is worse, a lot of people have been reporting damaged models after lousing power so CPDN is recommending that there people back up the BOINC folder occasionally.


Yea, happend to me today after I accidently turned of the wrong power switch. Luckily that I have made a backup yesterday. :-)


You are luckey, as far as I know only people running CPDN make backup's of BOINC. If the WU's were larger here this would be another to back up for.


Weird, I've had WU's crash on me for every project except CPDN. That's usually the WU that gets hit the hardest too since it usually is the project being crunched when some idiot decides to make love to a nearby telephone pole with their car ;) It's happened 3 times so far in the past 6 months...

ID: 444 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile George
Avatar

Send message
Joined: 27 Nov 05
Posts: 8
Credit: 634,319
RAC: 0
Message 4905 - Posted: 2 Dec 2005, 0:31:05 UTC

This is a bit off topic, but for some reason when Rosetta is running my CPU load is not 100% as I expect it to be because Seti@home uses all the CPU. Is this normal, or have I set something wrong in preference? I have both Seti and Rosetta attached. When Seti runs the CPU load is 100%; when Rosetta runs it's near 0%.

Thanks for any help.
ID: 4905 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,583,570
RAC: 3,169
Message 4908 - Posted: 2 Dec 2005, 2:12:20 UTC - in response to Message 4905.  

This is a bit off topic, but for some reason when Rosetta is running my CPU load is not 100% as I expect it to be because Seti@home uses all the CPU. Is this normal, or have I set something wrong in preference? I have both Seti and Rosetta attached. When Seti runs the CPU load is 100%; when Rosetta runs it's near 0%.


You have two computers attached - one seems to be working, with a couple of errors. The other is returning nothing but errors. Both are single-CPU systems, so BOINC should run _either_ SETI or Rosetta at any one time, never both.

First question is which computer (by ID# not name please, we can't see the names) are you asking about? Do you see two "Running" status indications in the Work tab? How are you measuring CPU load, with Task Manager?

As far as preference settings - you MUST set it to "leave application in memory = YES" in order for Rosetta to work properly. They're chasing that bug now, but it's not fixed yet. SETI doesn't care, but actually you'll lose a _little_ crunching time even on SETI if that option is set to "NO", because it won't checkpoint when switched out.

ID: 4908 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile George
Avatar

Send message
Joined: 27 Nov 05
Posts: 8
Credit: 634,319
RAC: 0
Message 4938 - Posted: 2 Dec 2005, 10:54:37 UTC

Thanks for the help. Both systems have the symptom. 78132 is the fast one and I'm guessing it is the one with few errors. 76452 is an older slow systgem but hosts some servers and so is on all the time. The errors may come from me suspending and resuming work in attempts to get the CPU usage up. I do see 'Running' when the CPU usage is near zero in Task Manager. As soon as I suspend Rosetta and Seti starts, usage jumps to 100%, then I resume Seti and it drops back down.

I did have 'leave app in memory' set to no in general prefs, just switched it to yes, we'll see if that's it. Thanks again for the pointer.
ID: 4938 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,583,570
RAC: 3,169
Message 4957 - Posted: 2 Dec 2005, 15:52:50 UTC - in response to Message 4938.  

I did have 'leave app in memory' set to no in general prefs, just switched it to yes, we'll see if that's it.


76452 still hasn't turned in a single valid result... the error reported each time _looks_ like the "leave in memory" problem. I didn't look any further at it.

78132 is successful most of the time, possibly because it can complete a result fast enough to not be switched out, or if it is switched out, not as many times. It has only a couple of possible memory-swap errors. The CPU time it is taking to complete results looks to be in-line with what I would expect from a computer with these benchmarks, so when it _does_ get the CPU, it's using it.

I'm seeing, for example, one result I randomly picked, was downloaded to your computer, you spent 2 hours of CPU time on it, and returned it 4 hours later. No problem at all, and if SETI and Rosetta are 50/50 resource share, this is exactly what I'd expect - from the results page, I'm just not seeing a performance problem.

I would expect from what you're describing to see a WU downloaded and completing in 2 hours CPU time - but taking 24 or 48 hours (or longer) to be returned. Your average turnaround on 78132 is only 6 hours.

I would suggest now that you've changed the pref, just let both systems run for a couple of days. Then take a look at the results pages for both computers and see if you are still getting errors, and take a look at the RAC for each. 76452 isn't going to get much of a RAC, but it should do more than zero! I'm a Mac person, so I can't really tell you what to expect from 78132, but at a wild guess, I would say maybe 200 or 250 (again, if SETI and Rosetta are 50/50). It is already at 92.44 after just a few days, and it takes a week or so to "level out". If it has a similar RAC on SETI and Rosetta (it may take longer to level on SETI because of the need for a quorum on each result) and Rosetta is still showing 0% CPU use... well, you must have one of those CPUs that were altered by the aliens to help us fold proteins, and Rosetta work is being done in another dimension... :-)

Regardless, PLEASE post back here in a few days and let us know how it's going, and we'll look at things again, see if there are any "tweaks" we can suggest, make sure this is solved. And of course, if you have any other questions or problems, feel free to post before then.

ID: 4957 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile George
Avatar

Send message
Joined: 27 Nov 05
Posts: 8
Credit: 634,319
RAC: 0
Message 4958 - Posted: 2 Dec 2005, 16:52:24 UTC

Wow. What a response. If only my cell phone provider and ISP could be a quarter as responsive! And I pay them big bucks.

Sitting here at 78132 The CPU is showing 0% usage in Task Manager while Rosetta shows as 'Running' and Seti shows 'Preempted' in BOINC Manager. Darn those pesky trans-dimensional aliens! I don't care about the reported CPU usage as long as the WU is getting done, but if there is some weirdness perhaps we should know about it and publish it so others don't get similarly confused. If so we should start another thread with a more germane title.

I do have "Do work while computer in use?" set to 'yes'.

Thanks so much, and I'll keep an eye on it and report back.
ID: 4958 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,583,570
RAC: 3,169
Message 4968 - Posted: 2 Dec 2005, 18:22:19 UTC - in response to Message 4958.  

Wow. What a response. If only my cell phone provider and ISP could be a quarter as responsive! And I pay them big bucks.


O_O I have just recently finished yet another battle with BOTH my cell phone provider, AND my ISP... plus my local grocery store... Sigh. Very few now have ever heard of a thing called "customer service". It's "we want your money, we don't care if we piss you off, you don't have very many choices and the others are just as bad so deal with it, it would cost us another $0.10 on the quarterly share price if we actually paid our employees enough to be able to hire somebody with a brain!"

So needless to say, thank you. If I ever start providing service like a cell phone company or ISP, I'll just shut down the browser, 'cause it obviously isn't fun any more. :-)

ID: 4968 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dgnuff
Avatar

Send message
Joined: 1 Nov 05
Posts: 350
Credit: 24,773,605
RAC: 0
Message 4972 - Posted: 2 Dec 2005, 18:53:54 UTC - in response to Message 4958.  

Wow. What a response. If only my cell phone provider and ISP could be a quarter as responsive! And I pay them big bucks.

Sitting here at 78132 The CPU is showing 0% usage in Task Manager while Rosetta shows as 'Running' and Seti shows 'Preempted' in BOINC Manager. Darn those pesky trans-dimensional aliens! I don't care about the reported CPU usage as long as the WU is getting done, but if there is some weirdness perhaps we should know about it and publish it so others don't get similarly confused. If so we should start another thread with a more germane title.

I do have "Do work while computer in use?" set to 'yes'.

Thanks so much, and I'll keep an eye on it and report back.


Two more things to try.

On the "work" tab in Boinc manager, can you see the CPU time increasing?

Secondly, the processes tab of Task Manager will show you a breakdown of the CPU usage for each process. When you have a Rosetta unit "running" where does task manager say the most CPU cycles are going? i.e. what "image name" gets them?
ID: 4972 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TPR_Mojo

Send message
Joined: 20 Sep 05
Posts: 4
Credit: 684,947
RAC: 0
Message 4973 - Posted: 2 Dec 2005, 19:09:03 UTC - in response to Message 4905.  

This is a bit off topic, but for some reason when Rosetta is running my CPU load is not 100% as I expect it to be because Seti@home uses all the CPU. Is this normal, or have I set something wrong in preference? I have both Seti and Rosetta attached. When Seti runs the CPU load is 100%; when Rosetta runs it's near 0%.

Thanks for any help.


I have this occasionally too. I just aborted a unit which was sat at status "running" and CPU had been idling for at least an hour. Nothing of note in the output files to explain why, and no R@H tasks running. It was as if the task just disappeared but BOINC manager thought it was active.

When/if it happens again I'll take better notes and copy files etc.

ID: 4973 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile George
Avatar

Send message
Joined: 27 Nov 05
Posts: 8
Credit: 634,319
RAC: 0
Message 4977 - Posted: 2 Dec 2005, 19:19:30 UTC

:( Sorry to hear about the service provider woes. I feel your pain - I have them too. I'm thinking about moving to Russia for the more enlightened customer service due to competition...

Right now with Rosetta running (as shown in BOINC Mgr) on 78132 Task Manager shows 0% CPU usage. In the Task Manager process list, sorted by CPU load, Rosetta is near the top while memory for the process is ~20MB, but System Idle process shows 90 - 100%. The next highest process for CPU time is task mgr and Konfabulator (have you tried it? It's great - konfabulator.com) at a few % each.

The CPU time in BOINC Mgr for Rosetta has not changed in the 10 or so minutes I've been watching. It certainly appears no work is being done and cycles are being wasted. One Seti process is preempted while another shows "Ready to report". If I click "Show Graphics", nothing happens, but I know this has worked for me before because I've seen the Rosetta graphics while a WU was running - very impressive.

I wonder if something is conflicting with Rosetta. It certainly seems to be stalled.
ID: 4977 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,583,570
RAC: 3,169
Message 4980 - Posted: 2 Dec 2005, 19:48:14 UTC - in response to Message 4977.  

If I click "Show Graphics", nothing happens


Was the Rosetta result selected when you clicked it? This is really sounding like it's NOT working, which is confusing seeing results coming back...

ID: 4980 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile George
Avatar

Send message
Joined: 27 Nov 05
Posts: 8
Credit: 634,319
RAC: 0
Message 4981 - Posted: 2 Dec 2005, 20:05:15 UTC

Okay, we may get to file this in the "shut up and reboot" Windows drawer. Just for kicks and giggles I restarted - hadn't done it for a couple of weeks, I think - and now everything seems fine. Rosetta's running, CPU is at 100%, graphics come up fine, everything looks peachy. I can see the CPU time for Rosetta (in BOINC Manager Work tab) counting up, but it seems to be counting up from the stalled number I saw before, implying that Rosetta really was making no progress while the CPU load was 0%.

I'll continue to monitor it. If it's all fixed by the reboot, I should see my average WU/day rate rise for this computer. But it does seem as if Windows can get into a state that starves Rosetta but not Seti without any alarms going off.

To answer your question, yes, I had the line that showed Rosetta running selected when I clicked "Show Graphics".

Lowfield, what happens if you reboot (if you can)?
ID: 4981 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 575
Credit: 4,583,570
RAC: 3,169
Message 4982 - Posted: 2 Dec 2005, 20:18:27 UTC - in response to Message 4981.  

Okay, we may get to file this in the "shut up and reboot" Windows drawer.


One of my favorite places (hey, I'm a Mac guy). But, but, but... what about the other computer? I'm so confused...

Looking back at your results, I do _not_ see _anything_ returned since shortly before you posted the first time. The other computer has been silent for longer than that. So if the problem had just appeared the day you posted, all my blathering about the times on your results was, um, blather. I need more sleep...

ID: 4982 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 5013 - Posted: 3 Dec 2005, 5:50:08 UTC

Some of the older versions of BOINC had a habit of doing this after running for a long time and getting no contacts with the scheduler. Something about using up handles. Basically a resource leak (reason enough to not like C++). I forget which version clears this up. But, it used to be common on my PowerMac ...
ID: 5013 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,694,291
RAC: 1,367
Message 5098 - Posted: 4 Dec 2005, 11:33:05 UTC
Last modified: 4 Dec 2005, 11:38:57 UTC

>>> (reason enough to not like C++)

Leaking handles is not a language specific problem. You can leak handles in any language if you obtain them, and don't free them. It is a design issue, or sloppy coding.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 5098 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Warning: Don't shut down BOINC Manager..!!



©2024 University of Washington
https://www.bakerlab.org