Loads and loads of computing errors today

Message boards : Number crunching : Loads and loads of computing errors today

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Webmaster Yoda
Avatar

Send message
Joined: 17 Sep 05
Posts: 161
Credit: 162,253
RAC: 0
Message 1800 - Posted: 26 Oct 2005, 15:05:16 UTC - in response to Message 1795.  

I wonder if R@H causes some boxes to run hot, and that trashes the unit? Just a thought...


It may be the case sometimes, but having monitored this particular box, it rarely gets above 50 Celcius and it even happens within minutes of firing up the PC in the morning (I don't run all my PC's 24/7).

*** Join BOINC@Australia today ***
ID: 1800 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 1808 - Posted: 26 Oct 2005, 21:11:30 UTC - in response to Message 1800.  

I wonder if R@H causes some boxes to run hot, and that trashes the unit? Just a thought...


It may be the case sometimes, but having monitored this particular box, it rarely gets above 50 Celcius and it even happens within minutes of firing up the PC in the morning (I don't run all my PC's 24/7).


Heat is not the issue. On the Mac we have both sensors and the tools to monitor them. None of the applications bring the temps above the normal operating range on any of my systems. The only one that even produces any spikes is CPDN and the CPU runs at 130 F with it. The variable speed fans do not even spin for most applications. If temperature was the issue it would have been an issue before now. Most of the systems on this thread have been running fine until the last 3 or 4 days.

So what one has to wonder is what has changed in those last three or four days. Well we have at least two new classes of WU, we have a new version of the application (4.77) and we have the server side upgrade to BOINC 5.x. With the exception of upgraded application, most of the Client machines have not been changed. Recently a few have upgraded to BOINC 5.x, but many of these upgrades were performed to address the WU failure issue.

So we are left with the Application or the WUs. My suspicion is the WUs, but the App is not really off the hook. The two WUs of the old class that downloaded to my machine have not finished yet. If they make it without error, this might point a very large finger at the WUs. But as was pointed out by David, compiling for the G4 Dual Mac is operating systems dependent and has to be done as an overt act, and this may not have been done properly.

It is mildly irritating that recommended course of action from the project for the second fastest machine I have is to take it off line for this project. Particularly since I upgraded the operating system largely so it could run R@H.

Regards
Phil


We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 1808 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 1809 - Posted: 26 Oct 2005, 21:57:38 UTC

Sorry Phil, I asked David Anderson a while back about the possibility of OSX version dependent applications to fix the problem with the gcc4 incompatibility issues. This would solve the problem and also make it easy to optimize for the specific CPU architecture which the mac gcc compiler allows. It would require a bit of development on their part (BOINC developers) though and I suspect they are very busy. You can suggest this to the mac boinc developer, Charlie Fenton.
ID: 1809 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 1811 - Posted: 26 Oct 2005, 23:56:06 UTC - in response to Message 1809.  

Sorry Phil, I asked David Anderson a while back about the possibility of OSX version dependent applications to fix the problem with the gcc4 incompatibility issues. This would solve the problem and also make it easy to optimize for the specific CPU architecture which the mac gcc compiler allows. It would require a bit of development on their part (BOINC developers) though and I suspect they are very busy. You can suggest this to the mac boinc developer, Charlie Fenton.


David,

Please don't misunderstand my frustration. R@H is one of, if not the best run project I am involved with. Certainly the support of the user community is second to none. It just seems that it wasn't broken before and now it is. And it is a shame that so many of the effected machines are the dual processors. I would think this would have some effect on the speed that the science is produced. Oh Well, maybe version 4.78?

Regads
Phil

We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 1811 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
The Pirate
Avatar

Send message
Joined: 22 Sep 05
Posts: 20
Credit: 7,090,933
RAC: 0
Message 1812 - Posted: 27 Oct 2005, 0:02:09 UTC

Holy cow, I hadn't checked for a few days. Several of my linux boxes are throwing up all over the place with R@H errors. Both multi processor and single processor. All my windows computers seem to be running ok.

Question is, should I detatch from R@H and wait for it to get fixed or let it run so R@H can "test". One is running the 5.2.2 but I'm not at them right now. I'll check later tonight.

ID: 1812 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 1814 - Posted: 27 Oct 2005, 0:20:43 UTC - in response to Message 1812.  

Holy cow, I hadn't checked for a few days. Several of my linux boxes are throwing up all over the place with R@H errors. Both multi processor and single processor. All my windows computers seem to be running ok.

Question is, should I detatch from R@H and wait for it to get fixed or let it run so R@H can "test". One is running the 5.2.2 but I'm not at them right now. I'll check later tonight.


Try upgrading the 4.19 clients if you can. They seem to be giving the errors.
ID: 1814 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 1815 - Posted: 27 Oct 2005, 0:29:58 UTC - in response to Message 1811.  



David,

Please don't misunderstand my frustration. R@H is one of, if not the best run project I am involved with. Certainly the support of the user community is second to none. It just seems that it wasn't broken before and now it is. And it is a shame that so many of the effected machines are the dual processors. I would think this would have some effect on the speed that the science is produced. Oh Well, maybe version 4.78?

Regads
Phil


I hope we can fix this in the next version. Maybe you can help us test it out when the time comes. The previous app that worked was compiled using mac's gcc4 compiler so it only worked on OSX10.4. This app was producing lots of errors from users who were trying to run on 10.3.9 so the success rate was low for this platform. The success rate has improved dramatically to 99.25% from less than 50% with the new app so it has definitely helped. I don't know if the errors are happening on all dual G4's out there.
ID: 1815 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 1816 - Posted: 27 Oct 2005, 3:19:25 UTC - in response to Message 1815.  



David,

Please don't misunderstand my frustration. R@H is one of, if not the best run project I am involved with. Certainly the support of the user community is second to none. It just seems that it wasn't broken before and now it is. And it is a shame that so many of the effected machines are the dual processors. I would think this would have some effect on the speed that the science is produced. Oh Well, maybe version 4.78?

Regads
Phil


I hope we can fix this in the next version. Maybe you can help us test it out when the time comes. The previous app that worked was compiled using mac's gcc4 compiler so it only worked on OSX10.4. This app was producing lots of errors from users who were trying to run on 10.3.9 so the success rate was low for this platform. The success rate has improved dramatically to 99.25% from less than 50% with the new app so it has definitely helped. I don't know if the errors are happening on all dual G4's out there.


just before you released the version for 10.3.9, I upgraded to 10.4.2, For a few weeks the system ran fine. even after the release of the 10.3.9 version I had no problems. Then out of nowhere this started. I would be happy to help you test. I have done testing for the current release version of E@H and a VERY fast S@H altivec optmized application. The Altivec code really makes the Mac sing. S@H WUs went from around 7 hours each to around 2 hours each. So if your guy can make use of altivec your current 2 hour WUs mught become very fast indeed.

Just let me know when you are ready to test. There are a few other people working your project who I have seen on other alpha and beta tests that migh also be willing to help out.

Regards
Phil


We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 1816 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 1818 - Posted: 27 Oct 2005, 4:46:14 UTC

I just double checked, my dual G5 has no errors for 116 results. So, I don't know what to make of that ... I am running "Tiger" ... again, not sure what that may imply.
ID: 1818 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 1819 - Posted: 27 Oct 2005, 5:10:00 UTC - in response to Message 1818.  

I just double checked, my dual G5 has no errors for 116 results. So, I don't know what to make of that ... I am running "Tiger" ... again, not sure what that may imply.


The dual G5 I have is running ok as well.

Regards
Phil


We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 1819 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 1820 - Posted: 27 Oct 2005, 5:13:39 UTC

What about dual G4?
ID: 1820 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 1821 - Posted: 27 Oct 2005, 5:59:29 UTC - in response to Message 1820.  

What about dual G4?

Sneaking a peak at his account, the G5 of his has only 4-5 errors, the G4 is almost all bad.

Seems like the compile has a problem with the G4 for one reason or another ... though it does look like there were a couple successes. So, I am puzzled ...
ID: 1821 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Doug Worrall
Avatar

Send message
Joined: 19 Sep 05
Posts: 60
Credit: 58,445
RAC: 0
Message 1831 - Posted: 27 Oct 2005, 11:53:48 UTC

Hello,
Am running a Linux Box.As of today I was getting errors from the Relaxmode w/u also,now I Suspend all,no more 25 Minute w/u average size is 1 Hour.Credits
are just as good as the 3 Hour w/u of past.
Good luck all
Happy crunching
Doug
Boinc Synergy
ID: 1831 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
doc :)

Send message
Joined: 4 Oct 05
Posts: 47
Credit: 1,106,102
RAC: 0
Message 1883 - Posted: 28 Oct 2005, 18:40:59 UTC
Last modified: 28 Oct 2005, 18:42:10 UTC

the new 1n0u* units seem to be a problem for my athlonXP machine, 4 out of 6 processed so far ended with client errors (the 1hz6A* ones were no problem here), my old slow duron finished all 4 he tried so far without problems though.
ID: 1883 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 1884 - Posted: 28 Oct 2005, 18:49:14 UTC - in response to Message 1820.  

What about dual G4?

All errors. The WU type does not seem to matter.

Regards
Phil

We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 1884 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 1885 - Posted: 28 Oct 2005, 18:55:07 UTC - in response to Message 1821.  

What about dual G4?

Sneaking a peak at his account, the G5 of his has only 4-5 errors, the G4 is almost all bad.

Seems like the compile has a problem with the G4 for one reason or another ... though it does look like there were a couple successes. So, I am puzzled ...


I have two G4s, one is a dual processor, one is a powerbook. The powerbook does just fine. The few G5 erros are not related to the prosent problem. Also the sort of the WUs is by result number. On the second page of the stats list I have two WU that have been on the G4 Dual for a day or so. But that are also erroring out. At least in my case the problem is a G4 dual issue almost entirely.

Regards
phil

We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 1885 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 1886 - Posted: 28 Oct 2005, 19:10:10 UTC

Is everyone running dual G4s having problems?
ID: 1886 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 1887 - Posted: 28 Oct 2005, 19:12:41 UTC - in response to Message 1885.  
Last modified: 28 Oct 2005, 19:18:11 UTC

What about dual G4?

Sneaking a peak at his account, the G5 of his has only 4-5 errors, the G4 is almost all bad.

Seems like the compile has a problem with the G4 for one reason or another ... though it does look like there were a couple successes. So, I am puzzled ...


I have two G4s, one is a dual processor, one is a powerbook. The powerbook does just fine. The few G5 erros are not related to the prosent problem. Also the sort of the WUs is by result number. On the second page of the stats list I have two WU that have been on the G4 Dual for a day or so. But that are also erroring out. At least in my case the problem is a G4 dual issue almost entirely.

The G5 results are here
Regards
phil

Now I am getting a lot of Phantom WUs on the G5 dual as well. The ones I actually have on my system are running ok, but the list shows a lot of WUs I just do not have. All of them are listed as in progress, and they have all been given to other systems that never reported results before they showed up in my list. I suspect that these WUs do not in fact exist even on your server.

Regards
Phil


We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 1887 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
The Pirate
Avatar

Send message
Joined: 22 Sep 05
Posts: 20
Credit: 7,090,933
RAC: 0
Message 1893 - Posted: 28 Oct 2005, 21:18:26 UTC - in response to Message 1814.  

Holy cow, I hadn't checked for a few days. Several of my linux boxes are throwing up all over the place with R@H errors. Both multi processor and single processor. All my windows computers seem to be running ok.

Question is, should I detatch from R@H and wait for it to get fixed or let it run so R@H can "test". One is running the 5.2.2 but I'm not at them right now. I'll check later tonight.


Try upgrading the 4.19 clients if you can. They seem to be giving the errors.


It's a bit early but, upgrading to the 5.2.4 linux client appears to have corrected the problem.


ID: 1893 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Doug Worrall
Avatar

Send message
Joined: 19 Sep 05
Posts: 60
Credit: 58,445
RAC: 0
Message 1895 - Posted: 28 Oct 2005, 23:18:45 UTC - in response to Message 1893.  

Holy cow, I hadn't checked for a few days. Several of my linux boxes are throwing up all over the place with R@H errors. Both multi processor and single processor. All my windows computers seem to be running ok.

Question is, should I detatch from R@H and wait for it to get fixed or let it run so R@H can "test". One is running the 5.2.2 but I'm not at them right now. I'll check later tonight.


Try upgrading the 4.19 clients if you can. They seem to be giving the errors.


It's a bit early but, upgrading to the 5.2.4 linux client appears to have corrected the problem.

Hello Jim,
Glad to hear the new 5.2.4 Linux client is running well.If you donnot mind me asking,which O.S.{NIX} are you running.Myself am running off the live c/d and
updated thru symnaptic and KDE to LINUXOS 2.6.1,all these numbers are sending me for a spin.LOL
Found that my OLD 4.5 Boinc client is getting some special W/U.Have not had
one error with Rosetta ,asides from my own mistakes,for days.Using one P.C.
so Pause rest,let Rosetta crunch away{being Nix users No Rebooting} could let
it run all day.Added 1 stick before signing on to Rosetta,Fryed the M.B. yada
yada ect... #300 later,running Rosetta on a Rock Solid O.S. and RAM for plenty.
Good luck with your quest for the lack of w/u that are errors.May they all be
success,for all.
Happy Crunching
Doug
ID: 1895 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Loads and loads of computing errors today



©2024 University of Washington
https://www.bakerlab.org