Report Problems with Rosetta Version 5.07

Message boards : Number crunching : Report Problems with Rosetta Version 5.07

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 15275 - Posted: 2 May 2006, 12:31:25 UTC - in response to Message 15272.  
Last modified: 2 May 2006, 12:37:20 UTC


Jose,

You have a number of machines, and for the first time I have found the most recent connecting system. I notice that it is a quad CPU system but you have it set to use 1 CPU only. While it may be counter intuitive have you tried setting it to use all four processors? This is a setting in your general preferences.



I have only one machine. And dear Lord, my machine has only one processor. If more than one machine appear it is because of the quirks caused by the BOINC system when one has had to reattach to solve problems and the abscence of the merge functions that would give the real picture.

As to the 4 processors...I really dont know what to say...but I doubt that something as obvious as a processor could be hidden when I inspected my motherboard.


this is the system I am looking at. In the CPU section it shows as a 4 CPU system, but under number of CPUs to use is says 1.

It is probable that your system will always use the same CPU for BOINC processing and it is possible that that CPU is nearing failure and causing your problems

EDIT: This is particularly true since your errors are a mix of illegal instructions and memory access violations.
.
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 15275 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15276 - Posted: 2 May 2006, 12:36:18 UTC - in response to Message 15274.  

I am calm. Right now detaching and removing BOINC is becoming the more rational of the possibilities. I will have my machine checked up. But, I need the frustration this is causing as I need a callus in my but. I am sad. I thought I could do something useful but, alas all I have been able to do is mwaste my time and yours.

Well, Jose, you must do what you must do. Remember, Boinc takes advantage of otherwise "unused" cycles. So in effect, you're choosing to waste those cycles, rather than allowing them to come to some benefit. You need to do what's best for you. Good luck in whatever you choose.

tony-
ID: 15276 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Whl.

Send message
Joined: 29 Dec 05
Posts: 203
Credit: 275,802
RAC: 0
Message 15278 - Posted: 2 May 2006, 12:47:16 UTC - in response to Message 15275.  


this is the system I am looking at. In the CPU section it shows as a 4 CPU system, but under number of CPUs to use is says 1.

Pardon the intrusion guys, but does'nt the 4 just mean it is a Pentium 4 ?

ID: 15278 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15279 - Posted: 2 May 2006, 12:49:16 UTC - in response to Message 15276.  
Last modified: 2 May 2006, 12:50:11 UTC

I am calm. Right now detaching and removing BOINC is becoming the more rational of the possibilities. I will have my machine checked up. But, I need the frustration this is causing as I need a callus in my but. I am sad. I thought I could do something useful but, alas all I have been able to do is mwaste my time and yours.

Well, Jose, you must do what you must do. Remember, Boinc takes advantage of otherwise "unused" cycles. So in effect, you're choosing to waste those cycles, rather than allowing them to come to some benefit. You need to do what's best for you. Good luck in whatever you choose.

tony-

Tony the cycles are being wasted: Most of the errors are producing waste.

And yes, I will do what I must do.

Take care

Jose

This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15279 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nightbird

Send message
Joined: 17 Sep 05
Posts: 70
Credit: 32,418
RAC: 0
Message 15280 - Posted: 2 May 2006, 12:51:13 UTC - in response to Message 15262.  
Last modified: 2 May 2006, 12:56:46 UTC





The CPU efficiency is a "guess" from Boincview and not necessarily true. If the WU is really stuck (which happens rarely), Rosetta will auto-terminate it after an hour and return the result.

The problem is that the wu 1di2 is in this state since 2 days now.
Perhaps i must abort the wu. (?)


ID: 15280 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15281 - Posted: 2 May 2006, 12:59:42 UTC
Last modified: 2 May 2006, 13:00:14 UTC

Jose, How many puters do you really have? I see six IDENTICAL puters in your account and the benchmarks are all over the map.

1) Measured floating point speed 2009.88 million ops/sec
Measured integer speed 4014.11 million ops/sec

2) Measured floating point speed 2012.98 million ops/sec
Measured integer speed 4045.58 million ops/sec

3) Measured floating point speed 545.31 million ops/sec
Measured integer speed 3966.71 million ops/sec

4) Measured floating point speed 1276.07 million ops/sec
Measured integer speed 5114.47 million ops/sec

5) Measured floating point speed 1986.21 million ops/sec
Measured integer speed 3371.27 million ops/sec

6) Measured floating point speed 1154.1 million ops/sec
Measured integer speed 235.34 million ops/sec

If you just have one machine continuously being attached/detached then you have a issue here.

Note: none of this conversation belongs in this thread, maybe a mod could move them.
ID: 15281 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 15282 - Posted: 2 May 2006, 13:01:08 UTC - in response to Message 15254.  

I did a screenshoot with this wu "not working" (1di2) and an other wu working (2tif)



Are you saying that the CPU time is not increasing, even though it's "running"? Is the idle process getting all the CPU time when this WU is "running"?

I've seen something like that months ago (but not recently). It happened when BOINC stopped the WU and ran the benchmark. For some reason the rosetta client didn't restart even though BOINC said it was "running". I was able to see this by looking through the "messages". Restarting BOINC got the WU going again.

This would be serious, because if the rosetta client isn't actually running then the watchdog won't be running either. Is there anything in the messages around the time that this WU stopped?
ID: 15282 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 15283 - Posted: 2 May 2006, 13:08:38 UTC - in response to Message 15280.  





The CPU efficiency is a "guess" from Boincview and not necessarily true. If the WU is really stuck (which happens rarely), Rosetta will auto-terminate it after an hour and return the result.

The problem is that the wu 1di2 is in this state since 2 days now.
Perhaps i must abort the wu. (?)


First of all I would exit BOINC and restart and see if the WU "revives". If that isn't the case I'd abort it.
ID: 15283 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15284 - Posted: 2 May 2006, 13:31:15 UTC - in response to Message 15281.  
Last modified: 2 May 2006, 13:33:17 UTC

Jose, How many puters do you really have? I see six IDENTICAL puters in your account and the benchmarks are all over the map.

1) Measured floating point speed 2009.88 million ops/sec
Measured integer speed 4014.11 million ops/sec

2) Measured floating point speed 2012.98 million ops/sec
Measured integer speed 4045.58 million ops/sec

3) Measured floating point speed 545.31 million ops/sec
Measured integer speed 3966.71 million ops/sec

4) Measured floating point speed 1276.07 million ops/sec
Measured integer speed 5114.47 million ops/sec

5) Measured floating point speed 1986.21 million ops/sec
Measured integer speed 3371.27 million ops/sec

6) Measured floating point speed 1154.1 million ops/sec
Measured integer speed 235.34 million ops/sec

If you just have one machine continuously being attached/detached then you have a issue here.

Note: none of this conversation belongs in this thread, maybe a mod could move them.



I have only one. The other 5 are "ghosts" that could be corrected if the BOINC Merge function were in place, but it is not.

But that will not longer be a problem. See, I just got tired of checking and rechecking and calling the people that help me ( at a cost to me) my computers, etc , etc. This is highly inefficient. It goes against my nature. It makes feel useless . All the hassles have actually robed the joy of participating from me.

I am exhausted. I am tired

I just give up.

Exeunt

Jose
This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15284 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15287 - Posted: 2 May 2006, 13:36:20 UTC

Jose, I looked through the results on each of those 6 puters. You have a very very high failure rate. I can see where it would be bothersome. I now really suspect you have an issue with your puter. All the software I linked you to is FREE. They're good tools. I'd start with the speedfan one and see what temps you have. I'd be willing to walk you through fixing it, step by step, if you're uncomfortable with looking in the case. If you haven't taken it apart in 6 months or more I'd suspect you just have a big build up of the fuzzies on the CPU (and everywhere else). I'm extending the offer anyway. You should start a new thread for this if you're interested.

tony
ID: 15287 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 15292 - Posted: 2 May 2006, 14:15:24 UTC - in response to Message 15278.  


this is the system I am looking at. In the CPU section it shows as a 4 CPU system, but under number of CPUs to use is says 1.

Pardon the intrusion guys, but does'nt the 4 just mean it is a Pentium 4 ?

You are correct, it was early thins morning, no coffee yet, in a hurry. What can I say I misread it.
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 15292 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Whl.

Send message
Joined: 29 Dec 05
Posts: 203
Credit: 275,802
RAC: 0
Message 15293 - Posted: 2 May 2006, 14:22:17 UTC

No probs, easily done and nobody got killed. ;-} :thumbsup:

ID: 15293 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 15296 - Posted: 2 May 2006, 14:53:14 UTC - in response to Message 15284.  

I am exhausted. I am tired

I just give up.

Exeunt

Jose

No doubt about it, hardware problems are a bitch! Maybe if you know someone nearby to you who is one of those "computer gurus", they might be able to provide a fresh perspective? The folks that are whizzes can sometimes (but not always!) provide solutions not considered. And they are there on the spot, hands-on, which I found can be very helpful because long-distance help although very well intentioned is no substitute for someone having their hands directly on the machine. Just a thought! :)
Regards,
Bob P.
ID: 15296 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nightbird

Send message
Joined: 17 Sep 05
Posts: 70
Credit: 32,418
RAC: 0
Message 15315 - Posted: 2 May 2006, 19:04:16 UTC - in response to Message 15282.  

Are you saying that the CPU time is not increasing, even though it's "running"? Is the idle process getting all the CPU time when this WU is "running"?
............
Is there anything in the messages around the time that this WU stopped?

Exactly, the cpu time is not increasing.

I will check the messages.




ID: 15315 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nightbird

Send message
Joined: 17 Sep 05
Posts: 70
Credit: 32,418
RAC: 0
Message 15316 - Posted: 2 May 2006, 19:11:52 UTC - in response to Message 15283.  


..................

First of all I would exit BOINC and restart and see if the WU "revives". If that isn't the case I'd abort it.

I can't exit Boinc immediately because my machine is running also uFluids@home.
This application has no checkpoint. I have began a wu "zerobubble" and after 7h 34 min of cpu time, the % done is at 39 %.
So i need to wait.

For now i suspended my wu Rosetta.



ID: 15316 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cMw

Send message
Joined: 24 Apr 06
Posts: 9
Credit: 14,036
RAC: 0
Message 15350 - Posted: 2 May 2006, 22:05:08 UTC

possible problem as im boggled why this is happening but recently on WU's that take like 10,000+ seconds meaning its like 10,000-10,700 or something like thati used to get 92 credit points for em now im only getting 72/71?? what gives..

https://boinc.bakerlab.org/rosetta/results.php?hostid=210073

^^ thats the comp look at like last 10 results
ID: 15350 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15351 - Posted: 2 May 2006, 22:11:13 UTC - in response to Message 15350.  
Last modified: 2 May 2006, 22:33:22 UTC

possible problem as im boggled why this is happening but recently on WU's that take like 10,000+ seconds meaning its like 10,000-10,700 or something like thati used to get 92 credit points for em now im only getting 72/71?? what gives..

https://boinc.bakerlab.org/rosetta/results.php?hostid=210073

^^ thats the comp look at like last 10 results

here's your benchmarks:

Measured floating point speed 3024.34 million ops/sec
Measured integer speed 9085.53 million ops/sec


Here's the formula for "claimed credit"

claimed credit = ([whetstone]+[dhrystone]) * wu_cpu_time_in_sec / 1728000

From this I can only assume you recently reran the benchmarks and they came out lower. The benchmarks run each time you install a new boinc client, and then every 5 days after that.

tony
you can manually rerun the benchmark as well.



ID: 15351 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 15354 - Posted: 2 May 2006, 22:47:47 UTC
Last modified: 2 May 2006, 22:48:17 UTC

My 24hr WUs (86,000 seconds) seem to range from 120-220 credits. The good news is that in my case there's not sudden marked decline with new versions or anything, it's just not a very accurate thing.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 15354 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cMw

Send message
Joined: 24 Apr 06
Posts: 9
Credit: 14,036
RAC: 0
Message 15371 - Posted: 3 May 2006, 0:51:46 UTC - in response to Message 15351.  

possible problem as im boggled why this is happening but recently on WU's that take like 10,000+ seconds meaning its like 10,000-10,700 or something like thati used to get 92 credit points for em now im only getting 72/71?? what gives..

https://boinc.bakerlab.org/rosetta/results.php?hostid=210073

^^ thats the comp look at like last 10 results

here's your benchmarks:

Measured floating point speed 3024.34 million ops/sec
Measured integer speed 9085.53 million ops/sec


Here's the formula for "claimed credit"

claimed credit = ([whetstone]+[dhrystone]) * wu_cpu_time_in_sec / 1728000

From this I can only assume you recently reran the benchmarks and they came out lower. The benchmarks run each time you install a new boinc client, and then every 5 days after that.

tony
you can manually rerun the benchmark as well.


yea dude didnt even realize that :[ just reran the benchmark cause i had oc'd the cpu back up again and my measured floating pt speed just jumped back up to 3779 :]

ID: 15371 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Philip Hood

Send message
Joined: 11 Feb 06
Posts: 3
Credit: 35,986
RAC: 0
Message 15373 - Posted: 3 May 2006, 1:21:48 UTC

I still keep getting units that say they are running when they are not. No Error messages in the log.
ID: 15373 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : Number crunching : Report Problems with Rosetta Version 5.07



©2024 University of Washington
https://www.bakerlab.org