Posts by Stevea

1) Message boards : Number crunching : Problems with Rosetta version 5.81 (Message 48599)
Posted 13 Nov 2007 by Stevea
Post:
Detaching now way tooooo many errors for 0 credit.....Get your act together.

Every rig now has errors...

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=108626751

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=108637515

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=108635707

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=108622177

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=108641769

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=108632329

C'mon....
2) Message boards : Number crunching : Problems with Rosetta version 5.81 (Message 48584)
Posted 12 Nov 2007 by Stevea
Post:
Here is one that errored out after 2 runs, neither received credit:

MFR_SYMM_FOLD_AND_DOCK_RELAX_GB1_mutant_2286_18566

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=108650563

And for what it's worth, this is still one of the lowest granting projects, for time computed....

Might be time for a credit adjustment, I only crunch this now when other projects are down. Crunching this project for 2-5 times less credit than others is just unacceptable for a credit counter like me, trying to keep up with the other members of my team. Sorry but I cannot afford to buy 3-5 quads to crunch whatever I want, I have to pick and choose now.

And to all you cross project parity people that IMO screwed this project up....

Have you figured out this is impossible........ just MO

Will not be responding to the naysayers..JMO
3) Message boards : Number crunching : Welcome Back! (Message 45786)
Posted 9 Sep 2007 by Stevea
Post:
Welcome back?

Still not uploading any wu's, giving a can't find file error.

I have 4 rigs that have not uploaded a single wu yet. Last contact was on Sept. 4th.

I can see us not getting credit for the work that was completed before the servers went down. If the file in question was on the server. And cannot be recovered.

I can see a lot of people not returning after finding how much credit other projects are giving out compared to rosetta.

I can say for sure one of my machines will not be returning as its getting over 100 ppd more on another project. Seems like the dreaded fair credit question will be brought back up after this fiasco has been resolved.

So much for the industry standard 99.9% uptime....for critical systems.
4) Message boards : Number crunching : Problems with Rosetta version 5.76 (Message 45143)
Posted 18 Aug 2007 by Stevea
Post:
This may be part of the problem.
"This process generated 10449364 decoys from -542858906 attempts"

Anders n


That IS what I was pointing out!
5) Message boards : Number crunching : Problems with Rosetta version 5.76 (Message 45115)
Posted 18 Aug 2007 by Stevea
Post:
OK, this is starting to get annoying.

2 more results from a BETA release, 1 froze the rig the other has a validation error.

BETA 5.67

1st : cspA_BOINC_MFR_ABRELAX_CONTROL_1912_13731_0
http://boinc.bakerlab.org/rosetta/result.php?resultid=100116577

Just how does this happen? And no credit! And It's done!

DONE :: 1 starting structures 13908.6 cpu seconds
This process generated 10449364 decoys from -542858906 attempts



2nd : LFAB__BOINC_TOP5CHEAT_JUMPRELAX_BARCODE-LFAB_-_1947_3886_1
http://boinc.bakerlab.org/rosetta/result.php?resultid=100072160

Ran for 23 seconds, and froze the rig...waiting for permission to contact rosetta...this really needs to be removed, if it fails, move to the next one.

DO NOT freeze the whole rig up for overnight,waiting for permission, to run the debugger, COME ON.. You can name the app the same name and just overwrite it, one time permission setting!

Just name them all : rosetta_windows_intelx86.exe or a different name for Linux, and Mac.

You are changing apps so fast I cannot keep up with the permissions on my firewall. This really needs to be changed!!!

I really do have better things to do with my time than babysit computers for you!
6) Message boards : Number crunching : Problems with Rosetta versions 5.72 and 5.73 (Message 44900)
Posted 12 Aug 2007 by Stevea
Post:
Found this wu stalled, on one of my dual core rigs, required a system reboot to get it going again. Good thing I was home, (Saturday) or it would have been stuck all day. If this would have happened on the same remote machine, it would have been pulled today.

http://boinc.bakerlab.org/rosetta/result.php?resultid=98752680

Another beta 5.73 wu

At least this was on a rig in my house, but still 2 problems with 5.73 BETA within a couple of days now.

These rigs have been problem free since that last wu that had 0 cpu time on July 26th, until 5.73. Guess it's time for BETA 5.74 huh

beta = bahhh
I think thats going in my sig.
7) Message boards : Number crunching : Problems with Rosetta versions 5.72 and 5.73 (Message 44848)
Posted 9 Aug 2007 by Stevea
Post:
This one froze up the rig, did not notice it for a day...

http://boinc.bakerlab.org/rosetta/result.php?resultid=97706475

Using Rosetta beta 5.73

This was froze waiting for permission for it to contact rosetta. I don't think that it's a wise idea for it to wait for permission, it should just move on to the next one. You lost a day of computational time because this is a remote rig and I had to go and allow it to contact rosetta for the debugger to run.

If I continue to have problems with remote machines, I will pull them from the project. Releasing new apps every 2 weeks will cause the remote users to remove the remote machines from this project...if the apps are not stable.

beta = bahhh

I have better things to do with my time than to babysit computers for you.
8) Message boards : Number crunching : Problems with Rosetta versions 5.72 and 5.73 (Message 44420)
Posted 28 Jul 2007 by Stevea
Post:
yep somethings not right 0 cpu seconds on each rig
9) Message boards : Number crunching : Problems with Rosetta versions 5.72 and 5.73 (Message 44269)
Posted 26 Jul 2007 by Stevea
Post:
This one never got started..
On either rig..

1d3z_non_ideal_BOINC_MFR_ABRELAX_PICKED_1850_5161_0

http://boinc.bakerlab.org/rosetta/result.php?resultid=94897692
10) Message boards : Number crunching : Problems with Rosetta version 5.68 and 5.70 (Message 44022)
Posted 21 Jul 2007 by Stevea
Post:
Here's one that never even got started on 2 different rigs.


http://boinc.bakerlab.org/rosetta/workunit.php?wuid=85064822
11) Message boards : Number crunching : PENDING work units (Message 42085)
Posted 12 Jun 2007 by Stevea
Post:
Me

Most server apps are disabled now...

Pending credit: 1,000.84
12) Message boards : Number crunching : Problems with Rosetta version 5.68 (Message 41834)
Posted 4 Jun 2007 by Stevea
Post:
Here is another from a different rig that has never had a w/u crash before.

Result ID 84162036
Name 1gidA_BOINC_MG_CHAINBREAK5_RNA_ABINITIO_RNA_CONTACT_RNA_LONG_RANGE_CONTACT_RNA_SASA-1gidA-_1734_87054_1
Workunit 73952005
Created 3 Jun 2007 21:01:08 UTC
Sent 3 Jun 2007 21:02:21 UTC
Received 4 Jun 2007 20:53:12 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 410071
Report deadline 13 Jun 2007 21:02:21 UTC
CPU time 6
stderr out

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
trouble finding jump_templates_RNA_basepairs_v2.dat
ERROR:: Exit from: .read_paths.cc line: 360

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 0.0264771778312024
Granted credit 0
application version 5.68
13) Message boards : Number crunching : Problems with Rosetta version 5.68 (Message 41831)
Posted 4 Jun 2007 by Stevea
Post:
Here is another w/u that crashed. This is from a different rig that have never had a w/u crash before.

http://boinc.bakerlab.org/rosetta/result.php?resultid=84070989

Result ID 84070989
Name 1n0u__TREEJUMP_ABRELAX__NEWRELAXFLAGS_TJTOP3_TOR_BARCODE_BARCODE__1769_4930_0
Workunit 75769002
Created 3 Jun 2007 12:14:12 UTC
Sent 3 Jun 2007 12:15:25 UTC
Received 4 Jun 2007 13:03:13 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 411074
Report deadline 13 Jun 2007 12:15:25 UTC
CPU time 6080.59375
stderr out

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 10737
sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: 1.#QNAN00 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: 1.#QNAN00 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: -1.#IND000 is outside of [-1,+1] sin and cos value legal range

it goes on and on....



14) Message boards : Number crunching : Problems with Rosetta version 5.68 (Message 41784)
Posted 3 Jun 2007 by Stevea
Post:
This rig has not had a w/u crash in over a month.
Now it's 1st 5.68 w/u crashed.

http://boinc.bakerlab.org/rosetta/result.php?resultid=83768981
15) Message boards : Number crunching : Claimed/granted credit (Message 39858)
Posted 25 Apr 2007 by Stevea
Post:
Due to the different nature of projects, I don't believe there will ever be true credit parity.

Certainly there is an element of competiviness among crunchers. And I did get a "warm fuzzy feeling" when I broke 100k.

IMO, near parity is not too much to expect from a project, but exact parity is.

While Rosetta may be at the "bottom of the pack", I don't believe it fair to say that it is not near-parity with other projects.

Well lets see... 375 ppd at predictor, I said in an earlier post that if the credits fell to 100 ppd less than predictor, the boxes would be pulled. Thats over 25% less credit. Guess what, there gone...

My understanding is that in terms of all Boinc projects, Rosetta has the fourth highest cumulative total of credits. If Rosetta is so far away from parity (on the under-granting end of the spectrum), would such achievement be possible?

Because up until last Nov. you received whatever credit you claimed. And inflated client software was allowed. And my 4 boxes that were getting 375 ppd ea on Predictor with the quorum of 3 system, were getting over 825 ppd ea here with the inflated client software. My 5 highest daily totals are still from the 2 weeks before the credit system changed, topping out with a whopping 3,825, the five days I ran the inflated client software. It was that way from the beginning. I was only here for the last 2 weeks, then the new credit system was implemented. Thats right more than half of this projects credits came from inflated credit allowance. Its only been 6 months since the new credit system has been implemented. Thats how.

SETI@Home established the standard for all BOINC projects by being the first to conceive and implement DC/Grid Computing. SETI saw fit to establish a fair credit system to reflect as honestly as possible a person's/machine's contribution to the project.

I think the second statement really reflects what the credit system concept is really all about.

If all the projects were forced into being within 10% of each other, their would be no reason to be having this conversation. I think that would satisfy everyone .

Personal choice.
Yep my personal choice is to not let my 4 rigs suffer the indignity of 25% less credit. If nothing is done to correct this, 3 more rigs will be leaving soon. If it was your goal to get me miffed and sidetracked you have done a wonderful job.

And you are not helping the project by trying to be the Rosetta cop. You just want to make me leave sooner... Sometimes its best to know what you are talking about before stating facts that are untrue. Go ahead and make my day by stating some other facts that are not backed up by the project... and see me gone.

I believe this conversation was between me and Mod Sense.

I really did not want to get into a fight with a Bad Penquin. But you just had to interject yourshelf into the middle of it. Please stay out of it. I dont care what you think is fair or not, I am stating facts. And trying to get the project admins to see they have a problem and to do something about it before more damage is done.

16) Message boards : Number crunching : Claimed/granted credit (Message 39834)
Posted 24 Apr 2007 by Stevea
Post:
Well in the real world it just cost them 2 rigs. 3 if you include Keith's.

Right around 800 ppd.


800 x 365 = 292,000


17) Message boards : Number crunching : Claimed/granted credit (Message 39827)
Posted 24 Apr 2007 by Stevea
Post:
Stevea, I cannot dispute your experience with RAC over time. But I point out that you cannot have enough information to assert what caused that RAC change. Overall, according to these cross project stats comparing projects on the same machines, Rosetta credit per CPU second is comparable to other projects.


It was not my intention to assert that I had enough information as to what caused this recent drop of credit. Only that I have seen my RAC drop, as well as others.

I stated it was around the time that the core2 duo's and quads came out that the credit dropped. Is it a factor? I do not know. I know that the time frames match up, and offered it as a possible cause. I remember Who? joking that it was all his fault. I never said it was, just that the time frames matched.

My point was if you do not want people like Keith and I to feel short changed, than something needs to change. It takes a long time for someone getting 235 ppd to accumulate 70,000 + credits. He has proved what I said, that this project is low on the average credit given. Even your link states the same thing. There are many other projects out there that offer a lot more credit per second than here.

And I ask again. Instead of being in the bottom of the pack as far as credit awarded, why not adjust the system and be one of the top awarding projects. Is there a Boinc rule that says you cannot be in the top 10%? Or even the top project for credit allowed?

You and the project managers may not care that a single Keith left, because he saw his credit drop without explanation. You just had 5000 new members join in a large team. But if nothing changes and 5000 other Keith's leave, what did you gain?

I was just trying to say that it would be nice if 5000 new members joined, without another 5000 leaving. If it takes a look at how credit is awarded, and adjusted to be able to keep the 5000, now would be a good time to do it. And how would that be a bad thing?

I will only crunch disease related projects for personal reasons, but many others may not be so tunnel visioned in their reasons. That being said, it does not mean that I will crunch here for 0 credits, or if I feel that I am being short changed. Basic human nature. There are other disease related project out there. If I was getting the same 375 ppd on the boxes that where pulled from here, 2 of them would have stayed running, and I would have had 5 boxes crunching instead of 3.

If the system remains the same, and no logical explanation is given as to why this has happened, or no changes will be made. Who knows, I could be the next Keith...


And Keith, I in no way intended to offend you. Your post just happened to be here at the right time. If I did offened you in any way I oppologise.
18) Message boards : Number crunching : Claimed/granted credit (Message 39786)
Posted 23 Apr 2007 by Stevea
Post:
I'm sorry to disagree with you, but in the real world I have seen my credits for my 4 AMD Mobile 512k barton core crunchers drop 50 ppd each since Dec. I was told it would even out. It did not. Something changed, and did not return to the way it was before.

3 of them have been pulled, parted out and sold, and the other is just crunching until the new owner picks it up.

These where all water cooled NF motherboards running at 2585 to 2700mhz. These benched out between a FX 53-55. Before Dec. they where getting over 320 ppd each. As of late they where getting around 275 ppd each. These same 4 rigs where getting 375 ppd average on Predictor before it shut down.

I don't care what anyone says, I saw my overall RAC drop by 200 ppd in the last 4-5 months. It did not even out. There was a thread about this and ASTRO's spreadsheets proved that the credit had indeed fallen. This is now one of the lowest average credit projects in Boinc.

They are now replaced by 3 x2 Operton rigs. And if you go by benchmark scores, they are still getting short changed. Is it a coincidence that this happed at the same time as the Core2 Duo's came out? Highly unlikely. I believe that the new credit system is slanted to Intel. Thats fine, there are more Intel's out there. But if you want to keep the AMD crowd around, I think someone should look at the way credit is awarded.

Why not be one of the higher average credit projects? A small tweak of the credit system, and bingo. Lots of new users that are looking for nothing but credit. The project benefits as a whole. And cross project pairity is still kept intact.

And to all that reply and say nothing has changed for them, good for you. But I saw my credits fall by 100 ppd per rig from Predictor to this project. 50 when I moved here and 50 in the last 4-5 months.

Now these are my observations of my 4 rigs that ran 24/7. If you were not affected, good for you!
19) Message boards : Number crunching : Problems with Rosetta version 5.59 (Message 39701)
Posted 21 Apr 2007 by Stevea
Post:
Another problem series?

From 3 different machines!
I even pulled a single core mobile xp machine yesterday and replaced it with another dual core operton, and my rac continues to fall?

Half also ran longer than they were supposed to.

This is not funny any more......

Result ID 74101016
Name 2K3E_HEXAMER_CHAPERONE_DOCKING_1682_27127_0
Workunit 66457982
Created 20 Apr 2007 11:05:32 UTC
Sent 20 Apr 2007 11:16:57 UTC
Received 21 Apr 2007 15:42:26 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 411074
Report deadline 30 Apr 2007 11:16:57 UTC
CPU time 25171.453125
stderr out

<core_client_version>5.8.8</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 21600
# random seed: 1655494
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -491.601 for 900 seconds
**********************************************************************
GZIP SILENT FILE: .aa2K3E.out

</stderr_txt>
]]>

Validate state Valid
Claimed credit 80.8845111635969
Granted credit 20
application version 5.59

And more

Result ID 74050154
Name 2K3E_HEXAMER_CHAPERONE_DOCKING_1682_20372_0
Workunit 66410697
Created 20 Apr 2007 4:31:00 UTC
Sent 20 Apr 2007 4:31:41 UTC
Received 21 Apr 2007 7:48:07 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 410071
Report deadline 30 Apr 2007 4:31:41 UTC
CPU time 7202.125
stderr out

<core_client_version>5.8.8</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 1662249
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -491.094 for 900 seconds
**********************************************************************
GZIP SILENT FILE: .aa2K3E.out

</stderr_txt>
]]>

Validate state Valid
Claimed credit 30.7057917005236
Granted credit 20
application version 5.59

Result ID 74057237
Name 2K3E_HEXAMER_CHAPERONE_DOCKING_1682_21330_0
Workunit 66417403
Created 20 Apr 2007 5:30:08 UTC
Sent 20 Apr 2007 5:42:32 UTC
Received 21 Apr 2007 8:19:17 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 410071
Report deadline 30 Apr 2007 5:42:32 UTC
CPU time 10784.53125
stderr out

<core_client_version>5.8.8</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 1661291
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -197.897 for 900 seconds
**********************************************************************
GZIP SILENT FILE: .aa2K3E.out

</stderr_txt>
]]>

Validate state Valid
Claimed credit 45.9791478418227
Granted credit 20
application version 5.59

Result ID 74015760
Name 2K3E_HEXAMER_CHAPERONE_DOCKING_1682_15702_0
Workunit 66378007
Created 19 Apr 2007 23:45:27 UTC
Sent 19 Apr 2007 23:46:56 UTC
Received 21 Apr 2007 2:01:05 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 438532
Report deadline 29 Apr 2007 23:46:56 UTC
CPU time 17990.421875
stderr out

<core_client_version>5.8.15</core_client_version>
<![CDATA[
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 1666919
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -487.342 for 900 seconds
**********************************************************************
GZIP SILENT FILE: .aa2K3E.out

</stderr_txt>
]]>

Validate state Valid
Claimed credit 80.0340997052485
Granted credit 20
application version 5.59
20) Message boards : Number crunching : Problems with Rosetta version 5.59 (Message 39628)
Posted 19 Apr 2007 by Stevea
Post:
No one is going to look into this are they?

Validate state Valid
Claimed credit 93.0845086817385
Granted credit 8.09930888889199
application version 5.59

This one of those OH Well Too Bad..

I though the new and improved credit system was supposed to give credit for work done.

Can someone please explain to me how a Operton with 1meg cache work on a wu for over 6hrs and get awarded 8 credits?


Next 20



©2024 University of Washington
https://www.bakerlab.org