Discuss Rosetta Application Errors and Fixes (all Vers)

Message boards : Number crunching : Discuss Rosetta Application Errors and Fixes (all Vers)

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 14905 - Posted: 28 Apr 2006, 19:15:27 UTC
Last modified: 28 Apr 2006, 19:17:24 UTC

We found a bug that was accidentally introduced in the 5.06 release that ignores the cpu run time preference. It was not caught while testing on Ralph because it is a subtle bug. The effect is that work units will run depending on a command argument "-nstruct X" instead of the cpu run time user preference where X is the number of structure predictions to be made. This was the behavior before we introduced the run time preference. We will place a fix soon.

Also, work units that error out will be granted credit regularly now so don't worry about lost credit due to errors because they will eventually be granted.

Sorry for any inconvenience.
ID: 14905 · Rating: -1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 14907 - Posted: 28 Apr 2006, 19:45:37 UTC - in response to Message 14900.  
Last modified: 28 Apr 2006, 19:48:21 UTC

CmW Wrote (in part):

you know what im {edited} right now not even gonna lie..5.06 has given me nothing but problems so far

https://boinc.bakerlab.org/rosetta/result.php?resultid=18468821

https://boinc.bakerlab.org/rosetta/result.php?resultid=18452334

computation errors on BOTH OF THEM 5 hours on one and 8 HOURS on the other my rac has dropped 25 pts how am i supposed to contribute at all when ALL i have gotten so far is computation errors...


While I understand your frustration, when I look at your systems I only see 3 work Units with errors. For these you will be granted credit as indicated in David Kim's post here. When this credit is awarded as a lump sum, your RAC will in fact be temporarily inflated to a higher level than would otherwise be the case.
Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 14907 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 14909 - Posted: 28 Apr 2006, 20:02:33 UTC

After looking into the cpu run time bug more, it appears that it should only effect users who keep the app in memory since the logic will pick up the run time preference if the app makes decoys and restarts. For those who leave the app in memory, the app will never restart. We will update the app later today with a fix.
ID: 14909 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cMw

Send message
Joined: 24 Apr 06
Posts: 9
Credit: 14,036
RAC: 0
Message 14913 - Posted: 28 Apr 2006, 20:18:42 UTC - in response to Message 14909.  

After looking into the cpu run time bug more, it appears that it should only effect users who keep the app in memory since the logic will pick up the run time preference if the app makes decoys and restarts. For those who leave the app in memory, the app will never restart. We will update the app later today with a fix.

i hope so and im guessing the 502 pts i lost will be granted as well :D ?

btw when do you expect to have this updated by causing im losing time :(
ID: 14913 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 14915 - Posted: 28 Apr 2006, 20:49:46 UTC

A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance.

I want to make this crystal clear:

As the self appointed Poster Boy of the Rosetta Bugs and its Official Bug Magnet (Yes I can still keep my sense of humor.), I have been vocal in pointing out the bugs : vocal and (maybe for some tastes too fast) and constant in pointing out the bugs. My level of posting has been so intense in the last days that I know it has come close for the creation in this Board of a thread called "Jose is @@tching again!!!!) but, I digress.

The fact that I complained that much is an indication that:

1- I do believe this is a most worthwhile project. This is why all my computing resources, limited as they are, are totally committed to Rosetta.

2- I have to thanks the moderator and all the participants in the exchanges of ideas and information I have been involved with for their civility, and their desire to help me as an individual and the community as a whole. Some of the ideas that have been proposed to me seem to have worked. ( This said as I am crossing my fingers hoping the now-stable situation holds)

3- I would be unfair if I don't recognize and applaud the massive effort the project scientists and the software and model developers have undertaken in addressing our concerns, in paying attention to our complaints and suggestions and, in finding solutions as fast as humanly possible without sacrificing the scientific integrity and validity of the data produced. Let's us remember that the points and the competition among teams are but the frosting of a very important cake: scientific progress. The cake is way more important than the frosting.

That they have done it in relatively fast time [ Hey, not as fast I would have wanted it but I am in dire need of attending a nice and large BBQ party.], speaks very well of their commitment not only to the core scientific project but to US as a community.

That they continously keep doing it , given our constant pressure ( in addition to the extremely high pressure of their professional environments) speaks very loud as to their personal and professional qualities.

My thanks to all .
My appreciation of all.

Jose

This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 14915 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dag
Avatar

Send message
Joined: 16 Dec 05
Posts: 106
Credit: 1,000,020
RAC: 0
Message 14916 - Posted: 28 Apr 2006, 21:00:10 UTC - in response to Message 14905.  
Last modified: 28 Apr 2006, 21:02:16 UTC

We found a bug that was accidentally introduced in the 5.06 release that ignores the cpu run time preference.
...

We will place a fix soon.

Sorry for any inconvenience.

...
We will update the app later today with a fix.



OH NO! The dreaded late-Friday-afternoon-code-release!
dag
--Finding aliens is cool, but understanding the structure of proteins is useful.
ID: 14916 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rbpeake

Send message
Joined: 25 Sep 05
Posts: 168
Credit: 247,828
RAC: 0
Message 14917 - Posted: 28 Apr 2006, 21:29:40 UTC - in response to Message 14916.  

OH NO! The dreaded late-Friday-afternoon-code-release!

Maybe not so bad...the code was released late-Thursday, so technically I suppose this is the late-Friday-afternoon-fixed-code-release! :D

It remains to be seen whether this will come to be known in the annals of Rosetta@home lore as the start of the dreaded bug-fix late-Friday-afternoon-code-release.....! ;D (Let's hope not, and let's reconvene on Monday....!) :D

Regards,
Bob P.
ID: 14917 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 14918 - Posted: 28 Apr 2006, 22:06:41 UTC - in response to Message 14915.  

A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance.


By subtle, I meant it didn't stand out on ralph because two situations had to occur:

1. nstruct had to be high enough to present a problematic run time which it wasn't, I think.
2. it only affects leave in memory users or users that only run R@h (if app never gets preempted).

Obviously not subtle to those it affected. Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues.
ID: 14918 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
cMw

Send message
Joined: 24 Apr 06
Posts: 9
Credit: 14,036
RAC: 0
Message 14920 - Posted: 28 Apr 2006, 22:21:59 UTC - in response to Message 14918.  

A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance.


By subtle, I meant it didn't stand out on ralph because two situations had to occur:

1. nstruct had to be high enough to present a problematic run time which it wasn't, I think.
2. it only affects leave in memory users or users that only run R@h (if app never gets preempted).

Obviously not subtle to those it affected. Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues.

should it say in messages that i recieved an update or ?
ID: 14920 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 14921 - Posted: 28 Apr 2006, 22:45:20 UTC - in response to Message 14920.  

A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance.


By subtle, I meant it didn't stand out on ralph because two situations had to occur:

1. nstruct had to be high enough to present a problematic run time which it wasn't, I think.
2. it only affects leave in memory users or users that only run R@h (if app never gets preempted).

Obviously not subtle to those it affected. Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues.

should it say in messages that i recieved an update or ?


It will appear as a new version number in the Task tab, a message for the download in the message tab, and a longer download if you watch the transfers tab. Then we all hope it will show up as the fix to the errors.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 14921 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 14923 - Posted: 28 Apr 2006, 23:23:31 UTC - in response to Message 14918.  
Last modified: 28 Apr 2006, 23:29:01 UTC

Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues.


Hope you read message 14915 (Specially point 3). No need to apologize. I am in awe of how committed all of you are.

I will repeat it : 3- I would be unfair if I don't recognize and applaud the massive effort the project scientists and the software and model developers have undertaken in addressing our concerns, in paying attention to our complaints and suggestions and, in finding solutions as fast as humanly possible without sacrificing the scientific integrity and validity of the data produced. Let's us remember that the points and the competition among teams are but the frosting of a very important cake: scientific progress. The cake is way more important than the frosting.

That they have done it in relatively fast time [ Hey, not as fast I would have wanted it but I am in dire need of attending a nice and large BBQ party.], speaks very well of their commitment not only to the core scientific project but to US as a community.

That they continously keep doing it , given our constant pressure ( in addition to the extremely high pressure of their professional environments) speaks very loud as to their personal and professional qualities.

My thanks to all .
My appreciation of all.


This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 14923 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 14924 - Posted: 28 Apr 2006, 23:25:50 UTC - in response to Message 14921.  

BTW what I said in post 14915 also applies to you. Thanks for your patience.

This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 14924 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1018
Credit: 4,334,829
RAC: 0
Message 14928 - Posted: 29 Apr 2006, 0:06:36 UTC

I did. Your comments are very much appreciated.
ID: 14928 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cureseekers~Kristof

Send message
Joined: 5 Nov 05
Posts: 80
Credit: 689,603
RAC: 0
Message 14998 - Posted: 29 Apr 2006, 14:00:29 UTC

So if I understand it well,
the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours?
How much is this? 3 hours?

Do you recomend aborting jobs with engine 5.06?
Member of Dutch Power Cows
ID: 14998 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15003 - Posted: 29 Apr 2006, 14:59:16 UTC - in response to Message 14998.  

So if I understand it well,
the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours?
How much is this? 3 hours?

Do you recommend aborting jobs with engine 5.06?


IMHO... Don't do that there is good science to be obtained from them.

This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15003 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 15048 - Posted: 29 Apr 2006, 22:14:07 UTC - in response to Message 15003.  

So if I understand it well,
the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours?
How much is this? 3 hours?

Do you recommend aborting jobs with engine 5.06?


IMHO... Don't do that there is good science to be obtained from them.

I believe the 5.06 WUs run for a fixed number of models, rather than the runtime preference. The work they produce is every bit as useful to the project as the 5.07 WUs. So, it they appear to be running normally (i.e. stepping through models in the graphics display), leave them to do their thing. Credits are issued based on the CPU time you put into the work, not number of models or number of WUs. So the longer they take crunching, the more credit they earn.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 15048 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 15057 - Posted: 30 Apr 2006, 2:23:54 UTC - in response to Message 15048.  

Feet1st is right! You can keep your 5.06 workunits in the queue... they don't have any known errors and will give us useful science results. They just may take shorter or longer than usual.

So if I understand it well,
the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours?
How much is this? 3 hours?

Do you recommend aborting jobs with engine 5.06?


IMHO... Don't do that there is good science to be obtained from them.

I believe the 5.06 WUs run for a fixed number of models, rather than the runtime preference. The work they produce is every bit as useful to the project as the 5.07 WUs. So, it they appear to be running normally (i.e. stepping through models in the graphics display), leave them to do their thing. Credits are issued based on the CPU time you put into the work, not number of models or number of WUs. So the longer they take crunching, the more credit they earn.


ID: 15057 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 15073 - Posted: 30 Apr 2006, 9:18:42 UTC - in response to Message 15057.  

[quote]Feet1st is right! You can keep your 5.06 workunits in the queue... they don't have any known errors and will give us useful science results. They just may take shorter or longer than usual.[quote]

Hi Rhiju

When I look at my results they are either shorter due to few nstruct or they finish according to my time setting.

Anders n
ID: 15073 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jimi@0wned.org.uk

Send message
Joined: 10 Mar 06
Posts: 29
Credit: 335,252
RAC: 0
Message 15087 - Posted: 30 Apr 2006, 16:08:50 UTC

I've started getting lockups on my x2 3800+ system over the last week; I think it's a RAM problem as it has got worse and worse over time, I can barely get 20 minutes loaded out of it now w/o a lockup. Running dual prime seems to stress the system less than Rosetta, certainly CPU temps are 2 or 3 degrees lower. So I've detached the system and am priming again, trying to nail exactly what's wrong.
ID: 15087 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jimi@0wned.org.uk

Send message
Joined: 10 Mar 06
Posts: 29
Credit: 335,252
RAC: 0
Message 15090 - Posted: 30 Apr 2006, 17:27:09 UTC
Last modified: 30 Apr 2006, 17:33:45 UTC

Is there something grossly different about 5.06 and 5.07, or are the current WUs just more demanding? Prime is still running after 90 minutes with lower temps than Rosetta, which seems to tell me that Rosetta has become extremely tough on the hardware. I know it's difficult to quantify, but is this acknowledged by those that know more about it?

NB: this seems to be the wrong thread for my comments, but it's possible WU problems are going to be mistook for hardware problems if the WUs really have got tougher. Where shall I take this?
ID: 15090 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Discuss Rosetta Application Errors and Fixes (all Vers)



©2024 University of Washington
https://www.bakerlab.org