Message boards : Number crunching : Discuss Rosetta Application Errors and Fixes (all Vers)
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
We found a bug that was accidentally introduced in the 5.06 release that ignores the cpu run time preference. It was not caught while testing on Ralph because it is a subtle bug. The effect is that work units will run depending on a command argument "-nstruct X" instead of the cpu run time user preference where X is the number of structure predictions to be made. This was the behavior before we introduced the run time preference. We will place a fix soon. Also, work units that error out will be granted credit regularly now so don't worry about lost credit due to errors because they will eventually be granted. Sorry for any inconvenience. |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
CmW Wrote (in part): you know what im {edited} right now not even gonna lie..5.06 has given me nothing but problems so far While I understand your frustration, when I look at your systems I only see 3 work Units with errors. For these you will be granted credit as indicated in David Kim's post here. When this credit is awarded as a lump sum, your RAC will in fact be temporarily inflated to a higher level than would otherwise be the case. Moderator9 ROSETTA@home FAQ Moderator Contact |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
After looking into the cpu run time bug more, it appears that it should only effect users who keep the app in memory since the logic will pick up the run time preference if the app makes decoys and restarts. For those who leave the app in memory, the app will never restart. We will update the app later today with a fix. |
cMw Send message Joined: 24 Apr 06 Posts: 9 Credit: 14,036 RAC: 0 |
After looking into the cpu run time bug more, it appears that it should only effect users who keep the app in memory since the logic will pick up the run time preference if the app makes decoys and restarts. For those who leave the app in memory, the app will never restart. We will update the app later today with a fix. i hope so and im guessing the 502 pts i lost will be granted as well :D ? btw when do you expect to have this updated by causing im losing time :( |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance. I want to make this crystal clear: As the self appointed Poster Boy of the Rosetta Bugs and its Official Bug Magnet (Yes I can still keep my sense of humor.), I have been vocal in pointing out the bugs : vocal and (maybe for some tastes too fast) and constant in pointing out the bugs. My level of posting has been so intense in the last days that I know it has come close for the creation in this Board of a thread called "Jose is @@tching again!!!!) but, I digress. The fact that I complained that much is an indication that: 1- I do believe this is a most worthwhile project. This is why all my computing resources, limited as they are, are totally committed to Rosetta. 2- I have to thanks the moderator and all the participants in the exchanges of ideas and information I have been involved with for their civility, and their desire to help me as an individual and the community as a whole. Some of the ideas that have been proposed to me seem to have worked. ( This said as I am crossing my fingers hoping the now-stable situation holds) 3- I would be unfair if I don't recognize and applaud the massive effort the project scientists and the software and model developers have undertaken in addressing our concerns, in paying attention to our complaints and suggestions and, in finding solutions as fast as humanly possible without sacrificing the scientific integrity and validity of the data produced. Let's us remember that the points and the competition among teams are but the frosting of a very important cake: scientific progress. The cake is way more important than the frosting. That they have done it in relatively fast time [ Hey, not as fast I would have wanted it but I am in dire need of attending a nice and large BBQ party.], speaks very well of their commitment not only to the core scientific project but to US as a community. That they continously keep doing it , given our constant pressure ( in addition to the extremely high pressure of their professional environments) speaks very loud as to their personal and professional qualities. My thanks to all . My appreciation of all. Jose This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
dag Send message Joined: 16 Dec 05 Posts: 106 Credit: 1,000,020 RAC: 0 |
We found a bug that was accidentally introduced in the 5.06 release that ignores the cpu run time preference. OH NO! The dreaded late-Friday-afternoon-code-release! dag --Finding aliens is cool, but understanding the structure of proteins is useful. |
rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0 |
OH NO! The dreaded late-Friday-afternoon-code-release! Maybe not so bad...the code was released late-Thursday, so technically I suppose this is the late-Friday-afternoon-fixed-code-release! :D It remains to be seen whether this will come to be known in the annals of Rosetta@home lore as the start of the dreaded bug-fix late-Friday-afternoon-code-release.....! ;D (Let's hope not, and let's reconvene on Monday....!) :D Regards, Bob P. |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance. By subtle, I meant it didn't stand out on ralph because two situations had to occur: 1. nstruct had to be high enough to present a problematic run time which it wasn't, I think. 2. it only affects leave in memory users or users that only run R@h (if app never gets preempted). Obviously not subtle to those it affected. Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues. |
cMw Send message Joined: 24 Apr 06 Posts: 9 Credit: 14,036 RAC: 0 |
A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance. should it say in messages that i recieved an update or ? |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance. It will appear as a new version number in the Task tab, a message for the download in the message tab, and a longer download if you watch the transfers tab. Then we all hope it will show up as the fix to the errors. Moderator9 ROSETTA@home FAQ Moderator Contact |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues. Hope you read message 14915 (Specially point 3). No need to apologize. I am in awe of how committed all of you are. I will repeat it : 3- I would be unfair if I don't recognize and applaud the massive effort the project scientists and the software and model developers have undertaken in addressing our concerns, in paying attention to our complaints and suggestions and, in finding solutions as fast as humanly possible without sacrificing the scientific integrity and validity of the data produced. Let's us remember that the points and the competition among teams are but the frosting of a very important cake: scientific progress. The cake is way more important than the frosting. That they have done it in relatively fast time [ Hey, not as fast I would have wanted it but I am in dire need of attending a nice and large BBQ party.], speaks very well of their commitment not only to the core scientific project but to US as a community. That they continously keep doing it , given our constant pressure ( in addition to the extremely high pressure of their professional environments) speaks very loud as to their personal and professional qualities. My thanks to all . My appreciation of all. This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
BTW what I said in post 14915 also applies to you. Thanks for your patience. This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
I did. Your comments are very much appreciated. |
Cureseekers~Kristof Send message Joined: 5 Nov 05 Posts: 80 Credit: 689,603 RAC: 0 |
So if I understand it well, the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours? How much is this? 3 hours? Do you recomend aborting jobs with engine 5.06? Member of Dutch Power Cows |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
So if I understand it well, IMHO... Don't do that there is good science to be obtained from them. This and no other is the root from which a Tyrant springs; when he first appears he is a protector.†Plato |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
So if I understand it well, I believe the 5.06 WUs run for a fixed number of models, rather than the runtime preference. The work they produce is every bit as useful to the project as the 5.07 WUs. So, it they appear to be running normally (i.e. stepping through models in the graphics display), leave them to do their thing. Credits are issued based on the CPU time you put into the work, not number of models or number of WUs. So the longer they take crunching, the more credit they earn. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0 |
Feet1st is right! You can keep your 5.06 workunits in the queue... they don't have any known errors and will give us useful science results. They just may take shorter or longer than usual. So if I understand it well, |
anders n Send message Joined: 19 Sep 05 Posts: 403 Credit: 537,991 RAC: 0 |
[quote]Feet1st is right! You can keep your 5.06 workunits in the queue... they don't have any known errors and will give us useful science results. They just may take shorter or longer than usual.[quote] Hi Rhiju When I look at my results they are either shorter due to few nstruct or they finish according to my time setting. Anders n |
Jimi@0wned.org.uk Send message Joined: 10 Mar 06 Posts: 29 Credit: 335,252 RAC: 0 |
I've started getting lockups on my x2 3800+ system over the last week; I think it's a RAM problem as it has got worse and worse over time, I can barely get 20 minutes loaded out of it now w/o a lockup. Running dual prime seems to stress the system less than Rosetta, certainly CPU temps are 2 or 3 degrees lower. So I've detached the system and am priming again, trying to nail exactly what's wrong. |
Jimi@0wned.org.uk Send message Joined: 10 Mar 06 Posts: 29 Credit: 335,252 RAC: 0 |
Is there something grossly different about 5.06 and 5.07, or are the current WUs just more demanding? Prime is still running after 90 minutes with lower temps than Rosetta, which seems to tell me that Rosetta has become extremely tough on the hardware. I know it's difficult to quantify, but is this acknowledged by those that know more about it? NB: this seems to be the wrong thread for my comments, but it's possible WU problems are going to be mistook for hardware problems if the WUs really have got tougher. Where shall I take this? |
Message boards :
Number crunching :
Discuss Rosetta Application Errors and Fixes (all Vers)
©2024 University of Washington
https://www.bakerlab.org