Discuss Rosetta Application Errors and Fixes (all Vers)

Author	Message
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0	Message 14905 - Posted: 28 Apr 2006, 19:15:27 UTC Last modified: 28 Apr 2006, 19:17:24 UTC We found a bug that was accidentally introduced in the 5.06 release that ignores the cpu run time preference. It was not caught while testing on Ralph because it is a subtle bug. The effect is that work units will run depending on a command argument "-nstruct X" instead of the cpu run time user preference where X is the number of structure predictions to be made. This was the behavior before we introduced the run time preference. We will place a fix soon. Also, work units that error out will be granted credit regularly now so don't worry about lost credit due to errors because they will eventually be granted. Sorry for any inconvenience. ID: 14905 · Rating: -1 · rate: / Reply Quote

Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0	Message 14907 - Posted: 28 Apr 2006, 19:45:37 UTC - in response to Message 14900. Last modified: 28 Apr 2006, 19:48:21 UTC CmW Wrote (in part): you know what im {edited} right now not even gonna lie..5.06 has given me nothing but problems so far https://boinc.bakerlab.org/rosetta/result.php?resultid=18468821 https://boinc.bakerlab.org/rosetta/result.php?resultid=18452334 computation errors on BOTH OF THEM 5 hours on one and 8 HOURS on the other my rac has dropped 25 pts how am i supposed to contribute at all when ALL i have gotten so far is computation errors... While I understand your frustration, when I look at your systems I only see 3 work Units with errors. For these you will be granted credit as indicated in David Kim's post here. When this credit is awarded as a lump sum, your RAC will in fact be temporarily inflated to a higher level than would otherwise be the case. Moderator9 ROSETTA@home FAQ Moderator Contact ID: 14907 · Rating: 0 · rate: / Reply Quote

David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0	Message 14909 - Posted: 28 Apr 2006, 20:02:33 UTC After looking into the cpu run time bug more, it appears that it should only effect users who keep the app in memory since the logic will pick up the run time preference if the app makes decoys and restarts. For those who leave the app in memory, the app will never restart. We will update the app later today with a fix. ID: 14909 · Rating: 0 · rate: / Reply Quote

cMw Send message Joined: 24 Apr 06 Posts: 9 Credit: 14,036 RAC: 0	Message 14913 - Posted: 28 Apr 2006, 20:18:42 UTC - in response to Message 14909. After looking into the cpu run time bug more, it appears that it should only effect users who keep the app in memory since the logic will pick up the run time preference if the app makes decoys and restarts. For those who leave the app in memory, the app will never restart. We will update the app later today with a fix. i hope so and im guessing the 502 pts i lost will be granted as well :D ? btw when do you expect to have this updated by causing im losing time :( ID: 14913 · Rating: 0 · rate: / Reply Quote

Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0	Message 14915 - Posted: 28 Apr 2006, 20:49:46 UTC A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance. I want to make this crystal clear: As the self appointed Poster Boy of the Rosetta Bugs and its Official Bug Magnet (Yes I can still keep my sense of humor.), I have been vocal in pointing out the bugs : vocal and (maybe for some tastes too fast) and constant in pointing out the bugs. My level of posting has been so intense in the last days that I know it has come close for the creation in this Board of a thread called "Jose is @@tching again!!!!) but, I digress. The fact that I complained that much is an indication that: 1- I do believe this is a most worthwhile project. This is why all my computing resources, limited as they are, are totally committed to Rosetta. 2- I have to thanks the moderator and all the participants in the exchanges of ideas and information I have been involved with for their civility, and their desire to help me as an individual and the community as a whole. Some of the ideas that have been proposed to me seem to have worked. ( This said as I am crossing my fingers hoping the now-stable situation holds) 3- I would be unfair if I don't recognize and applaud the massive effort the project scientists and the software and model developers have undertaken in addressing our concerns, in paying attention to our complaints and suggestions and, in finding solutions as fast as humanly possible without sacrificing the scientific integrity and validity of the data produced. Let's us remember that the points and the competition among teams are but the frosting of a very important cake: scientific progress. The cake is way more important than the frosting. That they have done it in relatively fast time [ Hey, not as fast I would have wanted it but I am in dire need of attending a nice and large BBQ party.], speaks very well of their commitment not only to the core scientific project but to US as a community. That they continously keep doing it , given our constant pressure ( in addition to the extremely high pressure of their professional environments) speaks very loud as to their personal and professional qualities. My thanks to all . My appreciation of all. Jose This and no other is the root from which a Tyrant springs; when he first appears he is a protector.â€ Plato ID: 14915 · Rating: 0 · rate: / Reply Quote

dag Send message Joined: 16 Dec 05 Posts: 106 Credit: 1,000,020 RAC: 0	Message 14916 - Posted: 28 Apr 2006, 21:00:10 UTC - in response to Message 14905. Last modified: 28 Apr 2006, 21:02:16 UTC We found a bug that was accidentally introduced in the 5.06 release that ignores the cpu run time preference. ... We will place a fix soon. Sorry for any inconvenience. ... We will update the app later today with a fix. OH NO! The dreaded late-Friday-afternoon-code-release! dag --Finding aliens is cool, but understanding the structure of proteins is useful. ID: 14916 · Rating: 0 · rate: / Reply Quote

rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0	Message 14917 - Posted: 28 Apr 2006, 21:29:40 UTC - in response to Message 14916. OH NO! The dreaded late-Friday-afternoon-code-release! Maybe not so bad...the code was released late-Thursday, so technically I suppose this is the late-Friday-afternoon-fixed-code-release! :D It remains to be seen whether this will come to be known in the annals of Rosetta@home lore as the start of the dreaded bug-fix late-Friday-afternoon-code-release.....! ;D (Let's hope not, and let's reconvene on Monday....!) :D Regards, Bob P. ID: 14917 · Rating: 0 · rate: / Reply Quote

David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0	Message 14918 - Posted: 28 Apr 2006, 22:06:41 UTC - in response to Message 14915. A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance. By subtle, I meant it didn't stand out on ralph because two situations had to occur: 1. nstruct had to be high enough to present a problematic run time which it wasn't, I think. 2. it only affects leave in memory users or users that only run R@h (if app never gets preempted). Obviously not subtle to those it affected. Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues. ID: 14918 · Rating: 0 · rate: / Reply Quote

cMw Send message Joined: 24 Apr 06 Posts: 9 Credit: 14,036 RAC: 0	Message 14920 - Posted: 28 Apr 2006, 22:21:59 UTC - in response to Message 14918. A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance. By subtle, I meant it didn't stand out on ralph because two situations had to occur: 1. nstruct had to be high enough to present a problematic run time which it wasn't, I think. 2. it only affects leave in memory users or users that only run R@h (if app never gets preempted). Obviously not subtle to those it affected. Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues. should it say in messages that i recieved an update or ? ID: 14920 · Rating: 0 · rate: / Reply Quote

Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0	Message 14921 - Posted: 28 Apr 2006, 22:45:20 UTC - in response to Message 14920. A subtle bug???? I just hope that my not so subtle harping and arguing helped in finding it. :) Since I am one of the ones that keeps the application in memory, it seems that subtle bug hit me with a vengeance. By subtle, I meant it didn't stand out on ralph because two situations had to occur: 1. nstruct had to be high enough to present a problematic run time which it wasn't, I think. 2. it only affects leave in memory users or users that only run R@h (if app never gets preempted). Obviously not subtle to those it affected. Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues. should it say in messages that i recieved an update or ? It will appear as a new version number in the Task tab, a message for the download in the message tab, and a longer download if you watch the transfers tab. Then we all hope it will show up as the fix to the errors. Moderator9 ROSETTA@home FAQ Moderator Contact ID: 14921 · Rating: 0 · rate: / Reply Quote

Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0	Message 14923 - Posted: 28 Apr 2006, 23:23:31 UTC - in response to Message 14918. Last modified: 28 Apr 2006, 23:29:01 UTC Sorry Jose. We just updated the app with the fix. I hope this takes care of your recent issues. Hope you read message 14915 (Specially point 3). No need to apologize. I am in awe of how committed all of you are. I will repeat it : 3- I would be unfair if I don't recognize and applaud the massive effort the project scientists and the software and model developers have undertaken in addressing our concerns, in paying attention to our complaints and suggestions and, in finding solutions as fast as humanly possible without sacrificing the scientific integrity and validity of the data produced. Let's us remember that the points and the competition among teams are but the frosting of a very important cake: scientific progress. The cake is way more important than the frosting. That they have done it in relatively fast time [ Hey, not as fast I would have wanted it but I am in dire need of attending a nice and large BBQ party.], speaks very well of their commitment not only to the core scientific project but to US as a community. That they continously keep doing it , given our constant pressure ( in addition to the extremely high pressure of their professional environments) speaks very loud as to their personal and professional qualities. My thanks to all . My appreciation of all. This and no other is the root from which a Tyrant springs; when he first appears he is a protector.â€ Plato ID: 14923 · Rating: 0 · rate: / Reply Quote

Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0	Message 14924 - Posted: 28 Apr 2006, 23:25:50 UTC - in response to Message 14921. BTW what I said in post 14915 also applies to you. Thanks for your patience. This and no other is the root from which a Tyrant springs; when he first appears he is a protector.â€ Plato ID: 14924 · Rating: 0 · rate: / Reply Quote

David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0	Message 14928 - Posted: 29 Apr 2006, 0:06:36 UTC I did. Your comments are very much appreciated. ID: 14928 · Rating: 0 · rate: / Reply Quote

Cureseekers~Kristof Send message Joined: 5 Nov 05 Posts: 80 Credit: 689,603 RAC: 0	Message 14998 - Posted: 29 Apr 2006, 14:00:29 UTC So if I understand it well, the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours? How much is this? 3 hours? Do you recomend aborting jobs with engine 5.06? Member of Dutch Power Cows ID: 14998 · Rating: 0 · rate: / Reply Quote

Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0	Message 15003 - Posted: 29 Apr 2006, 14:59:16 UTC - in response to Message 14998. So if I understand it well, the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours? How much is this? 3 hours? Do you recommend aborting jobs with engine 5.06? IMHO... Don't do that there is good science to be obtained from them. This and no other is the root from which a Tyrant springs; when he first appears he is a protector.â€ Plato ID: 15003 · Rating: 0 · rate: / Reply Quote

Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0	Message 15048 - Posted: 29 Apr 2006, 22:14:07 UTC - in response to Message 15003. So if I understand it well, the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours? How much is this? 3 hours? Do you recommend aborting jobs with engine 5.06? IMHO... Don't do that there is good science to be obtained from them. I believe the 5.06 WUs run for a fixed number of models, rather than the runtime preference. The work they produce is every bit as useful to the project as the 5.07 WUs. So, it they appear to be running normally (i.e. stepping through models in the graphics display), leave them to do their thing. Credits are issued based on the CPU time you put into the work, not number of models or number of WUs. So the longer they take crunching, the more credit they earn. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ ID: 15048 · Rating: 0 · rate: / Reply Quote

Rhiju Volunteer moderator Send message Joined: 8 Jan 06 Posts: 223 Credit: 3,546 RAC: 0	Message 15057 - Posted: 30 Apr 2006, 2:23:54 UTC - in response to Message 15048. Feet1st is right! You can keep your 5.06 workunits in the queue... they don't have any known errors and will give us useful science results. They just may take shorter or longer than usual. So if I understand it well, the jobs with version 5.06 will not run for the number of hours set in the preferences, but a fixed number of hours? How much is this? 3 hours? Do you recommend aborting jobs with engine 5.06? IMHO... Don't do that there is good science to be obtained from them. I believe the 5.06 WUs run for a fixed number of models, rather than the runtime preference. The work they produce is every bit as useful to the project as the 5.07 WUs. So, it they appear to be running normally (i.e. stepping through models in the graphics display), leave them to do their thing. Credits are issued based on the CPU time you put into the work, not number of models or number of WUs. So the longer they take crunching, the more credit they earn. ID: 15057 · Rating: 0 · rate: / Reply Quote

anders n Send message Joined: 19 Sep 05 Posts: 403 Credit: 537,991 RAC: 0	Message 15073 - Posted: 30 Apr 2006, 9:18:42 UTC - in response to Message 15057. [quote]Feet1st is right! You can keep your 5.06 workunits in the queue... they don't have any known errors and will give us useful science results. They just may take shorter or longer than usual.[quote] Hi Rhiju When I look at my results they are either shorter due to few nstruct or they finish according to my time setting. Anders n ID: 15073 · Rating: 0 · rate: / Reply Quote

Jimi@0wned.org.uk Send message Joined: 10 Mar 06 Posts: 29 Credit: 335,252 RAC: 0	Message 15087 - Posted: 30 Apr 2006, 16:08:50 UTC I've started getting lockups on my x2 3800+ system over the last week; I think it's a RAM problem as it has got worse and worse over time, I can barely get 20 minutes loaded out of it now w/o a lockup. Running dual prime seems to stress the system less than Rosetta, certainly CPU temps are 2 or 3 degrees lower. So I've detached the system and am priming again, trying to nail exactly what's wrong. ID: 15087 · Rating: 0 · rate: / Reply Quote

Jimi@0wned.org.uk Send message Joined: 10 Mar 06 Posts: 29 Credit: 335,252 RAC: 0	Message 15090 - Posted: 30 Apr 2006, 17:27:09 UTC Last modified: 30 Apr 2006, 17:33:45 UTC Is there something grossly different about 5.06 and 5.07, or are the current WUs just more demanding? Prime is still running after 90 minutes with lower temps than Rosetta, which seems to tell me that Rosetta has become extremely tough on the hardware. I know it's difficult to quantify, but is this acknowledged by those that know more about it? NB: this seems to be the wrong thread for my comments, but it's possible WU problems are going to be mistook for hardware problems if the WUs really have got tougher. Where shall I take this? ID: 15090 · Rating: 0 · rate: / Reply Quote