Message boards : Number crunching : Rosetta@Home Version 3.24
Author | Message |
---|---|
Sysadm@Nbg Send message Joined: 16 Mar 10 Posts: 1 Credit: 773,579 RAC: 0 |
I have some problemes with upload of results I think this is in relation with the distribution of the new app (high network traffic ?!) A network monitoring at the rah_status.php like at primegridĀ“s server_status.php should be helpfully... |
TD Nickell Send message Joined: 20 Jan 07 Posts: 10 Credit: 3,810,259 RAC: 0 |
Same problem here.Work unit's won't upload. |
Andrii Muliar Send message Joined: 10 Nov 05 Posts: 12 Credit: 7,655,243 RAC: 0 |
Upload is very slow but it is working for me ("Retry Now"). |
TD Nickell Send message Joined: 20 Jan 07 Posts: 10 Credit: 3,810,259 RAC: 0 |
Seems to be uploading okay now! |
pvh Send message Joined: 7 Feb 10 Posts: 3 Credit: 2,487,638 RAC: 0 |
I noticed that the 3.24 app did not have the execute bit set after download (in openSUSE 11.4, Boinc 7.0.18), which caused all WUs to fail. I have fixed this manually, but that should not be necessary of course... |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
Rosetta@Home has been updated to version 3.24. If you encounter any problems, please let us know. Thank you for your continued support. plus remind new people the update is automatic and there is nothing they have to download .......... |
In Memory of Kimsey M Fowler Sr Send message Joined: 10 Mar 12 Posts: 26 Credit: 39,033,222 RAC: 0 |
Please post details about correcting the execute bit. I built a new machine over the weekend for R@H and the WU's all failed. As a consequence BOINC/R@H will only give me 8 new work units per day, and those are completed in three hours... a lot of processing time is being wasted. Also wasted were many hours testing the computer trying to figure out why it couldn't get a WU done correctly. |
Rocco Moretti Send message Joined: 18 May 10 Posts: 66 Credit: 585,745 RAC: 0 |
pvh: I noticed that the 3.24 app did not have the execute bit set after download (in openSUSE 11.4, Boinc 7.0.18), which caused all WUs to fail. There was nothing different done on our end, with respect to the executable bit, from any of the previous versions, so it's likely it's a Boinc 7 issue. Note that the Boinc 7.0 series is currently still a development version, and people have reported a number of issues with Boinc 7 and R@h. As it's development code, we're really not supporting Boinc 7 at this point. In Memory of Kimsey M Fowler Sr: I built a new machine over the weekend for R@H and the WU's all failed. If you're referring to this machine, it looks like the issue is not a faulty execute bit, but rather the successful completion/Exit status 0/Client Error/missing application version issue that others have experienced. (See https://boinc.bakerlab.org/forum_thread.php?id=5914#72425) We're looking into it, but the best lead so far is that it's related to GPU settings. If it won't impact computing for other projects, try turning off GPU usage for that machine. (Rosetta@home itself does not use GPUs, although other boinc projects you're running might.) |
In Memory of Kimsey M Fowler Sr Send message Joined: 10 Mar 12 Posts: 26 Credit: 39,033,222 RAC: 0 |
If you're referring to this machine, it looks like the issue is not a faulty execute bit, but rather the successful completion/Exit status 0/Client Error/missing application version issue that others have experienced. (See https://boinc.bakerlab.org/forum_thread.php?id=5914#72425) We're looking into it, but the best lead so far is that it's related to GPU settings. If it won't impact computing for other projects, try turning off GPU usage for that machine. (Rosetta@home itself does not use GPUs, although other boinc projects you're running might.) Thanks for taking time to respond. Yes, that is the correct machine. I set the GPU Activity button to "Suspend GPU", and the last few days WU's appear to be completing normally. I wonder if, like myself, others having experienced a similar problem are running F@H on one or more GPU's and R@H on the CPU? I'm doing this on a second nearly identical machine (computer ID 1498519) without any problems, so I'm thinking that "Suspend GPU" is the ticket. I am still dealing with the problem of being limited to eight new WU's per day. From poking around various forums it looks like BOINC may take several days to recognize that I can perform additional work units in the allotted time. I'm experimenting with a suggestion to accelerate that process by setting my preferences differently to indicate a connection to the internet every six days and request two days of work at the time even though the machine is always connected. |
Rocco Moretti Send message Joined: 18 May 10 Posts: 66 Credit: 585,745 RAC: 0 |
I set the GPU Activity button to "Suspend GPU", and the last few days WU's appear to be completing normally. Interesting ... But I'm wondering why you think they're completing normally, as according to the task list for that computer (https://boinc.bakerlab.org/rosetta/results.php?hostid=1525425) everything for the past couple of days (at least of from 16 Mar 15:00 UTC on back) seems to be still suffering from Client Errors. |
In Memory of Kimsey M Fowler Sr Send message Joined: 10 Mar 12 Posts: 26 Credit: 39,033,222 RAC: 0 |
I set the GPU Activity button to "Suspend GPU", and the last few days WU's appear to be completing normally. Yep, I see that now. There were no reports of errors when the jobs completed and I saw credit was awarded. 'I assumed'.... shameful. I'm going to uninstall and reinstall BOINC for a fresh attempt. |
DmGun Send message Joined: 21 Nov 10 Posts: 6 Credit: 706,645 RAC: 0 |
After updating to version 3.24 - Tasks are considered from two to seven hours (set to 3:00) - Granted credit was less than 6 times - Compute errors https://boinc.bakerlab.org/rosetta/results.php?userid=402480 Restarting the project has not helped... |
ArcSedna Send message Joined: 23 Oct 11 Posts: 16 Credit: 71,462,581 RAC: 35,877 |
Recently, "Granted credit" for Mac OS X clients is relatively low compared to one for Windows client. Client #1 Mac OS X(10.7.3) - Measured floating point speed 2840.51 million ops/sec - Measured integer speed 4754.19 million ops/sec - CPU Time (sec) 21,972.88 - Claimed Credit 96.57 - Granted Credit 14.61 - https://boinc.bakerlab.org/rosetta/workunit.php?wuid=448711523 Client #2 Windows 7 - Measured floating point speed 2271.54 million ops/sec - Measured integer speed 6904.26 million ops/sec - CPU Time (sec) 21,434.99 - Claimed Credit 113.82 - Granted Credit 92.04 - https://boinc.bakerlab.org/rosetta/workunit.php?wuid=448711904 |
DmGun Send message Joined: 21 Nov 10 Posts: 6 Credit: 706,645 RAC: 0 |
transient, I have the same thing happens in OS X 10.7.3 https://boinc.bakerlab.org/rosetta/results.php?userid=402480 see what it was two days ago - all calculated results fell about six times |
m2a2b2 Send message Joined: 10 May 07 Posts: 2 Credit: 816,900 RAC: 0 |
I am also experiencing the same results with MacOS X 10.6.8. All results have dropped to 20-25% of what they were for jobs completed prior to March 16. |
Rocco Moretti Send message Joined: 18 May 10 Posts: 66 Credit: 585,745 RAC: 0 |
It looks like the performance of the Rosetta@home application dropped on Macs (we believe all Macs) with 3.24. We're aware of the issue and looking into ways of remedying it. Note that the low performance is the direct cause of the variable runtimes. The R@h client will try to always produce at least decoy. If execution slows down enough that a job takes 7 hours to produce the first decoy, that workunit will run for 7 hours, even if your runtime setting is 3 hours. But once that first decoy is produced, the client will only start on subsequent decoys if the estimated runtime falls under the run-time limit. So if the first decoy takes 2 hours to complete and your runtime is set for 3 hours, the client will stop early, rather than run for 4 hours. |
DmGun Send message Joined: 21 Nov 10 Posts: 6 Credit: 706,645 RAC: 0 |
And why? Before the update to the new kernel was nothing like this. You can see the results for the previous couple of months. Also, errors were very rare. Another very bad bug: reboot the client, many jobs are beginning to be at zero (this and the users complain Windows) CASP9*** is not stable. The big difference in time calculations (from 2.5 to 7 hours) and reset after a reboot. Sorry for bad english - google translate ((( |
DmGun Send message Joined: 21 Nov 10 Posts: 6 Credit: 706,645 RAC: 0 |
I saw that you already answered ... |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
Another "Maximum disk usage exceeded" error CASP9_bj_benchmark_hybridization_run36_T0628_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_44571_2431_0 CPU time 16596.04 cpu run time pref is 28800 Lots of "sin_cos_range ERROR: nan is outside of [-1,+1] sin and cos value legal range" in the stdrr out Best, Snags |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
One more: CASP9_bj_benchmark_hybridization_run36_T0601_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_44523_2703_0 <message> Maximum disk usage exceeded </message> CPU time 11739.44 Best, Snags |
Message boards :
Number crunching :
Rosetta@Home Version 3.24
©2024 University of Washington
https://www.bakerlab.org