Rosetta@Home Version 3.24

Message boards : Number crunching : Rosetta@Home Version 3.24

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Sysadm@Nbg
Avatar

Send message
Joined: 16 Mar 10
Posts: 1
Credit: 773,579
RAC: 0
Message 72511 - Posted: 14 Mar 2012, 18:12:30 UTC

I have some problemes with upload of results
I think this is in relation with the distribution of the new app (high network traffic ?!)

A network monitoring at the rah_status.php like at primegridĀ“s server_status.php should be helpfully...
ID: 72511 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TD Nickell
Avatar

Send message
Joined: 20 Jan 07
Posts: 10
Credit: 3,810,259
RAC: 0
Message 72513 - Posted: 14 Mar 2012, 21:26:01 UTC

Same problem here.Work unit's won't upload.
ID: 72513 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Andrii Muliar

Send message
Joined: 10 Nov 05
Posts: 12
Credit: 7,655,243
RAC: 0
Message 72514 - Posted: 14 Mar 2012, 22:27:51 UTC

Upload is very slow but it is working for me ("Retry Now").
ID: 72514 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TD Nickell
Avatar

Send message
Joined: 20 Jan 07
Posts: 10
Credit: 3,810,259
RAC: 0
Message 72515 - Posted: 14 Mar 2012, 23:16:10 UTC

Seems to be uploading okay now!
ID: 72515 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
pvh

Send message
Joined: 7 Feb 10
Posts: 3
Credit: 2,487,638
RAC: 0
Message 72519 - Posted: 15 Mar 2012, 16:51:51 UTC

I noticed that the 3.24 app did not have the execute bit set after download (in openSUSE 11.4, Boinc 7.0.18), which caused all WUs to fail. I have fixed this manually, but that should not be necessary of course...
ID: 72519 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 72520 - Posted: 15 Mar 2012, 16:55:42 UTC - in response to Message 72510.  

Rosetta@Home has been updated to version 3.24. If you encounter any problems, please let us know. Thank you for your continued support.

Among other things, this release includes support for symmetry in the hybrid protocol for comparative modeling.


plus remind new people the update is automatic and there is nothing they have to download ..........
ID: 72520 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile In Memory of Kimsey M Fowler Sr

Send message
Joined: 10 Mar 12
Posts: 26
Credit: 39,033,222
RAC: 0
Message 72522 - Posted: 15 Mar 2012, 18:23:19 UTC - in response to Message 72519.  

Please post details about correcting the execute bit. I built a new machine over the weekend for R@H and the WU's all failed. As a consequence BOINC/R@H will only give me 8 new work units per day, and those are completed in three hours... a lot of processing time is being wasted. Also wasted were many hours testing the computer trying to figure out why it couldn't get a WU done correctly.
ID: 72522 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rocco Moretti

Send message
Joined: 18 May 10
Posts: 66
Credit: 585,745
RAC: 0
Message 72523 - Posted: 15 Mar 2012, 21:24:12 UTC - in response to Message 72522.  

pvh: I noticed that the 3.24 app did not have the execute bit set after download (in openSUSE 11.4, Boinc 7.0.18), which caused all WUs to fail.


There was nothing different done on our end, with respect to the executable bit, from any of the previous versions, so it's likely it's a Boinc 7 issue.

Note that the Boinc 7.0 series is currently still a development version, and people have reported a number of issues with Boinc 7 and R@h. As it's development code, we're really not supporting Boinc 7 at this point.

In Memory of Kimsey M Fowler Sr: I built a new machine over the weekend for R@H and the WU's all failed.


If you're referring to this machine, it looks like the issue is not a faulty execute bit, but rather the successful completion/Exit status 0/Client Error/missing application version issue that others have experienced. (See https://boinc.bakerlab.org/forum_thread.php?id=5914#72425) We're looking into it, but the best lead so far is that it's related to GPU settings. If it won't impact computing for other projects, try turning off GPU usage for that machine. (Rosetta@home itself does not use GPUs, although other boinc projects you're running might.)
ID: 72523 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile In Memory of Kimsey M Fowler Sr

Send message
Joined: 10 Mar 12
Posts: 26
Credit: 39,033,222
RAC: 0
Message 72530 - Posted: 16 Mar 2012, 19:53:16 UTC - in response to Message 72523.  


If you're referring to this machine, it looks like the issue is not a faulty execute bit, but rather the successful completion/Exit status 0/Client Error/missing application version issue that others have experienced. (See https://boinc.bakerlab.org/forum_thread.php?id=5914#72425) We're looking into it, but the best lead so far is that it's related to GPU settings. If it won't impact computing for other projects, try turning off GPU usage for that machine. (Rosetta@home itself does not use GPUs, although other boinc projects you're running might.)


Thanks for taking time to respond. Yes, that is the correct machine. I set the GPU Activity button to "Suspend GPU", and the last few days WU's appear to be completing normally. I wonder if, like myself, others having experienced a similar problem are running F@H on one or more GPU's and R@H on the CPU? I'm doing this on a second nearly identical machine (computer ID 1498519) without any problems, so I'm thinking that "Suspend GPU" is the ticket.

I am still dealing with the problem of being limited to eight new WU's per day. From poking around various forums it looks like BOINC may take several days to recognize that I can perform additional work units in the allotted time. I'm experimenting with a suggestion to accelerate that process by setting my preferences differently to indicate a connection to the internet every six days and request two days of work at the time even though the machine is always connected.
ID: 72530 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rocco Moretti

Send message
Joined: 18 May 10
Posts: 66
Credit: 585,745
RAC: 0
Message 72533 - Posted: 17 Mar 2012, 0:36:22 UTC - in response to Message 72530.  

I set the GPU Activity button to "Suspend GPU", and the last few days WU's appear to be completing normally.


Interesting ... But I'm wondering why you think they're completing normally, as according to the task list for that computer (https://boinc.bakerlab.org/rosetta/results.php?hostid=1525425) everything for the past couple of days (at least of from 16 Mar 15:00 UTC on back) seems to be still suffering from Client Errors.
ID: 72533 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile In Memory of Kimsey M Fowler Sr

Send message
Joined: 10 Mar 12
Posts: 26
Credit: 39,033,222
RAC: 0
Message 72536 - Posted: 17 Mar 2012, 15:15:24 UTC - in response to Message 72533.  

I set the GPU Activity button to "Suspend GPU", and the last few days WU's appear to be completing normally.


Interesting ... But I'm wondering why you think they're completing normally, as according to the task list for that computer (https://boinc.bakerlab.org/rosetta/results.php?hostid=1525425) everything for the past couple of days (at least of from 16 Mar 15:00 UTC on back) seems to be still suffering from Client Errors.


Yep, I see that now. There were no reports of errors when the jobs completed and I saw credit was awarded. 'I assumed'.... shameful.

I'm going to uninstall and reinstall BOINC for a fresh attempt.
ID: 72536 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
DmGun

Send message
Joined: 21 Nov 10
Posts: 6
Credit: 706,645
RAC: 0
Message 72537 - Posted: 17 Mar 2012, 17:18:04 UTC

After updating to version 3.24
- Tasks are considered from two to seven hours (set to 3:00)
- Granted credit was less than 6 times
- Compute errors
https://boinc.bakerlab.org/rosetta/results.php?userid=402480
Restarting the project has not helped...
ID: 72537 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ArcSedna

Send message
Joined: 23 Oct 11
Posts: 16
Credit: 71,462,581
RAC: 35,877
Message 72540 - Posted: 18 Mar 2012, 0:19:13 UTC

Recently, "Granted credit" for Mac OS X clients is relatively low compared to one for Windows client.

Client #1 Mac OS X(10.7.3)
- Measured floating point speed 2840.51 million ops/sec
- Measured integer speed 4754.19 million ops/sec
- CPU Time (sec) 21,972.88
- Claimed Credit 96.57
- Granted Credit 14.61
- https://boinc.bakerlab.org/rosetta/workunit.php?wuid=448711523

Client #2 Windows 7
- Measured floating point speed 2271.54 million ops/sec
- Measured integer speed 6904.26 million ops/sec
- CPU Time (sec) 21,434.99
- Claimed Credit 113.82
- Granted Credit 92.04
- https://boinc.bakerlab.org/rosetta/workunit.php?wuid=448711904

ID: 72540 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
DmGun

Send message
Joined: 21 Nov 10
Posts: 6
Credit: 706,645
RAC: 0
Message 72542 - Posted: 18 Mar 2012, 12:27:36 UTC
Last modified: 18 Mar 2012, 12:28:11 UTC

transient, I have the same thing happens in OS X 10.7.3
https://boinc.bakerlab.org/rosetta/results.php?userid=402480
see what it was two days ago - all calculated results fell about six times
ID: 72542 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
m2a2b2

Send message
Joined: 10 May 07
Posts: 2
Credit: 816,900
RAC: 0
Message 72546 - Posted: 18 Mar 2012, 21:23:27 UTC

I am also experiencing the same results with MacOS X 10.6.8. All results have dropped to 20-25% of what they were for jobs completed prior to March 16.
ID: 72546 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rocco Moretti

Send message
Joined: 18 May 10
Posts: 66
Credit: 585,745
RAC: 0
Message 72549 - Posted: 18 Mar 2012, 23:35:26 UTC

It looks like the performance of the Rosetta@home application dropped on Macs (we believe all Macs) with 3.24. We're aware of the issue and looking into ways of remedying it.

Note that the low performance is the direct cause of the variable runtimes. The R@h client will try to always produce at least decoy. If execution slows down enough that a job takes 7 hours to produce the first decoy, that workunit will run for 7 hours, even if your runtime setting is 3 hours. But once that first decoy is produced, the client will only start on subsequent decoys if the estimated runtime falls under the run-time limit. So if the first decoy takes 2 hours to complete and your runtime is set for 3 hours, the client will stop early, rather than run for 4 hours.
ID: 72549 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
DmGun

Send message
Joined: 21 Nov 10
Posts: 6
Credit: 706,645
RAC: 0
Message 72550 - Posted: 18 Mar 2012, 23:43:13 UTC

And why?
Before the update to the new kernel was nothing like this. You can see the results for the previous couple of months.
Also, errors were very rare.
Another very bad bug: reboot the client, many jobs are beginning to be at zero (this and the users complain Windows)
CASP9*** is not stable. The big difference in time calculations (from 2.5 to 7 hours) and reset after a reboot.
Sorry for bad english - google translate (((
ID: 72550 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
DmGun

Send message
Joined: 21 Nov 10
Posts: 6
Credit: 706,645
RAC: 0
Message 72551 - Posted: 18 Mar 2012, 23:48:09 UTC

I saw that you already answered ...
ID: 72551 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,888,320
RAC: 0
Message 72552 - Posted: 19 Mar 2012, 0:02:53 UTC

Another "Maximum disk usage exceeded" error

CASP9_bj_benchmark_hybridization_run36_T0628_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_44571_2431_0

CPU time 16596.04
cpu run time pref is 28800

Lots of "sin_cos_range ERROR: nan is outside of [-1,+1] sin and cos value legal range" in the stdrr out


Best,
Snags
ID: 72552 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,888,320
RAC: 0
Message 72553 - Posted: 19 Mar 2012, 10:21:28 UTC

One more: CASP9_bj_benchmark_hybridization_run36_T0601_0_C2_SAVE_ALL_OUT_IGNORE_THE_REST_44523_2703_0


<message>
Maximum disk usage exceeded
</message>

CPU time 11739.44

Best,
Snags
ID: 72553 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Rosetta@Home Version 3.24



©2024 University of Washington
https://www.bakerlab.org