Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 306 · Next

AuthorMessage
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 80935 - Posted: 24 Dec 2016, 2:06:13 UTC - in response to Message 80932.  

I seem to get quite a few work units that end with "computation error" although more finish with out any problem. I am new to Rosetta so I'm just wondering if this is normal or if I should be looking for some kind of solution?

I think my error rate is less than 5% or so. Are you overclocking your CPU, or is it running hot? Do you have enough memory?
ID: 80935 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jjch

Send message
Joined: 10 Nov 13
Posts: 14
Credit: 441,016,712
RAC: 27,831
Message 80942 - Posted: 27 Dec 2016, 20:40:48 UTC

Since December 24th I have noted that my Rosetta@home average work has been steadily dropping. Looking at it a bit further today I found the message "Rosetta Mini for Android is not available for your type of computer"

I have seen this message before sometime earlier this year and it seems to have come back again. The Rosetta server appears to have plenty of work available and my systems are all windows based.

If I shutdown and restart Boinc it will start retrieving work units again but that is a painful process to go through all the systems. These are all running Boinc version 7.6.33 and Rosetta version 3.73.

If there is a better method to keep up production please let me know. I would be willing to try testing some things if needed. Let me know if you need more information.


jjch
ID: 80942 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 80943 - Posted: 27 Dec 2016, 20:49:49 UTC - in response to Message 80942.  
Last modified: 27 Dec 2016, 20:57:22 UTC

...I found the message "Rosetta Mini for Android is not available for your type of computer"
...
If I shutdown and restart Boinc it will start retrieving work units again...


I tend to believe it was coincidence that ending and restarting BOINC Manager led to you receiving work. Instead, I believe it was just a matter of prepared WUs being available on your next project update. But I'd be interested to hear if you've seen this occur enough times that you see a correlation.
Rosetta Moderator: Mod.Sense
ID: 80943 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dr. Merkwürdigliebe
Avatar

Send message
Joined: 5 Dec 10
Posts: 81
Credit: 2,657,273
RAC: 0
Message 80944 - Posted: 27 Dec 2016, 22:53:59 UTC - in response to Message 80942.  


If I shutdown and restart Boinc it will start retrieving work units again but that is a painful process to go through all the systems. These are all running Boinc version 7.6.33 and Rosetta version 3.73.

If there is a better method to keep up production please let me know.


You could use Powershell to restart the service on a number of hosts, e.g. like this.

Theoretically you should also be able to parse the output of
boinccmd --get_messages
and look for the "no work" string.

If you find it, restart the service. Use the Windows task scheduler...

PS: This message board is SLOW. Where's the new hardware?
ID: 80944 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2132
Credit: 41,424,155
RAC: 12,857
Message 80945 - Posted: 28 Dec 2016, 4:03:04 UTC - in response to Message 80943.  

...I found the message "Rosetta Mini for Android is not available for your type of computer"
...
If I shutdown and restart Boinc it will start retrieving work units again...

I tend to believe it was coincidence that ending and restarting BOINC Manager led to you receiving work. Instead, I believe it was just a matter of prepared WUs being available on your next project update. But I'd be interested to hear if you've seen this occur enough times that you see a correlation.

I've had this message multiple times too - with the immediate 24hr deferral like last time too. The whole site is particularly unresponsive since before Christmas, with validation slightly delayed. And I also note Boincstats hasn't been able to pick up data for a few days either.

I'm not expecting this to improve until after new year. Work is coming through regularly enough, but the 24hr deferral thing has caused at least one of my team-members to run out of work. A forced update will often cure it, but only if I notice. Not all my PCs are attended.

On the plus side, Android tasks are readily available right now.
ID: 80945 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 80946 - Posted: 28 Dec 2016, 5:10:21 UTC - in response to Message 80945.  

I had wondered if anyone else had noticed the 24 hour deferral along with the lack of stats site updates for the past several days.

Both of these issues surfaced in the past when there was a significant uptick in new users. I suspect something else may be the root cause this time.



I've had this message multiple times too - with the immediate 24hr deferral like last time too. The whole site is particularly unresponsive since before Christmas, with validation slightly delayed. And I also note Boincstats hasn't been able to pick up data for a few days either.

I'm not expecting this to improve until after new year. Work is coming through regularly enough, but the 24hr deferral thing has caused at least one of my team-members to run out of work. A forced update will often cure it, but only if I notice. Not all my PCs are attended.


ID: 80946 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2000
Credit: 9,747,451
RAC: 8,449
Message 80947 - Posted: 28 Dec 2016, 11:37:10 UTC

The site seems so slow....
ID: 80947 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2132
Credit: 41,424,155
RAC: 12,857
Message 80948 - Posted: 28 Dec 2016, 12:28:01 UTC - in response to Message 80946.  

I had wondered if anyone else had noticed the 24 hour deferral along with the lack of stats site updates for the past several days.

Both of these issues surfaced in the past when there was a significant uptick in new users. I suspect something else may be the root cause this time.
I've had this message multiple times too - with the immediate 24hr deferral like last time too. The whole site is particularly unresponsive since before Christmas, with validation slightly delayed. And I also note Boincstats hasn't been able to pick up data for a few days either.

I'm not expecting this to improve until after new year. Work is coming through regularly enough, but the 24hr deferral thing has caused at least one of my team-members to run out of work. A forced update will often cure it, but only if I notice. Not all my PCs are attended.

Looking at the reported credits nothing has been picked up since Christmas eve. The new users page indicates it was measured in a few hundred, not several thousand like before, so I agree it's likely something different this time.

In past years I've suggested a server reboot before the holidays just to help ensure tasks keep coming through as best they can, but didn't this year. Still, with tasks coming through, it's better than it has been before, so we'll limp along until Jan 2 or 3. Anything better will be a bonus.
ID: 80948 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 61
Credit: 25,390,629
RAC: 70,221
Message 80949 - Posted: 28 Dec 2016, 21:46:35 UTC

Sure seems like the server could use a kick. Site is painfully slow. At one point I had 32 tasks ready to report when its usually only had a couple on my 3770k. I'm guessing Free-DC stats is timing out when trying to pull stats. Credit is updated on My Account page but not on Free-DC. It's actually not even listed in my CPUIP page. :(
ID: 80949 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
spiko

Send message
Joined: 16 Dec 15
Posts: 2
Credit: 444,464
RAC: 0
Message 80950 - Posted: 29 Dec 2016, 16:36:51 UTC

Hi, I had 5 computation errors and 2 validation errors in the last 9 days. Is this normal? https://boinc.bakerlab.org/rosetta/results.php?userid=1202513&offset=0

last task errors is https://boinc.bakerlab.org/rosetta/result.php?resultid=893830336

2016-12-28 16:34:18:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
failed to create shared mem segment: minirosetta Size: 25001672

Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0158CAE0 write attempt to address 0x017D7EC1


another was https://boinc.bakerlab.org/rosetta/result.php?resultid=893258317

Too many restarts with no progress. Keep application in memory while preempted.
======================================================
DONE :: 0 starting structures 10177.8 cpu seconds
This process generated 0 decoys from 0 attempts
======================================================
BOINC :: WS_max 0

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>rb_12_21_70929_114867__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_461616_167_0_0</file_name>
<error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
ID: 80950 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 80997 - Posted: 11 Jan 2017, 4:51:02 UTC

As noted in a separate thread.

Rosetta no longer generating XML statistics files.

Last XML stats update was three weeks ago.
ID: 80997 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Boris

Send message
Joined: 29 Aug 12
Posts: 2
Credit: 9,234
RAC: 0
Message 81006 - Posted: 12 Jan 2017, 1:54:56 UTC

Hi, is there a problem with Rosetta and Boinc for PC?

I just received this message when I turned on Boinc and enabled Rosetta:

12/01/2017 12:49:06 PM | rosetta@home | Sending scheduler request: To fetch work.
12/01/2017 12:49:06 PM | rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU
12/01/2017 12:49:08 PM | rosetta@home | Scheduler request completed: got 0 new tasks
12/01/2017 12:49:08 PM | rosetta@home | No work sent
12/01/2017 12:49:08 PM | rosetta@home | Rosetta Mini for Android is not available for your type of computer.

Umm, how will it be for Android if I am using a PC?

TIME: AEDT

Thank you
ID: 81006 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kenneth DePrizio

Send message
Joined: 15 Jul 07
Posts: 15
Credit: 3,123,915
RAC: 0
Message 81007 - Posted: 12 Jan 2017, 2:11:01 UTC
Last modified: 12 Jan 2017, 2:17:38 UTC

Getting the same issue. Won't download new work. Don't know why it thinks my windows pc is running android.
ID: 81007 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2132
Credit: 41,424,155
RAC: 12,857
Message 81009 - Posted: 12 Jan 2017, 6:20:58 UTC

"Rosetta Mini for Android is not available for your type of computer" is a correct statement if you're running a PC.

What I think it means to say is it's only got Android tasks - none for a PC - and they're no good for you, so it can't send you anything.
ID: 81009 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Boris

Send message
Joined: 29 Aug 12
Posts: 2
Credit: 9,234
RAC: 0
Message 81010 - Posted: 12 Jan 2017, 8:00:01 UTC - in response to Message 81009.  

"Rosetta Mini for Android is not available for your type of computer" is a correct statement if you're running a PC.

What I think it means to say is it's only got Android tasks - none for a PC - and they're no good for you, so it can't send you anything.


Hi,
My interpretation of the statement used, is that it thinks that our client is asking for Android units while being a PC, so it will not send us units. :-)

Unless, you are correct too, but they should have put it as such. i.e. "No "units" available for your client" would have been way better :D No confusion that way.
ID: 81010 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2132
Credit: 41,424,155
RAC: 12,857
Message 81015 - Posted: 13 Jan 2017, 1:30:19 UTC - in response to Message 81010.  

My interpretation of the statement used, is that it thinks that our client is asking for Android units while being a PC, so it will not send us units. :-)

I'm pretty sure it doesn't mean that, but we can all agree it's a dreadful line whatever they're trying to tell us.

Are people getting tasks through ok yet? I can see some of my unattended PCs haven't polled for a long while and it always bugs me whether it's because the PC has crashed or it's been forced into a 24-hour backoff again.
ID: 81015 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 81016 - Posted: 13 Jan 2017, 3:59:17 UTC - in response to Message 81015.  

I haven't received new tasks for quite a while. Not so big a deal, WorldGrid is picking up the slack for now.

But of course what would be seriously nice is to get some explanation of the status of things from project folks....

My interpretation of the statement used, is that it thinks that our client is asking for Android units while being a PC, so it will not send us units. :-)

I'm pretty sure it doesn't mean that, but we can all agree it's a dreadful line whatever they're trying to tell us.

Are people getting tasks through ok yet? I can see some of my unattended PCs haven't polled for a long while and it always bugs me whether it's because the PC has crashed or it's been forced into a 24-hour backoff again.

ID: 81016 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Darrell

Send message
Joined: 28 Sep 06
Posts: 25
Credit: 51,934,631
RAC: 0
Message 81020 - Posted: 13 Jan 2017, 13:34:18 UTC

Why all the WUs with super short deadlines?

I finally got some new work today (1/13). I use a 1+1 day queue. Now the Rosetta work is going into panic mode, shutting off the use of some of my graphic cards from other projects (SETI and Einstein).

More than 32 new WUs came down around 6:46PM on 1/13 and have deadlines at 6:46PM on 1/15. They are 80,000 GFLOPS and should run about 4 hours each. Running 7 threads, they will require (32/7)*4 hours to complete, which is around 20 hours to all finish.

Meanwhile, 3 of my 4 graphic cards are idle due to panic mode. Not nice.

Please assign AT LEAST 7 days to process work units.

Computer is https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=2098312
ID: 81020 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Trotador

Send message
Joined: 30 May 09
Posts: 108
Credit: 291,214,977
RAC: 0
Message 81021 - Posted: 13 Jan 2017, 20:19:24 UTC - in response to Message 81009.  

"Rosetta Mini for Android is not available for your type of computer" is a correct statement if you're running a PC.

What I think it means to say is it's only got Android tasks - none for a PC - and they're no good for you, so it can't send you anything.


+1
ID: 81021 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mmonnin

Send message
Joined: 2 Jun 16
Posts: 61
Credit: 25,390,629
RAC: 70,221
Message 81022 - Posted: 14 Jan 2017, 1:34:56 UTC - in response to Message 81020.  

Why all the WUs with super short deadlines?

I finally got some new work today (1/13). I use a 1+1 day queue. Now the Rosetta work is going into panic mode, shutting off the use of some of my graphic cards from other projects (SETI and Einstein).

More than 32 new WUs came down around 6:46PM on 1/13 and have deadlines at 6:46PM on 1/15. They are 80,000 GFLOPS and should run about 4 hours each. Running 7 threads, they will require (32/7)*4 hours to complete, which is around 20 hours to all finish.

Meanwhile, 3 of my 4 graphic cards are idle due to panic mode. Not nice.

Please assign AT LEAST 7 days to process work units.

Computer is https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=2098312


I've had CPU tasks get to high priority and my GPUs still ran. Multiple systems. Your GPUs tasks must require a CPU thread all to itself and thats causing the issue. Not rosetta.
ID: 81022 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 306 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org