Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 . . . 55 · Next
Author | Message |
---|---|
bcrosby Send message Joined: 13 Apr 07 Posts: 2 Credit: 2,742,612 RAC: 0 |
My issue is a persistent 'Communication deferred' situation - routinely 18 hrs plus. I'm able to download units, and have a set pending upload/reporting ... but with communication constantly 'deferred', that never happens. There are no local computational settings that I can see to invoke this delay in finished job reporting. Is there something more 'central' that I need to adjust (with my Account) to get this out of the way? |
ArcSedna Send message Joined: 23 Oct 11 Posts: 16 Credit: 71,462,581 RAC: 58,883 |
My issue is a persistent 'Communication deferred' situation - routinely 18 hrs plus. I'm able to download units, and have a set pending upload/reporting ... but with communication constantly 'deferred', that never happens. Seeing your computer summary, your "Maximum daily WU quota per CPU" is dropping too low (only 1/day). This means you are allowed only 1 WU per CPU core per day. (In your case, 1 x 4 cores = 4 WUs per day.) This value is maximum 100. Each time the computer returns WU error, the value is reduced little by little, and finally drop to 1 (or zero?). Your computer seems to be returning a lot of error results (*1), so WU quota was dropped to 1. The way to recover from this, is to return a valid successful results. Then WU quota will be back to normal state (100/day), little by little. Regarding the computation errors, it seems to be NOT your fault (hard to solve on user side). There is reported some issues, similar to yours: Thread : Client error for ALL tasks since a month. (Linux 64 bits boinc 7.0.27) (*1) It seems hard to notice your errors, because on the local BOINC Manager, it shows "Ready to report", not "Computation error". https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6050&nowrap=true#73836 What is most troubling about this problem, is that on the CLIENT, it shows that it completed successfully with "Ready to report". It doesn't even show that it resulted in an error at all! It is only after checking my Tasks that I see that it was Client error. |
J.S. Send message Joined: 25 Jul 12 Posts: 3 Credit: 845 RAC: 0 |
Hi, I just became aware of an issue on my system: the BOINC client does not save the state of the tasks when I restart my netbook. So when I had to restart my system, the tasks were reset and started back from zero. Is there anything I can do to prevent this? The usual tasks run six to seven hours on my system and I'm running two at a time so resetting it is like loosing a whole day of cpu time I donated. :-( I'm running BOINC 7.0.28 on Lubuntu Linux 3.2.0-31. My machine is a Samsung N150 netbook, if that matters. |
Polian Send message Joined: 21 Sep 05 Posts: 152 Credit: 10,141,266 RAC: 0 |
Hi, I just became aware of an issue on my system: the BOINC client does not save the state of the tasks when I restart my netbook. So when I had to restart my system, the tasks were reset and started back from zero. Although I'm not that familiar with netbooks, I believe that their processing power is generally pretty limited. Your tasks may simply not be making it to the first "checkpoint", or save point, therefore they start over from the beginning. |
BarryAZ Send message Joined: 27 Dec 05 Posts: 153 Credit: 30,843,285 RAC: 0 |
Looks like the folks back at the shop had something go bump earlier today. The entire site was unreacheable for while. Then I could access the site, but the server status page showed a number of processes not running. Now the server status page is green, but uploads and reporting are still not working. Waiting on a Baker labs update on this. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
I'm not able to upload either. Fri 28 Sep 2012 09:14:09 EST Internet access OK - project servers may be temporarily down. Fri 28 Sep 2012 09:14:09 EST rosetta@home Temporarily failed upload of rb_09_26_33702_63856_t000__casp9_ben_IGNORE_THE_REST_03_08_60229_13_0_0: connect() failed Fri 28 Sep 2012 09:14:09 EST rosetta@home Backing off 1 min 0 sec on upload of rb_09_26_33702_63856_t000__casp9_ben_IGNORE_THE_REST_03_08_60229_13_0_0 Fri 28 Sep 2012 09:14:09 EST rosetta@home Temporarily failed upload of rb_09_26_33707_63866_h003__casp9_ben_IGNORE_THE_REST_05_18_60239_11_0_0: connect() failed Fri 28 Sep 2012 09:14:09 EST rosetta@home Backing off 1 min 0 sec on upload of rb_09_26_33707_63866_h003__casp9_ben_IGNORE_THE_REST_05_18_60239_11_0_0 At least the web page works. ;) |
KWSN THE Holy Hand Grenade! Send message Joined: 3 May 07 Posts: 5 Credit: 2,542,452 RAC: 0 |
Agreed : uploading and any type of reporting (reporting a task, requesting new work, or just a stats update) are off-line! |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,533,485 RAC: 10,732 |
Arghh!!! I set all my projects on my laptop to No New Tasks to clear an issue here, now can't grab new ones. Bad timing. No downtime all year then I get hit at the worst moment :( Oh well, let's give WCG a bit of love for a change... |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
System still broken, NO up or downloads.! |
harlequin Send message Joined: 29 Dec 06 Posts: 1 Credit: 655,030 RAC: 0 |
Hello! Same here: 28.09.2012 11:18:10 | rosetta@home | Reporting 15 completed tasks, not requesting new tasks 28.09.2012 11:18:11 | | Project communication failed: attempting access to reference site 28.09.2012 11:18:11 | rosetta@home | Scheduler request failed: Couldn't connect to server 28.09.2012 11:18:12 | | Internet access OK - project servers may be temporarily down. |
J.S. Send message Joined: 25 Jul 12 Posts: 3 Credit: 845 RAC: 0 |
Although I'm not that familiar with netbooks, I believe that their processing power is generally pretty limited. Granted, but irrelevant for my question. Your tasks may simply not be making it to the first "checkpoint", or save point, therefore they start over from the beginning. Not that I knew very much about the checkpoints you are referring to, but a task that run for, say, five hours straight and indicates a progress of about 75% should have reached some kind of checkpoint by then - otherwise, the software would be fundamentally broken. Can anybody else help? |
Bockovic Send message Joined: 22 Feb 10 Posts: 1 Credit: 3,975 RAC: 0 |
Your tasks may simply not be making it to the first "checkpoint", or save point, therefore they start over from the beginning. Not that I knew very much about the checkpoints you are referring to, but a task that run for, say, five hours straight and indicates a progress of about 75% should have reached some kind of checkpoint by then - otherwise, the software would be fundamentally broken. Can anybody else help?[/quote] Maybe you should try to play a little bit with settings? Go to Preferences > Disk and memory usage. Then check/uncheck "Leave aplications in memory while suspended" and try to decrease time for "Tasks checkpoints to disk every:" (I put here 600 seconds). Maybe there is a problem writing to disk if the partition is closed or Boinc do not see it porperly or else (I am not familiar with Linux). Try changing default folder for tasks. Be carefull if you do that and monitor behaviour of boinc client. I chenged my BOINC folder to D: partition and in some older version of BOINC it would stuck at "Reconnecting to client". As if it can not access 127.0.0.0. Very strange behaviour. When I leave it to install to default C: drive everything works fine, but there was no room on that partition because Win is there off course, so the PC crashed. Solved it with newer version of Boinc :) Hope this helps to someone ;) Bockovic |
J.S. Send message Joined: 25 Jul 12 Posts: 3 Credit: 845 RAC: 0 |
Maybe you should try to play a little bit with settings? Go to Preferences > Disk and memory usage. Then check/uncheck "Leave aplications in memory while suspended" and try to decrease time for "Tasks checkpoints to disk every:" (I put here 600 seconds). (...) My default there is 60 seconds. The application is supposed to stay in memory while suspended (box is checked). I often also suspend the application manually, so it can finish writing to disk before I suspend/hibernate the machine. Is there a place where I can read more about the "reset conditions" of a task - under which circumstance can this happen? Oh, by the way, yes, I tried turning it off and on again. I also uninstalled and reinstalled. ;-) |
TJ Send message Joined: 29 Mar 09 Posts: 127 Credit: 4,799,890 RAC: 0 |
Maybe you should try to play a little bit with settings? Go to Preferences > Disk and memory usage. Then check/uncheck "Leave aplications in memory while suspended" and try to decrease time for "Tasks checkpoints to disk every:" (I put here 600 seconds). (...) This won't help you, but I have read somewhere with another project, that hibernating a pc (under windows) will result in strange behavior of BOINC and that thanks error out eventually. Greetings, TJ. |
Doug_Hood Send message Joined: 15 Dec 05 Posts: 2 Credit: 3,416,526 RAC: 0 |
Hello! Same here. I have about 50 work units trying to report for the last 15+ hours |
Bill Kozorra Send message Joined: 25 Jan 11 Posts: 5 Credit: 86,550,972 RAC: 0 |
Is anyone having problems uploading results? Not one of my computers will upload. My Internet access is ok. I noticed that the server status page says that everything is ok. Still nothing will upload. |
TJ Send message Joined: 29 Mar 09 Posts: 127 Credit: 4,799,890 RAC: 0 |
Is anyone having problems uploading results? Not one of my computers will upload. My Internet access is ok. I noticed that the server status page says that everything is ok. Still nothing will upload. Yes, see my thread "upload problem". No new work either. Greetings, TJ. |
Chuck Send message Joined: 13 Aug 10 Posts: 3 Credit: 3,297 RAC: 0 |
Is anyone having problems uploading results? Not one of my computers will upload. My Internet access is ok. I noticed that the server status page says that everything is ok. Still nothing will upload. Getting these same uploading results myself, nothing from rosetta@home will upload, just keeps saying project backoff with a timer delay of anywhere 2 mins to 2 hours or more. |
Chuck Send message Joined: 13 Aug 10 Posts: 3 Credit: 3,297 RAC: 0 |
Hi Snagletooth Please remember David even 2 out of 9 completed WU's get us closer to a cure. Every wu done is another step closer. Don't give up. You can always run BOINC in a windows OS. Keep on crunching. |
BarryAZ Send message Joined: 27 Dec 05 Posts: 153 Credit: 30,843,285 RAC: 0 |
While, on occasion, this project has encountered connectivity problems (perhaps a few times a year), it is relatively rare for this project in the distributed processing world. Somewhat more troublesome is -- and this is VERY rare for Rosetta -- an informational black out. As we move toward a full day of the outage, we've not seen any information from the project folks, not even an acknowledgement of what folks are reporting here. It may well be that they are aware of the problem and are working on it, but at this juncture, for the community here, it is all speculation. I very much looking toward at least an acknowledgement that the folks back at the lab are aware there is a problem, |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org