Message boards : Number crunching : Stuck on Uploading
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Luigi R. Send message Joined: 7 Feb 14 Posts: 39 Credit: 2,045,527 RAC: 0 |
Change of hosts.txt is a stupid idea! There should be another way. No, it's not stupid. It worked fine for me. I uploaded and got validated 14 4-hour tasks, that's 56 hours of computing. On Linux is /etc/hosts. Replace of "srv1.bakerlab.org" in client_state.xml is useless. Whenever i restart the boinc manager is it again there. I can't find from where boinc restore this. Yep, that is what I was worried about. Thanks your lazy guys have i now two WU's lost! 24 hours of work for nothing. See above. |
Luigi R. Send message Joined: 7 Feb 14 Posts: 39 Credit: 2,045,527 RAC: 0 |
..., we're all on the same team here man :) I share this point, but - you know - BOINC is plenty of medicine projects. If Rosetta@home is not always reliable or doesn't meet your demands, you can choose another project to place side by side or to replace it. For example I don't run Rosetta@home tasks very much cause of inefficiency, but it is off-topic. Anyway 3.73 app sounds more cpu-intensive to me. |
sinspin Send message Joined: 30 Jan 06 Posts: 29 Credit: 6,574,585 RAC: 0 |
It is not our problem that srv1 is not responding. And it is not on us to implement a solution for that. Even more, if the solution is so damn stupid. 99% of the users will forget that they have made this change an will later have another problems only caused by this solution! I think it is much better to change the server redirection at your side. Then can you spend all time you need to find out what is wrong. Especially since the weekend is at the doorstep. |
Luigi R. Send message Joined: 7 Feb 14 Posts: 39 Credit: 2,045,527 RAC: 0 |
99% of the users will forget that they have made this change an will later have another problems only caused by this solution! Maybe not. These servers will be both online, so there could be no difference in the future. For forgetful users it could be fine to edit hosts file to upload only and to comment that line right after upload process. I'm agree we should not do any special configuration. We should run BOINC only. Not our problem, but we want our work to get validated too. |
vb Send message Joined: 31 Dec 14 Posts: 1 Credit: 1,587,957 RAC: 0 |
Temporary fix in place (modified hosts file). Thanks, whoever proposed it! |
AMDave Send message Joined: 16 Dec 05 Posts: 35 Credit: 12,576,896 RAC: 0 |
My suspicions confirmed; No go for me. Here is the line I added: 128.95.160.145 srv1.bakerlab.org, in accordance with these instructions. Here is the Event log: 6/24/2016 4:48:10 PM | rosetta@home | update requested by user 6/24/2016 4:48:15 PM | rosetta@home | Sending scheduler request: Requested by user. 6/24/2016 4:48:15 PM | rosetta@home | Not requesting tasks: too many uploads in progress 6/24/2016 4:48:17 PM | rosetta@home | Scheduler request completed |
Timo Send message Joined: 9 Jan 12 Posts: 185 Credit: 45,649,459 RAC: 0 |
My suspicions confirmed; This fix is not to help with Requesting new tasks, but rather to help get your uploads going again. Judging from the snippet of your log file that you shared, you were attempting to fetch new work.. Go to your transfers tab and click one of your Uploading tasks that is stuck on 'Retry in ....' and click Retry Now - it should work. Also, if you want to check that your hosts modification worked, you can attempt to visit this link http://srv1.bakerlab.org/ and it should bring you to the Rosetta@Home homepage :) Alternatively, you can ping srv1.bakerlab.org Cheers **38 cores crunching for R@H on behalf of cancercomputer.org - a non-profit supporting High Performance Computing in Cancer Research |
AMDave Send message Joined: 16 Dec 05 Posts: 35 Credit: 12,576,896 RAC: 0 |
This fix is not to help with Requesting new tasks, but rather to help get your uploads going again. Judging from the snippet of your log file that you shared, you were attempting to fetch new work.. I went to the Tasks tab and clicked Update. I thought that was an all-inclusive request. Retry Now worked, all "uploading" WUs are gone. Just need to remember to undo mod to hosts file if problem is fixed. Thank you Timo |
Oscar Järkvik Send message Joined: 16 Jun 16 Posts: 1 Credit: 1,651,223 RAC: 0 |
I am running BOINC on Linux machines headless. After updating /etc/hosts with 128.95.160.145 srv1.bakerlab.org they still weren't able to upload recent work. I control BOINC through boinctui but couldn't find a way to force retrying of upload. The solution I found was to restart BOINC after the hosts file edit, then the uploads retried, and successfully uploaded. On a recent Debian system or derivative (Ubuntu and more), BOINC is restarted from the terminal by running sudo systemctl restart boinc. |
hjdghjdghjghjjggh Send message Joined: 10 May 16 Posts: 7 Credit: 9,749 RAC: 0 |
Without making any changes to any hosts files and simply waiting (While still crunching at least), all my completed tasks finally uploaded and I got credit for them. Looks like they fixed it. |
Luigi R. Send message Joined: 7 Feb 14 Posts: 39 Credit: 2,045,527 RAC: 0 |
I am running BOINC on Linux machines headless. After updating /etc/hosts with 128.95.160.145 srv1.bakerlab.org they still weren't able to upload recent work. I control BOINC through boinctui but couldn't find a way to force retrying of upload. The solution I found was to restart BOINC after the hosts file edit, then the uploads retried, and successfully uploaded. On a recent Debian system or derivative (Ubuntu and more), BOINC is restarted from the terminal by running sudo systemctl restart boinc. Yes, on Xubuntu you didn't need restarting BOINC to get hosts modifications effective and I'm not running BOINC as a service. You could have tried $boinccmd --get_file_transfers Then $boinccmd --file_transfer https://boinc.bakerlab.org/rosetta $filename retry where $boinccmd is your boinccmd command including path and $filename is the result you want try to reupload. Or simply $boinccmd --set_network_mode never $boinccmd --set_network_mode auto and retry should be automatic. See boinccmd. Without making any changes to any hosts files and simply waiting (While still crunching at least), all my completed tasks finally uploaded and I got credit for them. Looks like they fixed it. That solution was recommended for people having near-deadline tasks. Some robetta (rb_*) tasks were expiring on 24/06. |
sinspin Send message Joined: 30 Jan 06 Posts: 29 Credit: 6,574,585 RAC: 0 |
It seems that the Problem is fixed. Thank you guys very much! |
entigy Send message Joined: 2 Nov 05 Posts: 5 Credit: 990,830 RAC: 0 |
This. Again. 28/06/2016 07:45:22 | rosetta@home | Computation for task gr062216_EEHEE_rd3_1211_fragments_fold_SAVE_ALL_OUT_384881_10_0 finished 28/06/2016 07:45:24 | rosetta@home | Started upload of gr062216_EEHEE_rd3_1211_fragments_fold_SAVE_ALL_OUT_384881_10_0_0 28/06/2016 07:46:07 | rosetta@home | Temporarily failed upload of gr062216_EEHEE_rd3_1211_fragments_fold_SAVE_ALL_OUT_384881_10_0_0: transient HTTP error 28/06/2016 07:46:07 | rosetta@home | Backing off 00:02:21 on upload of gr062216_EEHEE_rd3_1211_fragments_fold_SAVE_ALL_OUT_384881_10_0_0 28/06/2016 07:47:26 | rosetta@home | Started upload of gr062216_EEHEE_rd3_1211_fragments_fold_SAVE_ALL_OUT_384881_10_0_0 28/06/2016 07:48:13 | rosetta@home | Temporarily failed upload of gr062216_EEHEE_rd3_1211_fragments_fold_SAVE_ALL_OUT_384881_10_0_0: transient HTTP error 28/06/2016 07:48:13 | rosetta@home | Backing off 00:04:05 on upload of gr062216_EEHEE_rd3_1211_fragments_fold_SAVE_ALL_OUT_384881_10_0_0 |
entigy Send message Joined: 2 Nov 05 Posts: 5 Credit: 990,830 RAC: 0 |
And again. 04/08/2016 08:14:54 | rosetta@home | Started upload of FFD__300b59433e13d029e07395f92f38637f_abinitioDocking_16_08_09_53_15_globalDocking_7_SAVE_ALL_OUT_406840_18_0_0 04/08/2016 08:15:16 | rosetta@home | Temporarily failed upload of FFD__300b59433e13d029e07395f92f38637f_abinitioDocking_16_08_09_53_15_globalDocking_7_SAVE_ALL_OUT_406840_18_0_0: transient HTTP error 04/08/2016 08:15:16 | rosetta@home | Backing off 00:41:38 on upload of FFD__300b59433e13d029e07395f92f38637f_abinitioDocking_16_08_09_53_15_globalDocking_7_SAVE_ALL_OUT_406840_18_0_0 |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
And again. thanks for alerting us on this. I'll see what's going on. |
shanen Send message Joined: 16 Apr 14 Posts: 195 Credit: 12,662,308 RAC: 0 |
Why revive this thread? Anyway, the server status is showing everything is disabled except for one process. Not sure why the "Disabled" boxes are green, since it obviously isn't an all-systems-go status, but... Seems like rosetta@home is a good project if you don't care much. Obviously I've given up caring, but I still wonder if the shadow extends to the results. I'd be a bit troubled if someone asked me to review a research paper under the condition of not caring much about the calculated results. #1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech) |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
Why revive this thread? Anyway, the server status is showing everything is disabled except for one process. Not sure why the "Disabled" boxes are green, since it obviously isn't an all-systems-go status, but... Coincidentally one of our servers had a very high load so we looked into that and decided to reboot the server which is why the project was disabled momentarily during the reboot. It seems ok now but I have to check further. This coincided with a researcher submitting over 20,000 individual unique jobs all at once which was the likely culprit for the load due to the enormous amount of files associated with so many unique jobs. |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
Why revive this thread? Anyway, the server status is showing everything is disabled except for one process. Not sure why the "Disabled" boxes are green, since it obviously isn't an all-systems-go status, but... I'll ask the researcher to describe her work here on the forum. As more of our research may involve this kind of huge number of unique jobs for protein design, we will have to look into our hardware options for the upgrade we are planning to support this. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,249,734 RAC: 8,235 |
Why revive this thread? Anyway, the server status is showing everything is disabled except for one process. Not sure why the "Disabled" boxes are green, since it obviously isn't an all-systems-go status, but... This issue has returned for just two of my tasks. Later tasks are uploading fine, but two persist in being unable to upload. |
googloo Send message Joined: 15 Sep 06 Posts: 133 Credit: 22,722,686 RAC: 3,377 |
Three tasks stuck. |
Message boards :
Number crunching :
Stuck on Uploading
©2024 University of Washington
https://www.bakerlab.org