Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 31 · 32 · 33 · 34 · 35 · 36 · 37 . . . 55 · Next

AuthorMessage
Ananas

Send message
Joined: 1 Jan 06
Posts: 232
Credit: 752,471
RAC: 0
Message 77137 - Posted: 30 Jul 2014, 14:42:59 UTC

The scheduler is back, downloads are working (so we can get new work), uploads are still stuck.
ID: 77137 · Rating: 0 · rate: Rate + / Rate - Report as offensive
googloo
Avatar

Send message
Joined: 15 Sep 06
Posts: 133
Credit: 22,783,789
RAC: 5,547
Message 77138 - Posted: 30 Jul 2014, 15:46:58 UTC - in response to Message 77137.  

The scheduler is back, downloads are working (so we can get new work), uploads are still stuck.


Not working for me: "too many uploads in progress."
ID: 77138 · Rating: 0 · rate: Rate + / Rate - Report as offensive
krypton
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 16 Nov 11
Posts: 108
Credit: 2,164,309
RAC: 0
Message 77139 - Posted: 30 Jul 2014, 16:16:26 UTC

We are beginning to get jobs back on our end!

I'll update here as soon as I know more =]

Thanks for all the reports/logs (I've forwarded the logs to our server admin guy, these should be helpful in debugging the issue).
ID: 77139 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Usuario1_S

Send message
Joined: 24 Mar 14
Posts: 92
Credit: 3,059,705
RAC: 0
Message 77140 - Posted: 30 Jul 2014, 17:33:16 UTC

My UTC -4
30/07/2014 13:24:39 | | Project communication failed: attempting access to reference site
30/07/2014 13:24:42 | | Internet access OK - project servers may be temporarily down.
30/07/2014 13:24:52 | | Suspending network activity - user request
ID: 77140 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Keith E. Laidig
Volunteer moderator
Project developer
Avatar

Send message
Joined: 1 Jul 05
Posts: 154
Credit: 117,189,961
RAC: 0
Message 77142 - Posted: 30 Jul 2014, 18:03:10 UTC

This appears to be an upstream issue for us. Something has changed with the campus routing to 'throttle'
traffic into our servers. We're working with the UW's Network Operations team to hunt down the cause....
-KEL

ID: 77142 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Miklos M

Send message
Joined: 8 Dec 13
Posts: 29
Credit: 5,277,251
RAC: 0
Message 77143 - Posted: 30 Jul 2014, 18:07:34 UTC

Not uploading at all and concerned about timed out work units soon.
ID: 77143 · Rating: 0 · rate: Rate + / Rate - Report as offensive
[TA]Assimilator1
Avatar

Send message
Joined: 9 May 07
Posts: 7
Credit: 5,399,250
RAC: 0
Message 77145 - Posted: 30 Jul 2014, 18:46:02 UTC

Same problem here, can't upload atm.

And yea there should be a post about this on the front page news section.
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN

Main - Ryzen 5 3600, TR Ultima90, MSI B450, 32GB DDR4 3200, RX580 8GB, Seasonic Prime PX-550
2nd - i7 4930k @4.1GHz, TR Ultra120 E, 16GB DDR3 1866, HD7870 XT 3GB(DS
ID: 77145 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sascha Becker

Send message
Joined: 1 Oct 05
Posts: 2
Credit: 4,318,835
RAC: 0
Message 77146 - Posted: 30 Jul 2014, 18:49:33 UTC

Have a problem with the down / upload of Wu's for about 2 days. Currently, I have 13 Wu's done, but which are not uploaded either by my internet connection or the internet my parents. I have already tried two different routers. Times have issued terminates the killer virus or firewalls. But no up or download from Wu's. Can someone help me? The problem I have only with Rosetta @ home otherwise everything works. Sincerely oetker201 / Sascha Becker.
ID: 77146 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Daniel Kohn

Send message
Joined: 30 Dec 05
Posts: 18
Credit: 2,899,939
RAC: 0
Message 77147 - Posted: 30 Jul 2014, 19:00:47 UTC

I can't wait to see the TeraFLOP estimate spike when all the complete work units start pouring in after this is resolved.
ID: 77147 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 77148 - Posted: 30 Jul 2014, 19:05:08 UTC - in response to Message 77142.  

This appears to be an upstream issue for us. Something has changed with the campus routing to 'throttle'
traffic into our servers. We're working with the UW's Network Operations team to hunt down the cause....
-KEL



I knew we would hear from KEL soon. He just has to hear about the problem first.
ID: 77148 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 77149 - Posted: 30 Jul 2014, 19:08:19 UTC - in response to Message 77137.  

The scheduler is back, downloads are working (so we can get new work), uploads are still stuck.



Not for me right now at least: All times are CET (GMT +2)

7/30/2014 9:05:48 PM | rosetta@home | update requested by user
7/30/2014 9:05:51 PM | rosetta@home | Sending scheduler request: Requested by user.
7/30/2014 9:05:51 PM | rosetta@home | Requesting new tasks for CPU and NVIDIA
7/30/2014 9:06:14 PM | | Project communication failed: attempting access to reference site
7/30/2014 9:06:14 PM | rosetta@home | Scheduler request failed: Couldn't connect to server
7/30/2014 9:06:16 PM | | Internet access OK - project servers may be temporarily down.
ID: 77149 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Miklos M

Send message
Joined: 8 Dec 13
Posts: 29
Credit: 5,277,251
RAC: 0
Message 77150 - Posted: 30 Jul 2014, 19:25:45 UTC

7/30/2014 3:23:58 PM | rosetta@home | Reporting 1 completed tasks
7/30/2014 3:23:58 PM | rosetta@home | Not requesting tasks: too many uploads in progress
7/30/2014 3:24:21 PM | rosetta@home | Scheduler request failed: Couldn't connect to server
7/30/2014 3:24:29 PM | | Project communication failed: attempting access to reference site
7/30/2014 3:24:31 PM | | Internet access OK - project servers may be temporarily down.
ID: 77150 · Rating: 0 · rate: Rate + / Rate - Report as offensive
[FI] OIKARINEN
Avatar

Send message
Joined: 16 Nov 13
Posts: 6
Credit: 131,483
RAC: 0
Message 77151 - Posted: 30 Jul 2014, 21:42:17 UTC

Nice to know that downloads are available again .. althought uploads are still disabled due to the technical issues (i guess) , havent been able to upload since 2 days ago .

to 31. heinäkuuta 2014 00.17.25 | rosetta@home | Started upload of tube9_5_A_tube9_5_B_patchdock_split_05_140726_SAVE_ALL_OUT__179846_70_0_0
to 31. heinäkuuta 2014 00.17.25 | rosetta@home | Started upload of tj_7_20_2helix_highRadius_X16_BAB_14_GB_8_r_fd_fragments_abinitio_SAVE_ALL_OUT_179220_884_0_0
to 31. heinäkuuta 2014 00.20.05 | | Project communication failed: attempting access to reference site
to 31. heinäkuuta 2014 00.20.05 | rosetta@home | Temporarily failed upload of tube9_5_A_tube9_5_B_patchdock_split_05_140726_SAVE_ALL_OUT__179846_70_0_0: transient HTTP error
to 31. heinäkuuta 2014 00.20.05 | rosetta@home | Backing off 00:06:15 on upload of tube9_5_A_tube9_5_B_patchdock_split_05_140726_SAVE_ALL_OUT__179846_70_0_0
to 31. heinäkuuta 2014 00.20.05 | rosetta@home | Temporarily failed upload of tj_7_20_2helix_highRadius_X16_BAB_14_GB_8_r_fd_fragments_abinitio_SAVE_ALL_OUT_179220_884_0_0: transient HTTP error
to 31. heinäkuuta 2014 00.20.05 | rosetta@home | Backing off 00:07:34 on upload of tj_7_20_2helix_highRadius_X16_BAB_14_GB_8_r_fd_fragments_abinitio_SAVE_ALL_OUT_179220_884_0_0
to 31. heinäkuuta 2014 00.20.05 | rosetta@home | Started upload of tube9_2_A_tube9_2_B_patchdock_split_01_140726_SAVE_ALL_OUT__179843_128_0_0
to 31. heinäkuuta 2014 00.20.05 | rosetta@home | Started upload of tj_7_11_2helix_highRadius_X18_GB_16_DDD_3_e_fa_fragments_abinitio_SAVE_ALL_OUT_174853_768_0_0
to 31. heinäkuuta 2014 00.20.08 | | Internet access OK - project servers may be temporarily down.
to 31. heinäkuuta 2014 00.22.10 | | Suspending network activity - user request

Life is too short to live concerned about its mysteries.
ID: 77151 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Rockhound57

Send message
Joined: 2 Mar 11
Posts: 16
Credit: 1,181,412
RAC: 0
Message 77152 - Posted: 30 Jul 2014, 22:02:47 UTC - in response to Message 77151.  

Not for me. No uploads, no downloads, and I have only 2 work units left.
ID: 77152 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Ananas

Send message
Joined: 1 Jan 06
Posts: 232
Credit: 752,471
RAC: 0
Message 77153 - Posted: 30 Jul 2014, 22:03:14 UTC - in response to Message 77150.  

... Scheduler request failed: Couldn't connect to server ...

Same here now, but I received three results during the 30.07., one at ~07:00 and two at ~11:30 (UTC)
ID: 77153 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Just Jake

Send message
Joined: 18 Nov 06
Posts: 6
Credit: 18,223,581
RAC: 0
Message 77154 - Posted: 30 Jul 2014, 22:31:30 UTC

Uploads failing. I reset the project yesterday, throwing away 20+ good results, and got 20+ new work units. They're all done now and still won't upload.

554 7/30/2014 5:26:38 PM Internet access OK - project servers may be temporarily down.
553 7/30/2014 5:26:37 PM Project communication failed: attempting access to reference site
552 rosetta@home 7/30/2014 5:26:34 PM Backing off 00:10:54 on upload of tj_7_11_2helix_highRadius_X16_BAB_14_GB_2_j_fd_fragments_abinitio_SAVE_ALL_OUT_174728_891_0_0
551 rosetta@home 7/30/2014 5:26:34 PM Temporarily failed upload of tj_7_11_2helix_highRadius_X16_BAB_14_GB_2_j_fd_fragments_abinitio_SAVE_ALL_OUT_174728_891_0_0: connect() failed



ID: 77154 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1233
Credit: 14,324,975
RAC: 3,637
Message 77155 - Posted: 30 Jul 2014, 23:02:51 UTC
Last modified: 30 Jul 2014, 23:04:52 UTC

Finally, some response. Scroll down on the Rosetta@Home home page to see it.

I got one more workunit from the efforts so far. It's finished and won't upload.
ID: 77155 · Rating: 0 · rate: Rate + / Rate - Report as offensive
krypton
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 16 Nov 11
Posts: 108
Credit: 2,164,309
RAC: 0
Message 77157 - Posted: 30 Jul 2014, 23:16:12 UTC - in response to Message 77155.  
Last modified: 30 Jul 2014, 23:16:29 UTC

Finally, some response. Scroll down on the Rosetta@Home home page to see it.

I got one more workunit from the efforts so far. It's finished and won't upload.


We've been responding since yesterday morning =P

We are hoping that the upload will get resolved today/tomorrow (then everything will upload), if not I'm gonna go through and stop anymore jobs from being distributed.
ID: 77157 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 77160 - Posted: 30 Jul 2014, 23:45:34 UTC - in response to Message 77157.  

Finally, some response. Scroll down on the Rosetta@Home home page to see it.

I got one more workunit from the efforts so far. It's finished and won't upload.


We've been responding since yesterday morning =P

We are hoping that the upload will get resolved today/tomorrow (then everything will upload), if not I'm gonna go through and stop anymore jobs from being distributed.


Just let KEL go yell at the computing center for awhile then things will get back in order. Pretty sure if he threatens to turn the other machines into food replicators he will get some action.


ID: 77160 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,662,308
RAC: 0
Message 77161 - Posted: 31 Jul 2014, 0:42:08 UTC - in response to Message 77160.  



Just let KEL go yell at the computing center for awhile then things will get back in order. Pretty sure if he threatens to turn the other machines into food replicators he will get some action.



I think a lot of it goes back to the money, and therefore much of the problem is their own. I don't believe I am the only person who noticed all those 80-Meg "Computation Error" tasks that the system was broadcasting for many months before (and at least a couple of months after) I commented on them in these discussions.

After my initial comments, I did some simple-minded testing on all of my machines. Whenever I noticed that sub-project in any queue, I briefly suspended the tasks ahead of it to let it run, and it ALWAYS crashed. On slower machines it took a bit longer, up to several minutes, but it never produced any valid results.

From that perspective of null results, I feel like congratulating the university for their tolerance.

Anyway, no sign of improvement in the situation, at least from the perspective of Japan. Just in case someone is thinking the problems have gone away, another thinking is called for. My own priority remains for fresh tasks to work on, and I believe that almost all of my computers have exhausted their input queues.
ID: 77161 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 . . . 31 · 32 · 33 · 34 · 35 · 36 · 37 . . . 55 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org