Stalled downloads

Message boards : Number crunching : Stalled downloads

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Peter Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 730
Credit: 5,404,916
RAC: 0
Message 91928 - Posted: 10 Mar 2020, 22:12:47 UTC - in response to Message 91926.  

This is odd, as I've not had any stuck ones for over a fortnight on 4 different computers. There must be a reason they fixed it for me and not you :-) It just suddenly started working, before then every machine stuck once a day. I don't think I changed anything that would have fixed it.

I have seen that too. I wonder if it is determined by the number of machines (that is, cores) you have on Rosetta?
As I go down, I will see. It could be that their server chokes up, but would be surprised if it starts working permanently.


I have increased the priority of Rosetta, so it's running almost all the time, but I don't think it coincides with it working better. I'm sure it improved before I did that, or I wouldn't have given it more.
ID: 91928 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 197
Credit: 17,869,114
RAC: 8,606
Message 91929 - Posted: 10 Mar 2020, 22:24:27 UTC - in response to Message 91927.  
Last modified: 10 Mar 2020, 22:25:19 UTC

Yep. i have this file stuck to0: boinc.bakerlab.org/rosetta/download/113/twc_method_msd_cpp_c3212_9mer_gb_000182_msd.zip

It does not download via browsers too right now. Always stops at 2.5/3 KB
Clear R@H server error.
ID: 91929 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 641
Credit: 45,159,261
RAC: 5,652
Message 91930 - Posted: 10 Mar 2020, 22:31:55 UTC - in response to Message 91929.  

Clear R@H server error.

I will leave my four machines with the greatest memory-to-core ratio (12 cores at 32 GB each) on Rosetta.
The others I will have to put in quarantine.
ID: 91930 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 730
Credit: 5,404,916
RAC: 0
Message 91931 - Posted: 10 Mar 2020, 22:44:44 UTC - in response to Message 91930.  

Clear R@H server error.

I will leave my four machines with the greatest memory-to-core ratio (12 cores at 32 GB each) on Rosetta.
The others I will have to put in quarantine.


I saw no relationship with this here - all my machines were sticking equally about 2 weeks ago, and now not at all:
16GB 6 cores
8GB 4 cores
8GB 4 cores
8GB 2 of 4 cores (to prevent overheating, it's a laptop with a pathetic cooling fan)
ID: 91931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 641
Credit: 45,159,261
RAC: 5,652
Message 91932 - Posted: 10 Mar 2020, 22:53:46 UTC - in response to Message 91931.  

I saw no relationship with this here - all my machines were sticking equally about 2 weeks ago, and now not at all:

Yes, I don't really expect a limitation based on the number of my own machines. It would be the overall load on their server most likely.

It might be nice if they were to tell us something. We are left to our own remedies otherwise.
ID: 91932 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 730
Credit: 5,404,916
RAC: 0
Message 91933 - Posted: 10 Mar 2020, 23:41:26 UTC - in response to Message 91932.  
Last modified: 10 Mar 2020, 23:42:27 UTC

I saw no relationship with this here - all my machines were sticking equally about 2 weeks ago, and now not at all:

Yes, I don't really expect a limitation based on the number of my own machines. It would be the overall load on their server most likely.

It might be nice if they were to tell us something. We are left to our own remedies otherwise.


I'm thinking either:

1) They half fixed it, so some of us are still having problems and some aren't.

or

2) A lot of people got fed up and stopped running it, lowering the load on their servers so they're coping better.

Maybe the coronavirus warped into a computer virus and has infected their servers to stop them finding a cure?
ID: 91933 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 641
Credit: 45,159,261
RAC: 5,652
Message 91934 - Posted: 11 Mar 2020, 2:14:27 UTC - in response to Message 91933.  

2) A lot of people got fed up and stopped running it, lowering the load on their servers so they're coping better.

Yes, exactly. I had another stall, so I am down to three machines.
I am doing my part to lighten the load, but they need to figure out what is going on and tell us, or it will get too light.
ID: 91934 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 730
Credit: 5,404,916
RAC: 0
Message 91943 - Posted: 11 Mar 2020, 18:08:18 UTC - in response to Message 91934.  

2) A lot of people got fed up and stopped running it, lowering the load on their servers so they're coping better.

Yes, exactly. I had another stall, so I am down to three machines.
I am doing my part to lighten the load, but they need to figure out what is going on and tell us, or it will get too light.


I've not seen a single word from anyone on the project. Is there another part of the forum they tend to hang out in?
ID: 91943 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 641
Credit: 45,159,261
RAC: 5,652
Message 91944 - Posted: 11 Mar 2020, 19:58:40 UTC - in response to Message 91943.  

I've not seen a single word from anyone on the project. Is there another part of the forum they tend to hang out in?

Actually, I did ask the Moderator about it, and he is forwarding a request.
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1000
ID: 91944 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 614
Credit: 10,486,437
RAC: 4,918
Message 91945 - Posted: 11 Mar 2020, 20:07:59 UTC
Last modified: 11 Mar 2020, 20:19:36 UTC

Another stuck download here as well, and I noticed a work unit crash in my tasks list. I had always regarded Rosetta as a reliable project, but there have been a few cracks recently. Watching with interest, not set no new tasks yet...

<edit>
And another on a different machine...
</edit>
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 91945 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 197
Credit: 17,869,114
RAC: 8,606
Message 91955 - Posted: 13 Mar 2020, 22:55:35 UTC

One more stalled download on two computers. Which autostwitch from Rosetta to WCG for 2 days due to this glitcing file server.

This time it was that file:
13/03/2020 10:00:56 | Rosetta@home | Started download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 10:06:03 | Rosetta@home | Temporarily failed download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip: transient HTTP error
13/03/2020 10:06:03 | Rosetta@home | Backing off 05:48:01 on download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 15:54:05 | Rosetta@home | Started download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 15:59:12 | Rosetta@home | Temporarily failed download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip: transient HTTP error
13/03/2020 15:59:12 | Rosetta@home | Backing off 04:31:47 on download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 21:30:26 | Rosetta@home | Started download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 21:35:33 | Rosetta@home | Temporarily failed download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip: transient HTTP error
13/03/2020 21:35:33 | Rosetta@home | Backing off 05:31:02 on download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
14/03/2020 01:47:39 | Rosetta@home | task twc_method_msd_cpp_c33591_11mer_gb_000211_msd_SAVE_ALL_OUT_901175_609_0 aborted by user
ID: 91955 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 730
Credit: 5,404,916
RAC: 0
Message 91956 - Posted: 13 Mar 2020, 23:04:13 UTC - in response to Message 91955.  

One more stalled download on two computers. Which autostwitch from Rosetta to WCG for 2 days due to this glitcing file server.

This time it was that file:
13/03/2020 10:00:56 | Rosetta@home | Started download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 10:06:03 | Rosetta@home | Temporarily failed download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip: transient HTTP error
13/03/2020 10:06:03 | Rosetta@home | Backing off 05:48:01 on download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 15:54:05 | Rosetta@home | Started download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 15:59:12 | Rosetta@home | Temporarily failed download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip: transient HTTP error
13/03/2020 15:59:12 | Rosetta@home | Backing off 04:31:47 on download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 21:30:26 | Rosetta@home | Started download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
13/03/2020 21:35:33 | Rosetta@home | Temporarily failed download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip: transient HTTP error
13/03/2020 21:35:33 | Rosetta@home | Backing off 05:31:02 on download of twc_method_msd_cpp_c33591_11mer_gb_000211_msd.zip
14/03/2020 01:47:39 | Rosetta@home | task twc_method_msd_cpp_c33591_11mer_gb_000211_msd_SAVE_ALL_OUT_901175_609_0 aborted by user


Yip, mine do the same, can't get work from project A, do project B. If they can't run their servers properly, they won't get the work done.
ID: 91956 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James W

Send message
Joined: 25 Nov 12
Posts: 130
Credit: 1,766,254
RAC: 0
Message 91957 - Posted: 14 Mar 2020, 1:19:33 UTC

I haven't had a download problem yet (knock on wood!), though I'm still using Windows 7. What I've seen here and in other thread is that folks using Windows 10 have been reporting this problem.
ID: 91957 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 641
Credit: 45,159,261
RAC: 5,652
Message 91958 - Posted: 14 Mar 2020, 1:41:04 UTC - in response to Message 91957.  

I haven't had a download problem yet (knock on wood!), though I'm still using Windows 7. What I've seen here and in other thread is that folks using Windows 10 have been reporting this problem.

I have seen them on my Linux (Ubuntu) machines, but not yet on my Windows 7 one.
But that is probably just statistics. I have about 5 Ubuntu machines on Rosetta at any one time (sometimes more), and only one Windows machine. The OS probably doesn't make a difference.
ID: 91958 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 614
Credit: 10,486,437
RAC: 4,918
Message 91960 - Posted: 14 Mar 2020, 8:06:27 UTC
Last modified: 14 Mar 2020, 8:16:30 UTC

Both of my machines here had stuck downloads overnight. When visiting rarely visited sites, I will be suspending Rosetta. Problem has been around for some time now, without comment or action, a little worrying.
I saw that after killing the stuck downloads, BOINC did not imediately try for another. I can see the rational there under normal circumstances, but the situation at present here is not really normal. One of the machines here has no Rosettas at all on it, and is not requesting any. At the same time, the ready to send field shows a large number. I expect this is a BOINC issue however.
These machines are Windows 8.1 x64. Others I look after for other people/groups vary.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 91960 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 614
Credit: 10,486,437
RAC: 4,918
Message 91961 - Posted: 14 Mar 2020, 10:05:54 UTC

Both machines now have downloaded new work.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 91961 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 28 May 06
Posts: 59
Credit: 219,040
RAC: 0
Message 91971 - Posted: 14 Mar 2020, 19:21:29 UTC - in response to Message 91960.  

.... saw that after killing the stuck downloads, BOINC did not imediately try for another. .

I found out that you can't kill the stuck file(s) - you have to kill the task(s) that is that is stuck downloading - otherwise nothing gets downloaded from the project that has one or more files erroring on download.

Last I saw was a post from Mod.Sense 3+ days ago here saying:
I've sent an EMail to David Kim and asked that he look in to it, pointing out that the hung downloads are preventing machines from getting work so they can resume processing
.

ID: 91971 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 730
Credit: 5,404,916
RAC: 0
Message 91973 - Posted: 14 Mar 2020, 21:27:02 UTC - in response to Message 91957.  

I haven't had a download problem yet (knock on wood!), though I'm still using Windows 7. What I've seen here and in other thread is that folks using Windows 10 have been reporting this problem.


I only have four Windows 10 machines, no other operating systems, and they were all having the problem a few weeks back, but then just started working. Nothing sticks any more. This is completely random. I think it's just the server having a fault or an overload and you get lucky or not.
ID: 91973 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 28 May 06
Posts: 59
Credit: 219,040
RAC: 0
Message 91975 - Posted: 14 Mar 2020, 22:22:25 UTC - in response to Message 91973.  

It is obviously a SERVER issue.

Does not matter what operating system your computer runs.

I have had downloads stalled on Windows Vista, 7 and 8.1.

ID: 91975 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Om
Avatar

Send message
Joined: 18 Feb 20
Posts: 16
Credit: 777,076
RAC: 0
Message 91977 - Posted: 15 Mar 2020, 4:50:42 UTC - in response to Message 91975.  
Last modified: 15 Mar 2020, 4:59:57 UTC

Two stalled a Mac and one on Windows 10. Serve me some of the other variety.
ID: 91977 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : Number crunching : Stalled downloads



©2021 University of Washington
https://www.bakerlab.org