SERVER PROBLEMS - 2.

Message boards : Number crunching : SERVER PROBLEMS - 2.

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Heidi1
Avatar

Send message
Joined: 11 Aug 07
Posts: 49
Credit: 1,786,248
RAC: 0
Message 64770 - Posted: 4 Jan 2010, 4:04:36 UTC - in response to Message 64766.  

Avant-garde chapel? Really? I'm a student at the UW, and I haven't noticed any chapels on campus. Are you perhaps thinking of Seattle University (a Jesuit school)?


Sorry, you're right. I was at an organists' convention in Seattle last summer, where we went all over the greater Seattle area. I thought we were at UW for a particular workshop, but apparently not (I just checked).
ID: 64770 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1981
Credit: 38,437,492
RAC: 13,682
Message 64867 - Posted: 9 Jan 2010, 1:03:57 UTC

rah_validator_beta bk2 Not running
rah_validator_mini bk1 Not running
rah_assimilatorbeta1 bk1 Not running
rah_assimilatorbeta2 bk1 Not running
rah_assimilatorbeta3 bk2 Not running
rah_assimilatorbeta4 bk2 Not running

08/01/2010 23:42:38 rosetta@home Sending scheduler request: To fetch work.
08/01/2010 23:42:38 rosetta@home Reporting 1 completed tasks, requesting new tasks for GPU
08/01/2010 23:42:39 Project communication failed: attempting access to reference site
08/01/2010 23:42:39 rosetta@home Temporarily failed upload of homopt2b.t305_.t305_.IGNORE_THE_REST.S_00003_0000013_00086.pdb.JOB_16711_26_0_0: connect() failed
08/01/2010 23:42:39 rosetta@home Backing off 1 min 0 sec on upload of homopt2b.t305_.t305_.IGNORE_THE_REST.S_00003_0000013_00086.pdb.JOB_16711_26_0_0
08/01/2010 23:42:40 Internet access OK - project servers may be temporarily down.

ID: 64867 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1981
Credit: 38,437,492
RAC: 13,682
Message 64888 - Posted: 10 Jan 2010, 4:07:43 UTC - in response to Message 64867.  

rah_validator_beta bk2 Not running
rah_validator_mini bk1 Not running
rah_assimilatorbeta1 bk1 Not running
rah_assimilatorbeta2 bk1 Not running
rah_assimilatorbeta3 bk2 Not running
rah_assimilatorbeta4 bk2 Not running

Still showing the same.

At the time the server was down and I couldn't uldl for maybe 15 hours. Was that just me?

This seemed to get 'solved' 6 or 8 hours ago as I'm getting transfers and credits awarded just fine now. I guess it's just that the server page doesn't seem to be telling the truth atm. Or am I wrong (if so, ignore me).
ID: 64888 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 64891 - Posted: 10 Jan 2010, 11:13:43 UTC - in response to Message 64888.  

rah_validator_beta bk2 Not running
rah_validator_mini bk1 Not running
rah_assimilatorbeta1 bk1 Not running
rah_assimilatorbeta2 bk1 Not running
rah_assimilatorbeta3 bk2 Not running
rah_assimilatorbeta4 bk2 Not running

Still showing the same.

At the time the server was down and I couldn't uldl for maybe 15 hours. Was that just me?

This seemed to get 'solved' 6 or 8 hours ago as I'm getting transfers and credits awarded just fine now. I guess it's just that the server page doesn't seem to be telling the truth atm. Or am I wrong (if so, ignore me).


Ignore you? Wouldn't dream of it!
I also had results validated despite what the server page. Looks like a case of
"don't believe what you read in the papers".
ID: 64891 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 64907 - Posted: 10 Jan 2010, 23:13:59 UTC

1/10/10 614PM>>>>>>>>>> problem with the server again
ID: 64907 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 65098 - Posted: 25 Jan 2010, 7:01:16 UTC

I don't know if this is a server or tasks issue.

Mon 25 Jan 2010 17:56:42 EST|rosetta@home|[error] MD5 check failed for t309__boinc_filtered_loopbuild_threading_cst_all_tex.boinc.zip

Mon 25 Jan 2010 17:56:42 EST|rosetta@home|[error] expected 6b8c2b2035c720915a8593e055747dad, got 919e448c24f7d95710ece3cbd61ee98e

Mon 25 Jan 2010 17:56:42 EST|rosetta@home|[error] Checksum or signature error for t309__boinc_filtered_loopbuild_threading_cst_all_tex.boinc.zip

ID: 65098 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 65107 - Posted: 25 Jan 2010, 17:13:27 UTC

PPL that is generally a "network issue". Some of the bytes got scrambled as that file was sent down to you. More common if the transfer happens to get interrupted, or has to retry multiple times.

Looks like you've only had this happen on one task, so not a systemic problem. BOINC recovers, reports it as a download error and then sees it is short of work and asks for more. So recovery is automatic. Just has to use more bandwidth to get files needed for the next task it gets.

Here's a link to the one task with download error I found for future reference.
Rosetta Moderator: Mod.Sense
ID: 65107 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 65320 - Posted: 15 Feb 2010, 3:07:13 UTC
Last modified: 15 Feb 2010, 3:08:21 UTC

Hi.

Is nobody else having a problem with MD5 Checksum on just t*** tasks, no other

projects i run are getting these errors and i haven't had any other tasks from

here have this problem. I can give you a list of all of them, there fairly big

downloads to, not that it has been a problem. And no they haven't stopped and

restarted and i'm on a 1.5mb d/l line speed and i get close to that all the time.

Mon 15 Feb 2010 13:52:38 EST|rosetta@home|[error] MD5 check failed for t290__boinc_filtered_loopbuild_threading_cst_lb_tex.boinc.zip

Mon 15 Feb 2010 13:52:38 EST|rosetta@home|[error] expected a95fbc4b0d096a088f0d21a3587e0234, got 1e6fc5f63bd40d4c8664c7f6f55e7b0d

Mon 15 Feb 2010 13:52:38 EST|rosetta@home|[error] Checksum or signature error for t290__boinc_filtered_loopbuild_threading_cst_lb_tex.boinc.zip
ID: 65320 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 652
Credit: 11,662,550
RAC: 1,151
Message 65341 - Posted: 16 Feb 2010, 10:14:20 UTC

Likewise. No problems here, well not Rosetta problems anyway...
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 65341 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 150
Credit: 3,818,279
RAC: 728
Message 65356 - Posted: 18 Feb 2010, 1:07:12 UTC

I have had a few scheduler request failures saying a Timeout was reached.
And that the project servers maybe down.

A short while later (sometimes an hour) I can usually then get through with downloads and uploads working again.

Possibly a Rosetta network issue as my Internet connection has not failed and BOINC in fact says it is OK and that the Project servers are at fault.
ID: 65356 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,767,285
RAC: 12,464
Message 65357 - Posted: 18 Feb 2010, 9:23:03 UTC - in response to Message 65356.  

I have had a few scheduler request failures saying a Timeout was reached.
And that the project servers maybe down.

A short while later (sometimes an hour) I can usually then get through with downloads and uploads working again.

Possibly a Rosetta network issue as my Internet connection has not failed and BOINC in fact says it is OK and that the Project servers are at fault.


Probably all due to Seti being down and Malaria saying they will be down. Einstein has already crashed at least once. from the load. Seti down for an undetermined amount of time due to cooling issues!!
ID: 65357 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 65362 - Posted: 19 Feb 2010, 6:40:19 UTC

Hi.

I'm getting this type of message now on two of my rig's.

Fri 19 Feb 2010 17:34:45 EST|rosetta@home|Sending scheduler request: To fetch work. Requesting 20414 seconds of work, reporting 0 completed tasks
Fri 19 Feb 2010 17:36:19 EST||Project communication failed: attempting access to reference site
Fri 19 Feb 2010 17:36:21 EST||Internet access OK - project servers may be temporarily down.
Fri 19 Feb 2010 17:36:21 EST|rosetta@home|Scheduler request failed: Failed sending data to the peer

ID: 65362 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 65379 - Posted: 20 Feb 2010, 20:59:27 UTC
Last modified: 20 Feb 2010, 20:59:49 UTC

Hi.

I just noticed this, as i have a few tasks sitting their waiting.

And the flop's are dropping like a stone!

rah_validator_mini__bk1__Not running
ID: 65379 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 65380 - Posted: 20 Feb 2010, 22:34:07 UTC - in response to Message 65379.  

Hi.

I just noticed this, as i have a few tasks sitting their waiting.

And the flop's are dropping like a stone!

rah_validator_mini__bk1__Not running


problems>>>>>>>> see here

https://boinc.bakerlab.org/rosetta/rah_status.php
ID: 65380 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Conan
Avatar

Send message
Joined: 11 Oct 05
Posts: 150
Credit: 3,818,279
RAC: 728
Message 65385 - Posted: 22 Feb 2010, 8:19:26 UTC - in response to Message 65380.  

Hi.

I just noticed this, as i have a few tasks sitting their waiting.

And the flop's are dropping like a stone!

rah_validator_mini__bk1__Not running


problems>>>>>>>> see here

https://boinc.bakerlab.org/rosetta/rah_status.php


Yes still a problem as my Pendings have been climbing for the last two days and the validator seems to still have a problem (it is still in RED and not running).
ID: 65385 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
dango

Send message
Joined: 22 Dec 08
Posts: 3
Credit: 75,820
RAC: 0
Message 65440 - Posted: 1 Mar 2010, 23:48:37 UTC

what is the problem now?

4 days i can't open page. now it is working, but BM has still no connection with project
ID: 65440 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1981
Credit: 38,437,492
RAC: 13,682
Message 65441 - Posted: 2 Mar 2010, 0:55:30 UTC

Whatever it was it seemed pretty serious. I'm only glad it happened on a Sunday nightMonday morning when staff were present.

Uploads returned about 30 minutes ago and Server Status seems to have just returned and I've just returned 13 tasks.

Hope all is well now.
ID: 65441 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 65444 - Posted: 2 Mar 2010, 2:17:20 UTC

I was getting `Access forbidden` however i tried to get onto rosetta servers,
uploads have finaly got through, but no downloads of new work yet.
It will get sorted, just need to find the right box to kick :)
ID: 65444 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 67184 - Posted: 13 Aug 2010, 4:36:30 UTC
Last modified: 13 Aug 2010, 4:48:20 UTC

Hi.

You have a problem, if you didn't know already.

Unable to upload/download.

rah_make_work1__srv1__Not running

rah_make_work2__srv3__Not running

feeder__________srv4__Not running

file_deleter____srv1__Not running

db_purge________srv1__Not running

EDIT/ Been like this for a few hours.
ID: 67184 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jochen

Send message
Joined: 6 Jun 06
Posts: 133
Credit: 3,847,433
RAC: 0
Message 67193 - Posted: 13 Aug 2010, 12:12:41 UTC

Hm, AFAIR, it's now Friday morning, 5 am in CA (?), which hopefully means that the problem won't last over the weekend.
ID: 67193 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : SERVER PROBLEMS - 2.



©2024 University of Washington
https://www.bakerlab.org