Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 42 · 43 · 44 · 45 · 46 · 47 · 48 . . . 55 · Next

AuthorMessage
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 78611 - Posted: 30 Aug 2015, 5:10:18 UTC

Half the server page is now in the RED, No new tasks ready.

Still having problems uploading & when they do I'm getting validate errors.?

ID: 78611 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Killersocke@rosetta

Send message
Joined: 13 Nov 06
Posts: 29
Credit: 2,579,125
RAC: 0
Message 78613 - Posted: 30 Aug 2015, 11:27:44 UTC

- very slow downspead

- abort after download:

https://boinc.bakerlab.org/rosetta/result.php?resultid=755052556
https://boinc.bakerlab.org/rosetta/result.php?resultid=755052557
ID: 78613 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1960
Credit: 38,076,311
RAC: 6,958
Message 78615 - Posted: 30 Aug 2015, 14:09:54 UTC - in response to Message 78609.  

Uploads are timing out for rosetta, all other projects are o.k.

Downloads form rosetta are are working for now!

Ditto
ID: 78615 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 78616 - Posted: 30 Aug 2015, 18:02:59 UTC

Most of my machines are reporting upload timeouts. Here's an example:

8/30/2015 1:47:31 PM | rosetta@home | Started upload of gr081015_HEEHheeh_go3_nods_new_heeh_5.bp_r22_pass_20150720225429_fragments_fold_SAVE_ALL_OUT_279062_30_0_0
8/30/2015 1:47:53 PM | rosetta@home | Temporarily failed upload of gr081015_HEEHheeh_go3_nods_new_heeh_5.bp_r22_pass_20150720225429_fragments_fold_SAVE_ALL_OUT_279062_30_0_0: connect() failed
8/30/2015 1:47:53 PM | rosetta@home | Backing off 00:02:53 on upload of gr081015_HEEHheeh_go3_nods_new_heeh_5.bp_r22_pass_20150720225429_fragments_fold_SAVE_ALL_OUT_279062_30_0_0
8/30/2015 1:47:54 PM | | Project communication failed: attempting access to reference site
8/30/2015 1:47:55 PM | | Internet access OK - project servers may be temporarily down.

This problem started yesterday afternoon.
ID: 78616 · Rating: 0 · rate: Rate + / Rate - Report as offensive
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 11,805,838
RAC: 0
Message 78617 - Posted: 30 Aug 2015, 18:14:51 UTC

There was a big windstorm in the Pacific Northwest yesterday. Power was cut to the University of Washington: I believe it's been restored but would assume Rosetta@home is in recovery mode.
ID: 78617 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 78620 - Posted: 30 Aug 2015, 19:30:10 UTC - in response to Message 78617.  

There was a big windstorm in the Pacific Northwest yesterday. Power was cut to the University of Washington: I believe it's been restored but would assume Rosetta@home is in recovery mode.


If that's the case, do you have any idea why I have still been able to download new tasks the entire time--just unable to upload?
ID: 78620 · Rating: 0 · rate: Rate + / Rate - Report as offensive
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 11,805,838
RAC: 0
Message 78622 - Posted: 30 Aug 2015, 20:17:16 UTC - in response to Message 78620.  

There was a big windstorm in the Pacific Northwest yesterday. Power was cut to the University of Washington: I believe it's been restored but would assume Rosetta@home is in recovery mode.


If that's the case, do you have any idea why I have still been able to download new tasks the entire time--just unable to upload?


I'm not affiliated with the project and was really just speculating. But there are multiple servers: perhaps the file system of one but not the rest became corrupt and needs to be restored.

Or something.
ID: 78622 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 78623 - Posted: 30 Aug 2015, 20:23:29 UTC - in response to Message 78622.  

There was a big windstorm in the Pacific Northwest yesterday. Power was cut to the University of Washington: I believe it's been restored but would assume Rosetta@home is in recovery mode.


If that's the case, do you have any idea why I have still been able to download new tasks the entire time--just unable to upload?


I'm not affiliated with the project and was really just speculating. But there are multiple servers: perhaps the file system of one but not the rest became corrupt and needs to be restored.

Or something.


You might be right. I thought there might be some word from a Moderator regarding the status, but I appreciate your insight too!
ID: 78623 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 78624 - Posted: 30 Aug 2015, 20:52:39 UTC - in response to Message 78622.  

There was a big windstorm in the Pacific Northwest yesterday. Power was cut to the University of Washington: I believe it's been restored but would assume Rosetta@home is in recovery mode.


If that's the case, do you have any idea why I have still been able to download new tasks the entire time--just unable to upload?


I'm not affiliated with the project and was really just speculating. But there are multiple servers: perhaps the file system of one but not the rest became corrupt and needs to be restored.

Or something.


I just went through all my active tasks, and although most uploads from each of my computers aren't successful, occasionally some task uploads have been successful in the last two days. So it seems like however many servers are still working, the number must be significantly reduced, but not all servers are down.
ID: 78624 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ed Johnson

Send message
Joined: 9 Jun 06
Posts: 9
Credit: 4,738,577
RAC: 0
Message 78626 - Posted: 30 Aug 2015, 22:03:36 UTC - in response to Message 78624.  

According to the server status board. Several boxes are down right now:

https://boinc.bakerlab.org/rosetta/rah_status.php

There was a big windstorm in the Pacific Northwest yesterday. Power was cut to the University of Washington: I believe it's been restored but would assume Rosetta@home is in recovery mode.


If that's the case, do you have any idea why I have still been able to download new tasks the entire time--just unable to upload?


I'm not affiliated with the project and was really just speculating. But there are multiple servers: perhaps the file system of one but not the rest became corrupt and needs to be restored.

Or something.


I just went through all my active tasks, and although most uploads from each of my computers aren't successful, occasionally some task uploads have been successful in the last two days. So it seems like however many servers are still working, the number must be significantly reduced, but not all servers are down.


ID: 78626 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,806,125
RAC: 3,336
Message 78627 - Posted: 30 Aug 2015, 23:01:55 UTC

This could be a good reason for adding something to the server status page: Show which servers can handle uploads, and which can handle downloads. Perhaps also how much disk space is available for any uploads.
ID: 78627 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Dusty

Send message
Joined: 1 Mar 08
Posts: 41
Credit: 2,667,354
RAC: 0
Message 78629 - Posted: 30 Aug 2015, 23:51:19 UTC - in response to Message 78626.  

[quote]According to the server status board. Several boxes are down right now:

https://boinc.bakerlab.org/rosetta/rah_status.php

Thanks for the link! I see that 4 "make work" servers are down, but all of the "assimilators" are up, so I wonder why none of my completed tasks are uploading. Perhaps the "db_purge" server is key to accepting uploads.
ID: 78629 · Rating: 0 · rate: Rate + / Rate - Report as offensive
premier

Send message
Joined: 30 Dec 05
Posts: 14
Credit: 23,872,868
RAC: 146
Message 78630 - Posted: 31 Aug 2015, 8:56:14 UTC

One day, I believed, that I'm part of something good. 10 years of crunching and I see, I was wrong. I think it's time to leave this project and move resources somewhere, where people do care. You are wasting huge processing power.
ID: 78630 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 78631 - Posted: 31 Aug 2015, 11:05:29 UTC - in response to Message 78630.  

One day, I believed, that I'm part of something good. 10 years of crunching and I see, I was wrong. I think it's time to leave this project and move resources somewhere, where people do care. You are wasting huge processing power.


You are quitting because a project is taking a bit of time (over a weekend) to recover from a major power failure?

That's your choice of course, but I would value the project more on the results of its research than the amount of processing power they are able to consume.
ID: 78631 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Behemot

Send message
Joined: 3 Jul 07
Posts: 8
Credit: 4,742,308
RAC: 0
Message 78632 - Posted: 31 Aug 2015, 11:26:58 UTC - in response to Message 78631.  

One day, I believed, that I'm part of something good. 10 years of crunching and I see, I was wrong. I think it's time to leave this project and move resources somewhere, where people do care. You are wasting huge processing power.


You are quitting because a project is taking a bit of time (over a weekend) to recover from a major power failure?

That's your choice of course, but I would value the project more on the results of its research than the amount of processing power they are able to consume.

I think this may be more like about the fact nobody from the projects cares to inform his donors about the problem.

I criticised this project about absolute lack of communication already and it seems nothing changed at all.
ID: 78632 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Link
Avatar

Send message
Joined: 4 May 07
Posts: 348
Credit: 382,349
RAC: 0
Message 78634 - Posted: 31 Aug 2015, 14:54:11 UTC - in response to Message 78632.  
Last modified: 31 Aug 2015, 15:01:51 UTC

I think this may be more like about the fact nobody from the projects cares to inform his donors about the problem.

The server status page already informs everybody, who cares to look at it, that something is wrong with the servers (specially since it's not even updating anymore). That's what this page is done for, to inform us about the servers.

What could be improved, are the messages from servers. Curretly they just say "No work sent". Yeah, I can see that from the previous line "Scheduler request completed: got 0 new tasks". So instead of this completely useless message, they should send "Project has no tasks available", than I even don't need to look at the SSP to find out, why I'm not getting any work.
.
ID: 78634 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,806,125
RAC: 3,336
Message 78635 - Posted: 31 Aug 2015, 17:07:44 UTC - in response to Message 78634.  

I think this may be more like about the fact nobody from the projects cares to inform his donors about the problem.

The server status page already informs everybody, who cares to look at it, that something is wrong with the servers (specially since it's not even updating anymore). That's what this page is done for, to inform us about the servers.

What could be improved, are the messages from servers. Curretly they just say "No work sent". Yeah, I can see that from the previous line "Scheduler request completed: got 0 new tasks". So instead of this completely useless message, they should send "Project has no tasks available", than I even don't need to look at the SSP to find out, why I'm not getting any work.


Another possibility that should be shown if it happens: Server downloader not working.
ID: 78635 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,841,260
RAC: 0
Message 78636 - Posted: 31 Aug 2015, 17:53:55 UTC

My 'angst' isn't so much that the servers have crashed -- that happens in the computer world.

And it isn't that the server status messaging is less informative than some would like (I think that's more a function of the overall code that BOINC provides and thus not in specific project control).

Rather it is that the relative lack of update information that gets posted when problems (and or changes) occur in the project.

I realize that the particular problem occurred over the weekend -- and so I tend to give a pass on the lack of information made over the weekend -- even researchers and admins have lives other than keeping us happy. But it is now late morning and at this point the only information going around (aside for the status page) is end user posts here. To me that is troublesome -- but, there are other projects and for me I've simply diverted processing for now pending a resolution of the problem and ideally an explanation from project as to what went bump in the night.

All that said, I'd note that many folks in Washington state are likely more concerned (and perhaps affected) by the major forest fire going on there.

ID: 78636 · Rating: 0 · rate: Rate + / Rate - Report as offensive
premier

Send message
Joined: 30 Dec 05
Posts: 14
Credit: 23,872,868
RAC: 146
Message 78637 - Posted: 31 Aug 2015, 18:02:52 UTC - in response to Message 78632.  

One day, I believed, that I'm part of something good. 10 years of crunching and I see, I was wrong. I think it's time to leave this project and move resources somewhere, where people do care. You are wasting huge processing power.


You are quitting because a project is taking a bit of time (over a weekend) to recover from a major power failure?

That's your choice of course, but I would value the project more on the results of its research than the amount of processing power they are able to consume.

I think this may be more like about the fact nobody from the projects cares to inform his donors about the problem.

I criticised this project about absolute lack of communication already and it seems nothing changed at all.



@Behemot - You've got the point. It's exactly what I was meant. ATM this project looks like totaly automated project, no "public relations" at all.
ID: 78637 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Link
Avatar

Send message
Joined: 4 May 07
Posts: 348
Credit: 382,349
RAC: 0
Message 78639 - Posted: 31 Aug 2015, 18:19:28 UTC - in response to Message 78635.  

I think this may be more like about the fact nobody from the projects cares to inform his donors about the problem.

The server status page already informs everybody, who cares to look at it, that something is wrong with the servers (specially since it's not even updating anymore). That's what this page is done for, to inform us about the servers.

What could be improved, are the messages from servers. Curretly they just say "No work sent". Yeah, I can see that from the previous line "Scheduler request completed: got 0 new tasks". So instead of this completely useless message, they should send "Project has no tasks available", than I even don't need to look at the SSP to find out, why I'm not getting any work.


Another possibility that should be shown if it happens: Server downloader not working.

Download server not working would cause downloads to fail, new work would still be assigned to a computer.
.
ID: 78639 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 . . . 42 · 43 · 44 · 45 · 46 · 47 · 48 . . . 55 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org