Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 55 · Next

AuthorMessage
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 74167 - Posted: 4 Nov 2012, 15:41:31 UTC - in response to Message 74166.  

I would take the existence of this thread to be contrary to your assertions. Please don't attempt to characterize people you've never met.


True but you must agree that it has been a LONG time that these errors have been happening on a regular basis for some people! It is getting frustrating and even when the Project Admins say they will look into it they say "in a couple of weeks when I have more time". We put our time and energy and MONEY into running our pc's FOR Rosetta and get little to NO help in return when we have problems! You and I had a conversation a while back about how Rosetta is happy the ways things are, things haven't changed and yet we users are STILL hoping for one. Some of us BELIEVE in the idea of Rosetta, some are here for the credits and some are here for other reasons, but whatever the reason when the software 'just works' everywhere else yet works SOOOOO badly here, it is VERY FRUSTRATING for some of us!!!


Mikey has so absolutely right. I am here for the science, so I will stick with the project, but indeed when problems occur we are, for a long time, on or one.
When someone posts an error or something at Einstein@home, he or she will get an answer very soon from other crunchers and within 12-16 hours from the admins.
Greetings,
TJ.
ID: 74167 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 74173 - Posted: 4 Nov 2012, 22:55:10 UTC

I understand the frustration. I agree that zero problems is optimal. I just don't want to see what amounts to name-calling and personal attacks on the message boards. As most of you know, I am an at-home volunteer like the rest of you. So I am not directly involved in study nor correction of the code problems. So please read the following with that in mind.

To me it seems very curious that the low credit granted when tasks fail does not reflect more significantly in the overall project RAC. There must still be some unique aspect to the failing machines that has not been identified. Or the user base overall is very slow to adopt BOINC v7. I suppose both is possibly the case.
Rosetta Moderator: Mod.Sense
ID: 74173 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1226
Credit: 14,034,185
RAC: 2,900
Message 74175 - Posted: 5 Nov 2012, 2:51:01 UTC
Last modified: 5 Nov 2012, 2:51:48 UTC

I may have seen a reason for the RAC oddities. It seems that BOINC permits adding more credits to user accounts for workunits that did not pass the validator, but still returned useful information. These additional credits do NOT show up in the RAC sent back to the user's computer, but still show up in the total credits for that user. Could it be that a significant percentage of the current credits are granted that way, instead of automatically being granted by the validator?
ID: 74175 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 74201 - Posted: 8 Nov 2012, 4:27:41 UTC

I suppose that is a possibility.
Rosetta Moderator: Mod.Sense
ID: 74201 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 74907 - Posted: 16 Jan 2013, 5:39:49 UTC
Last modified: 16 Jan 2013, 5:41:04 UTC

Hi.

I've been having some problems with downloads the last week or so, it doesn't matter what size file they just stop for some reason then time out. This can happen a few times on one file while others download o.k. other times they get so far then all get stuck.

ps/ only Rosetta files, W.C.G. is fine.

-----------------

Wed 16 Jan 2013 16:22:57 EST rosetta@home Temporarily failed download of input_hyb_ba_bench_01_3vdxC_yfsong.zip: HTTP error
Wed 16 Jan 2013 16:22:57 EST rosetta@home Temporarily failed download of input_rb_01_15_35935_68145__t000__0_C1_robetta.zip: HTTP error
Wed 16 Jan 2013 16:22:58 EST rosetta@home Started download of input_hyb_ba_bench_01_3vdxC_yfsong.zip
Wed 16 Jan 2013 16:22:58 EST rosetta@home Started download of input_rb_01_15_35935_68145__t000__0_C1_robetta.zip
Wed 16 Jan 2013 16:23:00 EST Internet access OK - project servers may be temporarily down.
Wed 16 Jan 2013 16:23:30 EST rosetta@home Finished download of input_rb_01_15_35935_68145__t000__0_C1_robetta.zip
Wed 16 Jan 2013 16:24:01 EST rosetta@home Finished download of input_hyb_ba_bench_01_3vdxC_yfsong.zip
ID: 74907 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 75063 - Posted: 7 Feb 2013, 20:54:48 UTC

Having more problems with the servers today, can't upload!

=======================

Fri 08 Feb 2013 07:49:24 EST rosetta@home Started upload of rb_02_05_36619_69591__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_73700_52_0_0
Fri 08 Feb 2013 07:49:24 EST rosetta@home Started upload of MB3_1_t000___robetta_IGNORE_THE_REST_03_09_72248_9964_1_0
Fri 08 Feb 2013 07:50:00 EST rosetta@home [error] Error reported by file upload server: can't open file
Fri 08 Feb 2013 07:50:00 EST rosetta@home [error] Error reported by file upload server: can't open file
Fri 08 Feb 2013 07:50:00 EST rosetta@home Temporarily failed upload of rb_02_05_36619_69591__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_73700_52_0_0: transient upload error
Fri 08 Feb 2013 07:50:00 EST rosetta@home Backing off 1 min 0 sec on upload of rb_02_05_36619_69591__t000__1_C1_SAVE_ALL_OUT_IGNORE_THE_REST_73700_52_0_0
Fri 08 Feb 2013 07:50:00 EST rosetta@home Temporarily failed upload of MB3_1_t000___robetta_IGNORE_THE_REST_03_09_72248_9964_1_0: transient upload error
Fri 08 Feb 2013 07:50:00 EST rosetta@home Backing off 1 min 0 sec on upload of MB3_1_t000___robetta_IGNORE_THE_REST_03_09_72248_9964_1_0

ID: 75063 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Ed Parker

Send message
Joined: 8 May 07
Posts: 11
Credit: 132,966
RAC: 0
Message 75065 - Posted: 8 Feb 2013, 16:07:12 UTC

My RAC has dropped from almost 300 down to the 180s in the last few days. I've gone back to SETI, call me when you get your stuff fixed.
ID: 75065 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Link
Avatar

Send message
Joined: 4 May 07
Posts: 355
Credit: 382,349
RAC: 0
Message 75066 - Posted: 9 Feb 2013, 9:08:04 UTC - in response to Message 75065.  

My RAC has dropped from almost 300 down to the 180s in the last few days. I've gone back to SETI, call me when you get your stuff fixed.

ATM your RAC is 291, which is "almost 300"...
.
ID: 75066 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 75076 - Posted: 13 Feb 2013, 20:51:57 UTC

Servers are down again, can't report or get work.

Thu 14 Feb 2013 07:43:29 EST | rosetta@home | Reporting 4 completed tasks, requesting new tasks for CPU
Thu 14 Feb 2013 07:43:31 EST | rosetta@home | Scheduler request completed: got 0 new tasks
Thu 14 Feb 2013 07:43:31 EST | rosetta@home | Server error: can't attach shared memory


ID: 75076 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 75085 - Posted: 17 Feb 2013, 4:41:26 UTC
Last modified: 17 Feb 2013, 5:32:51 UTC

Don't know what's going on with your servers, server status page is showing all green but i'm getting this on my rigs.

Sun 17 Feb 2013 15:38:24 EST | | Project communication failed: attempting access to reference site
Sun 17 Feb 2013 15:38:24 EST | rosetta@home | Scheduler request failed: Failure when receiving data from the peer
Sun 17 Feb 2013 15:38:26 EST | | Internet access OK - project servers may be temporarily down.

Sun 17 Feb 2013 15:42:09 EST | rosetta@home | Reporting 1 completed tasks, requesting new tasks for CPU
Sun 17 Feb 2013 15:43:13 EST | | Project communication failed: attempting access to reference site
Sun 17 Feb 2013 15:43:13 EST | rosetta@home | Scheduler request failed: Couldn't connect to server
Sun 17 Feb 2013 15:43:15 EST | | Internet access OK - project servers may be temporarily down.
ID: 75085 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 75086 - Posted: 17 Feb 2013, 6:32:38 UTC - in response to Message 75085.  

I can confirm the problem here as well. As of about 3 hours ago (7PM PST)
ID: 75086 · Rating: 0 · rate: Rate + / Rate - Report as offensive
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 75087 - Posted: 17 Feb 2013, 9:00:38 UTC

Same problem here not uploading, reporting work and getting work. This are some error messages, it started at 02:15 pm at 16th.

2/16/2013 2:15:38 PM | rosetta@home | Temporarily failed download of input_rb_02_15_36886_69972__t000__1_C1_robetta.zip: transient HTTP error
2/16/2013 2:15:40 PM | rosetta@home | Started download of input_rb_02_15_36886_69972__t000__1_C1_robetta.zip
2/16/2013 2:15:54 PM | | Project communication failed: attempting access to reference site
2/16/2013 2:15:55 PM | | Internet access OK - project servers may be temporarily down.
2/16/2013 2:16:12 PM | rosetta@home | Finished download of input_rb_02_15_36886_69972__t000__1_C1_robetta.zip
2/16/2013 2:35:21 PM | rosetta@home | Temporarily failed download of cys45__1227_fold_data.zip: transient HTTP error
2/16/2013 2:35:22 PM | rosetta@home | Started download of cys45__1227_fold_data.zip
2/16/2013 2:35:35 PM | | Project communication failed: attempting access to reference site
2/16/2013 2:35:37 PM | | Internet access OK - project servers may be temporarily down.
2/17/2013 5:13:40 AM | rosetta@home | Sending scheduler request: To fetch work.
2/17/2013 5:13:40 AM | rosetta@home | Reporting 3 completed tasks, requesting new tasks for CPU and ATI
2/17/2013 5:14:02 AM | rosetta@home | Scheduler request failed: Couldn't connect to server
2/17/2013 5:14:16 AM | | Project communication failed: attempting access to reference site
2/17/2013 5:14:18 AM | | Internet access OK - project servers may be temporarily down.
2/17/2013 6:56:14 AM | rosetta@home | Temporarily failed upload of Ross3X3_SAVE_ALL_OUT_t075_002_74654_1844_0_0: connect() failed

Server page is all green however...

Greetings,
TJ.
ID: 75087 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1908
Credit: 8,773,998
RAC: 10,725
Message 75088 - Posted: 17 Feb 2013, 9:45:08 UTC - in response to Message 75087.  

Same problem here not uploading, reporting work and getting work. This are some error messages, it started at 02:15 pm at 16th.

2/16/2013 2:35:35 PM | | Project communication failed: attempting access to reference site
2/16/2013 2:35:37 PM | | Internet access OK - project servers may be temporarily down.
Server page is all green however...


Same here....
ID: 75088 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Warped

Send message
Joined: 15 Jan 06
Posts: 48
Credit: 1,788,185
RAC: 0
Message 75089 - Posted: 17 Feb 2013, 9:58:14 UTC

Oh well. It's Sunday so I expect we will have to wait another day for the comms to be sorted out.
Warped

ID: 75089 · Rating: 0 · rate: Rate + / Rate - Report as offensive
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 75098 - Posted: 17 Feb 2013, 15:47:18 UTC

I see however that the amount of "ready to send" WU's is decreasing so some are getting new work...
Greetings,
TJ.
ID: 75098 · Rating: 0 · rate: Rate + / Rate - Report as offensive
TPCBF

Send message
Joined: 29 Nov 10
Posts: 109
Credit: 4,873,716
RAC: 3,065
Message 75101 - Posted: 17 Feb 2013, 18:56:33 UTC - in response to Message 75089.  

Oh well. It's Sunday so I expect we will have to wait another day for the comms to be sorted out.
Don't hold your breath, Monday is a holiday here in the US of A, we likely have to wait longer for anyone to fix whatever is broken.
Not that the project admins are more responsive during a work week though... :-(

Ralf
ID: 75101 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 75102 - Posted: 17 Feb 2013, 19:19:27 UTC - in response to Message 75101.  

Could well be Tuesday due to the holiday tomorrow.

I'd note that at least some of the communications problems that happen with this project are related to administration over which the project has no control and precious little influence. It may simply be that someone in control of that aspect decided to change routing over the weekend and inadvertently decided to not inform the project people of the change. That has happened a few times in the past.

In any event, my own guess is 'best case' -- tomorrow someone on the project side becomes aware of the problem (and even posts something about it). Then, on Tuesday, the folks responsible for the IP addressing acknowledge they did some and either undo it or provide instructions so that folks can reset DNS to pick up the change.

Until then, I've suspended Rosetta processing and bumped up Malaria, Einstein, POEM and SETI to pick up the slack.



Oh well. It's Sunday so I expect we will have to wait another day for the comms to be sorted out.
Don't hold your breath, Monday is a holiday here in the US of A, we likely have to wait longer for anyone to fix whatever is broken.
Not that the project admins are more responsive during a work week though... :-(

Ralf

ID: 75102 · Rating: 0 · rate: Rate + / Rate - Report as offensive
GALAXY-VOYAGER

Send message
Joined: 25 Oct 12
Posts: 15
Credit: 47,437
RAC: 0
Message 75104 - Posted: 17 Feb 2013, 19:52:45 UTC - in response to Message 75102.  

Could well be Tuesday due to the holiday tomorrow.

I'd note that at least some of the communications problems that happen with this project are related to administration over which the project has no control and precious little influence. It may simply be that someone in control of that aspect decided to change routing over the weekend and inadvertently decided to not inform the project people of the change. That has happened a few times in the past.

In any event, my own guess is 'best case' -- tomorrow someone on the project side becomes aware of the problem (and even posts something about it). Then, on Tuesday, the folks responsible for the IP addressing acknowledge they did some and either undo it or provide instructions so that folks can reset DNS to pick up the change.

Until then, I've suspended Rosetta processing and bumped up Malaria, Einstein, POEM and SETI to pick up the slack.

Suspending The Project Will Not Prevent it from Uploading Completed Tasks to The Server. I've already Tried That.



Oh well. It's Sunday so I expect we will have to wait another day for the comms to be sorted out.
Don't hold your breath, Monday is a holiday here in the US of A, we likely have to wait longer for anyone to fix whatever is broken.
Not that the project admins are more responsive during a work week though... :-(

Ralf


ID: 75104 · Rating: 0 · rate: Rate + / Rate - Report as offensive
GALAXY-VOYAGER

Send message
Joined: 25 Oct 12
Posts: 15
Credit: 47,437
RAC: 0
Message 75105 - Posted: 17 Feb 2013, 20:08:51 UTC - in response to Message 75087.  
Last modified: 17 Feb 2013, 20:09:57 UTC

Same problem here not uploading, reporting work and getting work. This are some error messages, it started at 02:15 pm at 16th.

2/16/2013 2:15:38 PM | rosetta@home | Temporarily failed download of input_rb_02_15_36886_69972__t000__1_C1_robetta.zip: transient HTTP error
2/16/2013 2:15:40 PM | rosetta@home | Started download of input_rb_02_15_36886_69972__t000__1_C1_robetta.zip
2/16/2013 2:15:54 PM | | Project communication failed: attempting access to reference site
2/16/2013 2:15:55 PM | | Internet access OK - project servers may be temporarily down.
2/16/2013 2:16:12 PM | rosetta@home | Finished download of input_rb_02_15_36886_69972__t000__1_C1_robetta.zip
2/16/2013 2:35:21 PM | rosetta@home | Temporarily failed download of cys45__1227_fold_data.zip: transient HTTP error
2/16/2013 2:35:22 PM | rosetta@home | Started download of cys45__1227_fold_data.zip
2/16/2013 2:35:35 PM | | Project communication failed: attempting access to reference site
2/16/2013 2:35:37 PM | | Internet access OK - project servers may be temporarily down.
2/17/2013 5:13:40 AM | rosetta@home | Sending scheduler request: To fetch work.
2/17/2013 5:13:40 AM | rosetta@home | Reporting 3 completed tasks, requesting new tasks for CPU and ATI
2/17/2013 5:14:02 AM | rosetta@home | Scheduler request failed: Couldn't connect to server
2/17/2013 5:14:16 AM | | Project communication failed: attempting access to reference site
2/17/2013 5:14:18 AM | | Internet access OK - project servers may be temporarily down.
2/17/2013 6:56:14 AM | rosetta@home | Temporarily failed upload of Ross3X3_SAVE_ALL_OUT_t075_002_74654_1844_0_0: connect() failed

Server page is all green however...



Yep !!! ... I'm still having Problems Uploading. They run for a while and then revert to try Again in nn:nn:nn and run for a while at that then revert to Active, and keep reverting from one to the other. They are only Actice for a few seconds to a few minutes at a time. After so many attempts to Upload, and Rety, they show that they will Project Backoff in nn:nn:nn
My Tasks are ready to Backoff at the moment in about 90 Minutes. I have been manually clicking Retry Now, but I've given up. I'm letting them count down to Project Backoff, and if it Fails, too bad. But if it Reverts to Retry again, ad doesn't work, I'm going to Abort Transfer. In the meantime, I've Selected NO NEW TASKS from R@H. I have a couple more Tasks ready to Report, and if They also cause the same Issues, I'm dropping R@H.
ID: 75105 · Rating: 0 · rate: Rate + / Rate - Report as offensive
amgthis

Send message
Joined: 25 Mar 06
Posts: 81
Credit: 203,879,282
RAC: 0
Message 75106 - Posted: 17 Feb 2013, 20:57:37 UTC - in response to Message 75105.  
Last modified: 17 Feb 2013, 20:59:05 UTC

Server page is all green however...
[/quote]

I'll bet I've seen that 'server status' page actually show something as 'down'
maybe once over a few years. AFAIK it's rarely if ever accurate or updated.

Oh well.
ID: 75106 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 55 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org