Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 310 · Next

AuthorMessage
Pilgrim57

Send message
Joined: 31 Jul 08
Posts: 3
Credit: 1,965,851
RAC: 0
Message 86907 - Posted: 31 Jul 2017, 11:24:48 UTC

WU 838793241
Restarted project after a 3 week lull and despite having set Wus to run for 10 hours BOINC show 4hours run time and this hasn't updated, after more than 10 WUs, and so too many downloaded to meet deadline.
I paused this unit after 7 hours to start another to finish both in time. Upon restating PC this morning it reset the WU as only running for 7 minutes!
This is why I aborted it. I am not running it for anothe 10 hours for same credit.
Please note I will abandon project if this keeps happening and it is about time that you fixed the problem that either no work is sent until I run out or too many are sent so that it occupies 100% of CPU time when the project is only suposed to use 15% of 2 cores. In part the problem is that WU won't be sent while other projects are running and by suspending them to get more too many are sent.
ID: 86907 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 86918 - Posted: 31 Jul 2017, 16:35:18 UTC - in response to Message 86907.  

WU 838793241
Restarted project after a 3 week lull and despite having set Wus to run for 10 hours BOINC show 4hours run time and this hasn't updated, after more than 10 WUs, and so too many downloaded to meet deadline.
I paused this unit after 7 hours to start another to finish both in time. Upon restating PC this morning it reset the WU as only running for 7 minutes!
This is why I aborted it. I am not running it for anothe 10 hours for same credit.
Please note I will abandon project if this keeps happening and it is about time that you fixed the problem that either no work is sent until I run out or too many are sent so that it occupies 100% of CPU time when the project is only suposed to use 15% of 2 cores. In part the problem is that WU won't be sent while other projects are running and by suspending them to get more too many are sent.


The BOINC Manager handles all of the work requests in to each project. So you may want to give your feedback on the BOINC message boards. But it is trying to manage multiple projects with various deadlines and preferences. If you are suspending projects to force work downloads, you are not helping the BOINC Manager to cope with things. If you have some specific window of time to do network access or etc. please post a question explaining your requirements. BOINC has a lot of preferences that can be used to meet your objectives.

I also wanted to be certain you understand that the 15% resource share is a preference the BOINC Manager should be expected to achieve over the course of each week or so, not each hour or each afternoon. It is inefficient to try and run all of the tasks from all of your projects at the same time, so it rotates through each over the course of time.

WUs WILL be sent when other projects are running, but the BOINC Manager decided it was not yet time to request them. It will get work when it meets the preferences you have established and knows it will have existing work completed.
Rosetta Moderator: Mod.Sense
ID: 86918 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 1,227
Message 87214 - Posted: 5 Sep 2017, 22:26:24 UTC

Uploads currently aren't working, The delay before the next try is over 4 hours, so this may have started several hours ago. Is someone working on this problem?
ID: 87214 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 1,227
Message 87220 - Posted: 6 Sep 2017, 4:17:36 UTC - in response to Message 87214.  

Uploads currently aren't working, The delay before the next try is over 4 hours, so this may have started several hours ago. Is someone working on this problem?

Uploads still aren't working, but reporting tasks that have finished uploading is working. I haven't tried downloads today.
ID: 87220 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1480
Credit: 4,334,829
RAC: 0
Message 87221 - Posted: 6 Sep 2017, 5:42:54 UTC

I think we've resolved this issue. Please let us know if it persists.

The file locking logic in the upload handler was failing. We don't know what caused this; rebooting the filesystem didn't help. But rebuilding the upload handler after commenting out the locking logic did appear to fix the issue.
ID: 87221 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 87223 - Posted: 6 Sep 2017, 9:54:31 UTC - in response to Message 87221.  

I think we've resolved this issue. Please let us know if it persists.

The file locking logic in the upload handler was failing. We don't know what caused this; rebooting the filesystem didn't help. But rebuilding the upload handler after commenting out the locking logic did appear to fix the issue.

It took until 3.5hrs after your message, but everything cleared up for me, thanks.
ID: 87223 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
furukitsune

Send message
Joined: 19 Mar 16
Posts: 9
Credit: 7,233,115
RAC: 893
Message 87308 - Posted: 16 Sep 2017, 9:40:59 UTC

ID: 87308 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile David E K
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Jul 05
Posts: 1480
Credit: 4,334,829
RAC: 0
Message 87327 - Posted: 19 Sep 2017, 17:45:57 UTC - in response to Message 87308.  

This is a different issue that is probably specific to your computer/boinc client.
ID: 87327 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JohnH

Send message
Joined: 25 Mar 13
Posts: 43
Credit: 2,319,355
RAC: 0
Message 87346 - Posted: 23 Sep 2017, 6:55:48 UTC

WUs are finishing and uploading but credit stays pending.
bwsrv2 hosting assimilator, Validator and delete processes shows as not running.
Update please.
ID: 87346 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 87348 - Posted: 23 Sep 2017, 16:06:59 UTC - in response to Message 87346.  

I have noticed this as well --- starting on Friday. (Sept. 22)
ID: 87348 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 87349 - Posted: 23 Sep 2017, 17:00:21 UTC

547 WUs pending validation....
ID: 87349 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 87350 - Posted: 23 Sep 2017, 21:56:48 UTC

The validator servers seem to be down.
ID: 87350 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 87354 - Posted: 24 Sep 2017, 2:51:39 UTC

Validators aren't running

https://boinc.bakerlab.org/rosetta/server_status.php
ID: 87354 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JohnH

Send message
Joined: 25 Mar 13
Posts: 43
Credit: 2,319,355
RAC: 0
Message 87356 - Posted: 24 Sep 2017, 6:40:25 UTC - in response to Message 87354.  

Who can we tell - I'm never sure?
ID: 87356 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 0
Message 87361 - Posted: 24 Sep 2017, 18:29:02 UTC - in response to Message 87356.  

And they remain offline -- I suspect when folks show up for the project after the weekend they will take notice and restart the validators. Until then..... meh
ID: 87361 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 87365 - Posted: 25 Sep 2017, 14:15:50 UTC

It's of no consolation whatsoever, but the new task listings page makes it very easy to see my last validated task was timed at 22 Sep 2017, 19:05:14 UTC

Also, I just realised there are no Rosetta Android tasks available at the moment. Switching to my backup project
ID: 87365 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rosetta_and_chill?

Send message
Joined: 14 Nov 16
Posts: 1
Credit: 10,324,920
RAC: 0
Message 87366 - Posted: 25 Sep 2017, 15:04:21 UTC

Not receiving work units for any of my completed tasks, this has only become an issue since upgrading to the newest version of rosetta.

Also will wake up and check my queue and see 100+ jobs with 00:01 of computing time that result in "computational error", then I have to update project in order to replace all of them...
ID: 87366 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 87367 - Posted: 25 Sep 2017, 16:19:16 UTC - in response to Message 87365.  
Last modified: 25 Sep 2017, 16:27:21 UTC

It's of no consolation whatsoever, but the new task listings page makes it very easy to see my last validated task was timed at 22 Sep 2017, 19:05:14 UTC

Also, I just realised there are no Rosetta Android tasks available at the moment. Switching to my backup project

Servers back running and over half my backlog has been validated already.

Still no Android tasks at this precise moment but I hope that'll be next on the agenda

Edit: I have one completed job that's refusing to upload at the moment. Not sure if that's related or not.
Edit2: It's a Rosetta 4.03 task - one of the very few I've downloaded and possibly the first I've completed - checks: no, I've had just one other 4.03 task successfully uploaded and validated already
ID: 87367 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 87368 - Posted: 25 Sep 2017, 17:18:50 UTC - in response to Message 87367.  

Edit2: It's a Rosetta 4.03 task - one of the very few I've downloaded and possibly the first I've completed - checks: no, I've had just one other 4.03 task successfully uploaded and validated already

Also now a Rosetta Mini 3.73 task, so nothing to do with being a 4.03 task.

Only just noticed that the naming convention has finally reverted to Rosetta rather than Rosetta Mini
ID: 87368 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2145
Credit: 41,560,787
RAC: 9,320
Message 87369 - Posted: 25 Sep 2017, 18:00:27 UTC - in response to Message 87368.  
Last modified: 25 Sep 2017, 18:02:18 UTC

Edit2: It's a Rosetta 4.03 task - one of the very few I've downloaded and possibly the first I've completed - checks: no, I've had just one other 4.03 task successfully uploaded and validated already

Also now a Rosetta Mini 3.73 task, so nothing to do with being a 4.03 task.

So some specific errors with the upload followed by a job finishing normally and uploading straight away with no issues. Getting a bit weird now

Monday 25/09/2017 18:13:57 | Rosetta@home | Started upload of rb_09_22_77689_120373__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_515725_239_1_r1801937373_0
Monday 25/09/2017 18:13:59 | Rosetta@home | [error] Error reported by file upload server: can't open log file '../log_bwsrv2/file_upload_handler.log' (errno: 9)
Monday 25/09/2017 18:13:59 | Rosetta@home | Temporarily failed upload of rb_09_22_77689_120373__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_515725_239_1_r1801937373_0: transient upload error
Monday 25/09/2017 18:13:59 | Rosetta@home | Backing off 00:15:23 on upload of rb_09_22_77689_120373__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_515725_239_1_r1801937373_0
Monday 25/09/2017 18:34:47 | Rosetta@home | Started upload of rb_09_22_77689_120373__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_515725_239_1_r1801937373_0
Monday 25/09/2017 18:34:49 | Rosetta@home | [error] Error reported by file upload server: can't open log file '../log_bwsrv2/file_upload_handler.log' (errno: 9)
Monday 25/09/2017 18:34:49 | Rosetta@home | Temporarily failed upload of rb_09_22_77689_120373__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_515725_239_1_r1801937373_0: transient upload error
Monday 25/09/2017 18:34:49 | Rosetta@home | Backing off 00:30:19 on upload of rb_09_22_77689_120373__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_515725_239_1_r1801937373_0
Monday 25/09/2017 18:42:43 | Rosetta@home | Computation for task rb_09_22_77372_120357__t000__ab_robetta_IGNORE_THE_REST_515702_3925_0 finished
Monday 25/09/2017 18:42:59 | Rosetta@home | Starting task de82a6e775c6314a7ae40a83155e9ff8_C2_docking_big_job_17_08_02_49_28_globalDocking_6_SAVE_ALL_OUT_514767_23_0
Monday 25/09/2017 18:42:59 | Rosetta@home | [cpu_sched] Starting task de82a6e775c6314a7ae40a83155e9ff8_C2_docking_big_job_17_08_02_49_28_globalDocking_6_SAVE_ALL_OUT_514767_23_0 using minirosetta version 373 in slot 5
Monday 25/09/2017 18:43:00 | Rosetta@home | Started upload of rb_09_22_77372_120357__t000__ab_robetta_IGNORE_THE_REST_515702_3925_0_r568864107_0
Monday 25/09/2017 18:43:11 | Rosetta@home | Finished upload of rb_09_22_77372_120357__t000__ab_robetta_IGNORE_THE_REST_515702_3925_0_r568864107_0

ID: 87369 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 310 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org