Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 55 · 56 · 57 · 58 · 59 · 60 · 61 . . . 276 · Next

AuthorMessage
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 97481 - Posted: 21 Jun 2020, 0:14:07 UTC - in response to Message 97480.  
Last modified: 21 Jun 2020, 0:18:46 UTC

"Timed out - no response"
This one I don’t know about.


"Not started by deadline - canceled"
Deadlines for Rosetta@home tasks are 3 days from the time they are issued. It can take BOINC several days to learn how much wall time it takes a new computer to complete the CPU time required for each task (8 hours by default), so it is important to configure your settings to ensure that BOINC doesn’t download more tasks than you will be able to complete before their deadline.

In your settings (configured either online or locally [Advanced view > Options > Computing preferences]), set Store at least to no more than 0.3 days (≈ task duration, if your CPU is doing nothing besides Rosetta), and Store up to an additional to 0.02 days. These low values should help to ensure you don’t get a backlog of tasks you can’t complete before the deadline; as long as your computer is frequently online you will not run out of tasks as BOINC will still contact the server whenever it needs more work.

If you have a large backlog of Ready to start tasks that you suspect you have no chance of completing, you can Abort them to return them to the server to be allocated to another computer.
ID: 97481 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1481
Credit: 14,608,684
RAC: 15,528
Message 97482 - Posted: 21 Jun 2020, 0:42:35 UTC - in response to Message 97480.  
Last modified: 21 Jun 2020, 0:43:55 UTC

I'm running Rosetta 19 hours/day using at most 32% of CPU time.
"Not started by deadline - canceled"
The problem is in that first line. "using at most 32% of CPU time" means it will take at least 24hours to process one 8 hour Task.
You would be much better off limiting the number of cores/threads you use, but let them run at 100%

I'd suggest
Computing
   Usage limits	
                                   Use at most 50% of the CPUs
                                   Use at most 100% of CPU time
and use the caching values as suggested by Brian Nixon (or even smaller ones).
Grant
Darwin NT
ID: 97482 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1860
Credit: 8,160,158
RAC: 8,440
Message 97486 - Posted: 22 Jun 2020, 6:37:17 UTC

Error after pc reboot:
22/06/2020 08:32:06 | Rosetta@home | [error] garbage_collect(); still have active task for acked result rb_06_18_29700_29099__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_949298_51_1; state 1

ID: 97486 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1982
Credit: 38,450,760
RAC: 14,492
Message 97489 - Posted: 22 Jun 2020, 12:23:07 UTC
Last modified: 22 Jun 2020, 12:23:44 UTC

It's been an excellent run for the last few months, but no tasks available for download
As of 22 Jun 2020, 11:02:34 UTC [ Scheduler running ]
Total queued jobs: 0
In progress: 599,832
Successes last 24h: 512,637

ID: 97489 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 97493 - Posted: 22 Jun 2020, 15:06:51 UTC - in response to Message 97489.  

It's been an excellent run for the last few months, but no tasks available for download

Thanks. I wish they would tell us whether it is an operational glitch with the servers, or a longer-term issue with supply.
We need to manage our machines too, if we are to do the work effectively.
ID: 97493 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,710,557
RAC: 8,359
Message 97495 - Posted: 22 Jun 2020, 17:16:05 UTC - in response to Message 97493.  
Last modified: 22 Jun 2020, 17:17:25 UTC

It's been an excellent run for the last few months, but no tasks available for download

Thanks. I wish they would tell us whether it is an operational glitch with the servers, or a longer-term issue with supply.
We need to manage our machines too, if we are to do the work effectively.


On the main page, https://boinc.bakerlab.org/rosetta/ , is "total queued jobs". This varies from 0 to 12 million. Just keep an eye on that. A server glitch wouldn't show that going down to 0. I'm assuming at the moment we're coming to the end of a set of work and awaiting the scientists analysing that before creating more.

Do what I do, use more than one project so the computer is always busy. If you class Rosetta as much more important than the other project, simply set the resource share at https://boinc.bakerlab.org/rosetta/prefs.php?subset=project and the equivalent at your other project to either a much higher number for Rosetta, or 0 for the other project (which means it only runs when Rosetta has no tasks).

Or of course just let the machines have a rest? Not sure why some people get upset when there's no work to do.
ID: 97495 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,710,557
RAC: 8,359
Message 97496 - Posted: 22 Jun 2020, 17:19:45 UTC - in response to Message 97486.  

Error after pc reboot:
22/06/2020 08:32:06 | Rosetta@home | [error] garbage_collect(); still have active task for acked result rb_06_18_29700_29099__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_949298_51_1; state 1


I'm guessing, but could that mean your PC sent a task back but didn't finish getting the response before you rebooted? So the server has the result, but your PC thinks it doesn't. Do you have a completed task in the list?
ID: 97496 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1982
Credit: 38,450,760
RAC: 14,492
Message 97498 - Posted: 22 Jun 2020, 18:46:38 UTC - in response to Message 97493.  

It's been an excellent run for the last few months, but no tasks available for download

Thanks. I wish they would tell us whether it is an operational glitch with the servers, or a longer-term issue with supply.
We need to manage our machines too, if we are to do the work effectively.

The reaction to a lack of new tasks coming down always amuses me.
As far as I can make out there are still over 500k un-returned tasks, so it's an early-warning notice of about a day to the project, not users.
Users (in total) can be running those 500k tasks quite merrily until the project has more tasks to be worked on.
Rosetta won't create meaningless tasks just to keep users active - that was Seti's job and even they got fed up of doing it <cough>

Users can do any number of things:
We can temporarily increase runtimes to eke out the remaining tasks we have
We can download tasks from backup projects we set up for just this eventuality (we did that years ago, right?)
We can complete existing tasks and wait for meaningful work to become available again, knowing that we offer our capacity without any guarantee tasks will always be available to fill it.
We can use the downtime to clean and check PCs for the first time in years so that when new tasks arrive they'll run cleaner/cooler/more productively (I did this last week before I switched two furloughed PCs back on to great effect)

Users can avoid:
Whining about it, knowing that any shortfall in productivity will affect the project vastly more than any 1 user (or all users put together)

As far as managing our machines goes, I know it's been fashionable to pare offline caches to the bone, and that makes sense if there are problems running work or in meeting deadlines, but I've never done that.
As long as my cache is below (2 days, minus runtime, minus a margin for variability) I'm good and I've got time to react to short-term availability issues at the project.
So, anything below 1.5 days total works for my 24/7 PCs, 1.1 for my 24/7 remote PC and 0.6 for my remote work PC.
I set that up 3 or 4 years ago and adjusted my remote PCs when I finally got back to work last week.
Being out of tasks at Rosetta, short or long-term, makes no difference to those settings, so I'm not particularly waiting for any official message here outside of a casual interest
ID: 97498 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 97499 - Posted: 22 Jun 2020, 18:52:35 UTC - in response to Message 97498.  

I will be out of work on a few cores by this evening.
I am quite capable of managing it.
ID: 97499 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1224
Credit: 13,847,051
RAC: 1,945
Message 97500 - Posted: 22 Jun 2020, 19:09:52 UTC - in response to Message 97498.  

It's been an excellent run for the last few months, but no tasks available for download

Thanks. I wish they would tell us whether it is an operational glitch with the servers, or a longer-term issue with supply.
We need to manage our machines too, if we are to do the work effectively.

The reaction to a lack of new tasks coming down always amuses me.
As far as I can make out there are still over 500k un-returned tasks, so it's an early-warning notice of about a day to the project, not users.
Users (in total) can be running those 500k tasks quite merrily until the project has more tasks to be worked on.
Rosetta won't create meaningless tasks just to keep users active - that was Seti's job and even they got fed up of doing it <cough>
[snip]

Before that, Predictor@home did even worse - after the project team split up and they lost the two members who could create useful new workunits, the kept the project running for a few more months by repeatedly increasing the number of times unfinished workunits could fail before they were no longer had tasks sent out. Some of them had failed over 30 times before the project shut down.
ID: 97500 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1982
Credit: 38,450,760
RAC: 14,492
Message 97503 - Posted: 22 Jun 2020, 21:26:58 UTC - in response to Message 97499.  
Last modified: 22 Jun 2020, 21:35:10 UTC

I will be out of work on a few cores by this evening.
I am quite capable of managing it.

I'd have thought that already - I was a bit surprised you suggested otherwise.
Still worth going through the stages for any newcomers.

Two other things:
UW are UTC -9 I think, so more time to get a solution together
Any switch of tasks to other projects creates a 'debt' with Boinc for Rosetta, which will be 'repaid' once tasks become available again

Edit: Just noticed my work PC completed all Rosetta tasks already and buffers starting to be filled on two other machines ahead of completing remaining Rosetta tasks. All going as expected
ID: 97503 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 97507 - Posted: 23 Jun 2020, 1:15:47 UTC - in response to Message 97503.  

Edit: Just noticed my work PC completed all Rosetta tasks already and buffers starting to be filled on two other machines ahead of completing remaining Rosetta tasks. All going as expected

You are lucky, or maybe are getting re-sends. One of my machines is starting to going dry. That is my concern. They may come back immediately, or not. You don't know. Good luck.
ID: 97507 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1982
Credit: 38,450,760
RAC: 14,492
Message 97521 - Posted: 23 Jun 2020, 8:20:03 UTC - in response to Message 97507.  

Edit: Just noticed my work PC completed all Rosetta tasks already and buffers starting to be filled on two other machines ahead of completing remaining Rosetta tasks. All going as expected

You are lucky, or maybe are getting re-sends. One of my machines is starting to going dry. That is my concern. They may come back immediately, or not. You don't know. Good luck.

Sorry, I missed out the important words "with WCG tasks", not Rosetta - completely changing the meaning of that edit...

And since then, a 2nd of my 4 PCs has run out of Rosetta tasks.

And since then, it looks like 70k+ tasks have been issued, increasing in progress tasks, 6 of which I grabbed for 1 PC and none for the others, and they've all been gobbled up.
So something is happening, but insufficient to meet demand yet. Fingers crossed for later today.
ID: 97521 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
gw666

Send message
Joined: 30 Apr 20
Posts: 2
Credit: 387,116,939
RAC: 5
Message 97525 - Posted: 23 Jun 2020, 10:40:24 UTC

It seems I don't get new tasks, this is a typical message on one of my machines:
23-Jun-2020 10:58:33 [Rosetta@home] Scheduler request completed: got 0 new tasks
23-Jun-2020 10:58:33 [Rosetta@home] No tasks sent
23-Jun-2020 11:04:08 [Rosetta@home] Sending scheduler request: To fetch work.
23-Jun-2020 11:04:08 [Rosetta@home] Requesting new tasks for CPU
23-Jun-2020 11:04:11 [Rosetta@home] Scheduler request completed: got 0 new tasks
23-Jun-2020 11:04:11 [Rosetta@home] No tasks sent
23-Jun-2020 11:17:48 [Rosetta@home] Sending scheduler request: To fetch work.
23-Jun-2020 11:17:48 [Rosetta@home] Requesting new tasks for CPU
23-Jun-2020 11:17:51 [Rosetta@home] Scheduler request completed: got 0 new tasks
23-Jun-2020 11:17:51 [Rosetta@home] No tasks sent
23-Jun-2020 12:16:33 [Rosetta@home] Sending scheduler request: To fetch work.
23-Jun-2020 12:16:33 [Rosetta@home] Requesting new tasks for CPU
23-Jun-2020 12:16:37 [Rosetta@home] Scheduler request completed: got 0 new tasks
23-Jun-2020 12:16:37 [Rosetta@home] No tasks sent

This machine has 32 cores and only one is occupied at the moment.
ID: 97525 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1224
Credit: 13,847,051
RAC: 1,945
Message 97529 - Posted: 23 Jun 2020, 13:59:52 UTC - in response to Message 97525.  

gw666,

That has been typical for the last few days,

It's a good reason to add another BOINC project on your computer. I like World Community Grid, since their Open Pandemics subproject is currently working on COVID-19.

https://join.worldcommunitygrid.org?recruiterId=480838
ID: 97529 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,710,557
RAC: 8,359
Message 97535 - Posted: 23 Jun 2020, 20:04:54 UTC - in response to Message 97507.  

Edit: Just noticed my work PC completed all Rosetta tasks already and buffers starting to be filled on two other machines ahead of completing remaining Rosetta tasks. All going as expected

You are lucky, or maybe are getting re-sends. One of my machines is starting to going dry. That is my concern. They may come back immediately, or not. You don't know. Good luck.


Why is it of concern? I look at it as a good thing that we completed the work load. Now there's little chunks coming in, they get taken immediately. Best for the science to get done rapidly than they have to wait until we finish it. I assume they've reached a key point in the research.
ID: 97535 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile yoerik
Avatar

Send message
Joined: 24 Mar 20
Posts: 128
Credit: 169,525
RAC: 0
Message 97539 - Posted: 23 Jun 2020, 21:28:54 UTC - in response to Message 97535.  

Edit: Just noticed my work PC completed all Rosetta tasks already and buffers starting to be filled on two other machines ahead of completing remaining Rosetta tasks. All going as expected

You are lucky, or maybe are getting re-sends. One of my machines is starting to going dry. That is my concern. They may come back immediately, or not. You don't know. Good luck.


Why is it of concern? I look at it as a good thing that we completed the work load. Now there's little chunks coming in, they get taken immediately. Best for the science to get done rapidly than they have to wait until we finish it. I assume they've reached a key point in the research.

can't speak for other users, but I know for me that I'm less concerned about the shortage overall - moreso concerned by the continued lack of communication with volunteers since I joined in late march, 2020.
ID: 97539 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1982
Credit: 38,450,760
RAC: 14,492
Message 97549 - Posted: 24 Jun 2020, 1:38:57 UTC - in response to Message 97521.  

And since then, it looks like 70k+ tasks have been issued, increasing in progress tasks, 6 of which I grabbed for 1 PC and none for the others, and they've all been gobbled up.
So something is happening, but insufficient to meet demand yet. Fingers crossed for later today.

Some time between 23:00 UTC and 01:25 UTC, in progress tasks increased by 130k with 30k more showing ready to send
Will have to wait until the front page updates to be sure, but it looks like a healthy batch of work has now become available
ID: 97549 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,710,557
RAC: 8,359
Message 97564 - Posted: 24 Jun 2020, 13:40:32 UTC - in response to Message 97549.  
Last modified: 24 Jun 2020, 13:41:02 UTC

And since then, it looks like 70k+ tasks have been issued, increasing in progress tasks, 6 of which I grabbed for 1 PC and none for the others, and they've all been gobbled up.
So something is happening, but insufficient to meet demand yet. Fingers crossed for later today.

Some time between 23:00 UTC and 01:25 UTC, in progress tasks increased by 130k with 30k more showing ready to send
Will have to wait until the front page updates to be sure, but it looks like a healthy batch of work has now become available


Looks more like small batches which are gobbled immediately, it could mean they're doing finishing touches to a study?

And the set of 12 tasks I got an hour ago (which would be 12:40pm UTC 24th June) had a download error, permanent HTTP error on one of the data files.
ID: 97564 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
KWSN Ekky Ekky Ekky

Send message
Joined: 3 Apr 20
Posts: 9
Credit: 5,062,511
RAC: 0
Message 97565 - Posted: 24 Jun 2020, 15:15:36 UTC

"Deadlines for Rosetta@home tasks are 3 days from the time they are issued. "
Is there any sesible reason for sauch a short time?
Spending lots of time crunching is pointless if a task is "Completed, too late to validate". Seems more like an insult to people donating their time to then slap them in the face.
ID: 97565 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 55 · 56 · 57 · 58 · 59 · 60 · 61 . . . 276 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org