Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 45 · 46 · 47 · 48 · 49 · 50 · 51 . . . 55 · Next

AuthorMessage
sgaboinc

Send message
Joined: 2 Apr 14
Posts: 282
Credit: 208,966
RAC: 0
Message 79245 - Posted: 18 Dec 2015, 16:26:11 UTC
Last modified: 18 Dec 2015, 16:27:06 UTC

i've set target cpu run time to 4 hours, there is once i noted a job that ran for almost 8 hours or perhaps longer didn't track that, but in the end that job ends and it generates a single model / decoy! i've had the other extreme where in that 4 hours it generates more than 600 models/decoys. one may consider the time to complete moderately large tasks, i've seen 'large models' that generates only 2 decoys/models in a 4 hour run time (and occasionally even larger/more complex models generate only 1 model in the same 4 hours) and the extremes exceed 4 hours and go on running.

my own preference for waiting is about double my set target run time and if it didn't complete, sometimes i'd terminate the task if i consider that i may not complete the run after all, as i 'crunch' on and off on occasions mainly during the night. i'd think it may sometimes be a better option as i'd hope someone else may pick up the job and complete that so that the result could be returned earlier than it waited so long that the time to expire is past

an idea is based on the cpu performance, perhaps find the time needed to complete at least 1 model greater than say 95% of times (i.e. find the longest task), and perhaps more than double that duration could be consider a little 'too long' to wait.

if i'd think i'd likely continue running it again soon, i'd think a better way may be to consider suspending the task and set it to run again say the next day.
ID: 79245 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 79246 - Posted: 18 Dec 2015, 16:26:47 UTC

Normal operations would be that a task should be consuming CPU when the BOINC Manager indicates the task is in a "running" state.

The "CPU time" (seen in the task properties, not the "elapsed" time), should not exceed more than 4 hours beyond your runtime preference (6hrs is the default). At that point, the watchdog should be ending the task for you if it is still running. If the task is not getting CPU, then it is not something the task can control. The BOINC Manager allocates the CPU.
Rosetta Moderator: Mod.Sense
ID: 79246 · Rating: 0 · rate: Rate + / Rate - Report as offensive
sgaboinc

Send message
Joined: 2 Apr 14
Posts: 282
Credit: 208,966
RAC: 0
Message 79248 - Posted: 18 Dec 2015, 16:33:15 UTC
Last modified: 18 Dec 2015, 16:42:11 UTC

i'm guessing that some models may perhaps be 'very complex' and thus take a very long time to run (to perhaps find even a single decoy), but the dilemma is always that would it find a useful answer or that it may after all be a 'bifurcation' e.g. the algorithm goes into a never ending loop unable to find the answer, it would be a pity if say it is running for 11 hours & for all anyone knows the next hour it may find the answer
ID: 79248 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79270 - Posted: 22 Dec 2015, 1:27:58 UTC

A heads up

Total queued jobs: 440,072

Ready to send 56,012
In progress 674,016

Not too bad, but coming up to the holiday season (if it's not already too late) it would be nice to bump up what's coming through to us all through into the new year. Those numbers have been edging down throughout the month.

Is that possible?
ID: 79270 · Rating: 0 · rate: Rate + / Rate - Report as offensive
sgaboinc

Send message
Joined: 2 Apr 14
Posts: 282
Credit: 208,966
RAC: 0
Message 79276 - Posted: 22 Dec 2015, 13:16:05 UTC
Last modified: 22 Dec 2015, 13:18:37 UTC

maybe the researchers/scientists need to start to send proteins that fold in to *merry x'mas & happy new year* :o :D lol
ID: 79276 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79286 - Posted: 23 Dec 2015, 1:56:29 UTC - in response to Message 79270.  

A heads up

Total queued jobs: 440,072

Ready to send 56,012
In progress 674,016

Not too bad, but coming up to the holiday season (if it's not already too late) it would be nice to bump up what's coming through to us all through into the new year. Those numbers have been edging down throughout the month.

Is that possible?

Total queued jobs: 283,422

Ready to send 25,596
In progress 452,000

Not looking good...
ID: 79286 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79297 - Posted: 24 Dec 2015, 1:58:52 UTC - in response to Message 79286.  

A heads up

Total queued jobs: 440,072

Ready to send 56,012
In progress 674,016

Not too bad, but coming up to the holiday season (if it's not already too late) it would be nice to bump up what's coming through to us all through into the new year. Those numbers have been edging down throughout the month.

Is that possible?

Total queued jobs: 283,422

Ready to send 25,596
In progress 452,000

Not looking good...

Total queued jobs: 403,009

Ready to send 15,120
In progress 125,240

While I'm aware fewer people will be running their machines over the holidays, I'm considering increasing my runtimes to 24hrs to eke out the tasks I have.

I've never really been a positive thinker in these matters...
ID: 79297 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79304 - Posted: 24 Dec 2015, 23:11:04 UTC - in response to Message 79297.  

A heads up

Total queued jobs: 440,072

Ready to send 56,012
In progress 674,016

Not too bad, but coming up to the holiday season (if it's not already too late) it would be nice to bump up what's coming through to us all through into the new year. Those numbers have been edging down throughout the month.

Is that possible?

Total queued jobs: 283,422

Ready to send 25,596
In progress 452,000

Not looking good...

Total queued jobs: 403,009

Ready to send 15,120
In progress 125,240

While I'm aware fewer people will be running their machines over the holidays, I'm considering increasing my runtimes to 24hrs to eke out the tasks I have.

I've never really been a positive thinker in these matters...

Total queued jobs: 76,080

Ready to send 53,352
In progress 954,274

Plan going into operation tonight
ID: 79304 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79308 - Posted: 25 Dec 2015, 22:55:38 UTC - in response to Message 79304.  

A heads up

Total queued jobs: 440,072

Ready to send 56,012
In progress 674,016

Not too bad, but coming up to the holiday season (if it's not already too late) it would be nice to bump up what's coming through to us all through into the new year. Those numbers have been edging down throughout the month.

Is that possible?

Total queued jobs: 283,422

Ready to send 25,596
In progress 452,000

Not looking good...

Total queued jobs: 403,009

Ready to send 15,120
In progress 125,240

While I'm aware fewer people will be running their machines over the holidays, I'm considering increasing my runtimes to 24hrs to eke out the tasks I have.

I've never really been a positive thinker in these matters...

Total queued jobs: 76,080

Ready to send 53,352
In progress 954,274

Plan going into operation tonight

Total queued jobs: 177,014

Ready to send 69,004
In progress 772,920

Not sure if I'm helping or not
ID: 79308 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79312 - Posted: 27 Dec 2015, 1:59:03 UTC - in response to Message 79308.  

A heads up

Total queued jobs: 440,072

Ready to send 56,012
In progress 674,016

Not too bad, but coming up to the holiday season (if it's not already too late) it would be nice to bump up what's coming through to us all through into the new year. Those numbers have been edging down throughout the month.

Is that possible?

Total queued jobs: 283,422

Ready to send 25,596
In progress 452,000

Not looking good...

Total queued jobs: 403,009

Ready to send 15,120
In progress 125,240

While I'm aware fewer people will be running their machines over the holidays, I'm considering increasing my runtimes to 24hrs to eke out the tasks I have.

I've never really been a positive thinker in these matters...

Total queued jobs: 76,080

Ready to send 53,352
In progress 954,274

Plan going into operation tonight

Total queued jobs: 177,014

Ready to send 69,004
In progress 772,920

Not sure if I'm helping or not

Total queued jobs: 51,069

Ready to send 72,326
In progress 851,760

Just keeping our heads above water.
I'm fully stocked with 24hr jobs atm. I'll be glad when I can cut them back.
ID: 79312 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Steve

Send message
Joined: 22 Nov 15
Posts: 8
Credit: 164,345
RAC: 0
Message 79314 - Posted: 27 Dec 2015, 14:07:50 UTC - in response to Message 70158.  

Hi,

I'm finding that although some tasks complete OK, many more go "waiting to run" and seem to stay that way. I've aborted those that are clearly long past their deadline date but the others just sit there with varying % done and elapsed times. Is this normal? I'd have expected long-past-deadline tasks to be dropped and cleaned up by BOINC (but maybe that takes longer than a week?)Or is there something weird about my PC?

Any advice would be welcome.

Thanks in advance.
Steve
ID: 79314 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 79315 - Posted: 27 Dec 2015, 16:26:39 UTC - in response to Message 79314.  

Hi,

I'm finding that although some tasks complete OK, many more go "waiting to run" ...


Waiting to run is not a flaw in a task. It simply means that the BOINC Manager has decided to run something else first. Sounds like perhaps you have several projects running and the BOINC Manager is still getting used to the mix and may have download too much work.

As to the deadlines you mentioned, yes the BOINC Manager attempts to run tasks that are in risk of missing their deadlines first. And once the deadline has passed, you may as well "abort" the task. But once things settle in, this should not happen.

Does your machine run BOINC on a fairly regular schedule? How many hours per day?
Rosetta Moderator: Mod.Sense
ID: 79315 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79316 - Posted: 27 Dec 2015, 17:43:18 UTC - in response to Message 79314.  

I'm finding that although some tasks complete OK, many more go "waiting to run" and seem to stay that way. I've aborted those that are clearly long past their deadline date but the others just sit there with varying % done and elapsed times. Is this normal? I'd have expected long-past-deadline tasks to be dropped and cleaned up by BOINC (but maybe that takes longer than a week?)Or is there something weird about my PC?

Any advice would be welcome.

Thanks in advance.
Steve

Waiting to run only applies when other projects are prioritised ahead of Rosetta, but I notice your only other project is Malaria which has been out of tasks for some while, so I'm wondering if you have "suspend when computer is in use" checked in OptionsComputing Preferences. This should be unchecked.
ID: 79316 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79332 - Posted: 1 Jan 2016, 1:43:10 UTC - in response to Message 79312.  

A heads up

Total queued jobs: 440,072

Ready to send 56,012
In progress 674,016

Not too bad, but coming up to the holiday season (if it's not already too late) it would be nice to bump up what's coming through to us all through into the new year. Those numbers have been edging down throughout the month.

Is that possible?

Total queued jobs: 283,422

Ready to send 25,596
In progress 452,000

Not looking good...

Total queued jobs: 403,009

Ready to send 15,120
In progress 125,240

While I'm aware fewer people will be running their machines over the holidays, I'm considering increasing my runtimes to 24hrs to eke out the tasks I have.

I've never really been a positive thinker in these matters...

Total queued jobs: 76,080

Ready to send 53,352
In progress 954,274

Plan going into operation tonight

Total queued jobs: 177,014

Ready to send 69,004
In progress 772,920

Not sure if I'm helping or not

Total queued jobs: 51,069

Ready to send 72,326
In progress 851,760

Just keeping our heads above water.
I'm fully stocked with 24hr jobs atm. I'll be glad when I can cut them back.

Total queued jobs: 1,270,075

Ready to send 110,160
In progress 1,236,288

Looks like I can shut up now
ID: 79332 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Steve

Send message
Joined: 22 Nov 15
Posts: 8
Credit: 164,345
RAC: 0
Message 79347 - Posted: 2 Jan 2016, 12:28:47 UTC - in response to Message 79315.  

Hi,

I'm finding that although some tasks complete OK, many more go "waiting to run" ...


Waiting to run is not a flaw in a task. It simply means that the BOINC Manager has decided to run something else first. Sounds like perhaps you have several projects running and the BOINC Manager is still getting used to the mix and may have download too much work.

As to the deadlines you mentioned, yes the BOINC Manager attempts to run tasks that are in risk of missing their deadlines first. And once the deadline has passed, you may as well "abort" the task. But once things settle in, this should not happen.

Does your machine run BOINC on a fairly regular schedule? How many hours per day?


Understood. I should have said "seem to stay stuck on Waiting to Run with only a small percentage of work completed".

I have only Rosetta running (I did run MalariaControl for a while but found it swamped BOINC such that no other tasks would start, so I set it to run no new tasks and now only Rosetta is getting work to do)

This PC runs very little other work - it's my retired desktop machine, now acting as a baby fileserver and occasional test machine in my home office, hence I decided in November to run some BOINC work on it. It is set to run BOINC tasks 24 hours a day and I left it running over Xmas and New Year and today found about a dozen unfinished Waiting to Run tasks that were past their deadlines which I've now aborted.

What I'm puzzled about is that BOINC is starting new tasks when older ones still are Waiting to Run, but I'm going to try some compute preference changes as suggested in another reply and see if that works better.

Thanks for the response
Steve
ID: 79347 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Steve

Send message
Joined: 22 Nov 15
Posts: 8
Credit: 164,345
RAC: 0
Message 79348 - Posted: 2 Jan 2016, 12:37:05 UTC - in response to Message 79316.  

I'm finding that although some tasks complete OK, many more go "waiting to run" and seem to stay that way. I've aborted those that are clearly long past their deadline date but the others just sit there with varying % done and elapsed times. Is this normal? I'd have expected long-past-deadline tasks to be dropped and cleaned up by BOINC (but maybe that takes longer than a week?)Or is there something weird about my PC?

Any advice would be welcome.

Thanks in advance.
Steve

Waiting to run only applies when other projects are prioritised ahead of Rosetta, but I notice your only other project is Malaria which has been out of tasks for some while, so I'm wondering if you have "suspend when computer is in use" checked in OptionsComputing Preferences. This should be unchecked.


Thanks for the suggestion, I've not got "suspend when comouter is in use" checked but I did have "suspend GPU ... when in use" checked so I've cleared that and also allowed tasks to stay in memory when suspended so I'll see if that helps.

I've also removed the dormant Malaria Control project (which I deactivated because it hogged the system) so BOINC only has one project to work on.

Will see how that goes.

Thanks for your response

Steve
ID: 79348 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,888,320
RAC: 0
Message 79349 - Posted: 2 Jan 2016, 14:41:07 UTC - in response to Message 79347.  

A few things to consider:

Where did you set your preferences? Changes made in the BOINC Manager will override any web-based settings.

Double check the wording. In my version of BOINC Manager a box must be checked to keep tasks running while the computer is in use while you must select the “no” radio button to achieve the same thing using web-based prefs.

What I'm puzzled about is that BOINC is starting new tasks when older ones still are Waiting to Run...

This can happen if there isn’t enough memory to continue running a particular task. BOINC will set that one aside and try another. Rosetta tasks are among the most memory hungry tasks you will encounter in the BOINC world. So how much memory per core do you have and, more importantly, how much is BOINC allowed to use?

Could computer (not BOINC) sleep/hibernation settings be coming into play?

Best,
Snags
ID: 79349 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Steve

Send message
Joined: 22 Nov 15
Posts: 8
Credit: 164,345
RAC: 0
Message 79361 - Posted: 5 Jan 2016, 11:06:59 UTC - in response to Message 79349.  

A few things to consider:

Where did you set your preferences? Changes made in the BOINC Manager will override any web-based settings.

Double check the wording. In my version of BOINC Manager a box must be checked to keep tasks running while the computer is in use while you must select the “no” radio button to achieve the same thing using web-based prefs.

What I'm puzzled about is that BOINC is starting new tasks when older ones still are Waiting to Run...

This can happen if there isn’t enough memory to continue running a particular task. BOINC will set that one aside and try another. Rosetta tasks are among the most memory hungry tasks you will encounter in the BOINC world. So how much memory per core do you have and, more importantly, how much is BOINC allowed to use?

Could computer (not BOINC) sleep/hibernation settings be coming into play?

Best,
Snags

Thanks Snags - useful input. I have used local settings and the option window confirms that it's using those (it has a button to use prefs from the web but I haven't clicked that)

PC is a quad core with 12GB RAM, but it's running several large java-based services so memory typically runs around 80-90% used but with very little swapping. However as I'm not using the largest of those services most days I've now stopped that (releasing around 4GB) and will only run it when I need to access it. Rosetta tasks are usually under 200MB each in task manager so that should now mean there's plenty of memory available.

Making previously suggested changes seems to have improved things somewhat (only one overdue task waiting this morning) so I'll see if the latest change does any better.

Best,
Steve
ID: 79361 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 79362 - Posted: 5 Jan 2016, 14:45:02 UTC - in response to Message 79361.  

A few things to consider:

Where did you set your preferences? Changes made in the BOINC Manager will override any web-based settings.

Double check the wording. In my version of BOINC Manager a box must be checked to keep tasks running while the computer is in use while you must select the “no” radio button to achieve the same thing using web-based prefs.

What I'm puzzled about is that BOINC is starting new tasks when older ones still are Waiting to Run...

This can happen if there isn’t enough memory to continue running a particular task. BOINC will set that one aside and try another. Rosetta tasks are among the most memory hungry tasks you will encounter in the BOINC world. So how much memory per core do you have and, more importantly, how much is BOINC allowed to use?

Could computer (not BOINC) sleep/hibernation settings be coming into play?

Thanks Snags - useful input. I have used local settings and the option window confirms that it's using those (it has a button to use prefs from the web but I haven't clicked that)

PC is a quad core with 12GB RAM, but it's running several large java-based services so memory typically runs around 80-90% used but with very little swapping. However as I'm not using the largest of those services most days I've now stopped that (releasing around 4GB) and will only run it when I need to access it. Rosetta tasks are usually under 200MB each in task manager so that should now mean there's plenty of memory available.

Making previously suggested changes seems to have improved things somewhat (only one overdue task waiting this morning) so I'll see if the latest change does any better.

I saw you had 12Gb RAM so didn't expect RAM to be an issue, but now I read this it is likely to have been a factor. My 8 concurrent tasks typically contribute 1.5GB out of 6.5Gb RAM in use, but I have 16Gb RAM total to utilise.
ID: 79362 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BelgianEnthousiast

Send message
Joined: 25 May 15
Posts: 5
Credit: 1,023,045
RAC: 0
Message 79406 - Posted: 13 Jan 2016, 12:26:47 UTC

Hi All,

Been running Rosetta for a while and now encountering serious issues with near-endless or endless loops.
Normal running time is 6 hours on a task. And half of the WU's seem to adhere to that, however the other half is showing some weird behaviour :
1. Running forever without any estimated time left, going on for 20+ hours
as an example : nkid_1_3_2016_final3_0716_00058_0043.pdb343_TG_dez_fold_SAVE_ALL_OUT_322141_663_0
nkid_1_3_2016_final3_0692_00366_0042.pdb342_TG_dez_fold_SAVE_ALL_OUT_322134_678_0

2. Running forever, but with an estimated time left which keeps creeping up.
don't have examples here, I aborted them after 25+ hours of running.

This appears on a laptop. On my desktop, it seems to work well. Although I have other issues there with the scheduling of Rosetta.

Could you please investigate ?

Many thanks in advance !

Kind Regards,

B.E.
ID: 79406 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 . . . 45 · 46 · 47 · 48 · 49 · 50 · 51 . . . 55 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org