Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 64 · 65 · 66 · 67 · 68 · 69 · 70 . . . 309 · Next

AuthorMessage
EHM-1
Avatar

Send message
Joined: 21 Mar 20
Posts: 23
Credit: 183,782
RAC: 0
Message 97867 - Posted: 2 Jul 2020, 0:11:22 UTC - in response to Message 97810.  

What's easiest is to set Rosetta at say 50%, WCG at 25% and some orther project at 25% andlet Boinc figure it out,which it will do over time.Just besure to keep your cache sizes small so you don't run into deadline problems. With Rosetta's 3 day deadline if you have 3 days of work NO other projects will crunch because their deadline will be further out than 3 days.

Where are these resource share settings hidden?
Eric

system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM
ID: 97867 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 97868 - Posted: 2 Jul 2020, 0:11:56 UTC - in response to Message 97864.  

Apart from laptops, I've never known a CPU overheat, even on stock fans.

Ha ha ha LOL.

I didn't like to say...
When I first began overclocking, but before I began to take cooling seriously, I took my PC to a repair shop only to be shown the sockets I had literally melted.
And over a few processors in this machine, I've blown two other motherboards.
But apart from that, never...
ID: 97868 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,380,064
RAC: 20,136
Message 97871 - Posted: 2 Jul 2020, 7:15:45 UTC - in response to Message 97861.  

I was trying to get the six Rosetta tasks completed by their 2 July deadline, but that didn't work out. Today I had 13 download failures and one upload failure and they're all gone. Rosetta is trying to download 6 more, but they are taking very long and I they may fail as well.

OOopps. I paused my other projects and all six just now downloaded successfully. They are due on 4 July. I have to suspend most of them to let Asteroids and WCG catch up. But I will let them resume in a short while.

Is it possible that these things happen because this "low-spec" computer was trying to run two Rosetta tasks while two Asteroids and one WCG tasks were running or waiting to run?

No, the problem (apart from the download issues) is that you keep fiddling with things.
As long as you set things, then let BOINC do it's job then they will settle down. But if you keep suspending, unsuspending, changing when BOINC can & can't run, then there is no chance of things ever settling down.
Grant
Darwin NT
ID: 97871 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,380,064
RAC: 20,136
Message 97872 - Posted: 2 Jul 2020, 7:17:25 UTC - in response to Message 97867.  
Last modified: 2 Jul 2020, 7:19:43 UTC

What's easiest is to set Rosetta at say 50%, WCG at 25% and some orther project at 25% andlet Boinc figure it out,which it will do over time.Just besure to keep your cache sizes small so you don't run into deadline problems. With Rosetta's 3 day deadline if you have 3 days of work NO other projects will crunch because their deadline will be further out than 3 days.
Where are these resource share settings hidden?
In your account, Rosetta@ home preferences, Resource share.
The number there isn't a percentage. It makes up the ratio for the work to be done with the values from you other projects.

And changing that, will delay even further your Resource share settings being met.
Grant
Darwin NT
ID: 97872 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
EHM-1
Avatar

Send message
Joined: 21 Mar 20
Posts: 23
Credit: 183,782
RAC: 0
Message 97877 - Posted: 2 Jul 2020, 13:10:14 UTC - in response to Message 97872.  

What's easiest is to set Rosetta at say 50%, WCG at 25% and some orther project at 25% and let Boinc figure it out,which it will do over time. Just be sure to keep your cache sizes small so you don't run into deadline problems. With Rosetta's 3 day deadline if you have 3 days of work NO other projects will crunch because their deadline will be further out than 3 days.
Where are these resource share settings hidden?
In your account, Rosetta@ home preferences, Resource share.
The number there isn't a percentage. It makes up the ratio for the work to be done with the values from you other projects.
And changing that, will delay even further your Resource share settings being met.

Thanks, Grant. I've always just used the local preferences in BOINC. I now see what I presume is the default setting of 100 in my Rosetta prefs. But no corresponding setting in WCG (my one other project). No matter as I'm planning to continue sitting back per your suggestion and watch how BOINC balances the two projects. It piques my curiosity when, as it has a couple times over the past couple days, BOINC pauses a nearly complete Rosetta unit due in less than 2 days to download and start processing a shorter-duration OpenPandemic unit that is due in five days. All shall be revealed in the fullness of time.
Eric

system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM
ID: 97877 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1233
Credit: 14,338,560
RAC: 2,014
Message 97878 - Posted: 2 Jul 2020, 13:46:39 UTC - in response to Message 97877.  
Last modified: 2 Jul 2020, 13:49:55 UTC

[snip]

Thanks, Grant. I've always just used the local preferences in BOINC. I now see what I presume is the default setting of 100 in my Rosetta prefs. But no corresponding setting in WCG (my one other project). No matter as I'm planning to continue sitting back per your suggestion and watch how BOINC balances the two projects. It piques my curiosity when, as it has a couple times over the past couple days, BOINC pauses a nearly complete Rosetta unit due in less than 2 days to download and start processing a shorter-duration OpenPandemic unit that is due in five days. All shall be revealed in the fullness of time.
Eric

WCG places that setting under Device Profiles. Once you select which profile to edit, you may need to select Custom Profile, then scroll down to the bottom of the page to see where you can set it, labelled Project weight.
ID: 97878 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
EHM-1
Avatar

Send message
Joined: 21 Mar 20
Posts: 23
Credit: 183,782
RAC: 0
Message 97879 - Posted: 2 Jul 2020, 14:05:49 UTC - in response to Message 97878.  

Thanks, Grant. I've always just used the local preferences in BOINC. I now see what I presume is the default setting of 100 in my Rosetta prefs. But no corresponding setting in WCG (my one other project). No matter as I'm planning to continue sitting back per your suggestion and watch how BOINC balances the two projects. It piques my curiosity when, as it has a couple times over the past couple days, BOINC pauses a nearly complete Rosetta unit due in less than 2 days to download and start processing a shorter-duration OpenPandemic unit that is due in five days. All shall be revealed in the fullness of time.
Eric
WCG places that setting under Device Profiles. Once you select which profile to edit, you may need to select Custom Profile, then scroll down to the bottom of the page to see where you can set it, labelled Project weight.

Thanks, Robert. I feel a bit silly having missed Device Manager. Now I see under the custom option that it's set to 60% max. Tempted to adjust that now, but restraining myself for the time being.
Eric

system: up-to-date Windows 10, Intel quad-core 3.6 GHz processor, 8 GB RAM
ID: 97879 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,214,047
RAC: 1,450
Message 97882 - Posted: 2 Jul 2020, 14:29:22 UTC - in response to Message 97879.  

Thanks, Grant. I've always just used the local preferences in BOINC. I now see what I presume is the default setting of 100 in my Rosetta prefs. But no corresponding setting in WCG (my one other project). No matter as I'm planning to continue sitting back per your suggestion and watch how BOINC balances the two projects. It piques my curiosity when, as it has a couple times over the past couple days, BOINC pauses a nearly complete Rosetta unit due in less than 2 days to download and start processing a shorter-duration OpenPandemic unit that is due in five days. All shall be revealed in the fullness of time.
Eric
WCG places that setting under Device Profiles. Once you select which profile to edit, you may need to select Custom Profile, then scroll down to the bottom of the page to see where you can set it, labelled Project weight.


Thanks, Robert. I feel a bit silly having missed Device Manager. Now I see under the custom option that it's set to 60% max. Tempted to adjust that now, but restraining myself for the time being.
Eric


You can see what the setting is set to right now in the Boinc Manager under the Projects tab, it's one of the colums.
ID: 97882 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Keith Myers
Avatar

Send message
Joined: 29 Mar 20
Posts: 97
Credit: 332,619
RAC: 25
Message 97889 - Posted: 2 Jul 2020, 21:11:07 UTC

Another knob to twist is a setting in the cc_config.xml file. It shortens the averaging period of your projects credit accumulation in BOINC's attempt to balance credit weighting.

<rec_half_life_days>1.000000</rec_half_life_days>

is the value I use. REC stands for Recent Estimated Credit. The client default is ten days and that means it takes a solid two weeks of constant production before the credit scales balance between all your projects.

And what that means is that it takes that long before the resource share ratio between all your projects takes at least two weeks to stabilize. You still have to deal with the differences in project work deadlines if there is a great disparity from one to the others.

By shortening the balance averaging period to a single day, projects play nicely with each other . . . . mostly.
ID: 97889 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1233
Credit: 14,338,560
RAC: 2,014
Message 97895 - Posted: 3 Jul 2020, 20:36:56 UTC
Last modified: 3 Jul 2020, 20:40:58 UTC

Recently, I've been stepping down my target time from 22 hours to 8 hours. Now that it's at 8 hours, I'm getting more tasks that have COVID in their names. This COULD mean that the project scientists want their COVID-19 work returned fast, even if less time is spent on it. Could those who have been experimenting with target times below 8 hours mention how those times affect how often you get COVID-19 tasks?

Also, I've noticed that the initial value of time remaining seems now seems to be about 8 hours, regardless of what target time you have set.
ID: 97895 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,380,064
RAC: 20,136
Message 97897 - Posted: 3 Jul 2020, 21:38:59 UTC - in response to Message 97895.  

Recently, I've been stepping down my target time from 22 hours to 8 hours. Now that it's at 8 hours, I'm getting more tasks that have COVID in their names. This COULD mean that the project scientists want their COVID-19 work returned fast, even if less time is spent on it.
Or just the fact you are doing more Tasks, so there is a a greater chance of getting a particular type of Task if it is available at the time as you are requesting work more often.


Also, I've noticed that the initial value of time remaining seems now seems to be about 8 hours, regardless of what target time you have set.
If that is the case, then there is a problem with the Projects revised Estimated deadline mechanism.
It should be based on the users Target CPU time setting, not the project's default Target CPU time setting.
Grant
Darwin NT
ID: 97897 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 97898 - Posted: 3 Jul 2020, 21:40:53 UTC

I've noticed the same: Tasks are arriving with an 8 hour estimated completion time.
Setting is at 12 hours.
ID: 97898 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1233
Credit: 14,338,560
RAC: 2,014
Message 97913 - Posted: 5 Jul 2020, 3:06:51 UTC - in response to Message 97897.  

Recently, I've been stepping down my target time from 22 hours to 8 hours. Now that it's at 8 hours, I'm getting more tasks that have COVID in their names. This COULD mean that the project scientists want their COVID-19 work returned fast, even if less time is spent on it.
Or just the fact you are doing more Tasks, so there is a a greater chance of getting a particular type of Task if it is available at the time as you are requesting work more often.

I'm seeing a higher percentage of the tasks on my computer at any one time having COVID in their names.

That will indirectly mean more total tasks with COVID in their names, but that's not the point.
ID: 97913 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 97915 - Posted: 5 Jul 2020, 3:17:45 UTC - in response to Message 97872.  

What's easiest is to set Rosetta at say 50%, WCG at 25% and some orther project at 25% andlet Boinc figure it out,which it will do over time.Just besure to keep your cache sizes small so you don't run into deadline problems. With Rosetta's 3 day deadline if you have 3 days of work NO other projects will crunch because their deadline will be further out than 3 days.
Where are these resource share settings hidden?
In your account, Rosetta@ home preferences, Resource share.
The number there isn't a percentage. It makes up the ratio for the work to be done with the values from you other projects.

Yes, it isn't a %age.
I've seen someone's now pointed out where the setting is at WCG, but I could never find it before, so I just increased Rosetta to 2900. Amounts to the same thing
ID: 97915 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 97916 - Posted: 5 Jul 2020, 3:21:19 UTC - in response to Message 97898.  

I've noticed the same: Tasks are arriving with an 8 hour estimated completion time.
Setting is at 12 hours.

Definitely, yes.
During the last outage I increased my runtimes to 12hrs to eke my last few out, and they ran for 12hrs, but when new tasks came through, the unstarted ones still showed 8hrs.

I've reduced my runtimes back to 8hrs. Boinc has enough trouble with scheduling without me or Rosetta making it worse.
ID: 97916 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stevie G

Send message
Joined: 15 Dec 18
Posts: 107
Credit: 865,910
RAC: 814
Message 97930 - Posted: 6 Jul 2020, 2:46:53 UTC - in response to Message 97866.  
Last modified: 6 Jul 2020, 2:47:37 UTC

[quote] 8Gb RAM ought to be plenty for a 2-core processor.
Have you looked at the previous advice in this thread and compared to your own settings (even though the advice was for a different machine)? There should be plenty for you to consider.
Boinc <ought> to be able to give your other projects enough time to complete before their deadlines without you having to suspend them. The longer you can run without interfering, the better Boinc will be able to decide for you.[quote]

For some reason, the computer shut down and was unresponsive for 48 hours. No action from the power button, hard drive, etc. Nada, nichts, zip.

Power cable was OK, I don't think there's an inline fuse, so I dunno. There's a reset button on the power supply, but I didn't mess with that and the button is not popped out. Overheat? Usually, that just results in a restart. To be safe, I just vacuumed out all the accumulated cat hair and dust. We had some thunderstorms here last night, so maybe there was a power interruption. But nothing else in the house was affected and this machine is on a UPS backup, which did not register any action. A deep mystery

But when I just now turned it et Voila!! It awoke from its coma. Which is how I'm writing to you at this moment.{:>) No explanation for that, but I'll take it.

However, I've been out of business for more than two days, with deadlines rapidly approaching.

So I will take your suggestion under advisement and scrutinize my settings and preferences.

Thanks again.

SGaber
ID: 97930 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1233
Credit: 14,338,560
RAC: 2,014
Message 97931 - Posted: 6 Jul 2020, 2:57:11 UTC - in response to Message 97930.  
Last modified: 6 Jul 2020, 3:03:01 UTC

[snip]

For some reason, the computer shut down and was unresponsive for 48 hours. No action from the power button, hard drive, etc. Nada, nichts, zip.

The shutdown is typical after a momentary loss of power.

The UPS may have let its battery or batteries run too low. For example, if its rating was too low for your computer. If so, it would eventually recharge it or them after long enough with the computer using no power.

You may have needed to unplug it to keep it from being confused about whether it was still running.
ID: 97931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2140
Credit: 41,518,559
RAC: 10,612
Message 97955 - Posted: 7 Jul 2020, 22:58:56 UTC - in response to Message 97930.  

8Gb RAM ought to be plenty for a 2-core processor.
Have you looked at the previous advice in this thread and compared to your own settings (even though the advice was for a different machine)? There should be plenty for you to consider.
Boinc <ought> to be able to give your other projects enough time to complete before their deadlines without you having to suspend them. The longer you can run without interfering, the better Boinc will be able to decide for you.

For some reason, the computer shut down and was unresponsive for 48 hours. No action from the power button, hard drive, etc. Nada, nichts, zip.

Power cable was OK, I don't think there's an inline fuse, so I dunno. There's a reset button on the power supply, but I didn't mess with that and the button is not popped out. Overheat? Usually, that just results in a restart. To be safe, I just vacuumed out all the accumulated cat hair and dust. We had some thunderstorms here last night, so maybe there was a power interruption. But nothing else in the house was affected and this machine is on a UPS backup, which did not register any action. A deep mystery

But when I just now turned it on et Voila!! It awoke from its coma. Which is how I'm writing to you at this moment.{:>) No explanation for that, but I'll take it.

However, I've been out of business for more than two days, with deadlines rapidly approaching.

So I will take your suggestion under advisement and scrutinize my settings and preferences.

Thanks again.
SGaber

That's not a great sign. It's quite an old PC and must have done a lot of work in its time.
The best thing you've done is vacuum it out, because it sounds heat-related to me and you'll have helped it run cooler by getting rid of the junk, which will extend its remaining life.
I'm actually in much the same situation myself and considering what my next PC should be within my budget.
ID: 97955 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
justsomeguy

Send message
Joined: 24 May 17
Posts: 1
Credit: 375,643
RAC: 0
Message 97961 - Posted: 8 Jul 2020, 15:43:42 UTC

Recently, I started seeing a lot of jobs completing with a status of "aborted by project". They were completed prior to the deadline, but it doesn't appear that I get any credit for them either.
Any ideas/thoughts on this?
ID: 97961 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 97964 - Posted: 8 Jul 2020, 19:56:05 UTC
Last modified: 8 Jul 2020, 20:04:34 UTC

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1092259599


Both tasks errored out after just a few seconds. Slightly different error codes but the same "upload failure":

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>TFSCAFFOLD0001_6_SAVE_ALL_OUT_IGNORE_THE_REST_0ub6wd0j_953357_1_1_r1180454695_0</file_name>
<error_code>-240(stat() failed)</error_code>
</file_xfer_error>
</message>
]]>


</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>TFSCAFFOLD0001_6_SAVE_ALL_OUT_IGNORE_THE_REST_0ub6wd0j_953357_1_0_r1298488601_0</file_name>
<error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>


EDIT: Got another FSCAFFOLD0001 WU, also errored after just a few seconds. Bad batch?
https://boinc.bakerlab.org/rosetta/result.php?resultid=1217236062
</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>TFSCAFFOLD0001_2_SAVE_ALL_OUT_IGNORE_THE_REST_1xl5lk3f_953353_2_0_r1523244009_0</file_name>
<error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
</message>
]]>
ID: 97964 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 64 · 65 · 66 · 67 · 68 · 69 · 70 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org