Posts by Michael E.

1) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 100702)
Posted 7 Mar 2021 by Michael E.
Post:
The problem I reported earlier to Grant with tasks not running has been solved. It may apply to the memory issues reported recently.

That is, tasks you expected to run were not running. I did not see anything in the event log that provided a clue (but I may need to enable certain messages).

My PC has a small SSD disk so i was careful about how much disk space gets used. The same applies to memory use, although I check the Windows Task manager and see how much memory processes are using. On my son's 8 GB RAM PC, I cannot run two Rosetta tasks at the same time.

If you see a task that should be running but is not, in Preferences (Options > Computing Preferences in the Advanced View), tap/click the Disk and Memory tab and check the settings there.

Mike
2) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 100678)
Posted 1 Mar 2021 by Michael E.
Post:
Without seeing what, for example, Milky Way is doing on that machine at the same time it’s impossible to say. You need to look at the full picture, not just one project.

As a example, if one of the other projects has had an off day and fallen behind on its resource share then it will suspend processing on Rosetta, leaving all WUs as Ready to Start, until the other project has caught up.


Sorry for the incomplete info! No other CPU tasks are running from other projects. The other 3 projects on this PC do not allow new tasks (No New Tasks selected). Thanks for the questions!

I would suggest setting your cache to 0 as you are signed up to a dozen projects, almost half of them active.
The smaller the cache, the sooner the system can meet your resource share settings- with that many projects i'd suggest you'd be looking at weeks. With even a small cache, it will take months,
Preferences,
When and how BOINC uses your computer Computing preferences, Computing, Other
Store at least 0.00 days of work
Store up to an additional 0.01 days of work

I would also run the benchmarks on that system- it is showing the default values, and as they are used when it comes to allocating work (as well as allocating Credit for work done) it is probably impacting on what work is done & when.
On the BOINC manager, Tools, Run CPU benchmarks. .


No other active CPU tasks. Four total projects on this PC.

CPU benchmarks result:
3/1/2021 11:04:04 AM | | Running CPU benchmarks
3/1/2021 11:04:05 AM | | Suspending computation - CPU benchmarks in progress
3/1/2021 11:04:36 AM | | Benchmark results:
3/1/2021 11:04:36 AM | | Number of CPUs: 3
3/1/2021 11:04:36 AM | | 4742 floating point MIPS (Whetstone) per CPU
3/1/2021 11:04:36 AM | | 13780 integer MIPS (Dhrystone) per CPU
3/1/2021 11:04:37 AM | | Resuming computation
3/1/2021 11:12:48 AM | | General prefs: from http://einstein.phys.uwm.edu/ (last modified ---)
3/1/2021 11:12:48 AM | | Computer location: home
3/1/2021 11:12:48 AM | | General prefs: using separate prefs for home
3/1/2021 11:12:48 AM | | Reading preferences override file
3/1/2021 11:12:48 AM | | Preferences:
3/1/2021 11:12:48 AM | | max memory usage when active: 2428.71 MB
3/1/2021 11:12:48 AM | | max memory usage when idle: 8095.70 MB
3/1/2021 11:12:48 AM | | max disk usage: 8.00 GB
3/1/2021 11:12:48 AM | | max CPUs used: 3
3/1/2021 11:12:48 AM | | suspend work if non-BOINC CPU load exceeds 35%
3/1/2021 11:12:48 AM | | (to change preferences, visit a project web site or select Preferences in the Manager)
. . .

Good suggestion to run the benchmarks. Yes, I use the Advanced View and I use Local Pref's. I removed Einstein a few days ago so not sure why it appeared in the benchmarks.

I changed the cache for now but do not see why that matters. Cache was previously set for 1 day.

I exited and restarted BOINC. I just enabled Rosetta to download new tasks and it downloaded 2 tasks. I will let them finish - all 3 are running.

I had an issue with GPUGrid a few weeks ago and had to remove BOINC (and its ProgramData directory) completely. Not sure that is related.

Anyhow, not sure why it is fixed (maybe reducing cache?) but it is working OK now.

Thanks!
3) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 100671)
Posted 1 Mar 2021 by Michael E.
Post:
I downloaded this work unit: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=1202147167

It never begins processing. It stays in the "Ready to Start" state. Tasks from other projects process just fine. I have used BOINC for two+ decades but never saw this happen before.

The work unit is Rosetta version 4.20, BOINC is at Version 7.16.11, and it is a Windows 10 system with a GPU. The Options > Computing Preferences are set at 50% of CPUs (6). There are no work units in the Transfers tab.

Should I abort it and get some new work Rosetta units? Or abort and reset the Rosetta project?

Anyone ever seen this?
4) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 95528)
Posted 29 Apr 2020 by Michael E.
Post:
Many thanks to Mod.Sense, robertmiles, CIA and (previously) Grant for the clear explanations.

So it seems R@h users should stick to the default task size, which is 8 hours. For older systems or those not used 24x7, choose shorter length tasks.

OK about the system learning the Remaining time after about 12 tasks (good!). I do think limiting the number of work units for new hosts that have not run a dozen tasks would help.

Also, I ran into a strange situation where a 24-hour work unit reached zero Remaining time but kept processing for 10 extra hours. Work unit was 1043928617 and task 1043928617.
I had to restart my PC and that task reset to 10 hours remaining so I aborted it as it was out of time.

If you want a beta tester for the large R@h tasks, let me know.
5) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 95519)
Posted 29 Apr 2020 by Michael E.
Post:
My original question: Does it matter to the Rosetta@home research if we use 8-10 hour tasks rather than 24+ hour tasks?


Honestly if you are running 24 hour tasks why even have a cache? Are you running another project besides Rosetta? As long as you have a good internet connection the downtime from finishing a task and getting a fresh one is next to nothing and you are only hitting the server once per day for a new task (vs 3 times a day for 8 hour tasks). All my machines that are set to 24 hour tasks run 0 cache without issue.


So you are telling me that the same type of tasks are sent regardless of task length? That is, they get split up so there can be smaller tasks?

I want to understand the needs of the researchers. For example, if longer tasks do different types of calculations than small tasks and few people process them, I can do the long tasks.
6) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 95513)
Posted 28 Apr 2020 by Michael E.
Post:
This post asks about fixing the estimated Remaining time on long Rosetta tasks. I tend to be pretty direct so here goes...

I was using long 36 hours Rosetta tasks and cut it down to 24 hours, but still have the same issue. This project-specific preference is set under the web interface: Your account > Rosetta@home preferences > Target CPU run time.

On the computer under Advanced View > Options > Computing Preferences, I set my Store at least to 1 days of work, but I still get jobs that do not complete and have to be aborted.

With 24+ hour tasks, the estimated Remaining time says about 6 hours until about 6-7 hours elapsed time, when the more accurate time gets calculated, such as 17 hours left.

Questions/strong suggestion:
+ Could the estimated Remaining time for such 24+ hour tasks be doubled to prevent the need to abort so many tasks?
+ Could there be a limit on the number of downloaded tasks (maybe just long tasks) at a time to 2?

Could the option for long tasks > 10 or 12 hours be disabled until the estimated Remaining time can be fixed? I do not think it is a good practice for people to abort tasks.

Does it matter to the Rosetta@home research if we use 8-10 hour tasks rather than 24+ hour tasks?

Mike
7) Message boards : Number crunching : Tells us your thoughts on granting credit for large protein, long-running tasks (Message 95330)
Posted 24 Apr 2020 by Michael E.
Post:
I use a lot of BOINC projects. PrimeGrid applies a bonus for long-running tasks because most people like short-running tasks. For example, looking at CPU-only tasks:

Subprojects with a 10% long job credit bonus have recent average CPU time of 41:29:00 and 60:40:12 hours
Subprojects with a 20% long job credit bonus have a recent average CPU time of 107[/list]:29:32 and 125:37:06 hours
Other subprojects with longer run-times have long job and conjecture bonuses.

To see details, create a PrimeGrid account and choose Your Account > PrimeGrid Preferences. Or send me a message and ask for a text/screen cap. The preferences also show completion times.

I used to choose projects in part by measuring the points per CPU hour to find those with a high reward. Now I am concerned about medical science more than points.
8) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 94130)
Posted 11 Apr 2020 by Michael E.
Post:
Grant - Way cool! Thanks much for the help.

I changed the Computing Preferences as you suggested to allow disk writes at 60 seconds and increased the memory allowed to 90%. My PCs have 8 and 16 GB respectively.

Are these suggested Preferences written down anywhere? If not, if you could review it, want me to create a PDF-based guide? Send me a message. It needs to be easy to locate as well.

Mike
9) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 94106)
Posted 10 Apr 2020 by Michael E.
Post:
Mod.Sense asked:"Do you have the box checked to leave tasks in memory while suspended?"

On both local PCs, this is is checked/on: BOINC > Advanced View > Options > Computing Preferences > Disk and Memory > Leave non-GPU tasks in memory while suspended

On the web pref's, it was unchecked so I checked it. I usually use the local pref's only.

To minimize disk writes, I set the BOINC > Advanced View > Options > Computing Preferences > Computing > Request tasks to checkpoint at most every 300 seconds

The two PCs both had similar unexpectedly low elapsed time for a short time after they had been processing for a while. They both returned one task so far.

Mike
10) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 94091)
Posted 10 Apr 2020 by Michael E.
Post:
I am running Windows 10, BOINC 7.14.2 (x86), with Rosetta preferences at 1 day 12 hours.

The Rosetta 4.12 tasks downloaded initially said they would execute in 6 hours but it is taking much longer.

How much longer than the Deadline time is allowed? Sorry but I had to abort several of these.

The Elapsed time seems to reset on some of these tasks. One task completed (1143119307.) and others had to be aborted.

I just downloaded some Rosetta Mini v3.78 tasks and will see how those go.

How can I help? I am a retired software writing/support guy who knows what native code means :-).

Mike
11) Message boards : Number crunching : Minirosetta v1.34 bug thread (Message 56261)
Posted 6 Oct 2008 by Michael E.
Post:
I have a task that may be looping. The Message tab shows the task restarting every 30 minutes and the time remaining has been stuck at 00:09:57 for almost two working days. These tasks usually take 9-10 hours on this computer (computer ID is 858463).

Total CPU time and % complete follow:

CPU time 12:34:xx and Progress 98.691% (Friday AM)
CPU time 18:28:xx and Progress 99.105% (Monday late afternoon)

Some of the log messages in the Message tab indicate a restart about every thirty minutes:

10/6/2008 4:13:53 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_loopbuild_boinctest3_foldcst_loopbuild_t328__IGNORE_THE_REST_1VIMA_16_4578_2_0 using minirosetta version 134
10/6/2008 4:45:54 PM|rosetta@home|Restarting task hombench_mtyka_foldcst_loopbuild_boinctest3_foldcst_loopbuild_t328__IGNORE_THE_REST_1VIMA_16_4578_2_0 using minirosetta version 134

Should I let this task continue or abort it?

Task ID is 194948950 and Work unit is 178087521. It is now past the Report Deadline, so it has been re-sent.

The system is a laptop running Windows XP. It has development software installed.

Michael E. (Mike)






©2021 University of Washington
https://www.bakerlab.org