Posts by Jim1348

1) Message boards : Number crunching : Discussion of the merits and challenges of using GPUs (Message 96736)
Posted 2 days ago by Jim1348
Post:
A Google search did not find Quarantine@Home. Can you give me a link to that project? Is it able to share a GPU with Folding@Home?

https://quarantine.infino.me/

But the GPU version is only for Linux.
The Windows version is only on the CPU at the moment.
2) Message boards : Number crunching : Discussion of the merits and challenges of using GPUs (Message 96728)
Posted 3 days ago by Jim1348
Post:
For those interested only in projects related to medical research, the only choice now appears to be Folding@home, which wasn't set up to be compatible with BOINC projects. It's possible, but difficult, to run it on a computer that has BOINC running at the same time. Their forums currently aren't working.

I run Folding on the GPU on all my machines with BOINC on the CPU work units. It is no more difficult than the usual annoyances with Folding.
That is, you have to set it up and then delete the "CPU" slot, or it will run by default (and check it again - you usually have to do it twice).
And you of course have to reserve a CPU core to support the GPU, as with most setups.
But they have a new version of their app recently, which may ease the setup. It won't take long to get the hang of it.

And their forums are up, and have been for some time. Maybe you were not trying the SSL version?
https://foldingforum.org/index.php


If you are interested in other types of GPU projects, note that Asteroids@home currently has disk space problems interfering with uploads.

I am about to post a comparison of how awful their GPU version is as compared to the CPU version for efficiency. It will be something like 40 watt-hours per work unit for the GPU
(i.e., GTX 1060 or 1070), and about 14 watt-hours for the CPU. They should ban the GPU version to save the planet.
(It has been stated by others before, but should be emphasized again.)
3) Message boards : Number crunching : "process got signal 11": How to fix? (Message 96395)
Posted 13 days ago by Jim1348
Post:
I know nothing about Darwin, but do you have the latest updates for the OS?
(I don't think the BOINC version usually matters for this type of problem).
4) Message boards : Number crunching : Rosetta 4.0+ (Message 96305)
Posted 16 days ago by Jim1348
Post:
Not the buffer size :-

<rec_half_life_days>X</rec_half_life_days>
A project's scheduling priority is determined by its estimated credit in the last X days. Default is 10; set it larger if you run long high-priority jobs.

OK, I see what you are saying, but I am not sure why you set that larger. I want the estimated time to converge faster.
So I routinely set my mine as follows when installing BOINC:
<rec_half_life_days>1.000000</rec_half_life_days>

That was not the source of my problem. It was some incompatibility between the new BOINC (after 7.14.2) and the server.
It worked OK on some projects, and not others. I have not seen the problem for a while now, so it eventually corrects itself.
5) Message boards : Number crunching : Rosetta 4.0+ (Message 96280)
Posted 17 days ago by Jim1348
Post:
I changed the 10 day limit in the cc_config file down to 1 day because I didn’t like the was one project would run away with the machine after not having WUs for a while. I’m not certain that this also controls the time it takes to learn a machine’s throughput but I suspect it is.

Mine was set for the default (0.1 + 0.5 days). It ignored that. But it seems to have finally collapsed once it reached the 10-day limit, which maybe is part of the server code.
6) Message boards : Number crunching : Rosetta 4.0+ (Message 96267)
Posted 17 days ago by Jim1348
Post:
I'm afraid I spoke too soon : it started to request to many tasks again without changing anything to my small cache, i still have 120 waiting to run

It looks like that is your machine with BOINC 7.16.6. I had the same problem on WCG after I upgraded BOINC from 7.14.2 to the next version, whatever it was. It went berserk and downloaded work units until it reached the 10 day limit (or got exhausted, whichever came first). I ended up with hundreds of work units.

It is apparently due to a change in the BOINC scheduler. But the servers don't necessarily know how to deal with it, at least until they "learn". I posted about it on the WCG forum a few months ago.
I never found a good solution, except to manually control the downloads. After a while, it starts working again. Good luck.
7) Message boards : Number crunching : Server can't open database (Message 96218)
Posted 18 days ago by Jim1348
Post:
Are you using the new HTTPS address.

Yes, I converted everything over. I hope they know what they are doing.

EDIT: The work has started flowing again, and my machines are topped off.
8) Message boards : Number crunching : Server can't open database (Message 96214)
Posted 18 days ago by Jim1348
Post:
I have been getting this message for the last several hours on all of my machines that request work (three thus far).
9) Message boards : Rosetta@home Science : RosettaCommons, better togheter (Message 96115)
Posted 20 days ago by Jim1348
Post:
David Baker is as much a genius at organization as he is at protein structure prediction.
The two skills may go together.
10) Message boards : Number crunching : Rosetta 4.1+ and 4.2+ (Message 95608)
Posted 25 days ago by Jim1348
Post:
Security is the main problem indeed, having a Windows XP connected to internet is madness.

Some years ago, a guy did a test of a machine (probably XP) with absolutely no patches, but behind a router.
That was back in the days of Red Alert and the port vulnerabilities, when most machines did not last 10 seconds.

It had absolutely no infections after several days. But it was not used for web browsing.
A dedicated machine is safe with anything if you know what you are doing. Most people don't.
11) Message boards : Number crunching : All tasks failed : finish file present too long (Message 95354)
Posted 25 Apr 2020 by Jim1348
Post:
It's still a problem for large core-count machines. My 24 core Xeon machines have SSD storage and gigabit internet.

You can use a write-cache (don't bother wasting memory on a read-cache).

Windows: If you have a Samsung SSD, their Magician utility includes the "Rapid Mode cache" if you enable it.
The Crucial drives have "Storage Executive" with a cache.
For a larger cache, you can buy PrimoCache from Romex Software

Linux has its own built-in cache, you just need to set the size. 1 GB of cache and 1/2 hour write-delay should work wonders;
probably half that amount or even less would fix this problem; 5 minutes should be more than enough.
https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/
12) Message boards : Number crunching : Large proteins (Message 95014)
Posted 20 Apr 2020 by Jim1348
Post:
And i will continue to ask for the SSEx/AVX extensions :-P

You are a good man. But I don't believe for a minute that you have given up on GPUs.
13) Message boards : Number crunching : Large proteins (Message 94924)
Posted 19 Apr 2020 by Jim1348
Post:
I don't want my CPU cores idle "waiting for memory" and I can't check all day long.

With virtual cores, it is not much of a loss if you have one out of twelve idle, for example. You aren't losing a full core, only an instruction stream, and the hardware resources will still be in use.
If you are losing two virtual cores, then that is equivalent of a full core. One defense (the best) is to get more memory.

But if they need to get the science done, then the loss of cores for less important work is really their decision, and not a problem for me.

EDIT: As was said, another way is to run a second project that requires less memory. I often use TN-Grid, a gene-expansion project that has some secondary implications for COVID-19 (among a lot of others). It requires only about 56 MB per WU. Just set it to maybe 10% resource share or less, and it will fill in automatically if you have a free core.
14) Message boards : Number crunching : Large proteins (Message 94900)
Posted 19 Apr 2020 by Jim1348
Post:
Crossing the "bound" during the run would cause the BOINC Manager to abort the task. So it is a number you shouldn't actually see happen often.

What I have seen in the past on my 12-core Ryzens with 16 GB of memory (11 cores on BOINC) is that if there is not enough memory, then the last WU just won't start up, and I then have only 10 running.
If that is all that happens, no problem. If something crashes, then I think we need to make other arrangements.
15) Message boards : Number crunching : Project does not download new units, saying there is not enough disk space (Message 94552)
Posted 15 Apr 2020 by Jim1348
Post:
the disk space is enough, but the project does not download new units saying there is not enough disk space. This causes other projects to use the available, vacant space. And there we do have a problem!

You need to increase the disk space available to BOINC in the BOINC Manager. You have only 1.25 GB available now, not enough to download much.

Go to "Options" -> "Computing preferences" and then "Disk and memory".
Just set "Leave at least 5 GB free" (or whatever value you want), and then un-check the other two disk settings.
It uses the most restrictive of the three.
16) Message boards : Number crunching : No more work units ? (Message 94508)
Posted 15 Apr 2020 by Jim1348
Post:
You should set BOINC to run 100% of the time.
It is the "Use at most 100% of the CPU time" setting.

Also, do not suspend when computer is in use.
17) Message boards : Number crunching : No more work units ? (Message 94460)
Posted 14 Apr 2020 by Jim1348
Post:
Check your BOINC settings. There is work available.

But one of your machines has only 2 GB of memory. That is not enough. I wouldn't try with less than 8 GB.
18) Message boards : Number crunching : Rosetta 4.15 (short time estimates) (Message 94358)
Posted 13 Apr 2020 by Jim1348
Post:
tl;dr I left everything alone, runtime sorted itself out without me having to touch anything and I don't have an excessive buffer either

You lucked out. It depends on the order of what they send you. It could go the wrong way.
I keep the default buffer of 0.1 + 0.5 days. That may be short enough to keep them from timing out, but I don't trust it.
The zero resource share is foolproof. I can increase it in a few days.
19) Message boards : Number crunching : Rosetta 4.15 (short time estimates) (Message 94279)
Posted 12 Apr 2020 by Jim1348
Post:
When you preload tasks increase the setting of 0.1 day, and 0.1 additional day in boinc mgr to 3 or 4.
you'll get Wu's with longer deadline, which also means a higher chance of getting Wu's that run longer.
Rosetta has mixed WUs, some of which run close to 10 hours.

Mine run for 18 hours, because that is what I set them to.
You would initially get way too many short ones if you set BOINC to a 3 to 4 day buffer.

Insofar as I know, that setting does not change the run time. It is just that the initial time estimates are way off.
The initial estimate is usually 4 hours 30 minutes on my machines, but is even less now.
20) Message boards : Rosetta@home Science : Interesting starting point for attacking Coronavirus (Message 94278)
Posted 12 Apr 2020 by Jim1348
Post:
Are you implying that none of the compounds in the ZINC database contain any zinc atoms?

I already know not to trust anything Donald says without checking elsewhere, but I haven't found tools for reading what is in the ZINC database..

They may include zinc insofar as I know. I was just suggesting that someone might have confused the names.


Next 20



©2020 University of Washington
https://www.bakerlab.org