Message boards : Number crunching : Loads and loads of computing errors today
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0 |
I've spent the last two hours poring over the results from the last few days of runs--they are pretty amazing! I don't know what the origin of the computing errors is, but all the results that have been returned are fine. the "random_length_20" runs are forcing one residue in each 20 amino acid segment into one randomly selected conformation--thie idea is to keep the different runs spread out by fixing several randomly selected residues into specific but randomly selected states. this is like forcing different explorers to different (randomly selected) regions of the globe. (it is remotely possible that in some rare cases, the residues are fixed into states that cause an error like you saw below, but this doesn't seem very likely). so far, of the different runs you all have done, the "random_length_20" runs are sampling the most broadly, and we are just about to start jobs where more residues are randomly fixed (they will have names like random_length_15, etc.). On a different note, many people put a lot of effort into reducing the memory footprint over the last few weeks--is this reducing some of the problems all of you were having earlier? |
AnRM Send message Joined: 18 Sep 05 Posts: 123 Credit: 1,355,486 RAC: 0 |
[quote]I've spent the last two hours poring over the results from the last few days of runs--they are pretty amazing! I don't know what the origin of the computing errors is, ......" >Well, for me personally, all errors on four boxes were cleared when I upgraded to BOINC 5.2.2 from BOINC 4.19. Our other boxes were running BOINC 5.2.1 and were just fine and didn't miss a beat when R@H 4.78 was introduced. IMHO it was pretty obvious that R@H 4.78 was not very happy running on the older BOINC version. Cheers, Rog. |
Rebirther Send message Joined: 17 Sep 05 Posts: 116 Credit: 41,315 RAC: 0 |
|
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 51 |
I have not seen any errors using the 4.78 application and the 4.25 core client. 3.2GHz P-IV HT Win XP SP2. I can't go to version 5 yet as LHC@Home does not accept 5x yet. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Charles Dennett Send message Joined: 27 Sep 05 Posts: 102 Credit: 2,081,660 RAC: 513 |
I've been running for almost a month on my Linux system with AMD's XP2600+ processor and don't believe I've had any errors that I have not caused myself. I'm running R@H full time. I'm using version 4.72 of the core client and I compiled it myself in order to optimize it. (Did that before I even knew of R@H.) Won't switch to 5.X untill I can compile it myself once I get some libraries updated (libcurl, etc.) or one of the other people who made optimized 4.x clients available does so for 5.x, too. Just another data point. -Charlie |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 51 |
Off topic, but I don't really see why people bother with the optimised core client, it uses so little CPU and runs for such a short time anyway. Client apps, that's different of course. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Charles Dennett Send message Joined: 27 Sep 05 Posts: 102 Credit: 2,081,660 RAC: 513 |
Off topic, but I don't really see why people bother with the optimised core client, it uses so little CPU and runs for such a short time anyway. Client apps, that's different of course. I run Boinc on two systems at home. One is my Linux server which has an AMD XP2600+ cpu. Runs at just over 2 GHz. The other is my old slow Windows box with 98SE. It's an old 300 Mhz PII. With the stock core client on both, the Linux machine would claim about half the credit the windows machine would for similar workunits. Didn't matter which project it was. It all had to do with the benchmarks run by the core client. The Linux client was simply not optimized the way the windows client was (and still is with the latest 5.x client as far as my experiments can tell.) Therefore, the benchmarks ran proportionalely slower on the Linux box and since those are used to calculate the claimed credit, it would claim about half the credit the windows box would. If I compile my own Linux core client and optimize it, the credit claimed by the Linux box is just about the same as that claimed by my old slow windows box for similar workunits. I know I'm not pumping workunits through the Linux box any faster. I realize the important thing in any project is the science, not the credit. Indeed, I choose to run R@H 100% of the time (with other projects ready to go in case R@H goes down) because of the science. I feel a project with the potential to help fight cancer, diabetes and other afflictions is more important than searching for ET as interesting and exciting as that search can be. However, the credit is an attraction. I know I can't compete with others who have several fast machines running, but I do like to keep an eye on how well I'm doing relative to others near me in the standings. It just adds to the fun. -Charlie |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 51 |
I hear what you say but am more puzzled then before. I don't see how compiling in MMX, SSE etc. would improve the integer or floating point performance that the benchmarks return. I would expect the superior floating point unit in the AMD to outperform a similar Pentium in large apps like Rosetta. In little programs like Seti, the small size means it basically fits in the large cache of the newer Pentiums which makes up for the poorer floating point units. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Charles Dennett Send message Joined: 27 Sep 05 Posts: 102 Credit: 2,081,660 RAC: 513 |
I hear what you say but am more puzzled then before. I don't see how compiling in MMX, SSE etc. would improve the integer or floating point performance that the benchmarks return. I don't think it's so much the MMX and/or SEE options. There are other optimizations the compiler can do. Anyway, here are the compile options I use. I believe this is the same as Ned Slider recommends on his website: CFLAGS="-march=athlon-xp -O3 -fomit-frame-pointer -funroll-loops -fforce-addr -ffast-math -ftracer" There is a discussion of the effects of these option at http://forums.pcper.com/showthread.php?t=354308. Got that from Ned Slider's website located at http://www.pperry.f2s.com/index.htm. -Charlie |
rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0 |
Pardon my ignorance, but does "reducing the memory footprint" mean the client uses less RAM, and thus perhaps boxes with 256mb can successfully be utilized now? Thanks! :) (And by the way, I never did have any problems, thank goodness.) Regards, Bob P. |
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 51 |
You got it in one Bob. Reducing the "footprint" of some parameter means making it smaller. Reduced memory footprint - uses less memory, reduced desk footprint - use less table space, reduced me footprint - less smelly trekking shoes for my wife to complain about. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Ulrich Metzner Send message Joined: 17 Sep 05 Posts: 22 Credit: 405,640 RAC: 0 |
Here a little summary for application 4.78: 571322 446833 23 Oct 2005 20:34:07 UTC 24 Oct 2005 7:24:04 UTC Over Client error Computing 0.00 0.00 --- 571307 446742 23 Oct 2005 20:34:07 UTC 24 Oct 2005 7:24:04 UTC Over Client error Computing 0.00 0.00 --- 558197 453166 23 Oct 2005 16:11:33 UTC 23 Oct 2005 20:34:07 UTC Over Client error Computing 0.00 0.00 --- 558170 453140 23 Oct 2005 16:11:33 UTC 23 Oct 2005 20:34:07 UTC Over Client error Computing 0.00 0.00 --- 548794 444768 23 Oct 2005 13:09:45 UTC 23 Oct 2005 16:11:33 UTC Over Client error Computing 0.00 0.00 --- 548776 444750 23 Oct 2005 13:09:45 UTC 23 Oct 2005 16:11:33 UTC Over Client error Computing 0.00 0.00 --- 520387 417610 24 Oct 2005 7:24:04 UTC 21 Nov 2005 7:24:04 UTC In Progress Unknown New --- --- --- 520386 417609 24 Oct 2005 7:24:04 UTC 21 Nov 2005 7:24:04 UTC In Progress Unknown New --- --- --- 490343 393057 23 Oct 2005 1:01:58 UTC 23 Oct 2005 13:09:45 UTC Over Client error Computing 0.00 0.00 --- 490342 393056 23 Oct 2005 1:01:58 UTC 23 Oct 2005 13:09:45 UTC Over Client error Computing 0.00 0.00 --- 480167 382959 22 Oct 2005 19:02:22 UTC 23 Oct 2005 1:01:58 UTC Over Client error Computing 0.00 0.00 --- 470188 373076 22 Oct 2005 13:13:12 UTC 23 Oct 2005 1:01:58 UTC Over Client error Computing 0.00 0.00 --- 470187 373075 22 Oct 2005 13:13:12 UTC 22 Oct 2005 23:46:03 UTC Over Success Done 5,559.07 14.27 14.27 441698 350373 21 Oct 2005 15:32:31 UTC 22 Oct 2005 13:13:12 UTC Over Client error Computing 0.00 0.00 --- Except for one single wu all others errored out... greetz, Uli |
Ulrich Metzner Send message Joined: 17 Sep 05 Posts: 22 Credit: 405,640 RAC: 0 |
Ok, the problem is not limited on AMD Athlons. Just got the same on a 1.8 GHz P4 Willamette CPU: 592071 480093 24 Oct 2005 9:33:00 UTC 24 Oct 2005 10:34:03 UTC Over Client error Computing 0.00 0.00 --- 591998 480021 24 Oct 2005 9:33:00 UTC 24 Oct 2005 10:34:03 UTC Over Client error Computing 0.00 0.00 --- 571322 446833 23 Oct 2005 20:34:07 UTC 24 Oct 2005 7:24:04 UTC Over Client error Computing 0.00 0.00 --- 571307 446742 23 Oct 2005 20:34:07 UTC 24 Oct 2005 7:24:04 UTC Over Client error Computing 0.00 0.00 --- If nothing is been done on this, i'll detach for a while, cause this is simply a waste of bandwidth :/ greetz, Uli |
Ulrich Metzner Send message Joined: 17 Sep 05 Posts: 22 Credit: 405,640 RAC: 0 |
greetz, Uli |
rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0 |
You got it in one Bob. Reducing the "footprint" of some parameter means making it smaller. Reduced memory footprint - uses less memory, reduced desk footprint - use less table space, reduced me footprint - less smelly trekking shoes for my wife to complain about. Thanks! I think I will try a 256mb box and see what happens....if the shoes are still too large (which I don't think they will be) I will retire the 256mb box to something else. ;) Regards, Bob P. |
Andrew Send message Joined: 19 Sep 05 Posts: 162 Credit: 105,512 RAC: 0 |
@Ulrich It's too late now, but you should have uninstalled 4.19 (or older) client and then installed the 5.2 client. You wouldn't have lost any WU's then :( |
Ulrich Metzner Send message Joined: 17 Sep 05 Posts: 22 Credit: 405,640 RAC: 0 |
@Ulrich Thanks, but that's exactly, what i did. The installer already told me to first uninstall the old version :/ BTW: Nothing is really lost. I made a backup first ( It's not the first time this happens to me ;) ) and after i uninstalled 5.2.2, i copied back the old content. What's more annoying is: It already fetched a new CPDN wu, which i now trashed... greetz, Uli |
Andrew Send message Joined: 19 Sep 05 Posts: 162 Credit: 105,512 RAC: 0 |
I see... well that is annoying, but at least you made a back up :) |
atotos Send message Joined: 22 Oct 05 Posts: 8 Credit: 70 RAC: 0 |
I had 3 out of 4 WUs end in computation errors..... |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
I had one stuck on 1% for an hour and a half ... suspending it and restarting it did not help. But, a full exit and restart and it went through ... So, there is something about the start-up that is whonky ... |
Message boards :
Number crunching :
Loads and loads of computing errors today
©2024 University of Washington
https://www.bakerlab.org