Posts by Nuadormrac

1) Message boards : Number crunching : minirosetta 2.14 (Message 66188)
Posted 18 May 2010 by Nuadormrac
Post:
There's also another aside with increasing complexity on a task, which depending on people's machines might/might not effect them. The bigger the dataset, the more complex something is, the more RAM it can use. The current task I'm working on is using 511 MB of RAM to itself.


Unless it gets to the point where the processes start thrashing, this isn't much of an issue. Modern OS's have very effective virtual memory systems which simply page out less used portions of the working set to disk.


Actually the "conservative swap feature" was a setting which was restricted to Windows 9x branded operating systems, as the winNT/2k/XP line did things a little different. Windows Vista is essentially an off shoot of Windows 2003 server...

I can say that from experience with running Windows Vista 64 beta and release candidates on a then Athlon 64 which had 1 GB of RAM at the time it was in beta; my experience was this.

- Vista booted up allocating about 900 MB at desktop, winXP allocated around 200-340 MB at desktop (before loading apps). Typical bloatware, we're all familiar with that.

- When Vista 64 (and I do think some of this was a 64-bit OS on a 64-bit CPU) was allocating less RAM then one had in the machine (though tbh there is paging that goes on when the physical RAM doesn't warrant needing it, part of the differences on how the winNT line of OS's, along with it's successors deal with paging, vs how win98 dealt with it), the OS was snappier and more responsive.

- Course keep in mind, the physical memory also has a HD cache, whic the above isn't taking into account. But needless to say extra RAM is good, especially if one has write caching enabled for the drives, and not just read caching.

- But anyhow, the experience was that as soon as allocated RAM went beyond physical RAM by even a small amount, aka even just 1/10th of a GB or then 110% physical on that box, the responsiveness degraded, and the OS seemed slower to respond then even winXP. Course there's also a reason many downgraded to XP :p This sort of thing can be especially noticeable with any form of computer gaming, where real time response times can be an issue; especially in some intensive situations (be it from a FPS standpoint, or an MMO standpoint if one's in a large raid, with a lot going on at once which must be responded to with as next to no delay as possible).

- When left to themselves, the swapfiles in win2k, XP, Vista, and I would imagine win7; have one fatal flaw with how they "grow" if the initial swapfile size is exceeded. They do so very conservatively, and this can also result in a fragmentation problem wrt the swapfile. This is also why utilities such as Diskeeper and the like introduced a defrag pagefile option (and latter on an option to defrag the MFT). People in the know however don't go with the Windows default setting, they set a fixed swapfile size, when the initial and max sizes are the same, and follow MS's recommendation of making it at least 1.5x physical memory. (More on how this line of OS's handles paging vs how win9x handled it.) TBH, if speed and efficiency were the only concern I think win98 did pagefiles a little better (arguably), though this line of OS's does have other things it can do with pagefiles, such as a degree of error handling through them.

Vista would not count as an old, and would very much count as a "modern OS" even though Windows 7 is now out. And all I can say, is Vista, on this box here, with 2 GB RAM and a duel core, yes it's got some of that same sluggishness in general which can leave me wanting to curse Vista at times :laugh: I wouldn't exactly call it the most responsive and snappy thing out there. And tbh, if I had the memory in this box, a few of the changes I would want to make would be to impose a "conservative swap" feature like in win9x, except Vista doesn't allow for that. Though some things it does allow for and I would do, is go into regedt32 and alter some of the memory management features to disable paging executive (one wants enough extra RAM for that change though) as well as enable large system cache. There's some other tweaks one can make, if the computer isn't bogged down that is, relative to their own physical RAM.
2) Message boards : Number crunching : minirosetta 2.14 (Message 66187)
Posted 18 May 2010 by Nuadormrac
Post:
Nuadormrac, the watchdog tries to leave things alone and not interrupt useful work. So it will only end a task when it has gone 4 hours passed the target runtime user preference. I believe what you are observing is a combination of long runtime per model, and variance in runtime between one model and the next.

If I run one model is 90 minutes with a 3 hour target runtime, the task will begin a second model on the thought that it will complete within the target. However, if the second model then takes 120 minutes to complete, the task will end a half hour passed the target instead. This is normal behavior, and part of the reason why the watchdog has the patience I described above.


Well I had a 2 hour run time, and it was 2.5 hours runtime when model 0 was complete.... Which is why I was surprised to come back and see it still chugging away at the task, inspected in the "show graphics again" and saw it was on model 1 rather then 0. It was on model 0 when I left.
3) Message boards : Number crunching : Umm, why did this WU get it's credit grand dropped by so much? (Message 66152)
Posted 16 May 2010 by Nuadormrac
Post:
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=309383381

This is one of those units that was supposed to only run for 2 hours, model 0 was done around 3 hours, and it went on and did the whole unit. And in the end, with running to the end (regardless of preferences settings), the credits it granted were only like 38, instead of the 64.
4) Message boards : Number crunching : minirosetta 2.14 (Message 66150)
Posted 16 May 2010 by Nuadormrac
Post:
Has anyone else noticed that with this version of minirosetta tasks are completing, instead of cutting off as they should? I had one last night which was on model 0, beyond the target time set in preferences, went out to work, and when I came home, it was onto model 1 (the 2nd model), and crunched the entire WU. I've noticed a couple others like this.

Other models are completing within within the selected time, but the early completion of WUs based on target time has elapsed seems to be gone. I think watchdog, or whatever is responsible for telling the WU "you've crunched enough and are done" is not working as it should in this version.
5) Message boards : Number crunching : minirosetta 2.14 (Message 66141)
Posted 16 May 2010 by Nuadormrac
Post:
There's also another aside with increasing complexity on a task, which depending on people's machines might/might not effect them. The bigger the dataset, the more complex something is, the more RAM it can use. The current task I'm working on is using 511 MB of RAM to itself. Now this might/might not seem like much with today's computers, but also remember that today's processors have either 2 or even 4 cores on one CPU. Which means that each CPU is running a separate task which is each taking up it's own pool of RAM. If someone has a quad core, and is running 4 Rosseta tasks as such, they're really using 511 MB x 4 or 2,044 MB or (2,044/1024)= 1.996 GB of RAM over and above Windows (Vista or 7, on today's comps).

Now thinking of it another way, many of today's computers which come with 2 or 4 GB of RAM, on a quad core, would essentially have 512 MB or 1 GB respective per core if one were to break it up that way. And though you might not care on your web browser (what many OEM's are thinking about with pre-built systems, on BOINC you would...

As things become more crowded, their comps might swap a little more, and increased paging activity (as the memory pool useage grows larger) can slow things down for that reason.
6) Message boards : Number crunching : Discussion on increasing the default run time (Message 66125)
Posted 15 May 2010 by Nuadormrac
Post:
If the task failed, then for some reason it is not running well on your machine. It is more conservative to replace it with another task that may run better for your environment. In other words if model 1 or 2 failed from this task, let's not push our luck with more. Better to get word back to the project server about the failure sooner. Perhaps there is a trend that will indicate similar future work should be held until a specific issue is resolved.


If model one failed though it would both not impact people well, and yes that the tasks aren't working well on the machine can reasonably be argued. But then a longer WU time wouldn't effect it much if the unit was aborted early on, and a new unit needed to be downloaded (for instance 5 minutes after starting). That's well below even existing preferences.

Where this could more likely be an issue is if, lets say for sake of argument 20 models completed successfully, and for whatever reason model number 21 failed. Now the unit was running 2.5 hours. Only if partial validation for the 20 models occurs would one avoid losing 20 models (vs just 1), and the user would lose the whole 2.5 hours worth of credits, vs just the amount lost for the one unit.

Now arguably I haven't tended to see units fail much on Rossetta (though some have, for there to be discussion, along with recommendation on the team page for the Pentathalon challenge). But in the past I had seen it from time to time on RALPH, which is good because it means many are being caught in the alpha/beta stage, before getting released to people in general. But it can be a consideration.

But for crunchers, there can be 2 big considerations with this proposed change. One is the effect on the BOINC queue, and the other is a reason for which shorter run times can be chosen/preferred, less likelihood of running into the odd error, if it has a smaller span of time in which to occur, and if it does happen smaller impact on potential for lost credits.

For you, there's server load on the one hand, but also potential for lost models/work already completed on the other. (Given we're talking a change of 1-3 hour minimum and 3-6 hour default; units which successfully run for < 1 hour aren't a consideration with such a change as they'd get thrown out and new download would occur anyhow. Hence I'm presuming the first model or 2 has had a sucessful run, for it to now error out prior to either 1 or 3 hours respective. And yes I know a few models do run for 2 hours or so, though many end earlier.)
7) Message boards : Number crunching : Discussion on increasing the default run time (Message 66124)
Posted 15 May 2010 by Nuadormrac
Post:
This also brings up another issue with such a possible increase; though it's a credit related one, so might not take the same precedence as... And yet depending how the units are treated, it might effect the science also.

If the processing time is increased, and the unit deadlocks, hangs, or in some way crashes after the initial model(s) had been successfully been processed, it will after whatever time is spent hanging, error out. And yet not everything in the WU was bad. Now because the units don't have the time involved of a CPDN unit, it's unlikely that trickles would be introduced.

However, an effect of lengthening the runtime can also be that a unit that does error latter on will have a higher chance to error out; and if this occurs then any science which was accumulated prior to the model within the WU that did error could be lost, and the credits for those models which were completed without error most assuredly would be, unless something along the line of trickles or partial validation/crediting could be implemented to allow the successfully processed models within the unit to be validated and counted as such.

I understand completely the motivation behind increasing the default run time and if I only received Rosetta Beta 5.98 WUs I'm sure I'd hold to that default successfully.

But as I report here (and previously) I get Mini Rosetta WUs constantly crashing out with "Can't acquire lockfile - exiting" error messages - maybe 60% failure rate with a 3-hour runtime, reducing to 40% failure rate with a 2-hour run time.

I've seen this reported by several other people running a 64-bit OS - not just on Vista or with an AMD machine. That said, I don't know how widespread it is. Perhaps you can analyse results at your end.

As stated in the post linked above, I get no errors at all with Rosetta Beta, so I'm inclined to think it's not some aberration with my machine. I'd really like to see some feedback on this issue and some assurance it's being investigated in some way.

I'd ask that a minimum run time of 2 hours is allowed (I can just about handle that) or some mechanism that allows me to reject all Mini Rosetta WUs. If not, I'm prepared to abort all Mini Rosetta WUs before they run. It's really a waste of time me receiving them if 60% are going to crash out on me anyway.

I've commented on this before here, here, here and first of all and more extensively here - see follow-up messages in that thread.

No such issues arose for me with my old AMD single core XPSP2 machine - only when I got this new AMD quad-core Vista64 machine.

Any advice appreciated. It's a very big Rosetta issue for me, so while I'm sure you'll save a whole load of bandwidth if you go ahead with the proposed changes I just hope some allowance can be made for people in my situation.

8) Message boards : Number crunching : Not sure if this is going to be a continuing prob or not (Message 18492)
Posted 12 Jun 2006 by Nuadormrac
Post:
A second WU which was on my hard drive, and it's crunching has been running for over 4 hours. So keeping my fingers crossed the first failure since this new install was just a coincidental fluke, and this one will complete to the set 8 hour CPU time...
9) Message boards : Number crunching : Not sure if this is going to be a continuing prob or not (Message 18468)
Posted 11 Jun 2006 by Nuadormrac
Post:
Well, I just upgraded my comp to Windows Vista beta 2 tonight. The first Rosseta unit since the upgrade from my backed up older BOINC folder errored out. Einsteins seem to be running OK...

http://boinc.bakerlab.org/rosetta/result.php?resultid=23472320

It might be a one time thing, or Vista (when it does come closer to release, well beyond the beta being made publically available to all) might pose a slight snafu for the app here. Not sure...
10) Message boards : Number crunching : Is this for real??? (Message 17763)
Posted 6 Jun 2006 by Nuadormrac
Post:
Course, if you run the standard client (which I do) and then use akosf on Einstein@home (which I do, version u41-04), then there will be people over there that will tell people about how bad it is to not use an optimized client, because someone who didn't use akosf's app and claimed 51 credits, only got 8 credits (2 people used akosf's optimized science app) for the WU.

Given that I for one, am not going to run 2 seperate BOINC clients on the same computer, and under the same OS install, and then be unable to have it task switch between all projects; there's just no winning on that account. But oh well, I've done what seems reasonable, and what won't end up inflating credits here. Either way (though it has never been personally directed towards me), I s'pose I could hear of it in one form or another...

Oh, and on the team thing; go and search for one that suites your interests (as some teams do have some focus they got together for), perhaps visit the forums and see what the peeps are like, etc... Probably any of us would tell you to go join our team, or that our team is the best. Afterall, if we didn't like our teams, we woulda switched by now; and also why many of us advertize our teams. In the end, and for yourself, it's best for you to settle upon one that you will like :D
11) Message boards : Number crunching : Well, I've seen a new one (well for when I've looked) with a model here (Message 17698)
Posted 5 Jun 2006 by Nuadormrac
Post:
The accepted energy is a negative value in some of the graphings, and more then a little negative.

I suppose it can happen (if there's an exothermic reaction, rather then an endothermic reaction. And truth be told, these reactions are coupled with each other in our bodies as ATP [adanine tri-phosphate] for instance is broken down into ADP to couple certain metabolic activities). As such, this could be exactly the sorta thing going on; though I thought I would mention this rather different behaviour from what I've seen in the past, nonetheless...
12) Message boards : Number crunching : ROSETTA.....vs......CLIMATE PREDECTION (Message 17695)
Posted 5 Jun 2006 by Nuadormrac
Post:
Actually, as I understand it, if the actual crunch time is longer then what BOINC expects, it immediately places the higher value in place, but if it's less, it chops 10% off the expected crunch time or something till it "gets it right". For instance I was doing a bunch of LHC WUs there while they had work, and some were really short WUs where the beam was off or something and caused an early WU completion (yeah, the other crunchers got the same thing, and they did validate, just with smaller credit). Each of those resulted in a nudging down of the expected crunch time. However, one standard sized unit that went the whole 1 million turns, and the expected time was instantly set to crunch time on that WU...

This said, increasing it too much could easily result in the computer entering into panic EDF mode, as it thinks it's over-committed. So yes, caution is needed.

There is one thing to add here however. The flip side to a larger target CPU time meaning you download fewer WUs, is that if you are having problems (as you stated you were having with CPDN), the longer a WU runs, the higher the liklihood it will crop up. In that sense, you might want to lower the target CPU time if you need to get some WUs to successful completion. If it's 300 hours you were failing at CPDN this might not matter. If you were failing within 24 hours each time, then setting a 24 hour target CPU time might be risky.

For starts, you might want to try setting this below whatever point you were getting failures on other projects to try to make sure you get some successful runs. You can try uping this gradually then, to see if the same problems replicate or not. If you're set to an 8 hour run time, and find WUs failing at 7.5 hours (rather consistently) on your machine, then you might want to bump this back down to 6 hours so they'll get some models. We still haven't established (unless there's something I'm missing from the CPDN board in your case), why your machine was having all these errors.
13) Message boards : Number crunching : Team FaDBeens, we taunt in your general direction :D (Message 17494)
Posted 1 Jun 2006 by Nuadormrac
Post:
If my father smelled of elderberries, it was he was delivering em to your father, in an attempt to counter that flatulent smell all em beens produced in your own household :rofl

Albeit say what you will, only shrubbing will avail in this contest Muhahahahahaha!!!!!!! :D
14) Message boards : Number crunching : Team FaDBeens, we taunt in your general direction :D (Message 17453)
Posted 31 May 2006 by Nuadormrac
Post:
Oh yes, we a gainin fast... And just do pray that Einstein comes back up line, or we might have a nice lil treat in store, when some people's E@H units run out. In that matter, we shall see Ni!!! :D
15) Message boards : Number crunching : Team FaDBeens, we taunt in your general direction :D (Message 17414)
Posted 31 May 2006 by Nuadormrac
Post:
As of this moment, or actually a few days ago; be advised that Team "Knights Who Say Ni!" declare this official taunting on Team FaDBeens... We gonna give you a run for your money, and show ya "beanie babies"

At Sir Fart's behest, the tauntage is on... Best to button down the hatches and set your affairs in order, cause in the Rosseta stats, we are a'commin :D
16) Message boards : Number crunching : RAC cheats, is this a problem (Message 13313)
Posted 9 Apr 2006 by Nuadormrac
Post:
My Einstein credits get very much under-estimated using akosf's algorithm. And what's more D40 which I'm using now, has a major improvement over C37 on my Athlon 64. C37 gave a major improvement, bring the then albert WUs down to about a half hour or so of crunch time (don't exactly remember now). Then longer albert units came out on E@H that took about 1.5 hours. Anyhow, D40, which includes 3d now optimizations in addition to the sse optimizations that akosf included previously is down to about 1 hour (or 1/3 of what optimized science app c37 included). My claimed credits are low there, and remain low...

However, unless someone else is using the optimized app, the quorum of 3 gives me more standard credit. I simply will not use an optimized CC, because most of my crunching is not on SETI (where I use crunch3r's science app) or even SETI and Einstein. True, projects like CPDN will be unaffected, but projects like Rosetta right here, definitely will...

However, BOINC was made open source for a reason, and trying to force only "official clients" is not the answer. It was in part to allow for running on currently unsupported CPU platforms... What's more, some projects (like some of the Japanese cell computing projects), state where they're listed that they require a non-standard client.

The answer is what SETI is doing with enhanced which is now in beta. In all the SETI beta WUs I've received thus far, the claimed credits are virtually identical of each other, regardless of computer or speed. Looking at some of my results on SETI beta, the actual variance from 1 credit claim to another is < 1 credit point (and in many cases is within a 10th of a point). I think the SETI staff has a fine handle on that, and one that will dispense of questions wrt optimized core clients once and for all.
17) Message boards : Number crunching : Miscellaneous Work Unit Errors (Message 13305)
Posted 9 Apr 2006 by Nuadormrac
Post:
I am new here and using version 4.97. I too have almost all my WU's failing with similar codes. ***unrecoverable error for result HBLR_1.0_2reb_426_1061_0 (-exit code -1073741819 (0xc0000005))***



I'm really sorry about these problems. I checked yesterday on RALPH and everything seemed fine, but there clearly is a problem. Unfortunately, I'm just leaving for a family weekend trip so can't figure things out right away. Please bear with us for a couple of days.

Nothing wrong with your trip. But I wonder if you do realize the consequences if nobody else seems to react to serious problems


Yeah, sorry we didn't catch something sooner, but on the WU types we were testing earlier, everything was going up and validating successfully. That was until we got the HBLR units in the morning, and only then did failures start comming out...

A couple HBLR units did validate on my machine here (over on RALPH), but the vast majority were a no go, with many of them getting 3 failures, and some 2 with 1 success. Others didn't have a report then, so not sure what became of them...

Until we got the newer WU types, wasn't able to report on any problems with them obviously, and could only report on the older types... Sorry us testers weren't able to catch this problem before it started rolling out...
18) Message boards : Number crunching : Please abort WUs with (Message 7785)
Posted 28 Dec 2005 by Nuadormrac
Post:
This would explain the computational errors I've been seeing in Rosetta. BTW, still got DEFAULT... WUs even today. Guess, we should still kill them then.

BTW, I got a computational error today on a WU of a different sort.

http://boinc.bakerlab.org/rosetta/result.php?resultid=5182702

As can be seen, it doesn't start with the DEFAULT... I've been seeing a fair amount of lately. Also, will our download quotas be negatively impacted from all these bad units, and if peeps start running out of work (as the servers don't allow anymore downloads) will steps be taken to rectify this possible side effect?

thx
19) Message boards : Number crunching : Noticed more oddities with BOINC benchmark (Message 1648)
Posted 23 Oct 2005 by Nuadormrac
Post:
I mentioned something about this earlier (though on another board), but just noticed it here again. Anyhow, I'm suspecting that the Linux client still trails behind the Windows. Anyhow, to preface this, both computer listings are the same physical box. It contains

- Athlon 64 3500+ OCed to 2.4 GHz
- MSI Neo 2-F
- 512 MB Corsair XMS
- Adaptec SCSI card 29160
- 2 10k rpm SCSI drives
- Radeon 9600
yadda, yadda

Now on the software side:

http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=12772

- Windows XP Pro SP2

Listed benches

Measured floating point speed 2242.99 million ops/sec
Measured integer speed 4092.96 million ops/sec

This listed computer

http://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=32125

- Slamd64 10.2 (port of Slackware 10.2, but all compiled over to the AMD-64 (x86-64) instruction set.

- Otherwise, now has ATI's Linux x86-64 drivers installed on top of it, with nVidia's x86-64 nForce drivers added to it as well.

Course BOINC isn't compiled to the 64-bit instruction set, but to be so substantially trailing the Windows counterpart?

Measured floating point speed 1205.91 million ops/sec
Measured integer speed 2423.55 million ops/sec

What I noticed sometime ago, which got me scratching my head was the older Athlon XP 1900+ (a 1.6 GHz Athlon XP) getting better benchies then a 2.7 GHz Pentium 4 (OK the Athlon has better performance clock per clock, but reviews from various hardware sites haven't shown something that extreme on the AXP). The 2 processors had shown up entirely identical computing times on WUs.

In this case, it's the same exact computer, but multi-booting...
20) Message boards : Number crunching : New type of WU? (Message 1646)
Posted 23 Oct 2005 by Nuadormrac
Post:
Earlier today, I saw several WUs that completed in about 20 minutes, and then another that took the usual 1.5-2 hours...


Next 20



©2026 University of Washington
https://www.bakerlab.org