Message boards : Number crunching : minirosetta 2.14
Author | Message |
---|---|
Yifan Song Volunteer moderator Project developer Project scientist Send message Joined: 26 May 09 Posts: 62 Credit: 7,322 RAC: 0 |
More CASP related updates. |
Felix Send message Joined: 10 Nov 08 Posts: 2 Credit: 107,587 RAC: 0 |
More CASP related updates. Question: Is there ever an end to the Rosetta workunits or you guys keep sending out the same ones thousands of times to be ran with slightly different parameters? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Different parameters, different techniques, different proteins, Rosetta is a development project. There is no fixed amount of predetermined work to be done from a list. Rosetta Moderator: Mod.Sense |
Brutall Send message Joined: 4 May 10 Posts: 1 Credit: 64,339 RAC: 0 |
Seems like minirosetta 2.14 workunits far more complicated? My CPU process them slower, than workunits from 2.11 |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
CASP often sends very challenging targets. These are larger proteins, made up of more amino acids. These larger proteins generally take longer per model to process then those typically processed otherwise. Rosetta Moderator: Mod.Sense |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
Task 338821144 failed on W7 rb_05_13_148_531_rs_stg0_lrlxcst_t000__casp9_SAVE_ALL_OUT_20582_2038_1 Seems a previous cruncher had the same issue ERROR: CORE ERROR: You must use the ThreadingJobInputter with the LoopRelaxThreadingMover - did you forget the -in:file:template_pdb option? ERROR:: Exit from: ....srcprotocolsloopsLoopRelaxThreadingMover.cc line: 80 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish </stderr_txt> ]]> |
Nuadormrac Send message Joined: 27 Sep 05 Posts: 37 Credit: 202,469 RAC: 0 |
There's also another aside with increasing complexity on a task, which depending on people's machines might/might not effect them. The bigger the dataset, the more complex something is, the more RAM it can use. The current task I'm working on is using 511 MB of RAM to itself. Now this might/might not seem like much with today's computers, but also remember that today's processors have either 2 or even 4 cores on one CPU. Which means that each CPU is running a separate task which is each taking up it's own pool of RAM. If someone has a quad core, and is running 4 Rosseta tasks as such, they're really using 511 MB x 4 or 2,044 MB or (2,044/1024)= 1.996 GB of RAM over and above Windows (Vista or 7, on today's comps). Now thinking of it another way, many of today's computers which come with 2 or 4 GB of RAM, on a quad core, would essentially have 512 MB or 1 GB respective per core if one were to break it up that way. And though you might not care on your web browser (what many OEM's are thinking about with pre-built systems, on BOINC you would... As things become more crowded, their comps might swap a little more, and increased paging activity (as the memory pool useage grows larger) can slow things down for that reason. |
Nuadormrac Send message Joined: 27 Sep 05 Posts: 37 Credit: 202,469 RAC: 0 |
Has anyone else noticed that with this version of minirosetta tasks are completing, instead of cutting off as they should? I had one last night which was on model 0, beyond the target time set in preferences, went out to work, and when I came home, it was onto model 1 (the 2nd model), and crunched the entire WU. I've noticed a couple others like this. Other models are completing within within the selected time, but the early completion of WUs based on target time has elapsed seems to be gone. I think watchdog, or whatever is responsible for telling the WU "you've crunched enough and are done" is not working as it should in this version. |
dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0 |
There's also another aside with increasing complexity on a task, which depending on people's machines might/might not effect them. The bigger the dataset, the more complex something is, the more RAM it can use. The current task I'm working on is using 511 MB of RAM to itself. Unless it gets to the point where the processes start thrashing, this isn't much of an issue. Modern OS's have very effective virtual memory systems which simply page out less used portions of the working set to disk. Looking at the four rosetta tasks running on my system now, they all have a virtual size on the 400 to 450 mb range. However they all only have a working set size in the 250 mb to 300 mb range, and hardly any page faults happening. Therefore Windows is doing a first class job of figuring which parts of the process aren't immediately necessary this instant, and paging them out. And I suspect that as the workload on my system increases, the working set size of Rosetta task will decrease. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Nuadormrac, the watchdog tries to leave things alone and not interrupt useful work. So it will only end a task when it has gone 4 hours passed the target runtime user preference. I believe what you are observing is a combination of long runtime per model, and variance in runtime between one model and the next. If I run one model is 90 minutes with a 3 hour target runtime, the task will begin a second model on the thought that it will complete within the target. However, if the second model then takes 120 minutes to complete, the task will end a half hour passed the target instead. This is normal behavior, and part of the reason why the watchdog has the patience I described above. Rosetta Moderator: Mod.Sense |
coturnix Send message Joined: 8 Oct 09 Posts: 4 Credit: 760,915 RAC: 0 |
|
Kyle Send message Joined: 9 Jun 07 Posts: 1 Credit: 15,292 RAC: 0 |
I've been having a problem on 2 of 3 pc's i boinc on where rosetta hangs, stops what have you. wu's will stop progressing to there finish, but the pc is still trying to fold. using my older folder as an example i would get wu's that take around a day for this pc to finish. if i didn't restart the pc at least twice a day the wu would hang or pause but still counting elapsed time and completion time would also start to increase. I've had wu's stuck for over 13hrs elapsed time and show 60hr completion time. my 3rd system only runs rosetta and does so w/o any hassles. all systems running win xp pro sp3, one that's working fine is a intel dual core laptop. problem systems are desktops one running an old amd athalon xp 1600+ other is athalon 64 x2. sorry for the long post but i would like to try and get this fixed, the 1600+ was a full time rosetta rig. its currently running seti because i am tired of babysitting this one system. |
kashi Send message Joined: 23 Nov 07 Posts: 2 Credit: 647,336 RAC: 0 |
I had 4 tasks error with error message the same as svincent above. They only run for 11-13 seconds before erroring. All of these tasks have casp9 in the name, the other tasks without casp9 in the name gave no trouble. 338419916 338419899 338418442 38410731 I thought perhaps they exceeded the memory capacity of my computer because they used 400-500MB of ram each but 3 of them also errored on others' computers when they were sent out again. The 4th one that was resent has not been completed yet. One "casp9" task completed successfully however, so they don't all error on my computer. I have 4MB of ram, but a VM takes 900MB so only 3.1MB is available. With 8 cores to feed and these casp9 tasks taking 500MB each and Windows 7 taking a fair chunk too, I am thinking my computer does not have sufficient resources to process the current minirosetta tasks when 2 casp9 tasks try to run at the one time. |
Nuadormrac Send message Joined: 27 Sep 05 Posts: 37 Credit: 202,469 RAC: 0 |
Nuadormrac, the watchdog tries to leave things alone and not interrupt useful work. So it will only end a task when it has gone 4 hours passed the target runtime user preference. I believe what you are observing is a combination of long runtime per model, and variance in runtime between one model and the next. Well I had a 2 hour run time, and it was 2.5 hours runtime when model 0 was complete.... Which is why I was surprised to come back and see it still chugging away at the task, inspected in the "show graphics again" and saw it was on model 1 rather then 0. It was on model 0 when I left. |
Nuadormrac Send message Joined: 27 Sep 05 Posts: 37 Credit: 202,469 RAC: 0 |
There's also another aside with increasing complexity on a task, which depending on people's machines might/might not effect them. The bigger the dataset, the more complex something is, the more RAM it can use. The current task I'm working on is using 511 MB of RAM to itself. Actually the "conservative swap feature" was a setting which was restricted to Windows 9x branded operating systems, as the winNT/2k/XP line did things a little different. Windows Vista is essentially an off shoot of Windows 2003 server... I can say that from experience with running Windows Vista 64 beta and release candidates on a then Athlon 64 which had 1 GB of RAM at the time it was in beta; my experience was this. - Vista booted up allocating about 900 MB at desktop, winXP allocated around 200-340 MB at desktop (before loading apps). Typical bloatware, we're all familiar with that. - When Vista 64 (and I do think some of this was a 64-bit OS on a 64-bit CPU) was allocating less RAM then one had in the machine (though tbh there is paging that goes on when the physical RAM doesn't warrant needing it, part of the differences on how the winNT line of OS's, along with it's successors deal with paging, vs how win98 dealt with it), the OS was snappier and more responsive. - Course keep in mind, the physical memory also has a HD cache, whic the above isn't taking into account. But needless to say extra RAM is good, especially if one has write caching enabled for the drives, and not just read caching. - But anyhow, the experience was that as soon as allocated RAM went beyond physical RAM by even a small amount, aka even just 1/10th of a GB or then 110% physical on that box, the responsiveness degraded, and the OS seemed slower to respond then even winXP. Course there's also a reason many downgraded to XP :p This sort of thing can be especially noticeable with any form of computer gaming, where real time response times can be an issue; especially in some intensive situations (be it from a FPS standpoint, or an MMO standpoint if one's in a large raid, with a lot going on at once which must be responded to with as next to no delay as possible). - When left to themselves, the swapfiles in win2k, XP, Vista, and I would imagine win7; have one fatal flaw with how they "grow" if the initial swapfile size is exceeded. They do so very conservatively, and this can also result in a fragmentation problem wrt the swapfile. This is also why utilities such as Diskeeper and the like introduced a defrag pagefile option (and latter on an option to defrag the MFT). People in the know however don't go with the Windows default setting, they set a fixed swapfile size, when the initial and max sizes are the same, and follow MS's recommendation of making it at least 1.5x physical memory. (More on how this line of OS's handles paging vs how win9x handled it.) TBH, if speed and efficiency were the only concern I think win98 did pagefiles a little better (arguably), though this line of OS's does have other things it can do with pagefiles, such as a degree of error handling through them. Vista would not count as an old, and would very much count as a "modern OS" even though Windows 7 is now out. And all I can say, is Vista, on this box here, with 2 GB RAM and a duel core, yes it's got some of that same sluggishness in general which can leave me wanting to curse Vista at times :laugh: I wouldn't exactly call it the most responsive and snappy thing out there. And tbh, if I had the memory in this box, a few of the changes I would want to make would be to impose a "conservative swap" feature like in win9x, except Vista doesn't allow for that. Though some things it does allow for and I would do, is go into regedt32 and alter some of the memory management features to disable paging executive (one wants enough extra RAM for that change though) as well as enable large system cache. There's some other tweaks one can make, if the computer isn't bogged down that is, relative to their own physical RAM. |
kashi Send message Joined: 23 Nov 07 Posts: 2 Credit: 647,336 RAC: 0 |
Oops, my previous post should read 4GB of ram and 3.1GB available. I trust you all knew what I meant and have overlooked my error. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Casp 9 task the died rs_stg0_lrlxjcst_t512__casp8_SAVE_ALL_OUT_20673_2315_0 Compute error Exit status -177 (0xffffff4f) CPU time 7788.453 Maximum elapsed time exceeded - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x7C90120E and another one earlier rb_05_13_148_531_rs_stg0_lrlxcst_t000__casp9_SAVE_ALL_OUT_20582_1917_1 Compute error Exit status 1 (0x1) Cpu time 15.54688 <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> ERROR: CORE ERROR: You must use the ThreadingJobInputter with the LoopRelaxThreadingMover - did you forget the -in:file:template_pdb option? ERROR:: Exit from: ....srcprotocolsloopsLoopRelaxThreadingMover.cc line: 80 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish and one I missed at the beginning of the month rb_05_04_128_339_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_A_20282_2406_0 Client state Compute error Exit status -177 (0xffffff4f) CPU time 9525.484 <message> Maximum elapsed time exceeded </message> - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x7C90120E |
coturnix Send message Joined: 8 Oct 09 Posts: 4 Credit: 760,915 RAC: 0 |
Quite a few work units recently failed with segmentation faults on my Linux machine. However, most of these work units seem to succeed on Windows. Here is another work unit that segfaulted on both Linux and Windows: rb_05_17_152_540_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_B_20851_1555 |
Rui Pinheiro Send message Joined: 6 Feb 10 Posts: 3 Credit: 103,931 RAC: 0 |
hi, sorry to post this here, but ive been searching for a while, still i cant find the answers as simple as they may be. 1 - how do i update to the 2.14 version? 2 - how can i get some info about the workunit im working on? like what are they related to as a suggestion i would say it would be nice if the users could pick their path, choosing the way their computer time is spent. |
Message boards :
Number crunching :
minirosetta 2.14
©2025 University of Washington
https://www.bakerlab.org