minirosetta 2.14

Message boards : Number crunching : minirosetta 2.14

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 7 · Next

AuthorMessage
Yifan Song
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 26 May 09
Posts: 62
Credit: 7,322
RAC: 0
Message 66050 - Posted: 10 May 2010, 19:48:20 UTC

More CASP related updates.
ID: 66050 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Felix

Send message
Joined: 10 Nov 08
Posts: 2
Credit: 107,587
RAC: 0
Message 66062 - Posted: 11 May 2010, 2:05:44 UTC - in response to Message 66050.  

More CASP related updates.


Question: Is there ever an end to the Rosetta workunits or you guys keep sending out the same ones thousands of times to be ran with slightly different parameters?
ID: 66062 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3428
Credit: 0
RAC: 0
Message 66063 - Posted: 11 May 2010, 2:27:27 UTC

Different parameters, different techniques, different proteins, Rosetta is a development project. There is no fixed amount of predetermined work to be done from a list.
Rosetta Moderator: Mod.Sense
ID: 66063 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brutall

Send message
Joined: 4 May 10
Posts: 1
Credit: 64,339
RAC: 0
Message 66085 - Posted: 12 May 2010, 9:06:00 UTC

Seems like minirosetta 2.14 workunits far more complicated? My CPU process them slower, than workunits from 2.11
ID: 66085 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 675
Credit: 15,833,672
RAC: 31,706
Message 66101 - Posted: 13 May 2010, 2:57:23 UTC - in response to Message 66085.  

Seems like minirosetta 2.14 workunits far more complicated? My CPU process them slower, than workunits from 2.11


Doubt it. The CPU takes a determined time (set by you, default 3 hrs I believe) for each WU. Independently of it\'s version.
ID: 66101 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
transient
Avatar

Send message
Joined: 30 Sep 06
Posts: 376
Credit: 8,090,810
RAC: 3,242
Message 66105 - Posted: 13 May 2010, 8:50:41 UTC - in response to Message 66101.  

Seems like minirosetta 2.14 workunits far more complicated? My CPU process them slower, than workunits from 2.11


Doubt it. The CPU takes a determined time (set by you, default 3 hrs I believe) for each WU. Independently of it\'s version.


True, but the number of models generated could be lower in that timespan, because of higher complexity.
ID: 66105 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3428
Credit: 0
RAC: 0
Message 66108 - Posted: 13 May 2010, 17:37:23 UTC

CASP often sends very challenging targets. These are larger proteins, made up of more amino acids. These larger proteins generally take longer per model to process then those typically processed otherwise.
Rosetta Moderator: Mod.Sense
ID: 66108 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 205
Credit: 5,123,573
RAC: 2,150
Message 66131 - Posted: 15 May 2010, 16:38:30 UTC

Task 338821144 failed on W7

rb_05_13_148_531_rs_stg0_lrlxcst_t000__casp9_SAVE_ALL_OUT_20582_2038_1

Seems a previous cruncher had the same issue

ERROR: CORE ERROR: You must use the ThreadingJobInputter with the LoopRelaxThreadingMover - did you forget the -in:file:template_pdb option?
ERROR:: Exit from: ..\\..\\src\\protocols\\loops\\LoopRelaxThreadingMover.cc line: 80
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
]]>
ID: 66131 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nuadormrac

Send message
Joined: 27 Sep 05
Posts: 37
Credit: 82,230
RAC: 103
Message 66141 - Posted: 16 May 2010, 3:09:12 UTC
Last modified: 16 May 2010, 3:12:28 UTC

There\'s also another aside with increasing complexity on a task, which depending on people\'s machines might/might not effect them. The bigger the dataset, the more complex something is, the more RAM it can use. The current task I\'m working on is using 511 MB of RAM to itself. Now this might/might not seem like much with today\'s computers, but also remember that today\'s processors have either 2 or even 4 cores on one CPU. Which means that each CPU is running a separate task which is each taking up it\'s own pool of RAM. If someone has a quad core, and is running 4 Rosseta tasks as such, they\'re really using 511 MB x 4 or 2,044 MB or (2,044/1024)= 1.996 GB of RAM over and above Windows (Vista or 7, on today\'s comps).

Now thinking of it another way, many of today\'s computers which come with 2 or 4 GB of RAM, on a quad core, would essentially have 512 MB or 1 GB respective per core if one were to break it up that way. And though you might not care on your web browser (what many OEM\'s are thinking about with pre-built systems, on BOINC you would...

As things become more crowded, their comps might swap a little more, and increased paging activity (as the memory pool useage grows larger) can slow things down for that reason.
ID: 66141 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nuadormrac

Send message
Joined: 27 Sep 05
Posts: 37
Credit: 82,230
RAC: 103
Message 66150 - Posted: 16 May 2010, 15:48:56 UTC

Has anyone else noticed that with this version of minirosetta tasks are completing, instead of cutting off as they should? I had one last night which was on model 0, beyond the target time set in preferences, went out to work, and when I came home, it was onto model 1 (the 2nd model), and crunched the entire WU. I\'ve noticed a couple others like this.

Other models are completing within within the selected time, but the early completion of WUs based on target time has elapsed seems to be gone. I think watchdog, or whatever is responsible for telling the WU \"you\'ve crunched enough and are done\" is not working as it should in this version.
ID: 66150 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dgnuff
Avatar

Send message
Joined: 1 Nov 05
Posts: 347
Credit: 24,023,248
RAC: 20
Message 66153 - Posted: 16 May 2010, 16:38:20 UTC - in response to Message 66141.  

There\'s also another aside with increasing complexity on a task, which depending on people\'s machines might/might not effect them. The bigger the dataset, the more complex something is, the more RAM it can use. The current task I\'m working on is using 511 MB of RAM to itself.


Unless it gets to the point where the processes start thrashing, this isn\'t much of an issue. Modern OS\'s have very effective virtual memory systems which simply page out less used portions of the working set to disk. Looking at the four rosetta tasks running on my system now, they all have a virtual size on the 400 to 450 mb range. However they all only have a working set size in the 250 mb to 300 mb range, and hardly any page faults happening. Therefore Windows is doing a first class job of figuring which parts of the process aren\'t immediately necessary this instant, and paging them out.

And I suspect that as the workload on my system increases, the working set size of Rosetta task will decrease.
ID: 66153 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3428
Credit: 0
RAC: 0
Message 66156 - Posted: 16 May 2010, 18:00:42 UTC

Nuadormrac, the watchdog tries to leave things alone and not interrupt useful work. So it will only end a task when it has gone 4 hours passed the target runtime user preference. I believe what you are observing is a combination of long runtime per model, and variance in runtime between one model and the next.

If I run one model is 90 minutes with a 3 hour target runtime, the task will begin a second model on the thought that it will complete within the target. However, if the second model then takes 120 minutes to complete, the task will end a half hour passed the target instead. This is normal behavior, and part of the reason why the watchdog has the patience I described above.
Rosetta Moderator: Mod.Sense
ID: 66156 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
coturnix

Send message
Joined: 8 Oct 09
Posts: 4
Credit: 750,672
RAC: 0
Message 66171 - Posted: 17 May 2010, 11:59:12 UTC

ID: 66171 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kyle

Send message
Joined: 9 Jun 07
Posts: 1
Credit: 15,292
RAC: 0
Message 66183 - Posted: 17 May 2010, 23:22:35 UTC

I\'ve been having a problem on 2 of 3 pc\'s i boinc on where rosetta hangs, stops what have you. wu\'s will stop progressing to there finish, but the pc is still trying to fold. using my older folder as an example i would get wu\'s that take around a day for this pc to finish. if i didn\'t restart the pc at least twice a day the wu would hang or pause but still counting elapsed time and completion time would also start to increase. I\'ve had wu\'s stuck for over 13hrs elapsed time and show 60hr completion time. my 3rd system only runs rosetta and does so w/o any hassles. all systems running win xp pro sp3, one that\'s working fine is a intel dual core laptop. problem systems are desktops one running an old amd athalon xp 1600+ other is athalon 64 x2. sorry for the long post but i would like to try and get this fixed, the 1600+ was a full time rosetta rig. its currently running seti because i am tired of babysitting this one system.
ID: 66183 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile kashi

Send message
Joined: 23 Nov 07
Posts: 2
Credit: 346,280
RAC: 0
Message 66185 - Posted: 18 May 2010, 10:44:36 UTC

I had 4 tasks error with error message the same as svincent above. They only run for 11-13 seconds before erroring. All of these tasks have casp9 in the name, the other tasks without casp9 in the name gave no trouble.
338419916
338419899
338418442
38410731

I thought perhaps they exceeded the memory capacity of my computer because they used 400-500MB of ram each but 3 of them also errored on others\' computers when they were sent out again. The 4th one that was resent has not been completed yet.

One \"casp9\" task completed successfully however, so they don\'t all error on my computer.

I have 4MB of ram, but a VM takes 900MB so only 3.1MB is available. With 8 cores to feed and these casp9 tasks taking 500MB each and Windows 7 taking a fair chunk too, I am thinking my computer does not have sufficient resources to process the current minirosetta tasks when 2 casp9 tasks try to run at the one time.
ID: 66185 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nuadormrac

Send message
Joined: 27 Sep 05
Posts: 37
Credit: 82,230
RAC: 103
Message 66187 - Posted: 18 May 2010, 11:01:10 UTC - in response to Message 66156.  
Last modified: 18 May 2010, 11:01:50 UTC

Nuadormrac, the watchdog tries to leave things alone and not interrupt useful work. So it will only end a task when it has gone 4 hours passed the target runtime user preference. I believe what you are observing is a combination of long runtime per model, and variance in runtime between one model and the next.

If I run one model is 90 minutes with a 3 hour target runtime, the task will begin a second model on the thought that it will complete within the target. However, if the second model then takes 120 minutes to complete, the task will end a half hour passed the target instead. This is normal behavior, and part of the reason why the watchdog has the patience I described above.


Well I had a 2 hour run time, and it was 2.5 hours runtime when model 0 was complete.... Which is why I was surprised to come back and see it still chugging away at the task, inspected in the \"show graphics again\" and saw it was on model 1 rather then 0. It was on model 0 when I left.
ID: 66187 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nuadormrac

Send message
Joined: 27 Sep 05
Posts: 37
Credit: 82,230
RAC: 103
Message 66188 - Posted: 18 May 2010, 11:16:54 UTC - in response to Message 66153.  
Last modified: 18 May 2010, 11:27:51 UTC

There\'s also another aside with increasing complexity on a task, which depending on people\'s machines might/might not effect them. The bigger the dataset, the more complex something is, the more RAM it can use. The current task I\'m working on is using 511 MB of RAM to itself.


Unless it gets to the point where the processes start thrashing, this isn\'t much of an issue. Modern OS\'s have very effective virtual memory systems which simply page out less used portions of the working set to disk.


Actually the \"conservative swap feature\" was a setting which was restricted to Windows 9x branded operating systems, as the winNT/2k/XP line did things a little different. Windows Vista is essentially an off shoot of Windows 2003 server...

I can say that from experience with running Windows Vista 64 beta and release candidates on a then Athlon 64 which had 1 GB of RAM at the time it was in beta; my experience was this.

- Vista booted up allocating about 900 MB at desktop, winXP allocated around 200-340 MB at desktop (before loading apps). Typical bloatware, we\'re all familiar with that.

- When Vista 64 (and I do think some of this was a 64-bit OS on a 64-bit CPU) was allocating less RAM then one had in the machine (though tbh there is paging that goes on when the physical RAM doesn\'t warrant needing it, part of the differences on how the winNT line of OS\'s, along with it\'s successors deal with paging, vs how win98 dealt with it), the OS was snappier and more responsive.

- Course keep in mind, the physical memory also has a HD cache, whic the above isn\'t taking into account. But needless to say extra RAM is good, especially if one has write caching enabled for the drives, and not just read caching.

- But anyhow, the experience was that as soon as allocated RAM went beyond physical RAM by even a small amount, aka even just 1/10th of a GB or then 110% physical on that box, the responsiveness degraded, and the OS seemed slower to respond then even winXP. Course there\'s also a reason many downgraded to XP :p This sort of thing can be especially noticeable with any form of computer gaming, where real time response times can be an issue; especially in some intensive situations (be it from a FPS standpoint, or an MMO standpoint if one\'s in a large raid, with a lot going on at once which must be responded to with as next to no delay as possible).

- When left to themselves, the swapfiles in win2k, XP, Vista, and I would imagine win7; have one fatal flaw with how they \"grow\" if the initial swapfile size is exceeded. They do so very conservatively, and this can also result in a fragmentation problem wrt the swapfile. This is also why utilities such as Diskeeper and the like introduced a defrag pagefile option (and latter on an option to defrag the MFT). People in the know however don\'t go with the Windows default setting, they set a fixed swapfile size, when the initial and max sizes are the same, and follow MS\'s recommendation of making it at least 1.5x physical memory. (More on how this line of OS\'s handles paging vs how win9x handled it.) TBH, if speed and efficiency were the only concern I think win98 did pagefiles a little better (arguably), though this line of OS\'s does have other things it can do with pagefiles, such as a degree of error handling through them.

Vista would not count as an old, and would very much count as a \"modern OS\" even though Windows 7 is now out. And all I can say, is Vista, on this box here, with 2 GB RAM and a duel core, yes it\'s got some of that same sluggishness in general which can leave me wanting to curse Vista at times :laugh: I wouldn\'t exactly call it the most responsive and snappy thing out there. And tbh, if I had the memory in this box, a few of the changes I would want to make would be to impose a \"conservative swap\" feature like in win9x, except Vista doesn\'t allow for that. Though some things it does allow for and I would do, is go into regedt32 and alter some of the memory management features to disable paging executive (one wants enough extra RAM for that change though) as well as enable large system cache. There\'s some other tweaks one can make, if the computer isn\'t bogged down that is, relative to their own physical RAM.
ID: 66188 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile kashi

Send message
Joined: 23 Nov 07
Posts: 2
Credit: 346,280
RAC: 0
Message 66191 - Posted: 18 May 2010, 12:58:21 UTC - in response to Message 66185.  

Oops, my previous post should read 4GB of ram and 3.1GB available. I trust you all knew what I meant and have overlooked my error.
ID: 66191 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 4835
Credit: 3,070,762
RAC: 732
Message 66193 - Posted: 18 May 2010, 13:46:00 UTC
Last modified: 18 May 2010, 13:53:10 UTC

Casp 9 task the died

rs_stg0_lrlxjcst_t512__casp8_SAVE_ALL_OUT_20673_2315_0

Compute error
Exit status -177 (0xffffff4f)
CPU time 7788.453

Maximum elapsed time exceeded

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7C90120E


and another one earlier

rb_05_13_148_531_rs_stg0_lrlxcst_t000__casp9_SAVE_ALL_OUT_20582_1917_1


Compute error
Exit status 1 (0x1)
Cpu time 15.54688

<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>


ERROR: CORE ERROR: You must use the ThreadingJobInputter with the LoopRelaxThreadingMover - did you forget the -in:file:template_pdb option?
ERROR:: Exit from: ..\\..\\src\\protocols\\loops\\LoopRelaxThreadingMover.cc line: 80
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

and one I missed at the beginning of the month

rb_05_04_128_339_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_A_20282_2406_0

Client state Compute error
Exit status -177 (0xffffff4f)
CPU time 9525.484

<message>
Maximum elapsed time exceeded
</message>


- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7C90120E
ID: 66193 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
coturnix

Send message
Joined: 8 Oct 09
Posts: 4
Credit: 750,672
RAC: 0
Message 66214 - Posted: 19 May 2010, 9:24:25 UTC

Quite a few work units recently failed with segmentation faults on my Linux machine. However, most of these work units seem to succeed on Windows.

Here is another work unit that segfaulted on both Linux and Windows:
rb_05_17_152_540_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_B_20851_1555
ID: 66214 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 . . . 7 · Next

Message boards : Number crunching : minirosetta 2.14



©2017 University of Washington
http://www.bakerlab.org