Problems with Minirosetta Version 1.71

Message boards : Number crunching : Problems with Minirosetta Version 1.71

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Yifan Song
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 26 May 09
Posts: 62
Credit: 7,322
RAC: 0
Message 61386 - Posted: 26 May 2009, 19:07:08 UTC

minirosetta updated to 1.71. Bug fixes and some local parameter file loading options added in this version.
ID: 61386 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nelson

Send message
Joined: 8 Jun 06
Posts: 1
Credit: 348,195
RAC: 0
Message 61404 - Posted: 27 May 2009, 2:20:46 UTC

I'm getting: "Message from server: Server error: Can't attach shared memory"
ID: 61404 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 4
Message 61410 - Posted: 27 May 2009, 7:40:57 UTC

I think that is a server issue, not client specific. I've seen it a few times in the last few months.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 61410 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
googloo
Avatar

Send message
Joined: 15 Sep 06
Posts: 133
Credit: 22,813,645
RAC: 3,531
Message 61413 - Posted: 27 May 2009, 12:48:35 UTC

Once again, I beg you: please, please, please post version changes to Rosetta Application Version Release Log. That way we'll get an email and will be able to update our firewalls. PLEASE
ID: 61413 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 61419 - Posted: 28 May 2009, 0:19:07 UTC

just a comment about these (lr5_E_yf_chbond_05_rlbd_1bq9_SAVE_ALL_OUT....) tasks. for me they go very very quick. had 99 decoys in 1hr and 17 mins on one task. think that is the fastest i have ever done 99 decoys. another one reached 99 in 2hrs plus. still very fast.
ID: 61419 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 163
Credit: 808,337
RAC: 1
Message 61422 - Posted: 28 May 2009, 4:48:19 UTC - in response to Message 61419.  

just a comment about these (lr5_E_yf_chbond_05_rlbd_1bq9_SAVE_ALL_OUT....) tasks. for me they go very very quick. had 99 decoys in 1hr and 17 mins on one task. think that is the fastest i have ever done 99 decoys. another one reached 99 in 2hrs plus. still very fast.

this is normal 1r5 tasks have always reached 99 decoys fast. I am unsure as to why this is the case though.
Have a crunching good day!!
ID: 61422 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 61425 - Posted: 28 May 2009, 10:13:38 UTC - in response to Message 61422.  

just a comment about these (lr5_E_yf_chbond_05_rlbd_1bq9_SAVE_ALL_OUT....) tasks. for me they go very very quick. had 99 decoys in 1hr and 17 mins on one task. think that is the fastest i have ever done 99 decoys. another one reached 99 in 2hrs plus. still very fast.

this is normal 1r5 tasks have always reached 99 decoys fast. I am unsure as to why this is the case though.


looks like it is just the yf tasks that have this speedy result. i had a new icoor task that ran the full 4hrs. yf must be a simple task.
ID: 61425 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Ian McGregor

Send message
Joined: 21 Oct 08
Posts: 5
Credit: 1,778,357
RAC: 0
Message 61449 - Posted: 29 May 2009, 18:21:19 UTC

Not sure why but the past 25 WU's of v1.71 i've gotten have all had computation errors and exited before finishing
ID: 61449 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Hammeh

Send message
Joined: 11 Nov 08
Posts: 63
Credit: 211,283
RAC: 0
Message 61450 - Posted: 29 May 2009, 18:24:43 UTC - in response to Message 61449.  

Not sure why but the past 25 WU's of v1.71 i've gotten have all had computation errors and exited before finishing


Your computer list shows no failed tasks.
ID: 61450 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 163
Credit: 808,337
RAC: 1
Message 61455 - Posted: 29 May 2009, 22:11:26 UTC

looks like it is just the yf tasks that have this speedy result. Yf must be a simple task.
lr5_E_yf_chbond_05_run2_rlbd_4ubp_SAVE_ALL_OUT_12502_390_0 ran for 5.9 hours & my time preference is 6 hours.
Have a crunching good day!!
ID: 61455 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PinkaDunka

Send message
Joined: 18 May 09
Posts: 2
Credit: 8,401,056
RAC: 0
Message 61456 - Posted: 30 May 2009, 0:42:17 UTC

My client (works only with rosetta) can't get any work units.
It is saying that I need 39 Gb of space.
(And that requirement is increasing. Every time it connects to get work units it asks for slightly more space.)

boinc: 29-May-2009 19:56:41 [rosetta@home] Sending scheduler request: To fetch work. Requesting 33217 seconds of work, reporting 0 completed tasks
boinc: 29-May-2009 19:56:46 [rosetta@home] Scheduler request completed: got 0 new tasks
boinc: 29-May-2009 19:56:46 [rosetta@home] Message from server: No work sent
boinc: 29-May-2009 19:56:46 [rosetta@home] Message from server: There was work but you don't have enough disk space allocated.
boinc: 29-May-2009 19:56:46 [rosetta@home] Message from server: An additional 39382 MB is needed.

Is it true that I need 39Gb?
ID: 61456 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PinkaDunka

Send message
Joined: 18 May 09
Posts: 2
Credit: 8,401,056
RAC: 0
Message 61461 - Posted: 30 May 2009, 4:14:50 UTC - in response to Message 61456.  

I have restarted the client, and that has fixed it.
ID: 61461 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
zpm

Send message
Joined: 21 Mar 09
Posts: 6
Credit: 349,801
RAC: 0
Message 61473 - Posted: 30 May 2009, 14:59:58 UTC - in response to Message 61461.  

I have restarted the client, and that has fixed it.


this issue also popped of wu's needing more space up at dd@h (drugdiscovery) but it was server side related.


I recommend Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
http://boinc.drugdiscoveryathome.com
ID: 61473 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Michael Hoffmann
Avatar

Send message
Joined: 5 Jun 08
Posts: 9
Credit: 1,307,108
RAC: 0
Message 61476 - Posted: 30 May 2009, 22:47:55 UTC

I'm running task lb_alnmatrix_threading_alncap__hb_t313__IGNORE_THE_REST_12577_3074_0 right now. The elapsed time is over 11 hours now, predicted are another 30. This was the same with lr5_D_rama_map_iter05_rlbn_1kpe_SAVE_ALL_OUT_NATIVE_NOCON_12603_60_0_0. The outcome is nothing really special so I wonder if this is normal. Usually, at least in the previous version, I needed between 3 and 5 hours for a task.
The current task's graphics also cannot be displayed. This is all a bit strange - is it due to the new minirosetta version?

(By the way, I'm running a Vista64 system with 2x3,25 GHz and 4Gb RAM)
ID: 61476 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Toby Broom

Send message
Joined: 15 Oct 08
Posts: 11
Credit: 18,732,062
RAC: 1
Message 61477 - Posted: 30 May 2009, 22:58:51 UTC

I seem to have a few tasks that seem to hang part way through, there still "running" in BOINC but there are way over the default 3hrs:



I aborted a few to keep my computer going e.g.:

255088102
255066919
255015300
254965104
254889680

Any other infomation that is of use?
ID: 61477 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 61478 - Posted: 30 May 2009, 23:09:10 UTC

Hi.

Some of mine seem to go in the other direction, i just noticed this one is a bit

odd, any idea why it finished early, it only ran half way!

lb_alnmatrix_threading_alncap__hb_t363__IGNORE_THE_REST_12591_1728

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=232788417

# cpu_run_time_pref: 14400
======================================================
DONE :: 22 starting structures 7877.82 cpu seconds
This process generated 22 decoys from 22 attempts
======================================================

pete.

ID: 61478 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 61481 - Posted: 31 May 2009, 0:26:31 UTC - in response to Message 61477.  

I seem to have a few tasks that seem to hang part way through, there still "running" in BOINC but there are way over the default 3hrs:



I aborted a few to keep my computer going e.g.:

255088102
255066919
255015300
254965104
254889680

Any other infomation that is of use?


Looks like BOINC thinks you're running on 8 CPU cores. Do you actually have that many, or is hyperthreading making it look like you have twice as many as you actually have? Hyperthreading allows BOINC to use any cpu time that the other workunit on the same cpu core does not use, but BOINC is unable to keep good track of which workunits use now much CPU time when hyperthreading is in use.

Also, I'd expect the total memory requirements to be increased when that many workunits are trying to run at once. Just how much RAM memory do you have?

And how much disk space do you allow BOINC to use?
ID: 61481 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Toby Broom

Send message
Joined: 15 Oct 08
Posts: 11
Credit: 18,732,062
RAC: 1
Message 61482 - Posted: 31 May 2009, 1:13:50 UTC
Last modified: 31 May 2009, 1:20:26 UTC

Hi Rob,

The systems that are having problems are Xeon 54xx's, both dual processor sytems without hypertreading, so the 8 cores is real.

The systems use 2-3Gb of ram and have 4Gb

There set to the default of 10Gb space

The 1.71 seems to be less reliable but I have just upgraded to BOINC 6.6.28?

See here:
1060278
941259
ID: 61482 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 61483 - Posted: 31 May 2009, 1:53:55 UTC - in response to Message 61481.  
Last modified: 31 May 2009, 1:59:32 UTC

I traced through your aborted workunits to the information BOINC maintains about your computer. It thinks you have 8 CPU cores and 4 GB of memory. I've found that at least under Vista SP1, you need at least 1 GB of memory per CPU core to run minirosetta, and some of those WCG programs, at full speed. Is your motherboard capable of handling another 4 GB of memory, and can you afford it?

WARNING - don't use the usual Crucial program for telling how much memory your motherboard can handle under any 64-bit operating system, unless you're prepared for an immediate operating system crash. It seems that program hadn't been adequately tested under 64-bit operating systems the last time I tried it on my 64-bit machine.

By default, BOINC won't use more than about half the available memory unless you go out of your way to tell out of your way to tell it that it can. Therefore, just currently not using much more than half of it doesn't mean that BOINC has enough for the workload you're giving it. To check how much effect this has, try running BOINC with the setting not to use more than half your CPU cores at a time, and see how much effect this has on the speed at which minirosetta workunits run.

On my 2 CPU core 32-bit machine, I found that either the default of 10 GB disk space or the default settings for how much swap space BOINC could use weren't enough. It's hard for me to tell which, because I changed them both at once.

I'd suggest that you change only one of these at a time, and record what the effects are:

1. Increase the disk space to 10 GB times the number of CPU cores. Expect BOINC to divide the allowed swap space equally among all the BOINC projects it's been told to connect to, before deciding how much to allocate to each workunit. Therefore, some BOINC projects can run short of swap space, while others aren't using all they're allocated.

2. Allow BOINC to use a higher percentage of the swap space, since BOINC is probably all you're running on that machine that needs much swap space, and Vista will base the total size of the swap space on how much of it is used.

Note that when the number of apparant CPU cores has been doubled by hyperthreading, you cannot run all of the new number at full speed at the same time, and BOINC has problems judging how much CPU time is used on one of a hyperthreaded pair when the other member of the pair is also in use. Therefore, don't expect hyperthreading to increase your total throughput very much over using only one member of each hyperthreaded pair.
ID: 61483 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Toby Broom

Send message
Joined: 15 Oct 08
Posts: 11
Credit: 18,732,062
RAC: 1
Message 61496 - Posted: 31 May 2009, 11:10:04 UTC

The Vista SP1 machine seems fine, this only has 4 cores and 4gb of ram.

I'll keep an eye out for some more ram, the 8 core machines can take 8Gb easy.

I upped the 10GB of disk space and see how it goes, if it doesn't drop the error rate then I'll do the swap.

The older Xeons don't have hyper treading so no worries there.
ID: 61496 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Problems with Minirosetta Version 1.71



©2024 University of Washington
https://www.bakerlab.org