Problems with Minirosetta 1.80

Message boards : Number crunching : Problems with Minirosetta 1.80

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Yifan Song
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 26 May 09
Posts: 62
Credit: 7,322
RAC: 0
Message 61886 - Posted: 22 Jun 2009, 19:42:37 UTC

In this version:
New protein-protein docking protocol.
New rotamer library.
ID: 61886 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
nick n
Avatar

Send message
Joined: 26 Aug 07
Posts: 49
Credit: 219,102
RAC: 0
Message 61918 - Posted: 24 Jun 2009, 14:24:11 UTC
Last modified: 24 Jun 2009, 14:26:12 UTC

I am getting ALOT of errors on my mac. I have tried resetting and detaching and re attaching to no avail. Here are a few WU examples

https://boinc.bakerlab.org/rosetta/result.php?resultid=261129500
https://boinc.bakerlab.org/rosetta/result.php?resultid=261082154
https://boinc.bakerlab.org/rosetta/result.php?resultid=261052997
https://boinc.bakerlab.org/rosetta/result.php?resultid=261042205
https://boinc.bakerlab.org/rosetta/result.php?resultid=260869175
https://boinc.bakerlab.org/rosetta/result.php?resultid=260866803
https://boinc.bakerlab.org/rosetta/result.php?resultid=260840258
ID: 61918 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Hepburn

Send message
Joined: 18 Sep 05
Posts: 14
Credit: 14,588,658
RAC: 3,963
Message 61924 - Posted: 24 Jun 2009, 18:08:57 UTC

I have had three now that came up with a "compute error" after they had almost finished. Don't think it is on my end. They were on two different computers (one XP Pro, one Win Server 2003). Two of them have been reissued and the second person errored out too. The last one just went out. Other 1.80 tasks run fine, other projects are running just fine.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=238330815
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=238113829
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=238093549
ID: 61924 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
RC

Send message
Joined: 27 Sep 05
Posts: 13
Credit: 262,048
RAC: 0
Message 61925 - Posted: 24 Jun 2009, 22:06:48 UTC - in response to Message 61924.  

I have also had a couple of failures on a Mac. In both cases the run time was less than 10 minutes:

https://boinc.bakerlab.org/rosetta/result.php?resultid=261100252
https://boinc.bakerlab.org/rosetta/result.php?resultid=261064311

ID: 61925 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Snags

Send message
Joined: 22 Feb 07
Posts: 198
Credit: 2,790,804
RAC: 530
Message 61929 - Posted: 25 Jun 2009, 12:02:58 UTC

lb_cutback_all_multi_hb_t290__IGNORE_THE_REST_1LOPA_7_12941_28_0

Outcome = Success and Validate state = valid but

cpu time = 1637.58 secs and

no models appear in the stderr out but this does:

Hbond tripped: [2009- 6-25 5:28: 3:]

ERROR: dis==0 in pairtermderiv!
ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 334
called boinc_finish




Snags
ID: 61929 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
slamb

Send message
Joined: 19 Oct 05
Posts: 2
Credit: 2,050,032
RAC: 0
Message 61930 - Posted: 25 Jun 2009, 12:24:06 UTC

Running out of work. Can't get any more work to download.
ID: 61930 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
nick n
Avatar

Send message
Joined: 26 Aug 07
Posts: 49
Credit: 219,102
RAC: 0
Message 61940 - Posted: 25 Jun 2009, 18:13:17 UTC
Last modified: 25 Jun 2009, 18:16:55 UTC

Now just about everything is failing. I am going to leave for a while if this isn't fixed soon.....
ID: 61940 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 61941 - Posted: 25 Jun 2009, 19:09:02 UTC - in response to Message 61930.  

Running out of work. Can't get any more work to download.


It seems the work server waits until you complete or get a long ways into your last running tasks before it downloads new work.
I have seen this happen allot lately.
I came down to my last 2 tasks (1 per core) and was running them when I got my huge quota (current +5 days extra) of new work.

See if that is happening on your system.
ID: 61941 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TomaszPawel

Send message
Joined: 28 Apr 07
Posts: 54
Credit: 2,791,145
RAC: 0
Message 61945 - Posted: 25 Jun 2009, 20:19:56 UTC - in response to Message 61941.  

I found "bug".

This WU make only 84.37 credit but was runing 22,555.02sec....

This WU make 84.34 credit and was runing 10652.67sec....

So it is bug or it is normal that for WU runing 2x longer I get the same credit?

WWW of Polish National Team - Join! Crunch! Win!
ID: 61945 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 61948 - Posted: 26 Jun 2009, 5:38:09 UTC

Hi.

This one ran for over ten hours on my six hour runtime then fell over, NOT GOOD.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=238272279

Fri 26 Jun 2009 14:59:27 EST|rosetta@home|Output file lb_cutback_all_multi_hb_t325__IGNORE_THE_REST_1ZZMA_12_12955_12_0_0 for task lb_cutback_all_multi_hb_t325__IGNORE_THE_REST_1ZZMA_12_12955_12_0 absent

<error_code>-161</error_code>

pete.

ID: 61948 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 650
Credit: 11,632,350
RAC: 1,054
Message 61950 - Posted: 26 Jun 2009, 9:15:14 UTC
Last modified: 26 Jun 2009, 9:18:33 UTC

I don't know if this is the right place, but have set 6 hours as the target runtime and this wu has been running 54:05:12 now and claims to be 15.255% complete. I have suspended the task pending comment. Claims to have 88:17:24 to completion.

<edit>

Mini Rosetta 1.80, Windows XP, BOINC 6.6.20.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 61950 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 61951 - Posted: 26 Jun 2009, 13:05:41 UTC

adrianxw, please click the task from the task list and click the properties button. Does it show more then 10 hours of CPU time as well? (because the task list now shows "elapsed time" with the new BOINC version).

If you unsuspend the task (and get it running again, perhaps by suspending other tasks for a moment), is it using CPU time?

If it has more then 10 hours of actual CPU time, I would suggest aborting the task.
Rosetta Moderator: Mod.Sense
ID: 61951 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Venturini Dario[VENETO]

Send message
Joined: 25 May 07
Posts: 22
Credit: 245,028
RAC: 0
Message 61952 - Posted: 26 Jun 2009, 13:47:17 UTC - in response to Message 61951.  

adrianxw, please click the task from the task list and click the properties button. Does it show more then 10 hours of CPU time as well? (because the task list now shows "elapsed time" with the new BOINC version).

If you unsuspend the task (and get it running again, perhaps by suspending other tasks for a moment), is it using CPU time?

If it has more then 10 hours of actual CPU time, I would suggest aborting the task.


I also have a WU that got stuck, luckily I noticed after just 4 hours.

Here's a screenshot of the properties of that WU, as you can see that CPU time is just 1 hour + while Run time is 4 hours +



Suspending --> Resuming didn't work to "unstuck" it, until I removed the flag from "keep WU's in memory when suspended". After that, suspending --> resuming made it work again from the percentage reached before the stop (43,43%)
ID: 61952 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 61953 - Posted: 26 Jun 2009, 14:18:40 UTC
Last modified: 26 Jun 2009, 14:20:54 UTC

Venturini, are you allowing BOINC to use 100% of CPU? And all of the available CPUs? Is the machine busy working on other applications that are running?
Rosetta Moderator: Mod.Sense
ID: 61953 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 650
Credit: 11,632,350
RAC: 1,054
Message 61955 - Posted: 26 Jun 2009, 14:44:52 UTC
Last modified: 26 Jun 2009, 14:49:29 UTC

The "Properties" box shows "CPU Time" 00:58:34, the "CPU time at last checkpoint" also shows as 00:58:34 "Elapsed time" 54:05:12 and "Estimated time remaining" 88:17:24.

Resuming the task, it started running in "High priority" mode.

I think I would have noticed if it had really been sitting there for a couple of days. In the time it has taken to write this, the percentage complete has risen to 18.012% and the estimated completion dropped to 83:58:43. Something weird going on there. I'll leave it running for the moment at least.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 61955 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Venturini Dario[VENETO]

Send message
Joined: 25 May 07
Posts: 22
Credit: 245,028
RAC: 0
Message 61956 - Posted: 26 Jun 2009, 15:40:14 UTC - in response to Message 61953.  

Venturini, are you allowing BOINC to use 100% of CPU? And all of the available CPUs? Is the machine busy working on other applications that are running?


All of the cores (2) are dedicated to BOINC, both running 100%, and the only other application running is Word (I'm writing schemes for my next university exams) plus the background ones (antivirus and so on) ;)

Plus, I have only Rosetta on this PC (and WCG, but it's set to no new task).

OS is Windows Vista Home Premium, BOINC is 6.6.28, CPU is a Intel 7700.

And, btw, call me Dario, Venturini is my surname ;)
ID: 61956 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile PinkPenguin

Send message
Joined: 26 Apr 09
Posts: 5
Credit: 280,676
RAC: 0
Message 61957 - Posted: 26 Jun 2009, 15:41:03 UTC

Reporting a couple of -161 errors encountered at the end of lb_cutback_all_multi_hb work units which appear to have completed OK.

On Windows Vista (Intel Core Duo 2GHz) - BOINC 6.6.36 / Rosetta 1.80:
https://boinc.bakerlab.org/rosetta/result.php?resultid=261371341

On Linux Fedora v10 (Intel Pentium 4 3.00GHz) - BOINC 6.4.7 / Rosetta 1.80:
https://boinc.bakerlab.org/rosetta/result.php?resultid=261035946
In this case the other task with the same workunit (238257150) completed without errors.

I noticed that there are similar reports earlier in thIS thread (see also message: 61948 from P.P.L.).

This may be similar to a series of lb_thread_all_multi errors reported earlier this month.

All the best,
Richard

ID: 61957 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chris Down

Send message
Joined: 19 Jun 09
Posts: 1
Credit: 11,750
RAC: 0
Message 61960 - Posted: 26 Jun 2009, 16:25:55 UTC

Also experiencing some compute errors and strange completion times. Seems to be ignoring my settings, too.
ID: 61960 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Venturini Dario[VENETO]

Send message
Joined: 25 May 07
Posts: 22
Credit: 245,028
RAC: 0
Message 61961 - Posted: 26 Jun 2009, 18:00:40 UTC - in response to Message 61956.  

Venturini, are you allowing BOINC to use 100% of CPU? And all of the available CPUs? Is the machine busy working on other applications that are running?


All of the cores (2) are dedicated to BOINC, both running 100%, and the only other application running is Word (I'm writing schemes for my next university exams) plus the background ones (antivirus and so on) ;)

Plus, I have only Rosetta on this PC (and WCG, but it's set to no new task).

OS is Windows Vista Home Premium, BOINC is 6.6.28, CPU is a Intel 7700.

And, btw, call me Dario, Venturini is my surname ;)


Here you go, completed, reported and validated succesfully

https://boinc.bakerlab.org/rosetta/result.php?resultid=261619500
ID: 61961 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rayburner

Send message
Joined: 4 Oct 05
Posts: 32
Credit: 16,518,823
RAC: 0
Message 61976 - Posted: 27 Jun 2009, 16:24:57 UTC
Last modified: 27 Jun 2009, 16:25:27 UTC

compute error after 4 hours

https://boinc.bakerlab.org/rosetta/result.php?resultid=261844121

real_core_1.5_low200_beta_low200_start_hb_t374__IGNORE_THE_REST_13119_137_0
ID: 61976 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Problems with Minirosetta 1.80



©2024 University of Washington
https://www.bakerlab.org