Problems with Minirosetta v1.54

Message boards : Number crunching : Problems with Minirosetta v1.54

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 15 · Next

AuthorMessage
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59688 - Posted: 20 Feb 2009, 18:31:56 UTC
Last modified: 20 Feb 2009, 18:33:48 UTC

robertmiles, if you were directing the question to me, I try to stay out of that one. And am only recommending a change to BOINC version because problems are occurring with the version installed now. I know we've seen many work-fetch and DCF problems reported on the 6.6 (which is the current test version) and I think 6.4 series introduced those problems. So, if it were me, I'd try the 6.2.19 shown at the link below. I myself am on 6.2.18 and running well on WinXP. (nothing against 6.2.28, but it's not listed anymore for some reason)

You can see more BOINC versions for download on this page:
http://boinc.berkeley.edu/download_all.php
Rosetta Moderator: Mod.Sense
ID: 59688 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TimL

Send message
Joined: 16 Sep 06
Posts: 16
Credit: 13,614,275
RAC: 3,095
Message 59723 - Posted: 22 Feb 2009, 9:59:14 UTC

Hi all,
loopbuild_mamaln_ideal_hb_t305__IGNORE_THE_REST_1zc0_1_7630_19 finished early with error -
Access Violation (0xc0000005) at address 0x7C91AA01 read attempt to address 0x0D1BF548

Haven't had much luck getting errors of late but will mention that I had just bumped the bus speed up a touch when this error occurred.


ID: 59723 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TomaszPawel

Send message
Joined: 28 Apr 07
Posts: 54
Credit: 2,791,145
RAC: 0
Message 59751 - Posted: 23 Feb 2009, 7:06:15 UTC - in response to Message 59045.  

ID: 59751 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rembertw

Send message
Joined: 21 Apr 07
Posts: 14
Credit: 628,529
RAC: 0
Message 59752 - Posted: 23 Feb 2009, 7:50:20 UTC - in response to Message 59688.  

Mod.Sense
And am only recommending a change to BOINC version because problems are occurring with the version installed now.

I set up Boinc 6.4.5 on that computer, and it seems to be running fine with Rosetta. I still will wait for a general upgrade until there are new Boinc versions, I think.

robertmiles
"Current" is for me the version that the actual Boinc site gives as standard. Researching older versions and installing those is too much micromanagement for me. Same like posting on the boards... If this problem gets solved with 6.4.5 (and it seems to be solved) then I'm off again.
ID: 59752 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59756 - Posted: 23 Feb 2009, 14:09:26 UTC - in response to Message 59751.  

Hi:

https://boinc.bakerlab.org/rosetta/result.php?resultid=229237620
https://boinc.bakerlab.org/rosetta/result.php?resultid=229237620
https://boinc.bakerlab.org/rosetta/result.php?resultid=229237514
https://boinc.bakerlab.org/rosetta/result.php?resultid=229145242
https://boinc.bakerlab.org/rosetta/result.php?resultid=228892067
https://boinc.bakerlab.org/rosetta/result.php?resultid=228820491
https://boinc.bakerlab.org/rosetta/result.php?resultid=228820477

Any tips?


Looks like all of these were the ss-neg-1i17s that most people have been having trouble with. Something specific to the 1i17, the other ss-neg's do not seem to be having any trouble.

Except for your last one on the list, it got a
"Too many restarts with no progress. Keep application in memory while preempted."
error. Perhaps you rebooted your machine several times in a row to install fixes or something?
Rosetta Moderator: Mod.Sense
ID: 59756 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 59761 - Posted: 23 Feb 2009, 18:49:59 UTC

-161 error on 230728890
ID: 59761 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
RodrigoPS
Avatar

Send message
Joined: 28 Nov 08
Posts: 3
Credit: 1,265,570
RAC: 30
Message 59782 - Posted: 24 Feb 2009, 22:01:20 UTC

I noticed that with the minirosetta 1.54 the granted credit was very low in the Athlon X2 processors - sometimes half the claimed credit. This did not occur with the single core Athlon.
ID: 59782 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
RodrigoPS
Avatar

Send message
Joined: 28 Nov 08
Posts: 3
Credit: 1,265,570
RAC: 30
Message 59834 - Posted: 27 Feb 2009, 0:05:01 UTC - in response to Message 59782.  

I noticed that with the minirosetta 1.54 the granted credit was very low in the Athlon X2 processors - sometimes half the claimed credit. This did not occur with the single core Athlon.


Problem solved. Updating the BIOS (F8> F9) of the motherboard caused a considerable loss of performance of PCs with Athlon X2 processors. The restoration of BIOS F8 normalized the system.
ID: 59834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Mike*

Send message
Joined: 16 Feb 09
Posts: 5
Credit: 102,030
RAC: 0
Message 59835 - Posted: 27 Feb 2009, 1:49:33 UTC

Hi all,
Had the below error show up.
I initially DLd 3 WU, the first 2 bombed, I aborted the 3rd.. I then detached, re-attached, then DLed 11 new ones.

Every one of them went south..

Boinc mgr is 6.2.18

Free disk is 88g
Used by boinc is 4.81
Use at most 100g
Leave 0
Use up to 50% disk
Leave apps in memory.
Only other project (which was suspended was CPDN at 55% @1004 hrs (do not want to loose this)

My host is 1008545 (should be viewable)

At this point, I will wait till next week (SIMAP starting soon with it's monthly run :)) and will try again.
Don't want to keep trashing WUs for no reason.

I do have the messages from boinc stored if they would be useful, but here is one thing I see, but it may only be due to the process crashing:

2/26/2009 8:04:04 PM|rosetta@home|Starting lr8_A_score12_rlbd_2ci2_IGNORE_THE_REST_DECOY_SAVE_ALL_OUT_7089_1093_0
2/26/2009 8:04:05 PM|rosetta@home|Starting task lr8_A_score12_rlbd_2ci2_IGNORE_THE_REST_DECOY_SAVE_ALL_OUT_7089_1093_0 using minirosetta version 154
2/26/2009 8:04:19 PM|rosetta@home|Computation for task lr8_A_score12_rlbd_2ci2_IGNORE_THE_REST_DECOY_SAVE_ALL_OUT_7089_1093_0 finished
2/26/2009 8:04:19 PM|rosetta@home|Output file lr8_A_score12_rlbd_2ci2_IGNORE_THE_REST_DECOY_SAVE_ALL_OUT_7089_1093_0_0 for task lr8_A_score12_rlbd_2ci2_IGNORE_THE_REST_DECOY_SAVE_ALL_OUT_7089_1093_0 absent

Thanks

mike

(extra blank lines removed)
<core_client_version>6.2.18</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
BOINC:: Initializing ... ok.
[2009- 2-26 20:10: 2:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing core...
Initializing options.... ok
Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C910193 write attempt to address 0x009882EA
Engaging BOINC Windows Runtime Debugger...
Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C910193 write attempt to address 0x0040118E
Engaging BOINC Windows Runtime Debugger...
</stderr_txt>
]]>

ID: 59835 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1061
Credit: 11,672,942
RAC: 7,736
Message 59839 - Posted: 27 Feb 2009, 6:05:40 UTC - in response to Message 59835.  
Last modified: 27 Feb 2009, 6:12:07 UTC

Hi all,
Had the below error show up.
I initially DLd 3 WU, the first 2 bombed, I aborted the 3rd.. I then detached, re-attached, then DLed 11 new ones.

Every one of them went south..

Boinc mgr is 6.2.18

Free disk is 88g
Used by boinc is 4.81
Use at most 100g
Leave 0
Use up to 50% disk
Leave apps in memory.
Only other project (which was suspended was CPDN at 55% @1004 hrs (do not want to loose this)


mike

(extra blank lines removed)
<core_client_version>6.2.18</core_client_version>



A few questions that may help pin down the problem:

Are you able to find BOINC 6.2.28, and willing to upgrade to it? That's the only version I have used since 5.10.45, and I don't have that problem.

Have you gone to any extra effort to tell BOINC that it could use more virtual memory than the default?

Have you gone to any extra effort to tell your copy of Windows to allow a bigger swap file than the default?

How many BOINC projects do you have your BOINC Manager set up to recognize? I've seen some so far rather indistinct signs that BOINC divides the disk space it is allowed to use into equal sections for each BOINC project it recognizes before it starts dividing those sections into smaller subsections for each workunit. Therefore, if one BOINC project is heavy on disk space use, workunits for that project might run out of disk space even if some other BOINC project doesn't need all that is reserved for it.

Does this site tell you how much memory your machine has now and what the maximum for that model of computer is?

http://www.crucial.com/

I had problems getting my dual-core CPU to run two Rosetta@home workunits at the same time back when I had only 1 GB of memory to share between Vista and the two workunits, so I ordered an upgrade to the 2 GB maximum my model of computer can handle; now I can run two such workunits at once even while typing this.
ID: 59839 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TomaszPawel

Send message
Joined: 28 Apr 07
Posts: 54
Credit: 2,791,145
RAC: 0
Message 59845 - Posted: 27 Feb 2009, 10:25:55 UTC - in response to Message 59756.  
Last modified: 27 Feb 2009, 10:26:31 UTC

Hi:

https://boinc.bakerlab.org/rosetta/result.php?resultid=229237620
https://boinc.bakerlab.org/rosetta/result.php?resultid=229237620
https://boinc.bakerlab.org/rosetta/result.php?resultid=229237514
https://boinc.bakerlab.org/rosetta/result.php?resultid=229145242
https://boinc.bakerlab.org/rosetta/result.php?resultid=228892067
https://boinc.bakerlab.org/rosetta/result.php?resultid=228820491
https://boinc.bakerlab.org/rosetta/result.php?resultid=228820477

Any tips?


Looks like all of these were the ss-neg-1i17s that most people have been having trouble with. Something specific to the 1i17, the other ss-neg's do not seem to be having any trouble.

Except for your last one on the list, it got a
"Too many restarts with no progress. Keep application in memory while preempted."
error. Perhaps you rebooted your machine several times in a row to install fixes or something?

Right, last was multifix from our "love" Microsoft....
ID: 59845 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Mike*

Send message
Joined: 16 Feb 09
Posts: 5
Credit: 102,030
RAC: 0
Message 59846 - Posted: 27 Feb 2009, 11:48:29 UTC - in response to Message 59839.  

Hi all,
Had the below error show up.
I initially DLd 3 WU, the first 2 bombed, I aborted the 3rd.. I then detached, re-attached, then DLed 11 new ones.

Every one of them went south..

Boinc mgr is 6.2.18

Free disk is 88g
Used by boinc is 4.81
Use at most 100g
Leave 0
Use up to 50% disk
Leave apps in memory.
Only other project (which was suspended was CPDN at 55% @1004 hrs (do not want to loose this)


mike

(extra blank lines removed)
<core_client_version>6.2.18</core_client_version>



A few questions that may help pin down the problem:

Are you able to find BOINC 6.2.28, and willing to upgrade to it? That's the only version I have used since 5.10.45, and I don't have that problem.

Have you gone to any extra effort to tell BOINC that it could use more virtual memory than the default?

Have you gone to any extra effort to tell your copy of Windows to allow a bigger swap file than the default?

How many BOINC projects do you have your BOINC Manager set up to recognize? I've seen some so far rather indistinct signs that BOINC divides the disk space it is allowed to use into equal sections for each BOINC project it recognizes before it starts dividing those sections into smaller subsections for each workunit. Therefore, if one BOINC project is heavy on disk space use, workunits for that project might run out of disk space even if some other BOINC project doesn't need all that is reserved for it.

Does this site tell you how much memory your machine has now and what the maximum for that model of computer is?

http://www.crucial.com/

I had problems getting my dual-core CPU to run two Rosetta@home workunits at the same time back when I had only 1 GB of memory to share between Vista and the two workunits, so I ordered an upgrade to the 2 GB maximum my model of computer can handle; now I can run two such workunits at once even while typing this.



The odd thing is that I had successfully finished 3 models a few days ago, and a couple before that, (cant remember the version off hand, only 1 wu at a time) with no issues. I am attached to 7 projects but am not running then all. (I NNT the projects, and have a small buffer so as to not have to worry about having too much (Yea, I know boinc manages it, but I want to make sure everything gets doone quickly).
When you mentioned boinc dividing the disk space, I am wondering if I had the non active projects suspended, which I ususally have done in the past..
I will retry after I get thru the SIMAP run (this is why I keep the tasks low), making sure my buffer is small so as hopefully not grab 11 tasks


Thanks

Mike


ID: 59846 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TomaszPawel

Send message
Joined: 28 Apr 07
Posts: 54
Credit: 2,791,145
RAC: 0
Message 59847 - Posted: 27 Feb 2009, 12:59:41 UTC
Last modified: 27 Feb 2009, 13:02:04 UTC

Another bug:
https://boinc.bakerlab.org/rosetta/result.php?resultid=231152575
loopbuild_reference_allmodels_hb_t360
ID: 59847 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rembertw

Send message
Joined: 21 Apr 07
Posts: 14
Credit: 628,529
RAC: 0
Message 59848 - Posted: 27 Feb 2009, 13:33:44 UTC - in response to Message 59655.  

[Mod.Sense]
I still have not seen anyone else reporting such a problem, and you've got a score of other hosts running fine.


Last update: everything seems to be ok after I updated the Boinc version to 6.4.5. The exact reason for the 0% progress with Mini Rosetta is still a mystery but at least that computer is crunching again.
ID: 59848 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1061
Credit: 11,672,942
RAC: 7,736
Message 59850 - Posted: 27 Feb 2009, 13:49:41 UTC - in response to Message 59846.  

Hi all,
Had the below error show up.
I initially DLd 3 WU, the first 2 bombed, I aborted the 3rd.. I then detached, re-attached, then DLed 11 new ones.

Every one of them went south..

Boinc mgr is 6.2.18

Free disk is 88g
Used by boinc is 4.81
Use at most 100g
Leave 0
Use up to 50% disk
Leave apps in memory.
Only other project (which was suspended was CPDN at 55% @1004 hrs (do not want to loose this)


mike

(extra blank lines removed)
<core_client_version>6.2.18</core_client_version>



A few questions that may help pin down the problem:



The odd thing is that I had successfully finished 3 models a few days ago, and a couple before that, (cant remember the version off hand, only 1 wu at a time) with no issues. I am attached to 7 projects but am not running then all. (I NNT the projects, and have a small buffer so as to not have to worry about having too much (Yea, I know boinc manages it, but I want to make sure everything gets doone quickly).
When you mentioned boinc dividing the disk space, I am wondering if I had the non active projects suspended, which I ususally have done in the past..
I will retry after I get thru the SIMAP run (this is why I keep the tasks low), making sure my buffer is small so as hopefully not grab 11 tasks


Thanks

Mike




Another question that may help pin down the problem:

Did you have graphics enabled at any time during those runs? When I run minirosetta 1.58 for RALPH@home, it completes successfully if I never enable graphics, but fails if I have graphics enabled for a short time during the run.
ID: 59850 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Yaroslav Isakov

Send message
Joined: 2 Nov 07
Posts: 11
Credit: 98,027
RAC: 0
Message 59858 - Posted: 27 Feb 2009, 16:05:56 UTC

Another bunch of Hbond tripped errors:
hw_mamaln_t290_3_hb_1xyh__IGNORE_THE_REST_1ihg_1_SAVE_ALL_OUT_7736_375_0
hw_mamaln_t290_3_hb_1ihg__IGNORE_THE_REST_1cyn_1_SAVE_ALL_OUT_7729_256_0
hw_mamaln_t290_3_hb_t290__IGNORE_THE_REST_1zkc_1_SAVE_ALL_OUT_7743_255_0
hw_mamaln_t290_3_hb_t290__IGNORE_THE_REST_1xwn_1_SAVE_ALL_OUT_7743_255_0

First three of them have valid status and:
ERROR: dis==0 in pairtermderiv!
ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 338
called boinc_finish
ID: 59858 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Mike*

Send message
Joined: 16 Feb 09
Posts: 5
Credit: 102,030
RAC: 0
Message 59867 - Posted: 27 Feb 2009, 23:34:09 UTC - in response to Message 59850.  
Last modified: 28 Feb 2009, 0:10:54 UTC

Hi all,
Had the below error show up.
I initially DLd 3 WU, the first 2 bombed, I aborted the 3rd.. I then detached, re-attached, then DLed 11 new ones.

Every one of them went south..

Boinc mgr is 6.2.18

Free disk is 88g
Used by boinc is 4.81
Use at most 100g
Leave 0
Use up to 50% disk
Leave apps in memory.
Only other project (which was suspended was CPDN at 55% @1004 hrs (do not want to loose this)


mike

(extra blank lines removed)
<core_client_version>6.2.18</core_client_version>



A few questions that may help pin down the problem:



The odd thing is that I had successfully finished 3 models a few days ago, and a couple before that, (cant remember the version off hand, only 1 wu at a time) with no issues. I am attached to 7 projects but am not running then all. (I NNT the projects, and have a small buffer so as to not have to worry about having too much (Yea, I know boinc manages it, but I want to make sure everything gets doone quickly).
When you mentioned boinc dividing the disk space, I am wondering if I had the non active projects suspended, which I ususally have done in the past..
I will retry after I get thru the SIMAP run (this is why I keep the tasks low), making sure my buffer is small so as hopefully not grab 11 tasks


Thanks

Mike




Another question that may help pin down the problem:

Did you have graphics enabled at any time during those runs? When I run minirosetta 1.58 for RALPH@home, it completes successfully if I never enable graphics, but fails if I have graphics enabled for a short time during the run.


No, did not have the graphics running, the process crashed immediatly upon startup (or at least within a few seconds).

Interesting thing..

Normally I only have 1 to 3 projects un-suspended at 1 time. I has more than that un-suspended, but No new tasks..
I suspended ALL projects, shut down, and re-booted.
Started up boinc, set to not keep projects in memory, 50% cpu (us the 1 core non HT, unsuspended Rossetta, said give me tasks, hit update. Gave me 6 and then let it do its thing..
Guess what.. no issues..
I suspended 5 of the tasks to let the 1 run.
I also re-adjusted to 100% to use HT, re-started Docking, and had several Docking and 1 Rosetta finish..

Might be due to allocating memory among the active projects..

Am wondering if any of the other bugs I saw here, is the same issue with too many "active projects".
The programmer in me is suspecting that.. Not knowing what goes on in Boinc, etc could not tell (Besides, don't do C++ or later).

Thanks for the 'insight"..
Mike

p.s. added answer on graphics and spellings.
ID: 59867 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Yaroslav Isakov

Send message
Joined: 2 Nov 07
Posts: 11
Credit: 98,027
RAC: 0
Message 59874 - Posted: 28 Feb 2009, 15:43:44 UTC

Very long WU (25000 seconds), probably ended by timeout (intended runtime + 4 hours):
wt_ub_BOINC_ABRELAX_3MERS_NOHOMS_t482_SAVE_ALL_OUT_IGNORE_THE_REST-S25-3-S3-3--wt_ub-_7707_42783_0

It slows down on about 90% and I see in graphics that for about 4 hours it do SmallMoverMoverBase+Minimization stage

And it's also a Hbond tripped result :(
ID: 59874 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 59877 - Posted: 28 Feb 2009, 16:53:46 UTC - in response to Message 59874.  

Very long WU (25000 seconds), probably ended by timeout (intended runtime + 4 hours):

And it's also a Hbond tripped result :(


This one is interesting as it was completed successfully by a second computer in less than half the time and both were run on Linux machines.

ID: 59877 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Yaroslav Isakov

Send message
Joined: 2 Nov 07
Posts: 11
Credit: 98,027
RAC: 0
Message 59878 - Posted: 28 Feb 2009, 18:25:30 UTC - in response to Message 59877.  

Very long WU (25000 seconds), probably ended by timeout (intended runtime + 4 hours):

And it's also a Hbond tripped result :(


This one is interesting as it was completed successfully by a second computer in less than half the time and both were run on Linux machines.


Maybe it's because I have a 64-bit Linux?
ID: 59878 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 15 · Next

Message boards : Number crunching : Problems with Minirosetta v1.54



©2021 University of Washington
https://www.bakerlab.org