Problems with Minirosetta v1.54

Message boards : Number crunching : Problems with Minirosetta v1.54

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 15 · Next

AuthorMessage
root

Send message
Joined: 16 Feb 09
Posts: 6
Credit: 24,387
RAC: 0
Message 59931 - Posted: 2 Mar 2009, 20:34:01 UTC

I'm getting this same error for nearly all WUs on two Linux boxes running FC8 and FC9 with kernel 2.6.23.1-42.fc8 and 2.6.25.14-108.fc9.x86_64; resp.

In addition, I have a third Linux laptop running FC9 with no problems whatsoever. All 3 machines are running with leave_apps_in_memory=0.

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>

Any ideas?
ID: 59931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile hedera
Avatar

Send message
Joined: 15 Jul 06
Posts: 67
Credit: 4,429,147
RAC: 2,022
Message 59948 - Posted: 3 Mar 2009, 19:46:19 UTC

I've had 2 Windows error messages in the last couple of days from Rosetta. This is on a Win XP Pro SP2 system. The last one was this morning. I looked at my results today and this WU has crashed at 15:13:50 UTC:

2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431

Checking my message log, I found these messages:

03/03/2009 6:00:54 AM|rosetta@home|Restarting task 2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431_2 using rosetta_beta version 598
03/03/2009 6:01:41 AM|rosetta@home|Task 2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431_2 exited with zero status but no 'finished' file
03/03/2009 6:01:41 AM|rosetta@home|If this happens repeatedly you may need to reset the project.


Identical messages repeated until 7:12 AM when I got this:

03/03/2009 7:12:14 AM|rosetta@home|Computation for task 2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431_2 finished
03/03/2009 7:12:14 AM|rosetta@home|Output file 2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431_2_0 for task 2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431_2 absent


If you look at the task details for WU 209583003 on computer 272841, you'll see this error followed by a dump:

<core_client_version>6.2.19</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
# cpu_run_time_pref: 10800
# random seed: 2834914

Unhandled Exception Detected...
- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x008BB955 read attempt to address 0x09A9C000

Engaging BOINC Windows Runtime Debugger...

********************


I'm sure it isn't meant to do this...
--hedera

Never be afraid to try something new. Remember that amateurs built the ark. Professionals built the Titanic.

ID: 59948 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 59955 - Posted: 3 Mar 2009, 22:53:51 UTC - in response to Message 59948.  

I've had 2 Windows error messages in the last couple of days from Rosetta. This is on a Win XP Pro SP2 system. The last one was this morning. I looked at my results today and this WU has crashed at 15:13:50 UTC:

2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431

Checking my message log, I found these messages:

03/03/2009 6:00:54 AM|rosetta@home|Restarting task 2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431_2 using rosetta_beta version 598
03/03/2009 6:01:41 AM|rosetta@home|Task 2p64__BOINC_SYMM_FOLD_AND_DOCK_RELAX-2p64_-native_frag2__7622_431_2 exited with zero status but no 'finished' file
03/03/2009 6:01:41 AM|rosetta@home|If this happens repeatedly you may need to reset the project.


Could you check the results uploaded for this one and see it the results include any mention of lockfile problems?

Also, a few questions that may help pin down the problem:

1. Do the error messages shown above repeat several times, and do the lockfile error messages if any repeat several times?

2. What version of BOINC are you using?

3. Have you enabled the leave in memory option?

4. What percentage of CPU time do you let BOINC projects use? The 60% setting typical for laptops, the 100% setting typical for desktops, or something else?

5. Did this workunit start with graphics enabled? Did you enable graphics later? Did you then shut down graphics for it?
ID: 59955 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
senatoralex85

Send message
Joined: 27 Sep 05
Posts: 66
Credit: 169,644
RAC: 0
Message 59959 - Posted: 4 Mar 2009, 2:43:32 UTC

Once in awhile, I get a Microsoft Visual C++ Runtime Library Error? It is for minirosetta_1.54_windows_intelx86.exe. The error message reads "This application has requested the runtime to terminate it in an unusual way. Please contact the applications support team for more information."

Received it for this workunit. http://boinc.bakerlab.org/rosetta/result.php?resultid=232499308

Currently Using XP service pack 2 with Boinc version 5.10.45
ID: 59959 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 161
Credit: 654,963
RAC: 2,175
Message 59961 - Posted: 4 Mar 2009, 4:14:04 UTC
Last modified: 4 Mar 2009, 4:22:39 UTC

Task Id 232649967 isn't displaying graphics instead it's displaying a black window, when i move my cursor around in the black window a white block of what looks like unreadable text moves around under my cursor while my cursor is in the black window. It's a 2vik task. Task finished with a successful out come. Task ID 232649968 isn't displaying graphics instead it's displaying a black window When I try to close the black window it comes up with End Program my opinions are Emd Now or Cancel I chose End Now. Task finished with a successful out come.

I am using XP Pro SP3 fully patched and Boinc 6.4.7 on a quad 2.66 with 2.87GB Ram. I'm not sure if that will make a difference or not.
Has anybody else had any of the above issues?
Thanks for any information as to why this could be happening in advance.
Have a crunching good day!!
ID: 59961 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 59967 - Posted: 4 Mar 2009, 15:04:11 UTC
Last modified: 4 Mar 2009, 15:50:43 UTC

The lockfile problem again:

http://boinc.bakerlab.org/rosetta/result.php?resultid=232787694

Starting work on structure: _1NRGA_7_00029
BOINC:: Initializing ... ok.
Can't acquire lockfile - exiting
BOINC:: Initializing ... ok.
Can't acquire lockfile - exiting
BOINC:: Initializing ... ok.
Can't acquire lockfile - exiting
BOINC:: Initializing ... ok.
Can't acquire lockfile - exiting
BOINC:: Initializing ... ok.
Can't acquire lockfile - exiting

Plus many more copies of the same error messages.

I run BOINC 6.2.28 at 95% CPU with the leave in memory option, under Vista SP1. I didn't enable graphics at all for this workunit.

A wingman, apparantly with a shorter requested workunit length, completed only 9 decoys, but successfully.
ID: 59967 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 59978 - Posted: 4 Mar 2009, 21:31:15 UTC

@Robert,

A wild question ... did you enable or run graphics for any task for any project?
ID: 59978 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 59979 - Posted: 4 Mar 2009, 22:16:07 UTC - in response to Message 59978.  

@Robert,

A wild question ... did you enable or run graphics for any task for any project?


Hard to remember. I often go for days without using graphics on any BOINC project these days.

When I was testing graphics triggering the problem for minirosetta 1.58 over on RALPH@home, though, it seemed to be only graphics for a 1.58 workunit, not graphics for a 1.54 workunit, which triggered the problem, though, and only for the 1.58 workunit.

I probably used graphics for purposes unrelated to BOINC projects, though, which hasn't triggered such a problem for me in the past.

ID: 59979 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1740
Credit: 3,662,462
RAC: 555
Message 60004 - Posted: 6 Mar 2009, 21:50:18 UTC
Last modified: 6 Mar 2009, 21:50:58 UTC

Wow, this one has a 2.15MB result file for 24hrs of crunching. 100K is more what I am used to seeing. Task name is lrfrag_0_8_hb_t308__IGNORE_THE_REST_ 1M2OB_8_7783_69_0
If having a DC project with BOINC is of interest to you, with volunteer or cloud computing resources, but have no time for the BOINC learning curve,
use a hosting service that understands BOINC projects: http://DeepSci.com
ID: 60004 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rembertw

Send message
Joined: 21 Apr 07
Posts: 14
Credit: 628,529
RAC: 0
Message 60077 - Posted: 11 Mar 2009, 17:14:20 UTC

Mod.Sense
Another 0% progress Minirosetta task, on another computer. 84:25:24 time progress of a projected 10:17:32 duration. Windows XP, SP3, vintage computer. Boinc version 6.2.18.

Task is now aborted, Boinc upgraded to 6.4.7. Am I still the only one noticing this?
ID: 60077 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3690
Credit: 0
RAC: 0
Message 60078 - Posted: 11 Mar 2009, 17:26:56 UTC
Last modified: 11 Mar 2009, 17:40:23 UTC

rembertw, I haven't heard any other reports. You seem to be translating the screen to English for me, and I appreciate that, but it's never entirely clear what you are referring to. On the English screen, there are three columns of interest, "CPU time", "Progress" (the percentage), and "To completion".

If I understand what you are saying is that
CPU time was 84hrs
progress was 0%
and to completion was still 10 more hours?

Is this the task you had to abort?

Now that you have aborted that one, what does the next task show for the "to ompletion" before it starts?

If the above is the correct task, it looks like the host only has 256MB of memory. And 32MB of that is likely devoted to your graphics card. The current recommendation to run Rosseta is machines with 512MB of memory or higher. (they recently increased that from 256MB when they began running more tasks that require more memory).

It looks like that machine has been having trouble earning credit for some time. I see you also do work for WCG. I've noticed that the rice project there runs in about 10MB of memory! So, perhaps that would run better on that machine.

I see you have a very large list of projects you do work for. Are all of your hosts using an account manager and dividing their resource share across all 8 projects? You also have 13 machines active, at least for Rosetta. You might want to create a seperate account, or seperate venue to seperate your P4 machines from your core 2's. And that way you could have some machines doing more work for WCG for example, and others do more for Rosetta. Based on the machine's configuration.
Rosetta Moderator: Mod.Sense
ID: 60078 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile alpha

Send message
Joined: 4 Nov 06
Posts: 27
Credit: 1,545,892
RAC: 575
Message 60090 - Posted: 12 Mar 2009, 7:03:31 UTC

Visual C++ runtime error with this task after 51,711 seconds:

http://boinc.bakerlab.org/rosetta/result.php?resultid=234626846
ID: 60090 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rembertw

Send message
Joined: 21 Apr 07
Posts: 14
Credit: 628,529
RAC: 0
Message 60095 - Posted: 12 Mar 2009, 11:44:29 UTC - in response to Message 60078.  

If I understand what you are saying is that
CPU time was 84hrs
progress was 0%
and to completion was still 10 more hours?

Mod.Sense, you understood correctly. Indeed it is for me sometimes guessing how Boinc translated English into Dutch. And indeed, it was that task.

I have all my computers under Gridrepublic. It would simply take too much time to micromanage every single computer so I don't even try to. Up until now I simply assumed that projects would not give work to computers that did not have the minimum requirements so I never bothered checking every project for that. From your reply I take it that Rosetta does not do such a test.

There are indeed a list of projects that I have active, but I never run them all at the same time. Right now there are only 2 projects active with a stable feed (Rosetta, WCG), 2 projects that send Wu's when they have them (simap, LHC) and one that is only on a couple of computers, and set on NNT (orbit).

Now I set that last computer on NNT for Rosetta since it's got a limited configuration.

I realise that this does not belong here, but it would be interesting if there were a manager like Gridrepublic or BAM that looks at the connected computers, and divides the projects over the available processors. Let's say that for now I have 20 processors available, and Rosetta gets 10% resource share, then Rosetta would get 1 computer with 2 processors working only for Rosetta. All this without having me driving from location to location if I want to change settings. It would help, indeed, in available memory, available disk space and so on.

Since there is no such thing for now, I'll just go on as I'm used to: if there's a problem, then upgrade Boinc, and set Rosetta to NNT on the older computers. I can equal out a little by increasing the resource share for every computer set on NNT.
ID: 60095 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Steven Pletsch
Avatar

Send message
Joined: 17 Oct 07
Posts: 17
Credit: 282,298
RAC: 0
Message 60097 - Posted: 12 Mar 2009, 14:24:19 UTC

Ran into a couple errors, both on the same machine.

I really don't know how to make heads or tails of the errors, but there is a lot of information there.

http://boinc.bakerlab.org/rosetta/result.php?resultid=234977428

http://boinc.bakerlab.org/rosetta/result.php?resultid=234904088

I believe it's something with the machine, since I'm not having errors on any others, and it's only been attached for about 24 hours.

I am curious if there is anything in the debug info that might point to a clue as to what is up with it.

Anyone that can provide some insight would be much appreciated.

Thanks
"Every passing hour brings the Solar System forty-three thousand miles closer to Globular Cluster M13 in Hercules -- and still there are some misfits who insist that there is no such thing as progress." - Kurt Vonnegut
ID: 60097 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 60099 - Posted: 12 Mar 2009, 16:03:34 UTC

A 1.54 workunit that hit the lockfile problem:

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=213726108

A wingman avoided the problem, apparantly by choosing a workunit length short enough to end it before my copy hit the problem.
ID: 60099 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rembertw

Send message
Joined: 21 Apr 07
Posts: 14
Credit: 628,529
RAC: 0
Message 60123 - Posted: 13 Mar 2009, 7:25:17 UTC - in response to Message 60078.  

Now that you have aborted that one, what does the next task show for the "to ompletion" before it starts?

I did not answer this question, but the answer would have been 10:17:32 time to completion if I accepted new tasks from Rosetta on that computer. (I do not). On other computers the time to completion values that I get vary from 9:something up to 12:something depending on the computer. Meaning that every computer has a different "time to completion" but different tasks on one computer have the same value.
ID: 60123 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ivor Cogdell

Send message
Joined: 7 Nov 06
Posts: 10
Credit: 15,627
RAC: 0
Message 60165 - Posted: 15 Mar 2009, 17:59:48 UTC

Hi folks,
I try to run Minirosetta 1.54 (Windows XP Home sp3,BOINC Manager 6.4.5, wx Wigets version 2.8.7), but my Kaspersky 2009 Interner Security (Version 8.0.0.0.506) blocks it from running and throws up a black error message. I have tried to view the report but that does not give me any information on how to rectify the problem.
The standard Rosetta program will run ok (as of 22 feb workunit). Any suggestions please ?

Ivor Cogdell
Birmingham, UK
ID: 60165 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2759
Credit: 1,726,531
RAC: 148
Message 60166 - Posted: 15 Mar 2009, 19:15:53 UTC

ID: 60166 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 60204 - Posted: 17 Mar 2009, 23:33:24 UTC - in response to Message 60166.  

http://boinc.bakerlab.org/rosetta/results.php?hostid=267483


Looks like you've managed to make your queue of workunits so long you don't return them by the deadline, so other people run them and get credit for them before you do.
ID: 60204 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 735
Credit: 9,979,591
RAC: 4,035
Message 60205 - Posted: 17 Mar 2009, 23:34:06 UTC - in response to Message 60166.  
Last modified: 17 Mar 2009, 23:35:32 UTC

[duplicate deleted]
ID: 60205 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 15 · Next

Message boards : Number crunching : Problems with Minirosetta v1.54



©2020 University of Washington
http://www.bakerlab.org