Problems with Rosetta version 5.59

Message boards : Number crunching : Problems with Rosetta version 5.59

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
Profile Anime-Addict

Send message
Joined: 21 Jan 07
Posts: 1
Credit: 447,569
RAC: 0
Message 39929 - Posted: 27 Apr 2007, 5:05:16 UTC

I have seen both the Graphics failing to render strings and the fact that the Graphics freeze up after a certain point. I know that the GFX take up valuable CPU time but it lets me have an excuse to bring my laptop into work at the hangar, under the false pretense of doing collegiate work.
ID: 39929 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [B^S] BOINC-SG
Avatar

Send message
Joined: 6 Oct 06
Posts: 4
Credit: 101,104
RAC: 0
Message 39986 - Posted: 28 Apr 2007, 11:23:44 UTC

ID: 39986 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stephen

Send message
Joined: 5 Jun 06
Posts: 23
Credit: 2,570,438
RAC: 0
Message 39998 - Posted: 28 Apr 2007, 15:37:25 UTC

I have a dual 1.8 GHZ PPC macintosh running the latest OS X.

Using Activity Monitor, I see two rosetta@home processes, both named rosetta_5.59_pow. One is using 94% of a CPU, and the other is using about 75% of the other CPU.

On the BOINC manager window, I see two processes marked as "running". The CPU time, progress, and to completion only increments for one of the two processes. I speculate that the CPU time, progress, and to completion for the process that is not incrementing, which seems to be older, stopped incrementing when the younger process started.

This looks like an OS X specific bug, since my dual core Athlon Ubuntu linux client shows updates for two processes. Is there a place to report bugs somewhere? Thanks.

Stephen
ID: 39998 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael.L

Send message
Joined: 12 Nov 06
Posts: 67
Credit: 31,295
RAC: 0
Message 40029 - Posted: 29 Apr 2007, 9:08:46 UTC

29/04/07 02:15:57 Can't rename current state file to previous state file; The process cannot access the file because it is being used by another process (0x20).

This msge was repeated at 07:00:27

Can someone explain??
ID: 40029 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Computerwiesel

Send message
Joined: 18 Sep 05
Posts: 1
Credit: 13,879
RAC: 0
Message 40032 - Posted: 29 Apr 2007, 10:03:20 UTC

HereĀ“s another WU that crashed when clicked on "Show graphics".

https://boinc.bakerlab.org/rosetta/result.php?resultid=75872123

ID: 40032 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile fastdude
Avatar

Send message
Joined: 11 Nov 06
Posts: 10
Credit: 8,764
RAC: 0
Message 40071 - Posted: 29 Apr 2007, 23:59:44 UTC

The laptop gone into standby or hibernate last night after finishing some downloads, this morning on startup there was a message about boinc crashing, we are sorry, etc. I am not sure if the boinc crash happened on startup or was from last night.

on re-launching boinc the task that was shown as almost finished (about 80%?) immediatly reverted to 0.1% with about 2:10 of cpu time & 10H to go.

the laptop is a (new) toshiba L30 celeron 1.6 with no screensaver enabled.
ID: 40071 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
D-Fens

Send message
Joined: 19 Mar 07
Posts: 1
Credit: 41,231
RAC: 0
Message 40106 - Posted: 30 Apr 2007, 20:24:42 UTC

Hi
I just finished a WU, and when i uploaded it, i got 0 credits, and it says Error also, while there was no error.
https://boinc.bakerlab.org/rosetta/result.php?resultid=75587254

In the message list it says normaly, WU finished, and nothing about an error, it finished just normaly as usual.

What is the problem?
ID: 40106 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mdettweiler
Avatar

Send message
Joined: 15 Oct 06
Posts: 33
Credit: 2,509
RAC: 0
Message 40108 - Posted: 30 Apr 2007, 20:55:20 UTC

I have the following workunit:

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=68288157

which is a 10-hour workunit, and has been running fine for a few days of off-and-on computing (I shut my computer down at night, and sometimes during the day). However, sometimes when I started up my computer again, the progress bar would temporarily go back to zero (although in a little while it would be back to normal; I know that some applications do this when they first restart). It also seemed to have started over from the beginning of the model. Not too terribly much of a biggie. Well, at least it wasn't too much of a biggie until today, when it started over from the beginning of model 1! I'm wondering if one of the checkpoints fell through, but the application thought it had checkpointed successfully, and it thus ended up starting over from the beginning.

Should I abort this WU? I'm suspending it for now so I don't lose any work on it if I do.
ID: 40108 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5652
Credit: 5,622,096
RAC: 0
Message 40118 - Posted: 30 Apr 2007, 23:22:48 UTC - in response to Message 40106.  

Hi
I just finished a WU, and when i uploaded it, i got 0 credits, and it says Error also, while there was no error.
https://boinc.bakerlab.org/rosetta/result.php?resultid=75587254

In the message list it says normaly, WU finished, and nothing about an error, it finished just normaly as usual.

What is the problem?


Your result says this: Workunit error - check skipped
now we just need modsense or someone to tell us what that means.
also there is this that makes things interesting, to many results?
ID: 40118 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael.L

Send message
Joined: 12 Nov 06
Posts: 67
Credit: 31,295
RAC: 0
Message 40128 - Posted: 1 May 2007, 7:16:15 UTC - in response to Message 40118.  

Hi
I just finished a WU, and when i uploaded it, i got 0 credits, and it says Error also, while there was no error.
https://boinc.bakerlab.org/rosetta/result.php?resultid=75587254

In the message list it says normaly, WU finished, and nothing about an error, it finished just normaly as usual.

What is the problem?


Your result says this: Workunit error - check skipped
now we just need modsense or someone to tell us what that means.
also there is this that makes things interesting, to many results?

I got one of these too.
Result ID 76072216 -Workunit Error - check skipped.
claimed credit 85. granted credit 0 too many total results.
Can see another PC beat me to completion, and got credit too.
ID: 40128 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sailor
Avatar

Send message
Joined: 19 Mar 07
Posts: 75
Credit: 89,192
RAC: 0
Message 40138 - Posted: 1 May 2007, 11:25:23 UTC

Hmm I found an odd similarity in the WUs from Michael.L and Dfens.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=65985061

The first PC is definetly over the turnaround time of 10 days

Sent : 17 Apr 2007 19:24:06 UTC
Deadline + 10 days so 27 Apr 2007
Returned : 29 Apr 2007 18:20:37 UTC - 2 days off limit, but got credit

meanwhile, at the 27.Apr (the time the WUs deadline on PC1), the wu got send to to PC2, but PC1 however returned a result before him, and got granted credit.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=64752196

The same here

Sent: 20 Apr 2007 7:40:31 UTC
Deadline: 30 Apr 2007 7:40:31 UTC
rosetta sending the WU new @ 30 Apr 2007 7:41:21 UTC
However, PC1 is returning a result @ 30 Apr 2007 15:07:00 UTC
(7 hours off limit) and got the credit, leaving the 2nd user empty..

Seems to be a bug here, the Clients beeing over the target time should not gain credits from their late results, and the 2 who turned within the targettime (Michael.L and Dfens) should get their credit. Fix it please, give them 2 their deserved 80/85 credits, if this is happening more often, ppl are computing for waste! As it looks to me, their turned in results just went straight to the trash...
ID: 40138 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
komandar

Send message
Joined: 13 Dec 06
Posts: 4
Credit: 767,791
RAC: 0
Message 40144 - Posted: 1 May 2007, 13:34:03 UTC

The 64bit version of BOINC doesn't work with the current rosetta version:
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=3147
ID: 40144 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael.L

Send message
Joined: 12 Nov 06
Posts: 67
Credit: 31,295
RAC: 0
Message 40161 - Posted: 1 May 2007, 17:02:46 UTC
Last modified: 1 May 2007, 17:43:13 UTC

According to BOINC Manager.
WU - 2ztaA boinc symm fold and dock relax 2ztaa 1639 10600 completed today 13.38.00 uploaded 13.38.02 finished 13.38.05 with BOINC LogX history showing 24 decoys. This WU started to process at 05.11.13.
in Work Units i have outstanding 2 WUs 68513611 and 68446495
and the prev completed WU as shown in message below (40138).
I see no trace of the WU completed at 13.38.00 either as completed or outstanding in My Work Units (my account)

I have updated Boinc Manager twice since 13.38 but cant find seemingly 'missing' WU.

That makes 2 successive WUs for which I get no credit.

at the same time my RAC has nose dived. BOINC LOGX shows Credits/RAC have not moved for last 2 completed WUs. (includes 2ztaA).
All above times are for today.
ID: 40161 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[AF>Quebec] YMG

Send message
Joined: 13 Apr 07
Posts: 2
Credit: 2,685,648
RAC: 0
Message 40164 - Posted: 1 May 2007, 19:01:52 UTC

Getting a lot of computing error here most of the form:

<core_client_version>5.8.16</core_client_version>
<![CDATA[
<message>
- exit code 1073807364 (0x40010004)
</message>
<stderr_txt>
# cpu_run_time_pref: 10800
# random seed: 3013385
# cpu_run_time_pref: 10800


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x7C911C29 read attempt to address 0x00000024

Engaging BOINC Windows Runtime Debugger...


</stderr_txt>
]]>


What is going on???
ID: 40164 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael.L

Send message
Joined: 12 Nov 06
Posts: 67
Credit: 31,295
RAC: 0
Message 40171 - Posted: 1 May 2007, 22:29:51 UTC
Last modified: 1 May 2007, 22:35:59 UTC

Sees that Rosie staff do not look at this forum very often so has to be patient!
Rhiju Apr 2. 8.
and Mod Sense Apr 12. 18.
Expect everyone is v busy but would appreciate an answer.
ID: 40171 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
genes
Avatar

Send message
Joined: 8 Oct 05
Posts: 60
Credit: 460,257
RAC: 0
Message 40177 - Posted: 2 May 2007, 1:33:18 UTC

Found graphics hung on this WU: resultid=76205644
Exited BOINC using Task Mgr, restarted.

ID: 40177 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 40181 - Posted: 2 May 2007, 2:32:49 UTC - in response to Message 40171.  
Last modified: 2 May 2007, 22:27:50 UTC

Sees that Rosie staff do not look at this forum very often so has to be patient!
Rhiju Apr 2. 8.
and Mod Sense Apr 12. 18.
Expect everyone is v busy but would appreciate an answer.


I only see one such WU. I've not seen such an issue before, so I had no advice to offer.

Rhiju, would this Workunit error - check skipped state be a problem on the validator? There don't appear to be any indications of a problem on the client side.

[edit] Here's a thread that discusses the check skipped error. But they didn't really seem to reach a concrete cause/resolution there.
Rosetta Moderator: Mod.Sense
ID: 40181 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5652
Credit: 5,622,096
RAC: 0
Message 40237 - Posted: 2 May 2007, 22:13:53 UTC
Last modified: 2 May 2007, 22:20:54 UTC

why does the WU in progress return to 1% or less after ROH manager stops the computation to do cpu benchmqrking? The time remains the same but the % drops way back. I was at 72% before the cpu benchmarking began but then it drops back to 1% and climbs slowly back up.

funny thing is that the time remains the same..now 6hrs plus out of 8hours run time and it is on model 14 step 63000+ and counting.
this does not correspond to the percentage rate.

Does 5.62 resolve this issue?
ID: 40237 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 40239 - Posted: 2 May 2007, 22:21:49 UTC - in response to Message 40237.  

Does 5.62 resolve this issue?

Yes. And suggest you set your General Preferences to leave applications in memory while suspended.

Rosetta Moderator: Mod.Sense
ID: 40239 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5652
Credit: 5,622,096
RAC: 0
Message 40244 - Posted: 2 May 2007, 23:12:09 UTC - in response to Message 40239.  

Does 5.62 resolve this issue?

Yes. And suggest you set your General Preferences to leave applications in memory while suspended.


changed general prefs, did update and tried a shut down and resume, percentage went to 82% and then went back to low 0% with time done so far resuming where it left off. model 18 now and 12000 steps and counting. so nothing really changed despite the different settings in prefs.
ID: 40244 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : Problems with Rosetta version 5.59



©2023 University of Washington
https://www.bakerlab.org