Problems with Rosetta version 5.81

Message boards : Number crunching : Problems with Rosetta version 5.81

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Ingemar

Send message
Joined: 28 Feb 06
Posts: 20
Credit: 1,680
RAC: 0
Message 48402 - Posted: 5 Nov 2007, 23:23:54 UTC

Please report problems with this version. Thanks!
ID: 48402 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Trey

Send message
Joined: 3 Oct 06
Posts: 11
Credit: 110,142
RAC: 0
Message 48404 - Posted: 6 Nov 2007, 0:07:45 UTC - in response to Message 48402.  
Last modified: 6 Nov 2007, 0:08:25 UTC

The home page news said: "Rosetta@home has been updated to version 5.81. This version contains small, but essential changes to the scientific protocols. For details, see this thread."

I don't see any details...
ID: 48404 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Kidd

Send message
Joined: 9 Dec 06
Posts: 5
Credit: 327
RAC: 0
Message 48405 - Posted: 6 Nov 2007, 2:10:36 UTC

Thanks Trey. The thread link now points to the correct location.
ID: 48405 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 48408 - Posted: 6 Nov 2007, 3:57:45 UTC
Last modified: 6 Nov 2007, 3:58:37 UTC

Didn't you mean 5.82!, thats what i've got. ;)

Pete.




ID: 48408 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5662
Credit: 5,700,734
RAC: 2,091
Message 48413 - Posted: 6 Nov 2007, 7:27:15 UTC
Last modified: 6 Nov 2007, 7:28:40 UTC

5.69 is 5.82 now? and 5.81 is the "new" version?
ID: 48413 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 11 Feb 06
Posts: 316
Credit: 6,589,590
RAC: 317
Message 48414 - Posted: 6 Nov 2007, 8:00:45 UTC

I'm not sure I understand the point of this thread here. I understand why it exists on RALPH, of course. But with sufficient testing on RALPH, why is this thread needed here?
Reno, NV
Team: SETI.USA
ID: 48414 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 48421 - Posted: 6 Nov 2007, 16:20:11 UTC - in response to Message 48405.  

Thanks Trey. The thread link now points to the correct location.


Hint: youd better insert this as adress in your link ( https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1449&nowrap=true#48406 ) this links directly to the post about 9.81
ID: 48421 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 48422 - Posted: 6 Nov 2007, 16:20:52 UTC - in response to Message 48413.  

5.69 is 5.82 now? and 5.81 is the "new" version?


yes 5.69 is renamed to 5.82 see the homepage ;)
ID: 48422 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
j2satx

Send message
Joined: 17 Sep 05
Posts: 97
Credit: 3,670,592
RAC: 0
Message 48444 - Posted: 7 Nov 2007, 17:51:28 UTC - in response to Message 48414.  

I'm not sure I understand the point of this thread here. I understand why it exists on RALPH, of course. But with sufficient testing on RALPH, why is this thread needed here?


Because they don't do "sufficient" testing on Ralph.....!

ID: 48444 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5662
Credit: 5,700,734
RAC: 2,091
Message 48450 - Posted: 7 Nov 2007, 19:08:09 UTC - in response to Message 48444.  
Last modified: 7 Nov 2007, 19:09:50 UTC

even if they did "sufficient" testing on Ralph, would it be perfect enough for each and every work unit generated on Rosie? the percentage of errors vs the percentage of successes is quite low but gains alot of visibility when a error does show up.

I'm not sure I understand the point of this thread here. I understand why it exists on RALPH, of course. But with sufficient testing on RALPH, why is this thread needed here?


Because they don't do "sufficient" testing on Ralph.....!



ID: 48450 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
j2satx

Send message
Joined: 17 Sep 05
Posts: 97
Credit: 3,670,592
RAC: 0
Message 48462 - Posted: 8 Nov 2007, 2:36:38 UTC - in response to Message 48450.  

even if they did "sufficient" testing on Ralph, would it be perfect enough for each and every work unit generated on Rosie? the percentage of errors vs the percentage of successes is quite low but gains alot of visibility when a error does show up.

I'm not sure I understand the point of this thread here. I understand why it exists on RALPH, of course. But with sufficient testing on RALPH, why is this thread needed here?


Because they don't do "sufficient" testing on Ralph.....!




What is the percentage of errors to successes?

That would be a good thing to see for status.......less complaints if the error rate is really low.
ID: 48462 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 48471 - Posted: 8 Nov 2007, 19:11:08 UTC - in response to Message 48462.  

even if they did "sufficient" testing on Ralph, would it be perfect enough for each and every work unit generated on Rosie? the percentage of errors vs the percentage of successes is quite low but gains alot of visibility when a error does show up.

I'm not sure I understand the point of this thread here. I understand why it exists on RALPH, of course. But with sufficient testing on RALPH, why is this thread needed here?


Because they don't do "sufficient" testing on Ralph.....!




What is the percentage of errors to successes?

That would be a good thing to see for status.......less complaints if the error rate is really low.


i thought for a batch of WU's the error rate is aroud 1% to 3%
ID: 48471 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 48478 - Posted: 8 Nov 2007, 23:52:28 UTC

At the risk of continuing to clutter this thread with non-problems... To give you a feel for number of successful results... Have a look at the reports collected by dcdc. If the result name ends in _0 then it was the first to be sent out. You will see batches all with sequential names followed by batches of a mixture of names. These mixtures of names are the result IDs created in between the project creating new work. They are generated automatically when a result misses it's deadline, or completes with an error. So you can see the number of WU names ending in _1 or _2 is very small. And many of those are probably due to missed deadlines, not failures.

Perhaps dcdc could run some more numbers and provide a more specific answer to the question.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 48478 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 48507 - Posted: 9 Nov 2007, 18:34:30 UTC - in response to Message 48478.  
Last modified: 9 Nov 2007, 18:35:08 UTC

At the risk of continuing to clutter this thread with non-problems... To give you a feel for number of successful results... Have a look at the reports collected by dcdc. If the result name ends in _0 then it was the first to be sent out. You will see batches all with sequential names followed by batches of a mixture of names. These mixtures of names are the result IDs created in between the project creating new work. They are generated automatically when a result misses it's deadline, or completes with an error. So you can see the number of WU names ending in _1 or _2 is very small. And many of those are probably due to missed deadlines, not failures.

Perhaps dcdc could run some more numbers and provide a more specific answer to the question.


i have a unstarted WU in my batch that ends with _1 the rest ends with _o, so that 1 is doomed to fail?

but we shouldn't be discussing this kind of things here. (a)
ID: 48507 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
M.L.

Send message
Joined: 21 Nov 06
Posts: 182
Credit: 180,462
RAC: 0
Message 48509 - Posted: 9 Nov 2007, 19:31:02 UTC
Last modified: 9 Nov 2007, 19:42:32 UTC

Way off topic.
Luuklag - Thinks you have the wrong end of the stick!
Would hate to think you were aborted WUs unnecessarily.
the ...0 or ...1 at the end of each WU means that it has been sent out previously ...0 times or ...1 times.
The WU to which I referred was one that had been sent and returned to another PC, but returned after the deadline for that WU {Duplicated WU thread} - credits had been applied to the original cruncher so there would have been no point in my crunching that WU again as it had been happpily accepted by Rosie and if I had crunched it there would have been NO credits for me.
You can always check the reason for any...1 WUs that you have in your results {awaiting processing} and see why it was sent out again. There was a time a few months ago that some WUs were sent out that had been crunched but returned after the deadline so the second cruncher would get no credits and that WU had been accepted by Rosie.
ID: 48509 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Voyager

Send message
Joined: 9 May 07
Posts: 3
Credit: 6,870,796
RAC: 1,914
Message 48513 - Posted: 10 Nov 2007, 9:54:38 UTC

And now about reporting problems with Beta 5.81 : 8 out of 9 WU's I had ended in "compute error" al results start with "maximum disk usage exceeded" and then get an "unhandled exception detected".
The first msg does not compute for me because my disk is nowhere near full, neither is Boinc near it's limit on disk usage.
The second msg is one for the specialists, my computer is not hidden so take a look and see what you can find.
I hope the problem is solved quickly, I'm wasting good CPU time here.
Will keep crunching for now.

Rob.
ID: 48513 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Path7

Send message
Joined: 25 Aug 07
Posts: 128
Credit: 61,751
RAC: 0
Message 48524 - Posted: 10 Nov 2007, 14:29:45 UTC - in response to Message 48513.  
Last modified: 10 Nov 2007, 14:42:01 UTC

Hi Rob,

The latest WU I crunched so far was the: 1fe6__BOINC_SYMM_FOLD_AND_DOCK_RELAX-1fe6_-crystal_foldanddock__2257_31479_0
As all your Maximum disk usage exceeded Wu's: __2257_.
The 1fe6__etc. had a odd behavior: The RMSD was very high, it was not visible in its graphic. The memory usage was high: 411060 kB Ram, & 606092 kB of swap page max. (according to Process Explorer).

I wonder if a low swap page might give you a Maximum disk usage exceeded error.

Keep on crunching,
Path7.
ID: 48524 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 48530 - Posted: 10 Nov 2007, 17:59:46 UTC - in response to Message 48507.  

At the risk of continuing to clutter this thread with non-problems... To give you a feel for number of successful results... Have a look at the reports collected by dcdc. If the result name ends in _0 then it was the first to be sent out. You will see batches all with sequential names followed by batches of a mixture of names. These mixtures of names are the result IDs created in between the project creating new work. They are generated automatically when a result misses it's deadline, or completes with an error. So you can see the number of WU names ending in _1 or _2 is very small. And many of those are probably due to missed deadlines, not failures.

Perhaps dcdc could run some more numbers and provide a more specific answer to the question.


i have a unstarted WU in my batch that ends with _1 the rest ends with _o, so that 1 is doomed to fail?

guess what it failed...
but we shouldn't be discussing this kind of things here. (a)

ID: 48530 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Voyager

Send message
Joined: 9 May 07
Posts: 3
Credit: 6,870,796
RAC: 1,914
Message 48531 - Posted: 10 Nov 2007, 18:30:18 UTC - in response to Message 48524.  

Hi Rob,

The latest WU I crunched so far was the: 1fe6__BOINC_SYMM_FOLD_AND_DOCK_RELAX-1fe6_-crystal_foldanddock__2257_31479_0
As all your Maximum disk usage exceeded Wu's: __2257_.
The 1fe6__etc. had a odd behavior: The RMSD was very high, it was not visible in its graphic. The memory usage was high: 411060 kB Ram, & 606092 kB of swap page max. (according to Process Explorer).

I wonder if a low swap page might give you a Maximum disk usage exceeded error.

Keep on crunching,
Path7.

Thanks for the insight, but the available swap space is 4.4 GB so that shouldn't be a problem.
To be sure I changed the 'use at most .... disk space' setting from 1 to 2 GB.
We'll see if that helps.

Rob.
ID: 48531 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 48537 - Posted: 10 Nov 2007, 22:54:08 UTC

There is a maximum to the size that the output file for a WU is allowed to reach. That must be the limit that you hit.

Batch 2257, sometimes with less then 1000 seconds of processing completed.
Rosetta Moderator: Mod.Sense
ID: 48537 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Problems with Rosetta version 5.81



©2024 University of Washington
https://www.bakerlab.org