Problems with Rosetta version 5.40

Message boards : Number crunching : Problems with Rosetta version 5.40

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 31158 - Posted: 15 Nov 2006, 0:55:37 UTC

Please abort those WUs with names like \"DOC_????_R061030_st_mode_??\", but NOT \"DOC_R061113_***_fa_relax_from_native_bound\".

These WUs were added into the queue before 5.40 was updated and they work fine with 5.36. But we found later the new application has some backward compatibility issue with these WUs and since most of these WUs are still in the queue, we chose to cancel the whole batch. However, this generates the new problem for validating results from those WUs which are in the same batch and have been sent out to run with 5.36 already. We are still investigating why this happens and for the moment, please abort these WUs and we will try to come up with a plan to adjust credits for those validator errors later. Sorry for causing this mess.
ID: 31158 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jascha

Send message
Joined: 8 Apr 06
Posts: 1
Credit: 250,935
RAC: 0
Message 31171 - Posted: 15 Nov 2006, 6:29:37 UTC
Last modified: 15 Nov 2006, 6:43:21 UTC

Because of this mistake, we have spent our computer time for nothing.

http://boinc.bakerlab.org/rosetta/result.php?resultid=47022204
http://boinc.bakerlab.org/rosetta/result.php?resultid=47022165
http://boinc.bakerlab.org/rosetta/result.php?resultid=47002859
http://boinc.bakerlab.org/rosetta/result.php?resultid=47027030
http://boinc.bakerlab.org/rosetta/result.php?resultid=47027024
http://boinc.bakerlab.org/rosetta/result.php?resultid=47011104
http://boinc.bakerlab.org/rosetta/result.php?resultid=47038465
http://boinc.bakerlab.org/rosetta/result.php?resultid=41562380
http://boinc.bakerlab.org/rosetta/result.php?resultid=46980931
http://boinc.bakerlab.org/rosetta/result.php?resultid=47041386
http://boinc.bakerlab.org/rosetta/result.php?resultid=47005151
http://boinc.bakerlab.org/rosetta/result.php?resultid=46984783
http://boinc.bakerlab.org/rosetta/result.php?resultid=47009260
http://boinc.bakerlab.org/rosetta/result.php?resultid=46732871
http://boinc.bakerlab.org/rosetta/result.php?resultid=46874924
http://boinc.bakerlab.org/rosetta/result.php?resultid=46922811
http://boinc.bakerlab.org/rosetta/result.php?resultid=46953602
http://boinc.bakerlab.org/rosetta/result.php?resultid=46966672
http://boinc.bakerlab.org/rosetta/result.php?resultid=46994551
http://boinc.bakerlab.org/rosetta/result.php?resultid=47033064

Is that anything can be done to get our credit back. As i don\'t want to waste my computer power for someone do test their software release. This kind of problem should be iron our before it release into the public. As a normal user, i really don\'t care about docking and etc etc as this is not my concern, the people from rosetta should do their job, (put in into beta or test project i.e seti have their beta and test project).
ID: 31171 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 31174 - Posted: 15 Nov 2006, 7:10:11 UTC
Last modified: 15 Nov 2006, 7:19:56 UTC

Hi Jascha

Rosetta has a test projekt , Ralph@home.

Most of the time all the bugs get cought there but not this time.

As for computer time for nothing you did get some credit for the WU-s

if you look at the bottom of the page on your links.

Happy crunching

Anders n


ID: 31174 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 31176 - Posted: 15 Nov 2006, 7:17:02 UTC - in response to Message 31171.  

The credits for jobs with validator errors or compuational errors will be assigned and updated every 24 hours. We will double check if this will actually happen tomorrow. If not, we will manually update the database to give the credits back.

We do have an alpha testing server: https://ralph.bakerlab.org and we appreciate your suggestions. We will try our best to prevent this type of problem from happening again.

Because of this mistake, we have spent our computer time for nothing.

http://boinc.bakerlab.org/rosetta/result.php?resultid=47022204
http://boinc.bakerlab.org/rosetta/result.php?resultid=47022165
http://boinc.bakerlab.org/rosetta/result.php?resultid=47002859
http://boinc.bakerlab.org/rosetta/result.php?resultid=47027030
http://boinc.bakerlab.org/rosetta/result.php?resultid=47027024
http://boinc.bakerlab.org/rosetta/result.php?resultid=47011104
http://boinc.bakerlab.org/rosetta/result.php?resultid=47038465
http://boinc.bakerlab.org/rosetta/result.php?resultid=41562380
http://boinc.bakerlab.org/rosetta/result.php?resultid=46980931
http://boinc.bakerlab.org/rosetta/result.php?resultid=47041386
http://boinc.bakerlab.org/rosetta/result.php?resultid=47005151
http://boinc.bakerlab.org/rosetta/result.php?resultid=46984783
http://boinc.bakerlab.org/rosetta/result.php?resultid=47009260
http://boinc.bakerlab.org/rosetta/result.php?resultid=46732871
http://boinc.bakerlab.org/rosetta/result.php?resultid=46874924
http://boinc.bakerlab.org/rosetta/result.php?resultid=46922811
http://boinc.bakerlab.org/rosetta/result.php?resultid=46953602
http://boinc.bakerlab.org/rosetta/result.php?resultid=46966672
http://boinc.bakerlab.org/rosetta/result.php?resultid=46994551
http://boinc.bakerlab.org/rosetta/result.php?resultid=47033064

Is that anything can be done to get our credit back. As i don\'t want to waste my computer power for someone do test their software release. This kind of problem should be iron our before it release into the public. As a normal user, i really don\'t care about docking and etc etc as this is not my concern, the people from rosetta should do their job, (put in into beta or test project i.e seti have their beta and test project).


ID: 31176 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SekeRob

Send message
Joined: 7 Sep 06
Posts: 35
Credit: 19,984
RAC: 0
Message 31178 - Posted: 15 Nov 2006, 8:12:36 UTC - in response to Message 31176.  

Did not quite get the assurance that the 5.36 WU in the queue together with the 5.40 in the queue would take the proper version, thus canceled all and requested a fresh batch.

thanks

PS. At WCG the WU with version X marker, would work with version X software i.e. multiple versions could be in the queue.
Coelum Non Animum Mutant, Qui Trans Mare Currunt
ID: 31178 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 31183 - Posted: 15 Nov 2006, 9:11:25 UTC

Grafics error on this wu

http://boinc.bakerlab.org/rosetta/result.php?resultid=47075177

After looking at grafics the window did not \"close\" as it should.

Anders n
ID: 31183 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator
Project administrator

Send message
Joined: 22 Aug 06
Posts: 3550
Credit: 0
RAC: 0
Message 31190 - Posted: 15 Nov 2006, 16:22:53 UTC
Last modified: 15 Nov 2006, 16:23:24 UTC

Chu, was the WU name to cancel actually \"model\" rather then \"mode\"??
Rosetta Moderator: Mod.Sense
ID: 31190 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 31193 - Posted: 15 Nov 2006, 17:38:01 UTC - in response to Message 31190.  

Yes, should be \"model\" instead of \"mode\". Thanks for pointing this out.
Chu, was the WU name to cancel actually \"model\" rather then \"mode\"??


ID: 31193 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 31200 - Posted: 15 Nov 2006, 20:19:56 UTC - in response to Message 31171.  
Last modified: 15 Nov 2006, 20:21:47 UTC

Is that anything can be done to get our credit back.

In my experience this project is by far the most diligent in granting approximate credit when this kind of thing happens.

As i don\'t want to waste my computer power for someone do test their software release. This kind of problem should be iron our before it release into the public. ... the people from rosetta should do their job, (put in into beta or test project i.e seti have their beta and test project).


I would suggest that this might not be the project for you, then.

Rosetta is a hybrid production and development project. Yes, some of the protein structires we crunch are really wanted by the bios, but other work we do is re-crunching known proteins using new methods, to see how the new methods match up to the new.

For example, Phil\'s new work will deliver results that relate to proteins important in Alzheimers - that is the production aspect of this project. As the same time, what he is doing is even more interesting because he is trying out methods for crunching the structures of stacked protiens. This is the development aspect of this project.

The new work was tested on Ralph - the Rosetta Alpha project - and worked with v40. The DOC work that is now failing was also tested on Ralph and worked with v36. The new client, v40 was tested and worked on Ralph on the selection of types of WU that were tried. It does not make sense to re-test every kind of WU on every new client on the Alpha project, and in my opinion a fair amount of testng was done.

Ever stop to think why SETI has a beta project while Rosetta has an Alpha?

We have an alpha bacuase the development part of this project means that the mainstream Rosetta is always halfway to being a beta. The programmers here will make mistakes again, that is part of developing new methods. Almost all their mistakes are picked up on alpha - but once in a while mistakes seep through to here.

We seem to be going through a minor rough patch at present, but considering the last six or eight months (ie since you joined) the error rate over that time is going to continue. Not because people are not doing their jobs, but becasue they *are* doing their jobs.

If you want to run the same code for six months, with an app that has all the bugs shaken out of it five months ago, then this is not the project for you. You might want to try Einstein for example (where they run the same app for around a year at a time on data collected in the previous year), or CPDN (where the models are based on well tested mainframe code that has been in use for decades, and where the science is important but the IT is old hat).

On the other hand, if you will put up with the inconvenience of the occasional error, this project (in my own opinion) gets a good balance between doing useful work at the same time as pushing forward the computational techniques.

And this project adds to that a concern for awarding corrective credit when problems occur, a willingness to take the responsibility for their mistakes, and by far the best explanations of the science of any BOINC project. It upsets me, as a participant & volunteer, to see the staff accused of not doing their job. They are running a different sort of project to Einstein or SETI, and they are running their chosen sort of project well.

In my opinion.
River~~
ID: 31200 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1740
Credit: 3,655,614
RAC: 0
Message 31201 - Posted: 15 Nov 2006, 20:52:40 UTC

River, thanks for saying all of that, and taking the time to do so in such a tactful way... now I don\'t feel I must take the time to scratch up something half as explanatory.
If having a DC project with BOINC is of interest to you, with volunteer or cloud computing resources, but have no time for the BOINC learning curve,
use a hosting service that understands BOINC projects: http://DeepSci.com
ID: 31201 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MattDavis
Avatar

Send message
Joined: 22 Sep 05
Posts: 206
Credit: 1,377,748
RAC: 0
Message 31208 - Posted: 15 Nov 2006, 22:58:46 UTC - in response to Message 31171.  

Because of this mistake, we have spent our computer time for nothing.

http://boinc.bakerlab.org/rosetta/result.php?resultid=47022204
http://boinc.bakerlab.org/rosetta/result.php?resultid=47022165
http://boinc.bakerlab.org/rosetta/result.php?resultid=47002859
http://boinc.bakerlab.org/rosetta/result.php?resultid=47027030
http://boinc.bakerlab.org/rosetta/result.php?resultid=47027024
http://boinc.bakerlab.org/rosetta/result.php?resultid=47011104
http://boinc.bakerlab.org/rosetta/result.php?resultid=47038465
http://boinc.bakerlab.org/rosetta/result.php?resultid=41562380
http://boinc.bakerlab.org/rosetta/result.php?resultid=46980931
http://boinc.bakerlab.org/rosetta/result.php?resultid=47041386
http://boinc.bakerlab.org/rosetta/result.php?resultid=47005151
http://boinc.bakerlab.org/rosetta/result.php?resultid=46984783
http://boinc.bakerlab.org/rosetta/result.php?resultid=47009260
http://boinc.bakerlab.org/rosetta/result.php?resultid=46732871
http://boinc.bakerlab.org/rosetta/result.php?resultid=46874924
http://boinc.bakerlab.org/rosetta/result.php?resultid=46922811
http://boinc.bakerlab.org/rosetta/result.php?resultid=46953602
http://boinc.bakerlab.org/rosetta/result.php?resultid=46966672
http://boinc.bakerlab.org/rosetta/result.php?resultid=46994551
http://boinc.bakerlab.org/rosetta/result.php?resultid=47033064

Is that anything can be done to get our credit back. As i don\'t want to waste my computer power for someone do test their software release. This kind of problem should be iron our before it release into the public. As a normal user, i really don\'t care about docking and etc etc as this is not my concern, the people from rosetta should do their job, (put in into beta or test project i.e seti have their beta and test project).



It was just a mistake. They didn\'t do it on purpose. An error ACCIDENTALLY slipped through. Geez.
ID: 31208 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile MM Sihombing
Avatar

Send message
Joined: 22 May 06
Posts: 15
Credit: 1,424,082
RAC: 0
Message 31246 - Posted: 16 Nov 2006, 10:14:03 UTC

11/16/2006 5:07:25 PM|rosetta@home|Unrecoverable error for result DOC_R061113_1DQJ_p1_fa_relax_from_native_unbound_1392_766_0 ( - exit code -1073741819 (0xc0000005))

47162899
ID: 31246 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile MM Sihombing
Avatar

Send message
Joined: 22 May 06
Posts: 15
Credit: 1,424,082
RAC: 0
Message 31252 - Posted: 16 Nov 2006, 13:48:43 UTC

Validate error:

47076159
ID: 31252 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Taki G

Send message
Joined: 16 Nov 06
Posts: 1
Credit: 63,345
RAC: 0
Message 31284 - Posted: 17 Nov 2006, 6:48:09 UTC

Hi

I just joined today and need help. Every time the rosetta program goes into screen saver mode it works for a while and then freezes. I have to reboot my computer to unlock it. Does anyone else have this problem? I tried to decrease the maximum cpu time before I wrote this post and it didn\'t work.
Thanks for your help.
Taki G
ID: 31284 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Killersocke@rosetta

Send message
Joined: 13 Nov 06
Posts: 26
Credit: 1,421,143
RAC: 2,740
Message 31290 - Posted: 17 Nov 2006, 9:09:33 UTC

same Problem too

Client error
Result ID 47575512
Name FRA_2rio_RIO2_hom002_6_2rio_6_2bdwA_IGNORE_THE_REST_45_1400_11_0
Validate state Invalid
Basement:
WU was ready, but Screenserver was running
The PC and Application hangs up.



ID: 31290 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
EW-3

Send message
Joined: 1 Sep 06
Posts: 27
Credit: 2,561,427
RAC: 0
Message 31295 - Posted: 17 Nov 2006, 12:59:10 UTC

Sign me up as having a problem using the screen saver too....
Things have been running great for the last week or two, and in the last 24s I enabled the screen saver again and I had my first problem since I turned it off. My cure is to keep it off, but maybe you could look into it so you don\'t loose valuable CPU time....
Thanks,

ID: 31295 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Fynjy

Send message
Joined: 18 Sep 06
Posts: 8
Credit: 3,688,731
RAC: 0
Message 31296 - Posted: 17 Nov 2006, 13:20:59 UTC - in response to Message 31200.  


For example, Phil\'s new work will deliver results that relate to proteins important in Alzheimers - that is the production aspect of this project. As the same time, what he is doing is even more interesting because he is trying out methods for crunching the structures of stacked protiens. This is the development aspect of this project.


River~~ Do you know, how can I connect with Phil? Some time ago there was very strange file stdout.txt in my Rosetta folder with size about 60Mb. It contains lots of lines like this

DANGER:: AI chainbreak score does not match the derivative!!!!!!!!!!!!!!
Talk to Phil about fixing this.

DANGER:: 0-overlap chainbreak score does not match the derivative!!!!!!!!!!!!!!
Talk to Phil about fixing this.

Maybe anyone know something about this?
Help people! Join TSC!Russia!
ID: 31296 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Chu

Send message
Joined: 23 Feb 06
Posts: 120
Credit: 112,439
RAC: 0
Message 31310 - Posted: 17 Nov 2006, 16:55:06 UTC - in response to Message 31296.  

This message has been passed to Phil...

For example, Phil\'s new work will deliver results that relate to proteins important in Alzheimers - that is the production aspect of this project. As the same time, what he is doing is even more interesting because he is trying out methods for crunching the structures of stacked protiens. This is the development aspect of this project.


River~~ Do you know, how can I connect with Phil? Some time ago there was very strange file stdout.txt in my Rosetta folder with size about 60Mb. It contains lots of lines like this

DANGER:: AI chainbreak score does not match the derivative!!!!!!!!!!!!!!
Talk to Phil about fixing this.

DANGER:: 0-overlap chainbreak score does not match the derivative!!!!!!!!!!!!!!
Talk to Phil about fixing this.

Maybe anyone know something about this?


ID: 31310 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 31313 - Posted: 17 Nov 2006, 17:07:16 UTC - in response to Message 31296.  
Last modified: 17 Nov 2006, 17:11:29 UTC

...
River~~ Do you know, how can I connect with Phil? Some time ago there was very strange file stdout.txt in my Rosetta folder with size about 60Mb. It contains lots of lines like this

DANGER:: AI chainbreak score does not match the derivative!!!!!!!!!!!!!!
Talk to Phil about fixing this.

DANGER:: 0-overlap chainbreak score does not match the derivative!!!!!!!!!!!!!!
Talk to Phil about fixing this.

Maybe anyone know something about this?


Post it on the \"Problems with\" thread for the version of the application you are using - probably v5.40 - in which case you have already come to the right place :)

If it happens again, please post the details of the resultname or resultid, but it sounds like you might no longer have that info.

If Phil is not the person checking the postings on that thread, they will pass it on. My guess is that this is a debugging message left over from pre-Alpha testing...

R~~
ID: 31313 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2751
Credit: 1,710,893
RAC: 116
Message 31329 - Posted: 17 Nov 2006, 21:38:25 UTC

i got more errors in last 5 days than i did in last 100 days
ID: 31329 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Problems with Rosetta version 5.40



©2020 University of Washington
http://www.bakerlab.org