When and how is the

Message boards : Number crunching : When and how is the

To post messages, you must log in.

AuthorMessage
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15107 - Posted: 30 Apr 2006, 23:50:25 UTC
Last modified: 30 Apr 2006, 23:53:29 UTC


This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15107 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 15109 - Posted: 30 Apr 2006, 23:55:29 UTC - in response to Message 15107.  

Here are Jose's graphics:


BTW the time to completion ( @ 7:49 PM AST) is now 11:18:54



Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 15109 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 15110 - Posted: 1 May 2006, 0:01:58 UTC
Last modified: 1 May 2006, 0:02:36 UTC

Jose, those CASP6 proteins are HUGE! Your 4 hour preference is there, but keep in mind, you still have to complete model 1 in order to have something to report back. The time to completion is often wrong, so don't worry too much about that. If your steps is still increasing, I'd let 'er run a while longer. And once it completes model one, you'll zip to 100% done, because it will see it has passed your 4hr preference.

While this large one is running, could you bring up your task manager, go the view pulldown menu, and do the select columns, check the "page faults" and "page faults delta", then click twice on the CPU column (so your active WUs should sort to the top) and just see what you've got there for page faults (which is the total) and then PF Delta will refresh every second. No need to post a screenshot, just a feel for the average PF Delta and the total page faults is all I'm looking for.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 15110 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 15117 - Posted: 1 May 2006, 2:54:21 UTC

The "time to completion" is just BOINC's wild guess at how much longer the WU will take. If the percentage complete doesn't increase linearly with time (as is the case with the big WUs we've been getting), then BOINC's guess will be a very very bad one. BOINC doesn't know that the percent complete will jump from 1% to 100% after the model finishes.

In other words, the "time to completion" contains no useful information.

I think you should just let the WU keep running.
ID: 15117 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15122 - Posted: 1 May 2006, 3:41:25 UTC
Last modified: 1 May 2006, 3:46:51 UTC

11:08 pm AST April 30

Barely Progress





The targeted 4 hours WU has become a 23 hour one.

In this lies of one the big issue many people I know are having with BOINC/Rosetta : when one asks for an 4 hour Wu one is not expecting a 24 hour. Ditto for one hours. (And this is happening 1 hour Wu becoming huge time Wu.)

The game is DISTRIBUTED COMPUTING: in the case of many of us, the computing time is distributed among other applications. It is a sine qua non requirement to establish a rational strategy for distributing the resources to have a good guess of what to expect (time wise) in the length of the Wu's for the prime resource one is distributing is that , time. Rosetta/BOINC have to do a better job when it comes to the issue of Wu estimates vs actual time . From what I have been hearing and experiencing myself this is one of the major sources of frustration and one of the main reasons people drop out of Rosetta or are unwilling to join.

BTW: I am not going to abort this WU. Call me hardheaded BUT I want to see how it pans out in the days it is scheduled for . ( I will abort it only when it comes close to change to red in the Results page)


While this large one is running, could you bring up your task manager, go the view pulldown menu, and do the select columns, check the "page faults" and "page faults delta", then click twice on the CPU column (so your active WUs should sort to the top) and just see what you've got there for page faults (which is the total) and then PF Delta will refresh every second. No need to post a screenshot, just a feel for the average PF Delta and the total page faults is all I'm looking for.


Can you explain this in simpler terms?. Please remember: I am basicaly the type of person that has the knowledge to turn the computer on and of and, no more: what you asked of me reads like Greek to me. :)

This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15122 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15123 - Posted: 1 May 2006, 3:49:16 UTC

Jose, it looks like you're 100K steps farther. Now if you see it go back to "model1, step 1", you might have an looping issue.
ID: 15123 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15124 - Posted: 1 May 2006, 4:08:51 UTC - in response to Message 15122.  

Please rememeber that I was a compliance auditor. It is not in my nature to like/feel comfortable with deviations of estimates vs real larger than .05%
This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15124 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15125 - Posted: 1 May 2006, 4:10:56 UTC - in response to Message 15123.  
Last modified: 1 May 2006, 4:12:00 UTC

Jose, it looks like you're 100K steps farther. Now if you see it go back to "model1, step 1", you might have an looping issue.

And in the time it took me to respond to your post the time to completion estimate has gone up to approximatelly 15:18...making the extimate dtotal time close to 26HRS!!! (2 hours over the time prefreence we have to choose from the preference menu)
This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15125 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15126 - Posted: 1 May 2006, 4:18:34 UTC
Last modified: 1 May 2006, 4:26:43 UTC

OK, your first screen shows 6:06:45 at step 190735? Now you're at 09:30:22 and step 292801. 190735/292801=.6514151 and 6:06:45=25245 seconds, 09:30:22= 34222 seconds. 25245/34222=.7376834. .6514151 is close to .7376834, so it's nearly proportional.

or

the first screen shows 7.56 steps/second avg, the second shows 8.56 steps/second avg. What any of what I just types has to do with anything I don't know. I see it working, doesn't appear to be looping, and I don't know how many steps are in that result, but it looks like you'll find out, unless you hit the 4x watchdog
ID: 15126 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 15134 - Posted: 1 May 2006, 6:49:55 UTC - in response to Message 15125.  

Jose, it looks like you're 100K steps farther. Now if you see it go back to "model1, step 1", you might have an looping issue.

And in the time it took me to respond to your post the time to completion estimate has gone up to approximatelly 15:18...making the extimate dtotal time close to 26HRS!!! (2 hours over the time prefreence we have to choose from the preference menu)

Jose,

The time to completion counter in BOINC will ALWAYS rise until a model is completed. This is normal. The completion time estimates will become more accurate the longer the system is allowed to complete work units. That has to do with the way Rosetta handles the new time setting being incompatible with the way BOINC estimates the time to completion. There is nothing that can be done about that for now.

When the work unit finishes the first model BOINC will recalculate the time to completion and the percent complete to a more accurate value. This always happens at the end of each model. If the model has run longer than your time setting when it completes the first model, it will jump to 100% complete and report the results.

At this time the project is preparing for CASP7, and they are testing Work Units from CASP 6 in preparation. These are all very large proteins compared to what we all normally have been seeing. So they take a LOT longer to compete a model. Every work unit will complete at least on model before it stops no matter how long that model may take, and no matter what the time setting is.

That is just the nature of the beast. These work units will produce normal credits so long as they complete. That is to say when calculated as a function of credits per unit time, all the work units will produce the same ratio no matter how long they run. So when the long ones finish, you get more credit for that work unit that a smaller one, but the ratio stays constant. So there is no impact to the scoring at all. (not that you asked or even care about that)

So the short of all this is, your BOINC scheduling system will see to it that ALL your other projects get their fair share of your computer time. There is no impact to credit scores, and the time you are providing to Rosetta is of critical importance to moving the science forward.

I personally have 2 of my system set to 2 hours and a third set to 4. Many of the work units currently running in my systems are exceeding 6 hours. While this is not the norm right now, just 6 months ago I was seeing a large number of 30 hour work units. So things are a lot better now than they were then.

Dr. Baker has just posted this information about CASP6 Work units.

With the new watchdog feature on the job, you need not be concerned that a work unit will stick or hang on you. So long as it is proceeding, it is stepping forward, and you can see the model moving, it is running ok.

ID: 15134 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15142 - Posted: 1 May 2006, 7:51:55 UTC - in response to Message 15134.  

Dr. Baker has just posted this information about CASP6 Work units.

With the new watchdog feature on the job, you need not be concerned that a work unit will stick or hang on you. So long as it is proceeding, it is stepping forward, and you can see the model moving, it is running ok.[/color][/b]
[/quote]

The issue is not one about credits. I know you will grant them.

As to :
That has to do with the way Rosetta handles the new time setting being incompatible with the way BOINC estimates the time to completion. There is nothing that can be done about that for now.



Please refer to my message 15124. As long as this is happening, the compliance auditor that still lurks in me will develop more new gray hairs in the little amount of hair I have left. :)

Again Ty for your patience and for your explanations.

And know for the GOOD news:

After 11+ hours one model was produced!!!!

https://boinc.bakerlab.org/rosetta/result.php?resultid=18698148
This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15142 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 15148 - Posted: 1 May 2006, 10:12:51 UTC - in response to Message 15142.  

And know for the GOOD news:

After 11+ hours one model was produced!!!!

https://boinc.bakerlab.org/rosetta/result.php?resultid=18698148


Whooa, what a thriller! But with a happy end! ;-)
ID: 15148 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15150 - Posted: 1 May 2006, 11:14:16 UTC

Glad you had the patience to stick it out. Hope others do the same. woo hoo for you
ID: 15150 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15151 - Posted: 1 May 2006, 11:28:31 UTC - in response to Message 15150.  
Last modified: 1 May 2006, 11:31:29 UTC

Glad you had the patience to stick it out. Hope others do the same. woo hoo for you


Well one of the reasons I am developing the patience is the fact that the Moderators, the Scientists/Developers and many members of the Rosetta @ Home community have been more than willing to provide clear information and provide ideas on what is happening to me (a small cog in the Rosetta @ home community of contributors). You all have been patient with me.

As a result my doubts and worries are being solved.

See, I discovered that there is only one way to have my doubts solved and my specific worries about the project dealt with in a fast and responsible way:

When in doubt : Post!!!
When you have a concern or a worry about a particular part of the project : Post!!!

Both the science and the number crunching board are the best resources that are available to the members of the Rosetta@home community : Post!!!

The only dumb question is the question not asked.

An unshared doubt, an unresolved worry not discussed with those that have the knowledge to solve it is not only dumb, it keeps you stuck in the quagmire you got stuck. Post!!!

Okies: I sound like a broken record but, I think I made myself clear.

Again, my thanks and appreciation to all that have helped clear my doubts, calm my worries and have taken of their valuable time to share their knowledge and experience with me.

You all ROCK!!!

This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15151 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15160 - Posted: 1 May 2006, 14:25:07 UTC
Last modified: 1 May 2006, 14:29:59 UTC

Jose, I'm doing an AB CASP6 t209. I watched it get to 340,000 steps before switching to the next model. My AMD64 3700 is at Model 4, step 0, at 01:27:14, 48.32%. so, 3 models X 340,000 steps(assuming 340K step/model)= 1,020,000 steps in 1 hour 27 min, or 281 steps/second.
ID: 15160 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 15162 - Posted: 1 May 2006, 15:00:49 UTC
Last modified: 1 May 2006, 15:19:26 UTC

I am also running an AB CASP6 t241, and it too has 340K steps. Each model I've seen on the t209 has been 340K steps(model 3, 4, 5, 6, and 7)
ID: 15162 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 15175 - Posted: 1 May 2006, 16:19:38 UTC

Sorry Jose, I just saw your request for more detail on my task manager questions to you. I have a theory that your PC does a lot of page faulting, and this makes it take even longer to get work done. It looks like you're now ready to wrap up this thread, so we can table that issue. I will plan to open a new thread this week, I'll spell out details on how to gather what I'm asking about. Look for a thread title here on crunching board with title of "High page faults for some WUs".
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 15175 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15204 - Posted: 1 May 2006, 19:07:55 UTC - in response to Message 15175.  

Sorry Jose, I just saw your request for more detail on my task manager questions to you. I have a theory that your PC does a lot of page faulting, and this makes it take even longer to get work done. It looks like you're now ready to wrap up this thread, so we can table that issue. I will plan to open a new thread this week, I'll spell out details on how to gather what I'm asking about. Look for a thread title here on crunching board with title of "High page faults for some WUs".


Well should you wish you can send me an emai to joseantonio@choicecable.net

This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15204 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 15216 - Posted: 1 May 2006, 21:26:54 UTC

Jose:

There are over 55,000 people attached to Rosetta. As I look around the boards, less than 600 are regular posters. In my experience, for every person that actually asks a question there will always be at least 10 that had the same question and for any number of reasons did not ask it. So if I multiply that out, you are representing at least 2,000-5,000 of these users all by your self.

In that group of 55,000 users the scale of experience runs from none to guru. As I think you would agree the only stupid question is the one which goes unasked. I would prefer that people post the questions, than either suffer in silence or simply leave without asking first.

Rosetta is truly a community of users. Almost a family. If I do not have the answer (not uncommon) someone else will. I think you can see that the moment you began posting people rushed to help you. While I wish more people would ask the questions sooner, so they would not be frustrated and angry by the time they post, if they post at all that is welcome.

For my part I extend thanks to you for your tolerance and willingness to work with people to find the answers for yourself and your 5,000 silent partners.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 15216 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 15233 - Posted: 2 May 2006, 2:12:45 UTC - in response to Message 15216.  

[color=darkred]Jose:



Rosetta is truly a community of users. Almost a family. If I do not have the answer (not uncommon) someone else will. I think you can see that the moment you began posting people rushed to help you. /color]


Yes and for that I am most grateful. That is why I will keep posting whenever a doubt or a problem arises.

This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 15233 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : When and how is the



©2024 University of Washington
https://www.bakerlab.org