Silly Newbie Tricks - Suspending a work unit

Message boards : Number crunching : Silly Newbie Tricks - Suspending a work unit

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 47587 - Posted: 10 Oct 2007, 4:31:05 UTC - in response to Message 47582.  
Last modified: 10 Oct 2007, 4:32:18 UTC

Since then it has run up more than 37 hours. I propose to let it run another day or so and see what happens.


Looks like your preferred runtime is 3hrs. The watchdog should have killed that task some time ago. You've already exited and restarted BOINC and it did not complete the task, so I suggest you abort it. Sorry.

Also, please join the Linux problems discussion
Rosetta Moderator: Mod.Sense
ID: 47587 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jean-David Beyer

Send message
Joined: 2 Nov 05
Posts: 178
Credit: 5,721,968
RAC: 3,451
Message 47597 - Posted: 10 Oct 2007, 16:08:35 UTC - in response to Message 47587.  

Since then it has run up more than 37 hours. I propose to let it run another day or so and see what happens.


Looks like your preferred runtime is 3hrs. The watchdog should have killed that task some time ago. You've already exited and restarted BOINC and it did not complete the task, so I suggest you abort it. Sorry.

Also, please join the Linux problems discussion


Note that when I exited BOINC it did not manage to kill the rosetta processes. I seem to remember that this is always the case. Could there be a problem in either the BOINC client, or the rosetta application that makes this happen?

I do not care what my preferred run time is. Would it make sense for me to increase it?
ID: 47597 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 47600 - Posted: 10 Oct 2007, 18:59:09 UTC - in response to Message 47597.  

Note that when I exited BOINC it did not manage to kill the rosetta processes. I seem to remember that this is always the case. Could there be a problem in either the BOINC client, or the rosetta application that makes this happen?


Yes, there could be a problem. Exploring that possibility is the purpose of the other thread and why I asked you to contribute your symptoms and observations there as well.


I do not care what my preferred run time is. Would it make sense for me to increase it?


I mentioned it only because one of the watchdog's criteria for ending a task is when it has run for 4 times longer then your preferred runtime. So if you had recently changed your runtime to 24hrs for example, then I wouldn't have expected the watchdog to kick in yet. The watchdog not ending the task is another symptom we need to study in the Linux preemption thread.

No, I am not suggesting a change to your preference. Some Linux users feel a shorter runtime tends to improve their success rate.
Rosetta Moderator: Mod.Sense
ID: 47600 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Boris

Send message
Joined: 11 Oct 07
Posts: 1
Credit: 11,120
RAC: 0
Message 47689 - Posted: 13 Oct 2007, 16:27:26 UTC

I have the same issue with two work units on my 64bit ubuntu 7.04 distro.
The two tasks in question both start with STM0082_BOINC_MFR_ABRELAX_PICKED_2175
I've tried restarting my system, and suspending-resuming tasks. Boinc has already given me more projects to work on, and I've started workin on those instead. I've already spent 9 hours of cpu time on each of the 'broken' ones, and i should have only had to spend half of that.
ID: 47689 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Silly Newbie Tricks - Suspending a work unit



©2024 University of Washington
https://www.bakerlab.org