Failed to stop applications; aborting CPU benchmarks

Message boards : Number crunching : Failed to stop applications; aborting CPU benchmarks

To post messages, you must log in.

AuthorMessage
Profile proxima

Send message
Joined: 9 Dec 05
Posts: 44
Credit: 4,148,186
RAC: 0
Message 32642 - Posted: 14 Dec 2006, 16:27:12 UTC

For some time, I have been finding Rosetta/BOINC stops with errors like the following:

2006-12-09 18:44:16 [rosetta@home] Started download of file 1a19A_R30_R55_cheat.bar
2006-12-09 18:44:19 [rosetta@home] Finished download of file 1a19A_R30_R55_cheat.bar
2006-12-09 18:44:19 [rosetta@home] Throughput 145 bytes/sec
2006-12-09 18:44:20 [---] Rescheduling CPU: files downloaded
2006-12-09 19:36:54 [---] Suspending computation - running CPU benchmarks
2006-12-09 19:36:54 [rosetta@home] Pausing task PSH_0097_looprlx_GP120_OD1_138_147_7082_1414_13_2 (removed from memory)
2006-12-09 19:36:54 [---] Suspending network activity - running CPU benchmarks
2006-12-09 19:36:57 [---] Running CPU benchmarks
2006-12-09 19:37:05 [---] Failed to stop applications; aborting CPU benchmarks
2006-12-09 19:37:06 [---] Resuming computation
2006-12-09 19:37:06 [---] Rescheduling CPU: Resuming computation
2006-12-09 19:37:06 [---] Resuming network activity
2006-12-09 19:37:06 [---] Process 23015 not found


Nothing then happens until I manually stop and restart the processes.

There are a few references to this error on the forums, but no real solution posted as far as I could understand.

I'm using BOINC 5.4.9 on Fedora Core 5.

Any ideas? I love Rosetta@Home, and I'm trying to avoid moving to other projects if at all possible. However, if I cannot get this to run reliably without manual intervention every few days, I may need to.
Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365.
ID: 32642 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile proxima

Send message
Joined: 9 Dec 05
Posts: 44
Credit: 4,148,186
RAC: 0
Message 32643 - Posted: 14 Dec 2006, 16:29:42 UTC

Hmm - I've just noticed, in conjunction with re-reading one of the other threads, that I'm set to "Leave applications in memory while suspended?: no". I'll try setting this to "yes", and see if that helps.


Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365.
ID: 32643 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 32645 - Posted: 14 Dec 2006, 18:30:46 UTC

No sure on that one. I don't think that leaving the app in memory is going to effect the contact with the application. But, it would be another case where Rosetta gets suspended, and work is lost if the application is not kept in memory.

The benchmarks are not all that meaningful for Rosetta work anymore, so I wouldn't let an issue like not having benchmarks run successfully determine which projects I run. And there are other BOINC issues with losing contact with the application, so I'm hopeful that some of the upcoming BOINC releases will be improving on these areas. Entirely likely same occurs on other projects.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 32645 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile proxima

Send message
Joined: 9 Dec 05
Posts: 44
Credit: 4,148,186
RAC: 0
Message 32686 - Posted: 15 Dec 2006, 9:09:19 UTC - in response to Message 32645.  

I wouldn't let an issue like not having benchmarks run successfully determine which projects I run.


Oh, I agree - me neither. But when this error occurs, all processing stops - no further work units take place until I stop and restart BOINC.

As my machines run unattended most of the time, it can be days before I notice, and with this happening probably at least weekly, my machines are probably running less than 50% of the time overall - not the whole idea!

I'll keep an eye on the machines now I've changed that setting to "yes", and post what happens either way.

Thanks for your reply.
Tom
Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365.
ID: 32686 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
FluffyChicken
Avatar

Send message
Joined: 1 Nov 05
Posts: 1260
Credit: 369,635
RAC: 0
Message 32691 - Posted: 15 Dec 2006, 11:05:55 UTC

If it still fails to work properly with it set to yes, I think some issues related to they where dealt with in newer (development) versions of boinc. Cannot remember when but I do remember it cropping up.
Either hold out for it to be officially released or give the latest version a go now
http://boinc.berkeley.edu/download_all.php (5.7.5 I think for linux)

Worth a try.

You can always revert back.

Team mauisun.org
ID: 32691 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile proxima

Send message
Joined: 9 Dec 05
Posts: 44
Credit: 4,148,186
RAC: 0
Message 32692 - Posted: 15 Dec 2006, 11:13:09 UTC

Will do - I'll see how this goes, and try the new version if necessary.

Thanks.
Tom
Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365.
ID: 32692 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Failed to stop applications; aborting CPU benchmarks



©2024 University of Washington
https://www.bakerlab.org