Message boards : Number crunching : Problems with Rosetta version 5.43
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · Next
Author | Message |
---|---|
adrianxw Send message Joined: 18 Sep 05 Posts: 653 Credit: 11,840,739 RAC: 1 |
I wasn't sure it was a client problem so started a thread to highlight the issue. I'm seeing exactly the same problem. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. |
Tiago Send message Joined: 11 Jul 06 Posts: 55 Credit: 2,538,721 RAC: 0 |
I'm having the same problem here. Something is wrong with that wu. |
paulcsteiner Send message Joined: 15 Oct 05 Posts: 19 Credit: 3,144,322 RAC: 0 |
Ditto, I've gotten 15 client errors just today all with this message: <core_client_version>5.4.11</core_client_version> <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # random seed: 2035473 ERROR:: Exit at: .fragments.cc line:459 For this machine: 401324 |
Lynn Send message Joined: 13 Jan 07 Posts: 3 Credit: 2,766,009 RAC: 0 |
Are you all using Windows or Linux? I've had no failures on 3 Windows XP Pro systems I am running, but I finally had to detach from Rosetta on *ALL* of my Linux systems (I have two running Ubuntu 6.06 and one running Ubuntu 6.10) as I was seeing nearly 90% failure and these systems were offering rosetta 50% of their time. I didn't mind the tasks that ran 10-15 seconds before failing, but one of these systems is a new P4 dual-core and I had WU's hogging 6 to 8 hours (or 4 days for one WU!!!!) before they failed. Better that I donate that CPU power to another project that produces something useful. Here is the dual-core's result page: 398561 - Lynn |
Lee Send message Joined: 18 Apr 06 Posts: 4 Credit: 36,335 RAC: 0 |
I am getting the following message: 2007/01/23 12:16:22 AM|rosetta@home|Note: not requesting new work or reporting results 2007/01/23 12:16:27 AM|rosetta@home|Scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi succeeded 2007/01/23 12:16:31 AM|rosetta@home|Started download of PSH_0037_2761.loopfile 2007/01/23 12:16:33 AM|rosetta@home|Temporarily failed download of PSH_0037_2761.loopfile: error 500 2007/01/23 12:16:34 AM|rosetta@home|Started download of PSH_0037_2761.loopfile 2007/01/23 12:16:36 AM|rosetta@home|Temporarily failed download of PSH_0037_2761.loopfile: error 500 2007/01/23 12:16:37 AM|rosetta@home|Started download of PSH_0037_2761.loopfile 2007/01/23 12:16:39 AM|rosetta@home|Temporarily failed download of PSH_0037_2761.loopfile: error 500 2007/01/23 12:16:40 AM|rosetta@home|Started download of PSH_0037_2761.loopfile 2007/01/23 12:16:42 AM|rosetta@home|Temporarily failed download of PSH_0037_2761.loopfile: error 500 2007/01/23 12:16:42 AM|rosetta@home|Backing off 2 hours, 26 minutes, and 33 seconds on download of file PSH_0037_2761.loopfile The WUs have been done & sitting on my machine for a while... |
Lee Send message Joined: 18 Apr 06 Posts: 4 Credit: 36,335 RAC: 0 |
Oops message got duplicated. Does anyone know why this happens? |
netwraith Send message Joined: 3 Sep 06 Posts: 80 Credit: 13,483,227 RAC: 0 |
Same with me. I had all 4 PSH_0131_looprlx work units I downloaded fail in a similar manner. Same here... about 20 WU's total... Looking for a team ??? Join BoincSynergy!! |
Chu Send message Joined: 23 Feb 06 Posts: 120 Credit: 112,439 RAC: 0 |
A bad batch (PSH_003?_looprlx...) slipped through and we were purging it from the database this morning. As it failed right away after it is started, we don't expect any lef out there now and the impact to RAC should be minimal. However, this should not be an excuse for making such a mistake and we are very sorry for causing this convenience to all the boinc users. Thank you for the reporting and your support. I am getting the following message: |
Thomas F. Bates IV Send message Joined: 10 May 06 Posts: 5 Credit: 2,853,254 RAC: 0 |
I hated to do it, but I had to kill the rosetta process. The last 5.43 WU I had was stuck at 100% CPU usage even though it was marked as 100% complete. Oh well...just uploaded a "computation error"... |
Chu Send message Joined: 23 Feb 06 Posts: 120 Credit: 112,439 RAC: 0 |
Which one was it? I saw you have four hosts running R@H and if you can point me to the one you killed, it may give us a clue of what went wrong. Thanks. I hated to do it, but I had to kill the rosetta process. The last 5.43 WU I had was stuck at 100% CPU usage even though it was marked as 100% complete. Oh well...just uploaded a "computation error"... |
BennyRop Send message Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0 |
Are you all using Windows or Linux? I've had no failures on 3 Windows XP Pro systems I am running, but I finally had to detach from Rosetta on *ALL* of my Linux systems (I have two running Ubuntu 6.06 and one running Ubuntu 6.10) as I was seeing nearly 90% failure and these systems were offering rosetta 50% of their time. I didn't mind the tasks that ran 10-15 seconds before failing, but one of these systems is a new P4 dual-core and I had WU's hogging 6 to 8 hours (or 4 days for one WU!!!!) before they failed. Better that I donate that CPU power to another project that produces something useful. Lynn: While I don't use Linux for crunching - do these machines run Rosetta fine if they're running Rosetta 100% of the time? We've had better luck with the windows machines if we set the keep in memory setting to yes. Perhaps someone can confirm or deny if that's been a problem with the Linux systems as well. (trying to eliminate the keep in memory problem and possible problems with interactions between Rosetta and whatever other Boinc app or apps take up the other 50%). Since you've got 3 linux machines with similar problems - have you tried a different version of Linux like Red Hat? (trying to rule out versions not able to identify your hardware properly) Or verified that the 3 haven't managed to get infected? I'd ask if you'd tested the memory - but that wouldn't appear on 3 machines at once unless you recycled the ram from dead machines.. |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Chu, RE: Thomas Bates, looks like this wu. It is called 1ail__BOINC_NOFILTERS_ABRELAX_SAVE_ALL_OUT_NEWRELAXFLAGS_frags83__1505_1221_0 It SAYS the watchdog ended it, but apparently not so. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
hedera Send message Joined: 15 Jul 06 Posts: 76 Credit: 5,263,150 RAC: 48 |
I almost hate to say it (knock wood!) but my WinXP Pro machine has been happily cranking out successful computations, all day yesterday and again this evening for about an hour and a half. I don't know if this is because the problem didn't hit WinXP, or because my habit of turning the machine off during the day while I'm at work paid off by missing the bad batch of downloads. As I think about it, it's probably because I missed the bad batch... --hedera Never be afraid to try something new. Remember that amateurs built the ark. Professionals built the Titanic. |
cwc Send message Joined: 18 Dec 06 Posts: 1 Credit: 35,128 RAC: 0 |
I just discovered that I've got a "computation error" problem. On two occasions just a short time apart, I opened full-screen graphics on a work unit that had just started. I didn't do anything but watch it for a while. Everything was OK during the initial backbone search, but within a minute or so of starting the all-atom search, the application froze. (It may have frozen at the very start of the all-atom search. I'm not sure.) On the first occasion, Windows put up a message saying the application wasn't responding, followed a few seconds later with the standard "send error report" message as the program closed. I don't remember whether I closed the graphics screen or it closed on its own. By then the BOINC manager was running two other work units, one of them belonging to R@H, and the failed work unit showing a "computation error". I clicked the "update" button to clear it out. I then opened the graphics screen for the just-started unit, and watched to see what would happen. Again, just after the all-atom search begain, I noticed that the application had frozen. And again, by the time the graphics screen closed, the BOINC manager was running another R@H work unit. This time I didn't open the graphic screen, and it ran to completion. Here are the error messages for the two failed work units: 1/23/2007 01:01:14 AM|rosetta@home|Unrecoverable error for result 1shfA_BOINC_NOFILTERS_ABRELAX_SAVE_ALL_OUT_NEWRELAXFLAGS_frags83__1505_2579_0 ( - exit code 1073807364 (0x40010004)) 1/23/2007 01:07:35 AM|rosetta@home|Unrecoverable error for result 1c9oA_BOINC_NOFILTERS_ABRELAX_SAVE_ALL_OUT_NEWRELAXFLAGS_frags83__1505_2891_0 ( - exit code -1073741819 (0xc0000005)) Here's my system info: Machine name: CWC-05 Operating System: Windows XP Professional (5.1, Build 2600) Service Pack 2 (2600.xpsp_sp2_gdr.050301-1519) Language: English (Regional Setting: English) System Manufacturer: INTEL_ System Model: D945GNT_ BIOS: Default System BIOS Processor: Intel(R) Pentium(R) D CPU 3.00GHz (2 CPUs) Memory: 2046MB RAM Page File: 947MB used, 2991MB available Windows Dir: C:WINDOWS DirectX Version: DirectX 9.0c (4.09.0000.0904) DX Setup Parameters: Not found DxDiag Version: 5.03.2600.2180 32bit Unicode Graphics Card: NVIDIA GeForce 7900 GTX Screen Saver: Windows Star-Field cwc |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
I got a compute error after 20 secs and I am at work with my home computer running. Check the error message on result # 58499325 Basicly this is the text: <core_client_version>5.4.11</core_client_version> <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # random seed: 2091953 ERROR:: Exit at: .fragments.cc line:459 </stderr_txt> |
zombie67 [MM] Send message Joined: 11 Feb 06 Posts: 316 Credit: 6,621,003 RAC: 0 |
I am having serious errors, across all of my macs, running 5.8.* (some intel, some PPC). They started a day or two ago. Almost nothing is crunching successfully. https://boinc.bakerlab.org/rosetta/result.php?resultid=58758493 https://boinc.bakerlab.org/rosetta/results.php?hostid=204292 https://boinc.bakerlab.org/rosetta/results.php?hostid=269065 Edit: made URL's clickable. Reno, NV Team: SETI.USA |
Stevea Send message Joined: 19 Dec 05 Posts: 50 Credit: 738,655 RAC: 0 |
This one lasted 16 seconds... PSH_0144_looprlx_GP120_OD1_115_136_2663_1506_5 BETA = Bahhh Way too many errors, killing both the credit & RAC. And I still think the (New and Improved) credit system is not ready for prime time... |
zombie67 [MM] Send message Joined: 11 Feb 06 Posts: 316 Credit: 6,621,003 RAC: 0 |
I am having serious errors, across all of my macs, running 5.8.* (some intel, some PPC). They started a day or two ago. Almost nothing is crunching successfully. I noticed that these are all error code -161. What's error code -161? Reno, NV Team: SETI.USA |
Chu Send message Joined: 23 Feb 06 Posts: 120 Credit: 112,439 RAC: 0 |
That is the error code for not transfering result files correctly, either because the result files are not generated or because the client is unable to send the result files back to the server correctly. If you have only experienced such a problem recently, I would suggest to reset the project on your hosts as the current application has not been changed since last December and the specific WUs are returning valid results from other hosts. Seem like some communication issue between your host and the server, but I am not exactly sure what is causing that. I am having serious errors, across all of my macs, running 5.8.* (some intel, some PPC). They started a day or two ago. Almost nothing is crunching successfully. |
Chu Send message Joined: 23 Feb 06 Posts: 120 Credit: 112,439 RAC: 0 |
Thanks for the reporting. This is a well known graphics problem for the current rosetta application and we are working on a fix right now. Before it is fixed, please try to leave the graphic off. For details, read here I just discovered that I've got a "computation error" problem. |
Message boards :
Number crunching :
Problems with Rosetta version 5.43
©2025 University of Washington
https://www.bakerlab.org