Message boards : Number crunching : Report Problems with Rosetta Version 5.16 I
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next
Author | Message |
---|---|
Nightbird Send message Joined: 17 Sep 05 Posts: 70 Credit: 32,418 RAC: 0 |
|
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
|
Seth Aaronson Send message Joined: 5 Mar 06 Posts: 18 Credit: 3,976 RAC: 0 |
No Screen Saver, no errors. If anyone is interested, here are the running processes from the Hijack this log: Running processes: C:WINDOWSSystem32smss.exe C:WINDOWSsystem32winlogon.exe C:WINDOWSsystem32services.exe C:WINDOWSsystem32lsass.exe C:WINDOWSsystem32svchost.exe C:WINDOWSSystem32svchost.exe C:Program FilesAheadInCDInCDsrv.exe C:WINDOWSExplorer.EXE C:WINDOWSsystem32spoolsv.exe C:WINDOWSzHotkey.exe C:Program FilesAheadInCDInCD.exe C:Program FileseMachines Bay Readershwiconem.exe C:Program FilesCommon FilesMicrosoft SharedWorks SharedWkUFind.exe C:PROGRA~1GrisoftAVGFRE~1avgcc.exe C:PROGRA~1GrisoftAVGFRE~1avgemc.exe C:Program FilesHPhpcoretechhpcmpmgr.exe C:Program FilesiTunesiTunesHelper.exe C:Program FilesQuickTime7qttask.exe C:Program FilesQUICKENWQAGENT.EXE C:WINDOWSsystem32mrtMngr.EXE C:Program FilesMicrosoft SQL Server80ToolsBinnsqlmangr.exe C:Program FilespalmOneHOTSYNC.EXE C:Program FilesHPhpcoretechcomphptskmgr.exe C:PROGRA~1GrisoftAVGFRE~1avgamsvr.exe C:PROGRA~1GrisoftAVGFRE~1avgupsvc.exe C:CFusionMX7runtimebinjrunsvc.exe C:CFusionMX7dbslserver54binswagent.exe C:CFusionMX7runtimebinjrun.exe C:CFusionMX7dbslserver54binswstrtr.exe C:CFusionMX7dbslserver54binswsoc.exe C:CFusionMX7verityk2_nti40bink2admin.exe C:Program FilesCisco SystemsVPN Clientcvpnd.exe C:WINDOWSLogWatNT.exe C:CFusionMX7verityk2_nti40bink2server.exe C:CFusionMX7verityk2_nti40bink2index.exe C:Program FilesMySQLMySQL Server 4.1.1.2abinmysqld-nt.exe C:WINDOWSsystem32nvsvc32.exe C:Program FilesMicrosoft SQL Server90Sharedsqlbrowser.exe C:WINDOWSSystem32svchost.exe C:WINDOWSsystem32UAService7.exe C:Program FilesVMwareVMware Playervmware-authd.exe C:Program FilesCommon FilesVMwareVMware Virtual Image Editingvmount2.exe C:WINDOWSsystem32vmnat.exe C:Program FilesMicrosoft SQL ServerMSSQL.1MSSQLBinnmsftesql.exe C:WINDOWSsystem32vmnetdhcp.exe C:Program FilesiPodbiniPodService.exe C:Program FilesBOINCboincmgr.exe C:Program FilesBOINCboinc.exe C:Program FilesBOINCprojectssetiathome.berkeley.edusetiathome_5.15_windows_intelx86.exe C:Program FilesBOINCprojectsboinc.bakerlab.org_rosettarosetta_5.16_windows_intelx86.exe C:Program FilesMozilla Thunderbirdthunderbird.exe C:PROGRA~1MOZILL~4FIREFOX.EXE |
K1100LTSE Send message Joined: 28 Feb 06 Posts: 7 Credit: 192,387 RAC: 0 |
Result ID 20893015 Name t283_HOMOLOG_ABRELAX_hom001__515_20431_0 Workunit 17437191 Created 19 May 2006 20:14:40 UTC Sent 19 May 2006 22:20:09 UTC Received 20 May 2006 10:53:20 UTC Server state Over Outcome Client error Client state Computing Exit status -1073741811 (0xc000000d) Computer ID 193286 Report deadline 2 Jun 2006 22:20:09 UTC CPU time 2431.078125 stderr out <core_client_version>5.4.9</core_client_version> <message> - exit code -1073741811 (0xc000000d) </message> <stderr_txt> # random seed: 3329570 # cpu_run_time_pref: 10800 </stderr_txt> Validate state Invalid Claimed credit 3.69662235638505 Granted credit 0 application version 5.16 |
Jimi@0wned.org.uk Send message Joined: 10 Mar 06 Posts: 29 Credit: 335,252 RAC: 0 |
Failed. I accidentally knocked the power off the water pump without noticing and the CPU brewed up. No damage. Result ID 20840881 Name t283_HOMOLOG_ABRELAX_hom001__515_15191_0 Workunit 17390031 |
pieface Send message Joined: 20 Sep 05 Posts: 17 Credit: 797,661 RAC: 0 |
This is probably the same 0xc0000005 problem as i reported on 5.13 earlier here but on a different machine, still win xp, but this one is a pentium-m. the unit died overnite, i.e. no-one was messing with the screensaver or anything and then the security package tied things up with a dialog box because rosetta was trying to access a DNS server. output is in result. I guess this means that with the 'new' debugger code all of the executing programs have to be identified to security software in case they need to go out looking for symbols for a dump or something? |
webmaster777 Send message Joined: 15 Apr 06 Posts: 1 Credit: 13,500 RAC: 0 |
2006-05-19 16:51:00 [rosetta@home] Unrecoverable error for result T0283_FACONTACTS_hom003_508_18704_0 ( - exit code 1073807364 (0x40010004)) 2006-05-20 14:52:55 [rosetta@home] Unrecoverable error for result t287_HOMOLOG_ABRELAX_hom001__513_16762_0 ( - exit code 1073807364 (0x40010004)) Both were ended automatically claimed about 15 credit each |
truckpuller Send message Joined: 5 Nov 05 Posts: 40 Credit: 229,134 RAC: 0 |
I keep getting the Message( If this happens again you may need to reset the project) i have reset the project about a week ago and now im getting this message back again. Also i have noticed my RAC on the same machine has dropped from 250 down to like 220 or so. Visit us at Christianboards.org |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
I keep getting the Message( If this happens again you may need to reset the project) i have reset the project about a week ago and now im getting this message back again. Also i have noticed my RAC on the same machine has dropped from 250 down to like 220 or so. This error is usually accompanied by a mention of a missing file, and it can usually be ignored. See here. Moderator9 ROSETTA@home FAQ Moderator Contact |
truckpuller Send message Joined: 5 Nov 05 Posts: 40 Credit: 229,134 RAC: 0 |
I keep getting the Message( If this happens again you may need to reset the project) i have reset the project about a week ago and now im getting this message back again. Also i have noticed my RAC on the same machine has dropped from 250 down to like 220 or so. Ok i see where it said exited with zero status but no finished files and i have several of these so does this mean i get no credit for these jobs then. Thanks for reply Visit us at Christianboards.org |
anders n Send message Joined: 19 Sep 05 Posts: 403 Credit: 537,991 RAC: 0 |
As you se on this page https://boinc.bakerlab.org/rosetta/result.php?resultid=19607341 you do get credit for the work done. :) Anders n |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
I keep getting the Message( If this happens again you may need to reset the project) i have reset the project about a week ago and now im getting this message back again. Also i have noticed my RAC on the same machine has dropped from 250 down to like 220 or so. Let me apologize to everyone for what I am about to say. This is the single most Frequently Asked Question (FAQ) on these forums. So logically the answer might be found in A FAQ. The FAQs take a lot of time to prepare and maintain, and I am beginning to think it is not helping people very much. If you can't find the answer to your question there, I need to know that so I can add it if it would help a lot of people to see it. Moderator9 ROSETTA@home FAQ Moderator Contact |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
Psssst... Jose, are you in here? Hows it going? When I returned I discovered that the unit was stuck. The task manager did not record the existence of the Rosetta exe file ( it was not there) and yet the Boinc files were there and 99% of CPU went to idle. 2 attempts at reattaching had to be dumped as the exe files once in and started to work disappeared . Result 2 phantom files. Not a great day for me. Rosetta or life wise. I am so angry O cane close to torching the computer . I tried reattaching now. It is reportedly working. I am going back to bed. My head hurts. If machine fails again: it is torching time. |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
Psssst... Jose, are you in here? Hows it going? Well, at least this is different. Keep us posted. I have no idea why the EXE would vanish. Moderator9 ROSETTA@home FAQ Moderator Contact |
Jose Send message Joined: 28 Mar 06 Posts: 820 Credit: 48,297 RAC: 0 |
Keep us posted. There is no way I can recover from this fiasco. I am frustrated as hell. So I think I must ask for a special favor. IN my results page there are three WUS that are basically lost to me. No way they will be processed until they are removed from my account and sent to other computers to be processed. These are the units in question: https://boinc.bakerlab.org/rosetta/result.php?resultid=20874518 https://boinc.bakerlab.org/rosetta/result.php?resultid=20830429 https://boinc.bakerlab.org/rosetta/result.php?resultid=20525150 Please cancel them from my computer and send them to others to be processed. It seems that is going to be the only thing I can do to advance the project. I do apologize for all the time and resources I have wasted. Good luck to all. I will keep reading the boards but, I don't think I should try downloading units. These are CASP units and all my errors and problems are a but a drag. I am not a happy camper. |
Ian Send message Joined: 14 Apr 06 Posts: 29 Credit: 326,863 RAC: 637 |
Couple of errors from yesterday: https://boinc.bakerlab.org/rosetta/result.php?resultid=20850737 https://boinc.bakerlab.org/rosetta/result.php?resultid=20846087 Ian Cundell, St Albans, UK |
Moderator9 Volunteer moderator Send message Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0 |
...Please cancel them from my computer and send them to others to be processed. It seems that is going to be the only thing I can do to advance the project. Jose, First, all Work Units of the same name are the same, only the random number that is used to initiate processing changes. So do not be concerned about them being lost to the project, they aren't. Second you are not wasting anyones time. If you were we would have abandoned you to marinade in BBQ sauce and tequila long ago. However, We would like it very much if you could keep processing for RALPH. You are providing critical information that the project is using to find and kill the 107 error bug. The programers are particularly interested in why turning on EDP helped. Anyone who is still having errors should attach to and run RALPH. The error rate is very low, on RALPH, but we think this is because the people having errors on Rosetta are not running RALPH. These are the very systems we need to have over there. So please ANY OF YOU STILL HAVING ERRORS, ATTACH TO RALPH! WE NEED YOU. Please see Dr. Bakers journal for a message from the program team on this point. Moderator9 ROSETTA@home FAQ Moderator Contact |
Aglarond Send message Joined: 29 Jan 06 Posts: 26 Credit: 446,212 RAC: 0 |
LINUX problem: I need help with this problem: while running Rosetta on Linux server with PentiumIV HyperThreading processor, Rosetta occasionally hangs in a very strange state: everything is running except Rosetta. Boinc is running. Application on other thread (Simap@home) is running. Just Rosetta isn't. Processor is showing 50% idle. After some time Boinc decides to switch apps and Rosetta is preempted and there are 2 Simaps happily running. After some more time Boinc decides do switch apps and there are 2 Rosettas hanging and processor is 100% idle. And so on. After 2 days I stopped Boinc and run it again and both Rosettas started to work normally. This didn't happen for the first time. I even tried to attach to RALPH some time ago, but it never occured there. It happens once in a 2-3 weeks. Here are the results: (both are valid, but has something in stderr) 20587857 20574470 Do you have any idea what can be wrong? Is it a bug in Rosetta, or is there some problem with server, where I run it? (btw, yes I have permission from servers admin to run Boinc there) |
Dimitris Hatzopoulos Send message Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0 |
LINUX problem: I had encountered this particular issue back in Jan/Feb-06 (also under Linux). Overall about 5-6 times. BOINC log would show that boinc restarted Rosetta, but the Rosetta process would just stay "idle" (ps flags were "SN"=sleep,nice consuming no CPU time) for hours/days, until I manually killed it (I guess nowadays the "watchdog" thread will catch it). At the time, I thought it was an issue with Rosetta+BOINC interaction, as I think it happened upon resuming a Rosetta WU (with leave-in-mem=yes). At the time, I also suspected some issue with the system's resources, as that PC had only 256MB RAM and I was running 6 BOINC projects and 100+ processes. It COULD have been a faulty WU, but when I ran that WU with rosetta commandline outside BOINC and it completed fine. The things in common with your setup are BOINC 5.2.14 (optimised) and 2.4.x kernel (mine was Debian Sarge). Trying to solve the problem, I reduced the # of BOINC projects to 4 (Rosetta, Ralph, Simap and LHC) and never experienced any problems for the past 3+ months (since Feb-06), crunching 24/7: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND boinc 2120 0.0 0.5 7452 3760 ? S Apr27 0:28 ./boinc_client boinc 25448 7.8 5.2 62208 38776 ? SN May19 240:32 sixtrack_4.66_i68 boinc 1078 12.5 18.4 191428 136776 ? SN May20 59:42 rosetta_beta_5.16 boinc 1079 0.0 18.4 191428 136776 ? SN May20 0:00 rosetta_beta_5.16 boinc 1080 0.0 18.4 191428 136776 ? SN May20 0:00 rosetta_beta_5.16 boinc 1081 0.0 18.4 191428 136776 ? SN May20 0:00 rosetta_beta_5.16 boinc 5254 59.9 0.9 11700 7260 ? SN 02:15 59:51 simap_5.07_i686-p boinc 5255 0.0 0.9 11700 7260 ? SN 02:15 0:00 simap_5.07_i686-p boinc 5256 0.0 0.9 11700 7260 ? SN 02:15 0:00 simap_5.07_i686-p boinc 5828 99.5 8.8 94972 65656 ? RN 03:16 38:55 rosetta_5.16_i686 boinc 5829 0.0 8.8 94972 65656 ? SN 03:16 0:00 rosetta_5.16_i686 boinc 5830 0.0 8.8 94972 65656 ? SN 03:16 0:00 rosetta_5.16_i686 boinc 5831 0.0 8.8 94972 65656 ? SN 03:16 0:00 rosetta_5.16_i686 PS: I believe there was a SIGSEGV violation signal in my case also. You can search my post history for the details. Best UFO Resources Wikipedia R@h How-To: Join Distributed Computing projects that benefit humanity |
Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0 |
This result exited with code "1" giving the error message: ERROR:: Exit at: dock_structure.cc line:401 This is a somewhat old Linux-box with just 256 MB memory but usually it runs stable - this is its first error in, I guess, months... Team betterhumans.com - discuss and celebrate the future - hoelder1in.org |
Message boards :
Number crunching :
Report Problems with Rosetta Version 5.16 I
©2024 University of Washington
https://www.bakerlab.org