Report Problems with Rosetta Version 5.16 I

Message boards : Number crunching : Report Problems with Rosetta Version 5.16 I

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

AuthorMessage
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 16649 - Posted: 19 May 2006, 18:04:32 UTC - in response to Message 16602.  

Here, I just caught one on Ralph. These two snaps were taken within 10 seconds of each other:
The words get stretched out lower and longer to the point that it just becomes a straight line.


Thanks Feet1st, I tried to describe it in my earlier post in this thread:
https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1619#16552

But "one picture is worth 1000 words".


Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 16649 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 16650 - Posted: 19 May 2006, 18:04:43 UTC - in response to Message 16648.  


ID: 16650 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rhiju
Volunteer moderator

Send message
Joined: 8 Jan 06
Posts: 223
Credit: 3,546
RAC: 0
Message 16651 - Posted: 19 May 2006, 18:08:01 UTC - in response to Message 16645.  

crossworks -- this might be VERY important. I think I saw something
similar the one time my Windows laptop crashed -- a lot of opening and
closing graphics caused the same error. Can I possibly ask you to
attach your computer to Ralph for say ~50% of the time? We are
seeing fewer and fewer errors over there, and it's been hard for
us to solve this 0xc00000d error. Your crashes may give us
the stderr error reports we need.

When viewing the graphics rosetta.exe errors out.. Its happens random sometimes after several hours and sometimes in few secs of starting the graphics.

5/19/2006 11:25:46 AM|rosetta@home|Unrecoverable error for result t283_HOMOLOG_ABRELAX_hom003__515_10854_0 ( - exit code -1073741811 (0xc000000d))

Event Type: Error
Event Source: Application Error
Event Category: None
Event ID: 1000
Date: 5/19/2006
Time: 11:25:41 AM
User: N/A
Computer: NEW-02GCGES16U9
Description:
Faulting application rosetta_5.16_windows_intelx86.exe, version 0.0.0.0, faulting module rosetta_5.16_windows_intelx86.exe, version 0.0.0.0, fault address 0x0057104e.

This is the 4th time it errored in one day looking at the graphics.


ID: 16651 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BennyRop

Send message
Joined: 17 Dec 05
Posts: 555
Credit: 140,800
RAC: 0
Message 16654 - Posted: 19 May 2006, 18:36:31 UTC - in response to Message 16632.  

After we get this thing running, we can start to make it use the rest of the machine. Something is taking cycles away from the CPU to do something else. We shall find out what that is.


If we could get Jose to run HiJackThis! and post the HiJackThis!log file (in another thread, since they're usually rather long), it'll show everything that is being loaded automatically by windows.

And if he also pulls up taskmanager (Ctrl-Alt-Del and select TaskManager if using Win2k appearance), clicks on the "processes" and lists all of the lines that have 5% or more of the cpu - we'll be able to identify what's sapping his cpu performance.




ID: 16654 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nightbird

Send message
Joined: 17 Sep 05
Posts: 70
Credit: 32,418
RAC: 0
Message 16656 - Posted: 19 May 2006, 19:06:49 UTC - in response to Message 16650.  
Last modified: 19 May 2006, 19:08:20 UTC




ID: 16656 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile crossworks

Send message
Joined: 18 May 06
Posts: 9
Credit: 7,570
RAC: 0
Message 16657 - Posted: 19 May 2006, 19:12:53 UTC - in response to Message 16651.  

Attached to ralph.



crossworks -- this might be VERY important. I think I saw something
similar the one time my Windows laptop crashed -- a lot of opening and
closing graphics caused the same error. Can I possibly ask you to
attach your computer to Ralph for say ~50% of the time? We are
seeing fewer and fewer errors over there, and it's been hard for
us to solve this 0xc00000d error. Your crashes may give us
the stderr error reports we need.

When viewing the graphics rosetta.exe errors out.. Its happens random sometimes after several hours and sometimes in few secs of starting the graphics.

5/19/2006 11:25:46 AM|rosetta@home|Unrecoverable error for result t283_HOMOLOG_ABRELAX_hom003__515_10854_0 ( - exit code -1073741811 (0xc000000d))

Event Type: Error
Event Source: Application Error
Event Category: None
Event ID: 1000
Date: 5/19/2006
Time: 11:25:41 AM
User: N/A
Computer: NEW-02GCGES16U9
Description:
Faulting application rosetta_5.16_windows_intelx86.exe, version 0.0.0.0, faulting module rosetta_5.16_windows_intelx86.exe, version 0.0.0.0, fault address 0x0057104e.

This is the 4th time it errored in one day looking at the graphics.



ID: 16657 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Nightbird

Send message
Joined: 17 Sep 05
Posts: 70
Credit: 32,418
RAC: 0
Message 16663 - Posted: 19 May 2006, 21:21:51 UTC - in response to Message 16656.  
Last modified: 19 May 2006, 21:24:00 UTC




ID: 16663 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Seth Aaronson
Avatar

Send message
Joined: 5 Mar 06
Posts: 18
Credit: 3,976
RAC: 0
Message 16677 - Posted: 20 May 2006, 4:35:38 UTC

No Screen Saver, no errors.
If anyone is interested, here are the running processes from the Hijack this log:

Running processes:
C:WINDOWSSystem32smss.exe
C:WINDOWSsystem32winlogon.exe
C:WINDOWSsystem32services.exe
C:WINDOWSsystem32lsass.exe
C:WINDOWSsystem32svchost.exe
C:WINDOWSSystem32svchost.exe
C:Program FilesAheadInCDInCDsrv.exe
C:WINDOWSExplorer.EXE
C:WINDOWSsystem32spoolsv.exe
C:WINDOWSzHotkey.exe
C:Program FilesAheadInCDInCD.exe
C:Program FileseMachines Bay Readershwiconem.exe
C:Program FilesCommon FilesMicrosoft SharedWorks SharedWkUFind.exe
C:PROGRA~1GrisoftAVGFRE~1avgcc.exe
C:PROGRA~1GrisoftAVGFRE~1avgemc.exe
C:Program FilesHPhpcoretechhpcmpmgr.exe
C:Program FilesiTunesiTunesHelper.exe
C:Program FilesQuickTime7qttask.exe
C:Program FilesQUICKENWQAGENT.EXE
C:WINDOWSsystem32mrtMngr.EXE
C:Program FilesMicrosoft SQL Server80ToolsBinnsqlmangr.exe
C:Program FilespalmOneHOTSYNC.EXE
C:Program FilesHPhpcoretechcomphptskmgr.exe
C:PROGRA~1GrisoftAVGFRE~1avgamsvr.exe
C:PROGRA~1GrisoftAVGFRE~1avgupsvc.exe
C:CFusionMX7runtimebinjrunsvc.exe
C:CFusionMX7dbslserver54binswagent.exe
C:CFusionMX7runtimebinjrun.exe
C:CFusionMX7dbslserver54binswstrtr.exe
C:CFusionMX7dbslserver54binswsoc.exe
C:CFusionMX7verityk2_nti40bink2admin.exe
C:Program FilesCisco SystemsVPN Clientcvpnd.exe
C:WINDOWSLogWatNT.exe
C:CFusionMX7verityk2_nti40bink2server.exe
C:CFusionMX7verityk2_nti40bink2index.exe
C:Program FilesMySQLMySQL Server 4.1.1.2abinmysqld-nt.exe
C:WINDOWSsystem32nvsvc32.exe
C:Program FilesMicrosoft SQL Server90Sharedsqlbrowser.exe
C:WINDOWSSystem32svchost.exe
C:WINDOWSsystem32UAService7.exe
C:Program FilesVMwareVMware Playervmware-authd.exe
C:Program FilesCommon FilesVMwareVMware Virtual Image Editingvmount2.exe
C:WINDOWSsystem32vmnat.exe
C:Program FilesMicrosoft SQL ServerMSSQL.1MSSQLBinnmsftesql.exe
C:WINDOWSsystem32vmnetdhcp.exe
C:Program FilesiPodbiniPodService.exe
C:Program FilesBOINCboincmgr.exe
C:Program FilesBOINCboinc.exe
C:Program FilesBOINCprojectssetiathome.berkeley.edusetiathome_5.15_windows_intelx86.exe
C:Program FilesBOINCprojectsboinc.bakerlab.org_rosettarosetta_5.16_windows_intelx86.exe
C:Program FilesMozilla Thunderbirdthunderbird.exe
C:PROGRA~1MOZILL~4FIREFOX.EXE
ID: 16677 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
K1100LTSE
Avatar

Send message
Joined: 28 Feb 06
Posts: 7
Credit: 192,387
RAC: 0
Message 16690 - Posted: 20 May 2006, 11:38:22 UTC

Result ID 20893015
Name t283_HOMOLOG_ABRELAX_hom001__515_20431_0
Workunit 17437191
Created 19 May 2006 20:14:40 UTC
Sent 19 May 2006 22:20:09 UTC
Received 20 May 2006 10:53:20 UTC
Server state Over
Outcome Client error
Client state Computing
Exit status -1073741811 (0xc000000d)
Computer ID 193286
Report deadline 2 Jun 2006 22:20:09 UTC
CPU time 2431.078125
stderr out <core_client_version>5.4.9</core_client_version>
<message>
- exit code -1073741811 (0xc000000d)
</message>
<stderr_txt>
# random seed: 3329570
# cpu_run_time_pref: 10800

</stderr_txt>


Validate state Invalid
Claimed credit 3.69662235638505
Granted credit 0
application version 5.16

ID: 16690 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jimi@0wned.org.uk

Send message
Joined: 10 Mar 06
Posts: 29
Credit: 335,252
RAC: 0
Message 16691 - Posted: 20 May 2006, 11:55:27 UTC

Failed. I accidentally knocked the power off the water pump without noticing and the CPU brewed up. No damage.

Result ID 20840881
Name t283_HOMOLOG_ABRELAX_hom001__515_15191_0
Workunit 17390031
ID: 16691 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
pieface

Send message
Joined: 20 Sep 05
Posts: 17
Credit: 797,661
RAC: 0
Message 16692 - Posted: 20 May 2006, 12:46:05 UTC

This is probably the same 0xc0000005 problem as i reported on 5.13 earlier here
but on a different machine, still win xp, but this one is a pentium-m.
the unit died overnite, i.e. no-one was messing with the screensaver or anything and then the security package tied things up with a dialog box because rosetta was trying to access a DNS server. output is in result.
I guess this means that with the 'new' debugger code all of the executing programs have to be identified to security software in case they need to go out looking for symbols for a dump or something?
ID: 16692 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
webmaster777

Send message
Joined: 15 Apr 06
Posts: 1
Credit: 13,500
RAC: 0
Message 16695 - Posted: 20 May 2006, 13:33:54 UTC

2006-05-19 16:51:00 [rosetta@home] Unrecoverable error for result T0283_FACONTACTS_hom003_508_18704_0 ( - exit code 1073807364 (0x40010004))
2006-05-20 14:52:55 [rosetta@home] Unrecoverable error for result t287_HOMOLOG_ABRELAX_hom001__513_16762_0 ( - exit code 1073807364 (0x40010004))

Both were ended automatically
claimed about 15 credit each
ID: 16695 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
truckpuller

Send message
Joined: 5 Nov 05
Posts: 40
Credit: 229,134
RAC: 0
Message 16696 - Posted: 20 May 2006, 13:56:43 UTC
Last modified: 20 May 2006, 13:58:09 UTC

I keep getting the Message( If this happens again you may need to reset the project) i have reset the project about a week ago and now im getting this message back again. Also i have noticed my RAC on the same machine has dropped from 250 down to like 220 or so.
Visit us at Christianboards.org
ID: 16696 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
truckpuller

Send message
Joined: 5 Nov 05
Posts: 40
Credit: 229,134
RAC: 0
Message 16698 - Posted: 20 May 2006, 14:37:10 UTC - in response to Message 16697.  

I keep getting the Message( If this happens again you may need to reset the project) i have reset the project about a week ago and now im getting this message back again. Also i have noticed my RAC on the same machine has dropped from 250 down to like 220 or so.

This error is usually accompanied by a mention of a missing file, and it can usually be ignored. See here.


Ok i see where it said exited with zero status but no finished files and i have several of these so does this mean i get no credit for these jobs then.

Thanks for reply
Visit us at Christianboards.org
ID: 16698 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 16700 - Posted: 20 May 2006, 14:43:14 UTC - in response to Message 16698.  


Ok i see where it said exited with zero status but no finished files and i have several of these so does this mean i get no credit for these jobs then.

Thanks for reply


As you se on this page https://boinc.bakerlab.org/rosetta/result.php?resultid=19607341 you do get credit for the work done. :)

Anders n

ID: 16700 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 16717 - Posted: 20 May 2006, 17:02:47 UTC - in response to Message 16676.  

Psssst... Jose, are you in here? Hows it going?


When I returned I discovered that the unit was stuck. The task manager did not record the existence of the Rosetta exe file ( it was not there) and yet the Boinc files were there and 99% of CPU went to idle. 2 attempts at reattaching had to be dumped as the exe files once in and started to work disappeared . Result 2 phantom files.

Not a great day for me. Rosetta or life wise. I am so angry O cane close to torching the computer .

I tried reattaching now. It is reportedly working. I am going back to bed. My head hurts. If machine fails again: it is torching time.
ID: 16717 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 16726 - Posted: 20 May 2006, 22:00:33 UTC - in response to Message 16718.  

Keep us posted.


There is no way I can recover from this fiasco. I am frustrated as hell.

So I think I must ask for a special favor. IN my results page there are three WUS that are basically lost to me. No way they will be processed until they are removed from my account and sent to other computers to be processed. These are the units in question:

https://boinc.bakerlab.org/rosetta/result.php?resultid=20874518

https://boinc.bakerlab.org/rosetta/result.php?resultid=20830429

https://boinc.bakerlab.org/rosetta/result.php?resultid=20525150

Please cancel them from my computer and send them to others to be processed. It seems that is going to be the only thing I can do to advance the project.

I do apologize for all the time and resources I have wasted. Good luck to all.

I will keep reading the boards but, I don't think I should try downloading units. These are CASP units and all my errors and problems are a but a drag.

I am not a happy camper.
ID: 16726 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ian

Send message
Joined: 14 Apr 06
Posts: 29
Credit: 364,629
RAC: 500
Message 16727 - Posted: 20 May 2006, 22:13:46 UTC

Couple of errors from yesterday:

https://boinc.bakerlab.org/rosetta/result.php?resultid=20850737

https://boinc.bakerlab.org/rosetta/result.php?resultid=20846087
Ian Cundell, St Albans, UK
ID: 16727 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Aglarond

Send message
Joined: 29 Jan 06
Posts: 26
Credit: 446,212
RAC: 0
Message 16730 - Posted: 21 May 2006, 0:29:28 UTC

LINUX problem:
I need help with this problem: while running Rosetta on Linux server with PentiumIV HyperThreading processor, Rosetta occasionally hangs in a very strange state: everything is running except Rosetta. Boinc is running. Application on other thread (Simap@home) is running. Just Rosetta isn't. Processor is showing 50% idle. After some time Boinc decides to switch apps and Rosetta is preempted and there are 2 Simaps happily running. After some more time Boinc decides do switch apps and there are 2 Rosettas hanging and processor is 100% idle. And so on.
After 2 days I stopped Boinc and run it again and both Rosettas started to work normally. This didn't happen for the first time. I even tried to attach to RALPH some time ago, but it never occured there. It happens once in a 2-3 weeks.
Here are the results: (both are valid, but has something in stderr)
20587857
20574470
Do you have any idea what can be wrong? Is it a bug in Rosetta, or is there some problem with server, where I run it? (btw, yes I have permission from servers admin to run Boinc there)
ID: 16730 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dimitris Hatzopoulos

Send message
Joined: 5 Jan 06
Posts: 336
Credit: 80,939
RAC: 0
Message 16733 - Posted: 21 May 2006, 1:09:10 UTC - in response to Message 16730.  
Last modified: 21 May 2006, 1:16:23 UTC

LINUX problem:
I need help with this problem: while running Rosetta on Linux server with PentiumIV HyperThreading processor, Rosetta occasionally hangs in a very strange state: everything is running except Rosetta. Boinc is running. Application on other thread (Simap@home) is running. Just Rosetta isn't.


I had encountered this particular issue back in Jan/Feb-06 (also under Linux). Overall about 5-6 times.

BOINC log would show that boinc restarted Rosetta, but the Rosetta process would just stay "idle" (ps flags were "SN"=sleep,nice consuming no CPU time) for hours/days, until I manually killed it (I guess nowadays the "watchdog" thread will catch it).

At the time, I thought it was an issue with Rosetta+BOINC interaction, as I think it happened upon resuming a Rosetta WU (with leave-in-mem=yes). At the time, I also suspected some issue with the system's resources, as that PC had only 256MB RAM and I was running 6 BOINC projects and 100+ processes.

It COULD have been a faulty WU, but when I ran that WU with rosetta commandline outside BOINC and it completed fine.

The things in common with your setup are BOINC 5.2.14 (optimised) and 2.4.x kernel (mine was Debian Sarge).

Trying to solve the problem, I reduced the # of BOINC projects to 4 (Rosetta, Ralph, Simap and LHC) and never experienced any problems for the past 3+ months (since Feb-06), crunching 24/7:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
boinc 2120 0.0 0.5 7452 3760 ? S Apr27 0:28 ./boinc_client
boinc 25448 7.8 5.2 62208 38776 ? SN May19 240:32 sixtrack_4.66_i68
boinc 1078 12.5 18.4 191428 136776 ? SN May20 59:42 rosetta_beta_5.16
boinc 1079 0.0 18.4 191428 136776 ? SN May20 0:00 rosetta_beta_5.16
boinc 1080 0.0 18.4 191428 136776 ? SN May20 0:00 rosetta_beta_5.16
boinc 1081 0.0 18.4 191428 136776 ? SN May20 0:00 rosetta_beta_5.16
boinc 5254 59.9 0.9 11700 7260 ? SN 02:15 59:51 simap_5.07_i686-p
boinc 5255 0.0 0.9 11700 7260 ? SN 02:15 0:00 simap_5.07_i686-p
boinc 5256 0.0 0.9 11700 7260 ? SN 02:15 0:00 simap_5.07_i686-p
boinc 5828 99.5 8.8 94972 65656 ? RN 03:16 38:55 rosetta_5.16_i686
boinc 5829 0.0 8.8 94972 65656 ? SN 03:16 0:00 rosetta_5.16_i686
boinc 5830 0.0 8.8 94972 65656 ? SN 03:16 0:00 rosetta_5.16_i686
boinc 5831 0.0 8.8 94972 65656 ? SN 03:16 0:00 rosetta_5.16_i686

PS: I believe there was a SIGSEGV violation signal in my case also. You can search my post history for the details.
Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
ID: 16733 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

Message boards : Number crunching : Report Problems with Rosetta Version 5.16 I



©2025 University of Washington
https://www.bakerlab.org