Rosetta 5.40 locks up

Message boards : Number crunching : Rosetta 5.40 locks up

To post messages, you must log in.

AuthorMessage
Ivor Cogdell

Send message
Joined: 7 Nov 06
Posts: 10
Credit: 18,073
RAC: 16
Message 31323 - Posted: 17 Nov 2006, 20:43:49 UTC

Message 31292 - Posted 17 Nov 2006 10:06:49 UTC
Hi folks,
Running Seti@home enhanced 5.15, Einstein@home and Rosetta@home 5.40, Boinc 5.4.11 on PC Windows XP. After an overnight run, Rosetta refuses to break out of sreensaver mode to normal operations.on a mouse click or keypress, the screen freeses, apart from the cursor. I have to turn off the computer to get it active again.

Any thoughts,

Ivor

Reply from Fluffy Chicken asked...
ATI or Intel graphics card ?

What driver version ?

Pop along to the Number crunching section of the main message board, you shoul find a thread called Report problems with Rosetta@home v5.40.

Post in there inclding the answers to the questions I just asked.

ATI Graphics
Radeon 9200 Series
Primary and secondary
Driver ati2cqag.dll V 6.14.10.0265

Status shows as ok on both.

Quick message startup for stats.
17/11/2006 19:29:11||Starting BOINC client version 5.4.11 for windows_intelx86
17/11/2006 19:29:11||libcurl/7.15.3 OpenSSL/0.9.8a zlib/1.2.3
17/11/2006 19:29:11||Data directory: C:Program FilesBOINC
17/11/2006 19:29:12||Processor: 1 GenuineIntel x86 Family 6 Model 8 Stepping 10 996MHz
17/11/2006 19:29:12||Memory: 510.48 MB physical, 1.22 GB virtual
17/11/2006 19:29:12||Disk: 128.00 GB total, 89.02 GB free
17/11/2006 19:29:12|rosetta@home|URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 349316; location: home; project prefs: home
17/11/2006 19:29:12|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 25471; location: home; project prefs: default
17/11/2006 19:29:12|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 1638859; location: home; project prefs: default
17/11/2006 19:29:12||General prefs: from SETI@home (last modified 2006-11-17 01:34:09)
17/11/2006 19:29:12||General prefs: using separate prefs for home
17/11/2006 19:29:12||Local control only allowed
17/11/2006 19:29:12||Listening on port 31416
17/11/2006 19:29:12|SETI@home|Resuming task 10jn03aa.8062.19266.804812.3.110_3 using setiathome_enhanced version 515
17/11/2006 19:29:12|rosetta@home|Deferring task FRA_t362_HOMOENV_hom001_8_t362_6_2gf6A_IGNORE_THE_REST_15_1398_4_0

17/11/2006 20:37:55||Suspending computation - running CPU benchmarks
17/11/2006 20:37:55|SETI@home|Pausing task 10jn03aa.8062.19266.804812.3.110_3 (removed from memory)
17/11/2006 20:37:55||Suspending network activity - running CPU benchmarks
17/11/2006 20:37:57||Running CPU benchmarks
17/11/2006 20:38:56||Benchmark results:
17/11/2006 20:38:56|| Number of CPUs: 1
17/11/2006 20:38:56|| 793 floating point MIPS (Whetstone) per CPU
17/11/2006 20:38:56|| 1411 integer MIPS (Dhrystone) per CPU
17/11/2006 20:38:56||Finished CPU benchmarks
17/11/2006 20:38:57||Resuming computation
17/11/2006 20:38:57||Rescheduling CPU: Resuming computation
17/11/2006 20:38:57||Resuming network activity
17/11/2006 20:38:57|SETI@home|Restarting task 10jn03aa.8062.19266.804812.3.110_3 using setiathome_enhanced version 515

Thanks,

Ivor
ID: 31323 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 3,957,808
RAC: 1,401
Message 31326 - Posted: 17 Nov 2006, 21:01:10 UTC
Last modified: 17 Nov 2006, 21:02:06 UTC

Is this when you are running the Rosetta screensaver?

edit, my bad, you said it's when poping OUT of the screensaver? So you've set things up to not run when the user is active?
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 31326 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 31359 - Posted: 18 Nov 2006, 4:13:18 UTC

G'day Ivor

Disable the Rosetta@Home screensaver (right click the desktop->Properties->Screen Saver(tab)-> Screen saver (drop down box) and choose None), and see if that fixes up the problem.

It seems to me to be that Rosetta@Home is having problems with some of the ATI cards.

I'm just guessing so if you could tell me if this worked it would be much appreciated.

Thanks Hugo.

ID: 31359 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Keith Akins

Send message
Joined: 22 Oct 05
Posts: 176
Credit: 71,779
RAC: 0
Message 31361 - Posted: 18 Nov 2006, 5:33:56 UTC

Actually you can include the Intel 82865G Extream Graphics v2 onboard video. It ran 5.16 - 5.2X beautifully. When 5.3X - 5.4X came out, well, I had to set sceeensaver to blank due to one-in-three WU's failing. I suspect maybe some coding issues maybe with sidechain display could be part of it. That's when it started on my box.
ID: 31361 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile MM Sihombing
Avatar

Send message
Joined: 22 May 06
Posts: 15
Credit: 1,424,082
RAC: 0
Message 31363 - Posted: 18 Nov 2006, 6:15:43 UTC
Last modified: 18 Nov 2006, 6:37:13 UTC

It's having problems with NVIDIA card too.

<-- 7950 GX2
ID: 31363 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ivor Cogdell

Send message
Joined: 7 Nov 06
Posts: 10
Credit: 18,073
RAC: 16
Message 31425 - Posted: 19 Nov 2006, 17:35:49 UTC

Hi Gang,
Some additional information. Setup is to run in the background, then activates screensaver at the stated time. I have now set the screensaver to none, as requested, to see if this alters anything. The lockup also occurs when seti@home is running as well, so it might be an underlying boinc problem. I am waiting to download the latest Einstein@home workunit to find out if that is affected too.
ID: 31425 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ivor Cogdell

Send message
Joined: 7 Nov 06
Posts: 10
Credit: 18,073
RAC: 16
Message 31453 - Posted: 20 Nov 2006, 10:02:44 UTC

Hi gang,
Just had an overnight seven hour run with screensaver turned off, no problem at all getting to rest of system this morning. Hope that narrows it down a bit. Only half a haystack to go through.
Einstein@home workunit loaded, so I shall put screensaver back on and see if that has any effect.

ID: 31453 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ivor Cogdell

Send message
Joined: 7 Nov 06
Posts: 10
Credit: 18,073
RAC: 16
Message 31483 - Posted: 21 Nov 2006, 0:25:53 UTC
Last modified: 21 Nov 2006, 0:28:06 UTC

Hi gang,
Just had a lock up using rosetta, but I managed to get the task manager running by (control) (alt) (Delete) sequence. It stated that Rosetta was not responding. I ended the task, ran Boinc manager. Rosetta flagged a computational error and loaded another work unit and started on that.
Are there any other debug logs that I can download to help ?

20/11/2006 18:14:55||Starting BOINC client version 5.4.11 for windows_intelx86
20/11/2006 18:14:55||libcurl/7.15.3 OpenSSL/0.9.8a zlib/1.2.3
20/11/2006 18:14:55||Data directory: C:Program FilesBOINC
20/11/2006 18:14:56||Processor: 1 GenuineIntel x86 Family 6 Model 8 Stepping 10 996MHz
20/11/2006 18:14:56||Memory: 510.48 MB physical, 1.22 GB virtual
20/11/2006 18:14:56||Disk: 128.00 GB total, 88.63 GB free
20/11/2006 18:14:56|rosetta@home|URL: https://boinc.bakerlab.org/rosetta/; Computer ID: 349316; location: home; project prefs: home
20/11/2006 18:14:56|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 25471; location: home; project prefs: default
20/11/2006 18:14:56|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 1638859; location: home; project prefs: default
20/11/2006 18:14:56||General prefs: from SETI@home (last modified 2006-11-17 01:34:09)
20/11/2006 18:14:56||General prefs: using separate prefs for home
20/11/2006 18:14:56||Local control only allowed
20/11/2006 18:14:56||Listening on port 31416
20/11/2006 18:14:56|SETI@home|Deferring task 10jn03aa.8062.19266.804812.3.110_3
20/11/2006 18:14:56|Einstein@Home|Deferring task l1_1383.0_S5R1__38_S5R1a_1
20/11/2006 18:14:57|rosetta@home|Resuming task DOC_1CSE_R061114_pose_u_global_search_1402_2152_0 using rosetta version 540
20/11/2006 18:14:58||Using earliest-deadline-first scheduling because computer is overcommitted.
20/11/2006 18:14:58||Suspending work fetch because computer is overcommitted.
20/11/2006 19:53:08||Rescheduling CPU: application exited
20/11/2006 19:53:08|rosetta@home|Computation for task DOC_1CSE_R061114_pose_u_global_search_1402_2152_0 finished
20/11/2006 19:53:09|Einstein@Home|Restarting task l1_1383.0_S5R1__38_S5R1a_1 using einstein_S5R1 version 424
20/11/2006 19:53:11|rosetta@home|Started upload of file DOC_1CSE_R061114_pose_u_global_search_1402_2152_0_0
20/11/2006 19:53:21|rosetta@home|Finished upload of file DOC_1CSE_R061114_pose_u_global_search_1402_2152_0_0
20/11/2006 19:53:21|rosetta@home|Throughput 24767 bytes/sec
20/11/2006 21:15:00||Allowing work fetch again.
20/11/2006 21:15:01|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
20/11/2006 21:15:01|rosetta@home|Reason: To fetch work
20/11/2006 21:15:01|rosetta@home|Requesting 43200 seconds of new work, and reporting 1 completed tasks
20/11/2006 21:15:06|rosetta@home|Scheduler request succeeded
20/11/2006 21:15:09|rosetta@home|Started download of file hom003_s014_.fasta.gz
20/11/2006 21:15:09|rosetta@home|Started download of file hom003_s014_.psipred_ss2.gz
20/11/2006 21:15:10|rosetta@home|Finished download of file hom003_s014_.fasta.gz
20/11/2006 21:15:10|rosetta@home|Throughput 470 bytes/sec
20/11/2006 21:15:10|rosetta@home|Finished download of file hom003_s014_.psipred_ss2.gz
20/11/2006 21:15:10|rosetta@home|Throughput 2968 bytes/sec
20/11/2006 21:15:10|rosetta@home|Started download of file boinc_hom003_aas014_03_05.200_v1_3.gz
20/11/2006 21:15:10|rosetta@home|Started download of file boinc_hom003_aas014_09_05.200_v1_3.gz
20/11/2006 21:15:15|rosetta@home|Finished download of file boinc_hom003_aas014_09_05.200_v1_3.gz
20/11/2006 21:15:15|rosetta@home|Throughput 51837 bytes/sec
20/11/2006 21:15:15|rosetta@home|Started download of file sg_target_description.txt
20/11/2006 21:15:17|rosetta@home|Finished download of file sg_target_description.txt
20/11/2006 21:15:17|rosetta@home|Throughput 330 bytes/sec
20/11/2006 21:15:18|rosetta@home|Finished download of file boinc_hom003_aas014_03_05.200_v1_3.gz
20/11/2006 21:15:18|rosetta@home|Throughput 136672 bytes/sec
20/11/2006 21:15:19||Rescheduling CPU: files downloaded
20/11/2006 21:15:19|Einstein@Home|Pausing task l1_1383.0_S5R1__38_S5R1a_1 (removed from memory)
20/11/2006 21:15:19|rosetta@home|Starting task s014__BOINC_ABRELAX_SAVE_ALL_OUT_hom003__1406_1937_0 using rosetta version 540
20/11/2006 21:15:22||Suspending work fetch because computer is overcommitted.
21/11/2006 00:11:02|rosetta@home|Unrecoverable error for result s014__BOINC_ABRELAX_SAVE_ALL_OUT_hom003__1406_1937_0 ( - exit code 1073807364 (0x40010004))
21/11/2006 00:11:02|rosetta@home|Deferring scheduler requests for 1 minutes and 0 seconds
21/11/2006 00:11:02||Rescheduling CPU: application exited
21/11/2006 00:11:02|rosetta@home|Computation for task s014__BOINC_ABRELAX_SAVE_ALL_OUT_hom003__1406_1937_0 finished
21/11/2006 00:11:02||Resuming round-robin CPU scheduling.
21/11/2006 00:11:02|Einstein@Home|Restarting task l1_1383.0_S5R1__38_S5R1a_1 using einstein_S5R1 version 424
21/11/2006 00:11:05||Allowing work fetch again.
21/11/2006 00:12:05|rosetta@home|Sending scheduler request to https://boinc.bakerlab.org/rosetta_cgi/cgi
21/11/2006 00:12:05|rosetta@home|Reason: To fetch work
21/11/2006 00:12:05|rosetta@home|Requesting 43200 seconds of new work, and reporting 1 completed tasks
21/11/2006 00:12:10|rosetta@home|Scheduler request succeeded
21/11/2006 00:12:12|rosetta@home|Started download of file hom012_s018_.fasta.gz
21/11/2006 00:12:12|rosetta@home|Started download of file hom012_s018_.psipred_ss2.gz
21/11/2006 00:12:14|rosetta@home|Finished download of file hom012_s018_.fasta.gz
21/11/2006 00:12:14|rosetta@home|Throughput 435 bytes/sec
21/11/2006 00:12:14|rosetta@home|Finished download of file hom012_s018_.psipred_ss2.gz
21/11/2006 00:12:14|rosetta@home|Throughput 2442 bytes/sec
21/11/2006 00:12:14|rosetta@home|Started download of file boinc_hom012_aas018_03_05.200_v1_3.gz
21/11/2006 00:12:14|rosetta@home|Started download of file boinc_hom012_aas018_09_05.200_v1_3.gz
21/11/2006 00:12:18|rosetta@home|Finished download of file boinc_hom012_aas018_09_05.200_v1_3.gz
21/11/2006 00:12:18|rosetta@home|Throughput 52488 bytes/sec
21/11/2006 00:12:21|rosetta@home|Finished download of file boinc_hom012_aas018_03_05.200_v1_3.gz
21/11/2006 00:12:21|rosetta@home|Throughput 108299 bytes/sec
21/11/2006 00:12:23||Rescheduling CPU: files downloaded
21/11/2006 00:12:23||Using earliest-deadline-first scheduling because computer is overcommitted.
21/11/2006 00:12:23|Einstein@Home|Pausing task l1_1383.0_S5R1__38_S5R1a_1 (removed from memory)
21/11/2006 00:12:23|rosetta@home|Starting task s018__BOINC_ABRELAX_SAVE_ALL_OUT_hom012__1407_2120_0 using rosetta version 540
21/11/2006 00:12:27||Suspending work fetch because computer is overcommitted.

ID: 31483 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 31511 - Posted: 21 Nov 2006, 13:56:21 UTC - in response to Message 31483.  
Last modified: 21 Nov 2006, 14:04:32 UTC

Hi gang,
Just had a lock up using rosetta, but I managed to get the task manager running by (control) (alt) (Delete) sequence. It stated that Rosetta was not responding. I ended the task, ran Boinc manager. Rosetta flagged a computational error and loaded another work unit and started on that.


The following advice may seem counter intuitive, and it is!

If you don't have time to do all I am suggesting and you also don't want to leave your box idle, then what you did is a good quick way to get going again.

I am not suggesting you did anything wrong - rather I am trying to say that, if you can spare the extra time, doing the longer procedure given below might be of more help to the project.

It is better to exit boinc rather than use task manager (or on Linux top or kill) to end the rosetta process, even when it is the rosetta process causing the problem.

First, try forcing boinc to exit the normal way:

- file->exit for most people,

- but for those running boinc as a windows service, use

control panel -> admin tools -> services -> right click boinc -> stop

- linux users will probably already know how to stop boinc on their installation. One of the many ways is to open a shell window, cd to the BOINC directory, and type ./boinc_cmd --quit

If that does not work then use task manager (top, kill) to kill the boinc process, still not the rosetta one.

However you ended boinc, please wait one minute after boinc is dead, use task manager to see if rosetta is still there. If it is still running, only then use task manager to kill off the rosetta process.

The reason for this advice is because by doing things the more obvious way, as you did, the error report from the Rosetta app is about the fact that Rosetta was killed by intervention from the operating system.

If you kill boinc, rosetta should die anyway after 30sec, which is why I suggest waiting 1min. The same work will restart form the proevious checkpoint when boinc is restarted. Sometimes it will then run OK, sometimes it will die again - either way this is useful info to be reported to the project team when the work is finally reported.

If rosetta dies a second time, I would again suggest you try not to use task manager to abort it, but restart boinc and use the abort facility built into boinc. Again this gives the project team a better idea of what went wrong.

Sometimes in the past the team have asked us to preserve files from such situations - usually now they already have enough debug info sent back in the stderr output. If you are asked to save files you will be told which ones, and then a good point in the above sequence to copy them is while boinc is not running.


R~~

edit: moved things around, added how to stop boinc on linux for benefit of other readers
ID: 31511 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ivor Cogdell

Send message
Joined: 7 Nov 06
Posts: 10
Credit: 18,073
RAC: 16
Message 31537 - Posted: 21 Nov 2006, 22:12:19 UTC

Hi River and the gang,
It is my usual policy to use the Boinc manager Exit before closing down windows, if i can gain access to it. I then try task manager next. I will aim for Boinc before Roosetta.
ID: 31537 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 31547 - Posted: 22 Nov 2006, 2:36:01 UTC

21/11/2006 00:11:02|rosetta@home|Unrecoverable error for result s014__BOINC_ABRELAX_SAVE_ALL_OUT_hom003__1406_1937_0 ( - exit code 1073807364 (0x40010004))

Though note: it's a minus and your isn't, but with the number corresponding I would hazzard a guess that it's the same thing.

Unofficial BOINC Wiki
Exit Code -1073807364 (0x40010004)

This is usually due to a graphics crash. This can happen because something else is using the graphics, or alternatively you're displaying the graphics and kill the graphics window via Task manager or answering 'Yes' to Microsoft's 'this program is not responding, kill it?' question.


If you're not using the screensaver, I don't know, maybe a bug in BOINC version 5.4.11.
I seem to remember that someone said one of the new BOINC versions used a different file name for one of it's EXE's and was causing two BOINC's to load up, but that should be taken with a grain of salt. Though if your straw clutching you could have a look at the with the task manager

I use BOINC 5.4.9 with the screensaver set to none and it works, but like I said I don't know :?
ID: 31547 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile River~~
Avatar

Send message
Joined: 15 Dec 05
Posts: 761
Credit: 285,578
RAC: 0
Message 31553 - Posted: 22 Nov 2006, 8:34:50 UTC - in response to Message 31547.  

Though note: it's a minus and your isn't, but with the number corresponding I would hazzard a guess that it's the same thing.

[quote] Exit Code -1073807364 (0x40010004)


Good spot Hugo!

This is a documentation bug - the decimal value of this error code is positive and the wiki is wrong to show it as negative. So you are quite right to guess that this is the right interpretation of the code.

Fyi, hex numbers that begin with digits 0-7 are positive when converted to signed decimal values, those that begin 8-F are negative.

(NB - you must make sure you have all the hex digits before applying this rule: 0xFFFF would be negative (-1) if it were a 16 bit value, but positve if it were shorthand for 0x0000FFFF)

Maybe someone with current wiki access could take out the - sign.

Under the same rule, the minus sign is correct on the error codes above, which start 0xC. No doubt that is how the mistake crept in.

R~~
ID: 31553 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ivor Cogdell

Send message
Joined: 7 Nov 06
Posts: 10
Credit: 18,073
RAC: 16
Message 31592 - Posted: 22 Nov 2006, 23:46:52 UTC

Hi gang,
This may fit into the dumb question category, but is there any way to check the status of the screensaver on a regular basis, to see when the problem occurs. In other words, does it lock up at the keypress or mouse input stage or just after a set time or process ?
Just to clarify Hugo, the screensaver was active (running Rosetta or Seti) when the lockup occurs. I mentioned both because it might be a factor. The Screensaver is turned off at the moment, to preserve workunits, but I can put it back on for any tests anyone might like to suggest.
ID: 31592 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
ravens

Send message
Joined: 25 Mar 06
Posts: 4
Credit: 85,122
RAC: 0
Message 31633 - Posted: 24 Nov 2006, 16:12:58 UTC - in response to Message 31592.  

Hi gang,
This may fit into the dumb question category, but is there any way to check the status of the screensaver on a regular basis, to see when the problem occurs. In other words, does it lock up at the keypress or mouse input stage or just after a set time or process ?
Just to clarify Hugo, the screensaver was active (running Rosetta or Seti) when the lockup occurs. I mentioned both because it might be a factor. The Screensaver is turned off at the moment, to preserve workunits, but I can put it back on for any tests anyone might like to suggest.



I had the same lockup problem if I had the Rosetta graphics on, while running BOINC 5.4.x, and I was also having issues getting CPU throttling to work. The throttling option was shown in 5.4.x, but did not actually do it. I tried some later Beta versions, throttling worked in those (runs/suspends every second or so), but the tasks (both for Einstein and Rosetta) stopped running after a few minutes. Win XP Task Manager said was still doing the running/suspended sequence but task was not running, showed no progress or CPU usage. In addition, I got the screen lockups with Rosetta graphics.
I'm trying Beta 5.7.5 now, throttling works fine on my laptop, Rosetta and Einstein tasks both can run right thru to completion, and if Rosetta graphics are on too, they do NOT cause a lockup.
It is beta code, but I assume this fix will make it to th next official non-beta release.
Mike
ID: 31633 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Rosetta 5.40 locks up



©2020 University of Washington
https://www.bakerlab.org