Problems with Rosetta version 5.93

Message boards : Number crunching : Problems with Rosetta version 5.93

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 9 · Next

AuthorMessage
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 50598 - Posted: 12 Jan 2008, 0:03:41 UTC

argh im getting angry

from the last 9 WU's i had 7 errored out. 7!!!!
thats 77.77%
today 2 more WU's crashed, but i dont feel like posting links anymore, its always the same stuff, sin and cosin thats out of range, when are you guys going to fix this. or give me a reply.?
ID: 50598 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael Matthews

Send message
Joined: 12 Dec 05
Posts: 3
Credit: 37,852
RAC: 0
Message 50599 - Posted: 12 Jan 2008, 0:15:45 UTC - in response to Message 50594.  


The computer does not have any dust build up or fan problems. The shutting down problem only occurs with the Rosetta Beta 5.93 application and no other software (even ones with high CPU usage). The computer did not shutdown until the Rosetta Beta 5.93 was sent to my computer to run. As I stated before, the SETI@home application (version Enhanced 5.27) never causes this problem (it runs 80% of the time BOINC runs). The computer crashes only with Rosetta Beta 5.93.

-Michael


do you have graphics/screensaver enabled?


I only have the minimal graphics enabled for BOINC. All that is displayed is a graphic of the BOINC logo, the application that is running (Rosetta@home or SETI@home), the work unit name, and the percentage of the work unit completed so far. None of the 3D graphics is being used.

Rosetta@home Beta 5.93 crashed again this afternoon. I'm getting rid of it.


-Michael
ID: 50599 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 50611 - Posted: 12 Jan 2008, 12:47:39 UTC

It's been three days without any new watchdog errors.

Here's my scoreboard for 5.93.



Personally, I wonder what's different between my hosts and those of users like Luuklag who also has an AMD64 host but IS getting computation errors. I haven't seen one computation error yet, so something must be different.
ID: 50611 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 50618 - Posted: 12 Jan 2008, 18:15:06 UTC - in response to Message 50611.  

well i guess its the type of WU i ran, 1 type but only finished 1 sucessfully out of 7 of them or so. so i guess its in the type of WU.


It's been three days without any new watchdog errors.

Here's my scoreboard for 5.93.



Personally, I wonder what's different between my hosts and those of users like Luuklag who also has an AMD64 host but IS getting computation errors. I haven't seen one computation error yet, so something must be different.

ID: 50618 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 50621 - Posted: 12 Jan 2008, 20:38:17 UTC - in response to Message 50618.  
Last modified: 12 Jan 2008, 20:58:03 UTC

well i guess its the type of WU i ran, 1 type but only finished 1 sucessfully out of 7 of them or so. so i guess its in the type of WU.


I took the liberty of running your host with my "Rosetta-Pal". Then I copied and color coded all the work from yours combined with all the work from my "windows" hosts. Then I sorted by WU name and weeded out work not of the same "Job type", so we'd be comparing apples with apples. You had windows xp, I had winxp. You had AMD64, I had AMD64. Etc, Etc.

Anyway, I found 4 instances were we did the same "job type" and you can see them below. I see that of the first job type, you had many computation errors, but your host also did one of them successfully.

Your hosts are "Blue" when you had a error, and "Green" when you successfully completed one. Mine are a various colors so I added descriptions to the first column. My host can be discerned from the previous chart with the exception of my wife's laptop "M3700" which is a "Mobile AMD64 3700" using win xp(can't put linux on that one....lol).

So, from what I see, it's probably NOT the job type/wus, or at least my hosts aren't having trouble with them.

I wonder what else it could be??



[edit] on the second set of WUs I noticed a very early return date on the your wu I saw, so I rechecked, and that computation error was with 5.90, whereas my hosts were using 5.93. Also, that one was not a computation error, but Invalid.

Also, Look at the 'good' wu you returned (green text), It's the very next consecutive "task ID" and "Work unit ID" number from the previous one, which failed, so your own host managed to do one type that it had previous failed to do.[/edit]
ID: 50621 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Path7

Send message
Joined: 25 Aug 07
Posts: 128
Credit: 61,751
RAC: 0
Message 50622 - Posted: 12 Jan 2008, 21:43:10 UTC - in response to Message 50598.  
Last modified: 12 Jan 2008, 21:44:39 UTC

argh im getting angry

from the last 9 WU's i had 7 errored out. 7!!!!
thats 77.77%
today 2 more WU's crashed, but i dont feel like posting links anymore, its always the same stuff, sin and cosin thats out of range, when are you guys going to fix this. or give me a reply.?


Hi Luuklag,

I looked into your tasks and opened the task details of WU 132125634
The Windows Runtime Debugger show also:
ModLoad: 07280000 0000f000 C:WINDOWSsystem32ATKOGL32.dll (6.14.10.138) (-exported- Symbols Loaded)
File Version : 6, 14, 10, 138
Company Name : ASUSTeK COMPUTER INC.
Product Name : ASUSTeK Computer Inc. AsusOGL
Product Version: 6, 14, 10, 138

ModLoad: 69500000 00574000 C:WINDOWSsystem32nvoglnt.dll (6.14.10.9147) (-exported- Symbols Loaded)
File Version : 6.14.10.9147
Company Name : NVIDIA Corporation
Product Name : NVIDIA Compatible OpenGL ICD
Product Version: 6.14.10.9147
Those 2: ATKOGL32.dll & nvoglnt.dll both look like graphics card drivers. However one of the drivers might as well from some add-on software.
Perhaps you changed your graphics card and left an old driver?

I hope this information is useful to you.
Path7.
ID: 50622 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,860,059
RAC: 7,494
Message 50623 - Posted: 12 Jan 2008, 21:46:34 UTC

i also had a quick look and this:

[01/08/08 21:42:17] TRACE [3172]: Retrieved the required window station

[01/08/08 21:42:17] TRACE [3172]: Retrieved the required desktop

[01/08/08 21:47:11] TRACE [3172]: Retrieved the required window station

[01/08/08 21:47:11] TRACE [3172]: Retrieved the required desktop

i would presume is a graphics issue, which would support Path7's detective work ;)
ID: 50623 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 50634 - Posted: 13 Jan 2008, 13:29:54 UTC - in response to Message 50622.  

argh im getting angry

from the last 9 WU's i had 7 errored out. 7!!!!
thats 77.77%
today 2 more WU's crashed, but i dont feel like posting links anymore, its always the same stuff, sin and cosin thats out of range, when are you guys going to fix this. or give me a reply.?


Hi Luuklag,

I looked into your tasks and opened the task details of WU 132125634
The Windows Runtime Debugger show also:
ModLoad: 07280000 0000f000 C:WINDOWSsystem32ATKOGL32.dll (6.14.10.138) (-exported- Symbols Loaded)
File Version : 6, 14, 10, 138
Company Name : ASUSTeK COMPUTER INC.
Product Name : ASUSTeK Computer Inc. AsusOGL
Product Version: 6, 14, 10, 138

ModLoad: 69500000 00574000 C:WINDOWSsystem32nvoglnt.dll (6.14.10.9147) (-exported- Symbols Loaded)
File Version : 6.14.10.9147
Company Name : NVIDIA Corporation
Product Name : NVIDIA Compatible OpenGL ICD
Product Version: 6.14.10.9147
Those 2: ATKOGL32.dll & nvoglnt.dll both look like graphics card drivers. However one of the drivers might as well from some add-on software.
Perhaps you changed your graphics card and left an old driver?

I hope this information is useful to you.
Path7.


yes got a new card about 2 months ago, same manufacturer, cause my card was called back because of cooling issues, it made enormous noize cause the bearings of the fan broke down. i just installed the new drivers, imho just an update of the drives, so i dont think there is a problem with that, cause i can do everything like play UT3 on high.
ID: 50634 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Barraud Denis
Avatar

Send message
Joined: 8 May 06
Posts: 6
Credit: 1,258,677
RAC: 0
Message 50643 - Posted: 13 Jan 2008, 15:13:57 UTC

roseta failed and stop/block boinc completely my Q6600, so i have stop this project to protect my others WU running on boinc. The boinc manager stay in memory but is not running, no WU could work. Even with BOINC and all projets completely reinstalled after a reboot, roseta bug again and block boinc.

the only way to recover boinc, i found was to kill boinc manager, restart it and supress the roseta project rapidely, before it reload a new wu.

I think roseta must be upgraded to disconnect it better from boinc, when it failled in error, to prevent boinc freeze.

The information i have from event observer.

Type de l'événement : Erreur
Source de l'événement : Application Error
Catégorie de l'événement : Aucun
ID de l'événement : 1000
Date : 13/01/2008
Heure : 15:05:54
Utilisateur : N/A
Ordinateur : C2Q1
Description :
Application défaillante minirosetta_1.03_windows_intelx86.exe, version 0.0.0.0, module défaillant minirosetta_1.03_windows_intelx86.exe, version 0.0.0.0, adresse de défaillance 0x0027e8c2.

Pour plus d'informations, consultez le centre Aide et support à l'adresse http://go.microsoft.com/fwlink/events.asp.
Données :
0000: 41 70 70 6c 69 63 61 74 Applicat
0008: 69 6f 6e 20 46 61 69 6c ion Fail
0010: 75 72 65 20 20 6d 69 6e ure min
0018: 69 72 6f 73 65 74 74 61 irosetta
0020: 5f 31 2e 30 33 5f 77 69 _1.03_wi
0028: 6e 64 6f 77 73 5f 69 6e ndows_in
0030: 74 65 6c 78 38 36 2e 65 telx86.e
0038: 78 65 20 30 2e 30 2e 30 xe 0.0.0
0040: 2e 30 20 69 6e 20 6d 69 .0 in mi
0048: 6e 69 72 6f 73 65 74 74 nirosett
0050: 61 5f 31 2e 30 33 5f 77 a_1.03_w
0058: 69 6e 64 6f 77 73 5f 69 indows_i
0060: 6e 74 65 6c 78 38 36 2e ntelx86.
0068: 65 78 65 20 30 2e 30 2e exe 0.0.
0070: 30 2e 30 20 61 74 20 6f 0.0 at o
0078: 66 66 73 65 74 20 30 30 ffset 00
0080: 32 37 65 38 63 32 0d 0a 27e8c2..

ID: 50643 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 50651 - Posted: 13 Jan 2008, 16:55:17 UTC - in response to Message 50643.  

anyone please translate it into english...


roseta failed and stop/block boinc completely my Q6600, so i have stop this project to protect my others WU running on boinc. The boinc manager stay in memory but is not running, no WU could work. Even with BOINC and all projets completely reinstalled after a reboot, roseta bug again and block boinc.

the only way to recover boinc, i found was to kill boinc manager, restart it and supress the roseta project rapidely, before it reload a new wu.

I think roseta must be upgraded to disconnect it better from boinc, when it failled in error, to prevent boinc freeze.

The information i have from event observer.

Type de l'événement : Erreur
Source de l'événement : Application Error
Catégorie de l'événement : Aucun
ID de l'événement : 1000
Date : 13/01/2008
Heure : 15:05:54
Utilisateur : N/A
Ordinateur : C2Q1
Description :
Application défaillante minirosetta_1.03_windows_intelx86.exe, version 0.0.0.0, module défaillant minirosetta_1.03_windows_intelx86.exe, version 0.0.0.0, adresse de défaillance 0x0027e8c2.

Pour plus d'informations, consultez le centre Aide et support à l'adresse http://go.microsoft.com/fwlink/events.asp.
Données :
0000: 41 70 70 6c 69 63 61 74 Applicat
0008: 69 6f 6e 20 46 61 69 6c ion Fail
0010: 75 72 65 20 20 6d 69 6e ure min
0018: 69 72 6f 73 65 74 74 61 irosetta
0020: 5f 31 2e 30 33 5f 77 69 _1.03_wi
0028: 6e 64 6f 77 73 5f 69 6e ndows_in
0030: 74 65 6c 78 38 36 2e 65 telx86.e
0038: 78 65 20 30 2e 30 2e 30 xe 0.0.0
0040: 2e 30 20 69 6e 20 6d 69 .0 in mi
0048: 6e 69 72 6f 73 65 74 74 nirosett
0050: 61 5f 31 2e 30 33 5f 77 a_1.03_w
0058: 69 6e 64 6f 77 73 5f 69 indows_i
0060: 6e 74 65 6c 78 38 36 2e ntelx86.
0068: 65 78 65 20 30 2e 30 2e exe 0.0.
0070: 30 2e 30 20 61 74 20 6f 0.0 at o
0078: 66 66 73 65 74 20 30 30 ffset 00
0080: 32 37 65 38 63 32 0d 0a 27e8c2..

ID: 50651 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 50661 - Posted: 13 Jan 2008, 18:26:21 UTC - in response to Message 50651.  
Last modified: 13 Jan 2008, 18:27:04 UTC

see the enlish stuff in ( )

anyone please translate it into english...


roseta failed and stop/block boinc completely my Q6600, so i have stop this project to protect my others WU running on boinc. The boinc manager stay in memory but is not running, no WU could work. Even with BOINC and all projets completely reinstalled after a reboot, roseta bug again and block boinc.

the only way to recover boinc, i found was to kill boinc manager, restart it and supress the roseta project rapidely, before it reload a new wu.

I think roseta must be upgraded to disconnect it better from boinc, when it failled in error, to prevent boinc freeze.

The information i have from event observer.

Type de l'événement : Erreur - type of event: error
Source de l'événement : Application Error - source of event
Catégorie de l'événement : Aucun - catagory of event: none
ID de l'événement : 1000 - ID of the event
Date : 13/01/2008
Heure : 15:05:54
Utilisateur : N/A - user is N/A
Ordinateur : C2Q1 - computer (id or name?)
Description :
Application défaillante (failing applications)minirosetta_1.03_windows_intelx86.exe, version 0.0.0.0, module défaillant (failing module) minirosetta_1.03_windows_intelx86.exe, version 0.0.0.0, adresse de défaillance (address of failure) 0x0027e8c2.

Pour plus d'informations, consultez le centre Aide et support à l'adresse http://go.microsoft.com/fwlink/events.asp. (the usal sentance about to find more information visit .....)
Données (data):
0000: 41 70 70 6c 69 63 61 74 Applicat
0008: 69 6f 6e 20 46 61 69 6c ion Fail
0010: 75 72 65 20 20 6d 69 6e ure min
0018: 69 72 6f 73 65 74 74 61 irosetta
0020: 5f 31 2e 30 33 5f 77 69 _1.03_wi
0028: 6e 64 6f 77 73 5f 69 6e ndows_in
0030: 74 65 6c 78 38 36 2e 65 telx86.e
0038: 78 65 20 30 2e 30 2e 30 xe 0.0.0
0040: 2e 30 20 69 6e 20 6d 69 .0 in mi
0048: 6e 69 72 6f 73 65 74 74 nirosett
0050: 61 5f 31 2e 30 33 5f 77 a_1.03_w
0058: 69 6e 64 6f 77 73 5f 69 indows_i
0060: 6e 74 65 6c 78 38 36 2e ntelx86.
0068: 65 78 65 20 30 2e 30 2e exe 0.0.
0070: 30 2e 30 20 61 74 20 6f 0.0 at o
0078: 66 66 73 65 74 20 30 30 ffset 00
0080: 32 37 65 38 63 32 0d 0a 27e8c2..

application failure minirosetta_1.03 windows intelx86.exe 0.0.0.0 in minirosetta_1.03_windows_intelx86>exe 0.0.0.0 at offset 0027e8c2

(used http://babelfish.altavista.com/tr for the translation of the text)
I am not a French expert.


ID: 50661 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ingemar

Send message
Joined: 28 Feb 06
Posts: 20
Credit: 1,680
RAC: 0
Message 50671 - Posted: 14 Jan 2008, 0:32:17 UTC - in response to Message 50598.  

argh im getting angry

from the last 9 WU's i had 7 errored out. 7!!!!
thats 77.77%
today 2 more WU's crashed, but i dont feel like posting links anymore, its always the same stuff, sin and cosin thats out of range, when are you guys going to fix this. or give me a reply.?


Hi Luuklag,

The overall error rates of the WU that are crashing for you are much lower than what you observe (around 2-5%). You may be unlucky, on the other hand they are caused by the the same problem (the cosine error) and not only for one type of WU so we need to fix that. We are looking into this problem to find the bug.
ID: 50671 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 50678 - Posted: 14 Jan 2008, 7:09:04 UTC

Just returned this task it is marked as valid, but has this in result file.

fyi

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=121084932

5croA_BOINC_ABRELAX_VF_IGNORE_THE_REST-S25-18-S3-11--5croA-vf__2597_848_0

sin_cos_range ERROR: 1.2851869 is outside of [-1,+1] sin and cos value legal range
sin_cos_range ERROR: 1.2833332 is outside of [-1,+1] sin and cos value legal range

pete.

ID: 50678 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Yeti
Avatar

Send message
Joined: 2 Nov 05
Posts: 45
Credit: 14,945,062
RAC: 0
Message 50679 - Posted: 14 Jan 2008, 11:24:13 UTC

Here is a 5.93er WU that errored with Exit status -1073741819 (0xc0000005)

https://boinc.bakerlab.org/rosetta/result.php?resultid=133082830

The box is a Double-Quad-Xeon, running 2003 Server 64 Bit with 8 GB memory




Supporting BOINC, a great concept !
ID: 50679 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Yeti
Avatar

Send message
Joined: 2 Nov 05
Posts: 45
Credit: 14,945,062
RAC: 0
Message 50680 - Posted: 14 Jan 2008, 11:28:03 UTC

And one word from me:

Please, discuss things like Rosetta against Ralph please in a different thread; I restarted crunching Rosetta with 5.93 and was looking, if something relevant is to be find about Errors with 5.93, but I had to read all your discussion.

Yes, the content of this discussion is okay, but for me it is definitely the wrong place in this thread


Supporting BOINC, a great concept !
ID: 50680 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 50681 - Posted: 14 Jan 2008, 13:16:53 UTC
Last modified: 14 Jan 2008, 13:19:21 UTC

I finally got a computation error, and strangely enough, I woke to find one wus stuck at 100% and gkrellm showed 0% cpu use for that core. I have suspended and resumed that wu and now wait for it to run again. The "stuck one" is 1zpy__BOINC_DEFAULT_SYMM_FOLD_AND_DOCK-1zpy_native_2_2519_22709_0. The one which has already reported as a computation error is resultid=133308819 1zpy__BOINC_TWIST_RINGS_TWIST_ANGLE_SYMM_FOLD_AND_DOCK_RELAX-1zpy_-native__2477_294683_0 and shows:

<core_client_version>5.10.21</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3191248
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -66.1132 for 900 seconds
**********************************************************************
GZIP SILENT FILE: ./xx1zpy.out
SIGSEGV: segmentation violation
Stack trace (22 frames):
[0x8da3037]
[0x8d9de2c]
[0xffffe500]
[0x89a1824]
[0x804c828]
[0x8a8ae99]
[0x8a8babf]
[0x8d0c170]
[0x8c12abe]
[0x8c14e33]
[0x804c7c2]
[0x8a835ed]
[0x8a8586f]
[0x89363de]
[0x89380e3]
[0x893ba27]
[0x898ad7a]
[0x85e96d6]
[0x87289d2]
[0x8728af2]
[0x8e07384]
[0x8048111]

Exiting...

so, it looks like I'm going to have two computation errors for my AMD64 X2 5200 under Linux
ID: 50681 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 50684 - Posted: 14 Jan 2008, 15:21:00 UTC
Last modified: 14 Jan 2008, 15:28:20 UTC

too late to edit.

The second one which was stuck, remained stuck after the work scheduler got back around to it. I ended up exiting the mangager, opening Konsole, and killing Boinc. I then restarted and opened the manager. The result showed "ready to report", so it must have uploaded before the manager displayed it.

Anyway, It was considered "Valid" and was granted credit like this never even happened. It's resultid=133326615
which shows:

<core_client_version>5.10.21</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3623102
======================================================
DONE :: 1 starting structures 9911.7 cpu seconds
This process generated 6 decoys from 6 attempts
======================================================


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>
]]>

Which seems completely uneventful to me, but I know it stuck. Leaving my host only using one core for who knows how long.
ID: 50684 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 50687 - Posted: 14 Jan 2008, 20:17:38 UTC

oops. linked to the wrong work unit for the stuck one. It was really, resultid=133258619 which showed this.

<core_client_version>5.10.21</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3630287
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -84.1725 for 900 seconds
**********************************************************************
GZIP SILENT FILE: ./xx1zpy.out
SIGSEGV: segmentation violation
Stack trace (21 frames):
[0x8da3037]
[0x8d9de2c]
[0xffffe500]
[0x8e2a1b9]
[0x8df8727]
[0x8dfaba1]
[0x8cb4a2c]
[0x8c1179b]
[0x8c14e33]
[0x804c7c2]
[0x8a835ed]
[0x8a8586f]
[0x89363de]
[0x893822e]
[0x893ba27]
[0x898ad7a]
[0x85e96d6]
[0x87289d2]
[0x8728af2]
[0x8e07384]
[0x8048111]

Exiting...
No heartbeat from core client for 31 sec - exiting
FILE_LOCK::unlock(): close failed.: Bad file descriptor
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
**********************************************************************
Rosetta score is stuck or going too long. Watchdog is ending the run!
Stuck at score -82.6613 for 900 seconds
**********************************************************************
GZIP SILENT FILE: ./xx1zpy.out
SIGSEGV: segmentation violation
Stack trace (22 frames):
[0x8da3037]
[0x8d9de2c]
[0xffffe500]
[0x89a1824]
[0x804c828]
[0x8a8ae99]
[0x8a8babf]
[0x8d0c170]
[0x8c12abe]
[0x8c14e33]
[0x804c7c2]
[0x8a835ed]
[0x8a8586f]
[0x89363de]
[0x893822e]
[0x893ba27]
[0x898ad7a]
[0x85e96d6]
[0x87289d2]
[0x8728af2]
[0x8e07384]
[0x8048111]

Exiting...
SIGSEGV: segmentation violation
SIGABRT: abort called
[insert] about 200 more of the "abort called", but I snipped it for brevity
SIGABRT: abort called

</stderr_txt>
]]>
ID: 50687 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile AdeB
Avatar

Send message
Joined: 12 Dec 06
Posts: 45
Credit: 4,428,086
RAC: 0
Message 50689 - Posted: 14 Jan 2008, 20:56:52 UTC

resultid 133097235 had some problems, but is valid after all - strange.

<core_client_version>5.8.15</core_client_version>
<![CDATA[
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
# random seed: 3031158
SIGSEGV: segmentation violation
Stack trace (12 frames):
[0x8da3037]
[0x8d9de2c]
[0xffffe420]
[0x8e28653]
[0x8df90a1]
[0x8dfaac9]
[0x83e8c0f]
[0x8e0e98f]
[0x8d9fab7]
[0x8da10d5]
[0x8d9a0c5]
[0x8e3aa1a]

Exiting...
SIGSEGV: segmentation violation
Stack trace (17 frames):
[0x8da3037]
[0x8d9de2c]
[0xffffe420]
[0x881d8ba]
[0x881f90a]
[0x88263b5]
[0x8827d6d]
[0x84fcf7a]
[0x84fd442]
[0x8b3e9c0]
[0x8b4134b]
[0x80d8efd]
[0x85eaa7e]
[0x8728a47]
[0x8728af2]
[0x8e07384]
[0x8048111]

Exiting...
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
SIGSEGV: segmentation violation
Stack trace (19 frames):
[0x8da3037]
[0x8d9de2c]
[0xffffe420]
[0x850ea02]
[0x8c12f90]
[0x876ba6c]
[0x876c3fe]
[0x87703bb]
[0x878176f]
[0x8787179]
[0x8cf4461]
[0x8b3e9dc]
[0x8b4134b]
[0x80d8efd]
[0x85eaa7e]
[0x8728a47]
[0x8728af2]
[0x8e07384]
[0x8048111]

Exiting...
Graphics are disabled due to configuration...
# cpu_run_time_pref: 10800
======================================================
DONE :: 1 starting structures 10809.5 cpu seconds
This process generated 8 decoys from 8 attempts
======================================================


BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...

</stderr_txt>
]]>
ID: 50689 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile hedera
Avatar

Send message
Joined: 15 Jul 06
Posts: 76
Credit: 5,263,150
RAC: 144
Message 50691 - Posted: 14 Jan 2008, 23:48:23 UTC

5.93 is eating my Windows machine alive. I tried to do something this afternoon and the box was so hung it was barely responding. Here's my system, from the opening log:

01/14/2008 8:11:54 AM||Starting BOINC client version 5.10.20 for windows_intelx86
01/14/2008 8:11:54 AM||log flags: task, file_xfer, sched_ops
01/14/2008 8:11:54 AM||Libraries: libcurl/7.16.4 OpenSSL/0.9.8e zlib/1.2.3
01/14/2008 8:11:54 AM||Data directory: C:Program FilesBOINC
01/14/2008 8:11:56 AM||Processor: 2 GenuineIntel Intel(R) Pentium(R) 4 CPU 3.20GHz [x86 Family 15 Model 4 Stepping 1]
01/14/2008 8:11:56 AM||Processor features: fpu tsc pae nx sse sse2 mmx
01/14/2008 8:11:57 AM||OS: Microsoft Windows XP: Professional Edition, Service Pack 2, (05.01.2600.00)
01/14/2008 8:11:57 AM||Memory: 1022.09 MB physical, 2.40 GB virtual
01/14/2008 8:11:57 AM||Disk: 145.27 GB total, 106.66 GB free
01/14/2008 8:11:57 AM||Local time is UTC -8 hours

In mid-afternoon (around 3:30 PM local), first of all I had three WUs running at once; and when I looked at the task manager I saw that they were using a whole lot of memory:

319,896K
258,352K
34,636K

That's 612,884K, just for Rosetta! Add to this the fact that ZoneAlarm Internet Security (which I recently installed to replace Norton) was running some kind of update, and I could barely get the mouse to respond. I suspended Rosetta temporarily so I could post this and let ZA finish whatever it was doing. (I'll be discussing this with them.)

I've been running on the assumption that my computing preferences, which are pretty standard, would give me 2 WUs using, between them, 98-100% of CPU but NOT this much memory! Is there some tweak I should do to my settings? Should I expect to be running 3 or even 4 WUs at a time?


--hedera

Never be afraid to try something new. Remember that amateurs built the ark. Professionals built the Titanic.

ID: 50691 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 9 · Next

Message boards : Number crunching : Problems with Rosetta version 5.93



©2024 University of Washington
https://www.bakerlab.org