Problems with Rosetta version 5.98

Message boards : Number crunching : Problems with Rosetta version 5.98

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
AMD_is_logical

Send message
Joined: 20 Dec 05
Posts: 299
Credit: 31,460,681
RAC: 0
Message 56161 - Posted: 2 Oct 2008, 3:33:48 UTC - in response to Message 56160.  

I just watched this HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_39952_0 one die on me. 1:48 computation time and then it even pops up a windows error box.

I had a HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539 die on one of my Linux nodes:

https://boinc.bakerlab.org/rosetta/result.php?resultid=195898771

ID: 56161 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Qui-Gon Jinn

Send message
Joined: 10 Aug 08
Posts: 3
Credit: 4,683
RAC: 0
Message 56180 - Posted: 3 Oct 2008, 0:43:50 UTC

Wierd, i had the same problem with a similar task. This is what Boinc said.
10/2/2008 7:15:37 PM|rosetta@home|Starting HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_162905_0

10/2/2008 7:15:40 PM|rosetta@home|Starting task HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_162905_0 using rosetta_beta version 598

10/2/2008 7:15:41 PM|rosetta@home|Started upload of abinitio_nohomfrag_70_A_2he4A_4482_20855_0_0

10/2/2008 7:15:46 PM|rosetta@home|Finished upload of abinitio_nohomfrag_70_A_2he4A_4482_20855_0_0

10/2/2008 7:32:34 PM|rosetta@home|Computation for task HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_162905_0 finished

10/2/2008 7:32:34 PM|rosetta@home|Output file HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_162905_0_0 for task HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_162905_0 absent

Note that the WU was running for only 1000 seconds. I run vista and it said that the process was malfunctioning ( and then stopped it).
ID: 56180 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Fishead

Send message
Joined: 3 Sep 08
Posts: 7
Credit: 89,566
RAC: 0
Message 56185 - Posted: 3 Oct 2008, 8:10:47 UTC

Same problems here:

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=179380707
https://boinc.bakerlab.org/rosetta/workunit.php?wuid=177135943

One of them gave me a windows error as well, but I'm afraid I don't remember which it was.
ID: 56185 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Griiim Dave

Send message
Joined: 4 Sep 08
Posts: 3
Credit: 386,218
RAC: 0
Message 56213 - Posted: 4 Oct 2008, 12:13:32 UTC

3 workunits starting with HR19 failed on me this morning!
ID: 56213 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 56214 - Posted: 4 Oct 2008, 15:19:23 UTC
Last modified: 4 Oct 2008, 15:22:25 UTC

HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_122324_0

10/4/2008 10:32:44 AM|rosetta@home|Starting task HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_122324_0 using rosetta_beta version 598
10/4/2008 2:28:18 PM|rosetta@home|Computation for task HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_122324_0 finished
10/4/2008 2:28:18 PM|rosetta@home|Output file HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_122324_0_0 for task HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_122324_0 absent

Exit status -1073741819 (0xc0000005)
CPU time 13778.61
stderr out

<core_client_version>6.2.19</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 21600
# random seed: 2709137


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00BCCCF2 write attempt to address 0x1F4ED470

Engaging BOINC Windows Runtime Debugger...

Dump Timestamp : 10/04/08 14:22:59
LoadLibraryA( symsrv.dll ): GetLastError = 126
LoadLibraryA( srcsrv.dll ): GetLastError = 126
Debugger Engine : 4.0.5.0
Symbol Search Path: E:boincprojectsslots;E:boincprojectsprojectsboinc.bakerlab.org_rosetta;srv*C:WINDOWSTEMPsymbols*http://msdl.microsoft.com/download/symbols;srv*C:WINDOWSTEMPsymbols*https://boinc.bakerlab.org/rosetta/symstore


SymGetModuleInfo(): GetLastError = 87
ModLoad: 00000000 00000000 ( Symbols Loaded)

The above message repeats 22 times like last time

No credit granted, I'm wasting my time with these types of tasks. 2 out of 3 of these tasks have died. If it goes 50% I will abort all the HR19 tasks to save my RAC.
ID: 56214 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 2
Message 56221 - Posted: 4 Oct 2008, 16:31:26 UTC
Last modified: 4 Oct 2008, 16:32:41 UTC

HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_228065_1

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 10800
# random seed: 2603396


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00BCCCF2 write attempt to address 0x2015E48C

Engaging BOINC Windows Runtime Debugger...
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 56221 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2144
Credit: 41,550,899
RAC: 8,846
Message 56263 - Posted: 6 Oct 2008, 23:34:03 UTC - in response to Message 56068.  

Task ID 194793599 gave a Compute Error with the std err:

Sid Celery I've fixed up the the link to your result for you

Thanks Speedy. Finger trouble here...

I've noticed Mgr 6.2.19 is out now - updating now in the vain hope it helps my issue here or with mini 1.34 errors. This post really just a marker. Any improvement and I'll revert to 3 hour runs instead of 2 hour to help save on bandwidth.
ID: 56263 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Matthias Lehmkuhl

Send message
Joined: 20 Nov 05
Posts: 10
Credit: 2,447,781
RAC: 202
Message 56291 - Posted: 8 Oct 2008, 6:46:55 UTC
Last modified: 8 Oct 2008, 6:49:31 UTC

I got this error on a machine, causes a window with user acknowledge, where the CPU runs no other boinc project since I press OK.

resultid=197473964

Exit status -1073741819 (0xc0000005)

<core_client_version>6.2.19</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
# cpu_run_time_pref: 14400
# random seed: 2595731


Unhandled Exception Detected...

application version 5.98

edit:
Name HR19__BOINC_LONGNOE_JUMPRELAX_BARCODE_SAVE_ALL_OUT_200-HR19_-_4539_235730_1
Matthias

ID: 56291 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Narlanthrotep

Send message
Joined: 23 Jan 06
Posts: 2
Credit: 512,528
RAC: 0
Message 56501 - Posted: 29 Oct 2008, 1:32:30 UTC

Virus detected by NOD32 v2.7 definition 3564 (20081028)
IMON detected: "probably a variant of Win32/Statik application"
file:
hxxp://srv4.bakerlab.org/rosetta/download/rosetta_bate_5.98_windows_intelx86.exe

i've seen a similar detection before with a previous verision of Rosetta Mini, but the problem was the AV software, not Rosetta itself.

ID: 56501 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
bzag00

Send message
Joined: 17 Aug 06
Posts: 4
Credit: 4,765
RAC: 0
Message 56612 - Posted: 1 Nov 2008, 23:08:44 UTC

My last 2 tasks completed with a Computation Error and I was not granted credit. I am using V5.98. I am not having any problems with any other BOINC serviced applications.

I am subscribing to this thread, so when a new release of Rosetta comes out please post a message in this thread and I will resume the Rosetta project locally.

Thank you.
ID: 56612 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
googloo
Avatar

Send message
Joined: 15 Sep 06
Posts: 133
Credit: 22,813,645
RAC: 2,151
Message 56614 - Posted: 2 Nov 2008, 0:41:43 UTC - in response to Message 56612.  
Last modified: 2 Nov 2008, 0:42:15 UTC

My last 2 tasks completed with a Computation Error and I was not granted credit. I am using V5.98. I am not having any problems with any other BOINC serviced applications.

I am subscribing to this thread, so when a new release of Rosetta comes out please post a message in this thread and I will resume the Rosetta project locally.

Thank you.


The thread to subscribe to for notifications of new Rosetta versions is supposed to be Rosetta Application Version Release Log. However, the Rosetta team apparently has trouble remembering this.
ID: 56614 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 56791 - Posted: 9 Nov 2008, 19:20:30 UTC

ID: 56791 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kctipton

Send message
Joined: 21 Nov 05
Posts: 2
Credit: 76,842
RAC: 0
Message 56904 - Posted: 13 Nov 2008, 13:40:22 UTC

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=185399690 -- your server sent it out 3x and thereby created an error condition of "too many results"
ID: 56904 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Toby Broom

Send message
Joined: 15 Oct 08
Posts: 11
Credit: 18,732,062
RAC: 0
Message 57002 - Posted: 16 Nov 2008, 17:09:47 UTC
Last modified: 16 Nov 2008, 17:12:22 UTC

I have a computer running Windows Server 2008 Core.

This edition of windows doesn't include opengl.dll, this is causing failed workunits on my computer.

As I run BOINC as a service I assumed that the opengl.dll was only used for the screensaver and hence wouldn't be needed.
ID: 57002 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Aegis Maelstrom

Send message
Joined: 29 Oct 08
Posts: 61
Credit: 2,137,555
RAC: 0
Message 57192 - Posted: 24 Nov 2008, 0:11:28 UTC
Last modified: 24 Nov 2008, 0:17:09 UTC

Hi, I seem to have a following problem with the client: a unit resumes just to get completed after a couple of seconds and get uploaded as succesful.

Unfortunately, I had a similar problem with a minirosetta (but then it was after 3 restarts and probably new memory requirements for the recent tasks - I say probably because nobody replied my directly).

Message 57035

Hi,

as I have promised I have come back, increased the memory amount and started to crunch again.

To my surprise, the process has suddenly finished with a "success". The log says:
2008-11-17 21:54:33|rosetta@home|Restarting task IL23p40_p40BrubYhbond_design_jecorn_SAVE_ALL_OUT_IGNORE_THE_REST_ip40_1wr2_4683_55_1 using minirosetta version 140
2008-11-17 21:56:12|rosetta@home|Computation for task IL23p40_p40BrubYhbond_design_jecorn_SAVE_ALL_OUT_IGNORE_THE_REST_ip40_1wr2_4683_55_1 finished

As I wrote in the posts above, this is impossible to end this task in such a time. Last time I needed two and a half "physical" hours just to crash, due to probably too low memory limits.
(...)


This time the process was resumed after 3 hrs of work. Log:

2008-11-24 00:49:04|rosetta@home|Restarting task LZ21__BOINC_SYMM_FOLD_AND_DOCK_RELAX-LZ21_-foldanddock__4643_15895_2 using rosetta_beta version 598
2008-11-24 00:49:25|rosetta@home|Computation for task LZ21__BOINC_SYMM_FOLD_AND_DOCK_RELAX-LZ21_-foldanddock__4643_15895_2 finished
2008-11-24 00:49:26|rosetta@home|Starting cs_jumping_abrelax_6PNAS_proteins3_homo_bench_cs_jumping_abrelax_cs_nsp1_olange_4732_32919_1
2008-11-24 00:49:31|rosetta@home|Starting task cs_jumping_abrelax_6PNAS_proteins3_homo_bench_cs_jumping_abrelax_cs_nsp1_olange_4732_32919_1 using minirosetta version 140
2008-11-24 00:49:32|rosetta@home|Started upload of LZ21__BOINC_SYMM_FOLD_AND_DOCK_RELAX-LZ21_-foldanddock__4643_15895_2_0
2008-11-24 00:49:37|rosetta@home|Finished upload of LZ21__BOINC_SYMM_FOLD_AND_DOCK_RELAX-LZ21_-foldanddock__4643_15895_2_0

Is there a problem with a client or a machine? Does BOINC require a reinstall, more resources or what exactly?

EDIT: Unfortunately I can't tell how many % was completed before the resuming. If it was more than 90% it could have been just a coincidence. :/
ID: 57192 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
PlaNed

Send message
Joined: 25 Sep 05
Posts: 3
Credit: 37,334
RAC: 0
Message 57386 - Posted: 1 Dec 2008, 7:55:17 UTC

Today NOD32 report:

01.12.2008 09:49:22 HTTP filter file http://srv1.bakerlab.org/rosetta/download/rosetta_beta_5.98_windows_intelx86.exe probably a variant of Win32/Statik application connection terminated - quarantined

<img src="http://boinc.mundayweb.com/one/stats.php?userID=120&amp;trans=off">
ID: 57386 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile ByRad
Avatar

Send message
Joined: 12 Apr 08
Posts: 8
Credit: 15,869,002
RAC: 386
Message 57718 - Posted: 8 Dec 2008, 21:58:36 UTC

This message window occured:

Log messages:
2008-12-08 21:22:52|rosetta@home|Starting t062_1_NMRREF_1_t062_1_id_model_04_coreIGNORE_THE_REST_idl_5434_6154_0
2008-12-08 21:22:53|rosetta@home|Starting task t062_1_NMRREF_1_t062_1_id_model_04_coreIGNORE_THE_REST_idl_5434_6154_0 using rosetta_beta version 598

2008-12-08 22:48:05|rosetta@home|Computation for task t062_1_NMRREF_1_t062_1_id_model_04_coreIGNORE_THE_REST_idl_5434_6154_0 finished
2008-12-08 22:48:05|rosetta@home|Output file t062_1_NMRREF_1_t062_1_id_model_04_coreIGNORE_THE_REST_idl_5434_6154_0_0 for task t062_1_NMRREF_1_t062_1_id_model_04_coreIGNORE_THE_REST_idl_5434_6154_0 absent

ID: 57718 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 1,227
Message 57820 - Posted: 12 Dec 2008, 15:26:28 UTC - in response to Message 57192.  

Hi, I seem to have a following problem with the client: a unit resumes just to get completed after a couple of seconds and get uploaded as succesful.

Unfortunately, I had a similar problem with a minirosetta (but then it was after 3 restarts and probably new memory requirements for the recent tasks - I say probably because nobody replied my directly).

Message 57035

Hi,

as I have promised I have come back, increased the memory amount and started to crunch again.

To my surprise, the process has suddenly finished with a "success". The log says:
2008-11-17 21:54:33|rosetta@home|Restarting task IL23p40_p40BrubYhbond_design_jecorn_SAVE_ALL_OUT_IGNORE_THE_REST_ip40_1wr2_4683_55_1 using minirosetta version 140
2008-11-17 21:56:12|rosetta@home|Computation for task IL23p40_p40BrubYhbond_design_jecorn_SAVE_ALL_OUT_IGNORE_THE_REST_ip40_1wr2_4683_55_1 finished

As I wrote in the posts above, this is impossible to end this task in such a time. Last time I needed two and a half "physical" hours just to crash, due to probably too low memory limits.
(...)


Is there a problem with a client or a machine? Does BOINC require a reinstall, more resources or what exactly?

EDIT: Unfortunately I can't tell how many % was completed before the resuming. If it was more than 90% it could have been just a coincidence. :/


I'm not sure what went wrong, but my machine has acted better after I increased the upper limit on how much disk space BOINC is allowed to use and also increased the fraction of the virtual memory BOINC is allowed to use. I'd consider increasing the amount of physical memory it has as well, except that I'm already at the limit of how much my motherboard can handle.

As far as I can tell, the check of whether to end processing of a workunit comes only at the beginning of a timeslice. Restarting a task also starts a new timeslice, so the check of whether to return only the results calculated so far comes soon afterwards.

Minirosetta seems to have problems showing the correct time remaining between the time it reaches about 10 minutes from the initial estimated time it needs until it's complete, so if you get a workunit with a bad estimate of how much time it needs, expect slowly changing estimates of how much CPU time it needs to complete during this time. I wouldn't be surprized if Rosetta 5.98 has the same problem.
ID: 57820 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
frederick corse

Send message
Joined: 7 Oct 05
Posts: 10
Credit: 1,545,999
RAC: 0
Message 58397 - Posted: 3 Jan 2009, 1:17:43 UTC

Hello there seems to be a problem with AT01 they all fail during download can't get input files. I've had them download to my G5 and also to my mac pro with the same error.
ID: 58397 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Evan

Send message
Joined: 23 Dec 05
Posts: 268
Credit: 402,585
RAC: 0
Message 59495 - Posted: 9 Feb 2009, 23:20:01 UTC

error with these two - no heartbeat problems

227327788
(second time)
lasted .42 secs

</stderr_txt>
<message>
<file_xfer_error>
<file_name>t078_1_NMRREF_1_t078_1_idS_vnl_00024494_10961IGNORE_THE_REST_431_6701_745_1_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

and with
227327677
(first time)
lasted .41 sec


/stderr_txt>
<message>
<file_xfer_error>
<file_name>t078_1_NMRREF_1_t078_1_idS_vnl_00024494_10961IGNORE_THE_REST_431_6701_1032_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

and this one had a validate error

227327697

ID: 59495 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next

Message boards : Number crunching : Problems with Rosetta version 5.98



©2024 University of Washington
https://www.bakerlab.org