Rosetta@home

Minirosetta 2.00

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Advanced search
Message boards : Number crunching : Minirosetta 2.00

Sort
AuthorMessage
Yifan Song
Forum moderator
Project administrator
Project developer
Project scientist

Joined: May 26 09
Posts: 62
ID: 318024
Credit: 7,322
RAC: 0
Message 63966 - Posted 6 Nov 2009 0:10:20 UTC

minirosetta 2.00 is now up.
The energy function optimization we've been working on for the last few month are now in.
This version also fixes some stability issues brought in by 1.98.

Chilean Profile
Avatar

Joined: Oct 16 05
Posts: 651
ID: 5008
Credit: 10,394,263
RAC: 2,476
Message 63967 - Posted 6 Nov 2009 2:59:25 UTC

GPU support yet?? :D

The new BOINC version apparently likes my GPU and keeps requesting for work :(
____________

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 63968 - Posted 6 Nov 2009 3:05:09 UTC

Hi.

Is this right that the linux gnu is so big, at 31.4 MB that's huge

plus the new db file.

Espeacially if you have a lot of rigs, look out.



____________


Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 63971 - Posted 6 Nov 2009 5:55:35 UTC

PPL has a point. Looks like both Linux versions are more then 4x larger then prior releases.
____________
Rosetta Moderator: Mod.Sense

Speedy
Avatar

Joined: Sep 25 05
Posts: 159
ID: 1058
Credit: 521,019
RAC: 10
Message 63972 - Posted 6 Nov 2009 7:24:33 UTC - in response to Message ID 63971.

PPL has a point. Looks like both Linux versions are more then 4x larger then prior releases.

If this is the case I will wait until the size decreases again before I return to this project.
____________
Have a crunching good day!!

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 63973 - Posted 6 Nov 2009 18:56:51 UTC

speedy, I should clarify, we are talking about the download size of the new program version. It most likely is just due to a compiler parameter that incorporated some debug capabilities or something. So, it does not necessarily effect how much memory is required to run.

The concern would just be if you have very limited bandwidth available to perform the initial download. Or perhaps if you have very limited disk space available to BOINC.
____________
Rosetta Moderator: Mod.Sense

nick n
Avatar

Joined: Aug 26 07
Posts: 49
ID: 201050
Credit: 219,102
RAC: 0
Message 63975 - Posted 7 Nov 2009 1:38:51 UTC

Seems to work fine but the graphics lock up and need to be force quitted in activity monitor in OS X 10.6.1.

Michael G.R.

Joined: Nov 11 05
Posts: 263
ID: 11128
Credit: 8,385,240
RAC: 115
Message 63977 - Posted 8 Nov 2009 0:56:50 UTC - in response to Message ID 63975.

Seems to work fine but the graphics lock up and need to be force quitted in activity monitor in OS X 10.6.1.


I'm on OS X 10.6.1 and the graphics are fine here (though I haven't left them running for very long -- did you experience a lock up after a long time?).
____________

DJStarfox

Joined: Jul 19 07
Posts: 140
ID: 191721
Credit: 575,994
RAC: 722
Message 63984 - Posted 9 Nov 2009 14:19:59 UTC

Did the disk space requirements for this new version change? I'm getting:
09-Nov-2009 09:09:12 [rosetta@home] Message from server: No work sent
09-Nov-2009 09:09:12 [rosetta@home] Message from server: There was work but you don't have enough disk space allocated.
09-Nov-2009 09:09:12 [rosetta@home] Message from server: An additional 8 MB is needed.

Disk tab in BoincMgr only says 10.2 MB used by BOINC, even after resetting the project. How much space needs to be free to run the project?

zpm

Joined: Mar 21 09
Posts: 6
ID: 306856
Credit: 349,801
RAC: 0
Message 63986 - Posted 9 Nov 2009 15:25:06 UTC - in response to Message ID 63984.

Did the disk space requirements for this new version change? I'm getting:
09-Nov-2009 09:09:12 [rosetta@home] Message from server: No work sent
09-Nov-2009 09:09:12 [rosetta@home] Message from server: There was work but you don't have enough disk space allocated.
09-Nov-2009 09:09:12 [rosetta@home] Message from server: An additional 8 MB is needed.

Disk tab in BoincMgr only says 10.2 MB used by BOINC, even after resetting the project. How much space needs to be free to run the project?


set the limit to 1 GB....

AMD_is_logical

Joined: Dec 20 05
Posts: 299
ID: 41207
Credit: 31,460,681
RAC: 0
Message 63987 - Posted 9 Nov 2009 16:32:14 UTC

These mix_score12 WUs gave validate errors for both crunchers:

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=268883328
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=268921690

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 63989 - Posted 9 Nov 2009 16:35:16 UTC

DJStarfox if you have Linux boxes, see the discussion previously in this thread about how the size of the executable seems to have increased significantly in version 2.00.
____________
Rosetta Moderator: Mod.Sense

Gilles J. Seguin

Joined: Apr 3 09
Posts: 2
ID: 309494
Credit: 8,617,360
RAC: 0
Message 63993 - Posted 9 Nov 2009 21:12:44 UTC - in response to Message ID 63972.

PPL has a point. Looks like both Linux versions are more then 4x larger then prior releases.

If this is the case I will wait until the size decreases again before I return to this project.


The cli command would display debug info
$ nm minirosetta_2.00_x86_64-pc-linux-gnu
that is, the file has not being strip.

other improvement would be to used shared option
with prelink utility. That is, static linking is far from being optimal.
Here the CPU is quad core.

i would like to know also what is hapening with availability of client
for GPU hardware.

Chilean Profile
Avatar

Joined: Oct 16 05
Posts: 651
ID: 5008
Credit: 10,394,263
RAC: 2,476
Message 63995 - Posted 10 Nov 2009 2:24:28 UTC

2.00 Works perfectly under Windows 7 Ultimate x64.

Including graphics.
____________

AMD_is_logical

Joined: Dec 20 05
Posts: 299
ID: 41207
Credit: 31,460,681
RAC: 0
Message 63996 - Posted 10 Nov 2009 3:40:18 UTC - in response to Message ID 63993.

other improvement would be to used shared option
with prelink utility. That is, static linking is far from being optimal.

Static linking is needed for portability. A non-static binary would probably not run properly on a lot of Linux machines.

Evan

Joined: Dec 23 05
Posts: 268
ID: 42505
Credit: 402,585
RAC: 0
Message 64000 - Posted 10 Nov 2009 8:35:30 UTC

This one had a validate error on both attempts.
mix_score12_correct_B_rlbd_1vie__IGNORE_THE_RESTlr13_DECOY_15624_476
____________

[DPC]DeApen~BaDu

Joined: Oct 17 09
Posts: 2
ID: 354688
Credit: 102,981
RAC: 0
Message 64004 - Posted 10 Nov 2009 17:02:18 UTC

Can it be that this version delivers somewhat less performance?
I'm down 20% since the introduction of the 2.00 version.

http://tadah.mine.nu/graphs/flushHistoryGraph.php?tabel=subteamoffset&prefix=rah&naam=BaDu&team=[DPC]DeApen
The introduction was on the 6th my queue has 3 days of work, so you can see the preformance decrease since running 2.00

Is there a way to switch back to 1.98?

transient
Avatar

Joined: Sep 30 06
Posts: 376
ID: 115553
Credit: 7,834,811
RAC: 4,046
Message 64005 - Posted 10 Nov 2009 18:17:48 UTC

No you can't switch versions. 2.00 has been out only for 4 day's. That seems to be a bit short to me to have such an effect on your RAC.
____________

[DPC]DeApen~BaDu

Joined: Oct 17 09
Posts: 2
ID: 354688
Credit: 102,981
RAC: 0
Message 64006 - Posted 10 Nov 2009 20:12:06 UTC

The graph is not showing RAC but daily points over the last 7 days generated bij a stats engine.

transient
Avatar

Joined: Sep 30 06
Posts: 376
ID: 115553
Credit: 7,834,811
RAC: 4,046
Message 64009 - Posted 11 Nov 2009 6:01:42 UTC - in response to Message ID 64006.

The graph is not showing RAC but daily points over the last 7 days generated bij a stats engine.


True, but when looking up your account at boincstats, I did not see such a drop.
____________

Yifan Song
Forum moderator
Project administrator
Project developer
Project scientist

Joined: May 26 09
Posts: 62
ID: 318024
Credit: 7,322
RAC: 0
Message 64012 - Posted 11 Nov 2009 9:01:21 UTC

The large increase to the executable size could be due to the inclusion of a number of protocols that has been developed over the last 2 years. Those protocols were not able to compile with the boinc build until now.
There shouldn't be any difference in running the tasks though, the only difference is the time it takes to update.

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64019 - Posted 11 Nov 2009 18:50:02 UTC

If this is the cause, then how did the Windows version get by without any noticeable increase in size?
____________
Rosetta Moderator: Mod.Sense

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 64023 - Posted 11 Nov 2009 21:35:10 UTC - in response to Message ID 64012.

The large increase to the executable size could be due to the inclusion of a number of protocols that has been developed over the last 2 years. Those protocols were not able to compile with the boinc build until now.
There shouldn't be any difference in running the tasks though, the only difference is the time it takes to update.


Hi.

Does that mean that we Linux folk get to do more of the heavy lifting. ;) L.O.L.

____________


P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 64028 - Posted 12 Nov 2009 2:00:27 UTC
Last modified: 12 Nov 2009 2:02:11 UTC

Hi. first error with mini 2.00. well sort of.

This is an odd one only ran for 3 min's, i don't know what happened.

No error in manager.

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=269405320

mix_score12_B_rlbd_1ttz__IGNORE_THE_RESTlr13_DECOY_15619_826_0

Over__Validate error__Done__180.24

# cpu_run_time_pref: 14400
======================================================
DONE :: 1 starting structures 1201 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

</stderr_txt>
____________


Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,969,735
RAC: 81
Message 64030 - Posted 12 Nov 2009 20:43:51 UTC

Also got my first 2.00 error

http://boinc.bakerlab.org/rosetta/result.php?resultid=295408260
lr5_combine_smooth_torsion_it07_A_rlbd_1bm8_SAVE_ALL_OUT_IGNORE_THE_REST_DECOY_15460_190_2

core_client_version>6.10.17</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
[2009-11-12 16:39:46:] :: BOINC:: Initializing ... ok.
[2009-11-12 16:39:46:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
ERROR: Option matching -new_icoor not found in command line top-level context

</stderr_txt>
]]>

[AF>Libristes] Dudumomo Profile

Joined: Nov 30 06
Posts: 5
ID: 132369
Credit: 1,667,509
RAC: 0
Message 64031 - Posted 12 Nov 2009 21:03:11 UTC - in response to Message ID 64030.
Last modified: 12 Nov 2009 21:04:03 UTC

Hi.
I got a lot of errors too :
lr5_dun08_it04_A_rlbd_4icb_SAVE_ALL_OUT_IGNORE_THE_REST_DECOY_15799_439_0
<core_client_version>6.10.17</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
[2009-11-12 17:51:35:] :: BOINC:: Initializing ... ok.
[2009-11-12 17:51:35:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev33769.zip
Unpacking WU data ...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/yfsong_lr5_dun08_it04_A.zip
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/lr5_4icb.out.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Fullatom mode ..
# cpu_run_time_pref: 86400
Fullatom mode ..
..
..
..
Fullatom mode ..
SIGSEGV: segmentation violation
Stack trace (27 frames):
[0x9667f13]
.
.
.
[0x8048121]

Exiting...

</stderr_txt>
]]>

And also :

lr5_dun08_it04_A_rlbd_1ugh_SAVE_ALL_OUT_IGNORE_THE_REST_DECOY_15799_445_0
<core_client_version>6.10.17</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
[2009-11-12 19:16:30:] :: BOINC:: Initializing ... ok.
[2009-11-12 19:16:30:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev33769.zip
Unpacking WU data ...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/yfsong_lr5_dun08_it04_A.zip
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/lr5_1wdv.out.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Fullatom mode ..
# cpu_run_time_pref: 86400
Fullatom mode ..
Fullatom mode ..
Fullatom mode ..
*** glibc detected *** free(): invalid next size (fast): 0xef219138 ***
SIGABRT: abort called
Stack trace (30 frames):
[0x9667f13]
.
.
.
[0x8048121]

Exiting...

</stderr_txt>
]]>

Any idea why ?

And I got a lr5_dun08 blocked at 0.310% after 24h...I'm gonna cancel it I guess.
____________
MyUneo, the Cupid of Services

Hefto99

Joined: Oct 11 05
Posts: 5
ID: 3973
Credit: 1,312,635
RAC: 0
Message 64033 - Posted 13 Nov 2009 11:47:33 UTC

I have got several errors too (on 64-bit Linux):

===========
<core_client_version>6.2.15</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
[2009-11-13 13: 7:12:] :: BOINC:: Initializing ... ok.
[2009-11-13 13: 7:12:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev33769.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
# cpu_run_time_pref: 28800
*** glibc detected *** corrupted double-linked list: 0x11b99940 ***
SIGABRT: abort called
Stack trace (23 frames):
[0x9667f13]


============
<core_client_version>6.2.15</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
[2009-11-13 13:10: 5:] :: BOINC:: Initializing ... ok.
[2009-11-13 13:10: 5:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev33769.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
# cpu_run_time_pref: 28800
*** glibc detected *** free(): invalid next size (normal): 0x11212198 ***
SIGABRT: abort called
Stack trace (21 frames):
[0x9667f13]

____________

[AF>Libristes] Dudumomo Profile

Joined: Nov 30 06
Posts: 5
ID: 132369
Credit: 1,667,509
RAC: 0
Message 64034 - Posted 13 Nov 2009 13:03:42 UTC - in response to Message ID 64033.

I got linux 64b too.
I guess there is something wrong with our lib...?

My second laptop with Linux 64b as well, does not have any error calculation...

Do we have to install a particular lib ? Or what is wrong ?

Thanks
____________
MyUneo, the Cupid of Services

AMD_is_logical

Joined: Dec 20 05
Posts: 299
ID: 41207
Credit: 31,460,681
RAC: 0
Message 64035 - Posted 13 Nov 2009 17:34:04 UTC

Had some 3gbm WUs bomb out after 100 seconds or so. This was with 32bit Linux. In some cases the other cruncher returned a successful result (with Windows).

http://boinc.bakerlab.org/rosetta/result.php?resultid=295903692
http://boinc.bakerlab.org/rosetta/result.php?resultid=295911236
http://boinc.bakerlab.org/rosetta/result.php?resultid=295912057
http://boinc.bakerlab.org/rosetta/result.php?resultid=295912773

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64036 - Posted 13 Nov 2009 17:34:56 UTC - in response to Message ID 64034.

I got linux 64b too.
I guess there is something wrong with our lib...?

My second laptop with Linux 64b as well, does not have any error calculation...

Do we have to install a particular lib ? Or what is wrong ?

Thanks


Everything needed downloads with the work unit. It appears some specific tasks are having trouble and that is what this thread is for, to collect the descriptions of those so they can be corrected in future releases.
____________
Rosetta Moderator: Mod.Sense

[AF>Libristes] Dudumomo Profile

Joined: Nov 30 06
Posts: 5
ID: 132369
Credit: 1,667,509
RAC: 0
Message 64037 - Posted 13 Nov 2009 22:09:33 UTC

Okay thanks !
I let my second computer running these WUs.
____________
MyUneo, the Cupid of Services

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64038 - Posted 14 Nov 2009 3:07:59 UTC

sel_core_2.0_low50_beta_low200_start0_hb_t286__IGNORE_THE_REST_15751_714_1

Task 295582440 failed on Windows 7.

ERROR: res1 != res2
ERROR:: Exit from: ..\..\src\core\kinematics\FoldTree.cc line: 2342
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
]]>

____________

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64065 - Posted 17 Nov 2009 16:27:30 UTC

mix_score13_C_rlbd_1ttz__IGNORE_THE_RESTlr13_DECOY_15917_345_1 task 296879164 gave a Validate Error on Mac OS X 10.6 after generating one decoy. "Too many error results" according to the Workunit log: it had been sent out once before with a similar result.

____________

Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,969,735
RAC: 81
Message 64076 - Posted 18 Nov 2009 18:08:15 UTC

2 more errors - compute errors

http://boinc.bakerlab.org/rosetta/result.php?resultid=297254753
http://boinc.bakerlab.org/rosetta/result.php?resultid=296995752

ERROR: res1 != res2
ERROR:: Exit from: ..\..\src\core\kinematics\FoldTree.cc line: 2342
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
]]>


svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64084 - Posted 19 Nov 2009 5:29:59 UTC

again_sel_core_2.0_low50_beta_low200_nostart_hb_t286__IGNORE_THE_REST_15859_550_1 (task 296161309) failed on Windows 7

ERROR: res1 != res2
ERROR:: Exit from: ..\..\src\core\kinematics\FoldTree.cc line: 2342
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
]]>
____________

Interboy

Joined: Sep 28 05
Posts: 3
ID: 1645
Credit: 694,409
RAC: 0
Message 64085 - Posted 19 Nov 2009 8:34:10 UTC
Last modified: 19 Nov 2009 8:35:07 UTC

I aborted task "threading_bong_promals_3_hb_t305__IGNORE_THE_REST_16009_335_0" with unhandled exception on task 297355887.
____________

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64089 - Posted 19 Nov 2009 17:08:26 UTC

A couple more sel_core* tasks failing on Windows 7. Looking at the forum, it seems tasks with names containing t313 are quite prone to failure.

sel_core_1.5_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_15870_160_0 (task 296161514)
sel_core_1.5_low200_beta_low200_nostart_hb_t328__IGNORE_THE_REST_15873_167_0(task 296161959)

ERROR: res1 != res2
ERROR:: Exit from: ..\..\src\core\kinematics\FoldTree.cc line: 2342
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
]]>

____________

Telescope Adrian

Joined: Nov 14 06
Posts: 9
ID: 129278
Credit: 1,906,378
RAC: 0
Message 64091 - Posted 19 Nov 2009 19:03:25 UTC

Anybody noticed a new "facility" with 2.00 yet ?
Run 2 jobs together ( AMD Athlon 64 X 2) and , after a while , one of the jobs goes idle meaning that the system idle process sits at 50% utilisation . Suspending Rosetta , then restarting it makes no difference to this behaviour.
I used to see this feature a while ( many months) ago , but it went away I think at about Version 1.97 .

Has anyone else this yet ?

Best wishes
____________

Chilean Profile
Avatar

Joined: Oct 16 05
Posts: 651
ID: 5008
Credit: 10,394,263
RAC: 2,476
Message 64092 - Posted 19 Nov 2009 19:40:18 UTC - in response to Message ID 64091.
Last modified: 19 Nov 2009 19:41:40 UTC

Anybody noticed a new "facility" with 2.00 yet ?
Run 2 jobs together ( AMD Athlon 64 X 2) and , after a while , one of the jobs goes idle meaning that the system idle process sits at 50% utilisation . Suspending Rosetta , then restarting it makes no difference to this behaviour.
I used to see this feature a while ( many months) ago , but it went away I think at about Version 1.97 .

Has anyone else this yet ?

Best wishes


How much RAM do you have?

Edit: I figured it myself (2GB). I don't know what the problem could be... you could've given Rosetta too little available RAM in your setting, maybe?
____________

Telescope Adrian

Joined: Nov 14 06
Posts: 9
ID: 129278
Credit: 1,906,378
RAC: 0
Message 64093 - Posted 19 Nov 2009 20:19:54 UTC - in response to Message ID 64092.

Anybody noticed a new "facility" with 2.00 yet ?
Run 2 jobs together ( AMD Athlon 64 X 2) and , after a while , one of the jobs goes idle meaning that the system idle process sits at 50% utilisation . Suspending Rosetta , then restarting it makes no difference to this behaviour.
I used to see this feature a while ( many months) ago , but it went away I think at about Version 1.97 .

Has anyone else this yet ?

Best wishes


How much RAM do you have?

Edit: I figured it myself (2GB). I don't know what the problem could be... you could've given Rosetta too little available RAM in your setting, maybe?


Hello there . It's not a problem of store availability since I allow BOINC to use 75% of my available real store when I'm running projects . ( Virtual storage systems don't work like you seem to think ! ) . On this machine I usually have other jobs from Rosetta and Spinhenge queuing to run , but when the Rosetta job goes " idle " , no other job starts up to take its engine time up , so its nothing to do with OCP time utilisation either .As I said earlier , this facility used to show itself earlier this year , but went away at about Rosetta 1.97 . The workunit seems just to sit waiting for something , but I know not what !
Regards
____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64096 - Posted 20 Nov 2009 4:54:39 UTC

Adrian, Rosetta does not decide what work runs at what time, BOINC decides this. It does this based on your preferences. Since BOINC does not have a configuration setting called "real store", you haven't really told us much about your settings. Even if you were indicating memory, you didn't tells us if this was the setting for when the machine is in use, or when it is idle.

The main thing to check is... what does BOINC say the reason for not running it is? The task's status or the messages should indicate what's going on. Since you seem familiar with the Windows task manager, another idea would be to suspend the task that is active, and see if the other resumes running. And then look at how much memory it is using. Or easier yet, it should appear in the task list if you sort it alphabetically and show you how much memory it is using.

If it is consuming too much memory then that would be something Rosetta might be able to address.

Do you know which task name is causing you problems?
____________
Rosetta Moderator: Mod.Sense

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64097 - Posted 20 Nov 2009 5:00:57 UTC

Notes for Project Team:

Looking at Adrian's task list, it looks like this one had a very long running model on the third decoy
threading_bong_promals_4_hb_t328__IGNORE_THE_REST_16074_67_0
http://boinc.bakerlab.org/rosetta/result.php?resultid=297460595

Target runtime 14,400, 3 decoys ran in 23,000. The first two must have been done within 9,600 or it would have ended the task before starting the third. So that means the third ran for at least 13,400, which is nearly 4 hours.
____________
Rosetta Moderator: Mod.Sense

Telescope Adrian

Joined: Nov 14 06
Posts: 9
ID: 129278
Credit: 1,906,378
RAC: 0
Message 64099 - Posted 20 Nov 2009 7:16:02 UTC - in response to Message ID 64098.

Adrian, Rosetta does not decide what work runs at what time, BOINC decides this. It does this based on your preferences. Since BOINC does not have a configuration setting called "real store", you haven't really told us much about your settings. Even if you were indicating memory, you didn't tells us if this was the setting for when the machine is in use, or when it is idle.

The main thing to check is... what does BOINC say the reason for not running it is? The task's status or the messages should indicate what's going on. Since you seem familiar with the Windows task manager, another idea would be to suspend the task that is active, and see if the other resumes running. And then look at how much memory it is using. Or easier yet, it should appear in the task list if you sort it alphabetically and show you how much memory it is using.

If it is consuming too much memory then that would be something Rosetta might be able to address.

Do you know which task name is causing you problems?


Yes , I am aware that BOINC is what is loosely termed a High - Level Scheduler .
You seem to be expert at reading the bleeding obvious , there is no message from Boinc and I 've already checked all the obvious parameters that you've mentioned . Please be advised that I was a senior operating systems software engineer prior to my retirement , so I'm not one of the average computer illiterate cretins posting inane problems .
____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64109 - Posted 20 Nov 2009 17:01:10 UTC

Adrian, I'm simply trying to understand what you've got set up there, and what you are expecting the behavior to be. Is Rosetta the only BOINC project with work on your machine? How many tasks do you have waiting to run? Has it begun each, run for about 30 seconds and then suspended it? Do you leave tasks in memory when suspended?

As a senior operating systems software engineer, you must realize that the questions I've asked are highly relevant, and that you've answered none of them. Not even the task name and how much memory it is using. So I am still unable to use your report to build any theories about specific tasks consuming excessive memory.

Doctors make the worst patients, and the ole "...it broke. I did everything and nothing worked" problem description certainly sounds familiar.

Since you speak English, and don't have to translate the displays for us, and you realize the how intricate the configuration of the system can be, please use the terms written on the screen and describe your situation. Otherwise none of us cretins are going to be able to help you.
____________
Rosetta Moderator: Mod.Sense

Cesium_133* Profile
Avatar

Joined: Dec 1 08
Posts: 28
ID: 290631
Credit: 113,039
RAC: 2
Message 64115 - Posted 21 Nov 2009 11:35:20 UTC

So we're up to v2.00 on the Mini. I'm hoping that we're building upon a good foundation and cleaning up problems as we go... not just patching up this problem and that or finding work-arounds. I am very satisfied with the main bulk of the project, the computation... the graphic side isn't as important to me, but error creep, as it were, could always happen.

To perfection and no bugs :D
____________
The lovely lady you see isn't I, but Hayley Westenra, a classical crossover singer from Christchurch, NZ. There is no known voice as hers. Check her out- she's seraphic.

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64119 - Posted 21 Nov 2009 18:17:28 UTC

Task 3a9bB_rebuild_loop_no_no_relax_16088_612_0 ( 297537645 ) hung on Windows 7 at 18% completion. According to BOINC it was still going but the Task Manager said it was getting 0% time. Invoking Graphics showed a blank black screen and had to be aborted.

Restarting the computer caused the task to start behaving normally again, but it soon hung (with the same symptoms) and I aborted it.

____________

[af>FRANCE>44>Nantes] Einstein-Rosen-Podolsky

Joined: Jul 14 06
Posts: 4
ID: 99993
Credit: 425,264
RAC: 0
Message 64123 - Posted 22 Nov 2009 8:48:08 UTC

i have my computer which crach with few mns of "Rosetta mini 2.0".

I know IT it's Mini 2.0 which cause crach (no BSOD), just PC is as sleep, i push button power and PC STOP.

i do severales tests

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64129 - Posted 22 Nov 2009 16:30:58 UTC - in response to Message ID 64123.

i have my computer which crach with few mns of "Rosetta mini 2.0".

I know IT it's Mini 2.0 which cause crach (no BSOD), just PC is as sleep, i push button power and PC STOP.

i do severales tests


Do you know which work unit you had problems with? Have you had this happen with more then one work unit?
____________
Rosetta Moderator: Mod.Sense

[af>FRANCE>44>Nantes] Einstein-Rosen-Podolsky

Joined: Jul 14 06
Posts: 4
ID: 99993
Credit: 425,264
RAC: 0
Message 64132 - Posted 22 Nov 2009 18:04:40 UTC - in response to Message ID 64129.
Last modified: 22 Nov 2009 18:07:03 UTC

i have my computer which crach with few mns of "Rosetta mini 2.0".

I know IT it's Mini 2.0 which cause crach (no BSOD), just PC is as sleep, i push button power and PC STOP.

i do severales tests


Do you know which work unit you had problems with? Have you had this happen with more then one work unit?



hum i believe it is with: frb_0_8_mike_chosen_csts.noloopclose_ideal_hb_t369__IGNORE_THE_REST_1RXQA_8_16202_22_0, but note sure. it's an appli which is suspend.
i have not really saw the project, i have saw while minirosetta running, after few mns (2 or 3) my PC crach.

i'm under win7 x64 with 4GBs DDR3

AMD_is_logical

Joined: Dec 20 05
Posts: 299
ID: 41207
Credit: 31,460,681
RAC: 0
Message 64144 - Posted 23 Nov 2009 16:18:46 UTC

These mix_score12 WUs gave validate errors for both crunchers.

This one ended reporting one decoy:
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=272164905

This one reported 12 decoys for both crunchers:
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=272118041

Both WUs ended well before the specified runtime.

[af>FRANCE>44>Nantes] Einstein-Rosen-Podolsky

Joined: Jul 14 06
Posts: 4
ID: 99993
Credit: 425,264
RAC: 0
Message 64150 - Posted 23 Nov 2009 17:29:22 UTC

i'm french, so.... ?

i can reset project and begin a new project without problem (crach) ?
____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64157 - Posted 23 Nov 2009 19:31:52 UTC - in response to Message ID 64150.

i'm french, so.... ?

i can reset project and begin a new project without problem (crach) ?


Rosen, often people trying to translate BOINC terms, come up with words that do not match the English BOINC displays, so let me answer the two possible ways I see to interpret your question.

"Project" is Rosetta@home.
"Task" is one of the things your machine has downloaded from Rosetta.

If you "reset" the project, it aborts all of the tasks you have, and downloads all of the programs and files again (about 20MB of downloads), this is generally not necessary. If you "abort" a task, BOINC will try to download more work when it updates with the project the next time.

So, either way, your machine won't crash. But you will not get any credit for the aborted tasks. In general, you should not have to reset or abort anything. Perhaps you could post (en Francais if you must) what you are seeing that causes you to want to force a change.
____________
Rosetta Moderator: Mod.Sense

Ace Casino Profile

Joined: Jul 16 07
Posts: 11
ID: 191064
Credit: 3,874,587
RAC: 4,066
Message 64169 - Posted 24 Nov 2009 10:12:25 UTC
Last modified: 24 Nov 2009 11:10:39 UTC

Downloaded 9 wu's and they all errored out. The wingman on the wu's errored out also.

update: downloaded 4 more wu's on another computer...errors

AdeB Profile
Avatar

Joined: Dec 12 06
Posts: 45
ID: 135244
Credit: 2,473,178
RAC: 1,976
Message 64173 - Posted 24 Nov 2009 13:11:44 UTC - in response to Message ID 64169.

Downloaded 9 wu's and they all errored out. The wingman on the wu's errored out also.

update: downloaded 4 more wu's on another computer...errors


Same here:

ERROR: Value of inactive option accessed: -score:dun08_dir

example: 298867941

AdeB
____________

AMD_is_logical

Joined: Dec 20 05
Posts: 299
ID: 41207
Credit: 31,460,681
RAC: 0
Message 64180 - Posted 24 Nov 2009 19:25:22 UTC

I'm getting a whole slew of errors from lr8_combine_smooth_torsion_it00_rama WUs. They quickly error out saying:

ERROR: Value of inactive option accessed: -score:dun08_dir

Here are a few examples:
http://boinc.bakerlab.org/rosetta/result.php?resultid=298962894
http://boinc.bakerlab.org/rosetta/result.php?resultid=298962944
http://boinc.bakerlab.org/rosetta/result.php?resultid=298984258
http://boinc.bakerlab.org/rosetta/result.php?resultid=299000848
http://boinc.bakerlab.org/rosetta/result.php?resultid=299050519

[B^S]Beremat

Joined: Nov 1 06
Posts: 18
ID: 126607
Credit: 196,820
RAC: 1,955
Message 64182 - Posted 24 Nov 2009 19:54:02 UTC

Confirmed.
11/24/2009 2:47:59 PM rosetta@home Task lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1 exited with zero status but no 'finished' file
11/24/2009 2:47:59 PM rosetta@home If this happens repeatedly you may need to reset the project.
11/24/2009 2:48:00 PM rosetta@home Restarting task lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1 using minirosetta version 200
11/24/2009 2:48:14 PM rosetta@home Computation for task lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1 finished
11/24/2009 2:48:14 PM rosetta@home Output file lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1_0 for task lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1 absent

TONS of this.
____________

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 64189 - Posted 25 Nov 2009 0:30:57 UTC
Last modified: 25 Nov 2009 0:31:42 UTC

Add two more with the same error for me, so far.


http://boinc.bakerlab.org/rosetta/workunit.php?wuid=272594235

Wed 25 Nov 2009 09:31:16 EST|rosetta@home|Starting task lr8_combine_smooth_torsion_it00_rama02_A_rlbd_1bgf_IGNORE_THE_REST_DECOY_14887_508_1 using minirosetta version 200

Wed 25 Nov 2009 09:31:29 EST|rosetta@home|Output file lr8_combine_smooth_torsion_it00_rama02_A_rlbd_1bgf_IGNORE_THE_REST_DECOY_14887_508_1_0 for task absent
=================================================================================

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=272841430

Wed 25 Nov 2009 09:31:29 EST|rosetta@home|Starting task lr8_combine_smooth_torsion_it00_rama09_A_rlbd_1b3a_IGNORE_THE_REST_DECOY_14894_617_1 using minirosetta version 200

Wed 25 Nov 2009 09:31:42 EST|rosetta@home|Output file lr8_combine_smooth_torsion_it00_rama09_A_rlbd_1b3a_IGNORE_THE_REST_DECOY_14894_617_1_0 for task absent

<core_client_version>6.2.14</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
ERROR: Value of inactive option accessed: -score:dun08_dir
____________


Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,969,735
RAC: 81
Message 64190 - Posted 25 Nov 2009 0:50:48 UTC - in response to Message ID 64182.

Confirmed.
11/24/2009 2:47:59 PM rosetta@home Task lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1 exited with zero status but no 'finished' file
11/24/2009 2:47:59 PM rosetta@home If this happens repeatedly you may need to reset the project.
11/24/2009 2:48:00 PM rosetta@home Restarting task lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1 using minirosetta version 200
11/24/2009 2:48:14 PM rosetta@home Computation for task lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1 finished
11/24/2009 2:48:14 PM rosetta@home Output file lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1_0 for task lr8_combine_smooth_torsion_it00_rama08_A_rlbd_2acy_IGNORE_THE_REST_DECOY_14893_560_1 absent

TONS of this.



can you post the links to those tasks that errored out like that?
there is usually an underlying cause in the task status report.

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 64191 - Posted 25 Nov 2009 1:37:56 UTC

Another three for me, same error.

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=272594647

Wed 25 Nov 2009 12:03:07 EST|rosetta@home|Starting task lr8_combine_smooth_torsion_it00_rama03_A_rlbd_1kpe_IGNORE_THE_REST_DECOY_14888_508_0 using minirosetta version 200

Wed 25 Nov 2009 12:03:19 EST|rosetta@home|Output file lr8_combine_smooth_torsion_it00_rama03_A_rlbd_1kpe_IGNORE_THE_REST_DECOY_14888_508_0_0 for task absent

--------------------------------------------------------------------------------------

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=272895767

Wed 25 Nov 2009 11:43:49 EST|rosetta@home|Starting task lr8_combine_smooth_torsion_it00_rama07_A_rlbd_1kpe_IGNORE_THE_REST_DECOY_14892_659_0 using minirosetta version 200

Wed 25 Nov 2009 11:44:10 EST|rosetta@home|Output file lr8_combine_smooth_torsion_it00_rama07_A_rlbd_1kpe_IGNORE_THE_REST_DECOY_14892_659_0_0 for task absent

------------------------------------------------------------------------------------------------

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=272895760

Wed 25 Nov 2009 11:44:11 EST|rosetta@home|Starting task lr8_combine_smooth_torsion_it00_rama07_A_rlbd_1ig5_IGNORE_THE_REST_DECOY_14892_659_0 using minirosetta version 200

Wed 25 Nov 2009 11:44:18 EST|rosetta@home|Output file lr8_combine_smooth_torsion_it00_rama07_A_rlbd_1ig5_IGNORE_THE_REST_DECOY_14892_659_0_0 for task absent

____________


Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64200 - Posted 25 Nov 2009 4:00:23 UTC
Last modified: 25 Nov 2009 4:04:10 UTC

Many of these lr8_combine_smooth_torsion_it00_rama##... work units seem to be failing in first 30 seconds of execution with:
ERROR: Value of inactive option accessed: -score:dun08_dir

The messages tab then shows no output file was present as well.

____________
Rosetta Moderator: Mod.Sense

Coolcow

Joined: Nov 23 09
Posts: 1
ID: 359568
Credit: 15,819
RAC: 0
Message 64212 - Posted 25 Nov 2009 11:37:08 UTC

I have the same Problems with Error.
It interrupts while Initializing the Calculation.

Paul

Joined: Oct 29 05
Posts: 155
ID: 7397
Credit: 12,429,801
RAC: 1,706
Message 64214 - Posted 25 Nov 2009 12:36:00 UTC - in response to Message ID 64212.

All:

I continue to have WUs fail with computation errors. When I look at the failed WU in the task details, I see this error:

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x005763AF write attempt to address 0x00000027


It looks like this happened with minirosetta 1.98 as well (I found a few posts)

This is a new computer, intel Core i7 and Win 7 64-bit. It would be great to get this CPU working for R@H.

Any help is greatly appreciated.
____________
Thx!

Paul

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 64216 - Posted 25 Nov 2009 15:48:04 UTC

Can I be clear on something:

The problems are with WUs with the name:

lr8_combine_smooth_torsion_it00_rama*

New WU's are coming down with the name:

lr5_combine_smooth_torsion_it00_redo*

Are these new ones ok? I think I aborted one by accident.That was wrong, wasn't it?
____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64219 - Posted 25 Nov 2009 16:46:52 UTC - in response to Message ID 64216.
Last modified: 25 Nov 2009 16:49:25 UTC

Can I be clear on something:

The problems are with WUs with the name:

lr8_combine_smooth_torsion_it00_rama*

New WU's are coming down with the name:

lr5_combine_smooth_torsion_it00_redo*

Are these new ones ok? I think I aborted one by accident.That was wrong, wasn't it?


I've received no specific word, but it sounds very likely, yes. No biggie.

[edit]Yifan posted here confirming the rama batch had a problem in how it was created.
____________
Rosetta Moderator: Mod.Sense

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 64223 - Posted 25 Nov 2009 19:04:21 UTC

That's what I eventually realised before going into an abort-frenzy.

Ok, I think I'm clear and fully re-stocked now.
____________

bruce Profile

Joined: Sep 15 07
Posts: 10
ID: 205458
Credit: 839,797
RAC: 0
Message 64228 - Posted 26 Nov 2009 0:11:52 UTC

I'm also seeing a considerable number of WUs with errors similar to those posted by others recently.

Here is an example of the messages on the client:
11/25/2009 1:10:59 PM rosetta@home Starting sel_core_1.0_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_16216_654_1
11/25/2009 1:11:00 PM rosetta@home Starting task sel_core_1.0_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_16216_654_1 using minirosetta version 200
11/25/2009 1:12:43 PM rosetta@home Computation for task sel_core_1.0_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_16216_654_1 finished
11/25/2009 1:12:43 PM rosetta@home Output file sel_core_1.0_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_16216_654_1_0 for task sel_core_1.0_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_16216_654_1 absent

Here are some examples from the results:
299750820
299623182
299623178
299623176
299623173
299623157
299621562
299618069
299540989
299540964
299524862
299522937
299509005
299508999
... and 11 more WUs downloaded today (Nov 25) err'd with similar results
24 downloaded yesterday (nov 24) err'd with similar results.

I'll monitor these boards for updates, 'till then I've suspended further WU downloads.


____________

darkpella

Joined: Sep 27 05
Posts: 13
ID: 1390
Credit: 66,840
RAC: 0
Message 64232 - Posted 26 Nov 2009 8:02:16 UTC - in response to Message ID 64228.

I'm also seeing a considerable number of WUs with errors similar to those posted by others recently.

.....


Similar here with the following WUs:
299885240
299643164
299547442
298811740

stderr is slightly different though. stderr from my WUs is like:
<core_client_version>6.6.38</core_client_version>
<![CDATA[
<message>
Funzione non corretta. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
[2009-11-24 9:22: 6:] :: BOINC:: Initializing ... ok.
[2009-11-24 9:22: 6:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev33769.zip
Unpacking WU data ...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/yfsong_lr8_combine_smooth_torsion_it00_rama06_A.zip
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/lr8_1shf.out.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Fullatom mode ..
ERROR: Value of inactive option accessed: -score:dun08_dir


</stderr_txt>
]]>


while the one from some of bruce's WUs is like:
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
[2009-11-25 13:11: 0:] :: BOINC:: Initializing ... ok.
[2009-11-25 13:11: 0:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev33769.zip
Unpacking WU data ...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/sel_core_1.0_low200_beta_low200_nostart.broker_corebuild.t313_.olange.boinc_files.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.

ERROR: res1 != res2
ERROR:: Exit from: ..\..\src\core\kinematics\FoldTree.cc line: 2342
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish


</stderr_txt>
]]>

____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64245 - Posted 27 Nov 2009 17:10:16 UTC
Last modified: 27 Nov 2009 17:22:19 UTC

darkpella, all four of the tasks you linked are the known problem described here with "...rama..." in the name. These tasks were later corrected and reissued with "...redo..." in the name.
____________
Rosetta Moderator: Mod.Sense

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64310 - Posted 30 Nov 2009 17:24:20 UTC

Two tasks failing on Windows 7

300429025 sel_core_1.5_low200_beta_low200_nostart_hb_t297__IGNORE_THE_REST_15865_1407_1
300429024 resa_sel_core_1.5_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_16299_98_1

both with the error

ERROR: res1 != res2
ERROR:: Exit from: ..\..\src\core\kinematics\FoldTree.cc line: 2342
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>

Link
Avatar

Joined: May 4 07
Posts: 260
ID: 173059
Credit: 338,704
RAC: 3
Message 64311 - Posted 30 Nov 2009 17:25:53 UTC
Last modified: 30 Nov 2009 17:38:35 UTC

I'm getting recently many -1073741819 (0xc0000005) errors:

300067343
300945493 WU: 273907970. My wingman got same error.
301117881
301138967
____________
.

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 64323 - Posted 1 Dec 2009 15:28:57 UTC - in response to Message ID 64310.

Two tasks failing on Windows 7

300429025 sel_core_1.5_low200_beta_low200_nostart_hb_t297__IGNORE_THE_REST_15865_1407_1
300429024 resa_sel_core_1.5_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_16299_98_1

both with the error

ERROR: res1 != res2
ERROR:: Exit from: ..\..\src\core\kinematics\FoldTree.cc line: 2342
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>

My W7 laptop is error-free, but my Vista desktop had a few of the same errors:

sel_core_1.5_low200_beta_low200_nostart_hb_t297__IGNORE_THE_REST_15865_1075_0
sel_core_1.5_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_15870_2182_1
sel_core_1.5_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_15870_9716_1

All other WUs are fine.
____________

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64345 - Posted 3 Dec 2009 0:58:45 UTC

Some more failures on Win 7, all with the res1 != res2 error

resa_sel_core_1.5_low200_beta_low200_nostart_hb_t331__IGNORE_THE_REST_16303_113_1
rsel_core_1.5_low200_beta_low200_nostart_hb_t297__IGNORE_THE_REST_15865_7931_1
rsel_core_1.5_low200_beta_low200_nostart_hb_t313__IGNORE_THE_REST_15870_8110_1

and this one

sel_core_1.5_low200_beta_low200_nostart_hb_t297__IGNORE_THE_REST_15865_7946_0

which gave the same res1 != res2 error but ran for half an hour and returned an error status of success.

Again, it seems it's those tasks with t331 and t297 in their names that are causing problems.
____________

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 64368 - Posted 4 Dec 2009 2:57:19 UTC

Mod Sense, Here you go.

These are the only ones still in my list.

Credit was about normal, mostly get less than claimed anyway.

Don't think there was any double headers as you call them, some may have restarted.
===============================================================
This one did 135 models. - CC_101.22 / GC_83.32

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=275075567
---------------------------------------------------------------
This did 112. - CC_101.69 / GC_81.83

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=274576093
---------------------------------------------------------------
This did 153. - CC_102.90 / GC_86.03

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=273886738
---------------------------------------------------------------
This did 116. - CC_103.40 / GC_85.25

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=273350890

All with mini 2.00, i've had some with older versions to.

____________


robertmiles Profile

Joined: Jun 16 08
Posts: 658
ID: 264600
Credit: 3,743,487
RAC: 7,157
Message 64376 - Posted 4 Dec 2009 19:15:56 UTC

A few recent minirosetta 2.00 workunits that went beyond
the usual 100 decoys limit:

http://boinc.bakerlab.org/rosetta/result.php?resultid=301662056

http://boinc.bakerlab.org/rosetta/result.php?resultid=301552794

http://boinc.bakerlab.org/rosetta/result.php?resultid=301231750

http://boinc.bakerlab.org/rosetta/result.php?resultid=301041709

http://boinc.bakerlab.org/rosetta/result.php?resultid=300975639

http://boinc.bakerlab.org/rosetta/result.php?resultid=300923679

http://boinc.bakerlab.org/rosetta/result.php?resultid=300923678

http://boinc.bakerlab.org/rosetta/result.php?resultid=300745181

http://boinc.bakerlab.org/rosetta/result.php?resultid=300695511

http://boinc.bakerlab.org/rosetta/result.php?resultid=300688521

http://boinc.bakerlab.org/rosetta/result.php?resultid=300574451

http://boinc.bakerlab.org/rosetta/result.php?resultid=300412255

http://boinc.bakerlab.org/rosetta/result.php?resultid=300278675

http://boinc.bakerlab.org/rosetta/result.php?resultid=300272462

No definite problem; those that got less credit than usual also used
less CPU time than usual.

aguiar@carrier.com.br

Joined: Feb 19 06
Posts: 6
ID: 59950
Credit: 357,496
RAC: 0
Message 64392 - Posted 7 Dec 2009 9:20:48 UTC

Good morning!

I have WU 3gbm_3g0l_0264_revert.php_dock_rmsd.xml__16270_181_1 now elapsed 13:25:10 with 0.789% progress. Should I let it go or delete it?

Thanks,

Valter Aguiar
Brazil.
____________

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 64395 - Posted 7 Dec 2009 13:41:47 UTC - in response to Message ID 64392.

I have WU 3gbm_3g0l_0264_revert.php_dock_rmsd.xml__16270_181_1 now elapsed 13:25:10 with 0.789% progress. Should I let it go or delete it?

With a 3-hour default runtime the watchdog ought to have closed it down already, but if you click properties on that WU I would expect the CPU time is minimal, so something seems to have stalled with that one. I'd abort it and hope the next person that picks it up has more success with it.
____________

aguiar@carrier.com.br

Joined: Feb 19 06
Posts: 6
ID: 59950
Credit: 357,496
RAC: 0
Message 64396 - Posted 7 Dec 2009 14:19:21 UTC

Done, thanks. You were right, only 3 min of CPU time.

Valter.
____________

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3389
ID: 106194
Credit: 0
RAC: 0
Message 64397 - Posted 7 Dec 2009 14:42:16 UTC
Last modified: 7 Dec 2009 14:47:50 UTC

...and so it becomes a question of whether your machine has something else going on at a higher priority that is causing BOINC not to get any CPU time? Or is there a problem with BOINC or the task?

All other things being equal, starting a new task would also be impacted by other activity on the system (assuming the other activity is still running). Is your next task running normally? (i.e. check properties or task manager and see how many actual CPU seconds it has now used).

[edit] I don't see this task in your results and off-hand, the naming doesn't look like a Rosetta task. Can you post a link?
____________
Rosetta Moderator: Mod.Sense

Sid Celery

Joined: Feb 11 08
Posts: 806
ID: 241409
Credit: 10,030,156
RAC: 9,347
Message 64399 - Posted 7 Dec 2009 17:46:13 UTC - in response to Message ID 64397.

...and so it becomes a question of whether your machine has something else going on at a higher priority that is causing BOINC not to get any CPU time? Or is there a problem with BOINC or the task?

All other things being equal, starting a new task would also be impacted by other activity on the system (assuming the other activity is still running). Is your next task running normally? (i.e. check properties or task manager and see how many actual CPU seconds it has now used).

[edit] I don't see this task in your results and off-hand, the naming doesn't look like a Rosetta task. Can you post a link?

It appears to be this one:
3gbm_3g0l_0264_revert.pdb_dock_rmsd.xml__16270_181_1

I've seen this kind of thing very occasionally, even while other WUs appear to be running fine. In this case Valter appears to have been the wingman where the original cruncher failed as well.
____________

svincent

Joined: Dec 30 05
Posts: 202
ID: 44923
Credit: 4,404,794
RAC: 6,286
Message 64404 - Posted 8 Dec 2009 3:15:26 UTC

I've had a couple of tasks with names like 3a9bB* fail on Windows 7. In both cases I had to abort them as no progress was being made, even though they weren't getting any CPU time. My wingman in both cases successfully completed the tasks, one on Max OS X and the other on Win XP. The first one's reported above: the second is 271436170

AdeB Profile
Avatar

Joined: Dec 12 06
Posts: 45
ID: 135244
Credit: 2,473,178
RAC: 1,976
Message 64406 - Posted 8 Dec 2009 10:10:47 UTC

Validate errors in workunits with the name: mix_score13_hb_rlbd_1ttz__IGNORE_THE_RESTlr13_DECOY_16324_*

- 1. ----------------------------------------------------------
Task: 303144429
Workunit: mix_score13_hb_rlbd_1ttz__IGNORE_THE_RESTlr13_DECOY_16324_936_0
CPU time: 85.64598
stderr out:
...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Fullatom mode ..
# cpu_run_time_pref: 43200
======================================================
DONE :: 1 starting structures 1201 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish

- 2. ----------------------------------------------------------
Task: 302775198
Workunit: mix_score13_hb_rlbd_1ttz__IGNORE_THE_RESTlr13_DECOY_16324_508_1
CPU time: 75.6415
stderr out:
...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Fullatom mode ..
# cpu_run_time_pref: 43200
======================================================
DONE :: 1 starting structures 1201 cpu seconds
This process generated 1 decoys from 1 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down cleanly ...
called boinc_finish


AdeB
____________

Yifan Song
Forum moderator
Project administrator
Project developer
Project scientist

Joined: May 26 09
Posts: 62
ID: 318024
Credit: 7,322
RAC: 0
Message 64414 - Posted 9 Dec 2009 0:23:35 UTC - in response to Message ID 64406.

Validate errors in workunits with the name: mix_score13_hb_rlbd_1ttz__IGNORE_THE_RESTlr13_DECOY_16324_*
...
AdeB


Thanks!
There was a bug when we combine lr5, 8, 10 and 13 to make a large test. As a result, a few lr13 ones end up with too small input file and running too fast for the validation server.
This should be fixed soon.

Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,969,735
RAC: 81
Message 64456 - Posted 13 Dec 2009 10:53:56 UTC

lr8_combine_smooth_torsion_it00_rama06_A_rlbd_1tul_IGNORE_THE_REST_DECOY_14891_644_2

It ran 5 secs and crashed with this:

BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Fullatom mode ..
ERROR: Value of inactive option accessed: -score:dun08_dir

</stderr_txt>
]]>

Message boards : Number crunching : Minirosetta 2.00


Home | Join | About | Participants | Community | Statistics

Copyright © 2017 University of Washington

Last Modified: 10 Nov 2010 1:51:38 UTC
Back to top ^