Rosetta@home

Rosetta@Home version 3.26

  UW Seal
 
[ Home ] [ Join ] [ About ] [ Participants ] [ Community ] [ Statistics ]
  [ login/out ]


Advanced search
Message boards : Number crunching : Rosetta@Home version 3.26

Sort
AuthorMessage
cmiles

Joined: Jan 4 11
Posts: 9
ID: 407423
Credit: 0
RAC: 0
Message 72675 - Posted 5 Apr 2012 17:30:13 UTC

Rosetta@Home has been updated to version 3.26. If you encounter any problems, please let us know. Thank you for your continued support.

This update improves performance of the hybrid protocol for comparative modeling on symmetric targets.

Alberto

Joined: Jan 21 09
Posts: 1
ID: 298021
Credit: 88,596
RAC: 43
Message 72683 - Posted 5 Apr 2012 21:57:27 UTC - in response to Message ID 72675.

Rosetta@Home has been updated to version 3.26. If you encounter any problems, please let us know. Thank you for your continued support.

This update improves performance of the hybrid protocol for comparative modeling on symmetric targets.


Hi,
today I downloaded some new jobs of Rosetta Mini 3.24 and I can't upload the results.
Is it possible that the issue is the new version of project?
Thanks

Rocco Moretti

Joined: May 18 10
Posts: 66
ID: 381114
Credit: 585,745
RAC: 0
Message 72685 - Posted 6 Apr 2012 0:04:27 UTC - in response to Message ID 72683.

today I downloaded some new jobs of Rosetta Mini 3.24 and I can't upload the results.


During updates, the servers get busy from everyone automatically downloading the updates. Boinc should keep retrying to send the results, and they should get through later, when the servers are less busy.

Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,948,921
RAC: 243
Message 72691 - Posted 6 Apr 2012 23:55:02 UTC

Rocco, everything is running nicely now with this version.
The credits graph is climbing back up nicely now.
Well done to the team.

Snagletooth

Joined: Feb 22 07
Posts: 192
ID: 149031
Credit: 1,396,123
RAC: 1,318
Message 72695 - Posted 7 Apr 2012 20:26:58 UTC
Last modified: 7 Apr 2012 20:27:46 UTC

T0535_boinc_casp9_abinitio_smooth_abrelax_smooth_cmiles_SAVE_ALL_OUT_46221_481

both copies failed immediately with:

ERROR: unrecognized aa UNX
ERROR:: Exit from: src/core/io/pdb/file_data.cc line: 972
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 72696 - Posted 8 Apr 2012 3:30:54 UTC

Hi.

Got this error after 5min's running.

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=453105707

rb_04_06_29941_60607__t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_46687_40_0

Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
# cpu_run_time_pref: 14400
SIGSEGV: segmentation violation
Stack trace (12 frames):
[0xa9660c7]
[0xf7702400]
[0x8a67102]
[0x8a03fa9]
[0x94100f0]
[0x9412d6a]
[0x958f221]
[0x95f6945]
[0x95f4175]
[0x80547ed]
[0xa9f6058]
[0x8048131]

Exiting...

</stderr_txt>
]]>

____________


Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,948,921
RAC: 243
Message 72710 - Posted 9 Apr 2012 7:11:22 UTC
Last modified: 9 Apr 2012 7:13:26 UTC

Task ID 497264845
Name T0543_boinc_casp9_abinitio_smooth_abrelax_smooth_cmiles_SAVE_ALL_OUT_46232_627_0
CPU time 0
stderr out

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
]]>
application version 3.26

First one of the new version on my system to have problems.
Everything else is running smoothly.

Warped

Joined: Jan 15 06
Posts: 44
ID: 50853
Credit: 1,336,001
RAC: 225
Message 72757 - Posted 14 Apr 2012 13:13:15 UTC

Thank you for improving the interval between checkpoints. For those of us who reboot from time to time, the work lost is now minimal.
____________
Warped

robertmiles Profile

Joined: Jun 16 08
Posts: 656
ID: 264600
Credit: 3,462,248
RAC: 2,198
Message 72761 - Posted 14 Apr 2012 21:44:22 UTC - in response to Message ID 72710.

Task ID 497264845
Name T0543_boinc_casp9_abinitio_smooth_abrelax_smooth_cmiles_SAVE_ALL_OUT_46232_627_0
CPU time 0
stderr out

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
]]>
application version 3.26

First one of the new version on my system to have problems.
Everything else is running smoothly.


This suggests that a small change in future versions of minirosetta could be useful:

Have them send the amount of disk space they are allowed to use to an output file before doing much else. At least the lowest limit that can be determined this early in the workunit.

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3381
ID: 106194
Credit: 0
RAC: 0
Message 72762 - Posted 15 Apr 2012 0:01:40 UTC

All BOINC work units are shipped with a disk space limit. If a work unit runs past this, the BOINC Manager stops it.
____________
Rosetta Moderator: Mod.Sense

robertmiles Profile

Joined: Jun 16 08
Posts: 656
ID: 264600
Credit: 3,462,248
RAC: 2,198
Message 72764 - Posted 15 Apr 2012 0:35:25 UTC - in response to Message ID 72762.

All BOINC work units are shipped with a disk space limit. If a work unit runs past this, the BOINC Manager stops it.


Then just writing that limit to the log file may be enough.

[VENETO] boboviz Profile

Joined: Dec 1 05
Posts: 545
ID: 25524
Credit: 1,510,213
RAC: 1,277
Message 72767 - Posted 15 Apr 2012 6:58:22 UTC

A lot of errors like this:
498556225



BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Starting work on structure: _00001
# cpu_run_time_pref: 7200

</stderr_txt>
]]>

Validate state Invalid
____________

Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,948,921
RAC: 243
Message 72768 - Posted 15 Apr 2012 11:56:09 UTC - in response to Message ID 72761.

Task ID 497264845
Name T0543_boinc_casp9_abinitio_smooth_abrelax_smooth_cmiles_SAVE_ALL_OUT_46232_627_0
CPU time 0
stderr out

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
]]>
application version 3.26

First one of the new version on my system to have problems.
Everything else is running smoothly.


This suggests that a small change in future versions of minirosetta could be useful:

Have them send the amount of disk space they are allowed to use to an output file before doing much else. At least the lowest limit that can be determined this early in the workunit.


What I forgot to add in was the CPU time was 0.
This makes it an even more bizzare crash.
http://boinc.bakerlab.org/rosetta/result.php?resultid=497264845

robertmiles Profile

Joined: Jun 16 08
Posts: 656
ID: 264600
Credit: 3,462,248
RAC: 2,198
Message 72780 - Posted 16 Apr 2012 3:48:30 UTC - in response to Message ID 72768.
Last modified: 16 Apr 2012 3:50:02 UTC

Task ID 497264845
Name T0543_boinc_casp9_abinitio_smooth_abrelax_smooth_cmiles_SAVE_ALL_OUT_46232_627_0
CPU time 0
stderr out

<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
Maximum disk usage exceeded
</message>
]]>
application version 3.26

First one of the new version on my system to have problems.
Everything else is running smoothly.


This suggests that a small change in future versions of minirosetta could be useful:

Have them send the amount of disk space they are allowed to use to an output file before doing much else. At least the lowest limit that can be determined this early in the workunit.


What I forgot to add in was the CPU time was 0.
This makes it an even more bizzare crash.
http://boinc.bakerlab.org/rosetta/result.php?resultid=497264845


I'd expect that it mean that it tried to reserve all the space it needed very early in the workunit startup, was unable to do so, and therefore the failure was seen when the amount of CPU time used was still low enough to round off to zero.

P . P . L .
Avatar

Joined: Aug 20 06
Posts: 581
ID: 105843
Credit: 4,864,105
RAC: 0
Message 72782 - Posted 16 Apr 2012 9:06:46 UTC

Hi.

I was the second lucky winner of this one, first run had same error.

http://boinc.bakerlab.org/rosetta/workunit.php?wuid=454867641

T0624_CASP9_br_11_starts_with_abmodels_SAVE_ALL_OUT_47293_33_1

Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.

ERROR: ERROR: FragmentIO: could not open file boinc_input_files/aat000_.3mers.gz
ERROR:: Exit from: src/core/fragment/FragmentIO.cc line: 233
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
]]>

____________


Mike

Joined: Apr 30 09
Posts: 44
ID: 313723
Credit: 65,019
RAC: 0
Message 72791 - Posted 16 Apr 2012 16:48:11 UTC - in response to Message ID 72675.

Rosetta@Home has been updated to version 3.26. If you encounter any problems, please let us know. Thank you for your continued support.

This update improves performance of the hybrid protocol for comparative modeling on symmetric targets.



Please see my thread about viewing structure predictions as pdb files in the science forum.. I am posting here since I got no reply there

Around

Joined: Oct 16 11
Posts: 3
ID: 433464
Credit: 474,717
RAC: 0
Message 72820 - Posted 18 Apr 2012 20:26:51 UTC

With the upgrade to the latest version of BOINC (7.0.25 (x64)), I seem to have a steady stream of "computational errors" with Rosettas Mini 3.26.

Cheers,

Adrian

Torsten Persson

Joined: Feb 11 08
Posts: 5
ID: 241493
Credit: 14,359,146
RAC: 7,852
Message 72822 - Posted 18 Apr 2012 22:40:16 UTC

Hi!
Something happened around March 18. The daily points from my Macs, all three of them, dropped down to approx. 20% of normal and have since stayed there. My Windows machines have not been affected. What´s up?

Torsten

Mike

Joined: Apr 30 09
Posts: 44
ID: 313723
Credit: 65,019
RAC: 0
Message 72824 - Posted 18 Apr 2012 22:59:15 UTC

The Result Graphs on the website is NOT WORKING... I hope I don't get the same slow response to this that I did with my other issue.

Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,948,921
RAC: 243
Message 72825 - Posted 18 Apr 2012 23:11:21 UTC - in response to Message ID 72822.

Hi!
Something happened around March 18. The daily points from my Macs, all three of them, dropped down to approx. 20% of normal and have since stayed there. My Windows machines have not been affected. What´s up?

Torsten



This thread has some discussion about macintosh machines. However the version they were talking about was 3.24 not 3.26. So whatever was happening in .24 may have not been fixed in .26 when it comes to mac's.

Rocco Moretti

Joined: May 18 10
Posts: 66
ID: 381114
Credit: 585,745
RAC: 0
Message 72835 - Posted 19 Apr 2012 18:32:32 UTC - in response to Message ID 72825.

This thread has some discussion about macintosh machines. However the version they were talking about was 3.24 not 3.26. So whatever was happening in .24 may have not been fixed in .26 when it comes to mac's.


I can confirm that 3.26 has the same mac slowdown issue that 3.24 does.

We have some leads now on the issue, so <deity of your choice (or absence thereof)> willing, we may be able to correct things by the next release, though no promises.

Rocco Moretti

Joined: May 18 10
Posts: 66
ID: 381114
Credit: 585,745
RAC: 0
Message 72836 - Posted 19 Apr 2012 18:36:27 UTC - in response to Message ID 72824.

The Result Graphs on the website is NOT WORKING...


Hi Mike,

To which graph are you referring?

D J Blumer

Joined: Nov 6 05
Posts: 2
ID: 9615
Credit: 11,490,394
RAC: 6,656
Message 72844 - Posted 20 Apr 2012 15:55:23 UTC

As reported widely, the Mac OSX versions of 3.24 and 3.26 are showing extremely poor performance, which is reflected in abnormally low "granted credit" scores compared to all previous versions. The Granted Credit is ~4 -5 times lower than the "Claimed Credit" now! Clearly, version 3.26 did not fix the problem introduced by 3.24. At first, I thought this might be a glitch in the scoring system, but after analyzing the result reports, I no longer think that is the case. I have a 21000-second time preference set and usually a large number of decoys are attempted and finish within the time window on my 3.2 GHz Mac Pro. Many of my results have only one decoy attempt now, so the low granted credit is probably correct.

However, that implies that a serious bug was introduced in v3.24. Is this is receiving priority attention by the Rosetta programmers? Not only is the calculation efficiency dramatically reduced, but I worry that the technical accuracy of the results may be poor, also. As a Mac OSX developer, I know there have been lots of changes in Xcode recently and thus more opportunities for code to break that used to work fine previously.
____________

Mike

Joined: Apr 30 09
Posts: 44
ID: 313723
Credit: 65,019
RAC: 0
Message 72846 - Posted 20 Apr 2012 17:54:56 UTC - in response to Message ID 72836.

The Result Graphs on the website is NOT WORKING...


Hi Mike,

To which graph are you referring?



Hey, sorry for blowing a fuse there... It is the results and data plots from active workunits

Thanks
Mike

Greg_BE Profile
Avatar

Joined: May 30 06
Posts: 4835
ID: 85645
Credit: 2,948,921
RAC: 243
Message 72858 - Posted 22 Apr 2012 15:03:22 UTC

this task died at just under 2hrs due to no heartbeat.

CASP9_fb_benchmark_hybridization_run54_T0532_0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_47951_293_0

All the usual unpacking and checking went ok
then this:
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
# cpu_run_time_pref: 28800
No heartbeat from core client for 30 sec - exiting

Stealth Eagle* Profile
Avatar

Joined: Jan 1 07
Posts: 2
ID: 138824
Credit: 12,376
RAC: 0
Message 72859 - Posted 22 Apr 2012 16:12:21 UTC

I have noticed that the tasks are requiring about 1/2 Gig of memory to run.
I never noticed this before.
____________

What you do today you will have to live with tonight

robertmiles Profile

Joined: Jun 16 08
Posts: 656
ID: 264600
Credit: 3,462,248
RAC: 2,198
Message 72860 - Posted 22 Apr 2012 20:28:40 UTC

Rosetta Mini 3.26
rb_04_20_30733_61889__round2_t000__0_C1_SAVE_ALL_OUT_IGNORE_THE_REST_48383_2236

Now using 713 KB and has used as much as 963 KB.

Can you give users the option of limiting the number of such
memory-hungry workunits that can be on one of their computers
at any one time? Or would you prefer to recompile them to
run in 64-bit mode instead of 32-bit mode so that, under
64-bit versions of Windows, the workunits can run without any
additional memory being used for the SysWOW64 software
needed to run 32-bit workunits under 64-bit versions of
Windows? Note that you choose the second option, you may
need to ask the BOINC developers to make it possible to set
a lower limit for memory used by all 32-bit workunits than for
the total memory used by all workunits, or if they prefer,
insure that every 32-bit workunit gets a separate 4 GB of
32-bit memoryspace. I've found that my computer tends to slow
down user response quite a bit as the total memory used by
32-bit workunits and other 32-bit programs approaches 3.5 GB,
as I'd expect if all 32-bit programs must fit within a single
4 GB memoryspace. The computer has a total of 8 GB of memory.

Mod.Sense
Forum moderator
Project administrator

Joined: Aug 22 06
Posts: 3381
ID: 106194
Credit: 0
RAC: 0
Message 72863 - Posted 23 Apr 2012 2:56:34 UTC

robertmiles means "MB" not "KB".

Just a suggestion that may improve your memory usage in the meantime, consider turning off hyperthreading if you have it active. It causes twice as many threads to run, and try to utilize twice as much memory, but doesn't generally yield twice the processing (especially if there is contention for memory, such as you describe for the <4GB space). I'm sure every processor is different. But in years past, with fairly lengthy review of credit per second averages, it has been observed that contention for floating point operations between the 2 hyperthreaded WUs roughly equaled any benefits of running in two threads.

It could be that they've beefed up the floating point processing to be useful to twice as many threads simultaneously, so you'd want to watch your RAC over time to see if change to hyperthreading setting is helping or not. But that's a tough one because of specific variations in any given work unit. So, on the other hand, watch for result after a few days, don't panic at a variation any sooner then that.

I'm sure many would be curious to hear your findings.
____________
Rosetta Moderator: Mod.Sense

robertmiles Profile

Joined: Jun 16 08
Posts: 656
ID: 264600
Credit: 3,462,248
RAC: 2,198
Message 72864 - Posted 23 Apr 2012 5:00:41 UTC - in response to Message ID 72863.

robertmiles means "MB" not "KB".

Just a suggestion that may improve your memory usage in the meantime, consider turning off hyperthreading if you have it active. It causes twice as many threads to run, and try to utilize twice as much memory, but doesn't generally yield twice the processing (especially if there is contention for memory, such as you describe for the <4GB space). I'm sure every processor is different. But in years past, with fairly lengthy review of credit per second averages, it has been observed that contention for floating point operations between the 2 hyperthreaded WUs roughly equaled any benefits of running in two threads.


The computer with the problems doesn't use hyperthreading within BOINC, and I already have BOINC using one less than the number of cores available.

I do have another computer that does use hyperthreading, but does not have this problem. It uses 64-bit Windows 7 Professional and has 16 GB, but is currently using only 3.85 GB. Could that mean that its version of Windows is better at running large numbera of 32-bit programs at once? I may have to watch it to see if it ever goes above 4 GB for all 32-bit programs.

The first computer still has some programs that just aren't ready for installing Windows 7.

Mike

Joined: Apr 30 09
Posts: 44
ID: 313723
Credit: 65,019
RAC: 0
Message 72872 - Posted 24 Apr 2012 11:52:08 UTC - in response to Message ID 72846.

The Result Graphs on the website is NOT WORKING...


Hi Mike,

To which graph are you referring?



Hey, sorry for blowing a fuse there... It is the results and data plots from active workunits

Thanks
Mike



??? BUMP

Rocco Moretti

Joined: May 18 10
Posts: 66
ID: 381114
Credit: 585,745
RAC: 0
Message 72883 - Posted 25 Apr 2012 18:04:29 UTC - in response to Message ID 72846.

The Result Graphs on the website is NOT WORKING...

It is the results and data plots from active workunits


It looks like the server side system that handles it is fundamentally broken (resource issues), and would take a non-trivial amount of effort to fix. Unfortunately, the "fix" we're probably going to implement is to remove that feature/link altogether. Sorry about that.

Mike

Joined: Apr 30 09
Posts: 44
ID: 313723
Credit: 65,019
RAC: 0
Message 72889 - Posted 26 Apr 2012 0:39:47 UTC - in response to Message ID 72883.

The Result Graphs on the website is NOT WORKING...

It is the results and data plots from active workunits


It looks like the server side system that handles it is fundamentally broken (resource issues), and would take a non-trivial amount of effort to fix. Unfortunately, the "fix" we're probably going to implement is to remove that feature/link altogether. Sorry about that.



When you take the fun out of doing a project, you lose customers quickly

Mike

Joined: Apr 30 09
Posts: 44
ID: 313723
Credit: 65,019
RAC: 0
Message 77925 - Posted 12 Feb 2015 17:37:00 UTC
Last modified: 12 Feb 2015 17:39:06 UTC

I know I'm going on this bumping thread thing again.

The not funny part is that it's not working again.

The funny part is that I did a Google search for the problem as usual, came across this thread, and only after reading the whole thing did I realize that I was actually the poster.

Oh.. PS sorry for blowing a fuse yet again in that last post a few years ago. I have learned to be less bitter since.. Although fixing this feature would really make it fun ;).

Message boards : Number crunching : Rosetta@Home version 3.26


Home | Join | About | Participants | Community | Statistics

Copyright © 2017 University of Washington

Last Modified: 10 Nov 2010 1:51:38 UTC
Back to top ^