Posts by Thomas Leibold

21) Message boards : Number crunching : Problems with version 5.90/5.91 (Message 49854)
Posted 21 Dec 2007 by Thomas Leibold
Post:

The Rosetta 5.90 task shows cpu time --- and progress 0.040% and 7:54:07 time to completion (I run Rosetta with 8 hour workunits). It has only run for about 10 minutes, so there is no telling yet how it will behave. While I'm typing this the progress and time to completion values have jumped up and down a couple of times (by as much as 2.5% progress and 30 minutes of time to completion), but cpu time remains just dashes.


Shortly after posting the above message I saw a brief flash of the cpu time for the Rosetta 5.90 task (which was confirmed by ps), but the display went back to --- immediately afterwards.

Here are all the lines starting with BOINC in the stdout.txt file for the indefinite Ralph 5.90 task. It seems actual cpu time is stuck at 0.000999 and therefore never approaches 14400 (4 hours). The Watchdog timer isn't kicking in because the client is making progress completing more and more decoys.

BOINC :: [2007-12-19 21:25:55:] :: mode: pose1 :: nstartnum: 1 :: number_of_output: 9999 :: num_decoys: 0 :: pct_complete: 0
BOINC :: [2007-12-19 22:24:24:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 1 :: num_decoys: 1 :: farlx_stage: 0
BOINC :: [2007-12-19 23:17:12:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 2 :: num_decoys: 2 :: farlx_stage: 0
BOINC :: [2007-12-19 23:17:12:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 0.0004995
BOINC :: [2007-12-20 0: 3:15:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 3 :: num_decoys: 3 :: farlx_stage: 0
BOINC :: [2007-12-20 0: 3:15:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 0.000333
BOINC :: [2007-12-20 0:51:41:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 4 :: num_decoys: 4 :: farlx_stage: 0
BOINC :: [2007-12-20 0:51:41:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 0.00024975
BOINC :: [2007-12-20 1:49:10:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 5 :: num_decoys: 5 :: farlx_stage: 0
BOINC :: [2007-12-20 1:49:10:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 0.0001998
BOINC :: [2007-12-20 2:43:43:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 6 :: num_decoys: 6 :: farlx_stage: 0
BOINC :: [2007-12-20 2:43:43:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 0.0001665
BOINC :: [2007-12-20 3:34:52:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 7 :: num_decoys: 7 :: farlx_stage: 0
BOINC :: [2007-12-20 3:34:52:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 0.000142714
BOINC :: [2007-12-20 4:24:11:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 8 :: num_decoys: 8 :: farlx_stage: 0
BOINC :: [2007-12-20 4:24:11:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 0.000124875
BOINC :: [2007-12-20 5:12:53:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 9 :: num_decoys: 9 :: farlx_stage: 0
BOINC :: [2007-12-20 5:12:53:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 0.000111
BOINC :: [2007-12-20 5:55:19:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 10 :: num_decoys: 10 :: farlx_stage: 0
BOINC :: [2007-12-20 5:55:19:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 9.99e-05
BOINC :: [2007-12-20 6:41:22:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 11 :: num_decoys: 11 :: farlx_stage: 0
BOINC :: [2007-12-20 6:41:22:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 9.08182e-05
BOINC :: [2007-12-20 7:46: 5:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 12 :: num_decoys: 12 :: farlx_stage: 0
BOINC :: [2007-12-20 7:46: 5:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 8.325e-05
BOINC :: [2007-12-20 8:30:52:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 13 :: num_decoys: 13 :: farlx_stage: 0
BOINC :: [2007-12-20 8:30:52:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 7.68462e-05
BOINC :: [2007-12-20 9:19:13:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 14 :: num_decoys: 14 :: farlx_stage: 0
BOINC :: [2007-12-20 9:19:13:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 7.13571e-05
BOINC :: [2007-12-20 10: 5:29:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 15 :: num_decoys: 15 :: farlx_stage: 0
BOINC :: [2007-12-20 10: 5:29:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 6.66e-05
BOINC :: [2007-12-20 11: 3:21:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 16 :: num_decoys: 16 :: farlx_stage: 0
BOINC :: [2007-12-20 11: 3:21:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 6.24375e-05
BOINC :: [2007-12-20 11:53:49:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 17 :: num_decoys: 17 :: farlx_stage: 0
BOINC :: [2007-12-20 11:53:49:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 5.87647e-05
BOINC :: [2007-12-20 12:40:32:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 18 :: num_decoys: 18 :: farlx_stage: 0
BOINC :: [2007-12-20 12:40:32:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 5.55e-05
BOINC :: [2007-12-20 13:24:42:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 19 :: num_decoys: 19 :: farlx_stage: 0
BOINC :: [2007-12-20 13:24:42:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 5.25789e-05
BOINC :: [2007-12-20 14:14:23:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 20 :: num_decoys: 20 :: farlx_stage: 0
BOINC :: [2007-12-20 14:14:23:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 4.995e-05
BOINC :: [2007-12-20 15: 6:58:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 21 :: num_decoys: 21 :: farlx_stage: 0
BOINC :: [2007-12-20 15: 6:58:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 4.75714e-05
BOINC :: [2007-12-20 15:51: 4:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 22 :: num_decoys: 22 :: farlx_stage: 0
BOINC :: [2007-12-20 15:51: 4:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 4.54091e-05
BOINC :: [2007-12-20 16:35: 6:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 23 :: num_decoys: 23 :: farlx_stage: 0
BOINC :: [2007-12-20 16:35: 6:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 4.34348e-05
BOINC :: [2007-12-20 17:24: 1:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 24 :: num_decoys: 24 :: farlx_stage: 0
BOINC :: [2007-12-20 17:24: 1:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 4.1625e-05
BOINC :: [2007-12-20 18: 7:38:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 25 :: num_decoys: 25 :: farlx_stage: 0
BOINC :: [2007-12-20 18: 7:38:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 3.996e-05
BOINC :: [2007-12-20 19: 1: 9:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 26 :: num_decoys: 26 :: farlx_stage: 0
BOINC :: [2007-12-20 19: 1: 9:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 3.84231e-05
BOINC :: [2007-12-20 19:50:50:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 27 :: num_decoys: 27 :: farlx_stage: 0
BOINC :: [2007-12-20 19:50:50:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 3.7e-05
BOINC :: [2007-12-20 20:35:27:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 28 :: num_decoys: 28 :: farlx_stage: 0
BOINC :: [2007-12-20 20:35:27:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 3.56786e-05
BOINC :: [2007-12-20 21:31:26:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 29 :: num_decoys: 29 :: farlx_stage: 0
BOINC :: [2007-12-20 21:31:26:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 3.44483e-05
BOINC :: [2007-12-20 22:17: 3:] :: checkpoint_decoys() :: saved decoy info :: attempted_decoys: 30 :: num_decoys: 30 :: farlx_stage: 0
BOINC :: [2007-12-20 22:17: 3:] :: cpu_time_pref: 14400 :: cpu_time: 0.000999 :: cpu_time_per_nstruct: 3.33e-05
22) Message boards : Number crunching : Problems with version 5.90/5.91 (Message 49853)
Posted 21 Dec 2007 by Thomas Leibold
Post:
I currently have both a Ralph 5.90 and a Rosetta 5.90 running on my home system (SuSE Linux 32-bit dual cpu).

The Ralph 5.90 task shows cpu time 00:00:00 and progress 0.000% and 3:59:02 time to completion (I run Ralph with 4 hour workunits). This task seems to have run for over 24 hours and will probably never finish.

The Rosetta 5.90 task shows cpu time --- and progress 0.040% and 7:54:07 time to completion (I run Rosetta with 8 hour workunits). It has only run for about 10 minutes, so there is no telling yet how it will behave. While I'm typing this the progress and time to completion values have jumped up and down a couple of times (by as much as 2.5% progress and 30 minutes of time to completion), but cpu time remains just dashes.
23) Message boards : Number crunching : Problems with Rosetta version 5.89 (Message 49767)
Posted 17 Dec 2007 by Thomas Leibold
Post:
Task ID 126856692
Name 1lis_WHS_ETABLE_SVM_TESTS-1lis_-frags83__2453_41_0
Workunit 115320625
Created 15 Dec 2007 4:37:53 UTC
Sent 15 Dec 2007 4:38:18 UTC
Received 15 Dec 2007 19:55:18 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 193 (0xc1)
Computer ID 687330
Report deadline 25 Dec 2007 4:38:18 UTC
CPU time 24587.680633
stderr out
<core_client_version>5.10.21</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 28800
# random seed: 1054640
SIGILL: illegal instruction
Stack trace (18 frames):
[0x8da95f7]
[0x8da43ec]
[0xffffe500]
[0x8c18d60]
[0x8775f6f]
[0x877651e]
[0x877a4cf]
[0x878b883]
[0x879128d]
[0x8cfb408]
[0x8b45e94]
[0x8b48f13]
[0x80d8c55]
[0x85f4d25]
[0x8732b67]
[0x8732c12]
[0x8e0d944]
[0x8048111]

Exiting...

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 94.8036591472954
Granted credit 0
application version 5.89

This one died after more than 6.5 hours (I run with 8 hours per workunit) when it was almost finished with an illegal instruction trap. Computer has two Quad-Core Opteron 2346HE processors and 8GB of memory running OpenSuSE 10.3 in 64-bit mode. No other error have been reported from this fairly new system as far as I can tell.
24) Message boards : Number crunching : Problems with Rosetta version 5.89 (Message 49737)
Posted 16 Dec 2007 by Thomas Leibold
Post:

Gee is this normal here in Rosetta? Things are much more polite in Seti.


I'm not sure what it is that you are considering impolite:
- Developers pushing insufficiently tested applications into the main project Rosetta without giving it enough exposure in the test project Ralph ?
- Project participants complaining about the insufficient testing ?

Mind you, I haven't seen anything yet that indicates that there are problems with 5.89. It is just that those quick releases without proper testing have caused problems before.
25) Message boards : Number crunching : Problems with Rosetta version 5.89 (Message 49717)
Posted 16 Dec 2007 by Thomas Leibold
Post:

So this was one of those one day tests again, I see. Released on ralph on the 12th, and released here on the 13th.

Ludicrous to call that "testing".


I have one system participating in Ralph and it received the first 5.89 workunit late on the 14th. This machine had gotten 'no work from project' from Ralph since the 12th.

That very same server also participates in Rosetta and received the first 5.89 workunit already on the 13th!

I think I agree with the 'ludicrous' sentiment. I find it unacceptable to experience problems in Rosetta that proper testing in Ralph could have avoided.
26) Message boards : Number crunching : Problems with Rosetta version 5.85 (or 5.86 for linux) (Message 49623)
Posted 11 Dec 2007 by Thomas Leibold
Post:
I'm still hoping to get some response regarding the validator errors that have been reported not just by myself but also several other users. I came to Rosetta from another DC project because I was disappointed in the way project members treated reports of problems by users and it sounded at the time that Baker Lab was doing a better job at that. I'm not so sure anymore that this is really the case.

In the meantime here is something else (to be ignored?):


Task ID 125965972
Name BAK_3chy_loop_model_2377_5470_0
Workunit 114511247
Created 11 Dec 2007 2:06:31 UTC
Sent 11 Dec 2007 2:07:35 UTC
Received 11 Dec 2007 10:18:34 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 679308
Report deadline 21 Dec 2007 2:07:35 UTC
CPU time 0
stderr out
<core_client_version>5.10.21</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 28800
ERROR:: Exit from: fragments.cc line: 691

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 0
Granted credit 0
application version 5.86

This one didn't even 'get off the ground' and failed instantly.
27) Message boards : Number crunching : Problems with Rosetta version 5.85 (or 5.86 for linux) (Message 49341)
Posted 3 Dec 2007 by Thomas Leibold
Post:
Is there any way to find out what caused the validate error on workunit 112697569 ?
The server 679308 is a new machine with dual Quad-Core Opteron 2346HE and 16GB of memory running OpenSuSE 10.3 in 64-bit mode. All other results from the server completed without any errors.

The same workunit was assigned to another computer, but that result has not been returned yet.
28) Message boards : Number crunching : Problems with Rosetta version 5.68 (Message 41786)
Posted 3 Jun 2007 by Thomas Leibold
Post:
Out of the first dozen or more workunits with 5.68 (different computers) three failed.
Workunit #s: 73754058, 73753552, 73753540
ResultId #s: 83930186, 83928164, 83928154

The successful 5.68 workunits had 'TREEJUMP' in their name while all the failing ones had 'CHAINBREAKS' and 'ALTSECSTRUCT' in their name.

<core_client_version>5.8.17</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# cpu_run_time_pref: 28800
trouble finding jump_templates_RNA_basepairs_v2.dat
ERROR:: Exit from: read_paths.cc line: 360

</stderr_txt>
]]>
29) Message boards : Number crunching : Problems with Rosetta version 5.64 (Message 40916)
Posted 14 May 2007 by Thomas Leibold
Post:
Workunit 71222475 is hanging on one of my systems (state is running, but no cpu time accumulates). The following is the contents of stderr.txt:

Graphics are disabled due to configuration...
# cpu_run_time_pref: 28800
# random seed: 2862215
ERROR:: Exit from: pose.cc line: 769
SIGABRT: abort called
Stack trace (21 frames):
[0x8cbf0fb]
[0x8cb9f2c]
[0xffffe420]
[0x8d2a0b4]
[0x8d3ef9f]
[0x8d44005]
[0x8d442e3]
[0x8d14d11]
[0x8d16739]
[0x84aacad]
[0x8d2a5ff]
[0x8cbbbbf]
[0x8063a3d]
[0x8064905]
[0x88baf95]
[0x83402ed]
[0x85b4a7f]
[0x86d8113]
[0x86d81be]
[0x8d22ff4]
[0x8048111]

Exiting...


I'm going to abort this workunit since it is obviously not going to go anywhere.
30) Message boards : Number crunching : Problems with Rosetta version 5.59 (Message 39092)
Posted 6 Apr 2007 by Thomas Leibold
Post:
I'm not sure it really is a 'Problem', better call it an observation. One of my servers was shut down for a couple of days. When I restarted Boinc (this quad (2 dual-core) cpu linux server is dedicated to Rosetta) all 4 Rosetta tasks were restarted and immediately terminated (successfully and for credit), despite the fact that the workunit deadline was still far away and despite being still more than 1 hour short of the requested 8 hour runtime.

The workunits (result ids) are 70914277,
70911561,
70911269,
70910655.

The number of decoys generated (37-42) is lower than the number of nstructs (48-49), but that is to be expected since they didn't run for the full 8 hours.
31) Message boards : Number crunching : Rosetta not wanting to crunch on my Linux Box (Message 38913)
Posted 3 Apr 2007 by Thomas Leibold
Post:
Should this be mentioned in the FAQ under a Linux heading or some such as it has led to many lost days of computation over the months? Perhaps it is there and I just couldn't find it


The recommendation for the preference setting can be found under "Additional Notes" in the Rosetta System Requirements. I doesn't say why, but there are a number of threads where this topic is raised in both "Questions and Answers" and "Message Boards".

Instead of documenting the workaround I would prefer to see this problem fixed.

Given that this is a Rosetta specific problem (other projects have no such issues when task switching) I'm not sure that waiting for new versions of Boinc is going to magically fix the problem. To be clear, I'm not saying that the cause of the problem is necessarily in Rosetta code. It could be a unique way of using Boinc APIs that triggers a problem inside Boinc code. However as long as this issue only effects Rosetta, why would any Boinc developer search for the cause in their code (if they even heard about it) ? Right or wrong, I'm sure that they will assume it is something in the Rosetta code that is causing it.
32) Message boards : Number crunching : Wasting my time and energy (Message 38387)
Posted 25 Mar 2007 by Thomas Leibold
Post:
F@H last time I knew had no-deadline units you could specify.


Those have been discontinued for some time now. They wouldn't really have helped either since the client would only download one of those. When that one workunit is complete the computer would sit idle until bob transfers another workunit. The only benefit the timeless workunits would have given is that you would still get credit if the workunit is being returned after a long (not indefinite) time.
33) Message boards : Number crunching : Wasting my time and energy (Message 38358)
Posted 25 Mar 2007 by Thomas Leibold
Post:
F@H is none boinc


While F@H is not (yet?) using Boinc and has similar objectives as Rosetta, for bob's purposes it is even worse. Like Rosetta the F@H project requires a quick turnaround of their workunits since subsequent workunits depend on the previous results. While Rosetta/Boinc allows some caching of workunits to mitigate the not-always-online network issue, the F@H client permits only one workunit at a time.

Outside of Climate Prediction I don't know of any project that works well with systems that are mostly offline.
34) Message boards : Number crunching : Problems to run rosetta/boinc on debian etch (Message 38038)
Posted 20 Mar 2007 by Thomas Leibold
Post:
Why did you decide to compile your own Boinc client ? Unless there are compelling reasons not to use them (compatibility issues), the ready made binaries are easier to use.

A computer id will be generated if it doesn't exist when you start the client. However you do need an account id for every project you want to participate in. If you already created an account that id/key should have been send to you. If not, then you do need to create an account first.
35) Message boards : Cafe Rosetta : Personal Milestones (Message 38036)
Posted 20 Mar 2007 by Thomas Leibold
Post:
I'm a millionaire :-)

1.000.000 Rosetta Coblestones!
36) Message boards : Number crunching : Problems with Rosetta version 5.51 (Message 37927)
Posted 17 Mar 2007 by Thomas Leibold
Post:
I don't think I have seen this one before:

<core_client_version>5.4.9</core_client_version>
<message>
process exited with code 1 (0x1)
</message>
<stderr_txt>
Graphics are disabled due to configuration...
# random seed: 1057481
# cpu_run_time_pref: 28800
ERROR:: Exit at: refold.cc line:337

</stderr_txt>

This is the 5.51 Rosetta Linux client on workunit 60792243
37) Message boards : Number crunching : Rosetta not wanting to crunch on my Linux Box (Message 37926)
Posted 17 Mar 2007 by Thomas Leibold
Post:
on this system i have an Athlon XP 1700+, and 512 megs og ram, so it s not like theres not enough system resources avaliable. anyone have any ideas?

For some Rosetta workunits (the recent HINGE series) 512MB is the minimum, but most workunits use much less.


The issue you are reporting is a longstanding and still not resolved issue with Rosetta. The workaround is to change the BOINC General Preferences and enable (set to "Yes") the option:

"Leave applications in memory while suspended?
(suspended applications will consume swap space if 'yes')"

This is always required when using multiple projects since the Rosetta linux client doesn't handle the rescheduling properly unless it is permitted to remain in memory (it fails with segmentation violation but doesn't terminate properly).
38) Message boards : Number crunching : Problems with Rosetta version 5.51 (Message 37784)
Posted 14 Mar 2007 by Thomas Leibold
Post:
All of the RNA workunits processed with Rosetta 5.51 show the same message:

DONE :: 1 starting structures built 30 (nstruct) times
This process generated 0 decoys from 0 attempts

Also all of them are running for much less time then the 8 hours I have in my preferences (range is about 10 minutes to 2 hours). Most of them show a status (outcome/client state) of "Success Done", but there are also some with "Validate Error Done". Those with validation errors also fail the second time they are being assigned to someone else.

All of those results are from Rosetta. I never got any 5.51 workunits from Ralph (RNA or otherwise).
39) Message boards : Number crunching : Change in CPUs (Message 37736)
Posted 12 Mar 2007 by Thomas Leibold
Post:
That is very wierd:
At 6:25:27 PM you get preferences from your Boinc Account Manager that indicate that the Number of usable CPUs changed from 1 to 2 and also saying that the last time you changed preferences was at 18:25:35 (same day).
That part is fine as long as we assume that your local PC clock is a few seconds slow (or the time at the BAM server is a little fast).
Then at 6:25:45 PM Boinc is getting the preferences again, only this time the Boinc Account Manager responds that the number of CPUs changed from 2 to 1 and that the last time you changed your preferences was at 18:24:06 (which is ***older*** than the previously stated date/time of last change)!!!

Whatever the reason, it does appear that your problem has its roots at the Boinc Account Manager (bam.boincstats.com).

One speculative possibility might be that bam.boincstats.com resolves to multiple servers that aren't perfectly synchronized and therefore allow the occasional release of stale (old) preferences.
40) Message boards : Number crunching : UPDATE BOINC............ (Message 37718)
Posted 12 Mar 2007 by Thomas Leibold
Post:
The post by Vijay Pande about the public beta for FAH is still on their Forum. I used to participate in F@H expecting the Boinc client 'soon' (as another post promised) and nothing happened.
Perhaps the hostility towards Boinc from a number of outspoken F@H participants is a factor in this. The Stanford - Berkeley rivalry is hard to understand for someone like me who didn't grow up in this country, let alone the San Francisco Bay Area.

Instead my team (DSLReports.com Team Helix) adopted Rosetta/Boinc as an additional project and I moved all my resources over to it. I'd consider contributing to F@H again if they ever release the promised Boinc client, but in the meantime their loss is Rosetta's gain (and if you look at my stats, it isn't all that insignificant) :-)


Previous 20 · Next 20



©2024 University of Washington
https://www.bakerlab.org