Posts by Prom

1) Message boards : Rosetta@home Science : FSHD (Message 76556)
Posted 26 Mar 2014 by Prom
Post:
Has Rosetta ever done any work in relation to FSHD? Despite being the most prevalent form of muscular dystrophy estimated to affect at least 1/15,000 people it has been the most poorly researched and understood MD. Only as of 2010 when the full mechanism for the pathology was discovered has research avenues for treatments opened up. It would be great to have Rosetta on board in helping find a cure for this disease.
2) Message boards : Rosetta@home Science : Need some information please (Message 76555)
Posted 25 Mar 2014 by Prom
Post:
Basically, yes.

When for instance a new protein is designed a synthetic gene has to be ordered to make it. Rosetta is successfully used to predict if a candidate will fold into the desired structure for treating a target disease or not. In science every discovery helps, even discovering that you're wrong. By knowing a structure was never a candidate no further time and money will be wasted on it.

Of course many candidates also turn out to be 'duds' but by eliminating a lot of them early on these will be tested and eliminated a lot earlier leading to treatments earlier than the traditional methods. The algorithm is also getting refined by using known structures to make predictions quicker and more accurate.

So don't stop crunching those WUs.
3) Message boards : Number crunching : AMD VR INTEL (Message 71452)
Posted 21 Oct 2011 by Prom
Post:
The main problem i see with Bulldozer is that it has only fore floating point units,
Each pair of integer core`s share one FPU, even though the FPU`s have been `optimized` to take account of this and do have supprizing performance as they are,
With BOINC projects that are FPU intensive surely this will be a bottle neck for us,
And in the few tests so far published the 1100T can keep up in many of the ways we are likely to use them when the CPU is subjected to 24/7 100% load and the fact of eight work units trying to use the limited number of FPU`s at the same time, only time will tell and when someone is brave enough to buy one and put it on Rosetta to see what it can do in real world tests.
I would still like to buy one but will wait a bit longer to see how things work out.

There is also a thread on seti@home about Bulldozy
http://setiathome.berkeley.edu/forum_thread.php?id=65764


I'm fairly sure I recall someone stating that Rosetta has more integer work than float. Not sure if that's true, and I can't find the thread but have foudn this: http://boinc.bakerlab.org/rosetta/forum_thread.php?id=4907...

Either way it doesn't really matter. Bulldozer has one shared FPU per module for the new 256-bit instructions but for the traditional 128-bit instructions it's still 2 FPUs so no difference for Rosetta or any other app currently available. It would make sense therefor to restrict all the BOINC apps to 128-bit only.

I would like to see some Bulldozer Rosetta benchmarks as opposed to the synthetic ones we are seeing now.
4) Message boards : Number crunching : Not getting work. (Message 69380)
Posted 16 Jan 2011 by Prom
Post:
No, it's still skipping regularly here with server saying "No work from project" or just not sending any tasks. When it does not do that most of the time it says "(won't finish in time) Computer on 4.2% of time, BOINC on 99.4% of that, this project gets 100.0% of that".

The 4.2% started at 3% this morning. I don't understand it. Due to my previous problem it created a new "computer" and it's been on and running for about 12 hours a day since then. I can do at least 12 times what it's willing to give me.

I've even set my workunit runtime to 1 hour and it still says that. They still run for 6 hours even after clicking update so the server seems to ingore that setting or doesn't send it to my client.
5) Message boards : Number crunching : Not getting work. (Message 69361)
Posted 15 Jan 2011 by Prom
Post:
I got four tasks, one at a time, starting early yesterday even before the announcement on the home page. Each of them completed, uploaded and reported normally.

That changed today. I am now not getting any new tasks.

You can assume that this problem is related to the recent system failures, but I'm going to try to let the managers know when there is what might be a new problem. They have been most appreciative of my letting them know about such problems in the past, and that's what I'm doing now.


Same here. Tried to set my stash to 20 days and decrease the runtime but still to no avail. Now I've set the few I got to 1 day just in case. How does one contact the managers? They don't seem to read these forums.

I've also had a problem with the merge function wiping my computer credit instead of merging it. Now it looks like my computer did almost nothing while it actually did all the work. I also got the standard "Didn't you see..." response. Nobody seems to realise these are additional problems and not solely because of the crash. They may be related but they will not magically disappear once everything is restored.

Backups are meant for one thing alone - DATA. All programs and settings should be restored by clean installs. I can't say for sure this was not done but I have a sneaky suspicion the backups restored "functionality" it shouldn't have restored.
6) Message boards : Number crunching : Problems with web site (Message 69339)
Posted 14 Jan 2011 by Prom
Post:
They were there and they did show. It doesn't sound like anybody understands so let me illustrate:

Total Credit for 768175 before merge: 181,754.28
Total Credit for 965812 before merge: 6,008.17
Total Credit for 1003570 before merge: 0
Total Credit for 1026651 before merge: 0 <--- computer presumably being merged to

Total Credit for 768175 after merge: 0
Total Credit for 965812 after merge: 0
Total Credit for 1003570 after merge: 0
Total Credit for 1026651 after merge: 0 <--- notice no credit

So their credit was zeroed but it was never added to destination host. They were effectively just removed. This looks like a software bug. If it can't be fixed right now I understand, but the merge function should then be disabled until it can.
7) Message boards : Number crunching : Problems with web site (Message 69308)
Posted 14 Jan 2011 by Prom
Post:
So I rejoin after almost 2 years of inactivity. I see 4 computers and only one of them with credit or perhaps another one with a little in the single figures. I can't remember there being 4 when I left. Could this be because of the backup? Anyway I try to merge them and they seem to have been disappeared instead. Is this from the outage or is it a bug in the software?

Only the newest show up and I can still access the other 3 with their numbers only they have now been anonymized, all with zero credit. My account still looks like it has the total credit. People will now think I cheated having the credit but no computers that actually did it. :D Though I don't see how one could. I seem to have rejoined at the worst time.

So is this something that will automatically be fixed or should I just leave it and forget about it?


So you haven't noticed that Rosetta@Home recently had a file system crash bad enough that they expect it to take some time to get everything back to normal?

You could always try doing at least one more workunit, and seeing if that's enough to restore whichever computer you use to your records. But note that a number of other people rejoined after an absence long enough they need to download a new version of the application software have had problems doing that download, since the restore from the file system crash hadn't proceded far enough to restore that software from the backup they're using.

Hopefully, they can fix whatever other problems you're seeing, once the restore from the backup gets far enough.

Yes I did. But I don't think you understand my problem. The computers were all there and it looked like the totals added up. The only strange thing was that there were 4 when I was only using 1 during that period so I assumed it was a side-effect from the restore.

No problem so far and easy to solve, or so I thought. Then when I did the merge they just poof disappeared. Currently studying how databases work myself I don't see this as a problem with the backup. The data WAS all there after the restore until I did the merge then it disappeared. The app and workunits and everything are working fine but I think anyone doing a merge or who did one then would have the same problem. This was something I did on the website alone and that is where it broke.
8) Message boards : Number crunching : Problems with web site (Message 69302)
Posted 14 Jan 2011 by Prom
Post:
So I rejoin after almost 2 years of inactivity. I see 4 computers and only one of them with credit or perhaps another one with a little in the single figures. I can't remember there being 4 when I left. Could this be because of the backup? Anyway I try to merge them and they seem to have been disappeared instead. Is this from the outage or is it a bug in the software?

Only the newest show up and I can still access the other 3 with their numbers only they have now been anonymized, all with zero credit. My account still looks like it has the total credit. People will now think I cheated having the credit but no computers that actually did it. :D Though I don't see how one could. I seem to have rejoined at the worst time.

So is this something that will automatically be fixed or should I just leave it and forget about it?
9) Message boards : Number crunching : Problems with Rosetta version 5.85 (or 5.86 for linux) (Message 49042)
Posted 25 Nov 2007 by Prom
Post:
I aborted the 5.85 workunits. There weren't hundreds, they were in the minority. Hopefully this will be sorted when I need more.
10) Message boards : Number crunching : Problems with Rosetta version 5.85 (or 5.86 for linux) (Message 48995)
Posted 24 Nov 2007 by Prom
Post:
There is something seriously wrong with the application. It's not a virtual memory hog it's a complete memory hog. I don't have virtual memory, have 2GiB of physical memory. Rosetta now uses 800MB of that for one wu. With the 600MB normally used this is not enough to run two or sometimes even one. I am aborting all version 5.85 tasks until they are disabled.
11) Message boards : Number crunching : Problems with Rosetta version 5.80 (Message 46656)
Posted 20 Sep 2007 by Prom
Post:
I don't really know if this is a 5.8 problem, a BOINC problem, or what..so..if I'm in the wrong thread, feel free to move this elsewhere.

Just happened to notice that a WU had just completed, so went to my results page to check credit, and in less than 5 minutes, the result was nowhere to be found. All my prior WU's from today's work are there, and my "in progress" work is there, but not this one. Here's the log (edited for brevity):

9/19/2007 2:33:13 PM|rosetta@home|Starting 1a68__SEARCH_PAIRINGS_ROUND2_RESCORE_150_SAVE_ALL_OUT_-1a68_-_BARCODE__2050_7166_0
9/19/2007 2:33:14 PM|rosetta@home|Starting task 1a68__SEARCH_PAIRINGS_ROUND2_RESCORE_150_SAVE_ALL_OUT_-1a68_-_BARCODE__2050_7166_0 using rosetta_beta version 580
9/19/2007 5:28:31 PM|rosetta@home|Computation for task 1a68__SEARCH_PAIRINGS_ROUND2_RESCORE_150_SAVE_ALL_OUT_-1a68_-_BARCODE__2050_7166_0 finished
9/19/2007 5:28:33 PM|rosetta@home|[file_xfer] Started upload of file 1a68__SEARCH_PAIRINGS_ROUND2_RESCORE_150_SAVE_ALL_OUT_-1a68_-_BARCODE__2050_7166_0_0
9/19/2007 5:28:40 PM|rosetta@home|[file_xfer] Finished upload of file 1a68__SEARCH_PAIRINGS_ROUND2_RESCORE_150_SAVE_ALL_OUT_-1a68_-_BARCODE__2050_7166_0_0
9/19/2007 5:28:40 PM|rosetta@home|[file_xfer] Throughput 4223 bytes/sec
9/19/2007 5:28:46 PM|rosetta@home|Sending scheduler request: To report completed tasks
9/19/2007 5:28:46 PM|rosetta@home|Reporting 1 tasks
9/19/2007 5:28:51 PM|rosetta@home|Scheduler RPC succeeded [server version 509]
9/19/2007 5:28:51 PM|rosetta@home|Deferring communication for 4 min 2 sec
9/19/2007 5:28:51 PM|rosetta@home|Reason: requested by project

By 5:33PM (PDT)(i.e., 00:33 UTC) this WU was not in my results list. Since I've never had one disappear THAT fast before, I'm wondering if something is amiss. Unfortunately I have no way to know the result # or WU#...

Thoughts, anyone?

This workunit by any chance?
It hasn't disappeared, it's on page 3. The workunits are ordered by wu# not date. You got an earlier workunit later than you normally should if you look at the date and so you'e likely to miss it without digging deeper. :) Funny this usually only happens when a workunit is reassigned after the deadline.
12) Message boards : Number crunching : Problems with Rosetta version 5.80 (Message 46650)
Posted 20 Sep 2007 by Prom
Post:
Just a theory here.

I noticed that the cpu usage increased to above 50%. I was told that this might happen with graphics running but it seems it continues to do that after the graphics are shut down. The cpu usage is erratic and despite showing high usage the time and results seem to be going slower. This only seems to happen with the capri workunits though, the last one just finished after which things returned to normal. I know there was a graphics problem that is supposed to be fixed in the new versions. Is there any chance that another problem was introduced where the graphics thread continues to run after exiting the graphics?

Spoke too soon. :/
cpu usage is 2/3 on one unit and 1/3 on the other. This after viewing both.
13) Message boards : Number crunching : CAPRI14? (Message 46465)
Posted 17 Sep 2007 by Prom
Post:
1g4u__BOINC_CAPRI14_DOCK_FIXBACKBONE-1g4u_-nodimerloop_plexinmonomer__2067_505 ... stuck ...
14) Message boards : Number crunching : CAPRI14? (Message 46447)
Posted 17 Sep 2007 by Prom
Post:
Another 1g4u aborted that got stuck. As suspected it was using between 50~60% cpu.
15) Message boards : Number crunching : CAPRI14? (Message 46434)
Posted 17 Sep 2007 by Prom
Post:
Hmm, it seems the only people with problems here have two or more CPUs. The only one without problems is Anonymous here but he hasn't been running a lot of these WUs and not for long. Maybe a cache inconsistency between processors? This is strange since a WU is usually only run on one processor at a time but my cpu usage sometimes goes well above 50% on one and well below on the other.
16) Message boards : Number crunching : CAPRI14? (Message 46376)
Posted 16 Sep 2007 by Prom
Post:
I am quite curious as to the relative success rate of the Capri14 WUs versus WUs of all other types.

My personal experience is that a Capri14 WU is nearly doomed to failure, no matter the Rosetta version on which it is running, while nearly all other WUs will succeed...

I sincerely hope that the CPU time that (to my layman's perspective) I seem to be wasting on Capri14 is not representative of the experience of the general population of Rosetta crunchers.

Respectfully,
David Emigh


David I believe you are having a bad run of luck on the WUs. I also have had several fail but overall better than 95% have finished with valid results. Like a few others I have had to make adjustments to memory allocations on my old machines but the CAPRI14 project will soon be over and we can get back to the same old boring number crunching.
I also don't like the "client error" result either it does give me a since of loss.

Cheers Jim

It seems that only a few types of WUs will result in some sort of failure. Most have given me a success. I don't really see it as wasted CPU time but then credit should be allocated accordingly for the results.
17) Message boards : Number crunching : CAPRI14? (Message 46298)
Posted 15 Sep 2007 by Prom
Post:
This is puzzling -- those jobs arent taking long. We'll look into it. In the meanwhile, can your or other post links to the appropriate results that were killed?


I have now done several CAPRI14 WUs with 5.80: Only one has a problem:
http://boinc.bakerlab.org/rosetta/workunit.php?wuid=95775480
with this result:
http://boinc.bakerlab.org/rosetta/result.php?resultid=105522388
I don't have a clue as to why it failed to Validate. It seems to have run the full length of time I have set for processing.

Thanks,
Jim

Ok, one of mine have now failed to validate as well. Resetta owes me 132.92 for 11h57m47.5s of work and 177 decoys.
18) Message boards : Number crunching : CAPRI14? (Message 46192)
Posted 14 Sep 2007 by Prom
Post:
I agree. Did you happen to view the graphic prior to it getting ended by the watch dog? I'm wondering if it was one of those that took the whole 3.5hrs on that first model, and therefore never completed one and had no completed results to report... or if it was one of those running 6 models an hour, and so it had completed 20 models or whatever prior to the problem. If the ladder, then definately there is room for improvement.

Most WU types will report the completed models and the failure and show "success", and issue credit accordingly. But I'm not certain if these are proving the exception to the rule.

I didn't see it while it was stuck but viewing a couple of times before it was putting out results about every 5~10 minutes so there should have been at least 40 decoys by that time. The problem is that the system seems to give 20 credits for effort because of the error rather than giving credits for the reported results. Luckily mine only claimed about 40 but others have claimed anything from 4 to a couple hundred and they all got 20. Whether it's set up like this or a bug only surfacing now I think the developers should look at it.
If it were really stuck, the watch dog should detect that and end it. But there are cases where progress is slow to come and the watch dog knows that. More then an hour for a single step... ya that sounds a bit high. That was one of Rhiju's questions was "did you happen to notice if the screen looked totally stuck before the crash?" ... but then THAT one didn't "crash" with the stuck after 900 seconds condition.

Do you know? Was it still using CPU after that hour? There have been issues where BOINC shows a status of "running", but no CPU is being used.

It went to 500 steps in a few seconds and then just seemed to stay there indefinitely at the refinement stage. So I restarted the app after an hour and it seemed to stay at those same values before I ended it at 20 mins. Unfortunately I didn't look at the cpu usage but will do so if it happens again. Strange that the watchdog didn't end it. Maybe there could have been progress so little that it didn't show up on screen but I don't know how likely that is. Hmm... seems somebody else completed it... maybe starting with a different random number?

I also have to wonder why this didn't show up at ralph as there seems to be so many issues with so many workunits. The problem is that these all went out shortly after the blackout so I suspect there wasn't any testing done on them.
19) Message boards : Number crunching : CAPRI14? (Message 46161)
Posted 14 Sep 2007 by Prom
Post:
WU 94754759 was killed after being idle for 15 min. I only got 20 credits for it. I'm not upset at this but would have expected about 40 after doing about 3.5 hours of work with a lot of results. What happened to the results that were valid? Just something else I think should be looked into.

Now WU 94757175 goes up to 500 steps and stops at the refinement. This time the watchdog didn't shut it down. After going for an hour I restarted boinc but it simply got stuck again so I aborted it after 20 mins. Now someone else has it, will see for how long.

I'm watching the rest closely to abort any suspicious units.

EDIT: It seems the 1he8 and the 1g4u ones have the most problems, maybe remove these from the system until this is resolved.
20) Questions and Answers : Web site : Merge Not Working (Message 41836)
Posted 5 Jun 2007 by Prom
Post:
I made changes to three of my systems last week. One did not require merging, but two did. I was able to successfully merge one system, but not the other. Here are the three scenarios.

It appears the major problem is the cpu type. The new BOINC version not only lists your cpu type but also its features. After installing the new version, note I did not upgrade the old version, I had three computers suddenly of which the two old ones with only the cpu type could not merge the newest one. I found a hackaround using the old version and only have one pc now thankfully. I will try to explain what happened in your three scenarios.
Computer 177211 was upgraded from a 2.4Ghz P4 to a 3.2GHz Dual Core P4. While Windows XP and some software required reregistration, I did nothing to the BOINC client. The Rosetta site reflects the new and correct hardware/OS configuration that is now in 177211.

The BOINC client still had your old computer ids and only updated the details. This should not cause a new computer to be generated.
Computer 203481 was running Windows XP Home, and I installed Windows XP Pro. The installation wiped the hard drive, and I installed the latest version of BOINC. This created computer 486792, and I was able to merge the results of 203481 into 486792. The merge results reflect the proper and current OS on this hardware.

Since the BOINC data was wiped it had to create new computer ids. You did not change the hardware so you were able to merge the old data into the new computer.
Computer 297677 is running Red Hat Linux on a dual core P4 whose hostname is redhat. Last week I ran the "update" to apply the latest patches. The Kernel version changed from 2.6.9-42.0.10.ELsmp to 2.6.9-55.ELsmp. During the update, I also had to use the Logical Volument Management tool to extend my root and /var file systems. After patching, my /usr/local file system disappeared, and that is where the BOINC client was installed. This time I installed the latest version of BOINC under the /home filesystem and it created computer 486792. When I attempt to merge computer 486792, no computers appear in the merge list. However, if I attempt to merge computer 297677, it displays the same hostname or hardware in its various incarnations, but not the current configuration for 486792. The merge list for 297677 includes a system named fedora that was running on a Pentium III. Fedora was later replaced with redhat on the same P3. Redhat was later upgraded from a P3 to a Dual Core P4, and configuration also appears in the merge list. However, I cannot merge 297677 into 486792.

Were you perhaps running the old BOINC version before? If so when you installed the new one it created new computer ids and because it reports the cpus differently the selection criteria sees them as different computers. If you look you will see that 297677 is reported as "GenuineIntel Intel(R) Pentium(R) D CPU 2.66GHz" and 486792 is reported as" GenuineIntel Intel(R) Pentium(R) D CPU 2.66GHz [x86 Family 15 Model 4 Stepping 7] [fpu tsc sse sse2 mmx]". This is two completely different systems as far as the merging is concerned.

If you really want to merge the old 297677 system you can create a dummy install of the old BOINC version, run it, and merge the computer into the new computer it creates. You then have two options:
1. Note the "host_cpid" (cross project id) and "hostid" (the one reported) tag values in the "client_state.xml" file of the dummy version and place them into your main version over the values there after backing up the file. Do this while BOINC is not running and then start the main version and let it update the project and in the proses change the cpu type without creating a new computer. Then after exting BOINC restore the backed up file and restart BOINC as normal.
2. Install the new BOINC version while BOINC is not running over the dummy install, start the new dummy version and have it update the project. Then exit the dummy version and start the main version as normal.
After one of these you should be able to merge the new computer created by the dummy install into the computer created by your main version.

If you don't want this to happen again you should note the two ids from the installed version before making any big changes. Those two values seem to identify your computer.


Next 20



©2024 University of Washington
https://www.bakerlab.org