Help a newbie out. Work Unit explained?

Message boards : Rosetta@home Science : Help a newbie out. Work Unit explained?

To post messages, you must log in.

AuthorMessage
Pece Kocovski

Send message
Joined: 7 Mar 09
Posts: 2
Credit: 7,602
RAC: 0
Message 60071 - Posted: 11 Mar 2009, 8:12:36 UTC

Hi,

Well first of all, if there is somewhere where this question is allready answered, please direct me to it, because I had trouble finding it.

So I started lending my computer out to all these science projects a few days ago (I decided to let Rosetta take the most amount of resources). So BOINC has been working away happily, letting Rosetta have access to my computer so it can process the Work Units.

Now, I have gotten a few modest results (my computer is not very good/when I have the computer on, I tend to use it) and I understand what proteins are and what tertiary structure is, hydrogen bonds, di sulphide bonds etc. etc. and I understand that the processing is trying to find out the lowest energies for certain proteins but what I do not understand is the way the results are presented.

So I am looking at this page:

https://boinc.bakerlab.org/rosetta/results.php?userid=304847

(Not sure if others can access my stats)

So lets say I click on one of the completed tasks at random: Like number 234126438 (Task ID)/213439838 (Work unit ID)

https://boinc.bakerlab.org/rosetta/result.php?resultid=234126438

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=213439838

So on the first link, my first confusion is the name. Wth? Is that just dished out at random, or does the name actually mean something (like does that name somehow equal the Ubiquitin-protein ligase as an example)?

Then there is the "stderr out" bar that then goes on to list a whole bunch of text that at first glance seems like a whole bunch of blah. Second glance did not change my view:

<core_client_version>6.4.7</core_client_version>
<![CDATA[
<stderr_txt>
BOINC:: Initializing ... ok.
[2009- 3- 9 23:25:57:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing core...
Initializing options.... ok
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev26003.zip
<unzip> <-oq> <../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev26003.zip> <-d./>
Firstarg=true; pp=-d./
firstarg: <-d./>
End of unzipping.
Setting database description ...
Setting up checkpointing ...
Setting up folding (abrelax) ...
Beginning folding (abrelax) ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Starting work on structure: _U8X7X_00001
BOINC:: Initializing ... ok.
[2009- 3-10 0:25:14:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing core...
Initializing options.... ok
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev26003.zip
<unzip> <-oq> <../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev26003.zip> <-d./>
Firstarg=true; pp=-d./
firstarg: <-d./>
End of unzipping.
Setting database description ...
Setting up checkpointing ...
Setting up folding (abrelax) ...
Beginning folding (abrelax) ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Starting work on structure: _U8X7X_00001
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_1 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_2 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_1 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_2 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_3 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_4 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_5 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_6 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_7 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_8 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_9 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000001_ClassicAbinitio__stage_3_iter1_10 ... success!
Starting work on structure: _U8X7X_00002
BOINC:: Initializing ... ok.
[2009- 3-10 1:38:17:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
Registering options..
Registered extra options.
Initializing core...
Initializing options.... ok
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev26003.zip
<unzip> <-oq> <../../projects/boinc.bakerlab.org_rosetta/minirosetta_database_rev26003.zip> <-d./>
Firstarg=true; pp=-d./
firstarg: <-d./>
End of unzipping.
Setting database description ...
Setting up checkpointing ...
Setting up folding (abrelax) ...
Beginning folding (abrelax) ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
Starting work on structure: _U8X7X_00002
Continuing computation from checkpoint: chk_S_U8X7X_00000002_ClassicAbinitio__stage_1 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000002_ClassicAbinitio__stage_2 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000002_ClassicAbinitio__stage_3_iter1_1 ... success!
Continuing computation from checkpoint: chk_S_U8X7X_00000002_ClassicAbinitio__stage_3_iter1_2 ... success!
======================================================
DONE :: 1 starting structures 9401.06 cpu seconds
This process generated 2 decoys from 2 attempts
======================================================

BOINC :: Watchdog shutting down...
BOINC :: BOINC support services shutting down...
called boinc_finish

</stderr_txt>
]]>

Would someone here be able to explain to me what all of the above meant? (not in detail, just a general overview). I have noticed similar lines of text (although some shorter than others) on my other completed work units. Or is the above data, just the textual representation of the proteins we see on the screensaver?

So yeah, any help in clarifying the above would be great.

Finally (while I am here), I noticed that the homepage of Rosetta@home says that R@home is not for profit. But I then read on Dr David Bakers journal (which I skimmed and some parts are a great read!) and he mentioned something about manuscripts (which I assume means, publishing articles in scientific journals). Dont those articles go on to be published for a profit (correct me if I am wrong here)? Or does the heading mean that all the data is released in the public domain, and then you can use the data however way you wish (i.e. publish the data into results, make money etc etc?)

Cheers!
ID: 60071 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile cenit

Send message
Joined: 1 Apr 07
Posts: 13
Credit: 1,630,287
RAC: 0
Message 60072 - Posted: 11 Mar 2009, 8:50:09 UTC - in response to Message 60071.  

my first confusion is the name. Wth? Is that just dished out at random, or does the name actually mean something (like does that name somehow equal the Ubiquitin-protein ligase as an example)?


for an explanation of work-units' name... I'm not able to explain them. They're funny...

Then there is the "stderr out" bar that then goes on to list a whole bunch of text that at first glance seems like a whole bunch of blah. Second glance did not change my view:

stderr output has been increased a lot in 1.54 to help discover bugs in Rosetta software. Not interesting from the protein point of view.

Finally (while I am here), I noticed that the homepage of Rosetta@home says that R@home is not for profit. But I then read on Dr David Bakers journal (which I skimmed and some parts are a great read!) and he mentioned something about manuscripts (which I assume means, publishing articles in scientific journals). Dont those articles go on to be published for a profit (correct me if I am wrong here)? Or does the heading mean that all the data is released in the public domain, and then you can use the data however way you wish (i.e. publish the data into results, make money etc etc?)
Cheers!


papers are published on scientific journals NOT for profit (at most, they do it to receive funds). Journals are expensive because they have to pay people for peer-reviewing papers (to ensure they're correct), not any $ is given to the author. This is the way that science works: you produce a paper and you try to get it published on a journal, if accepted and peer-reviewed.
ID: 60072 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 60076 - Posted: 11 Mar 2009, 16:35:33 UTC

Hello Pece, welcome to Rosetta!

In general terms, the messages you ask about are normal. They are status messages at various points in the run of your task that indicate how things are going. If any problems arise in the processing of the task on your machine, these will be helpful to determine which area of the program was having a problem.

As for the WU names, yes, they get quite long sometimes :)
The one you mentioned is called
h001__BOINC_ABRELAX_SAVE_ALL_OUT_RANGE_t460_IGNORE_THE_REST -S25-8-S3-7--h001_-_7725_3756_0

In general terms, the "abrelax", "save all out" and "ignore the rest" are descriptive of the methods that particular task will try to use to find the low energy. The ending digits, 7225 is a batch number, 3756 is the task within the batch, and 0 is the first replica sent out. And if all goes well with your task, and it is returned before the deadline, no further replication of the task will be done.

The rest of the WU name typically has the protein identified. But sometimes it is more generically a "target" number. One of your tasks has a t313 in the name, that is probably a generic target number. Another one has 1vie in the name, that is usually the identifyier in the protein databank.

If you already know all of those things about proteins, you will probably be interested in spending some time on the protein databank website. For the few proteins with solved structures, you can download and view them in 3D with one of the viewers referenced there.

This will quickly lead to another question, if the structure of 1vie is already known, then why is Rosetta@home sending out work units that try to solve it?? The basic answer is that proteins of known structure provide a frame of reference. You can compute until you're CPUs are white hot, but if you have no way to know if you have arrived at the desired solution, how do you know if you are improving? So, these tasks are run to determine if the program would/could have predicted the same structure as was found experimentally.

Ultimately, once the predictions from Rosetta are correct (or close enough), using Rosetta to solve structures will be much faster and less expensive then the experimental techniques. Rosetta is also used to aid the experimental techniques. With the experimental data gathered as a frame of reference, Rosetta's predictions are more accurate and can save some of the laborious parts of the experimental process as well.

As for the intellectual property rights... yes, as Rosetta is developed, the discoveries are published and other researchers are allowed to use (and contribute to) the program for acedemic purposes. Rosetta is a tool that will some day be used by others to produce vaccines and treatments. And those that use the tool to develop, test, produce, and promote such vaccines and treatments will most likely be in it for the profit motive. BakerLab is not made up of drug researchers. But they've also done some collaboration with HIV researchers to help discover methods of attacking the virus.

Rosetta is sort of like an electron microscope. Noone could say for sure how people would use it, when it was still in development. It's only when people start to work with it, and be exposed to it, and have ideas about other ways to use it that you really see progress. They start to test it's capabilities beyond the original expectations and discover new and unexpected things to do with the tool.
Rosetta Moderator: Mod.Sense
ID: 60076 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Pece Kocovski

Send message
Joined: 7 Mar 09
Posts: 2
Credit: 7,602
RAC: 0
Message 60093 - Posted: 12 Mar 2009, 10:30:23 UTC - in response to Message 60076.  

Thank you both cenit and Mod.Sense for your excellent answers! With the info you both gave me, I am now more than happy to be donating my idle computer time to Rosetta@home (among a few other projects of similar interest) and will try and convince many others to do the same!

All of you working on Rosetta@home: (researchers,scientists,programmers,coders etc. etc.) keep up the excellent work!

Cheers!
ID: 60093 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Rosetta@home Science : Help a newbie out. Work Unit explained?



©2024 University of Washington
https://www.bakerlab.org