Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 39 · 40 · 41 · 42 · 43 · 44 · 45 . . . 237 · Next

AuthorMessage
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1194
Credit: 13,235,405
RAC: 1,087
Message 93775 - Posted: 7 Apr 2020, 20:51:20 UTC - in response to Message 93761.  

Hi all, new to Rosetta. I have been running Seti@Home since last fall with no issues. I added Rosetta@ home on two PC's. Upon re-boot Rosetta is removed from the list projects list. I have to add new task, select Rosetta and re-enter password info. Seti does not do this. What am I missing ?? I am using a different password from Seti if that makes a difference.

Looking forward to the new project.

Bill S.

Something simple that seems worth trying: Wait for at least one task to finish,

In the advanced view, Click on Activity, then Suspend. Wait at least one more minute before shutting down BOINC, Windows, or the computer. Let us know if doing this even once helps.
ID: 93775 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kevin N. Carpenter

Send message
Joined: 6 Apr 20
Posts: 6
Credit: 6,614,362
RAC: 0
Message 93790 - Posted: 8 Apr 2020, 0:24:15 UTC - in response to Message 93788.  

Moderator removed my post from the "moderator contact point assistence post here thread" so here goes.

Hi all, new to Rosetta. I have been running Seti@Home since last fall with no issues. I added Rosetta@ home on two PC's. Upon re-boot Rosetta is removed from the list projects list. I have to add new task, select Rosetta and re-enter password info. Seti does not do this. What am I missing ?? I am using a different password from Seti if that makes a difference.

Looking forward to the new project.

Bill S.

Or, if I'm still in the wrong area someone please post a usefull link to the appropriate thread.

Thank you.


That is unusual. I added Rosetta to about a half-dozen machines yesterday and have not had that problem. Many have been restarted without issues. Sometimes its takes a few minutes for Rosetta to start back up, but that is all.
ID: 93790 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1366
Credit: 13,624,788
RAC: 0
Message 93792 - Posted: 8 Apr 2020, 0:31:00 UTC - in response to Message 93788.  

Hi all, new to Rosetta. I have been running Seti@Home since last fall with no issues. I added Rosetta@ home on two PC's. Upon re-boot Rosetta is removed from the list projects list. I have to add new task, select Rosetta and re-enter password info. Seti does not do this. What am I missing ?? I am using a different password from Seti if that makes a difference.
Not sure what the issue is. I just used the BOINC Manager, Tools, Add project, select Rosetta@home, New user, different password to my other project and everything went along OK from there.
I didn't reboot the system till a few days after joining up, and the project was still there when it did reboot.
Grant
Darwin NT
ID: 93792 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stephen "Heretic"

Send message
Joined: 2 Apr 20
Posts: 21
Credit: 11,028
RAC: 0
Message 93794 - Posted: 8 Apr 2020, 0:43:53 UTC - in response to Message 93766.  
Last modified: 8 Apr 2020, 0:46:23 UTC

A Windows XP pc connected to internet? A Pentium 4? Yeah.... well.... the problem seems self explanatory, too old...
Windows XP is horribly unsecure, at least switch to Linux.


Well. I'm currently living at my dad's place and this is his PC, the only one I can have. It still can run WUs from WCG and Asteroid@Home. The problem is not how powerful the PC is. It's why current Rosetta WUs don't work with it? Compatible issue? A month ago there was no problem, even my older PC, a PIII/WinXP, run Rosetta WUs pretty well.


. . They have revised the app from 4.07/4.08 to 4.12. This new version asks a bit more from the hardware and seems to be a bit of a memory hog, see the thread on use of memory and Rosetta. I have an i5 with 8GB ram and I have had problems getting it to run reliably. But there may be another cause.

Stephen

?
ID: 93794 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1366
Credit: 13,624,788
RAC: 0
Message 93795 - Posted: 8 Apr 2020, 0:49:51 UTC - in response to Message 93794.  

. . They have revised the app from 4.07/4.08 to 4.12. This new version asks a bit more from the hardware and seems to be a bit of a memory hog
The Task being processed determines how much RAM is required.
Present Tasks are using much less RAM (400MB to 800MB max) than ones from a week ago (400MB to 1.3GB max).
Grant
Darwin NT
ID: 93795 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MarkJ

Send message
Joined: 28 Mar 20
Posts: 72
Credit: 24,846,907
RAC: 1,665
Message 93800 - Posted: 8 Apr 2020, 1:22:05 UTC

Got 56 CF_monomer/Rosetta Mini work units that all failed with an instant "Error while computing". Running Linux x64. Other work units seem fine, just not these ones. I ended up aborting the few that hadn't committed suicide.

I notice the scheduler is down now so maybe they are removing them from the queue.

Example: https://boinc.bakerlab.org/rosetta/result.php?resultid=1142716718

Stderr
<core_client_version>7.16.1</core_client_version>
<![CDATA[
<message>
process exited with code 255 (0xff, -1)</message>
<stderr_txt>
[2020- 4- 8 8:49:35:] :: BOINC:: Initializing ... ok.
[2020- 4- 8 8:49:35:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
command: ../../projects/boinc.bakerlab.org_rosetta/minirosetta_3.78_x86_64-pc-linux-gnu -abinitio::fastrelax 1 -ex2aro 1 -frag3 00001.200.3mers -in:file:native 00001.pdb -corrections::beta_nov16 -silent_gz 1 -frag9 00001.200.9mers -out:file:silent default.out -ex1 1 -abinitio::rsd_wt_loop 0.5 -relax::default_repeats 15 -abinitio::use_filters false -abinitio::increase_cycles 10 -abinitio::rsd_wt_helix 0.5 -abinitio::rg_reweight 0.5 -in:file:boinc_wu_zip CF_monomer_28_data.zip -out:file:silent default.out -silent_gz -mute all -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2362735
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
ERROR: Option matching -corrections:beta_nov16 not found in command line top-level context

</stderr_txt>
]]>
BOINC blog
ID: 93800 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,659,331
RAC: 4
Message 93809 - Posted: 8 Apr 2020, 3:33:48 UTC

Trying and even eager to be of help, but...

All these short deadline units are troublesome. Is it accomplishing anything if my contributions are just discarded? And discarded for the sake of deadlines that seem quite arbitrary, even silly. Exacerbated by more checkpoint problems, too.

Actually writing from the machine that has the most problems dealing with the deadlines, but even some of my bigger machines clearly have more queued tasks than they can possibly complete within the short deadlines. Obvious workaround (though it's tedious) is to manually abort the tasks that can't be completed, but that causes problems because the flow of tasks has become sporadic again... Plus its wasting the bandwidth at the project end when they send data that is just discarded.

On top of that, some of the machines wind up wasting time because of large batches of tasks with large memory requirements that cause the "Waiting for memory" status on some tasks. Again, selective nuking of tasks can get the CPU's busy again, but I'm NOT supposed to be spending time managing memory problems because the people running Rosetta@home can't figure it out... I'm fairly confident that BOINC has those capabilities to assess and manage memory, but it seems they are not being used by the Baker Lab people.

I've currently earned over 12 million points, which is supposed to indicate a moderate contribution, but I'm thinking about moving along. The reason I switched to Rosetta was because the projects I used to support were not well managed. I'm sure I could even shop around for projects that are also working on Covid projects.

In addition, if I were still supporting researchers, I would not recommend that they rely on data processed on Rosetta because such problems make the entire thing dubious... There were a couple of teams in the lab that are probably doing Covid stuff now (but I'm retired, so I have no idea).
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 93809 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MarkJ

Send message
Joined: 28 Mar 20
Posts: 72
Credit: 24,846,907
RAC: 1,665
Message 93810 - Posted: 8 Apr 2020, 3:43:20 UTC
Last modified: 8 Apr 2020, 3:45:49 UTC

Now up to 68 with a few more waiting to run. Can we get these removed from the queue please.

Looking at my wing man on a few of them they too are failing so its not just me. I don't want to get my hosts blocked from getting tasks due to what looks like a parameter error with these tasks.
BOINC blog
ID: 93810 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1194
Credit: 13,235,405
RAC: 1,087
Message 93812 - Posted: 8 Apr 2020, 3:53:24 UTC - in response to Message 93800.  

Got 56 CF_monomer/Rosetta Mini work units that all failed with an instant "Error while computing". Running Linux x64. Other work units seem fine, just not these ones. I ended up aborting the few that hadn't committed suicide.

I notice the scheduler is down now so maybe they are removing them from the queue.

[snip]

I got a few under 64 bit Windows 10. Those hit errors within a few seconds of starting.
ID: 93812 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SIXER (L.Gammel)

Send message
Joined: 6 Apr 20
Posts: 1
Credit: 393,432
RAC: 0
Message 93814 - Posted: 8 Apr 2020, 4:59:46 UTC

I also switched from seti to rosetta but am getting a message about disc space please help;

Rosetta@home: Notice from server
Rosetta Mini needs 183.28MB more disk space. You currently have 1533.33 MB available and it needs 1716.61 MB.

what do I need to do???
Sixer
ID: 93814 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Marcos Carot

Send message
Joined: 30 Dec 11
Posts: 3
Credit: 301,124
RAC: 0
Message 93815 - Posted: 8 Apr 2020, 5:11:21 UTC - in response to Message 93814.  

Yes, I got the same message... just had to make space deleting other stuff in the partition where BOINC stores the data. In linux, it is not Home, I think it was /var
ID: 93815 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1366
Credit: 13,624,788
RAC: 0
Message 93817 - Posted: 8 Apr 2020, 5:37:13 UTC - in response to Message 93814.  

I also switched from seti to rosetta but am getting a message about disc space please help;

Rosetta@home: Notice from server
Rosetta Mini needs 183.28MB more disk space. You currently have 1533.33 MB available and it needs 1716.61 MB.

what do I need to do???
Give it more disk space.
In your Account, Computing preferences
Disk
Use no more than 12GB
Leave at least    2GB free
Use no more than  60% of total

Memory
When computer is in use, use at most     90%
When computer is not in use, use at most 95%
Even so, you will also need to add more RAM, or reduce the number of CPU cores/threads you use (down to 6 or even 5) so you don't run out of RAM (as much as 1.3GB of RAM per task can be required, although present tasks are only using 800MB or less).
Grant
Darwin NT
ID: 93817 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1366
Credit: 13,624,788
RAC: 0
Message 93818 - Posted: 8 Apr 2020, 5:48:55 UTC - in response to Message 93809.  

Trying and even eager to be of help, but...

All these short deadline units are troublesome. Is it accomplishing anything if my contributions are just discarded? And discarded for the sake of deadlines that seem quite arbitrary, even silly. Exacerbated by more checkpoint problems, too.

Actually writing from the machine that has the most problems dealing with the deadlines, but even some of my bigger machines clearly have more queued tasks than they can possibly complete within the short deadlines. Obvious workaround (though it's tedious) is to manually abort the tasks that can't be completed, but that causes problems because the flow of tasks has become sporadic again... Plus its wasting the bandwidth at the project end when they send data that is just discarded.

On top of that, some of the machines wind up wasting time because of large batches of tasks with large memory requirements that cause the "Waiting for memory" status on some tasks. Again, selective nuking of tasks can get the CPU's busy again
Simple.
Set a small cache, 1 day or less, additional days .02 or so. Make sure your checkpointing request is set for 60 seconds.
Set the number of threads used for crunching equal to or less than the number that can be supported with the RAM your systems have allowing for 1.3GB per task (unless you wish to add more- that also solves the problem) and in your computing preferences allow BOINC to make use of the RAM you have (set it for 90% or higher). Don't abort the Tasks, if they miss the deadline the project will resend them to another host.

Work gets done, errors reduced (if not eliminated), the Manager will figure out how much work you can & can't do and stop getting too much, and it won't require frequent intervention on your part.
Grant
Darwin NT
ID: 93818 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
michelv

Send message
Joined: 28 Mar 20
Posts: 8
Credit: 216,762
RAC: 0
Message 93820 - Posted: 8 Apr 2020, 6:07:39 UTC - in response to Message 93812.  

I got a few under 64 bit Windows 10. Those hit errors within a few seconds of starting.

Same here, just a few minutes ago.
ID: 93820 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1366
Credit: 13,624,788
RAC: 0
Message 93821 - Posted: 8 Apr 2020, 6:14:27 UTC - in response to Message 93812.  
Last modified: 8 Apr 2020, 6:32:10 UTC

Got 56 CF_monomer/Rosetta Mini work units that all failed with an instant "Error while computing". Running Linux x64. Other work units seem fine, just not these ones. I ended up aborting the few that hadn't committed suicide.

I notice the scheduler is down now so maybe they are removing them from the queue.

[snip]

I got a few under 64 bit Windows 10. Those hit errors within a few seconds of starting.

Decided to give mine a try (WIn10).
one Task no problem, crunching away using minirosetta_3.78_windows_x86_64.exe

On the other system, every one was an instant Computation error and they were set to run with Rosetta Mini v3.78 windows_intelx86, and there was one Task using Rosetta Mini v3.78 windows_x86_64 that errored out as well.

Presently processing normally (40min so far so good)
CF_monomer_12_fold_SAVE_ALL_OUT_905409_765_0    Rosetta Mini v3.78 windows_x86_64




All failed instantly
CF_monomer_78_fold_SAVE_ALL_OUT_905546_126_1      Rosetta Mini v3.78 windows_intelx86
CF_monomer_37_fold_SAVE_ALL_OUT_905505_203_0      Rosetta Mini v3.78 windows_intelx86
CF_monomer_103_fold_SAVE_ALL_OUT_905615_59_0      Rosetta Mini v3.78 windows_intelx86
CF_monomer_103_fold_SAVE_ALL_OUT_905620_58_0      Rosetta Mini v3.78 windows_intelx86
CF_monomer_103_relax_SAVE_ALL_OUT_905662_69_0     Rosetta Mini v3.78 windows_x86_64
CF_monomer_103_fold_SAVE_ALL_OUT_905589_56_0      Rosetta Mini v3.78 windows_intelx86
CF_monomer_103_fold_SAVE_ALL_OUT_905646_12_1      Rosetta Mini v3.78 windows_intelx86



<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -1 (0xffffffff)
</message>
<stderr_txt>
[2020- 4- 8 15:26:42:] :: BOINC:: Initializing ... ok.
[2020- 4- 8 15:26:42:] :: BOINC :: boinc_init()
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully. 
command: projects/boinc.bakerlab.org_rosetta/minirosetta_3.78_windows_x86_64.exe -abinitio::fastrelax 1 -ex2aro 1 -frag3 00001.200.3mers -in:file:native 00001.pdb -corrections::beta_nov16 -silent_gz 1 -frag9 00001.200.9mers -out:file:silent default.out -ex1 1 -abinitio::rsd_wt_loop 0.5 -relax::default_repeats 15 -abinitio::use_filters false -abinitio::increase_cycles 10 -abinitio::rsd_wt_helix 0.5 -abinitio::rg_reweight 0.5 -in:file:boinc_wu_zip CF_monomer_103_data.zip -out:file:silent default.out -silent_gz -mute all -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2047753
Registering options.. 
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok 
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize()  End reached
ERROR: Option matching -corrections:beta_nov16 not found in command line top-level context

</stderr_txt>
]]>

Grant
Darwin NT
ID: 93821 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MarkJ

Send message
Joined: 28 Mar 20
Posts: 72
Credit: 24,846,907
RAC: 1,665
Message 93822 - Posted: 8 Apr 2020, 6:14:44 UTC - in response to Message 93818.  
Last modified: 8 Apr 2020, 6:17:45 UTC

<snipped>
Set a small cache, 1 day or less, additional days .02 or so. Make sure your checkpointing request is set for 60 seconds.
Set the number of threads used for crunching equal to or less than the number that can be supported with the RAM your systems have allowing for 1.3GB per task (unless you wish to add more- that also solves the problem) and in your computing preferences allow BOINC to make use of the RAM you have (set it for 90% or higher). Don't abort the Tasks, if they miss the deadline the project will resend them to another host.

Running a 0.33 day cache on my 6 core/12 thread machines with 32GB of memory.

On the couple of Pi's that I have running Rosetta I had to create an app_config to limit the number of tasks running at once.
BOINC blog
ID: 93822 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jackielan2000

Send message
Joined: 5 Sep 06
Posts: 13
Credit: 14,079
RAC: 0
Message 93830 - Posted: 8 Apr 2020, 7:50:16 UTC - in response to Message 93794.  

A Windows XP pc connected to internet? A Pentium 4? Yeah.... well.... the problem seems self explanatory, too old...
Windows XP is horribly unsecure, at least switch to Linux.


Well. I'm currently living at my dad's place and this is his PC, the only one I can have. It still can run WUs from WCG and Asteroid@Home. The problem is not how powerful the PC is. It's why current Rosetta WUs don't work with it? Compatible issue? A month ago there was no problem, even my older PC, a PIII/WinXP, run Rosetta WUs pretty well.


. . They have revised the app from 4.07/4.08 to 4.12. This new version asks a bit more from the hardware and seems to be a bit of a memory hog, see the thread on use of memory and Rosetta. I have an i5 with 8GB ram and I have had problems getting it to run reliably. But there may be another cause.

Stephen

?


8G RAM? Well, that means it can only be run in x64 systems. Ok, bye bye Rosetta.
ID: 93830 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1366
Credit: 13,624,788
RAC: 0
Message 93835 - Posted: 8 Apr 2020, 8:25:33 UTC - in response to Message 93830.  
Last modified: 8 Apr 2020, 8:27:59 UTC

8G RAM? Well, that means it can only be run in x64 systems. Ok, bye bye Rosetta.
You need up to 1.3GB RAM per Task you run.
So even a system with less than 4GB can run 2 Tasks.
Grant
Darwin NT
ID: 93835 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
P35W4EKYw1tBcC12bsEDD2gQ68B

Send message
Joined: 12 Nov 16
Posts: 5
Credit: 2,729,332
RAC: 0
Message 93838 - Posted: 8 Apr 2020, 8:59:21 UTC

Hi there!

I am having calculation error problems too. I have 2 older 45nm opteron server with 24gb ram 12threads each, other can only run 5 rosetta threads at this moment, all others fail. Both have linux with old hdd. Other server have now 2 running tasks and i can tell later when wcg projects are done, but ive seen some errors already.

With lower ram perthread 32nm Xeon servers are running fine with latest rosetta version, same with ryzen based computers.
ID: 93838 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1366
Credit: 13,624,788
RAC: 0
Message 93839 - Posted: 8 Apr 2020, 9:04:52 UTC - in response to Message 93838.  

I am having calculation error problems too.
With your systems hidden it's difficult to help, but there are faulty Rosetta Mini Tasks about at present that fail pretty much instantly.
Grant
Darwin NT
ID: 93839 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 39 · 40 · 41 · 42 · 43 · 44 · 45 . . . 237 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2022 University of Washington
https://www.bakerlab.org