Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 160 · 161 · 162 · 163 · 164 · 165 · 166 . . . 236 · Next

AuthorMessage
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 12 Aug 06
Posts: 1313
Credit: 5,983,350
RAC: 11,707
Message 104310 - Posted: 18 Jan 2022, 19:52:49 UTC - in response to Message 104309.  

I never touched Science United because I like to choose exactly what I do. But I assumed they'd at least give you the genres you asked for.

I never use built in GPUs (I assume you mean as part of the Ryzen CPU?) because I find they remove some of the speed of the main CPU and don't actually achieve any more work than as a CPU. There's enough CPU only stuff to do that I don't need to use the GPU part.
ID: 104310 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 185
Credit: 23,192,571
RAC: 1,182
Message 104311 - Posted: 18 Jan 2022, 20:58:25 UTC - in response to Message 104308.  

Here are their numbers: https://boinc.netsoft-online.com/e107_plugins/boinc/bp.php?project=6
Mostly going down

I had seen the netsoft `fallout` .
Rosetta have fallen from grace .
There is bound to be a horse named Grace somewhere
ID: 104311 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5570
Credit: 5,560,753
RAC: 586
Message 104312 - Posted: 18 Jan 2022, 23:27:30 UTC - in response to Message 104309.  

I had selected also Biology but I still would get only Milkyway and Asteroid. I have two GPU boards running Einstein, a GTX 1060 with 3 GB RAM and a GTX 1650 with 4 GB RAM. My latest PC has a AMD Ryzen 4500U which has graphic capabilities, but it is slower compared to the two GTX boards. There was a period when Gravitational Waves GPU tasks required more than 3 GB RAM and I had to buy the GTX 1650 board.
Tullio



I have been running Einstein with 1050 as my most powerful GPU back when I started, now I can use a 1080 I got a few years ago.
These work fine for what I want to do.
I've already put enough money into this system, no need to upgrade anything that still works.

I think I have about 500 sitting here next to me. New full case to be able to handle the expanded radiator set me back a good chunk.
New CPU hurt. Burned the other one.
1080 was 2nd hand from a graphics design company server, was cheap as far as GPU's go.
1050 has been with me forever.
New MOBO and new digital PSU from awhile back. I burned up one of those cheaper ones. Forgot who that was by.
It all adds up.
As much as I envy the xenon guys, there is no way I will ever be able to afford that.
My dream is a threadripper, But then that's a new MOBO again. Forget it.


I chose this project and Einstein because they are both based in my old home state of Washington (not DC).
LIGO is out at Hanford and that is about 80 minutes or so from my parents place.
This one, well I used to live in Seattle, so I read about this in the Seattle Times and joined up.
It's a shame to see them shove us off to the side, but I guess that's what happens when you get to be big time.
ID: 104312 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tullio

Send message
Joined: 10 May 20
Posts: 63
Credit: 617,972
RAC: 423
Message 104314 - Posted: 19 Jan 2022, 7:41:58 UTC

My aon lives a few niles from the Virgo interferometer in Cascina, Pisa, one of the five active gravitational waves detectors, but I live near Milano and could noy visit it.
Tullio
ID: 104314 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tullio

Send message
Joined: 10 May 20
Posts: 63
Credit: 617,972
RAC: 423
Message 104316 - Posted: 19 Jan 2022, 16:00:33 UTC

Rosetta 4.20 here again. I am running 5 of them and 1 rosetta python.
Tullio
ID: 104316 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Killersocke@rosetta

Send message
Joined: 13 Nov 06
Posts: 29
Credit: 2,478,213
RAC: 6
Message 104317 - Posted: 19 Jan 2022, 16:55:40 UTC

still no tasks here :-(
ID: 104317 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Killersocke@rosetta

Send message
Joined: 13 Nov 06
Posts: 29
Credit: 2,478,213
RAC: 6
Message 104318 - Posted: 19 Jan 2022, 16:55:44 UTC

still no tasks here :-(
ID: 104318 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 12 Aug 06
Posts: 1313
Credit: 5,983,350
RAC: 11,707
Message 104319 - Posted: 19 Jan 2022, 17:29:11 UTC - in response to Message 104316.  
Last modified: 19 Jan 2022, 17:30:11 UTC

Rosetta 4.20 here again. I am running 5 of them and 1 rosetta python.
Tullio
They must come in regular small bursts, because there's always a fair amount running according to server status. I've only got 5 pythons on the only computer that will run them. For some reason it refuses to run 6 (it has 6 cores), even if there's loads of RAM left.
ID: 104319 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tullio

Send message
Joined: 10 May 20
Posts: 63
Credit: 617,972
RAC: 423
Message 104320 - Posted: 19 Jan 2022, 17:45:11 UTC

I can run 2 rosetta pythons at most on my 12 GB RAM. If thre is a third it will be waiting for memory. My Intel i5 9400F has 3 cores that is six processors.
Tullio
ID: 104320 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 12 Aug 06
Posts: 1313
Credit: 5,983,350
RAC: 11,707
Message 104322 - Posted: 19 Jan 2022, 18:03:26 UTC - in response to Message 104320.  

I can run 2 rosetta pythons at most on my 12 GB RAM. If thre is a third it will be waiting for memory. My Intel i5 9400F has 3 cores that is six processors.
Tullio
You could be right, they probably ask for more RAM than they actually use, just in case. I forgot that machine with 6 cores only had 16GB. I stole some of it to put in my new Ryzen. Must have 64GB on my gaming machine! It's currently running 5 pythons using 11.5/16GB. But it doesn't say "waiting for memory" like I've seen before. It just doesn't start them. And I'm sure I've seen it under half utilizing the memory and not starting one.
ID: 104322 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5570
Credit: 5,560,753
RAC: 586
Message 104324 - Posted: 19 Jan 2022, 19:55:26 UTC

This is weird https://boinc.bakerlab.org/rosetta/result.php?resultid=1464088106
1.5 days processing for 20 minutes or so cpu time.

So heres the breakdown

022-01-18 10:33:11 (15556): Status Report: Elapsed Time: '15314.521130'
2022-01-18 10:33:11 (15556): Status Report: CPU Time: '29.109375'


2022-01-18 00:17:37 (15156): Creating new snapshot for VM.
2022-01-18 00:17:42 (15156): Deleting stale snapshot.
2022-01-18 00:17:43 (15156): Checkpoint completed.
2022-01-18 00:21:45 (15156): VM state change detected. (old = 'running', new = 'paused')
2022-01-18 00:22:01 (15156): Powering off VM.
2022-01-18 00:22:01 (15156): Successfully stopped VM

(end of my day so I shut down via suspend, shut down client (leave in memory), exit BOINC

Now I restart:
2022-01-18 08:12:38 (15556): VM state change detected. (old = 'poweredoff', new = 'running')
2022-01-18 08:12:38 (15556): Status Report: Elapsed Time: '9314.493395'
2022-01-18 08:12:38 (15556): Status Report: CPU Time: '18.328125'
2022-01-18 08:12:38 (15556): Preference change detected
2022-01-18 08:12:38 (15556): Setting CPU throttle for VM. (100%)
2022-01-18 08:12:38 (15556): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 180 seconds) or (Vbox_job.xml: 600 seconds))
2022-01-18 08:32:02 (15556): Creating new snapshot for VM.
2022-01-18 08:32:12 (15556): Deleting stale snapshot.

Then this point
022-01-18 10:33:11 (15556): Status Report: Elapsed Time: '15314.521130'
2022-01-18 10:33:11 (15556): Status Report: CPU Time: '29.109375'

here is 6 hrs
022-01-18 12:31:57 (15556): Status Report: Elapsed Time: '21314.549383'
2022-01-18 12:31:57 (15556): Status Report: CPU Time: '39.125000'
2022-01-18 12:37:42 (15556): Creating new snapshot for VM.
2022-01-18 12:37:43 (15556): Deleting stale snapshot.

2022-01-18 14:28:55 (15556): Status Report: Elapsed Time: '27314.711735'
2022-01-18 14:28:55 (15556): Status Report: CPU Time: '49.218750'

2022-01-18 16:09:27 (15556): Status Report: Elapsed Time: '33315.182032'
2022-01-18 16:09:27 (15556): Status Report: CPU Time: '59.093750'

2022-01-18 18:11:20 (15556): Status Report: Elapsed Time: '39315.521685'
2022-01-18 18:11:20 (15556): Status Report: CPU Time: '68.562500'

Something went nuts, but does not show up in the report:

2022-01-18 19:27:47 (15556): Checkpoint completed.
2022-01-18 19:33:12 (11508): Detected: vboxwrapper 26202
2022-01-18 19:33:12 (11508): Detected: BOINC client v7.16.20
2022-01-18 19:33:13 (11508): Detected: VirtualBox VboxManage Interface (Version: 6.1.30)
2022-01-18 19:33:13 (11508): Feature: Checkpoint interval offset (88 seconds)
2022-01-18 19:33:13 (11508): Detected: Minimum checkpoint interval (600.000000 seconds)
2022-01-18 19:33:13 (11508): Restore from previously saved snapshot.
2022-01-18 19:33:14 (11508): Restore completed.


2022-01-18 19:33:19 (11508): Status Report: Elapsed Time: '43879.012785'
2022-01-18 19:33:19 (11508): Status Report: CPU Time: '75.46875


2022-01-18 21:13:48 (11508): Status Report: Elapsed Time: '49879.776962'
2022-01-18 21:13:48 (11508): Status Report: CPU Time: '86.453125'

2022-01-18 22:59:05 (11508): Status Report: Elapsed Time: '55880.065147'
2022-01-18 22:59:05 (11508): Status Report: CPU Time: '96.125000'

2022-01-19 00:02:14 (11508): VM state change detected. (old = 'running', new = 'paused')
2022-01-19 00:02:44 (11508): Powering off VM.
2022-01-19 00:02:44 (11508): Successfully stopped VM.

*End of day 1*


Start day 2

2022-01-19 07:58:26 (16032): VM state change detected. (old = 'poweredoff', new = 'running')
2022-01-19 07:58:26 (16032): Status Report: Elapsed Time: '58981.617149'
2022-01-19 07:58:26 (16032): Status Report: CPU Time: '100.656250'

022-01-19 10:00:34 (16032): Status Report: Elapsed Time: '64981.857656'
2022-01-19 10:00:34 (16032): Status Report: CPU Time: '112.250000'

022-01-19 11:46:01 (16032): Status Report: Elapsed Time: '70982.433000'
2022-01-19 11:46:01 (16032): Status Report: CPU Time: '122.140625'

022-01-19 13:26:46 (16032): Status Report: Elapsed Time: '76982.663074'
2022-01-19 13:26:46 (16032): Status Report: CPU Time: '132.531250'

2022-01-19 15:11:43 (16032): Status Report: Elapsed Time: '82982.833196'
2022-01-19 15:11:43 (16032): Status Report: CPU Time: '142.390625'

2022-01-19 17:17:08 (16032): Status Report: Elapsed Time: '88982.986887'
2022-01-19 17:17:08 (16032): Status Report: CPU Time: '152.312500'

2022-01-19 19:05:11 (16032): Status Report: Elapsed Time: '94983.557718'
2022-01-19 19:05:11 (16032): Status Report: CPU Time: '161.968750'

This is where I take the time to look and see how things are going and say WTF! 2 days! Come on! ABORT
ID: 104324 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 12 Aug 06
Posts: 1313
Credit: 5,983,350
RAC: 11,707
Message 104325 - Posted: 19 Jan 2022, 20:39:54 UTC - in response to Message 104324.  

Welcome to the club. ALL tasks for 6 of my machines do that. 1 in 50 tasks for my "good" machine do that. Whatever the bug is, it can be visible sometimes on some hardware and always on other hardware. I think we can't see enough information unless we're inside the VM. Is that possible?

And wow, you were in bed for under 8 hours.
ID: 104325 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5570
Credit: 5,560,753
RAC: 586
Message 104327 - Posted: 19 Jan 2022, 23:08:23 UTC - in response to Message 104325.  

And wow, you were in bed for under 8 hours.


yeah and I am paying for that.

I didn't think of looking in the VM for info.
If I get stuck next time, I'll have a look.
ID: 104327 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 185
Credit: 23,192,571
RAC: 1,182
Message 104328 - Posted: 20 Jan 2022, 1:30:30 UTC

They make snapshots even if the app is stuck in a loop going nowhere,
I have seen 30+ snapshots with only 5 minits of cpu time wasters.

By the way what happened to 700,000 workunits vanished from the front page que?
its down to only 1.8 million
are they trying to find the buggy one`s
ID: 104328 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1864
Credit: 34,366,447
RAC: 7,035
Message 104329 - Posted: 20 Jan 2022, 3:11:44 UTC - in response to Message 104291.  

Just checking in because I had a fair few Rosetta 4.20 tasks come down.
But I think they already ran out...

I'm useful like that
YOU!! You stole them! I wanted those. I'm going to hunt you down, and I mean physically!

I actually did. Full buffer on both machines I have near me before mentioning it.
No need to thank me.
I'll make tea - do you take sugar?
ID: 104329 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 12 Aug 06
Posts: 1313
Credit: 5,983,350
RAC: 11,707
Message 104335 - Posted: 20 Jan 2022, 17:18:23 UTC - in response to Message 104329.  

Just checking in because I had a fair few Rosetta 4.20 tasks come down.
But I think they already ran out...

I'm useful like that
YOU!! You stole them! I wanted those. I'm going to hunt you down, and I mean physically!

I actually did. Full buffer on both machines I have near me before mentioning it.
No need to thank me.
I'll make tea - do you take sugar?
I don't like hot drinks. Orange juice or vodka please, or both.
ID: 104335 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 12 Aug 06
Posts: 1313
Credit: 5,983,350
RAC: 11,707
Message 104336 - Posted: 20 Jan 2022, 17:18:52 UTC - in response to Message 104328.  

They make snapshots even if the app is stuck in a loop going nowhere,
I have seen 30+ snapshots with only 5 minits of cpu time wasters.

By the way what happened to 700,000 workunits vanished from the front page que?
its down to only 1.8 million
are they trying to find the buggy one`s
Interesting, now up to 2.2 million. I'll try grabbing some and see what happens.
ID: 104336 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Peter Hucker of the Scottish Boinc Team
Avatar

Send message
Joined: 12 Aug 06
Posts: 1313
Credit: 5,983,350
RAC: 11,707
Message 104339 - Posted: 20 Jan 2022, 17:46:00 UTC - in response to Message 104336.  

Well that didn't work, I tried 3 machines. Two of them failed pythons (no CPU time) and the other took four 4.20 tasks and got a computation error!
ID: 104339 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5570
Credit: 5,560,753
RAC: 586
Message 104341 - Posted: 20 Jan 2022, 19:05:46 UTC - in response to Message 104336.  
Last modified: 20 Jan 2022, 19:18:05 UTC

They make snapshots even if the app is stuck in a loop going nowhere,
I have seen 30+ snapshots with only 5 minits of cpu time wasters.

By the way what happened to 700,000 workunits vanished from the front page que?
its down to only 1.8 million
are they trying to find the buggy one`s
Interesting, now up to 2.2 million. I'll try grabbing some and see what happens.



Ignore that big fancy number on the front page.
That is what they have in queue for both the AI and RAH of which 99% are AI tasks.

Get to the next layer deep where it breaks down 4.2 and python.

This is the real number for us lowly PC crunchers:
Application Unsent In progress Runtime of last 100 tasks in hours: average, min, max Users in last 24 hours
Rosetta 0 61887 6.62 (0.28 - 51.23) 2600
rosetta python projects 4999 13547 4.59 (0.71 - 57.86) 1059
ID: 104341 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5570
Credit: 5,560,753
RAC: 586
Message 104342 - Posted: 20 Jan 2022, 19:06:28 UTC

Check this out from a 4.2 task today

<core_client_version>7.16.20</core_client_version>
<![CDATA[
<message>
Incorrect function.
(0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe @rb_01_17_185861_181891_ab_t000__robetta_FLAGS -in::file::fasta t000_.fasta -jumps:pairing_file t000_.fasta.bbcontacts.jumps -jumps:random_sheets 1 -constraints::cst_file t000_.fasta.CB.cst -constraints:cst_weight 5.0 -constraints::cst_fa_file t000_.fasta.MIN.cst -constraints:cst_fa_weight 5.0 -in:file:boinc_wu_zip rb_01_17_185861_181891_ab_t000__robetta.zip -frag3 rb_01_17_185861_181891_ab_t000__robetta.200.3mers.index.gz -fragA rb_01_17_185861_181891_ab_t000__robetta.200.9mers.index.gz -fragB rb_01_17_185861_181891_ab_t000__robetta.200.5mers.index.gz -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 1484534
Using database: database_357d5d93529_n_methylminirosetta_database

[ ERROR ]: Caught exception:


File: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/pack/dunbrack/SingleResidueDunbrackLibrary.hh:306
chi angle must be between -180 and 180: -nan(ind)
------------------------ Begin developer's backtrace -------------------------
BACKTRACE:
------------------------- End developer's backtrace --------------------------


AN INTERNAL ERROR HAS OCCURED. PLEASE SEE THE CONTENTS OF ROSETTA_CRASH.log FOR DETAILS.



</stderr_txt>
]]>

Gees...really?!?!?
ID: 104342 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 160 · 161 · 162 · 163 · 164 · 165 · 166 . . . 236 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2022 University of Washington
https://www.bakerlab.org