Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 189 · 190 · 191 · 192 · 193 · 194 · 195 . . . 309 · Next

AuthorMessage
Jean-David Beyer

Send message
Joined: 2 Nov 05
Posts: 196
Credit: 6,613,600
RAC: 6,755
Message 105506 - Posted: 18 Mar 2022, 2:13:59 UTC

Hurray! I got a bunch of new tasks a few hours ago and six are currently running. and all have more than three hours on them instead of 40 seconds or less I have gotten lately.
ID: 105506 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 105509 - Posted: 18 Mar 2022, 8:59:50 UTC - in response to Message 105506.  

Hurray! I got a bunch of new tasks a few hours ago and six are currently running. and all have more than three hours on them instead of 40 seconds or less I have gotten lately.


That's because you received tasks from the Robetta server. Which means those are not from the Rosetta@home/Baker lab team but from someone else.
(Robetta is a public server which allows researchers from around the world to submit jobs that will either run on Rosetta@home [if they run on Rosetta 4.20] or on the Baker Lab computational resources [if they run RoseTTAFold]).
ID: 105509 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105510 - Posted: 18 Mar 2022, 9:34:49 UTC - in response to Message 105509.  

I've never really looked at hoe to tell the difference.

What's in the task title that shows it's robetta and not something from the lab?

And then the CASP stuff, or what's that institute that Dr. B. has founded? Where is their stuff? Or does that not come here?
ID: 105510 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 105511 - Posted: 18 Mar 2022, 9:37:46 UTC - in response to Message 105510.  

I've never really looked at hoe to tell the difference.

What's in the task title that shows it's robetta and not something from the lab?

And then the CASP stuff, or what's that institute that Dr. B. has founded? Where is their stuff? Or does that not come here?
For Robetta, the task name begins RB.
ID: 105511 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 105512 - Posted: 18 Mar 2022, 10:12:56 UTC - in response to Message 105511.  

I've never really looked at hoe to tell the difference.

What's in the task title that shows it's robetta and not something from the lab?

And then the CASP stuff, or what's that institute that Dr. B. has founded? Where is their stuff? Or does that not come here?
For Robetta, the task name begins RB.



As for CASP, I believe those are submitted via Robettaunder the username "casp".
As for the Institute for Protein Design, I believe there was a post from an admin/scientist back in the early days of the pandemic in which it was said that Rosetta@home was being given more attention because of the massive increase in resources - in any case, if anyone from one of the other labs at the IPD needs to run something on Rosetta@home, it's probably fairly straightforward to do so.
ID: 105512 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2002
Credit: 9,787,940
RAC: 5,329
Message 105513 - Posted: 18 Mar 2022, 10:40:42 UTC - in response to Message 105512.  

As for CASP, I believe those are submitted via Robetta under the username "casp".


Here we can see the queue
ID: 105513 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jean-David Beyer

Send message
Joined: 2 Nov 05
Posts: 196
Credit: 6,613,600
RAC: 6,755
Message 105514 - Posted: 18 Mar 2022, 12:46:17 UTC - in response to Message 105509.  

Hurray! I got a bunch of new tasks a few hours ago and six are currently running. and all have more than three hours on them instead of 40 seconds or less I have gotten lately.

That's because you received tasks from the Robetta server. Which means those are not from the Rosetta@home/Baker lab team but from someone else.
(Robetta is a public server which allows researchers from around the world to submit jobs that will either run on Rosetta@home [if they run on Rosetta 4.20] or on the Baker Lab computational resources [if they run RoseTTAFold]).


That seems to be true. Mine are all rb_03_17_...

Does that mean that those using the Robetta server are more careful about what they offer we clients than those from the Rosetta@home/Baker lab team?
ID: 105514 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MStenholm

Send message
Joined: 18 Apr 20
Posts: 18
Credit: 26,615,100
RAC: 19,972
Message 105525 - Posted: 18 Mar 2022, 21:10:52 UTC - in response to Message 105514.  

rb jobs seems to be made from a wide variety of people. The memory required can reach 1.6 GB or the more normal 300-500k MB. The current rb_03_18 is easy on the memory and on the CPU and the result is that the points is less then half of rb_03_17. My normal wall 230W is 155W and CPU is max 55, not the normal 63C. 3900X, 480 mm good cooling. I’m tempted to run 1 hr test run for the future and then pass the easy jobs to others. Due the the lack of job I will let the remaining jobs run to completion but Rosetta is mix bag of fun at best recently. WCG will be back soon….
ID: 105525 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105527 - Posted: 18 Mar 2022, 23:21:34 UTC - in response to Message 105525.  

If you have virtualization and want to test your system, enable Python tasks, combine that with LHC ATLAS and your system will get a good work out.
ID: 105527 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MStenholm

Send message
Joined: 18 Apr 20
Posts: 18
Credit: 26,615,100
RAC: 19,972
Message 105530 - Posted: 19 Mar 2022, 2:25:42 UTC - in response to Message 105527.  

If you have virtualization and want to test your system, enable Python tasks, combine that with LHC ATLAS and your system will get a good work out.


After reading yours and others problems with Python my 16 GB machine and my problem with programmers that give a f..k about our equipment that would never happen. I like that I can run decent timings on my memory and there is no way that I will invest $500 in RAM for the sole purpose to bring my SSD to a premature death.
ID: 105530 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MStenholm

Send message
Joined: 18 Apr 20
Posts: 18
Credit: 26,615,100
RAC: 19,972
Message 105531 - Posted: 19 Mar 2022, 2:25:45 UTC - in response to Message 105527.  
Last modified: 19 Mar 2022, 2:26:52 UTC

Empty
ID: 105531 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105533 - Posted: 19 Mar 2022, 9:15:07 UTC - in response to Message 105530.  

If you have virtualization and want to test your system, enable Python tasks, combine that with LHC ATLAS and your system will get a good work out.


After reading yours and others problems with Python my 16 GB machine and my problem with programmers that give a f..k about our equipment that would never happen. I like that I can run decent timings on my memory and there is no way that I will invest $500 in RAM for the sole purpose to bring my SSD to a premature death.



There were teething problems with Python. We got those sorted out.
The SSD premature death is hype in my opinion.
I've been running BOINC with various projects that write all the time to one of my SSD's for the last 5 years and it is still in good health. Samsung SSD's are very rugged. I am just now at 65.1 TB of writes. According to web information, I still have at least another 60TB to go before I would have to replace this drive. So that is about 10 years of writes. But it could go longer since it is over provisioned and has some unused cells.
as for 500 in RAM, I spent about 150 to upgrade my ram to handle 15 Pythons at a one time if it came to that.

I have put a bunch of money into this machine over the years. It's not the fastest out there, but it is a good build for what I want to participate in. It's cost a fair bit of money over the years, but this is the final build.
48GB RAM, 16 core, 2 SSD (1 dedicated BOINC data, 1 Windows and was formerly BOINC as well). 1HDD for storage. One of the top liquid cooling systems on the market. 550 Watt digital power supply. I run 2 GPU (until WCG comes back) and 4 CPU projects, plus FAH. That's a fair amount of data being transferred, but yet my drive hasn't given up.

LHC ATLAS is a bit of a memory hog, but it's much more stable than Python and much more mature.
The group has a lot of Gurus in it, including one from here that also is on other similar projects to my selection.

I can't tell you why RAH group all of sudden disappeared from here. I guess their funding does not allow for a person to come monitor here. That or they say they will just see it in the results. But the lack of response from their office or from a member of staff that is the tech person is not right. They seem to be isolated.

BTW, have you looked at SIdock or TN-Grid or Quchempedia?
Quchem is a Vbox project. But I had troubles keeping their work stable with all my other projects.
I run SIdock and TN-Grid as well and they are both standard CPU projects and very rock solid and don't require that much memory.
ID: 105533 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105534 - Posted: 19 Mar 2022, 9:16:28 UTC - in response to Message 105531.  

Empty



Next time, when you edit, just erase everything, type in two space bars of blank text and then save your post. The server will automatically delete it.
ID: 105534 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105535 - Posted: 19 Mar 2022, 13:17:10 UTC - in response to Message 105533.  

An additional comment, The pythons are pretty much faultless now. I have not had any error out on me.
That is from a much larger database not within Baker lab.

The 4.2 stuff, it seems the RB is ok, that's external work. The non RB can be hit and miss. For instance this group: preetham_gen_ etc. had one specific bug.

It's annoying to run something that errors out. In the past they would heard about it via the mod and pull it or fix it. Now we have to burn through all of the tasks and they take what works and toss the rest I guess.
Stupid, but that's how they work now. Thing is 4.2 tasks are not that often anymore and when they are, the are consumed quickly. I think I ran about 12 or 16 of these in total. They all errored out quick, so not a lot of wasted time. Annoying? Yes. But that's just part of the package now.
ID: 105535 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 105536 - Posted: 19 Mar 2022, 13:24:30 UTC - in response to Message 105535.  
Last modified: 19 Mar 2022, 13:24:59 UTC

An additional comment, The pythons are pretty much faultless now. I have not had any error out on me.
I have 7 computers and despite trying every version of virtualbox, and stopping AVG messing with virtual machines (which fixed LHC), I can only run Pythons on 2 of the 7. The others just don't process. Walltime passes, CPU time stops about 20 seconds.
ID: 105536 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 272
Credit: 507,897
RAC: 334
Message 105537 - Posted: 19 Mar 2022, 13:35:03 UTC - in response to Message 105536.  
Last modified: 19 Mar 2022, 13:39:17 UTC

Can you open virtuualbox gui and press show to attach gui screen to the running vm and look at what it writes writing to the screen?
Then you can detach gui from the vm and switch to next.
You can post screenshots to imgur.
you can make screenshots with win+printscreen.
They are saved to C:\Users\[username]\Pictures\Screenshots
ID: 105537 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 105538 - Posted: 19 Mar 2022, 14:40:25 UTC - in response to Message 105537.  
Last modified: 19 Mar 2022, 14:46:14 UTC

Can you open virtuualbox gui and press show to attach gui screen to the running vm and look at what it writes writing to the screen?
Then you can detach gui from the vm and switch to next.
You can post screenshots to imgur.
you can make screenshots with win+printscreen.
They are saved to C:\Users\[username]\Pictures\Screenshots

https://imgur.com/a/ENKUzDk
Does that Intel error mean my CPU doens't support the command being used? I can do LHC VB stuff ok on it. Is Rosetta using CPU extensions that one doesn't have? It's a Xeon X5650. The two which DO run Python ok are the newest two.

Mind you, I looked up that error and similar ones I found suggest incorrect libraries, but how is that possible on some of my machines and not others? Aren't the libraries inside the VM?
ID: 105538 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
zxcvbob

Send message
Joined: 4 Jan 06
Posts: 8
Credit: 830,878
RAC: 0
Message 105539 - Posted: 19 Mar 2022, 14:48:52 UTC

I have one work-unit that has been running for days. I thought it was stuck yesterday morning at 99.1%, but it was still climbing *very* slowly. Today it is at 99.9+%; maybe it will be finished by tomorrow. It was due yesterday so I don't know if it will even be accepted. It's a virtual-box one. I'm trying now to figure out how to post a screen shot. Apparently I can't just add an attachment. And I can't copy text from the Boinc Manager. It's one of the aagb-mNMPHE ones.
ID: 105539 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 272
Credit: 507,897
RAC: 334
Message 105541 - Posted: 19 Mar 2022, 14:53:17 UTC - in response to Message 105539.  
Last modified: 19 Mar 2022, 14:53:34 UTC

drag and drop it to imgur.com and use img or url tag
ID: 105541 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 105542 - Posted: 19 Mar 2022, 14:56:29 UTC - in response to Message 105539.  

I have one work-unit that has been running for days. I thought it was stuck yesterday morning at 99.1%, but it was still climbing *very* slowly. Today it is at 99.9+%; maybe it will be finished by tomorrow. It was due yesterday so I don't know if it will even be accepted. It's a virtual-box one. I'm trying now to figure out how to post a screen shot. Apparently I can't just add an attachment. And I can't copy text from the Boinc Manager. It's one of the aagb-mNMPHE ones.
That percentage sometimes is wrong. Is it actually using your CPU? Check in task manager. Mine appear to run in Boinc Manager, but they don't do any calculations and the CPU is idle. Boinctasks is better, it shows CPU usage.
ID: 105542 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 189 · 190 · 191 · 192 · 193 · 194 · 195 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org