Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 205 · 206 · 207 · 208 · 209 · 210 · 211 . . . 274 · Next

AuthorMessage
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 220
Credit: 291,990
RAC: 1,126
Message 106043 - Posted: 25 Apr 2022, 17:40:26 UTC - in response to Message 106042.  

nanoHUB@Home too.
ID: 106043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tullio

Send message
Joined: 10 May 20
Posts: 63
Credit: 630,125
RAC: 0
Message 106044 - Posted: 25 Apr 2022, 19:35:23 UTC

I am using VitualBox to build a Linux Virtual Machine that runs OpenSuSE Tumbleweed, a development versio with kernel 5.17.3, frequently updated so I have to reboot it frequently. It runs Einstein@home CPU tasks not being able to use the nVidia GTX1060 board of its Windows 10 host and QuChem, which being a Linux project does not need VirtualBox.
Tullio
ID: 106044 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 106045 - Posted: 25 Apr 2022, 21:11:02 UTC
Last modified: 25 Apr 2022, 21:13:04 UTC

I can't run Quchem.
It goes to 60% and becomes unstable.
VmJob unmanagable message.
2022-04-25 20:38:52 (2024): VM state change detected. (old = 'running', new = 'paused')
2022-04-25 21:05:27 (2024): VM state change detected. (old = 'paused', new = 'running')
2022-04-25 21:06:03 (2024): Creating new snapshot for VM.
2022-04-25 21:06:11 (2024): Deleting stale snapshot.
2022-04-25 21:06:12 (2024): Checkpoint completed.
2022-04-25 21:10:16 (2024): VM state change detected. (old = 'running', new = 'paused')
2022-04-25 21:14:52 (2024): VM state change detected. (old = 'paused', new = 'running')
2022-04-25 21:16:38 (2024): VM state change detected. (old = 'running', new = 'paused')
2022-04-25 21:32:41 (2024): ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time.

And GPU gives me the exit child error on their new pythons.
ID: 106045 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 220
Credit: 291,990
RAC: 1,126
Message 106046 - Posted: 25 Apr 2022, 21:16:57 UTC - in response to Message 106045.  

Maybe there isn't enough ram for python gpu?
ID: 106046 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1966
Credit: 38,184,495
RAC: 10,704
Message 106047 - Posted: 25 Apr 2022, 21:51:42 UTC - in response to Message 106032.  

Looking at this message was a reminder to do all this.
No new .tmp files, freed up a few Gb here too, grabbed Treeview but it's not telling me anything I expect to find useful so removed again.
I've got BoincTasks but hadn't set it up to run at startup, which I've now done. Yes, very useful in finding tasks that are very far behind in CPU time compared to Elapsed time.
More useful when running VBox tasks compared to running plain Rosetta tasks - I'll keep this going now.
All good, ta
I assume you used treeSIZE. That one is great, it's like a windows explorer tree, but with sizes. I removed a whole load of stuff *I* had put there that I didn't need. Games I no longer played, films that can be archived onto the rust spinner, etc.

No idea how people can manage with just the plain Boinc Manager, it's absolutely horrid, especially if you have a lot of tasks. No colour coding, no grouping of queued tasks, etc. And with me having 7 computers, I really need a central controller. At least Folding at Home supplies such a thing, but I don't think the Boinc Manager will look at many computers easily.

Oops, yes Treesize. I can see how useful it might be, but I keep a pretty tight ship at the best of times, so no need for it here.
I'd used Boinctasks before, but prior to installing VirtualBox, and I didn't have the kind of problems that BoincTasks would solve back then, so it just seemed an unnecessary duplication.
Nothing against it - just not enough going for it with my limited uses. Until now.
ID: 106047 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 106048 - Posted: 26 Apr 2022, 6:08:05 UTC - in response to Message 106046.  

Maybe there isn't enough ram for python gpu?

48 gigs? not enough?
I got a ACEMD 3 and running ATLAS and Prime Grid and only using 32% of my total RAM
I don't think its RAM.

The STDERR goes on about memory leaks in its setup, but this is fresh RAM (well half new and half less new but not ancient)
ID: 106048 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 220
Credit: 291,990
RAC: 1,126
Message 106050 - Posted: 26 Apr 2022, 9:02:10 UTC - in response to Message 106048.  

I have only 16 gb ram.
ID: 106050 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tullio

Send message
Joined: 10 May 20
Posts: 63
Credit: 630,125
RAC: 0
Message 106056 - Posted: 26 Apr 2022, 11:33:39 UTC
Last modified: 26 Apr 2022, 11:35:43 UTC

I have only 12 GB RAM on thjis Windows 11 PC and can run both rosetta python and QuChem, but not at the same time. QuChem runs also on my Linux Virtual Machine with 8 GB RAM. I could not run rosetta python on it. Now I am running Rosetta 4.20 on the Windows 11 PC.
Tullio
ID: 106056 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 220
Credit: 291,990
RAC: 1,126
Message 106057 - Posted: 26 Apr 2022, 11:38:41 UTC - in response to Message 106056.  

They run up to 2% and freeze. With wall of errors in stderr.txt
ID: 106057 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,537,480
RAC: 776
Message 106058 - Posted: 26 Apr 2022, 16:24:59 UTC - in response to Message 106041.  

That's strange. How vbox processes tasks is above my head.
Computermeze or whatever his name is knows more about that kind of stuff.
Not on speaking terms ROFL! And I thought it was a woman.

Have you asked in Cosmo forum at all if anyone knows why 6 does not work?
Maybe post in Github and see what the experts say.
I think the program needs to be written to work in 6. From what the admin at Kryptos said, it's easier to program in 5. Kinda like a driver for Windows 7 might not work in 10.
ID: 106058 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,537,480
RAC: 776
Message 106059 - Posted: 26 Apr 2022, 16:26:25 UTC - in response to Message 106043.  

ID: 106059 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 220
Credit: 291,990
RAC: 1,126
Message 106060 - Posted: 26 Apr 2022, 16:27:39 UTC - in response to Message 106059.  

tacc is empty too.
ID: 106060 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,537,480
RAC: 776
Message 106061 - Posted: 26 Apr 2022, 16:28:08 UTC - in response to Message 106045.  
Last modified: 26 Apr 2022, 16:29:23 UTC

I can't run Quchem.
It goes to 60% and becomes unstable.
VmJob unmanagable message.
2022-04-25 20:38:52 (2024): VM state change detected. (old = 'running', new = 'paused')
2022-04-25 21:05:27 (2024): VM state change detected. (old = 'paused', new = 'running')
2022-04-25 21:06:03 (2024): Creating new snapshot for VM.
2022-04-25 21:06:11 (2024): Deleting stale snapshot.
2022-04-25 21:06:12 (2024): Checkpoint completed.
2022-04-25 21:10:16 (2024): VM state change detected. (old = 'running', new = 'paused')
2022-04-25 21:14:52 (2024): VM state change detected. (old = 'paused', new = 'running')
2022-04-25 21:16:38 (2024): VM state change detected. (old = 'running', new = 'paused')
2022-04-25 21:32:41 (2024): ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time.

And GPU gives me the exit child error on their new pythons.
Working here on a variety of Windows 11 machines, using VB 5. Oldest has 8GB of DDR2! I can run one QuChem per 2GB of RAM (almost). Something up with the QuChem tasks, I'm getting tasks ending in _9, so I checked and there are loads of (Linux) hosts churning through several thousand and failing them in 1 second. Missing libraries? I asked over there, but the forum is quiet.
ID: 106061 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,537,480
RAC: 776
Message 106062 - Posted: 26 Apr 2022, 16:36:14 UTC - in response to Message 106046.  

Maybe there isn't enough ram for python gpu?
WHAT? GPU? On QuChem? Where? I want.
ID: 106062 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 220
Credit: 291,990
RAC: 1,126
Message 106063 - Posted: 26 Apr 2022, 16:37:06 UTC - in response to Message 106062.  

gpugrid
ID: 106063 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,537,480
RAC: 776
Message 106064 - Posted: 26 Apr 2022, 16:37:52 UTC - in response to Message 106047.  

Oops, yes Treesize. I can see how useful it might be, but I keep a pretty tight ship at the best of times, so no need for it here.
I'd used Boinctasks before, but prior to installing VirtualBox, and I didn't have the kind of problems that BoincTasks would solve back then, so it just seemed an unnecessary duplication.
Nothing against it - just not enough going for it with my limited uses. Until now.
I have 7 computers, controlling all those individually would be ridiculous. AFAIK you have 4 active machines. That would be enough for me to use Boinctasks.
ID: 106064 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,537,480
RAC: 776
Message 106065 - Posted: 26 Apr 2022, 16:38:51 UTC - in response to Message 106048.  

Maybe there isn't enough ram for python gpu?

48 gigs? not enough?
I got a ACEMD 3 and running ATLAS and Prime Grid and only using 32% of my total RAM
I don't think its RAM.

The STDERR goes on about memory leaks in its setup, but this is fresh RAM (well half new and half less new but not ancient)
You could run memtest (I always do for any new/used RAM I obtain), but I think a memory leak is a programming error, not a hardware fault.
ID: 106065 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,537,480
RAC: 776
Message 106066 - Posted: 26 Apr 2022, 16:40:07 UTC - in response to Message 106056.  

I have only 12 GB RAM on thjis Windows 11 PC and can run both rosetta python and QuChem, but not at the same time. QuChem runs also on my Linux Virtual Machine with 8 GB RAM. I could not run rosetta python on it. Now I am running Rosetta 4.20 on the Windows 11 PC.
Tullio
I find VB on the main machine I'm trying to use gives a very sluggish Windows 11 interface. I run it on the 6 Boinc only machines and do native tasks on this one.
ID: 106066 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 220
Credit: 291,990
RAC: 1,126
Message 106067 - Posted: 26 Apr 2022, 16:41:02 UTC - in response to Message 106066.  

It happens on windows 10 too.
ID: 106067 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 9,537,480
RAC: 776
Message 106068 - Posted: 26 Apr 2022, 16:41:05 UTC - in response to Message 106063.  
Last modified: 26 Apr 2022, 16:41:23 UTC

gpugrid
GPUGrid needs Nvidias. I don't own Nvidias, I find AMD gives more bang for the buck. I've got my SP GPUs on Folding@Home and my DP GPUs on Milkyway.
ID: 106068 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 205 · 206 · 207 · 208 · 209 · 210 · 211 . . . 274 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org