Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 152 · 153 · 154 · 155 · 156 · 157 · 158 . . . 300 · Next
Author | Message |
---|---|
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
Well so far my i5 has worked perfectly, my Ryzen got banned, and my old Xeons I just noticed have spent 24 hours running python tasks with a total of 13 minutes CPU time. I wondered why they felt cold to the touch. There's something terribly wrong with these WUs. These are the two Xeons, I'm in the process of aborting the tasks, if anyone can look and interpret the outputs. https://boinc.bakerlab.org/rosetta/results.php?hostid=6169682 https://boinc.bakerlab.org/rosetta/results.php?hostid=6169697 Make sure you look at the right ones, the ones aborted just now, not the ones aborted yesterday (that was something else when I was trying to set things up). Here is a dodgy one, many errors, please interpret: https://boinc.bakerlab.org/rosetta/result.php?resultid=1463541284 It includes many of these lines: Hypervisor System Log: 24:11:34.575288 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={85cd948e-a71f-4289-281e-0ca7ad48cd89} aComponent={MachineWrap} aText={The object functionality is limited}, preserve=false aResultDetail=0" I have asked over in the main Boinc forum too, https://boinc.berkeley.edu/dev/forum_thread.php?id=14532 |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
I've asked in the LHC forum, since they use vbox on almost all tasks and might know what the problem is: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5781 |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,265,544 RAC: 4,489 |
I've asked in the LHC forum, since they use vbox on almost all tasks and might know what the problem is: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5781 Don't confuse vbox (which handles 32-bit work) with vbox64 (which handles 64-bit work). |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
I assume everyone is on vbox64 by now? LHC will be, and they seem to use the same wrapper as Rosetta. I'm not sure what it is you're trying to tell me. I only installed one piece of software, virtualbox, from the Oracle site, same version that Boinc issues. Are you telling me there's two halves and Rosetta uses the other one to LHC? My i5 which does python ok has vboxheadless and virtualbox interface listed in the windows task manager azs running, no mention of 32 or 64 bit. After following the advice from the LHC forum, I am no further forwards. My old xeons don't do any CPU time, my Ryzen (I think, can't check as it's now banned) computes but is not validated, and my i5 runs them perfectly. Same version of everything on all of them. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,265,544 RAC: 4,489 |
RNA World is still on vbox, but they're down to 19 unfinished workunits. So, not everyone. Virtualbox (at least the latest versions) has two parts, the vbox part for 32-bit work and the vbox64 part for 64-bit work. Rosetta. and probably also LHC. use the vbox64 part. I don't participate in LHC. so I haven't seen what they use. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
RNA World is still on vbox, but they're down to 19 unfinished workunits. So, not everyone.But LHC and Rosetta are 64 bit? And how does RNA world work, do you have to download an old 32 bit version? |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Well so far my i5 has worked perfectly, my Ryzen got banned, and my old Xeons I just noticed have spent 24 hours running python tasks with a total of 13 minutes CPU time. I wondered why they felt cold to the touch. There's something terribly wrong with these WUs. It was chugging along just fine and then blows up with access denied? That's weird. Did windows all of sudden block it or it ran into a fault with the data. That it ran 24 hours is really odd. These finish in 4 hours or less. A quick look with the object statement says something went wrong in Vbox. If that happens repeatedly, then you need to remove Vbox and reinstall it. Again its very late in the EU, so I will have to dig into more later. Maybe our two experts can help you more. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
The xeons never chugged along just fine. The CPU usage in Windows task manager was virtually zero, as was the number in brackets in Boinc showing the actual cpu usage. If left alone, it gradually moved to 100% over 24 hours, but did virtually no CPU usage (I believe it was about 13 minutes CPU time over 24 hours). On the working i5, I see full CPU usage almost immediately. If it's something in windows, do you have an idea what? Can I check some folder permissions? Vbox has been reinstalled already, didn't help. I tried 5.2 and 6.1. Both with the extension pack. I am also in the EU, in the UK, the founder of the world as we know it :-) When I say EU I mean geographically, since we told you lot to go forth and multiply :-P |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
RNA World is still on vbox, but they're down to 19 unfinished workunits. So, not everyone.The wrapper for LHC's CMS simulations uses the same wrapper as Rosetta, so I assume both are 64 bit. LHC are always bang up to date with everything. Still not sure what you thought I was confusing. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,265,544 RAC: 4,489 |
RNA World is still on vbox, but they're down to 19 unfinished workunits. So, not everyone.But LHC and Rosetta are 64 bit? Probably. And how does RNA world work, do you have to download an old 32 bit version? Their application is written to run in 32-bit mode. They haven't updated their application since VirtualBox would only do 32-bit mode. With so few workunits left, they don't plan to. The recent versions of VirtualBox can do both 32-bit and 64-bit, so downloading the 32-bit application from RNA World is enough. It's not easy to make 32-bit applications run in 64-bit mode. You have to recompile them in 64-bit mode. but that's usually not enough. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,265,544 RAC: 4,489 |
The xeons never chugged along just fine. The CPU usage in Windows task manager was virtually zero, as was the number in brackets in Boinc showing the actual cpu usage. If left alone, it gradually moved to 100% over 24 hours, but did virtually no CPU usage (I believe it was about 13 minutes CPU time over 24 hours). On the working i5, I see full CPU usage almost immediately. [snip] It appears to be something in the workunits, not in the folders. I haven't spotted just what. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
It appears to be something in the workunits, not in the folders. I haven't spotted just what.What I don't understand is why it makes it either work or not on different machines. Ryzen 9 3900XT - no. i5-8600K - yes. Xeon X5650 - no. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,265,544 RAC: 4,489 |
It appears to be something in the workunits, not in the folders. I haven't spotted just what.What I don't understand is why it makes it either work or not on different machines. It might mean that those three CPUs have slightly different instruction sets, and the workunits were not written to use only instructions that are available on all three of them. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
Possible. The Ryzen is much newer though. Are there instructions AMD have omitted that older Intels had?It appears to be something in the workunits, not in the folders. I haven't spotted just what.What I don't understand is why it makes it either work or not on different machines. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,265,544 RAC: 4,489 |
Possible. The Ryzen is much newer though. Are there instructions AMD have omitted that older Intels had?It appears to be something in the workunits, not in the folders. I haven't spotted just what.What I don't understand is why it makes it either work or not on different machines. Probably, if Intel put patents on these instructions. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
Probably, if Intel put patents on these instructions.That can't happen very often, or AMD and Intel CPUs would be totally incompatible. My AMD Ryzen 9 3900XT has, according to Boinc: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 svm sse4a osvw ibs skinit wdt tce topx page1gb r My Intel i5 8600K has, according to Boinc: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 pbe fsgsbase bmi1 hle Intel only: dts acpi ss tm vmx smx tm2 pbe fsgsbase bmi1 hle AMD only: svm sse4a osvw ibs skinit wdt tce topx page1gb r It's a miracle anything works at all. [shakes head]Capitalists....[/shakes head] This is an interesting read: https://itigic.com/x86-on-intel-and-amd-why-cant-anyone-else-make-cpus/ |
Stevie G Send message Joined: 15 Dec 18 Posts: 107 Credit: 817,125 RAC: 1,534 |
quote]Lucky you, I have chronic fatigue :-([/quote] Epstein-Barr syndrome? SGaber |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
I've had it since before I was her toyboy.Lucky you, I have chronic fatigue :-( |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
From LHC: 7 GB is rather huge for this kind of usecase and it appears that the virtual disk has only 1 GB space left.Part of the error output on one of my machines failing a Rosetta VB Python task: 2022-01-07 22:37:22 (9144):Also, I'm wondering if "dynamic default" means it should auto-grow? |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 11,708,745 RAC: 22,460 |
Also, I'm wondering if "dynamic default" means it should auto-grow?It doesn't. LHC tell me capacity is the upper limit. I think perhaps Rosetta need to check this. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org