Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 193 · 194 · 195 · 196 · 197 · 198 · 199 . . . 309 · Next
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Some kind of a response from them would be nice, perhaps one of: HAHAHAHA...yeah right. Take #1 and add we don't care |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Idiot server kicked me off after 2 aborts and 1 error all from aagb If they would make things correct the first time I wouldn't have this problem. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,538,222 RAC: 10,691 |
Never paused once That's fine. If they pause, abort them. They never unpause in my experience. But I have plenty of running and successfully completed aagb tasks too. They may certainly be most susceptible, but it's not all of them. Just check them 10mins after they've started and you'll know which way it's heading, then take the appropriate action. |
Bruce Morse Send message Joined: 8 Oct 05 Posts: 5 Credit: 2,056,124 RAC: 38,533 |
I have a two applications of Rosetta python projects 1.03 (vbox64) running. aagb-SAR_pp-….. And aaam-PRO_pp-…. They are currently showing elapsed time; Time remaining: 5d 14:45:56 00:00:04 5d 14:39:50. 00:00:04 The elapsed timer is running. The time remaining has been getting progressively longer and longer between changes - currently measured in hours. Any ideas? |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I have a two applications ofYou need to see how much CPU time they're actually using. These tasks tend to sit doing nothing. If you have Boinctasks, this shows real CPU usage. Or you can use Windows task manager. If they aren't doing anything, abort them. |
Bruce Morse Send message Joined: 8 Oct 05 Posts: 5 Credit: 2,056,124 RAC: 38,533 |
Additional notes: Menu options in BOINC Manager are no longer functioning, including the snooze, about and exit options from the taskbar; Outlook will no longer start, Just noticed that my elapsed time NOW reads three (3) seconds remaining. The version of Vbox is the one distributed with BOINC and has not been updated. Is/Are there some settings in Vbox that *I* should have modified? Vbox shows both tasks running and a pop up indicates a new version available: 6.1.32 Current version: 6.0.14r133895 (Qt5.6.2) There is sporadic activity. |
Bruce Morse Send message Joined: 8 Oct 05 Posts: 5 Credit: 2,056,124 RAC: 38,533 |
Checking windows 10 task manager: baseline - There is very little cpu usage but there is some bursts of usage; memory - minimal changes; disk - some; Vbox Ethernet- zero; and LAN network - some. |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 272 Credit: 507,897 RAC: 274 |
can you open virtualbox gui, press show and look at what virtualbox vm screens are showing? Also open task C:programdataBOINCslots[slotnumber]shared and look at file modification times? |
Bruce Morse Send message Joined: 8 Oct 05 Posts: 5 Credit: 2,056,124 RAC: 38,533 |
can you open virtualbox gui, press show and look at what virtualbox vm screens are showing? Looks like it never started? Last line: Intel MKL FATAL ERROR: Error on loading function mkl_lapack_ps_mc3_dsytrf_l_small. Also open task Most recent are 03/16/2022 05:46 PM (Um.. today: 03/22/3022) Kinda saddens me - it appears I have wasted many days. ETA: left it running for now in case anyone wants additional information. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I get that every single time on 5 of my 7 computers. Nobody knows why.can you open virtualbox gui, press show and look at what virtualbox vm screens are showing?Looks like it never started? Which of your computers are having problems? From my end it looks like older computers don't work. Mine are: Ryzen 9 3900XT - works all the time on Rosetta Python VB. i5 8600K - works all the time on Rosetta Python VB. Core 2 Quad Q8400 - gets the same error as you every time. Pentium N3700 - gets the same error as you every time. Dual Xeon X5650 - gets the same error as you every time. Dual Xeon X5650 - gets the same error as you every time. i3 M350 - gets the same error as you every time. |
Bruce Morse Send message Joined: 8 Oct 05 Posts: 5 Credit: 2,056,124 RAC: 38,533 |
I currently have only two computers actively running Vbox: Toshiba laptop: Intel Pentium CPU 2020 (two core hyper thread)@ 2.4GHz; 16.0 GB RAM; Win10/home. & doesn’t want to play nice 6-core 3.2 GHz Intel Core i7-8700; 16.0 GB RAM; Win10/home. IS playing nice. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
I currently have only two computers actively running Vbox:You seem to be getting the same as me. Newer machines work, older machines don't. I'm going to guess the Python app is using newer instruction sets only available on newer processors, and the incompetant fools at Rosetta are handing them out to everybody instead of only those that can handle it. They must be relying on you failing a lot of them so it automatically switches off your computer from Python, but the trouble is they don't just quickly fail, they sit doing nothing for days. And you have to fail 100 of them (not just abort them) before it bans you from Python. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2002 Credit: 9,787,940 RAC: 5,329 |
Newer machines work, older machines don't. I'm going to guess the Python app is using newer instruction sets only available on newer processors, and the incompetant fools at Rosetta are handing them out to everybody instead of only those that can handle it. If i'm not wrong, VirtualBox exposes instructions sets automaticaly to guest machines so you're idea is not so fool. Python app is running TrRosetta simulations that are, probably, compiled against Tensorflow. Someone has this problem with old cpu |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
CPUs have many different combinations of instructions sets, it should be tested for before running!Newer machines work, older machines don't. I'm going to guess the Python app is using newer instruction sets only available on newer processors, and the incompetant fools at Rosetta are handing them out to everybody instead of only those that can handle it. |
JohnDK Send message Joined: 6 Apr 20 Posts: 33 Credit: 2,390,240 RAC: 0 |
I just can't get the python WUs to work properly on my Linux host, they all too often pauses with the VM unmanageable error. I started with 9 WUs and it has been suggested to lower that and one by one I'm down to 5. Right now I only have 4 left in cache and running, but 3 of them have already pausede with the VM error message after a BOINC restart. So the question of having enough RAM doesn't seem to apply, to my PC anyway. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 1,653 |
VirtualBox comes in two major versions, vbox and vbox64. The Python tasks use only the newer of these, vbox64. Since vbox emulates a 32-bit instruction set and vbox64 emulates a 64-bit instruction set, they are not interchangeable. Each is a program, and therefore requires a certain list of instructions from the physical CPU core it runs on. BOINC makes a list of the major groups of instructions available as it starts up. It appears that vbox has been in use long enough that it only uses CPU instructions available on nearly all computers still in use, but vbox64 hasn't. VirtualBox https://www.virtualbox.org/wiki/Downloads https://www.virtualbox.org/ If some of you can identify specific emulated CPU instructions for which emulation fails and shuts down the emulation, you might give the details to Oracle and see if they will fix at least part of the problem, even if Rosetta@Home won't help. The details you send them should include the list of CPU instruction groups produced when BOINC starts up. One thing many of us might send them is a request that when the VM unmanageable error is given, vbox64 should give more details on why. |
zxcvbob Send message Joined: 4 Jan 06 Posts: 8 Credit: 830,878 RAC: 0 |
Are there no 32-bit work units? One of my better systems (that crunches WCG very well when that project is up) has Windows 10 Pro 32-bit. I attached it to R@H several hours ago and it's getting no tasks. It does not have vbox installed, but my 64-bit machine without vbox (or do you call it vbox64?) is getting new work. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
If some of you can identify specific emulated CPU instructions for which emulation fails and shuts down the emulation, you might give the details to Oracle and see if they will fix at least part of the problem, even if Rosetta@Home won't help.The best way to do this would be for many of us to create a big list of CPUs, their instructions sets as reported by Boinc, and if they run Python or not. We can then see which instruction is causing the problem. I'll start us off with my 7 machines, add your own please, and I'll shove them all in a spreadsheet and see what's what: Ryzen 9 3900XT, WORKS, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 svm sse4a osvw ibs skinit wdt tce topx page1gb r (I think this is truncated?) i5-8600K, WORKS, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 pbe fsgsbase bmi1 hle Core 2 Quad Q8400, DOESN'T WORK, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 syscall nx lm vmx tm2 pbe Pentium N3700, DOESN'T WORK, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 movebe popcnt aes rdrandsyscall nx lm vmx tm2 pbe smep Xeon X5650, DOESN'T WORK, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx smx tm2 dca pbe i3 M350, DOESN'T WORK, fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt syscall nx lm vmx tm2 pbe |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,417,319 RAC: 20,286 |
Are there no 32-bit work units?Work units are just data that can be processed by any software that has been written to process it. Rosetta has both 32 bit & 64 bit applications. Non-Python Rosetta 4.20 Tasks are very rare, it's just the the luck of the draw if your system just happens to request work when there are actually some available. Grant Darwin NT |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 8,091 |
My non-python capable machines are attached to Rosetta so if 4.2 appears, they grab it, since Boinc will have a work debt for that project. But I give them other projects to do aswell.Are there no 32-bit work units?Work units are just data that can be processed by any software that has been written to process it. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org