Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 141 · 142 · 143 · 144 · 145 · 146 · 147 . . . 311 · Next
Author | Message |
---|---|
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Python tasks failing I don't see that you have VirtualBox installed. https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=6157362 But you are better off with VBox 5.2.44 anyway. Version 6.1 gives "Vm job unmanageable" suspensions. https://www.virtualbox.org/wiki/Download_Old_Builds_5_2 |
Jonathan Send message Joined: 4 Oct 17 Posts: 43 Credit: 1,337,472 RAC: 0 |
Are you thinking of a different project that shows Virtual Box on the computer details page? I don't see it on either of the computer details for mine, nor on the two I check listed under you. Virtual Box is working on both my computers but I have been sticking to 6.1 since it is supported by Virtualbox. Support was dropped for the earlier versions. I just don't load up my computers to 100 percent processor usage nor juggle too many concurrent VM tasks. These Python / Rosettas are brutal with creating almost 8 Gb images. I just can't figure out why the one computer is having problems now as it was working with the previous python related tasks. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Probably so. I usually see VirtualBox listed on most projects. The memory requirement for downloading the new pythons is now down to 3 GB, and the amount required to run is less than that. But the .vdi images in the slots are still large. Maybe they will be reduced eventually. Have you checked BOINC Manager for memory and disk usage allowed? It may not be enough. |
Jonathan Send message Joined: 4 Oct 17 Posts: 43 Credit: 1,337,472 RAC: 0 |
It's using Rosetta preferences RAM set to %75 in use and not in use. So can use 9 out of 12Gb. That seems correct as It has 3 tasks. It doesn't keep non running tasks in memory. that box is unchecked. I think it is something inside the VM. I kind of got spoiled with the LCH Atlas tasks and being able to see the second and third terminals. One for tasks and one showing TOP |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I kind of got spoiled with the LCH Atlas tasks and being able to see the second and third terminals. One for tasks and one showing TOP While the VBox version won't affect your ability to download, LHC is the only project that uses VBox 6.1 without the suspensions, from what I have seen at any rate. That is apparently because they use a different wrapper, which I think they compile themselves. At least it is different. But that is why I went to Win10. It allows the use of VBox 5.2.44, whereas Ubuntu 20.04.3 allows only 6.1. I haven't had a suspension yet in Win10, though it has been running only a day. But I would normally get several in that time with VBox 6.1. Unfortunately, it does not solve the "0 CPU" error, where a work unit uses very little (less than 1%) CPU power, and goes on forever, or else times out. |
mmonnin Send message Joined: 2 Jun 16 Posts: 61 Credit: 25,390,629 RAC: 13,030 |
Python tasks failing I have 6.x and never have this issue with LHC but about half have these issues at Rosetta.. Plenty of place and memory. Rosetta has never had an efficient app. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Python tasks failing I run LHC ATLAS and I had to downgrade to run RAH Python. Python is new to RAH style of computing on PC's. I've been with this project since almost the beginning and they have never deviated from their base program. They always have bugs, that's a given. We saw that here, quite a few things went wrong before they got a stable working project. It's been the same with some of the projects they put out on normal Rosetta. It's just one of those things that we have to deal with. As far as the two versions of Vbox, I don't see any difference in the way ATLAS runs on 6 or on 5. So I will just stick with 5 until a newer version of 6 comes out that may make the errors go away or maybe not. But it really doesn't seem to make any difference on any of the other 2 Vbox projects I run. So just down grade to 5 if you want to run Python. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Python tasks failing You might want to research that error. I found quite a few things about it, but it way over my head to understand. It's quite technical stuff that comes back in the search results. https://www.google.com/search?client=firefox-b-d&q=%22Intel+MKL+FATAL+ERROR%3A+Error+on+loading+function+mkl_lapack_ps_mc3_dsytrf_l_small |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
At the moment the only projects I am running on that computer are Moo on the gpu and rosetta on cpu I have not had many disk space messages today though some rosetta 42 work has found its way here, wich eased the problem I decided not to install any more programs to deleat the disk junk coz they would take up more space on the disk :) though I know I can uninstall them later So after the usual microwsoft `disk cleanup` and system files I had a good uninstall of everything I don't need, had a play with the digital chainsaw and deleted everything that don't have to be on the disk including everything from documents and download folders. that got me 12GB back even that did`nt get rid of the "disk space" message , though the demands where less. the thing that finaly shut it up was reducing the virtual memory size on the disk coz it was holding 39GB to ransom and not using it , my account page still shows it as - Swap space 32784.33 MB its got 32GB ram in it and windows automatically creates a page file 1 1/2 times the size of fitted RAM, give or take a bit {having remembered the fun I had with win98se all those years ago with running out of memory when it only had 756MB in it to start with} But you dident need gigabites of memory just to boot the thing back then. so having read up on how its done these days and chopped it down to a tenth of what it was using and now have 106GB free disk space even with several greedy python tasks running I will just have to keep an eye on it and see what happens ................. Just been to check on it it last had a disk space moan ten hours ago so that seems to be it for now Einstein does use the most disk space at 330MB [suspended] Rosetta [we will all have much the same] is using 49GB So , yes , rosetta is the disk hog |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
At the moment the only projects I am running on that computer are Moo on the gpu and rosetta on cpu That's interesting...I had a look at my RAH folder and its 30.7 GB in size and in compressed form it is 13.7 (size on disk) 5,316 files and 424 folders. I have a smaller drive than you and yet I don't get errors and I am running 7 BOINC projects and FAH plus Facebook and Firefox with many tabs and I don't get a disk space error. I am beginning to think RAH is having issues with Win7. I use Win 10. Just a thought. Which one of your systems is having issues? |
Jonathan Send message Joined: 4 Oct 17 Posts: 43 Credit: 1,337,472 RAC: 0 |
I aborted all the newer python jobs that started with 'aa'. I got a single 'boinc_cages_IL' job so I kept that one. That one runs fine. I set the computer to not receive VM jobs from Rosetta. |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
Its windblows 7, opteron16 that has gone funky I thort I had it fixed, but today its back on python only work , 11 at once and it is getting the disk space moan again, except even after all that clear out, its got worse !!!?? 04/12/2021 10:50:17 | Rosetta@home | Message from server: rosetta python projects needs 16200.20MB more disk space. You currently have 2873.28 MB available and it needs 19073.49 MB. 04/12/2021 12:19:11 | Rosetta@home | Message from server: rosetta python projects needs 16255.39MB more disk space. You currently have 2818.09 MB available and it needs 19073.49 MB. right now, as far as the OS on drive C, its got 91GB of disk space free even with the 11 pythons running so just for interest I have set it off on a full 5 pass disk check to see if it finds anything funny old world . . . . |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2002 Credit: 9,790,281 RAC: 2,986 |
Now my pcs have "got 0 new tasks" of python wus, but in the queue there are over 5000 wus... |
Falconet Send message Joined: 9 Mar 09 Posts: 354 Credit: 1,276,393 RAC: 828 |
Now my pcs have "got 0 new tasks" of python wus, but in the queue there are over 5000 wus... I received 5 tasks just now. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
They have a mix of the old (8 GB) and new (3 GB) pythons. The old ones are "boinc_cages-Il", and the new ones are "aaxx-xxx". So maybe we will finish off the big ones at some point. |
trevG Send message Joined: 5 Nov 13 Posts: 9 Credit: 687,475 RAC: 0 |
I've been trying to run these Pyrhons starting aa.. and made some progress, after finding windows update had pre-stalled my VM 'manual start up' setting in services. No warning or anything useful. I had previously struggled with LHC due to this. Checking the operation out showed that BoincMgr was freezing,same with EFMER version, which I prefer [for visibilty and function] - but less good at access to Boinc lately, under password issues. I looked into the VM box setting and saw that the ram allocation was waning that max sertting would cause system lag- Problem, I couldn't mod the setting for a while till it suddenly went live. It seemed to reset back easily, though. In trying to sort out this, I lost half a dozen WU's after completing two ok but on restarting I no longer get units after inceasing my disk allocation a lot -as pointed out by others prioe to successful two runs.. I waited 24 hours after failures to see if finihing other work affected getting download better- but no change. I wonder if aborted work has led to blacklisting?? Annoying -as I lost 4 days of GPUGrid work in the process and spent hours sorting out the VM- which is pretty tricky to use. Any thoughts, Maestros? I never had issues over years with old RAH clients.. World Community Grid 03-12-2021 16:03 02:20:28 (01:44:22) 03-12-2021 16:04 MCM1_0185708_4424_2 74.30 Reported: OK + 7.61 Mapping Cancer Markers DESKTOP- Rosetta@home 03-12-2021 14:50 00:00:00 (00:00:00) 03-12-2021 14:51 aaam-SAR_pp-mPRO_pp-PIP-AMACBEN2_pp_12_2570219_1_0 0.00 Aborted (203) 1.03 rosetta python projects (vbox64) DESKTOP *** GPUGRID 03-12-2021 14:39 *RUN TIME 02d,01:17:02 (02d,01:55:03) 03-12-2021 14:41 e7s224_e1s376p0f362-ADRIA_BanditGPCR_APJ_b0-0-1-RND6911_1 0.956C + 1NV 100.00 Aborted (203) 2.19 New version of ACEMD (cuda1121) DESKTOP- Rosetta@home 03-12-2021 14:30 00:00:27 (00:00:00) 03-12-2021 14:31 aagb-PRO_pp-SAR-ACPenC13T-mB3PHG_pp_11_2697622_1_0 0.00 **Reported: Computation error (1,) 1.03 rosetta python projects (vbox64) DESKTOP- World Community Grid 03-12-2021 12:59 02:37:38 (02:27:14) 03-12-2021 13:01 World Community Grid 03-12-2021 11:07 03:03:09 (02:36:18) 03-12-2021 11:09 OPN1_0095285_00301_0 85.34 Reported: OK + 7.21 OpenPandemics - COVID 19 DESKTOP- Rosetta@home 03-12-2021 10:54 00:59:04 (00:54:55) 03-12-2021 10:56 aaam-mNMVAL_pp-FPR-mPHE- AMACBEN2_pp_4_2496370_1_0 92.97 ** Reported: Computation error (0,) 1.03 rosetta python projects (vbox64) DESKTOP- Rosetta@home 03-12-2021 09:52 03:53:41 (03:47:29) 03-12-2021 09:56 aaas-SAR-VAL_pp-NMVAL-SUGA_pp_12_2559723_1_0 97.35 Reported: OK * 1.03 rosetta python projects (vbox64) DESKTOP- Rosetta@home 03-12-2021 05:57 07:03:18 (06:52:27) 03-12-2021 05:57 aaap-PIP_pp-mNMPHE_pp-TIC-AMACBEN3_pp_0_2502770_1_0 97.44 Reported: OK + 1.03 rosetta python projects (vbox64) DESKTOP- Rosetta@home 02-12-2021 22:06 00:00:00 (00:00:00) 02-12-2021 22:08 aagb-mNMVAL-mPHE-GPN-B3PHG_pp_12_2632874_1_0 0.00 **Aborted (203) 1.03 rosetta python projects (vbox64) DESKTOP-I Rosetta@home 02-12-2021 21:15 06:44:38 (06:22:05) 02-12-2021 21:17 aaas-PHE_pp-mTIC_pp-NMVAL-mSUGA_1_2517870_1_0 94.43 Reported: OK + 1.03 rosetta python projects (vbox64) DESKTOP |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Well, I use only the <project_max_concurrent> not <max_concurrent>, Jean - I thought you might be on to something. But it was a fluke. I put <name> in app_config and I set the project_concurrent to 2 and then to 1, but that is being ignored. Still running 3. I guess RAH will do what it wants to do no matter what commands you give it, short of cutting resource share which looks like the only way to get it to 2 tasks and maybe at 25% to get it to 1. Because they still want to use/reserve 7629 MB per task which times 3 is 22,887 MB which is 90% of my memory. That is with the boinc_cages_IL tasks. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2002 Credit: 9,790,281 RAC: 2,986 |
Now my pcs have "got 0 new tasks" of python wus, but in the queue there are over 5000 wus... Uh, i cannot understand. In the pc profiles the phyton wus are "disable" (skip) but i don't change this option. |
Jonathan Send message Joined: 4 Oct 17 Posts: 43 Credit: 1,337,472 RAC: 0 |
You got Blacklisted |
Jonathan Send message Joined: 4 Oct 17 Posts: 43 Credit: 1,337,472 RAC: 0 |
trevG, if your computer has only 4Gb of RAM, you don't have enough to run the VM tasks. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org