Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 181 · 182 · 183 · 184 · 185 · 186 · 187 . . . 309 · Next
Author | Message |
---|---|
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
I have given up on manualy aborting tasks , it takes longer than `just leave them to it` and any way , it don't like me any more . . . 20/02/2022 00:33:22 | Rosetta@home | This computer has finished a daily quota of 37 tasks and so little time later 20/02/2022 01:13:43 | Rosetta@home | This computer has finished a daily quota of 1 tasks and I have left it alone today . never mind I will save on electric |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,534,176 RAC: 10,708 |
I haven't got my head around what the specific issue is with rb tasks crashing out, but I do see people reporting problems with those too. I didn't get a chance to look at them but rb_02_18 tasks <seemed> to complete successfully. Which would be great, except I haven't seen any further rb tasks to run since <sigh> It looks like these movingstub tasks are going to be allowed to error out, seeing as they take up so very little user runtime. A project problem more than ours. Without having had any reply, it may be they've taken the view that linux systems will continue to run them successfully and Windows systems will error out and they get what they get. That's happened before. In the meantime, the queue that started off at 4.4m (inc 2.2m pythons) is now down to 2.8m (inc 2.2m pythons) so we're ~75% through them and it won't be too long before they're wiped out anyway. I know that's not satisfactory, but it may be how this episode turns out |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2002 Credit: 9,785,717 RAC: 5,211 |
Warning! RAH is sending out moving stubs in bulk now. I just aborted over 130 of them in the last 5 minutes Despite the twitt of Rosetta@Home account, i continue to receive "movingstubb" wus (and continue errors) |
Kissagogo27 Send message Joined: 31 Mar 20 Posts: 86 Credit: 2,977,216 RAC: 1,905 |
it seems that the server side of Boinc has decided alone to use the 32Bits app with my W7 64Bits OS because of the hudge amount of bad tasks, an automatic process to eradicate a bad app ? but 4.20 64b and 4.21 32b are not bads applications ... |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Warning! RAH is sending out moving stubs in bulk now. I just aborted over 130 of them in the last 5 minutes One, because its the weekend and two as Sid pointed out a few posts below, it is possible that since the linux ones run just fine they will just let the windows ones go through and error out. After 2 or 3 errors then its a dead task. Either they don't care or don't have time or understanding on how to purge these tasks. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,534,176 RAC: 10,708 |
The problem ones you guys are posting are rb_02_14 or rb_02_16 and the ones I have are rb_02_18 Seems like I forgot what PC they ran on. It was the PC I just left and the rb_02_18 tasks <did> run successfully. Still haven't received any others It looks like these movingstub tasks are going to be allowed to error out, seeing as they take up so very little user runtime. A project problem more than ours. Now down to 2.2m and none unsent, so looks like those left are all Python tasks. No more to come down - just returning 300k duff tasks (except for those running linux). Until next time... |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
The problem ones you guys are posting are rb_02_14 or rb_02_16 and the ones I have are rb_02_18 I killed the remaining 33 I had on my system. Good riddance! Now the other projects need to get some work done so maybe later I get python back. Oh..now I am insulted...the damn system knocked me off python for some reason (to many aborts of 4.2?) so I had to connect again. Can't they do anything right? |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,534,176 RAC: 10,708 |
Despite the twitt of Rosetta@Home account, I continue to receive "movingstubb" wus (and continue errors) It doesn't seem to be a case of that imo. It's not that there's anything wrong with the tasks - just that they won't run on Windows machines but will run on Linux. So how can they select "bad" ones to delete? They can't. The selection of "bad" tasks is very efficient. Allow Windows machines to download them and they're all rejected after 20 seconds. Linux machines run them successfully to completion. A perfect solution. All the machines that can run them do. All the ones that can't, don't. The price paid is in bandwidth of Windows users, that's true, but nothing more. I don't personally consider people's frustration to be worth a second thought |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
| Rosetta@home | Started download of AIMNet_minimization_python_project.py AIMNet - Atoms In Molecules Neural Network Potential This repository contains reference AIMNet implementation along with some examples and menchmarks Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network Anyone got details on this? I am just looking fast...not digging around yet. |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 272 Credit: 507,897 RAC: 334 |
Do you have problems with download speed? My download speed is 595 KBps |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,399,907 RAC: 19,807 |
Well that's the end of that batch, 50% or more of which were errored out. What a waste. Grant Darwin NT |
entity Send message Joined: 8 May 18 Posts: 19 Credit: 6,123,514 RAC: 4,804 |
Are the vbox tasks limited as to how many can run concurrently. Can only get 17 to run at the same time. All others are in "waiting to run" status in BOINC. No app config file in the projects directory. |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 272 Credit: 507,897 RAC: 334 |
.Can you try to change use at most memory setting in computing preferences > disk and memory? You have hidden your computers so i can't see them. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2002 Credit: 9,785,717 RAC: 5,211 |
The selection of "bad" tasks is very efficient. VERY efficient. It's up to the volunteers to kill the bad tasks. VERY efficient. The price paid is in bandwidth of Windows users, that's true, but nothing more. Yep, 3 days of waste of time and bandwidth WITHOUT ANY APOLOGIES from project it's "nothing more". Thank you for your considerations of volunteers Are you speaking on behalf of the project administrators? If yes, it's very serious. |
entity Send message Joined: 8 May 18 Posts: 19 Credit: 6,123,514 RAC: 4,804 |
.Can you try to change use at most memory setting in computing preferences > disk and memory? Use at most setting for memory is set to 99% and 100% for the CPUs. Server has 128 threads and 256GB of memory yet only 17 tasks are running. No message in log indicating that BOINC is waiting for any resource. Boinc has copied the VDI file to 88 slots result in about 697GB of used disk space. Disk is a 900GB disk. Boinc told to leave 1% free as the most restrictive parameter. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
.Can you try to change use at most memory setting in computing preferences > disk and memory? What else is running on your system? How much memory debt do you have to other programs and OS requirements? In theory, with nothing else running, you should be able to run around 100 pythons, but maybe Vbox can't handle that? 88 slots, so you have 88 tasks downloaded, but can run only 17? I'll leave this for the experts to work on...I think, but can not say for sure that Vbox might be the limiter. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
Are the vbox tasks limited as to how many can run concurrently. Can only get 17 to run at the same time. All others are in "waiting to run" status in BOINC. No app config file in the projects directory. Each of them reserves a large amount of memory, almost 8 GB. Unless that amount of memory is free, they won't start, even if they would shift to using much less memory if they did start. |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 272 Credit: 507,897 RAC: 334 |
My two tasks allocate 6144 |
entity Send message Joined: 8 May 18 Posts: 19 Credit: 6,123,514 RAC: 4,804 |
Are the vbox tasks limited as to how many can run concurrently. Can only get 17 to run at the same time. All others are in "waiting to run" status in BOINC. No app config file in the projects directory. There is 202GB of free memory as of this writing. BOINC client was not acting correctly so I restarted the client. It took almost 15 minutes for the client to restart the 17 VBoxHeadless processes. During the restart the client runs 100% busy and BOINCMgr is totally unresponsive. However, one you let the processes complete the startup and the boinc process drops back to under 5% utilization you can start more processes. Starting 10 processes causes the boinc process to jump back to 100% busy for about 10 minutes. Once the client drops back to 5% the tasks show as running. It seems to be related to BOINC and VBox. I/O is negligible during the starting of tasks. I think I can baby sit this thing and get it to where I want it. Thanks for the insights. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,534,176 RAC: 10,708 |
The selection of "bad" tasks is very efficient. I'm a bit confused. All contributors offer their time and resources to the project. So if it's for users (miilions) to sort through runnable and non-runnable tasks so the researchers (a handful) don't have to, that's conforms perfectly. The price paid is in bandwidth of Windows users, that's true, but nothing more. Lol, if I was I'd have been thrown out years ago. It's pure pragmatism. If it wasn't for the bad news, there wouldn't be any news at all. Sorry and thanks for your time and effort. Bygones. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org