Not getting any python work

Message boards : Number crunching : Not getting any python work

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 9 · Next

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 102956 - Posted: 14 Oct 2021, 18:33:30 UTC

It's not max concurrent I was having trouble with.
That is a job limiting function.
I was doing CPU allocation per project.
<ncpu>x</ncpu) that screwed me up.

I used max concurrent in LHC and that had no problems.
I don't need that function anymore in LHC since they upgraded their preferences page so you can set the number of cpu's you want.

They took away the need for max_concurrent because again they updated their web preferences page so you can do that your self.

IF AND WHEN RAH does this...(most likely never) then we won't be having this discussion.
ID: 102956 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 102957 - Posted: 14 Oct 2021, 18:35:32 UTC

back to the topic I started this thread with....

I just aborted 21 tasks for 4.20 in the last 24hrs.
I still get only 4.20 tasks and no python.

Any other ideas?
ID: 102957 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 102958 - Posted: 14 Oct 2021, 18:45:06 UTC - in response to Message 102944.  

OK, that is useful. It may happen only when running multiple work units (or at least more than two).
In that case, smaller memory may be better. You can't use an app_config to limit the number of work units until they get the download bug fixed.

I can run Rosetta in a second BOINC instance and limit it to one or two work units at a time, but that affects both the pythons and the non-pythons.
They need to give us some way to select them.

Thanks.


You say that you *can’t* use app_config to limit the number of work units but I’ve been doing so for a couple of years with zero problems.

Each of my projects has :-

<app_config>
<project_max_concurrent>N</project_max_concurrent>
</app_config>


and it limits the processing as required with no runaway downloads.


*exactly*
Used this in LHC before they upgraded their preference page.
Never had run away downloads.
1 task at a time is how I had it set.
But way back when I had to use ncpu's as well, but that is handled by their preferences page as well.

So in my OLD app_config (no need to use it now, but for example purposes here) I used max_concurrent for limiting to 1 task at a time (no runaway downloads) and ncpu was set to 4 since that is what ATLAS likes on my system.

But as we established in another conversation, that when you try to do ncpu here in RAH then you get slammed with 50-100 tasks that you will never be able to complete in time.
I have not tried max_concurrent here, so no idea if there are any problems or not.
I can only base my experience with that function from LHC.
ID: 102958 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 102960 - Posted: 14 Oct 2021, 19:36:48 UTC - in response to Message 102957.  

back to the topic I started this thread with....

I just aborted 21 tasks for 4.20 in the last 24hrs.
I still get only 4.20 tasks and no python.

Any other ideas?


I still had some residual in my vbox from a long time ago.
Cleaned that out and now lets see what happens.
ID: 102960 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Werinbert

Send message
Joined: 22 Jul 13
Posts: 4
Credit: 1,001,196
RAC: 0
Message 103006 - Posted: 23 Oct 2021, 4:08:56 UTC

There was a lull due to lack of work, but python tasks are aplenty now...
Yet I still am not getting any python tasks. I tried aborting the 4.20 tasks, that didn't work. I know that I can process python tasks as I was doing so week or so ago.
ID: 103006 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103007 - Posted: 23 Oct 2021, 6:34:34 UTC - in response to Message 103006.  

Maybe now there is a minimum memory requirement. All of my machines that got them had at least 48 GB, though I have not tried with less.
ID: 103007 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 103009 - Posted: 24 Oct 2021, 10:05:30 UTC

4,900+ python tasks and I get nothing.
I don't get it.
I still have 60% memory free of 24 Gigs.
I don't get memory errors and I don't think anything has been published about Python memory requirements.
ID: 103009 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103013 - Posted: 24 Oct 2021, 20:02:28 UTC - in response to Message 103009.  
Last modified: 24 Oct 2021, 20:25:59 UTC

I don't get memory errors and I don't think anything has been published about Python memory requirements.

(1) I have gotten them on every machine I have tried (four of them), and they all have at least 48 GB of memory.
(2) You have gotten none with less memory.
(3) You apparently imagine that they publish requirements for their work units. Where have you seen them for any of them?

PS - The obvious thing to do, as I indicated before, is to detach from your other projects to clear out more memory. Maybe it would work with what you have. Your current approach is not achieving results.
ID: 103013 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Werinbert

Send message
Joined: 22 Jul 13
Posts: 4
Credit: 1,001,196
RAC: 0
Message 103014 - Posted: 24 Oct 2021, 21:25:21 UTC - in response to Message 103013.  
Last modified: 24 Oct 2021, 21:25:37 UTC

I have 32GB of memory. Two weeks ago I was getting tasks without a problem. Now I am not getting any. I doubt that a single task requires 48GB to run.
ID: 103014 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103015 - Posted: 24 Oct 2021, 21:49:07 UTC - in response to Message 103014.  
Last modified: 24 Oct 2021, 21:52:45 UTC

I have 32GB of memory. Two weeks ago I was getting tasks without a problem. Now I am not getting any. I doubt that a single task requires 48GB to run.

The ones that I have running now on Ubuntu are using less than 2 GB. But they show as 8 GB on BoincTasks, which probably means that they reserve that much.
Since you used to get them and now don't, it seems likely that they have imposed additional restrictions on the required minimum amount of memory for download.
That is not uncommon in various projects, in case you have not run into it before.
ID: 103015 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103017 - Posted: 25 Oct 2021, 13:28:38 UTC - in response to Message 103015.  

By the way, you aren't missing much. Even on a Ryzen 3900X with 96 GB of memory, and limiting it to using only 50% of the cores, I still get "VM job unmanageable" suspensions. That means that no more work can be downloaded until the suspension lifts. That is about 24 hours, or a reboot.

I have removed VirtualBox from two of my machines to eliminate the pythons, reassigned another machine, and may take VBox out on the last one too.
They could be more trouble than they are worth. Rosetta needs to get its act together.
ID: 103017 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 103018 - Posted: 25 Oct 2021, 21:36:21 UTC - in response to Message 103015.  

I have 32GB of memory. Two weeks ago I was getting tasks without a problem. Now I am not getting any. I doubt that a single task requires 48GB to run.

The ones that I have running now on Ubuntu are using less than 2 GB. But they show as 8 GB on BoincTasks, which probably means that they reserve that much.
Since you used to get them and now don't, it seems likely that they have imposed additional restrictions on the required minimum amount of memory for download.
That is not uncommon in various projects, in case you have not run into it before.



Again, no error messages, no information that I know of indicating what kind of machine is needed.
Big black hole of information.
ID: 103018 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 103019 - Posted: 25 Oct 2021, 21:38:24 UTC - in response to Message 103017.  

By the way, you aren't missing much. Even on a Ryzen 3900X with 96 GB of memory, and limiting it to using only 50% of the cores, I still get "VM job unmanageable" suspensions. That means that no more work can be downloaded until the suspension lifts. That is about 24 hours, or a reboot.

I have removed VirtualBox from two of my machines to eliminate the pythons, reassigned another machine, and may take VBox out on the last one too.
They could be more trouble than they are worth. Rosetta needs to get its act together.



Interesting. RAH getting its act together? hah! Now your dreaming!
I think our role in the project is leftovers and low level stuff.
All the rest is run on their neural network and high end systems.
Kind of reminds me of TAC. That's a joke of a project for BOINC users.
ID: 103019 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103020 - Posted: 25 Oct 2021, 23:09:33 UTC - in response to Message 103019.  

I think our role in the project is leftovers and low level stuff.
All the rest is run on their neural network and high end systems.

I am sure they do a lot of their own stuff in-house now, with the AI they are using.
But it doesn't follow that we are doing low-level stuff. It may be everything that is sent for processing by groups that do not have much of an in-house capability.

So it could be good work. But they don't tell us.
ID: 103020 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 103021 - Posted: 26 Oct 2021, 6:05:25 UTC - in response to Message 103020.  

I think our role in the project is leftovers and low level stuff.
All the rest is run on their neural network and high end systems.

I am sure they do a lot of their own stuff in-house now, with the AI they are using.
But it doesn't follow that we are doing low-level stuff. It may be everything that is sent for processing by groups that do not have much of an in-house capability.

So it could be good work. But they don't tell us.
<--that is the thing about this project. No communication. It's a shame. They used to.
ID: 103021 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103050 - Posted: 29 Oct 2021, 20:26:03 UTC
Last modified: 29 Oct 2021, 20:38:02 UTC

I have attached a Ryzen 3600 with 48 GB of memory, and another Ryzen 3600 with 32 GB of memory. Both are on Ubuntu 20.04.3, and have VirtualBox installed.
Both picked up the pythons right away; the 32 GB machine downloaded five pythons, and four regular Rosettas in the first batch.

I limited them to 9 virtual cores running at a time to try to prevent the "VM job unmanageable" errors. So far, I have not gotten any errors on the 48 GB machine, but am just starting on the 32 GB machine.
So if you are not getting any pythons with 32 GB there is a problem somewhere. I usually just wipe out the OS and start over if I can't find it. It is so fast now that it is easier than spending much time on it
Good luck.
ID: 103050 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Scottie McKinley

Send message
Joined: 14 Jan 21
Posts: 5
Credit: 2,376,461
RAC: 0
Message 103058 - Posted: 30 Oct 2021, 13:54:12 UTC

I couldn't get pythons regardless of settings. I just did a full disk wipe and reinstall of Windows 10 with the Boinc client dated October 21 (I think). Now pythons are running without me changing any settings. Weird that it works now.
ID: 103058 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103060 - Posted: 30 Oct 2021, 15:28:02 UTC - in response to Message 103058.  

One last thought for Win10. If you have ever enabled "Hyper-V", be sure to disable it (in "Turn Windows features on or off").
It is incompatible with VirtualBox.
ID: 103060 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,633,150
RAC: 945
Message 103074 - Posted: 31 Oct 2021, 23:26:40 UTC - in response to Message 103060.  

One last thought for Win10. If you have ever enabled "Hyper-V", be sure to disable it (in "Turn Windows features on or off").
It is incompatible with VirtualBox.



For me, I Hyper V is not on. Never has been.
There is no way I am wiping out my OS to get some stubborn tasks.
There has to be something else holding up Python tasks.
Something that can be done without wiping out windows and spending hours reinstalling and re configuring.
ID: 103074 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1465
Credit: 14,162,844
RAC: 15,905
Message 103075 - Posted: 31 Oct 2021, 23:33:21 UTC - in response to Message 103074.  
Last modified: 31 Oct 2021, 23:34:34 UTC

There has to be something else holding up Python tasks.
Something that can be done without wiping out windows and spending hours reinstalling and re configuring.
BOINC Manager, Options, Event log options,
Work_fetch_debug, CPU_sched_debug & similar, rr_simulation etc.
Enable those (or just one of those) options (expect a huge amount of output in the Event log) to see what the Manager is actually doing- what values it is working with & the results it's producing and decisions it's making.
Grant
Darwin NT
ID: 103075 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 9 · Next

Message boards : Number crunching : Not getting any python work



©2024 University of Washington
https://www.bakerlab.org