Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 147 · 148 · 149 · 150 · 151 · 152 · 153 . . . 309 · Next

AuthorMessage
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103884 - Posted: 23 Dec 2021, 16:38:46 UTC - in response to Message 103882.  

most work requests get the `no tasks sent` and no reason why message.

Do you have any of the "Vm job unmanageable" ones on you machine? That will prevent any more from downloading.
You need to reboot to fix it. Or find another project.
ID: 103884 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103885 - Posted: 23 Dec 2021, 20:19:02 UTC - in response to Message 103884.  

most work requests get the `no tasks sent` and no reason why message.

Do you have any of the "Vm job unmanageable" ones on you machine? That will prevent any more from downloading.
You need to reboot to fix it. Or find another project.

I luckily have never had any `unmanageable` jobs or things like that,
and I only have 5 error tasks and one of them was `cancelled by server` another three where the `one minit wunders` that run for hours and do nothing , aborted them.
I did reboot the computer earlier today anyway, just in case something had gone funky
Just had a look at the server status, only three R4.2 jobs in que.
ID: 103885 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jonathan

Send message
Joined: 4 Oct 17
Posts: 43
Credit: 1,337,472
RAC: 0
Message 103886 - Posted: 23 Dec 2021, 20:29:37 UTC - in response to Message 103885.  

Check the details tab of your individual computer(s) on this website and see if "VirtualBox VM jobs" is showing Skip. If you want VM / Python jobs, that has to say Allow
ID: 103886 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103887 - Posted: 23 Dec 2021, 23:20:25 UTC - in response to Message 103884.  

most work requests get the `no tasks sent` and no reason why message.

Do you have any of the "Vm job unmanageable" ones on you machine? That will prevent any more from downloading.
You need to reboot to fix it. Or find another project.



Jim, have a look here at a older reddit thread.
Its to late at night here in the EU to try and understand what this person is saying, but maybe you understand it?
ID: 103887 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103888 - Posted: 23 Dec 2021, 23:46:18 UTC - in response to Message 103886.  

Check the details tab of your individual computer(s) on this website and see if "VirtualBox VM jobs" is showing Skip. If you want VM / Python jobs, that has to say Allow

I have two computers that are worth running VM jobs on and for both of them the button "VirtualBox VM jobs" is showing "Allow"
For some strange reason they stopped getting ANY work, one has now got two ordinary R4.2 tasks ,
The pythons don't want to come out of their cages and get crunched.
It may just be a strange scheduler thing that will work itself out in time.
ID: 103888 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,391,361
RAC: 19,589
Message 103891 - Posted: 24 Dec 2021, 6:50:55 UTC - in response to Message 103888.  

Check the details tab of your individual computer(s) on this website and see if "VirtualBox VM jobs" is showing Skip. If you want VM / Python jobs, that has to say Allow

I have two computers that are worth running VM jobs on and for both of them the button "VirtualBox VM jobs" is showing "Allow"
For some strange reason they stopped getting ANY work, one has now got two ordinary R4.2 tasks ,
The pythons don't want to come out of their cages and get crunched.
It may just be a strange scheduler thing that will work itself out in time.
Or there were issues with them & they were black listed by the server.
Grant
Darwin NT
ID: 103891 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 209
Credit: 26,458,767
RAC: 18,205
Message 103893 - Posted: 24 Dec 2021, 8:16:04 UTC - in response to Message 103718.  
Last modified: 24 Dec 2021, 8:17:02 UTC

I had gone as far to untick all the disk space boxes to give it unlimited use of the disk
The boxes aren't tickable, they require values. And one value in any one of the options overrides the values in any of the other two when it comes to what disk space is actually available.

They are quite tickable - there are checkboxes to the left of each value box which turn off corresponding limit.

I was referring to the web based settings.
If you've only got one system, local Setting are ok. More than one, web based settings make life much easier.

web based settings also have same checkboxes for disk and network usage limits as local settings do. At least here on Rosetta server web based settings for BOINC.
ID: 103893 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,391,361
RAC: 19,589
Message 103894 - Posted: 24 Dec 2021, 9:01:04 UTC - in response to Message 103893.  

I had gone as far to untick all the disk space boxes to give it unlimited use of the disk
The boxes aren't tickable, they require values. And one value in any one of the options overrides the values in any of the other two when it comes to what disk space is actually available.

They are quite tickable - there are checkboxes to the left of each value box which turn off corresponding limit.

I was referring to the web based settings.
If you've only got one system, local Setting are ok. More than one, web based settings make life much easier.

web based settings also have same checkboxes for disk and network usage limits as local settings do. At least here on Rosetta server web based settings for BOINC.
Hmm.
That's new.
Clicked on Edit and up come the check boxes with the value boxes next to them.
Grant
Darwin NT
ID: 103894 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103896 - Posted: 24 Dec 2021, 14:44:04 UTC - in response to Message 103887.  

Jim, have a look here at a older reddit thread.

Thanks, but I think Project Lasso is for Windows. I am on Linux, and don't change the default priorities.

You may be able to avoid the problem by running only one or two cores, but that defeats the purpose for me.
They have to fix it at the Rosetta end. LHC has managed to do so with their own Vbox wrapper.
ID: 103896 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103897 - Posted: 24 Dec 2021, 15:27:41 UTC - in response to Message 103891.  

Check the details tab of your individual computer(s) on this website and see if "VirtualBox VM jobs" is showing Skip. If you want VM / Python jobs, that has to say Allow

I have two computers that are worth running VM jobs on and for both of them the button "VirtualBox VM jobs" is showing "Allow"
For some strange reason they stopped getting ANY work, one has now got two ordinary R4.2 tasks ,
The pythons don't want to come out of their cages and get crunched.
It may just be a strange scheduler thing that will work itself out in time.
Or there were issues with them & they were black listed by the server.

Grant (SSSF) YOU GOT IT I am now getting Pythons again. 35 running.
The buttons work backwards {sort of}
If the button is showing "Allow" That computer is set on "skip" and will NOT get VB/Python work.
To get VB/Python work click the "Allow" button to get work again, and the button changes to "Skip" {for those that don`t want them etc}
After doing the `click and collect` I got this message on the webpage ....
------------------------------
Host updated
This host can now run VirtualBox VM jobs
This change will take effect the next time the host communicates with this project. If VM jobs cannot run due VirtualBox errors, this host will be flagged again to skip VM jobs.
_________________________
And so whoever it is [that has been blacklisted] will have to click the "Allow" button again to get tasks again to order in another takeaway of Pythons {oops sorry wrong website :), }
ID: 103897 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 103898 - Posted: 24 Dec 2021, 17:29:44 UTC - in response to Message 103896.  
Last modified: 24 Dec 2021, 18:20:48 UTC

Jim, have a look here at a older reddit thread.

Thanks, but I think Project Lasso is for Windows. I am on Linux, and don't change the default priorities.

You may be able to avoid the problem by running only one or two cores, but that defeats the purpose for me.
They have to fix it at the Rosetta end. LHC has managed to do so with their own Vbox wrapper.


I keep losing QuChem to that problem. I am now reducing it to a single task and core from 2.
I thought python was causing troubles, that does not seem to be the case.
ID: 103898 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
computezrmle

Send message
Joined: 9 Dec 11
Posts: 63
Credit: 9,680,103
RAC: 0
Message 103903 - Posted: 25 Dec 2021, 11:52:29 UTC - in response to Message 103896.  

... Project Lasso is for Windows. I am on Linux ...

On Linux systemd slices can be used to group processes and modify/control their priorities.
The following steps can be used on a Linux system running cgroups v2.
Cgroups v1 used by older kernels have a slightly different syntax as well as different default values.


Shutdown BOINC and create/modify the systemd configuration as follows

1.
Create a slice description /etc/systemd/system/boinc.slice
Content:
[Unit]
Description=BOINC main Slice
Before=slices.target
Requires=-.slice
After=-.slice


2.
Add this to the [Service] section of your BOINC service file (usually /etc/systemd/system/boinc-client):
Slice=boinc.slice


3.
Run "systemctl daemon-reload" (or reboot) and restart BOINC.
To check whether BOINC (and all processes started by BOINC or below) is running as part of the boinc slice run "systemctl status boinc-client.service" and check for:
CGroup: /boinc.slice



The default priority for all slices on the same level is 100.
This means system(slice), user(slice) and boinc(slice) are now running at the same priority relative to each other.

This simple example (permanently) modifies the mentioned scheduler parameters:
systemctl set-property boinc.slice CPUWeight=70 CPUQuota=300% CPUQuotaPeriodSec=400ms

CPUWeight=70: reduces the relative priority of all processes running in this slice to 70
=> System processes and interactive (user) processes stay responsive even on fully loaded systems
CPUQuota=300%: limits the slice to never use more than 3 cores, even if the system has many more and even if BOINC runs many more tasks.
CPUQuotaPeriodSec=400ms: Tries to keep a process active for the given timespan (Linux default: 100ms)
=> usually better CPU cache efficiency for long running background processes

Further information can be found in the systemd manual and the kernel manual.
ID: 103903 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 103905 - Posted: 25 Dec 2021, 16:52:36 UTC - in response to Message 103903.  

On Linux systemd slices can be used to group processes and modify/control their priorities.

I have tried modifying priorities on a temporary basis, but have never found any that correct the "Vm job unmanageable" problem.

This should be done by the project anyway, but thanks for the procedure.
ID: 103905 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 103911 - Posted: 26 Dec 2021, 15:30:45 UTC - in response to Message 103885.  

most work requests get the `no tasks sent` and no reason why message.

Do you have any of the "Vm job unmanageable" ones on you machine? That will prevent any more from downloading.
You need to reboot to fix it. Or find another project.

I luckily have never had any `unmanageable` jobs or things like that,
and I only have 5 error tasks and one of them was `cancelled by server` another three where the `one minit wunders` that run for hours and do nothing , aborted them.
I did reboot the computer earlier today anyway, just in case something had gone funky
Just had a look at the server status, only three R4.2 jobs in que.


I got a "the VB environment needs cleaning up", which paused a task. I went into VBox itself and removed some images that LHC had left in there cluttering it up, then rebooted the computer. Everything is fine now.
ID: 103911 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 103912 - Posted: 26 Dec 2021, 15:31:43 UTC - in response to Message 103894.  

I had gone as far to untick all the disk space boxes to give it unlimited use of the disk
The boxes aren't tickable, they require values. And one value in any one of the options overrides the values in any of the other two when it comes to what disk space is actually available.

They are quite tickable - there are checkboxes to the left of each value box which turn off corresponding limit.

I was referring to the web based settings.
If you've only got one system, local Setting are ok. More than one, web based settings make life much easier.

web based settings also have same checkboxes for disk and network usage limits as local settings do. At least here on Rosetta server web based settings for BOINC.
Hmm.
That's new.
Clicked on Edit and up come the check boxes with the value boxes next to them.


What are these boxes? I just use local settings, even though I have 7 computers, because they're all different. Pythons run ok here.
ID: 103912 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 103913 - Posted: 26 Dec 2021, 16:42:18 UTC
Last modified: 26 Dec 2021, 16:43:41 UTC

Looks like a small batch of Zika/West Nile stuff. Around 400,000 tasks.
That's the 3rd of 4th batch of work on those viruses in recent times.
ID: 103913 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103917 - Posted: 27 Dec 2021, 0:06:42 UTC - in response to Message 103912.  
Last modified: 27 Dec 2021, 0:09:48 UTC

I had gone as far to untick all the disk space boxes to give it unlimited use of the disk
The boxes aren't tickable, they require values. And one value in any one of the options overrides the values in any of the other two when it comes to what disk space is actually available.

They are quite tickable - there are checkboxes to the left of each value box which turn off corresponding limit.

I was referring to the web based settings.
If you've only got one system, local Setting are ok. More than one, web based settings make life much easier.

web based settings also have same checkboxes for disk and network usage limits as local settings do. At least here on Rosetta server web based settings for BOINC.
Hmm.
That's new.
Clicked on Edit and up come the check boxes with the value boxes next to them.


What are these boxes? I just use local settings, even though I have 7 computers, because they're all different. Pythons run ok here.

I was setting the disk usage in Boinc Manager as large as possible - click the `Options` tab, then `computing preferences` then `Disk and memory` settings , to try and get rid of some messages about low disk space.
It did not work . even with 200GB+ of disk space "Free available to Boinc"
I still sometimes get.. and this is only ten minits after a reboot
-----------------
26/12/2021 16:31:29 Rosetta@home Message from server : Rosetta needs 1907.35MB more disk space. You currently have 0.00 MB available and it needs 1907.35 MB.
26/12/2021 16:31:29 Rosetta@home Message from server : rosetta python projects needs 19073.49MB more disk space. You currently have 0.00 MB available and it needs 19073.49 MB.
--------------------
and that is with - on the `Disk` tab of boinc manager
used by boinc - 123.97GB
free . available to BOINC - 226.19GB
used by other programs - 114.99GB
-------------
I have given up caring about those messages so long as rosetta works.
ID: 103917 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,391,361
RAC: 19,589
Message 103918 - Posted: 27 Dec 2021, 2:14:51 UTC - in response to Message 103917.  
Last modified: 27 Dec 2021, 2:19:52 UTC

I was setting the disk usage in Boinc Manager as large as possible - click the `Options` tab, then `computing preferences` then `Disk and memory` settings , to try and get rid of some messages about low disk space.
It did not work . even with 200GB+ of disk space "Free available to Boinc"
It doesn't matter how large the "Use no more than" value is, if the "Leave at least" & "Use no more than % of total" result in a lower amount being available.
As it says at the top of those options, the most restrictive setting is the one that is used.



I have given up caring about those messages so long as rosetta works.
When it comes to Python, you're aborting more than you actually process. So i wouldn't consider that as it being working.
Grant
Darwin NT
ID: 103918 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 103920 - Posted: 27 Dec 2021, 12:15:41 UTC - in response to Message 103918.  

I have given up caring about those messages so long as rosetta works.
When it comes to Python, you're aborting more than you actually process. So i wouldn't consider that as it being working.

That is true,
I don't want to abort them, I am ok with running anything that will do some thing usefull
I take it you have looked at my returned work units.
the problem is that the one`s I abort are the `one minit wunders` that have only a few seconds of CPU time after several hours of elapsed time, and seem to be pointless to continue running them
I did see one of mine had run for 23 hours before I gave up on it,
in another thread on was seen to run for over three days and still not finish
What do you do with them ?
Do they ever finish and produce usefull work
Any idea if the over run watch dog is working with python work units, it does not look like it from what I can see.
ID: 103920 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,386,173
RAC: 0
Message 103923 - Posted: 27 Dec 2021, 15:11:02 UTC - in response to Message 103920.  
Last modified: 27 Dec 2021, 15:13:22 UTC

I've been out of the loop for a little while. Did they recently fix the RAM requirements for the vBox tasks? I'm running 7 Rosetta Python tasks + 1 WCG ARP on 16GBs of RAM.

I'm not having issues, for once.
ID: 103923 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 147 · 148 · 149 · 150 · 151 · 152 · 153 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org