Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 85 · 86 · 87 · 88 · 89 · 90 · 91 . . . 215 · Next

AuthorMessage
strombergFs

Send message
Joined: 18 Mar 21
Posts: 11
Credit: 150,490
RAC: 0
Message 100905 - Posted: 30 Mar 2021, 20:18:39 UTC - in response to Message 100903.  

Thank you very much for the information.
Is that high memory size requirement normal ?
The last week since the C4 operates i had no problems.

I read before about Raspi4 and other arm boards that it is possible to run Rosetta on them with linux..
Now i have 15!x C4s waiting to operate for Rosetta.
A bit sad these unsafe information.

Is it very likely that "normally" tasks of around 1GB are available ?
Or was i just lucky these last days ?

Thanks.

Does somebody have more information regarding the use of swap with Rosetta?
Is there a reason the swap is not used ( at least not on the C4 until now) ?
Thanks
ID: 100905 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
strombergFs

Send message
Joined: 18 Mar 21
Posts: 11
Credit: 150,490
RAC: 0
Message 100906 - Posted: 30 Mar 2021, 20:19:49 UTC - in response to Message 100903.  

the MacBook errors are okay. I had one freeze, now all is okay again.
ID: 100906 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 765
Message 100907 - Posted: 30 Mar 2021, 21:10:57 UTC - in response to Message 100905.  

Is that high memory size requirement normal ?
It wasn’t normal before today. Whether it becomes the new normal, we’ll have to wait and see. It’s also not yet clear whether these tasks do actually need all the memory and disk space they say they do, or whether it’s a misconfiguration.
ID: 100907 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1815
Credit: 33,426,209
RAC: 8,508
Message 100909 - Posted: 30 Mar 2021, 23:29:46 UTC - in response to Message 100903.  
Last modified: 30 Mar 2021, 23:34:44 UTC

My odroidC4 does not get anymore tasks.
i got a note: Rosetta required 6.x GB suddenly. Around 3.7 available.
I dont understand why this is, all the other days i could run 4 tasks without any problem.
Now i do not get even one tasks.
Also i have 5GB swap but Rosetta does not use any space of it, but this was also before i think.

My other two computers have again new tasks, all okay there.
There’s a new batch of work units today that are marked as needing 6.6 GB of memory, so the server won’t even send them to hosts it knows have less than that. I’m not sure swap space is taken into account in that decision. You might want to consider a different project until Rosetta comes back with tasks suited to smaller machines.

The i7 has only picked up resends of older tasks (with much smaller memory requirements).

The MacBook has returned a lot of errors lately. Might be a platform bug; who knows.

Interesting/weird...

Checking my main PC, which isn't reporting any disk or RAM errors, it has 18Gb RAM free (60+%), is running 16 tasks concurrently, but using less than 6Gb RAM total on Rosetta tasks and, for a while, 70% CPU total (though I'm having some weird problems on the whole PC that may be affecting that CPU usage figure, so don't pay attention to that)

Edit: the available RAM and disk space is definitely related to the settings in the Disk & Memory tab and doesn't account for swap space at all
ID: 100909 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mrhastyrib

Send message
Joined: 18 Feb 21
Posts: 90
Credit: 2,454,871
RAC: 258
Message 100911 - Posted: 31 Mar 2021, 0:26:16 UTC - in response to Message 100883.  
Last modified: 31 Mar 2021, 0:26:54 UTC

Not just memory: I’m now seeing
Rosetta needs 5472.67MB more disk space
(on an admittedly very small partition set aside for the BOINC data directory, but it’s been fine for the last year) – so it looks like the new batch is unusually resource-hungry…


Rosetta needs 7437.13MB more disk space.

Isn't that adorable? Say hello to two less hosts after they finish their current tasks, @Rosetta. I don't know if I have the time that's required to provide the space that is needed.

"Makes use of unused COP cycles" Sounds so easy, doesn't it? I know that I have been here but a short time, but @Rosetta is higher maintenance than any turbulent girlfriend that I've ever had.
ID: 100911 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mippi

Send message
Joined: 26 Mar 20
Posts: 5
Credit: 52,692,779
RAC: 0
Message 100912 - Posted: 31 Mar 2021, 0:36:46 UTC - in response to Message 100886.  

I use a Linux workstation with 12GB memory and 12 cores. I have never seen such message before.
With 12 cores you need more than 16GB of RAM- a Single Task can require over 4GB of RAM, although 1 to 1.3GB is more common.
For a while now the memory requirements for Tasks have been very low, but we did get a batch of work that required over 1GB of RAM shortly before we ran out of work for a while there.


I have a few workstations with similar parameters and they work perfectly fine for many years, so I don't think I need 16GB memory. Moreover, Rosetta should be a project which is run in the background. So, I should not equip my computer to meet Rosetta requirements, but Rosetta should try to use my resources.
ID: 100912 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mrhastyrib

Send message
Joined: 18 Feb 21
Posts: 90
Credit: 2,454,871
RAC: 258
Message 100913 - Posted: 31 Mar 2021, 1:19:44 UTC - in response to Message 100912.  

Moreover, Rosetta should be a project which is run in the background. So, I should not equip my computer to meet Rosetta requirements, but Rosetta should try to use my resources.


I hear you, but all programs specify the minimum hardware requirements. My complaint would be that those requirements for @Rosetta seem to change without notice.

On the other hand, it seems like they don't have enough work to consistently utilize the hardware available to them. If that's the case, then it makes no sense to spend time fine-tuning the program so that even more capacity is available.
ID: 100913 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Garry Heather

Send message
Joined: 23 Nov 20
Posts: 10
Credit: 297,086
RAC: 285
Message 100914 - Posted: 31 Mar 2021, 2:00:06 UTC

It is interesting to read how this is affecting other people - my Pi 4 rig I mentioned previously had acquired a cache of 2 days worth of units but has since stopped downloading more due to the insufficient memory issue. I do rather hope that this is not going to become the new normal.
ID: 100914 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 308
Credit: 9,148,317
RAC: 1,408
Message 100915 - Posted: 31 Mar 2021, 3:00:34 UTC

Has anyone else noticed that since these problems with disk / memory space have been reported there have been a lot (maybe 50%) of 3 hour work units?
ID: 100915 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1325
Credit: 13,624,379
RAC: 211
Message 100916 - Posted: 31 Mar 2021, 5:40:12 UTC - in response to Message 100912.  

I have a few workstations with similar parameters and they work perfectly fine for many years, so I don't think I need 16GB memory.
Why, do you think 640k RAM should be enough? 1MB? 4MB, And that was back in the days of single core systems. Now we have multiple core & thread systems, and each running application instance will require memory to support it.
Past memory limits were due to hardware & OS limitations. These days, most limitations are due to available finances, and whether or not the work being done requires the extra RAM or not. It's your choice whether or not you equip your system with the resources necessary for it to be fully utilised or not.



Moreover, Rosetta should be a project which is run in the background. So, I should not equip my computer to meet Rosetta requirements, but Rosetta should try to use my resources.
Rosetta does use your resources, and it does run in the back ground. If you want it to use all of your CPU resources at the same time, then it needs to have enough memory to do so.
If you don't have enough RAM, it's not a problem- other Tasks will stop running till there is enough RAM for them to run. All taking place in the background.
Grant
Darwin NT
ID: 100916 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1325
Credit: 13,624,379
RAC: 211
Message 100917 - Posted: 31 Mar 2021, 5:50:08 UTC - in response to Message 100915.  
Last modified: 31 Mar 2021, 5:52:12 UTC

Has anyone else noticed that since these problems with disk / memory space have been reported there have been a lot (maybe 50%) of 3 hour work units?
Yep, although all the ones i had running were using less than 300MB of RAM each.

For reference-
I've got 6c/12 thread systems with 32GB of RAM & a 1TB SSD, i never suspend processing.
Usage limits	
                                 Use at most 100 % of the CPUs
                                 Use at most 100 % of CPU time

Disk
                             Use no more than 20 GB
                                Leave at least 2 GB free
                             Use no more than 60 % of total

Memory
         When computer is in use, use at most 95 %
     When computer is not in use, use at most 95 %
Leave non-GPU tasks in memory while suspended N	
                  Page/swap file: use at most 75 %


I've had no issues with insufficient disk space or memory.


EDIT- there was a batch of RB tasks that came out before those shorter running ones, and the RB Tasks often need 1GB+ each.
Grant
Darwin NT
ID: 100917 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 765
Message 100922 - Posted: 31 Mar 2021, 9:20:31 UTC - in response to Message 100911.  
Last modified: 31 Mar 2021, 9:28:45 UTC

Say hello to two less hosts after they finish their current tasks, @Rosetta. I don't know if I have the time that's required to provide the space that is needed.
You’re not alone. Look at the recent results graphs – ‘tasks in progress’ has dropped by around 200,000 (a third)…
ID: 100922 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1325
Credit: 13,624,379
RAC: 211
Message 100923 - Posted: 31 Mar 2021, 9:37:44 UTC - in response to Message 100922.  

Say hello to two less hosts after they finish their current tasks, @Rosetta. I don't know if I have the time that's required to provide the space that is needed.
You’re not alone. Look at the recent results graphs – ‘tasks in progress’ has dropped by around 200,000 (a third)…
In the past it has taken several days for In progress numbers to get back to their pre-work shortage numbers. And that's with out running out of work again only a few hours after new work started coming through (which occurred this time).
If we don't run out of work again over the next few days, we should see how things actually are by early next week.


What is odd is that these messages are occurring now, with Tasks that don't require much RAM at all (less than 300MB) compared to many of the previous Tasks (around 800MB). Every one of my current Tasks is using less than 300MB.
Grant
Darwin NT
ID: 100923 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 765
Message 100924 - Posted: 31 Mar 2021, 9:49:30 UTC - in response to Message 100917.  
Last modified: 31 Mar 2021, 9:58:42 UTC

I've had no issues with insufficient disk space or memory.
This points to a misconfiguration of the new batch of work units, as it seems unlikely it would be the project’s intention to cut off a third of its capacity…

Look in client_state.xml for the rsc_memory_bound and rsc_disk_bound settings of the new work units: they used to be 1,800,000,000 each; to yield the errors people are reporting they must now be set to 7,000,000,000 and 9,000,000,000.

How big is your BOINC data directory now? Did the new batch need to download any unusually large files (such as a new protein database)? The issue I have is not so much disk space (though it will be a pain to have to repartition every machine) as download size, since I’m on a capped data plan.
ID: 100924 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1325
Credit: 13,624,379
RAC: 211
Message 100925 - Posted: 31 Mar 2021, 10:15:45 UTC - in response to Message 100924.  
Last modified: 31 Mar 2021, 10:16:27 UTC

How big is your BOINC data directory now?
Unchanged.
1.77GB on one system, 2.32GB on the other- the largest i've ever seen it was around 2.7GB when i had a larger cache.

So very much looking like some error with the memory/disk space requirements values for the newly generated Tasks.
Still odd that with my number of cores/threads and available system RAM i haven't had issues.
Grant
Darwin NT
ID: 100925 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 765
Message 100926 - Posted: 31 Mar 2021, 10:35:17 UTC - in response to Message 100925.  

Still odd that with my number of cores/threads and available system RAM i haven't had issues.
It must be the case that the server only considers a host’s total available RAM and disk space (not per core) in deciding whether a task is suitable.

So if a task tells the server it might need 6.6 GB of RAM, the server will never send it to any host with less (even if in practice it would not need anywhere near that much), but it will happily send you 24 of them because they can run (just maybe not all at the same time).

There can’t be many machines with >6 GB RAM per core…
ID: 100926 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile trevG

Send message
Joined: 5 Nov 13
Posts: 9
Credit: 680,580
RAC: 0
Message 100928 - Posted: 31 Mar 2021, 11:34:54 UTC - in response to Message 100903.  
Last modified: 31 Mar 2021, 11:43:48 UTC

Without knowing the actual system requirements (I really mean the post dated info where ir actually ran ok) of the work being 'unsent' it's not possible to see whether it's a bug or true resource mismatch. This accords with the post above.
As larger duty memory iunits are being sent out - that points to the low memory bar, which is fair enough.
All these projects are demand driven by default but lack of admin response on this forum leads to many duplicated queries.
Asteroids had a recent similar server problem- but the project put out an update message explaining glitches- saving many queries.
F@H has its faults- but the forum does solve many tech issues up front.
ID: 100928 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 765
Message 100931 - Posted: 31 Mar 2021, 12:35:10 UTC - in response to Message 100928.  

the post dated info where ir actually ran ok
For some examples, look at Grant’s recent valid tasks. There doesn’t seem to be anything unusual about their memory and disk usage.
ID: 100931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Richard Sun
Avatar

Send message
Joined: 19 Feb 21
Posts: 1
Credit: 30,980,116
RAC: 29,859
Message 100932 - Posted: 31 Mar 2021, 15:04:53 UTC - in response to Message 100931.  

All my Raspberry Pi's 3B+, 4B 4GB, and 4B 8GB all have lots of work today, if you don't have work, I suggest you go to the BOINC Manager, click on Projects, and click Update for Rosetta@home. I definitely had seen all the same things that others mentioned these past few days on not getting new workloads with not enough memory, etc. but today it's back to "normal".
ID: 100932 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 765
Message 100933 - Posted: 31 Mar 2021, 15:34:11 UTC - in response to Message 100932.  

It’s largely luck of the draw. If you happen to contact the server when it has some ‘small’ tasks ready to send, you will get them. But if it only has ‘big’ ones ready to go at that moment, and your machine is outside the limits, you won’t. And in that case the client will back off (for longer and longer durations, up to 1½ days) before asking again.

Also there doesn’t appear to be any mechanism stopping ‘small’ tasks going to ‘big’ machines – so the more that happens the less likely it becomes that there is work available for others…
ID: 100933 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 85 · 86 · 87 · 88 · 89 · 90 · 91 . . . 215 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2022 University of Washington
https://www.bakerlab.org