Discussion- Work Unit Distribution

Message boards : Rosetta@home Science : Discussion- Work Unit Distribution

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 15591 - Posted: 5 May 2006, 20:42:23 UTC - in response to Message 15589.  
Last modified: 5 May 2006, 20:45:11 UTC

It still does not preemptively match the WU to the host's capabilities at the scheduling level. The large memory WU would go out to everybody and would error out on all the hosts with insufficient RAM. That is not an acceptable solution.


What's going on? How does it come you all understand the behaviour like that. I read:

"The workunit will only be sent to hosts with at least this much available RAM."

English is not my native language but I understand it such as that the WU will _not_ be send out if the host has not at least this much available RAM.

The second part talks about the possibility that the WU will demand more memory on the host than was given as an upper bound from the scientists. If that is the case it gets aborted. To put in plain english:

If you define a big WU with minimum requirement = 512 MB it will be sent out only to hosts which have 512 MB or more RAM. If for whatever reason that WU will try to use more than 512 MB RAM (which it really shouldn't) it gets aborted.

I found some other interesting work unit distribution info:

http://boinc-wiki.ath.cx/index.php?title=Work_Distribution
http://boinc.berkeley.edu/work_distribution.php
http://boinc.berkeley.edu/configuration.php

ir this isn't all untrue what's written there you have some quite sophisticated possibilites to distribute work according to the hosts characteristics.
ID: 15591 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 15595 - Posted: 6 May 2006, 0:54:16 UTC
Last modified: 6 May 2006, 1:00:38 UTC

tralala seems right to me

if I'm reading sched_send_c ] correctly then
[b]
wu.rsc_memory_bound  
[/b]

is what you need to set for each of the different types wu's

You could clean up the code a little so that
[b]
...
        reply.wreq.insufficient_mem = true;
        reason |= INFEASIBLE_MEM;
        reply.set_delay(24*3600);[/b]

...

I'm assuming that the reply.set_delay(24*3600) is a client back off setting and you could get rid of the error msgs etc... but that does mean messing with the BOINC code so it's probibly best to leave it alone, as updateing would become messy.

The mutiple cpu's sharing memory is then easliy fixed by the participant changeing their preferences.

Edit: to add a little
ID: 15595 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2

Message boards : Rosetta@home Science : Discussion- Work Unit Distribution



©2024 University of Washington
https://www.bakerlab.org