Changed "Target CPU run time": How does it take effect?

Message boards : Number crunching : Changed "Target CPU run time": How does it take effect?

To post messages, you must log in.

AuthorMessage
xii5ku

Send message
Joined: 29 Nov 16
Posts: 22
Credit: 13,647,106
RAC: 1,965
Message 89165 - Posted: 28 Jun 2018, 13:37:18 UTC

I was just reading the FAQ and searching the forum, asking myself,

    when I change the "Target CPU run time" setting in the web prefs, when will it take effect?


(That is, take effect on the actual run time of tasks. I understood that it does not affect boinc-client's estimated time remaining directly.)

I arrived at a partial answer from what I saw in multiple posts:


  • After the setting was changed at the web interface, the user should trigger a project update in the client, and then the setting will be picked up.
  • It will not merely affect tasks downloaded after the change was made, but also tasks that were already downloaded before the change.


Furthermore, the wording of one or another post makes me think that not just a user-requested project update, but also any other scheduler request from the client will pick up the changed setting (e.g., when the client reports a completed task).

But what I could not find right away was a more detailed answer, particularly to:


    Does it affect only tasks in "Ready to start" state, or also already "Running" tasks, tasks suspended to RAM, tasks suspended to disk, tasks waiting for memory, ...?

    And for tasks in the various mentioned states, can it only lengthen them, or only shorten them, or can it both?


On these points, I found a post by River~~ on December 1, 2006. From this, I gather:


  • The changed setting will affect all tasks on the client that are not yet finished.
  • It is possible to lengthen the tasks this way. To an extent, it is also possible to shorten tasks:
  • Tasks which are suspended to disk or waiting for memory will be finished right away if they were already running longer than the new target time.
  • Tasks which are presently running or suspended to RAM will continue to run until their next checkpoint if they were already running longer than the new target time.
  • Generally, tasks may exceed the target time by a varying amount in order to reach their next checkpoint.


Is this correct and current?

ID: 89165 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 21,465,703
RAC: 16,826
Message 89166 - Posted: 28 Jun 2018, 14:22:42 UTC - in response to Message 89165.  

I was just reading the FAQ and searching the forum, asking myself,

    when I change the "Target CPU run time" setting in the web prefs, when will it take effect?


(That is, take effect on the actual run time of tasks. I understood that it does not affect boinc-client's estimated time remaining directly.)

I arrived at a partial answer from what I saw in multiple posts:


  • After the setting was changed at the web interface, the user should trigger a project update in the client, and then the setting will be picked up.
  • It will not merely affect tasks downloaded after the change was made, but also tasks that were already downloaded before the change.


Furthermore, the wording of one or another post makes me think that not just a user-requested project update, but also any other scheduler request from the client will pick up the changed setting (e.g., when the client reports a completed task).

But what I could not find right away was a more detailed answer, particularly to:


    Does it affect only tasks in "Ready to start" state, or also already "Running" tasks, tasks suspended to RAM, tasks suspended to disk, tasks waiting for memory, ...?

    And for tasks in the various mentioned states, can it only lengthen them, or only shorten them, or can it both?


On these points, I found a post by River~~ on December 1, 2006. From this, I gather:


  • The changed setting will affect all tasks on the client that are not yet finished.
  • It is possible to lengthen the tasks this way. To an extent, it is also possible to shorten tasks:
  • Tasks which are suspended to disk or waiting for memory will be finished right away if they were already running longer than the new target time.
  • Tasks which are presently running or suspended to RAM will continue to run until their next checkpoint if they were already running longer than the new target time.
  • Generally, tasks may exceed the target time by a varying amount in order to reach their next checkpoint.


Is this correct and current?




The Rosetta WU are typically longer than the TARGET time, so Rosetta chops up the job into pieces which are passed out to computers. The command line that starts the Rosetta WU on your computer has that "-cpu_run_time" value set as part of the WU. In a completed WU on my machine, the command line looks like below. The "-cpu_run_time 28800 " option in the command line says to run the job for 28,800 seconds and THEN when that decoy is finished, terminate the job.

Rosetta WU sent out will reflect the -cpu_run_time in effect when the WU is sent. All WU already sent, will use the value in effect when the they were sent.


command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.07_x86_64-pc-linux-gnu -relax::minimize_bond_angles 1 -ignore_unrecognized_res 1 -abinitio::rsd_wt_helix 0.5 -abinitio::rg_reweight 0.5 -abinitio::rsd_wt_loop 0.5 -in:file:native 00001.pdb -abinitio::fastrelax 1 -abinitio::use_filters false -out:file:silent_struct_type binary -relax::dualspace 1 -ex2aro 1 -relax::minimize_bond_lengths 1 -beta_cart 1 -relax::default_repeats 2 -abinitio::detect_disulfide_before_relax 1 -fr
ID: 89166 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
xii5ku

Send message
Joined: 29 Nov 16
Posts: 22
Credit: 13,647,106
RAC: 1,965
Message 89167 - Posted: 28 Jun 2018, 14:58:41 UTC - in response to Message 89166.  

I currently have 12 hours set, and the tasks are running for 12 hours indeed. Yet the command line says "-cpu_run_time 28800" (8 h), both in client_state.xml and in /proc/${PID}/cmdline. Example:
    <command_line>
-relax::minimize_bond_lengths 1 -frag9 00001.200.9mers -abinitio::use_filters false -ignore_unrecognized_res 1 -ex1 1 -out:file:silent_struct_type binary -ex2aro 1 -in:file:native 00001.pdb -abinitio::detect_disulfide_before_relax 1 -abinitio::rsd_wt_loop 0.5 -frag3 00001.200.3mers -beta 1 -optimization::default_max_cycles 200 -abinitio::fastrelax 1 -abinitio::rsd_wt_helix 0.5 -relax::default_repeats 2 -abinitio::increase_cycles 10 -beta_cart 1 -abinitio::rg_reweight 0.5 -relax::minimize_bond_angles 1 -relax::dualspace 1 -in:file:boinc_wu_zip DRH_curve_X_h30_l2_h20_l2_02149_2_2_loop_1_0001_one_capped_0001_fragments_data.zip  -out:file:silent default.out -silent_gz 1 -mute all -nstruct 10000  -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2643916
    </command_line>
ID: 89167 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 89170 - Posted: 28 Jun 2018, 16:45:26 UTC

The BOINC Manager has to complete a few at the new target runtime to get the duration correction factor and then the target runtimes will start to reflect your new runtime preference.

Just to add to comments from rjs5, I believe when each decoy or model is completed, it looks back at the average of how long your machine has been taking to complete the decoys of this task. It uses this to estimate the completion time of the next decoy. If the estimate exceeds your runtime preference, then the task is ended. This helps the runtime adjust to the actual specific task and to the actual host running the task. So, on average, the actual runtime should be slightly less than the target.
Rosetta Moderator: Mod.Sense
ID: 89170 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
xii5ku

Send message
Joined: 29 Nov 16
Posts: 22
Credit: 13,647,106
RAC: 1,965
Message 89172 - Posted: 28 Jun 2018, 18:22:39 UTC
Last modified: 28 Jun 2018, 18:44:08 UTC

Ah. I looked at 8 random tasks of mine, and 7 finished slightly before the target run time, only 1 task took mere 20 seconds longer than the target time.

Furthermore, as I am on 12 h currently, I grep'ed the boinc data directory for 43200 and found:

  • in sched_reply_boinc.bakerlab.org_rosetta.xml: <cpu_run_time>43200</cpu_run_time>
  • in account_boinc.bakerlab.org_rosetta.xml: <cpu_run_time>43200</cpu_run_time>
  • in slots/${SLOTNO}/init_data.xml: <cpu_run_time>43200</cpu_run_time>, among other account properties
  • in stderr_out of finished tasks: # cpu_run_time_pref: 43200, seemingly as the last of the task's startup messages


So I suppose the mechanism is that the project server's response to a client's scheduler request carries the cpu_run_time entity into the rosetta account properties, and the rosetta/ minirosetta application picks it up from there. Hence, my original questions could be reformulated: Is (mini)rosetta reading the cpu_run_time preference just once at its startup, or does it reread it after each completion of a decoy? (I could find out myself empirically, but asking is so much more convenient. ;-)

--------
Edit:
OK. I switched to 10 h, requested the client to "update" the project, and now 36000 instead of 43200 appears in sched_reply_boinc.bakerlab.org_rosetta.xml, account_boinc.bakerlab.org_rosetta.xml, and slots/*/init_data.xml. If the software went through the trouble of updating even the slot files, I suppose the running job is going to look at the changed init_data.xml eventually. Either periodically, or at least if it is suspended to disk and then resumed.

ID: 89172 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 89173 - Posted: 28 Jun 2018, 19:38:07 UTC

Oh, yes, that's correct. I forgot to point out that rjs5 was mistaken on that point. When you change the runtime preference, and you do the BOINC Manager update to the server, it changes all of the uncompleted tasks on your machine. For this reason I generally suggest changes be made gradually over time rather than one big jump. If you've got a day of work downloaded and change the runtime preference from 3 hours to 12 hours, now it will take 4 days to complete the work already downloaded. This can lead to missed deadlines. And BOINC Manager still doesn't understand how long these tasks take to run and this can lead to it requesting more work than will actually be completed in the time before the deadline.
Rosetta Moderator: Mod.Sense
ID: 89173 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
xii5ku

Send message
Joined: 29 Nov 16
Posts: 22
Credit: 13,647,106
RAC: 1,965
Message 89203 - Posted: 2 Jul 2018, 18:00:22 UTC

Maybe I should have titled this thread
    Changed "Target CPU run time": How does it take effect?

to make it clearer what my questions were about. Meanwhile I learned from experiments:

After a scheduler request, the target run time of the following tasks are being modified:

    + tasks which hadn't been started yet (and are started after the scheduler request)
    + tasks which were suspended to disk (and are resumed after the scheduler request)

A note on suspend to disk: Tasks have a certain likelihood to fail with error when resumed.

The target run time of the following tasks are not modified:

    - tasks which are running during the scheduler request

I did not test systematically tasks which were suspended to RAM during the scheduler request. It seemed they are not modified either.

ID: 89203 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Changed "Target CPU run time": How does it take effect?



©2024 University of Washington
https://www.bakerlab.org