All tasks in scheduler state uninitialized

Questions and Answers : Unix/Linux : All tasks in scheduler state uninitialized

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94223 - Posted: 12 Apr 2020, 10:06:02 UTC

Hi, I have observed that all tasks are in scheduler state: uninitialized. It's normal? What does it mean?

1) -----------
name: hgfp_dimer_5x_254_fold_SAVE_ALL_OUT_906972_605_0
WU name: hgfp_dimer_5x_254_fold_SAVE_ALL_OUT_906972_605
project URL: https://boinc.bakerlab.org/rosetta/
received: Fri Apr 10 17:07:41 2020
report deadline: Mon Apr 13 17:07:41 2020
ready to report: no
state: downloaded
scheduler state: uninitialized
active_task_state: UNINITIALIZED
app version num: 378
resources: 1 CPU
estimated CPU time remaining: 22168.405735
2) -----------
name: 7v1nm_gg_c274_7mer_gb_00510_SAVE_ALL_OUT_907410_372_0
WU name: 7v1nm_gg_c274_7mer_gb_00510_SAVE_ALL_OUT_907410_372
project URL: https://boinc.bakerlab.org/rosetta/
received: Fri Apr 10 17:07:41 2020
report deadline: Mon Apr 13 17:07:41 2020
ready to report: no
state: downloaded
scheduler state: uninitialized
active_task_state: UNINITIALIZED
app version num: 412
resources: 1 CPU
estimated CPU time remaining: 81737.392995
3) -----------
name: hgfp_dimer_3x_317_fold_SAVE_ALL_OUT_906914_611_0
WU name: hgfp_dimer_3x_317_fold_SAVE_ALL_OUT_906914_611
project URL: https://boinc.bakerlab.org/rosetta/
received: Fri Apr 10 17:23:20 2020
report deadline: Mon Apr 13 17:23:20 2020
ready to report: no
state: downloaded
scheduler state: uninitialized
active_task_state: UNINITIALIZED
app version num: 378
resources: 1 CPU
ID: 94223 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 94241 - Posted: 12 Apr 2020, 13:57:37 UTC

I believe it just means that they have not yet started to run. Are there other tasks (perhaps from other BOINC projects) that are running now? Have you set the preferences to limit the hours when BOINC can run? Which host are you talking about?
Rosetta Moderator: Mod.Sense
ID: 94241 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94244 - Posted: 12 Apr 2020, 14:11:17 UTC - in response to Message 94241.  
Last modified: 12 Apr 2020, 14:25:58 UTC

Past days the tasks are running normally and It doesn't run other BOINC projects. I have 5 (clone) system with CentOS and in all the task are in uninitialized state.

ID:4061992
ID: 4062010
ID: 4061905
ID: 4062012
ID: 4061963

I have not limit when BOINC can run. The log boincmd --get-messages it's seems normal:

294: 12-Apr-2020 12:48:47 (user notification) [Rosetta@home] This project is using an old URL. When convenient, remove the project, then add https://boinc.bakerlab.org/rosetta/
295: 12-Apr-2020 12:48:47 (low) [Rosetta@home] General prefs: from Rosetta@home (last modified 12-Apr-2020 11:51:30)
296: 12-Apr-2020 12:48:47 (low) [Rosetta@home] Host location: none
297: 12-Apr-2020 12:48:47 (low) [Rosetta@home] General prefs: using your defaults
298: 12-Apr-2020 12:48:47 (low) [] Preferences:
299: 12-Apr-2020 12:48:47 (low) [] max memory usage when active: 1894.50 MB
300: 12-Apr-2020 12:48:47 (low) [] max memory usage when idle: 3410.10 MB
301: 12-Apr-2020 12:48:47 (low) [] max disk usage: 7.56 GB
302: 12-Apr-2020 12:48:47 (low) [] don't use GPU while active
303: 12-Apr-2020 12:48:47 (low) [] suspend work if non-BOINC CPU load exceeds 75%
304: 12-Apr-2020 12:48:47 (low) [] (to change preferences, visit a project web site or select Preferences in the Manager)
305: 12-Apr-2020 12:48:49 (low) [Rosetta@home] Started download of hgfp_het2_215_data.zip
306: 12-Apr-2020 12:48:49 (low) [Rosetta@home] Started download of hgfp_good_frag_184_data.zip
307: 12-Apr-2020 12:49:00 (low) [Rosetta@home] Finished download of hgfp_het2_215_data.zip
308: 12-Apr-2020 12:49:03 (low) [Rosetta@home] Finished download of hgfp_good_frag_184_data.zip

I have adjust in Rosetta@home preferences Target CPU run time 1 day, It's ok?
ID: 94244 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 94270 - Posted: 12 Apr 2020, 20:03:34 UTC - in response to Message 94244.  

Your runtime preference will not effect when tasks run. BOINC Manager decides that, and is unaware of the runtime preferences.

Looks like your systems have 4 cores, and 4 GB of memory. What happens if you have the CPU preference to use at most 50% of the CPUs?
Rosetta Moderator: Mod.Sense
ID: 94270 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94329 - Posted: 13 Apr 2020, 9:49:38 UTC - in response to Message 94270.  

I have adjusted the "Use at most 50 % of the CPUs" and in one host reset the project:

boinccmd --project "https://boinc.bakerlab.org/rosetta/" reset

but unlucky the tasks continue uninitialized:

boinccmd --get_tasks

======== Tasks ========
1) -----------
name: hgfp_good_frag_52_fold_SAVE_ALL_OUT_909178_181_0
WU name: hgfp_good_frag_52_fold_SAVE_ALL_OUT_909178_181
project URL: https://boinc.bakerlab.org/rosetta/
received: Mon Apr 13 11:28:38 2020
report deadline: Thu Apr 16 11:28:37 2020
ready to report: no
state: downloaded
scheduler state: uninitialized
active_task_state: UNINITIALIZED
app version num: 378
resources: 1 CPU
estimated CPU time remaining: 27485.364576
2) -----------
[...]

The logs doesn't show nothing strange:

# boinccmd --get_messages

13-Apr-2020 11:28:25 (low) [Rosetta@home] Resetting project
31: 13-Apr-2020 11:28:30 (low) [Rosetta@home] Master file download succeeded
32: 13-Apr-2020 11:28:35 (low) [Rosetta@home] Sending scheduler request: To fetch work.
33: 13-Apr-2020 11:28:35 (low) [Rosetta@home] Requesting new tasks for CPU
34: 13-Apr-2020 11:28:38 (low) [Rosetta@home] Scheduler request completed: got 4 new tasks
35: 13-Apr-2020 11:28:38 (user notification) [Rosetta@home] This project is using an old URL. When convenient, remove the project, then add https://boinc.bakerlab.org/rosetta/
36: 13-Apr-2020 11:28:38 (low) [Rosetta@home] General prefs: from Rosetta@home (last modified 13-Apr-2020 11:13:01)
37: 13-Apr-2020 11:28:38 (low) [Rosetta@home] Host location: none
38: 13-Apr-2020 11:28:38 (low) [Rosetta@home] General prefs: using your defaults
39: 13-Apr-2020 11:28:38 (low) [] Preferences:
40: 13-Apr-2020 11:28:38 (low) [] max memory usage when active: 1894.50 MB
41: 13-Apr-2020 11:28:38 (low) [] max memory usage when idle: 3410.10 MB
42: 13-Apr-2020 11:28:38 (low) [] max disk usage: 7.56 GB
43: 13-Apr-2020 11:28:38 (low) [] Number of usable CPUs has changed from 4 to 2.
44: 13-Apr-2020 11:28:38 (low) [] max CPUs used: 2
45: 13-Apr-2020 11:28:38 (low) [] don't use GPU while active
46: 13-Apr-2020 11:28:38 (low) [] suspend work if non-BOINC CPU load exceeds 75%
47: 13-Apr-2020 11:28:38 (low) [] (to change preferences, visit a project web site or select Preferences in the Manager)
48: 13-Apr-2020 11:28:40 (low) [Rosetta@home] Started download of rosetta_4.15_x86_64-pc-linux-gnu
49: 13-Apr-2020 11:28:40 (low) [Rosetta@home] Started download of rosetta_graphics_4.15_x86_64-pc-linux-gnu
50: 13-Apr-2020 11:30:44 (low) [Rosetta@home] Finished download of rosetta_graphics_4.15_x86_64-pc-linux-gnu
51: 13-Apr-2020 11:30:44 (low) [Rosetta@home] Started download of database_357d5d93529_n_methyl.zip
52: 13-Apr-2020 11:31:01 (low) [Rosetta@home] Finished download of rosetta_4.15_x86_64-pc-linux-gnu
ID: 94329 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 94341 - Posted: 13 Apr 2020, 13:20:36 UTC

The log says that for R@h you are "...using an old URL". Yet the project URL has not changed. What URL did you attach to?
Rosetta Moderator: Mod.Sense
ID: 94341 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94344 - Posted: 13 Apr 2020, 13:49:10 UTC - in response to Message 94341.  

I'm using this URL in order to attach to Rosetta:

boinccmd --project_attach "https://boinc.bakerlab.org/rosetta/"

What's the correct URL?
ID: 94344 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94345 - Posted: 13 Apr 2020, 14:02:41 UTC - in response to Message 94344.  

Ok the URL it's with http not https, I have detach:

boinccmd --project https://boinc.bakerlab.org/rosetta/ detach

And again attach to http ...

# boinccmd --project_attach "https://boinc.bakerlab.org/rosetta/" "..."

I try to resume:

# boinccmd --project https://boinc.bakerlab.org/rosetta/ resume

But again all tasks it's uninitialized:

1) -----------
name: 5e2680b18d3ff769ed2d8d58de5013ef_start_1900_20_04_19_27_51_globalDocking_2_SAVE_ALL_OUT_913185_24_0
WU name: 5e2680b18d3ff769ed2d8d58de5013ef_start_1900_20_04_19_27_51_globalDocking_2_SAVE_ALL_OUT_913185_24
project URL: https://boinc.bakerlab.org/rosetta/
received: Mon Apr 13 15:56:02 2020
report deadline: Thu Apr 16 15:56:01 2020
ready to report: no
state: downloading
scheduler state: uninitialized
active_task_state: UNINITIALIZED
app version num: 415
resources: 1 CPU
estimated CPU time remaining: 20759.735125
2) -----------
ID: 94345 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 94387 - Posted: 13 Apr 2020, 22:01:52 UTC - in response to Message 94345.  

On the website, in your profile, there is a link for Rosetta@home preferences. In there, you can define up to 4 venues. If you click on your hosts, you can see the "venue" shown in the "location" column. When you look at the Rosetta preferences for the venue of the host, is the box checked for "Use CPU"? (it should show a checkmark). In fact, for R@h, this box should be checked for all venues you have defined.

I see your hosts all have credit. When did they stop working? Do you have a cc_config or app_config file setup? If so, please show what is in them.
Rosetta Moderator: Mod.Sense
ID: 94387 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94426 - Posted: 14 Apr 2020, 9:12:21 UTC - in response to Message 94387.  
Last modified: 14 Apr 2020, 9:12:52 UTC

Hi, thanks for the answer. I have defined the work venue and assigned to all my hosts.




The tasks were running normally during 3 days approximately. The files cc_config and app_config are not present.
ID: 94426 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94428 - Posted: 14 Apr 2020, 9:45:22 UTC - in response to Message 94387.  

I have detected a change, in one of the host I see running tasks:

======== Tasks ========
1) -----------
name: Mini_Protein_binds_IL1R_COVID-19_design5_SAVE_ALL_OUT_IGNORE_THE_REST_4qo3dr9i_909067_3_0
WU name: Mini_Protein_binds_IL1R_COVID-19_design5_SAVE_ALL_OUT_IGNORE_THE_REST_4qo3dr9i_909067_3
project URL: https://boinc.bakerlab.org/rosetta/
received: Mon Apr 13 15:56:13 2020
report deadline: Thu Apr 16 15:56:12 2020
ready to report: no
state: downloaded
scheduler state: scheduled
active_task_state: EXECUTING
app version num: 415
resources: 1 CPU
estimated CPU time remaining: 33887.100296
CPU time at last checkpoint: 52676.290000
current CPU time: 52753.410000
fraction done: 0.609875
swap size: 898 MB
working set size: 723 MB
2) -----------
name: Mini_Protein_binds_IL1R_COVID-19_design4_SAVE_ALL_OUT_IGNORE_THE_REST_3gp4jq7y_908587_3_0
WU name: Mini_Protein_binds_IL1R_COVID-19_design4_SAVE_ALL_OUT_IGNORE_THE_REST_3gp4jq7y_908587_3
project URL: https://boinc.bakerlab.org/rosetta/
received: Mon Apr 13 15:56:13 2020
report deadline: Thu Apr 16 15:56:12 2020
ready to report: no
state: downloaded
scheduler state: scheduled
active_task_state: EXECUTING
app version num: 415
resources: 1 CPU
estimated CPU time remaining: 46711.422053
CPU time at last checkpoint: 39847.050000
current CPU time: 39937.950000
fraction done: 0.462027
swap size: 845 MB
ID: 94428 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94430 - Posted: 14 Apr 2020, 9:59:56 UTC - in response to Message 94428.  

But only in one hosts, in the others continue in uninitialized state :-(
ID: 94430 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 373
Credit: 10,598,568
RAC: 8,061
Message 94447 - Posted: 14 Apr 2020, 14:24:26 UTC - in response to Message 94430.  

But only in one hosts, in the others continue in uninitialized state :-(


So, is that host, 4062010, either the one you changed the cpu limit to 50% on or the one you changed the host url on?
ID: 94447 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94457 - Posted: 14 Apr 2020, 16:14:07 UTC - in response to Message 94447.  
Last modified: 14 Apr 2020, 16:17:55 UTC

It's the same ... Now I have changed the URL in all hosts ... but with the same situation. I don't understand anything ...

I have force the update in one hosts:

52: 14-Apr-2020 18:14:38 (low) [Rosetta@home] update requested by user
53: 14-Apr-2020 18:14:41 (low) [Rosetta@home] Sending scheduler request: Requested by user.
54: 14-Apr-2020 18:14:41 (low) [Rosetta@home] Requesting new tasks for CPU
55: 14-Apr-2020 18:14:42 (low) [Rosetta@home] Scheduler request completed: got 4 new tasks
56: 14-Apr-2020 18:14:44 (low) [Rosetta@home] Started download of local_docking_20_04_15_28_09.xml
57: 14-Apr-2020 18:14:44 (low) [Rosetta@home] Started download of chainA_chainB_20_04_15_28_09.pdb
58: 14-Apr-2020 18:14:47 (low) [Rosetta@home] Finished download of local_docking_20_04_15_28_09.xml
59: 14-Apr-2020 18:14:47 (low) [Rosetta@home] Finished download of chainA_chainB_20_04_15_28_09.pdb
60: 14-Apr-2020 18:14:47 (low) [Rosetta@home] Started download of Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_0ys6zi4m.zip
61: 14-Apr-2020 18:14:47 (low) [Rosetta@home] Started download of Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_0ys6zi4m.flags
62: 14-Apr-2020 18:14:49 (low) [Rosetta@home] Finished download of Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_0ys6zi4m.flags
63: 14-Apr-2020 18:14:49 (low) [Rosetta@home] Started download of Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_0uc4pt5i.zip
64: 14-Apr-2020 18:14:53 (low) [Rosetta@home] Finished download of Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_0ys6zi4m.zip
65: 14-Apr-2020 18:14:53 (low) [Rosetta@home] Finished download of Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_0uc4pt5i.zip
66: 14-Apr-2020 18:14:53 (low) [Rosetta@home] Started download of Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_0uc4pt5i.flags
67: 14-Apr-2020 18:14:53 (low) [Rosetta@home] Started download of Mini_Protein_binds_IL1R_COVID-19_design8_SAVE_ALL_OUT_IGNORE_THE_REST_4zr9fy2g.zip
68: 14-Apr-2020 18:14:55 (low) [Rosetta@home] Finished download of Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_0uc4pt5i.flags
69: 14-Apr-2020 18:14:55 (low) [Rosetta@home] Started download of Mini_Protein_binds_IL1R_COVID-19_design8_SAVE_ALL_OUT_IGNORE_THE_REST_4zr9fy2g.flags
70: 14-Apr-2020 18:14:56 (low) [Rosetta@home] Finished download of Mini_Protein_binds_IL1R_COVID-19_design8_SAVE_ALL_OUT_IGNORE_THE_REST_4zr9fy2g.flags
71: 14-Apr-2020 18:14:59 (low) [Rosetta@home] Finished download of Mini_Protein_binds_IL1R_COVID-19_design8_SAVE_ALL_OUT_IGNORE_THE_REST_4zr9fy2g.zip


And all the task are again uninitialized:

======= Tasks ========                                                                                                                                                               
1) -----------                                                                                                                                                                        
   name: Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_9nn9gy6g_909112_4_0                                                                                    
   WU name: Mini_Protein_binds_IL1R_COVID-19_design7_SAVE_ALL_OUT_IGNORE_THE_REST_9nn9gy6g_909112_4                                                                                   
   project URL: https://boinc.bakerlab.org/rosetta/                                                                                                                                    
   received: Tue Apr 14 11:50:36 2020                                                                                                                                                 
   report deadline: Fri Apr 17 11:50:36 2020                                                                                                                                          
   ready to report: no                                                                                                                                                                
   state: downloaded                                                                                                                                                                  
   scheduler state: uninitialized                                                                                                                                                     
   active_task_state: UNINITIALIZED                                                                                                                                                   
   app version num: 415                                                                                                                                                               
   resources: 1 CPU                                                                                                                                                                   
   estimated CPU time remaining: 15686.105082                                                                                                                                         
2) -----------                                                                                                                                                                        
   name: Mini_Protein_binds_IL1R_COVID-19_design4_SAVE_ALL_OUT_IGNORE_THE_REST_8rz6dh5f_908717_4_0                                                                                    
   WU name: Mini_Protein_binds_IL1R_COVID-19_design4_SAVE_ALL_OUT_IGNORE_THE_REST_8rz6dh5f_908717_4                                                                                   
   project URL: https://boinc.bakerlab.org/rosetta/                                                                                                                                    
   received: Tue Apr 14 11:50:47 2020                                                                                                                                                 
   report deadline: Fri Apr 17 11:50:47 2020                                                                                                                                          
   ready to report: no                                                                                                                                                                
   state: downloaded                                                                                                                                                                  
   scheduler state: uninitialized                                                                                                                                                     
   active_task_state: UNINITIALIZED                                                                                                                                                   
   app version num: 415                                                                                                                                                               
   resources: 1 CPU                                                                                                                                                                   
   estimated CPU time remaining: 15686.105082                                                                                                                                         
3) -----------                                                         
ID: 94457 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 94472 - Posted: 14 Apr 2020, 20:58:00 UTC

In your preferences, (again, by venue of machine) have you setup specific time periods during the day when BOINC is allowed to use the CPU?
Rosetta Moderator: Mod.Sense
ID: 94472 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1467
Credit: 14,331,566
RAC: 16,669
Message 94491 - Posted: 14 Apr 2020, 23:40:06 UTC - in response to Message 94472.  
Last modified: 14 Apr 2020, 23:43:44 UTC

In your preferences, (again, by venue of machine) have you setup specific time periods during the day when BOINC is allowed to use the CPU?
A quick search shows you seem to be on the right track.


SGAI-CSIC, make sure use CPU is selected.
Make sure no settings that block BOINC from processing work are selected- eg Use at most xx % of the CPUs, Use at most x % of CPU time should both be 100%, Suspend when computer is on battery, Suspend when computer is in use, Suspend when non-BOINC CPU usage is above ---%, Compute only between --- should all be unselected/blank.

Edit- oh, and any local settings will override the web based ones.
Grant
Darwin NT
ID: 94491 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94530 - Posted: 15 Apr 2020, 11:01:00 UTC - in response to Message 94491.  

It's very strange, all my hosts are in the same system and boinc configuration, and now only one it's working on tasks. All the hosts are in the venue "work", and these are my preferences:

Computing preferences:






Rosetta@home preferences



what can be wrong?
ID: 94530 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1467
Credit: 14,331,566
RAC: 16,669
Message 94533 - Posted: 15 Apr 2020, 11:42:38 UTC
Last modified: 15 Apr 2020, 11:43:12 UTC

From what you've posted there i can't see any reason for Tasks not running. The default CPU Target time is 8 hours, but i can't see why 24 hours should cause things to not run.
I don't know how you would do it for a headless system, but you can select which functions get recorded to the Event log.

I'd exit BOINC, give it 10ssec or so and restart it. Then post the contents of the Event log here to see what messages are there, and someone with more experience might have a suggestion as to which options would be best to enable sorting this out.


One more thought- if you click on Details for the problem system & then compare them to a working system, down the bottom there is some info on when the system can process work (only you can see it on your system, not others). The ones of interest-
                             Fraction of time BOINC is running 99.91%
While BOINC is running, fraction of time computing is allowed 100.00%

Grant
Darwin NT
ID: 94533 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile SGAI-CSIC

Send message
Joined: 4 Apr 20
Posts: 19
Credit: 15,069,615
RAC: 0
Message 94536 - Posted: 15 Apr 2020, 12:36:58 UTC - in response to Message 94533.  
Last modified: 15 Apr 2020, 12:37:59 UTC

Thank you for your help, I try to stop boinc daemon a restart again and I see these messages in the log, perhaps it's a bug of the boinc-client on CentOS?


# systemctl stop boinc-client 
# systemctl start boinc-client
# boinccmd --get_messages

1:  15-Apr-2020 14:33:14 (low) [] cc_config.xml not found - using defaults
2: 15-Apr-2020 14:33:14 (low) [] Starting BOINC client version 7.16.1 for x86_64-pc-linux-gnu
3: 15-Apr-2020 14:33:14 (low) [] log flags: file_xfer, sched_ops, task
4: 15-Apr-2020 14:33:14 (low) [] Libraries: libcurl/7.29.0 NSS/3.44 zlib/1.2.7 libidn/1.28 libssh2/1.8.0
5: 15-Apr-2020 14:33:14 (low) [] Data directory: /var/lib/boinc
6: 15-Apr-2020 14:33:14 (low) [] No usable GPUs found
7: 15-Apr-2020 14:33:14 (low) [] [libc detection] gathered: 2.17, GNU libc
8: 15-Apr-2020 14:33:14 (low) [] Host name: rosetta3
9: 15-Apr-2020 14:33:14 (low) [] Processor: 4 GenuineIntel QEMU Virtual CPU version 2.5+ [Family 6 Model 13 Stepping 3]
10: 15-Apr-2020 14:33:14 (low) [] Processor features: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl xtopology eagerfpu pni cx16 x2apic hypervisor lahf_lm
11: 15-Apr-2020 14:33:14 (low) [] OS: Linux CentOS Linux: CentOS Linux 7 (Core) [3.10.0-1062.18.1.el7.x86_64|libc 2.17 (GNU libc)]
12: 15-Apr-2020 14:33:14 (low) [] Memory: 3.70 GB physical, 1.20 GB virtual
13: 15-Apr-2020 14:33:14 (low) [] Disk: 10.22 GB total, 7.85 GB free
14: 15-Apr-2020 14:33:14 (low) [] Local time is UTC +2 hours
15: 15-Apr-2020 14:33:14 (low) [Rosetta@home] General prefs: from Rosetta@home (last modified 15-Apr-2020 09:58:03)
16: 15-Apr-2020 14:33:14 (low) [Rosetta@home] Computer location: work
17: 15-Apr-2020 14:33:14 (low) [] General prefs: using separate prefs for work
18: 15-Apr-2020 14:33:14 (low) [] Preferences:
19: 15-Apr-2020 14:33:14 (low) [] max memory usage when active: 1894.49 MB
20: 15-Apr-2020 14:33:14 (low) [] max memory usage when idle: 3410.09 MB
21: 15-Apr-2020 14:33:14 (low) [] max disk usage: 7.56 GB
22: 15-Apr-2020 14:33:14 (low) [] (to change preferences, visit a project web site or select Preferences in the Manager)
23: 15-Apr-2020 14:33:14 (low) [] Setting up project and slot directories
24: 15-Apr-2020 14:33:14 (low) [] Checking active tasks
25: 15-Apr-2020 14:33:14 (low) [Rosetta@home] URL https://boinc.bakerlab.org/rosetta/; Computer ID 4062012; resource share 100
26: 15-Apr-2020 14:33:14 (low) [] Setting up GUI RPC socket
27: 15-Apr-2020 14:33:14 (low) [] Checking presence of 23 project files
ID: 94536 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 94542 - Posted: 15 Apr 2020, 13:47:50 UTC

It looks like you have 4-core systems with 4GB of memory. Try setting one system to use at most 25% of the CPUs, and set another to use at most 50% of the CPUs.

Some WUs are reserving more than a GB of memory to ensure they run well (once you get them to start running).

I would suggest adding one or more other BOINC projects to these systems, where the WUs require less memory. The BOINC Manager will then find a mix of WUs that can run with the resources available.
Rosetta Moderator: Mod.Sense
ID: 94542 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Questions and Answers : Unix/Linux : All tasks in scheduler state uninitialized



©2024 University of Washington
https://www.bakerlab.org