Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 119 · 120 · 121 · 122 · 123 · 124 · 125 . . . 274 · Next

AuthorMessage
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 72
Credit: 18,450,036
RAC: 0
Message 102469 - Posted: 26 Aug 2021, 16:50:52 UTC - in response to Message 102461.  

Several of these tasks that are running for twice my set computation time and not checkpointing to boot. I hope I get some sort of credit for these.


Application
Rosetta 4.20
Name
rb_08_23_108315_111529_ab_t000__robetta_cstwt_5.0_FT_IGNORE_THE_REST_06_05_1729195_676

[snip]

Rosetta@Home tasks have sections known as decoys. The decision on whether to end the task normally occurs only at the end of a decoy.

Your very long run time looks like you got at least one task with a very long time per decoy.

I have no information on whether checkpoints are also written only at the ends of decoys. However, if so, this is probably why you also had the long time with no checkpoints.

You might want to read the log file from that task to check whether it completed only one decoy.

Also check if you can read the log files from any other tasks for that workunit. If all of them were that slow, expect to get some credit as long as you either returned it by the deadline, or returned it before the quorum was met.

Thannk you, this is super helpful and I will do so.
I don't think some of these tasks are going to complete in time for the deadline without checkpointing. I'm going to try and keep the client running but they're also using pretty excessive amounts of ram.
I thought the quorum for each task (number of machines to complete) needed to be 1? Or do you mean others, apart from myself, also get this task, in case I don't complete it first?
ID: 102469 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kissagogo27

Send message
Joined: 31 Mar 20
Posts: 83
Credit: 2,593,359
RAC: 2,397
Message 102470 - Posted: 26 Aug 2021, 19:13:42 UTC

Hi, for me, for a setting time of 12h , some of them just run in 8h !
ID: 102470 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,824,497
RAC: 2,340
Message 102472 - Posted: 26 Aug 2021, 23:20:29 UTC - in response to Message 102469.  
Last modified: 26 Aug 2021, 23:21:35 UTC

[snip]

Thannk you, this is super helpful and I will do so.
I don't think some of these tasks are going to complete in time for the deadline without checkpointing. I'm going to try and keep the client running but they're also using pretty excessive amounts of ram.
I thought the quorum for each task (number of machines to complete) needed to be 1? Or do you mean others, apart from myself, also get this task, in case I don't complete it first?

The usual quorum used to be two, but has often been 1 lately. A quorum of 1 is adequate only for tasks for which some quick method of checking the output of the task is available. If the quorum is 2, the first two sets of task output files returned must agree enough before they are considered validated. If they don't agree enough, one more task is sent out to determine which of the first two tasks is correct enough to be validated. The purpose of the quorum is to check whether the task or tasks returned correct outputs, even if the task did not detect an error. Sometimes, a workunit with an error in its input files will give some credit if other tasks for that same workunit agree on detecting the error.

Usually, the first group of tasks sent out has as many tasks as the quorum, so if the quorum is greater than one, at least one other person will also get a task for that workunit. For each task that goes past its deadline, one more task for that workunit will be sent out. You have a head start on any task sent due to another task reaching its deadline, and therefore some chance of still returning it in time.

If the tasks are using excessive amounts of RAM, you may need to tell BOINC to reduce the number of tasks it is allowed to run at the same time, so that the reduced number will fit in the amount of RAM you have available.

I normally keep my computer running and doing BOINC work day and night, so it can handle tasks that go over 24 hours between checkpoints.
ID: 102472 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,824,497
RAC: 2,340
Message 102473 - Posted: 26 Aug 2021, 23:25:15 UTC - in response to Message 102470.  

Hi, for me, for a setting time of 12h , some of them just run in 8h !

Typical if it finishes its list of possible decoys in 8 hours.

Also expected if at the end of a decoy it calculates the time expected to do one more decoy and it would put the total time too far past the time you set.
ID: 102473 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kissagogo27

Send message
Joined: 31 Mar 20
Posts: 83
Credit: 2,593,359
RAC: 2,397
Message 102477 - Posted: 27 Aug 2021, 11:09:11 UTC

ok, thks ;)
ID: 102477 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,000,634
RAC: 0
Message 102479 - Posted: 27 Aug 2021, 11:42:37 UTC
Last modified: 27 Aug 2021, 11:43:05 UTC

Funny, I'm running "degrader" units at Rosetta@home and also "degrader" units at Ralph@home.

2 of the Rosetta@home units finished very early after 18 and 56 minutes, respectively.
ID: 102479 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 72
Credit: 18,450,036
RAC: 0
Message 102485 - Posted: 27 Aug 2021, 15:30:59 UTC - in response to Message 102472.  

[snip]

Thannk you, this is super helpful and I will do so.
I don't think some of these tasks are going to complete in time for the deadline without checkpointing. I'm going to try and keep the client running but they're also using pretty excessive amounts of ram.
I thought the quorum for each task (number of machines to complete) needed to be 1? Or do you mean others, apart from myself, also get this task, in case I don't complete it first?

The usual quorum used to be two, but has often been 1 lately. A quorum of 1 is adequate only for tasks for which some quick method of checking the output of the task is available. If the quorum is 2, the first two sets of task output files returned must agree enough before they are considered validated. If they don't agree enough, one more task is sent out to determine which of the first two tasks is correct enough to be validated. The purpose of the quorum is to check whether the task or tasks returned correct outputs, even if the task did not detect an error. Sometimes, a workunit with an error in its input files will give some credit if other tasks for that same workunit agree on detecting the error.

Usually, the first group of tasks sent out has as many tasks as the quorum, so if the quorum is greater than one, at least one other person will also get a task for that workunit. For each task that goes past its deadline, one more task for that workunit will be sent out. You have a head start on any task sent due to another task reaching its deadline, and therefore some chance of still returning it in time.

If the tasks are using excessive amounts of RAM, you may need to tell BOINC to reduce the number of tasks it is allowed to run at the same time, so that the reduced number will fit in the amount of RAM you have available.

I normally keep my computer running and doing BOINC work day and night, so it can handle tasks that go over 24 hours between checkpoints.

Hi,
I normally do too, on all but one. Of course that was the one that had these issues. The tasks ended up erroring out though they for some reason displayed a vast amount of credit, over 400.
Thanks for the explanation. That clears things up.
ID: 102485 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1467
Credit: 14,322,361
RAC: 16,418
Message 102496 - Posted: 30 Aug 2021, 7:32:50 UTC

Is it just my memory playing up, or has the Total queued jobs dropped from over 4 million to just over 1.7 million over night?
Grant
Darwin NT
ID: 102496 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1847
Credit: 7,988,827
RAC: 8,538
Message 102497 - Posted: 30 Aug 2021, 8:18:46 UTC - in response to Message 102479.  

Funny, I'm running "degrader" units at Rosetta@home and also "degrader" units at Ralph@home.

I have a lot of errors on "degrader" on Ralph with this message:

ERROR: Error in core::conformation::Conformation::residue(): The sequence position requested was greater than the number of residues in the pose.
ERROR:: Exit from: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/conformation/Conformation.hh line: 508

ID: 102497 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1467
Credit: 14,322,361
RAC: 16,418
Message 102498 - Posted: 30 Aug 2021, 8:40:29 UTC - in response to Message 102497.  

Funny, I'm running "degrader" units at Rosetta@home and also "degrader" units at Ralph@home.

I have a lot of errors on "degrader" on Ralph with this message:

ERROR: Error in core::conformation::Conformation::residue(): The sequence position requested was greater than the number of residues in the pose.
ERROR:: Exit from: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/conformation/Conformation.hh line: 508
I've got the same thing- it seems to be one particular group- _5nvx_ - that crashes & burns in less than 2 minutes.
All the others are crunching without issues (or if they do have issues, they take longer & Validate)..
Grant
Darwin NT
ID: 102498 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,000,634
RAC: 0
Message 102499 - Posted: 30 Aug 2021, 9:19:32 UTC - in response to Message 102496.  

Is it just my memory playing up, or has the Total queued jobs dropped from over 4 million to just over 1.7 million over night?



It did. It was at 3.899 last I saw.
ID: 102499 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1467
Credit: 14,322,361
RAC: 16,418
Message 102500 - Posted: 30 Aug 2021, 9:31:39 UTC - in response to Message 102499.  

Is it just my memory playing up, or has the Total queued jobs dropped from over 4 million to just over 1.7 million over night?
It did. It was at 3.899 last I saw.
Phew.
Nice to know i haven't completely lost it (yet).
Grant
Darwin NT
ID: 102500 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile lazyacevw

Send message
Joined: 18 Mar 20
Posts: 12
Credit: 93,576,463
RAC: 0
Message 102524 - Posted: 1 Sep 2021, 15:46:26 UTC - in response to Message 102496.  

Is it just my memory playing up, or has the Total queued jobs dropped from over 4 million to just over 1.7 million over night?


Well, 371 of those were "relatively" quick task failures on my systems. I have a sneaky feeling I am going to run out of bandwidth a little sooner this month.
ID: 102524 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Tomcat雄猫

Send message
Joined: 20 Dec 14
Posts: 180
Credit: 5,364,639
RAC: 0
Message 102537 - Posted: 4 Sep 2021, 8:50:19 UTC
Last modified: 4 Sep 2021, 9:04:43 UTC

Been getting lots of errors recently:

degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1co8fa9r_1729406_12_1
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol pdblite_boinc_120_10_tfirst--fuse--predictor_v13_degrader_boinc--fuse--tslp_design_v2_degrader_boinc.xml @degrader_site_5nvx_jhr_bcov_flags2 -in:file:silent degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1co8fa9r.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1co8fa9r.zip @degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1co8fa9r.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3430233
Using database: database_357d5d93529_n_methylminirosetta_database

ERROR: Error in core::conformation::Conformation::residue(): The sequence position requested was greater than the number of residues in the pose.
ERROR:: Exit from: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/conformation/Conformation.hh line: 508
BOINC:: Error reading and gzipping output datafile: default.out
06:16:21 (22764): called boinc_finish(1)

</stderr_txt>
]]>


degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1mo7yf7k_1729668_18_1
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol pdblite_boinc_120_10_tfirst--fuse--predictor_v13_degrader_boinc--fuse--tslp_design_v2_degrader_boinc.xml @degrader_site_5nvx_jhr_bcov_flags2 -in:file:silent degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1mo7yf7k.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1mo7yf7k.zip @degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1mo7yf7k.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 2084867
Using database: database_357d5d93529_n_methylminirosetta_database

ERROR: Error in core::conformation::Conformation::residue(): The sequence position requested was greater than the number of residues in the pose.
ERROR:: Exit from: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/conformation/Conformation.hh line: 508
BOINC:: Error reading and gzipping output datafile: default.out
07:41:32 (35104): called boinc_finish(1)

</stderr_txt>
]]>


degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_2sv7td9t_1730197_15_0
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol pdblite_boinc_120_10_tfirst--fuse--predictor_v13_degrader_boinc--fuse--tslp_design_v2_degrader_boinc.xml @degrader_site_5nvx_jhr_bcov_flags2 -in:file:silent degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_2sv7td9t.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_2sv7td9t.zip @degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_2sv7td9t.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 2786048
Using database: database_357d5d93529_n_methylminirosetta_database

ERROR: Error in core::conformation::Conformation::residue(): The sequence position requested was greater than the number of residues in the pose.
ERROR:: Exit from: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/conformation/Conformation.hh line: 508
BOINC:: Error reading and gzipping output datafile: default.out
09:22:19 (22736): called boinc_finish(1)

</stderr_txt>
]]>


degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_3qn1ob6y_1729329_16_0
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol pdblite_boinc_120_10_tfirst--fuse--predictor_v13_degrader_boinc--fuse--tslp_design_v2_degrader_boinc.xml @degrader_site_5nvx_jhr_bcov_flags2 -in:file:silent degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_3qn1ob6y.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_3qn1ob6y.zip @degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_3qn1ob6y.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 3805639
Using database: database_357d5d93529_n_methylminirosetta_database

ERROR: Error in core::conformation::Conformation::residue(): The sequence position requested was greater than the number of residues in the pose.
ERROR:: Exit from: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/conformation/Conformation.hh line: 508
BOINC:: Error reading and gzipping output datafile: default.out
16:22:42 (9952): called boinc_finish(1)

</stderr_txt>
]]>


degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_7ue1xx0j_1729914_20_0
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -run:protocol jd2_scripting -parser:protocol pdblite_boinc_120_10_tfirst--fuse--predictor_v13_degrader_boinc--fuse--tslp_design_v2_degrader_boinc.xml @degrader_site_5nvx_jhr_bcov_flags2 -in:file:silent degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_7ue1xx0j.silent -in:file:silent_struct_type binary -silent_gz -mute all -silent_read_through_errors true -out:file:silent_struct_type binary -out:file:silent default.out -in:file:boinc_wu_zip degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_7ue1xx0j.zip @degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_7ue1xx0j.flags -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 1228105
Using database: database_357d5d93529_n_methylminirosetta_database

ERROR: Error in core::conformation::Conformation::residue(): The sequence position requested was greater than the number of residues in the pose.
ERROR:: Exit from: C:cygwin64homeboinc4.17Rosettamainsourcesrccore/conformation/Conformation.hh line: 508
BOINC:: Error reading and gzipping output datafile: default.out
23:28:52 (3568): called boinc_finish(1)

</stderr_txt>
]]>


I'm going to assume these are known problems and are being investigated. I remember seeing these over at the test project, Ralph@home. These "degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST" sure aren't the most stable WUs I've seen. Has anyone seen a single "degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST" that didn't crash and burn?
ID: 102537 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1467
Credit: 14,322,361
RAC: 16,418
Message 102538 - Posted: 4 Sep 2021, 9:02:10 UTC - in response to Message 102537.  
Last modified: 4 Sep 2021, 9:04:25 UTC

I'm going to assume these are known problems and are being investigated. I remember seeing these over at the test project, Ralph@home.
Whoever sent them out obviously didn't take notice of what was occurring at Ralph before they released them here, so no investigation- if there were they would have all been cancelled ages ago,
Not a single _5nvx_ has lasted more than a couple of minutes. A 100% failure rate.



As it is, we're now out of work again, so the only Tasks we'll see for a while will be resends and the odd rb_ task. And you can bet many of resends will be _5nvx_ and every last one of them will fail in minutes.
Grant
Darwin NT
ID: 102538 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,824,497
RAC: 2,340
Message 102542 - Posted: 4 Sep 2021, 17:58:01 UTC - in response to Message 102537.  
Last modified: 4 Sep 2021, 17:58:48 UTC

Been getting lots of errors recently:

degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1co8fa9r_1729406_12_1
[code]<core_client_version>7.16.11</core_client_version>

[snip]

I've been seeing a lot of those lately, all in tasks with _5nvx_ in their names. Does that mean that whoever created that group of workunits needs to pay more attention in class?
ID: 102542 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,000,634
RAC: 0
Message 102555 - Posted: 8 Sep 2021, 20:24:40 UTC

There's new work available and it looks like the 5nvx workunits.
Let's see if these actually run.
ID: 102555 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1965
Credit: 38,174,417
RAC: 10,123
Message 102557 - Posted: 8 Sep 2021, 21:10:03 UTC - in response to Message 102542.  
Last modified: 8 Sep 2021, 21:13:22 UTC

Been getting lots of errors recently:

degrader_site_5nvx_jhr_bcov3_SAVE_ALL_OUT_IGNORE_THE_REST_1co8fa9r_1729406_12_1
[code]<core_client_version>7.16.11</core_client_version>

[snip]

I've been seeing a lot of those lately, all in tasks with _5nvx_ in their names. Does that mean that whoever created that group of workunits needs to pay more attention in class?

Seems like the range error has been corrected and a whole pile are getting downloaded and running right now.

Edit: Spoke too soon...

(unknown error) - exit code 3221226356 (0xc0000374)</message>
ID: 102557 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 350
Credit: 1,000,634
RAC: 0
Message 102558 - Posted: 8 Sep 2021, 22:00:58 UTC - in response to Message 102557.  

I have 4 running at between 11%-16% progress. Seems better than before but perhaps some are still going to error now.
ID: 102558 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1965
Credit: 38,174,417
RAC: 10,123
Message 102559 - Posted: 8 Sep 2021, 23:12:23 UTC - in response to Message 102558.  
Last modified: 8 Sep 2021, 23:17:39 UTC

I have 4 running at between 11%-16% progress. Seems better than before but perhaps some are still going to error now.

I've got 14 running between 0 & 23%
A second has crashed out after 1h 58m - same error as above. No idea what it means

Edit: And a 3rd crashes at 1h 35m - same error again. Other tasks reaching 25%
ID: 102559 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 119 · 120 · 121 · 122 · 123 · 124 · 125 . . . 274 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org