Posts by Barraud Denis

1) Message boards : Number crunching : Report "hombench_..." issues here! (Message 56202)
Posted 3 Oct 2008 by Profile Barraud Denis
Post:
Barraud, with your preferences, that task should have been ended by the watchdog because it has been running for more then 5 times your normal runtime preference.

Please abort it.



For the moment i prefer to let runing it, i notice wu's death line are for 6/10/2008 21H40..
but i want to know if you will need some of file of the wu for analyse.
I could zip the task slot directory and send you later if you need it.

INFORMATION : in Boinc ->Message : it seemed the wu restart around every 15 minutes ? Confirmed : in task manager and boinc'message : it restart the wu every 15 minutes... the wu restart in a loop at the same point every 15 min ??

now : 27:03:30 - 99,387% - 00:09:56

I will abort the wu later,in regard of time i crunch it, i can let it running a few time more before borted it.
2) Message boards : Number crunching : Report "hombench_..." issues here! (Message 56199)
Posted 3 Oct 2008 by Profile Barraud Denis
Post:
Suite de / Previously : Message 56174 - Posted 2 Oct 2008 20:21:07 UTC

hombench_mtyka_foldcst_loopbuild_boinctest3_foldcst_loopbuild_t328__IGNORE_THE_REST_1ET0A_11_4578_8_0
using minirosetta version 134

q6600 Xp 32bits 2*2Go DDR2-8500 sur mon BOINC 6.3.10
j'ai du Suspendre la WU pour vérifier qu'elle n'est pas défectueuse, Apparemment non, mais j'attends de voir si elle reprend du service.

MAintenant / Now : more information about my boinc / roseta parameters :

my boinc preferences : switch between application every 80 minutes
boinc -> project -> 'partages des resources': 6,25%
my roseta preferences : Target CPU run time : 4 hours

The WU restart at it turn in boinc ! it always run again with :

now...:
26:57:00 à 99,385% reste 00:09:53
26:47:00 à 99,380% reste 00:09:56
before:
21:52:20 à 99,243% reste 00:09:56

I only have change OS priority from Base TO Normal in XP's Task Manager, to see if it change something for Wu.

next to see.
3) Message boards : Number crunching : Report "hombench_..." issues here! (Message 56174)
Posted 2 Oct 2008 by Profile Barraud Denis
Post:
hombench_mtyka_foldcst_loopbuild_boinctest3_foldcst_loopbuild_t328__IGNORE_THE_REST_1ET0A_11_4578_8_0
using minirosetta version 134

q6600 Xp 32bits 2*2Go DDR2-8500 -> 21:52:20 à 99,243% reste 00:09:56
sur mon BOINC 6.3.10
j'ai du Suspendre la WU pour vérifier qu'elle n'est pas défectueuse, Apparemment non, mais j'attends de voir si elle reprend du service.
4) Message boards : Number crunching : Problems with Rosetta version 5.93 (Message 50643)
Posted 13 Jan 2008 by Profile Barraud Denis
Post:
roseta failed and stop/block boinc completely my Q6600, so i have stop this project to protect my others WU running on boinc. The boinc manager stay in memory but is not running, no WU could work. Even with BOINC and all projets completely reinstalled after a reboot, roseta bug again and block boinc.

the only way to recover boinc, i found was to kill boinc manager, restart it and supress the roseta project rapidely, before it reload a new wu.

I think roseta must be upgraded to disconnect it better from boinc, when it failled in error, to prevent boinc freeze.

The information i have from event observer.

Type de l'événement : Erreur
Source de l'événement : Application Error
Catégorie de l'événement : Aucun
ID de l'événement : 1000
Date : 13/01/2008
Heure : 15:05:54
Utilisateur : N/A
Ordinateur : C2Q1
Description :
Application défaillante minirosetta_1.03_windows_intelx86.exe, version 0.0.0.0, module défaillant minirosetta_1.03_windows_intelx86.exe, version 0.0.0.0, adresse de défaillance 0x0027e8c2.

Pour plus d'informations, consultez le centre Aide et support à l'adresse http://go.microsoft.com/fwlink/events.asp.
Données :
0000: 41 70 70 6c 69 63 61 74 Applicat
0008: 69 6f 6e 20 46 61 69 6c ion Fail
0010: 75 72 65 20 20 6d 69 6e ure min
0018: 69 72 6f 73 65 74 74 61 irosetta
0020: 5f 31 2e 30 33 5f 77 69 _1.03_wi
0028: 6e 64 6f 77 73 5f 69 6e ndows_in
0030: 74 65 6c 78 38 36 2e 65 telx86.e
0038: 78 65 20 30 2e 30 2e 30 xe 0.0.0
0040: 2e 30 20 69 6e 20 6d 69 .0 in mi
0048: 6e 69 72 6f 73 65 74 74 nirosett
0050: 61 5f 31 2e 30 33 5f 77 a_1.03_w
0058: 69 6e 64 6f 77 73 5f 69 indows_i
0060: 6e 74 65 6c 78 38 36 2e ntelx86.
0068: 65 78 65 20 30 2e 30 2e exe 0.0.
0070: 30 2e 30 20 61 74 20 6f 0.0 at o
0078: 66 66 73 65 74 20 30 30 ffset 00
0080: 32 37 65 38 63 32 0d 0a 27e8c2..
5) Message boards : Number crunching : Problem with your server: (Message 45859)
Posted 9 Sep 2007 by Profile Barraud Denis
Post:
When I try to valid my Wu after the repair of your SAN problems, boinc notice in its log :

Sending scheduler request: Requested by user
Requesting 23980 seconds of a new work, and reporting 3 compled tasks
Scheduler RPC succeeded
Project encountered internal error : shared memory
Sefering communication for 1h ...
Reason: project is down

But i see the states of your server seem to be OK
so i suppose a bug or an other problem in your SAN.
Could you Verify it ?

6) Message boards : Number crunching : exit code 1 at dock_structure.cc at line 401 ? (Message 15727)
Posted 9 May 2006 by Profile Barraud Denis
Post:
2 of ma recent units :

at 9 May 2006 14:19:46 UTC :
Result ID 19619762
Name FA_CASP6_u272__470_19905_0
Workunit 16283876
Received 9 May 2006 14:19:46 UTC

[ AND ]

Result ID 19569734
Name FA_CASP6_t212__470_16445_0
Workunit 16238895
Received 9 May 2006 0:33:43 UTC

failed apparently with this :

<core_client_version>4.45</core_client_version>
<message>Fonction incorrecte. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# random seed: 1523116
# cpu_run_time_pref: 10800
# cpu_run_time_pref: 10800
# cpu_run_time_pref: 10800
# cpu_run_time_pref: 10800
ERROR:: Exit at: .dock_structure.cc line:401
</stderr_txt>

if found this happend when boinc pause an climate wu and try to restart the roseta wu, it failled on 'finished' the wu.

-----
09/05/2006 01:03:37|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
09/05/2006 01:03:37|climateprediction.net|Requesting 0 seconds of work, returning 0 results
09/05/2006 01:03:38|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
09/05/2006 02:03:36|climateprediction.net|Pausing result sulphur_j46i_200891882_0 (removed from memory)
09/05/2006 02:03:36|rosetta@home|Restarting result FA_CASP6_t212__470_16445_0 using rosetta version 5.07
09/05/2006 02:03:38||request_reschedule_cpus: process exited
09/05/2006 02:32:37|rosetta@home|Unrecoverable error for result FA_CASP6_t212__470_16445_0 (Fonction incorrecte. (0x1) - exit code 1 (0x1))
09/05/2006 02:32:37||request_reschedule_cpus: process exited
09/05/2006 02:32:37|rosetta@home|Deferring communication with project for 1 minutes and 0 seconds
09/05/2006 02:32:37|rosetta@home|Computation for result FA_CASP6_t212__470_16445_0 finished
09/05/2006 02:32:37|climateprediction.net|Restarting result sulphur_j46i_200891882_0 using sulphur_cycle version 4.22
09/05/2006 02:32:39|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
09/05/2006 02:32:39|climateprediction.net|Requesting 0 seconds of work, returning 0 results
09/05/2006 02:32:40|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
09/05/2006 02:33:38|rosetta@home|Sending scheduler request to http://boinc.bakerlab.org/rosetta_cgi/cgi
09/05/2006 02:33:38|rosetta@home|Requesting 0 seconds of work, returning 1 results
09/05/2006 02:33:40|rosetta@home|Scheduler request to http://boinc.bakerlab.org/rosetta_cgi/cgi succeeded
09/05/2006 03:32:37|climateprediction.net|Pausing result sulphur_j46i_200891882_0 (removed from memory)

-----
09/05/2006 15:17:14|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
09/05/2006 16:17:13|climateprediction.net|Pausing result sulphur_j46i_200891882_0 (removed from memory)
09/05/2006 16:17:13|rosetta@home|Restarting result FA_CASP6_u272__470_19905_0 using rosetta version 5.07
09/05/2006 16:17:16||request_reschedule_cpus: process exited
09/05/2006 16:18:40|rosetta@home|Unrecoverable error for result FA_CASP6_u272__470_19905_0 (Fonction incorrecte. (0x1) - exit code 1 (0x1))
09/05/2006 16:18:40||request_reschedule_cpus: process exited
09/05/2006 16:18:40|rosetta@home|Deferring communication with project for 1 minutes and 0 seconds
09/05/2006 16:18:40|rosetta@home|Computation for result FA_CASP6_u272__470_19905_0 finished
09/05/2006 16:18:40|rosetta@home|Starting result HBLR_1.0_1n0u_RDFLAGS_485_15349_0 using rosetta version 5.07






©2024 University of Washington
https://www.bakerlab.org