two "validate errors" in 24 hours

Message boards : Number crunching : two "validate errors" in 24 hours

To post messages, you must log in.

AuthorMessage
Shoikan

Send message
Joined: 4 Apr 06
Posts: 14
Credit: 180,211
RAC: 0
Message 52481 - Posted: 15 Apr 2008, 15:32:16 UTC

If I'm not wrong this error is related to some problem in rosseta's servers end that lose the WUs.

Is this normal? Can I do something in my end to fix it?

Regards.
ID: 52481 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2000
Credit: 38,569,962
RAC: 16,865
Message 52490 - Posted: 15 Apr 2008, 20:14:59 UTC

Got one here too:

https://boinc.bakerlab.org/rosetta/result.php?resultid=156081140

Am I just unlucky or could it be me? I've had loads of client errors recently due to a CPURAMHD upgrade over the last few weeks. Still getting problems with that too. I've reverted the RAM and HD upgrade which seems to have settled things down quite a bit, but this is the first validation error I noticed.
ID: 52490 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Shoikan

Send message
Joined: 4 Apr 06
Posts: 14
Credit: 180,211
RAC: 0
Message 52601 - Posted: 18 Apr 2008, 20:18:38 UTC

Another one today :(

It's really annoying. Can't anybody tell me what's happening?

Regards
ID: 52601 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 52602 - Posted: 18 Apr 2008, 21:53:53 UTC

None of the tasks show any specific error. Yet it appears no result data was actually sent back, because no models are shown, and there's no shutdown messages. Do you see any messages about these tasks in your BOINC Manager's messages tab in the advanced view?

No, I can't think of anything to suggest on your end to try.
Rosetta Moderator: Mod.Sense
ID: 52602 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Shoikan

Send message
Joined: 4 Apr 06
Posts: 14
Credit: 180,211
RAC: 0
Message 52609 - Posted: 19 Apr 2008, 7:19:01 UTC

Hi Mod Sense, thanks for replying.

These are the messages related to the last WU that failed to validate:

18/04/2008 7:08:37|rosetta@home|Starting 1bm8__CONTROL_ABRELAX_040808_FRAGS_UNCONDENSED_SAVE_ALL_OUT_-1bm8_-__3079_2612_0
18/04/2008 7:08:37|rosetta@home|Starting task 1bm8__CONTROL_ABRELAX_040808_FRAGS_UNCONDENSED_SAVE_ALL_OUT_-1bm8_-__3079_2612_0 using rosetta_beta version 596
18/04/2008 11:12:36|rosetta@home|Restarting task 1bm8__CONTROL_ABRELAX_040808_FRAGS_UNCONDENSED_SAVE_ALL_OUT_-1bm8_-__3079_2612_0 using rosetta_beta version 596
18/04/2008 12:44:27|rosetta@home|Computation for task 1bm8__CONTROL_ABRELAX_040808_FRAGS_UNCONDENSED_SAVE_ALL_OUT_-1bm8_-__3079_2612_0 finished
18/04/2008 12:44:33|rosetta@home|Started upload of 1bm8__CONTROL_ABRELAX_040808_FRAGS_UNCONDENSED_SAVE_ALL_OUT_-1bm8_-__3079_2612_0_0
18/04/2008 12:44:36|rosetta@home|Finished upload of 1bm8__CONTROL_ABRELAX_040808_FRAGS_UNCONDENSED_SAVE_ALL_OUT_-1bm8_-__3079_2612_0_0

Seems pretty normal to me, nothing sugests any abnormal happened to the WU.

Regards
ID: 52609 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Shoikan

Send message
Joined: 4 Apr 06
Posts: 14
Credit: 180,211
RAC: 0
Message 52738 - Posted: 26 Apr 2008, 15:40:03 UTC

Validate errors keep happening on a regular basis. Anyone is experiencing this same problem?

Regards.


ID: 52738 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 28 May 06
Posts: 62
Credit: 231,858
RAC: 135
Message 52741 - Posted: 26 Apr 2008, 19:28:40 UTC

I also got a validate error on this workunit upon exit:
Task ID 158556233
Name 1bm8__CONTROL_ABRELAX_040808_FRAGS_SAVE_ALL_OUT_-1bm8_-__3054_13655_0
Workunit 144818041

Outcome Validate error
Client state Done
Exit status 0 (0x0)

Computer ID 720246
CPU time 5204.234
stderr out

<core_client_version>6.1.0</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 16.3129116236015
Granted credit 0
application version 5.96


Messages in BOINC relating to this workunit
4/26/2008 4:50:14 AM [file_xfer] Finished upload of file 1bm8__CONTROL_ABRELAX_040808_FRAGS_SAVE_ALL_OUT_-1bm8_-__3054_13655_0_0
4/26/2008 4:50:11 AM Requesting 923 seconds of new work
4/26/2008 4:50:11 AM Sending scheduler request: To fetch work
4/26/2008 4:50:11 AM [file_xfer] Started upload of file 1bm8__CONTROL_ABRELAX_040808_FRAGS_SAVE_ALL_OUT_-1bm8_-__3054_13655_0_0
4/26/2008 4:50:08 AM Computation for task 1bm8__CONTROL_ABRELAX_040808_FRAGS_SAVE_ALL_OUT_-1bm8_-__3054_13655_0 finished
4/26/2008 3:17:58 AM Resuming task 1bm8__CONTROL_ABRELAX_040808_FRAGS_SAVE_ALL_OUT_-1bm8_-__3054_13655_0 using rosetta_beta version 596
4/26/2008 1:02:23 AM Starting task 1bm8__CONTROL_ABRELAX_040808_FRAGS_SAVE_ALL_OUT_-1bm8_-__3054_13655_0 using rosetta_beta version 596
4/26/2008 1:02:23 AM Starting 1bm8__CONTROL_ABRELAX_040808_FRAGS_SAVE_ALL_OUT_-1bm8_-__3054_13655_0



This is a dump of the progress.csv file saved by the BoincView program
"checkpoint";"progress";"final"
0,000000;0,001243000000000000;12511,249579
0,000000;0,005456000000000000;12437,193911
0,000000;0,010158000000000000;12350,602210
0,000000;0,014308000000000000;12278,968929
114,218800;0,018440000000000002;12321,108354
114,218800;0,022591000000000000;12248,827591
114,218800;0,027564999999999999;12157,387858
114,218800;0,031726999999999998;12085,339312
234,687500;0,035864000000000000;12134,345702
234,687500;0,040564000000000003;12050,401848
234,687500;0,044713999999999997;11979,128522
355,171900;0,048876000000000003;12028,541468
355,171900;0,053033999999999998;11958,104207
399,750000;0,055516000000000003;11957,668503
399,750000;0,059568000000000003;11888,966702
399,750000;0,064266000000000004;11806,275324
399,750000;0,068403000000000006;11736,245434
518,437500;0,071981000000000003;11794,756464
518,437500;0,076690999999999995;11712,366996
518,437500;0,080836000000000005;11642,710868
518,437500;0,084972000000000006;11573,376708
635,187500;0,089665999999999996;11608,803187
635,187500;0,093812999999999994;11539,685461
635,187500;0,097956000000000001;11470,857822
635,187500;0,102107000000000003;11402,872760
755,546900;0,107080999999999996;11435,485336
755,546900;0,111213000000000006;11367,364484
755,546900;0,115358000000000002;11300,111292
755,546900;0,120055999999999996;11219,418677
875,828100;0,124203999999999995;11271,897823
875,828100;0,128333000000000003;11204,573984
875,828100;0,132480999999999988;11137,988765
875,828100;0,136624999999999996;11070,718500
996,078100;0,141594999999999999;11105,266418
996,078100;0,145726999999999995;11039,499587
996,078100;0,149886999999999992;10972,560735
996,078100;0,153874000000000011;10906,963752
1111,250000;0,157760000000000011;10959,173480
1111,250000;0,161902999999999991;10893,875826
1111,250000;0,166034999999999988;10828,110375
1230,250000;0,170973999999999987;10863,616035
1230,250000;0,175034999999999996;10800,229630
1230,250000;0,179140999999999995;10735,454829
1230,250000;0,183846000000000009;10658,859751
1349,766000;0,187994999999999995;10714,118264
1349,766000;0,192146000000000011;10649,916176
1349,766000;0,197105000000000002;10567,657606
1349,766000;0,201243000000000005;10504,906775
1470,109000;0,205947999999999992;10549,039354
1470,109000;0,210088999999999998;10486,644684
1470,109000;0,214232000000000006;10423,575319
1470,109000;0,219204000000000010;10342,314693
1590,406000;0,223345999999999989;10399,174927
1590,406000;0,227484999999999993;10336,741328
1590,406000;0,232187000000000004;10261,934463
1590,406000;0,235768000000000005;10209,201310
1707,266000;0,240352000000000010;10253,543480
1707,266000;0,244507000000000002;10190,797747
1707,266000;0,248654999999999987;10128,371301
1707,266000;0,252800000000000025;10066,910270
1827,750000;0,256931000000000020;10125,520876
1827,750000;0,261902999999999997;10046,761469
1827,750000;0,266052999999999984;9985,791890
1948,031000;0,270195000000000018;10044,637714
1948,031000;0,274888999999999994;9972,253914
1948,031000;0,279021000000000019;9912,090015
1948,031000;0,283173000000000008;9851,076226
1948,031000;0,287320000000000020;9790,321715
1948,031000;0,292296000000000000;9713,951696
1948,031000;0,296445000000000014;9653,572776
2139,516000;0,300495000000000012;9786,295746
2139,516000;0,305161000000000016;9716,608609
2139,516000;0,309306000000000025;9656,859427
2139,516000;0,313464000000000020;9597,066030
2277,625000;0,317605999999999999;9675,797344
2277,625000;0,320826999999999973;9630,570754
2277,625000;0,325793999999999972;9555,577625
2277,625000;0,329957000000000000;9496,450187
2409,109000;0,334110000000000018;9569,773198
2409,109000;0,338795999999999986;9500,890493
2409,109000;0,342938000000000021;9442,603612
2409,109000;0,347096000000000016;9384,276497
2409,109000;0,351231000000000015;9327,081187
2535,453000;0,355366999999999988;9395,750079
2535,453000;0,360327999999999982;9323,510849
2535,453000;0,364472999999999991;9266,088772
2535,453000;0,368601999999999985;9209,105151
2659,422000;0,373298999999999992;9266,485472
2659,422000;0,377450000000000008;9209,583066
2659,422000;0,381599000000000022;9152,847135
2780,219000;0,385736000000000023;9217,301581
2780,219000;0,390705000000000024;9146,645718
2780,219000;0,394855000000000012;9090,510973
2780,219000;0,398876000000000008;9036,800781
2780,219000;0,402498000000000022;8986,251741
2780,219000;0,405322999999999989;8948,048442
2926,313000;0,407532999999999979;9064,446405
2926,313000;0,409997999999999974;9031,328709
2926,313000;0,413111999999999979;8990,581619
2926,313000;0,416210999999999998;8947,102444
3018,938000;0,419648000000000021;8994,221712
3018,938000;0,423196999999999990;8947,068444
3018,938000;0,426559999999999995;8902,950711
3018,938000;0,430321000000000009;8851,577455
3018,938000;0,433493999999999990;8809,725819
3131,891000;0,436993999999999994;8875,590650
3131,891000;0,440004000000000006;8836,171371
3131,891000;0,443147000000000013;8795,669478
3131,891000;0,446043999999999996;8758,158888
3234,969000;0,449884999999999979;8809,317871
3234,969000;0,453079000000000010;8768,874941
3234,969000;0,457197999999999993;8715,822644
3234,969000;0,461953000000000002;8652,595887
3350,422000;0,465017000000000014;8729,942994
3350,422000;0,469115000000000004;8677,738334
3350,422000;0,473905999999999994;8614,631958
3350,422000;0,477883999999999975;8564,336255
3350,422000;0,482005000000000017;8512,806769
3475,828000;0,486783999999999994;8575,493850
3475,828000;0,490920000000000023;8524,187878
3475,828000;0,494487999999999983;8479,060470
3475,828000;0,497240000000000015;8445,109848
3587,484000;0,500852999999999993;8510,073533
3587,484000;0,504832999999999976;8461,145519
3587,484000;0,508932000000000051;8410,549971
3587,484000;0,513750000000000040;8349,301334
3587,484000;0,517199999999999993;8306,532896
3728,750000;0,520529000000000019;8407,253884
3728,750000;0,524298999999999960;8361,761526
3728,750000;0,527799000000000018;8318,783576
3728,750000;0,531159000000000048;8276,934508
3830,641000;0,534154000000000018;8342,592666
3830,641000;0,537638999999999978;8300,953693
3830,641000;0,540418999999999983;8266,965035
3830,641000;0,543822000000000028;8224,574291
3830,641000;0,546633999999999953;8191,169200
3830,641000;0,549512000000000000;8156,856504
3961,344000;0,552389000000000019;8253,926008
3961,344000;0,556257000000000001;8206,173800
3961,344000;0,559416000000000024;8168,844447
3961,344000;0,562966999999999995;8127,309310
4068,531000;0,565806999999999949;8201,162112
4068,531000;0,569027999999999978;8161,892008
4068,531000;0,571864000000000039;8128,761102
4068,531000;0,574938999999999978;8093,334864
4068,531000;0,578031999999999990;8057,439356
4175,297000;0,581879000000000035;8117,964557
4175,297000;0,584860999999999964;8083,982416
4175,297000;0,588033999999999946;8047,296333
4175,297000;0,591087000000000029;8012,371806
4272,578000;0,594423000000000035;8070,065227
4272,578000;0,597611000000000003;8034,018564
4272,578000;0,600241000000000024;8004,102195
4272,578000;0,603978000000000015;7960,470309
4272,578000;0,607208999999999999;7924,037420
4396,969000;0,610380000000000034;8012,295997
4396,969000;0,613418000000000019;7978,090869
4396,969000;0,616426000000000029;7943,335040
4396,969000;0,619654999999999956;7907,129802
4396,969000;0,622712999999999961;7873,325579
4493,688000;0,625920000000000032;7934,388091
4493,688000;0,629507000000000039;7893,173682
4493,688000;0,632363000000000008;7861,699849
4493,688000;0,635592000000000046;7826,295026
4589,578000;0,638634999999999953;7888,814723
4589,578000;0,642260999999999971;7847,684801
4589,578000;0,645268999999999982;7814,953173
4589,578000;0,648317999999999950;7782,036811
4688,656000;0,651367000000000029;7847,851019
4688,656000;0,655021999999999993;7807,190226
4688,656000;0,658078999999999970;7774,136456
4688,656000;0,660490000000000022;7748,328320
4688,656000;0,663583000000000034;7715,097907
4785,844000;0,666638999999999982;7779,627919
4785,844000;0,670308000000000015;7739,480870
4785,844000;0,673113999999999990;7709,679306
4785,844000;0,676073999999999953;7678,410319
4880,297000;0,680069000000000035;7729,668982
4880,297000;0,683382999999999963;7694,854572
4880,297000;0,687182999999999988;7655,085088
4880,297000;0,690989999999999993;7615,263812
4994,203000;0,695305999999999980;7683,132345
4994,203000;0,698628000000000027;7648,581954
4994,203000;0,701509999999999967;7618,641911
4994,203000;0,704983000000000026;7583,302573
5101,156000;0,708493999999999957;7654,143316
5101,156000;0,712555999999999967;7611,450274
5101,156000;0,716211000000000042;7574,153202
5101,156000;0,719249000000000027;7543,670210
5101,156000;1,000000000000000000;5101,156000



ID: 52741 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Shoikan

Send message
Joined: 4 Apr 06
Posts: 14
Credit: 180,211
RAC: 0
Message 52754 - Posted: 27 Apr 2008, 10:43:44 UTC

What I mean is that I'm getting one of this errors almost every day, and that it doesn't seem like compute errors, the WUs run complete, then error out when they are uploaded.

It's a waste of cycles methinks.
ID: 52754 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Shoikan

Send message
Joined: 4 Apr 06
Posts: 14
Credit: 180,211
RAC: 0
Message 52759 - Posted: 27 Apr 2008, 19:17:50 UTC

Well, that's it, two in a row today convinced me to leave this project. It's a pity though, because I think that the science behind it is very promising. But something have to be done to the SW.

I'll gladly reattach when the serious issues with WU validation are adressed.

Regards

PD: Sorry about the lousy english
ID: 52759 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : two "validate errors" in 24 hours



©2024 University of Washington
https://www.bakerlab.org