Message boards : Number crunching : Computation Error
Author | Message |
---|---|
wk4536 Send message Joined: 18 Mar 14 Posts: 5 Credit: 14,883 RAC: 0 |
So I have recently checked the success of some of the WU I have completed and I have a ton of them that are never validated due to compute error (average about 10% success rate) and am not sure why this is the case. Could anyone provide me with some feedback because I will probably just move my support towards other projects like the world community grid. I don't have the most powerful machines but its frustrating to see that most of the efforts made aren't actually resulting in progress for the project |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2124 Credit: 41,226,850 RAC: 11,023 |
So I have recently checked the success of some of the WU I have completed and I have a ton of them that are never validated due to compute error (average about 10% success rate) and am not sure why this is the case. Could anyone provide me with some feedback because I will probably just move my support towards other projects like the world community grid. I don't have the most powerful machines but its frustrating to see that most of the efforts made aren't actually resulting in progress for the project All your tasks seem to run ok, but those on your two android devices seem to come up with file xfer errors. Do you see errors reported at your end? Is there an antivirus firewall involved or restriction in mobile connectivity somewhere? I mention this because I'm currently having file upload problems from my laptop at the moment even though all other devices are working fine (and downloads are ok too) |
wk4536 Send message Joined: 18 Mar 14 Posts: 5 Credit: 14,883 RAC: 0 |
I don't see any error messages when I check my android devices. One of them is an old galaxy s3 that I leave plugged in and check every couple of days just to see if its still crunching and the other is my s6 that crunches when I charge it. I honestly did not think there were any issues until I thought to check my task list here on this site and saw that most of the tasks never actually earned any credits. I never installed any firewalls or anything on my phones like I do on my computer and I don't think I know how to restrict the mobile connection. I live downtown in a big city on the east coast and one of the devices is completely stationary (doesn't leave a room) in a building with wifi. It is possible that the wifi had occasional flickers but I don't think that would justify the large number of errors seen. |
caffeineyellow5 Send message Joined: 12 Jun 16 Posts: 3 Credit: 4,876 RAC: 0 |
I have 26 successful results out of 111 on my ZTE Maven. It runs continuously plugged in and on the wifi and has continual upload errors after seemingly good finished tasks. It is a "naked" phone since the only app I have installed after buying it was BOINC. This is the phone: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=2737228 This is the task list: https://boinc.bakerlab.org/rosetta/results.php?hostid=2737228 After getting a: ====================================================== DONE :: 2 starting structures 2279.05 cpu seconds This process generated 2 decoys from 2 attempts ====================================================== BOINC :: WS_max 0 BOINC :: BOINC support services shutting down cleanly ... 01:54:19 (15023): called boinc_finish(0) </stderr_txt> I get a message: <message> upload failure: <file_xfer_error> <file_name>db_set3_7res_android_d_c.20.10_0001_SAVE_ALL_OUT_344080_2094_1_0</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> Please help me fix this so that I can get credit for every successful task and further the work of Rosetta@Home. Thank you - Mike |
Luigi R. Send message Joined: 7 Feb 14 Posts: 39 Credit: 2,045,527 RAC: 0 |
It looks like there is no db_set3_7res_android_d_c.20.10_0001_SAVE_ALL_OUT_344080_2094_1_0 file. Maybe no read permissions for some reason? |
caffeineyellow5 Send message Joined: 12 Jun 16 Posts: 3 Credit: 4,876 RAC: 0 |
It looks like there is no db_set3_7res_android_d_c.20.10_0001_SAVE_ALL_OUT_344080_2094_1_0 file. Maybe no read permissions for some reason? How can I fix this error then??? |
wk4536 Send message Joined: 18 Mar 14 Posts: 5 Credit: 14,883 RAC: 0 |
It looks like there is no db_set3_7res_android_d_c.20.10_0001_SAVE_ALL_OUT_344080_2094_1_0 file. Maybe no read permissions for some reason? I would also like to know how to fix as this is currently the project that gets me most excited regarding boinc. I've suspended my work on this and shifted to other projects in the mean time. |
Luigi R. Send message Joined: 7 Feb 14 Posts: 39 Credit: 2,045,527 RAC: 0 |
Sorry, I don't know. It's weird some successes occurred too. |
wk4536 Send message Joined: 18 Mar 14 Posts: 5 Credit: 14,883 RAC: 0 |
Sorry, I don't know. It's weird some successes occurred too. Hmm, maybe it would be better to shift support to other projects until its sorted out. At least that way work units don't go to waste. |
caffeineyellow5 Send message Joined: 12 Jun 16 Posts: 3 Credit: 4,876 RAC: 0 |
Sorry, I don't know. It's weird some successes occurred too. I'm noticing when I go to look at the work units and their history on other systems that my lack of success is followed by or follows other's lack of success on the same work units. Perhaps the work units that can run on ARM7 systems all have a 25% success rate in general and failure is part of the plan for 75% of them??? It seems that the ones the fail for me fail for many systems across different versions of phone, OS, etc. I mean is this a possibility at Rosetta@Home that there is a built in 75% failure rate for certain types of work units and the failures and the successes are all leading to the success of the project as an whole? Like a failure proves something about the work inside the work unit that leads to conclusions about the work and directions of research? Just throwing it out there as an open question since I don't know at all and the thought occurred. Cause if that is true, then none of the work, failure or success, is wasted work, just uncredited work, which in the long term credit means nothing to the work being done, as long as it gets done. |
wk4536 Send message Joined: 18 Mar 14 Posts: 5 Credit: 14,883 RAC: 0 |
That is true if that was indeed what was happening. I just think that it actually means that it never made it back. Based on the outcomes explanation screen: Client error The task was sent to a computer and an error occurred. Success A computer completed and reported the task successfully. I thought it was also a mobile device issue but the majority of the projects sent to my vaio also ended in a compute error. Oh well, its not really like my individual contribution was going to make or break anything. If the work being done does not result in the the results they want I don't think it would come up as a compute error since they would have received data that they can use to modify their design or whatever. I don't necessarily care that my credit numbers aren't going up since they don't mean anything outside of the app but I do like seeing changes so that I know that my computer and phone are making contributions, as small as they seem. Sorry, I don't know. It's weird some successes occurred too. |
catavalon21 Send message Joined: 22 Oct 06 Posts: 1 Credit: 898,035 RAC: 0 |
I don't think this is entirely the reason. I have a very large percentage of Android tasks ending in compute failure, and they can't have "not made it back", as the deadline hasn't occurred yet. It's possible the results made it back with an error, but as another post said earlier, tasks that fail appear to fail repeatedly across several users consistently. Hopefully it'll get sorted out, but will crunch something else in the interim. F@H has an Android client, though not on BOINC. |
Dr. Merkwürdigliebe Send message Joined: 5 Dec 10 Posts: 81 Credit: 2,657,273 RAC: 0 |
|
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
A whole bunch of compute errors on FFD_something jobs... Same here. All of my FFD_ tasks failed rather quickly. I'll shoot an EMail to DK just in case they have not already noticed. Rosetta Moderator: Mod.Sense |
krypton Volunteer moderator Project developer Project scientist Send message Joined: 16 Nov 11 Posts: 108 Credit: 2,164,309 RAC: 0 |
Thanks for the alert you guys! I've contacted the responsible scientist. Hopefully we can avoid this in the future! |
jl91569 Send message Joined: 5 Jul 16 Posts: 1 Credit: 9,049 RAC: 0 |
Hey there. Just wanted to let you know that FFD tasks seem to be failing on my machine after a few seconds as well. Here's the error log: ERROR: unrecognized residue SUG ERROR:: Exit from: ......srccoreiopose_from_sfrPoseFromSFRBuilder.cc line: 1030 BOINC:: Error reading and gzipping output datafile: default.out called boinc_finish Thanks. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,051,657 RAC: 8,071 |
Thanks for the alert you guys! I've contacted the responsible scientist. Seems like someone could write a script that runs "hourly" to automatically detect errors and publish that to the list of scientists ... or is that what already happens? 8-) |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,622,132 RAC: 9,522 |
Seems like someone could write a script that runs "hourly" to automatically detect errors and publish that to the list of scientists .. +1 Great idea!! |
sinspin Send message Joined: 30 Jan 06 Posts: 29 Credit: 6,574,585 RAC: 0 |
Seems like someone could write a script that runs "hourly" to automatically detect errors and publish that to the list of scientists ... or is that what already happens? 8-) +1 I hope it comes before the hell is frozen up. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,051,657 RAC: 8,071 |
Seems like someone could write a script that runs "hourly" to automatically detect errors and publish that to the list of scientists ... or is that what already happens? 8-) ... and boboviz ... The script is probably 5 lines long and the results could be sorted into a "scientist quality" shame ranking to improve the quality of submissions. Scientist "rankings" could be a TASK "success percentage" or just ... my prank suggestion ... 1. Baker 2. Graduate 3. GED 4. WTF |
Message boards :
Number crunching :
Computation Error
©2024 University of Washington
https://www.bakerlab.org