Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 41 · 42 · 43 · 44 · 45 · 46 · 47 . . . 55 · Next
Author | Message |
---|---|
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I have not heard others reporting BSOD and suspecting R@h to be the root cause. So, I think it more likely that since R@h is probably running in the background all of the time that when a BSOD occurred that R@h probably had a number of files opened, and unzipped, and etc. Each task that runs handles its own files. So if there were storage problems, they should clear out as new tasks come in. I believe the underlying file structures used by SETI are probably not as extensive and complex as those used by R@h. So that may be part of why you didn't see similar problems there. I'd suggest running the integrity tests several times in the coming days and see if any further corrupted files crop up. Since the operating system is handling the file storage, orphaned fragments etc. would be the type of thing that the applications should be able to cause to happen. I mean the operating system should enforce such things, and so even an erroneous application should not be able to cause such corruption. Rosetta Moderator: Mod.Sense |
David Duvall Send message Joined: 1 Oct 05 Posts: 1 Credit: 20,797,051 RAC: 334 |
"I used multiple tests as most of my BSOD errors repeatedly kept pointing to the drives as the reason for the system crashing." In the past 6 months I had a similar problem with 2 separate computers. After days of investigation I figured out that I had a marginal memory stick in one machine that would pass RAM testing but, after extended BOINC use would give me errors that eventually caused the blue screen of death. On the other machine I found that my North Bridge chip was getting excessively hot and causing the same type of symptom. I used zip-ties to place an extra fan that blows directly on the North Bridge. Soon, I will have no choice but to replace the motherboard. On my father's machine I actually replaced the SATA cable with a shorter cable and his BSOD errors stopped. On a side note, there is a BOINC setting change that seemed to help me extend the life of my hard drives. Under BOINC computing preferences - disk and memory usage tab you will find, "Tasks checkpoint to disk at most every ___ seconds" If you change that setting from the default 60 seconds to say 120 seconds you should have a lot less disk activity. I set my systems at 360 and have had no problems. I wish you good luck on finding the solution to your problem and please let us know what you determine. |
markj Send message Joined: 21 Jun 08 Posts: 6 Credit: 18,060,229 RAC: 0 |
job "foldit:2000402_1078_fold_and_dock_SAVE_ALL_OUT_245640_26607_0" appears to be "eternal", has been running for almost 10 hours with no sign of progress for the last hour. Any recommendation? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
job "foldit:2000402_1078_fold_and_dock_SAVE_ALL_OUT_245640_26607_0" appears to be "eternal", has been running for almost 10 hours with no sign of progress for the last hour. I'd suggest reviewing the task's properties to confirm if the recorded CPU time is increasing (the BOINC Manager will show time elapsed increasing, even if the task is not being given CPU). If not, it may mean a higher priority task is consuming CPU. The "watchdog" will clean it up if it runs for more than 4 hours passed your target runtime preference. So, I'd let it run normally. I suspect your system is busy and so elapsed time is increasing more than CPU time. The watchdog is comparing the CPU time you see in the task's properties. Rosetta Moderator: Mod.Sense |
Scott Probst Send message Joined: 1 Apr 09 Posts: 1 Credit: 1,338,447 RAC: 0 |
I'm wondering if there are some problems currently. All but very few units are ending with 'computation error' messages. I've taken off and re-downloaded the Rosetta project but no change. thanks. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I'm wondering if there are some problems currently. All but very few units are ending with 'computation error' messages. I've taken off and re-downloaded the Rosetta project but no change. Looking at your tasks, several have been successfully completed by other machines. Most show issues with missing files, which are sometimes being shown as download errors, and other times showing in the log when reported as a compute error. Have you checked your antivirus software? Or firewall settings? It seems your machine is often not getting all of the pieces that allow a task to run properly. Rosetta Moderator: Mod.Sense |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. I'm having problems getting new tasks. Ready to send 37 |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Hi. Looks like more tasks now. Rosetta Moderator: Mod.Sense |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
NO new tasks again! Total queued jobs: 0 |
Old man Send message Joined: 10 Nov 07 Posts: 25 Credit: 1,122,372 RAC: 0 |
NO new tasks again! http://boincstats.com/en/stats/14/project/detail/user A lot of active users again. Maybe Charity engine is here again. |
JohnH Send message Joined: 25 Mar 13 Posts: 43 Credit: 2,319,355 RAC: 0 |
Not getting any workunits either. All servers "green" but only one task available ... but not for me apparently. Anybody else copy this state? |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
Not getting any workunits either. All servers "green" but only one task available ... but not for me apparently. yes and likely no more until Monday morning |
Deepak Chandra Send message Joined: 19 Nov 14 Posts: 1 Credit: 819,919 RAC: 0 |
Hi, it has been more than 24hrs since my computer hasn't been assigned new tasks. Is something wrong? from log - 27/07/2015 18:53:58 | rosetta@home | Sending scheduler request: To fetch work. 27/07/2015 18:53:58 | rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU 27/07/2015 18:54:01 | rosetta@home | Scheduler request completed: got 0 new tasks 27/07/2015 18:54:01 | rosetta@home | No work sent |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
It seems like a few tasks were made available his morning - I got 4, yay - but they've all been swallowed up. Everything's gone red on the Server-status page at this precise moment, so I assume people are working on it still. Hopefully it all gets sorted soon. I've got a few other projects to run in the short term. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
Another 200,000 tasks seem to have come through, of which I got 5 - yay - not much for an 8-core machine. All swallowed up again. Slowly, slowly... |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2146 Credit: 41,570,180 RAC: 8,210 |
Not sure about you, but I'm getting everything Boinc is asking for while I force my other projects to run down |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
I started getting this message now. Thu 30 Jul 2015 16:08:19 AEST | rosetta@home | Scheduler request completed Thu 30 Jul 2015 16:08:19 AEST | rosetta@home | Server error: can't attach shared memory I can't report any of your tasks. |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,860,059 RAC: 3,073 |
Mine isn't uploading either... |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
It appears there has been another surge of users from Charity Engine. There are over a million tasks out being worked on right now. It looks like there are plenty available to send right now, but as they all start reporting back, I'd imagine some upload congestion is likely as well. The BOINC Manager will retry both getting more work and any uploads that run in to programs. One way to make the most of each task is to bump up your preferred work unit runtime. This is a setting in the Rosetta@home preferences, which are updated directly from the website. Be sure to set the value for the venue your host is attached to if you have more than one. Also, suggest you make changes gradually so the BOINC Manager has time to adjust to the longer runtime and request a roughly correct amount of work. So best to adjust when you have a small number of days of work on hand setting. Rosetta Moderator: Mod.Sense |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Uploads are timing out for rosetta, all other projects are o.k. Downloads form rosetta are are working for now! Sun 30 Aug 2015 08:21:17 AEST | | Project communication failed: attempting access to reference site Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Temporarily failed upload of gr081015_HEEHheeh_go3_nods_new_heeh_12.bp_r3_pass_20150721035246_fragments_fold_SAVE_ALL_OUT_277850_62_1_0: transient HTTP error Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Backing off 6 min 54 sec on upload of gr081015_HEEHheeh_go3_nods_new_heeh_12.bp_r3_pass_20150721035246_fragments_fold_SAVE_ALL_OUT_277850_62_1_0 Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Temporarily failed upload of FFD__8091b9a093b6f6d8e0c04cb95e4c345c_abinitioDocking_15_08_09_29_22_globalDocking_7_SAVE_ALL_OUT_301200_1_0_0: transient HTTP error Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Backing off 4 min 14 sec on upload of FFD__8091b9a093b6f6d8e0c04cb95e4c345c_abinitioDocking_15_08_09_29_22_globalDocking_7_SAVE_ALL_OUT_301200_1_0_0 Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Started upload of fd_t009___robetta_IGNORE_THE_REST_03_09_292878_4889_0_0 Sun 30 Aug 2015 08:21:19 AEST | | Internet access OK - project servers may be temporarily down. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org