Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 41 · 42 · 43 · 44 · 45 · 46 · 47 . . . 55 · Next

AuthorMessage
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 77945 - Posted: 15 Feb 2015, 2:59:10 UTC

I have not heard others reporting BSOD and suspecting R@h to be the root cause. So, I think it more likely that since R@h is probably running in the background all of the time that when a BSOD occurred that R@h probably had a number of files opened, and unzipped, and etc.

Each task that runs handles its own files. So if there were storage problems, they should clear out as new tasks come in. I believe the underlying file structures used by SETI are probably not as extensive and complex as those used by R@h. So that may be part of why you didn't see similar problems there.

I'd suggest running the integrity tests several times in the coming days and see if any further corrupted files crop up. Since the operating system is handling the file storage, orphaned fragments etc. would be the type of thing that the applications should be able to cause to happen. I mean the operating system should enforce such things, and so even an erroneous application should not be able to cause such corruption.
Rosetta Moderator: Mod.Sense
ID: 77945 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile David Duvall

Send message
Joined: 1 Oct 05
Posts: 1
Credit: 20,601,984
RAC: 1,355
Message 77973 - Posted: 24 Feb 2015, 1:56:02 UTC - in response to Message 77942.  

"I used multiple tests as most of my BSOD errors repeatedly kept pointing to the drives as the reason for the system crashing."

In the past 6 months I had a similar problem with 2 separate computers. After days of investigation I figured out that I had a marginal memory stick in one machine that would pass RAM testing but, after extended BOINC use would give me errors that eventually caused the blue screen of death.

On the other machine I found that my North Bridge chip was getting excessively hot and causing the same type of symptom. I used zip-ties to place an extra fan that blows directly on the North Bridge. Soon, I will have no choice but to replace the motherboard.

On my father's machine I actually replaced the SATA cable with a shorter cable and his BSOD errors stopped.

On a side note, there is a BOINC setting change that seemed to help me extend the life of my hard drives. Under BOINC computing preferences - disk and memory usage tab you will find, "Tasks checkpoint to disk at most every ___ seconds" If you change that setting from the default 60 seconds to say 120 seconds you should have a lot less disk activity. I set my systems at 360 and have had no problems.

I wish you good luck on finding the solution to your problem and please let us know what you determine.


ID: 77973 · Rating: 0 · rate: Rate + / Rate - Report as offensive
markj

Send message
Joined: 21 Jun 08
Posts: 6
Credit: 18,060,229
RAC: 0
Message 78002 - Posted: 6 Mar 2015, 14:51:50 UTC

job "foldit:2000402_1078_fold_and_dock_SAVE_ALL_OUT_245640_26607_0" appears to be "eternal", has been running for almost 10 hours with no sign of progress for the last hour.
Any recommendation?
ID: 78002 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 78003 - Posted: 6 Mar 2015, 16:40:43 UTC - in response to Message 78002.  

job "foldit:2000402_1078_fold_and_dock_SAVE_ALL_OUT_245640_26607_0" appears to be "eternal", has been running for almost 10 hours with no sign of progress for the last hour.
Any recommendation?


I'd suggest reviewing the task's properties to confirm if the recorded CPU time is increasing (the BOINC Manager will show time elapsed increasing, even if the task is not being given CPU). If not, it may mean a higher priority task is consuming CPU.

The "watchdog" will clean it up if it runs for more than 4 hours passed your target runtime preference. So, I'd let it run normally. I suspect your system is busy and so elapsed time is increasing more than CPU time. The watchdog is comparing the CPU time you see in the task's properties.
Rosetta Moderator: Mod.Sense
ID: 78003 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Scott Probst

Send message
Joined: 1 Apr 09
Posts: 1
Credit: 1,338,447
RAC: 0
Message 78245 - Posted: 1 Jun 2015, 11:11:11 UTC

I'm wondering if there are some problems currently. All but very few units are ending with 'computation error' messages. I've taken off and re-downloaded the Rosetta project but no change.

thanks.
ID: 78245 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 78258 - Posted: 4 Jun 2015, 0:09:00 UTC - in response to Message 78245.  

I'm wondering if there are some problems currently. All but very few units are ending with 'computation error' messages. I've taken off and re-downloaded the Rosetta project but no change.

thanks.


Looking at your tasks, several have been successfully completed by other machines. Most show issues with missing files, which are sometimes being shown as download errors, and other times showing in the log when reported as a compute error.

Have you checked your antivirus software? Or firewall settings? It seems your machine is often not getting all of the pieces that allow a task to run properly.
Rosetta Moderator: Mod.Sense
ID: 78258 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 78301 - Posted: 14 Jun 2015, 6:26:01 UTC

Hi.

I'm having problems getting new tasks.

Ready to send 37

ID: 78301 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 78309 - Posted: 15 Jun 2015, 15:49:52 UTC - in response to Message 78301.  

Hi.

I'm having problems getting new tasks.

Ready to send 37


Looks like more tasks now.
Rosetta Moderator: Mod.Sense
ID: 78309 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 78497 - Posted: 26 Jul 2015, 3:24:56 UTC
Last modified: 26 Jul 2015, 3:25:29 UTC

NO new tasks again!

Total queued jobs: 0
ID: 78497 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Old man

Send message
Joined: 10 Nov 07
Posts: 25
Credit: 1,122,372
RAC: 23
Message 78500 - Posted: 26 Jul 2015, 8:34:47 UTC - in response to Message 78497.  

NO new tasks again!

Total queued jobs: 0


http://boincstats.com/en/stats/14/project/detail/user

A lot of active users again. Maybe Charity engine is here again.
ID: 78500 · Rating: 0 · rate: Rate + / Rate - Report as offensive
JohnH

Send message
Joined: 25 Mar 13
Posts: 43
Credit: 2,319,355
RAC: 0
Message 78502 - Posted: 26 Jul 2015, 17:01:43 UTC

Not getting any workunits either. All servers "green" but only one task available ... but not for me apparently.
Anybody else copy this state?
ID: 78502 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 78503 - Posted: 26 Jul 2015, 18:51:09 UTC - in response to Message 78502.  

Not getting any workunits either. All servers "green" but only one task available ... but not for me apparently.
Anybody else copy this state?



yes and likely no more until Monday morning
ID: 78503 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Deepak Chandra

Send message
Joined: 19 Nov 14
Posts: 1
Credit: 819,919
RAC: 0
Message 78506 - Posted: 27 Jul 2015, 13:28:03 UTC

Hi, it has been more than 24hrs since my computer hasn't been assigned new tasks. Is something wrong?
from log -
27/07/2015 18:53:58 | rosetta@home | Sending scheduler request: To fetch work.
27/07/2015 18:53:58 | rosetta@home | Requesting new tasks for CPU and AMD/ATI GPU
27/07/2015 18:54:01 | rosetta@home | Scheduler request completed: got 0 new tasks
27/07/2015 18:54:01 | rosetta@home | No work sent
ID: 78506 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1960
Credit: 38,075,642
RAC: 7,255
Message 78507 - Posted: 27 Jul 2015, 14:50:34 UTC

It seems like a few tasks were made available his morning - I got 4, yay - but they've all been swallowed up.

Everything's gone red on the Server-status page at this precise moment, so I assume people are working on it still. Hopefully it all gets sorted soon. I've got a few other projects to run in the short term.
ID: 78507 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1960
Credit: 38,075,642
RAC: 7,255
Message 78508 - Posted: 27 Jul 2015, 16:48:32 UTC

Another 200,000 tasks seem to have come through, of which I got 5 - yay - not much for an 8-core machine. All swallowed up again. Slowly, slowly...
ID: 78508 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1960
Credit: 38,075,642
RAC: 7,255
Message 78509 - Posted: 27 Jul 2015, 18:20:50 UTC

Not sure about you, but I'm getting everything Boinc is asking for while I force my other projects to run down
ID: 78509 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 78515 - Posted: 30 Jul 2015, 6:20:15 UTC

I started getting this message now.

Thu 30 Jul 2015 16:08:19 AEST | rosetta@home | Scheduler request completed
Thu 30 Jul 2015 16:08:19 AEST | rosetta@home | Server error: can't attach shared memory

I can't report any of your tasks.


ID: 78515 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1829
Credit: 113,931,835
RAC: 64,055
Message 78516 - Posted: 30 Jul 2015, 11:51:13 UTC

Mine isn't uploading either...
ID: 78516 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 78517 - Posted: 30 Jul 2015, 16:01:21 UTC
Last modified: 30 Jul 2015, 16:03:56 UTC

It appears there has been another surge of users from Charity Engine. There are over a million tasks out being worked on right now. It looks like there are plenty available to send right now, but as they all start reporting back, I'd imagine some upload congestion is likely as well. The BOINC Manager will retry both getting more work and any uploads that run in to programs.

One way to make the most of each task is to bump up your preferred work unit runtime. This is a setting in the Rosetta@home preferences, which are updated directly from the website. Be sure to set the value for the venue your host is attached to if you have more than one. Also, suggest you make changes gradually so the BOINC Manager has time to adjust to the longer runtime and request a roughly correct amount of work. So best to adjust when you have a small number of days of work on hand setting.
Rosetta Moderator: Mod.Sense
ID: 78517 · Rating: 0 · rate: Rate + / Rate - Report as offensive
P . P . L .

Send message
Joined: 20 Aug 06
Posts: 581
Credit: 4,865,274
RAC: 0
Message 78609 - Posted: 29 Aug 2015, 22:19:33 UTC
Last modified: 29 Aug 2015, 22:22:28 UTC

Uploads are timing out for rosetta, all other projects are o.k.

Downloads form rosetta are are working for now!

Sun 30 Aug 2015 08:21:17 AEST | | Project communication failed: attempting access to reference site
Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Temporarily failed upload of gr081015_HEEHheeh_go3_nods_new_heeh_12.bp_r3_pass_20150721035246_fragments_fold_SAVE_ALL_OUT_277850_62_1_0: transient HTTP error
Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Backing off 6 min 54 sec on upload of gr081015_HEEHheeh_go3_nods_new_heeh_12.bp_r3_pass_20150721035246_fragments_fold_SAVE_ALL_OUT_277850_62_1_0
Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Temporarily failed upload of FFD__8091b9a093b6f6d8e0c04cb95e4c345c_abinitioDocking_15_08_09_29_22_globalDocking_7_SAVE_ALL_OUT_301200_1_0_0: transient HTTP error
Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Backing off 4 min 14 sec on upload of FFD__8091b9a093b6f6d8e0c04cb95e4c345c_abinitioDocking_15_08_09_29_22_globalDocking_7_SAVE_ALL_OUT_301200_1_0_0
Sun 30 Aug 2015 08:21:17 AEST | rosetta@home | Started upload of fd_t009___robetta_IGNORE_THE_REST_03_09_292878_4889_0_0
Sun 30 Aug 2015 08:21:19 AEST | | Internet access OK - project servers may be temporarily down.
ID: 78609 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 . . . 41 · 42 · 43 · 44 · 45 · 46 · 47 . . . 55 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org