Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 211 · 212 · 213 · 214 · 215 · 216 · 217 . . . 309 · Next
Author | Message |
---|---|
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
The important lines were:Have the programmers not heard of something called "timeout"? |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
OLDER? Really? Ryzen 3700x here, barely 2 years old in my system.ROTFPMSL at your computer being insulted. Project doesn't give a S--- about failures. As long as they get the data somehow from someone and if its just one task somewhere that dies...oh well.There's a lot less pythons in the queue than there was. Either we've crunched them way faster than I thought we would, or they've been deleting some, or many have failed. Perhaps the next batch will have improvements. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
<-- what is the source of these pythons tasks and has anybody ever seen the output from them?OLDER? Really? Ryzen 3700x here, barely 2 years old in my system.ROTFPMSL at your computer being insulted. <-- yeah, don't you know microchips have feelings? |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
The machine I'm trying to run python on keeps getting banned, despite completing most of them successfully. Is there ever an end to problems here? And yes I know they have feelings, that's why I buy "broken" GPUs on Ebay and try to get them to do something. Actually that's probably cruel as the poor things thought they'd retired. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
The machine I'm trying to run python on keeps getting banned, despite completing most of them successfully. Is there ever an end to problems here? Probably your GPU's are complaining loudly and the RAH server is taking sympathy. Maybe you insulted it to many times? Or your just lucky 13. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
I've lost a Radeon 6990 and a cockatiel in the last couple of days, I'm not happy. Things should be made to last forever. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
I've lost a Radeon 6990 and a cockatiel in the last couple of days, I'm not happy. Things should be made to last forever. Note that it would then take forever to determine if they actually last forever or not. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
Forever is not a fixed time. It could be 5 times longer than normal for example.I've lost a Radeon 6990 and a cockatiel in the last couple of days, I'm not happy. Things should be made to last forever. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Forever is not a fixed time. It could be 5 times longer than normal for example.I've lost a Radeon 6990 and a cockatiel in the last couple of days, I'm not happy. Things should be made to last forever. You know those McDonald's trackers for your table? (well here in EU we have them) I looked at the underside last night, made in Thailand assembled in China. And since most stuff these days is or was made in China, its cheap and throw away. The GPU mfg's would not be in business if their cards lasted forever. Cars used to last forever, but the "forever" went away in the 80s I think when we switched over to make things as cheap as possible and charge regular price to make more profit and get the consumer to buy more. |
Mr P Hucker Send message Joined: 12 Aug 06 Posts: 1600 Credit: 12,116,986 RAC: 9,863 |
You know those McDonald's trackers for your table? (well here in EU we have them)Damn, closed browser after checking preview thinking I'd posted it, so the following is shorter as I'm lazy. What's a McD tracker? I'm the UK but rarely go there. Cars last me 20 years now, used to be 10. GPUs get replaced for the latest game. If the old one keeps value, those gamers have more money to buy the new one. If something breaks you don't buy from the same make again and write a nasty review, so making shit quality stuff harms your company. |
raymond Send message Joined: 27 Apr 20 Posts: 1 Credit: 418,877 RAC: 0 |
Why am I getting a notice "Waiting to contact project servers"? |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1725 Credit: 18,406,665 RAC: 19,920 |
Why am I getting a notice "Waiting to contact project servers"?No idea. In the Advanced view of BOINC Manager, select the Projects tab, select Rosetta & click on Update. Then check in Tools, Event log & see what messages are there. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,534,176 RAC: 10,708 |
I've got one weird Python task that's been running now for 26hrs, but it is using the CPU 25.5hrs and has checkpointed regularly - most recently 8 minutes ago. I've got no idea why it won't end itself. Does the watchdog no longer work? CPU time 1d 02:32:21 I'm going to abort it now and see what it reports It should show here aagb-HPR_pp-NMPHE-GPN_pp-BPRO_pp_6_2605012_6_1 |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 272 Credit: 507,897 RAC: 334 |
does .out file in c:programdataboincslots[slot number here]shared change? |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,534,176 RAC: 10,708 |
CPU time 1d 02:32:21 Apologies, it's this task, not the one shown above aagb-PHE_pp-mPIP-GGLY-mB3LEU_3_2686388_6_0 Run time 1 days 2 hours 37 min 11 sec Can anyone spot the error in the task? |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,534,176 RAC: 10,708 |
does .out file in c:programdataboincslots[slot number here]shared change? Sorry, I didn't see this, but neither do I know what .out file I should look at, nor what slot it was running in, nor know if or how it might've changed. Task aborted now - I assume the info has gone now? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
CPU time 1d 02:32:21 No error I can spot before this line, then several: Hypervisor System Log: However, these can be due to the abort. It may be a task that ran much longer than expected, without anything going wrong. If so, just letting it run enough longer would have let it finish. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
does .out file in c:programdataboincslots[slot number here]shared change? To find the slot number click on the task in the tasks column, them on properties. The info is gone shortly after the output files are uploaded and the task is reported as finished. The probable change to look for is any change to the dates and size of the .out file. If there is more than out .out file in the slot directory, look for changes in the dates or size in all of them. |
kotenok2000 Send message Joined: 22 Feb 11 Posts: 272 Credit: 507,897 RAC: 334 |
you can copy out file twice waiting several minutes between copies and then compare two copies with winmerge . |
.clair. Send message Joined: 2 Jan 07 Posts: 274 Credit: 26,399,595 RAC: 0 |
Looks like Rosetta 4.2 just got a batch of `miniprotein in , grab them while they iz hot front page job que went up by millions . |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org