Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 92 · 93 · 94 · 95 · 96 · 97 · 98 . . . 309 · Next

AuthorMessage
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 101123 - Posted: 7 Apr 2021, 16:55:01 UTC - in response to Message 101089.  

It seems the RAM problem is not yet solved. I just updated Rosetta now and got the same complaint about needing 6 GB of RAM and only having 3 GB.
It's not a problem, it's just some tasks needing more RAM.
It is a problem when a Work Unit says it it will need 6+GB of RAM, when it really only needs 300MB (or less), as that results in almost a third of the projects computing resources becoming unavailable.
It's true all the ones that came through to my larger machines all say under 700MB RAM. Maybe they can occasionally need a lot more so they're playing safe and not crashing small machines with them?
ID: 101123 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 101124 - Posted: 7 Apr 2021, 16:56:49 UTC - in response to Message 101094.  

This is why i reduce my cpu time in this project...
I used to run Rosetta by itself. Now I run SiDock too (both at 100%) and don't have to worry about switching anything.
Of course, Rosetta then gets less total CPU time, but it appears that they don't need it/can't use it anyway.
You can always set SiDock to 0 priority, then it will only get that if Rosetta is broken.
By the way it's not percent. I can set priorities to add up to any number.
ID: 101124 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 101125 - Posted: 7 Apr 2021, 16:59:12 UTC - in response to Message 101097.  

I must agree with Brian, pretty disappointing that I have yet to see a project admin come in even acknowledging there is an issue. I'll give it to the end of the week. If it doesn't appear anyone is working on the problem I'll most likely drop Rosetta and look for a different project to donate processing to.
Why do you expect them to spend time talking to us instead of working on the programming and science? If things go wrong, they'll notice when they don't get the tasks sent back completed, then they'll fix it, they need it fixed so they can get the science done. all you need to do is check in here to see if any other volunteers are experiencing the same problem. If we all are, then it's not your end at fault.
ID: 101125 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 101126 - Posted: 7 Apr 2021, 17:01:11 UTC - in response to Message 101102.  

Seeing as it seems to be whining season for the excessively-entitled it might be worth making myself as popular as usual by stating the obvious that I can't be the only one aware of.

1) Looking at my main PC, I've downloaded 54 tasks today across 10 separate downloads. Even my weirdly-running laptop managed 6 more. And my Android 4 more.
For all the comments about it (none) it seems like I'm the only one.
Of course, I'm not.

2) There are people here from all over the world, and a similar range of nationalities and creeds at UW from what I've observed, so some people may not be aware that UW is in the USA, which is a largely Christian country.
Traditionally their institutions and work-places close for the Easter holidays which have been taking place over the last4 days, from "Good Friday" to "Easter Monday" even if some of the people working there don't personally celebrate.

But, of course, during this holiday period people here expect to demand the creation and issue of sufficient work to serve at least a third of a million tasks per day to the world at large, with no respite.
And the amazing thing is, a fair few do seem to have come down. Maybe my caches aren't quite completely full, but near enough.

And as thanks, I see the usual levels of appreciation here, to wit, "if you don't supply what I need 24hrs a day, even during the holiday season, and provide chapter and verse on progress to tell us what we already know, loyalty will be shown by reducing contribution levels or leaving altogether"

Maybe you should have the occasional day off from your disgusting levels or personal entitlement too.
Though tbf, there's never any shortage of that.

Yeah, save it. I heard last time too. And the time before that. And the time before that etc
Well put.
ID: 101126 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 101127 - Posted: 7 Apr 2021, 17:03:08 UTC - in response to Message 101104.  

Me, I have obsessive-compulsive tendencies
Then keep them to yourself.
ID: 101127 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 101128 - Posted: 7 Apr 2021, 17:04:34 UTC - in response to Message 101108.  

You’re missing some prerequisite system libraries (glibc-⁠2.27)


Woah, dude, you can tell that even though my system was turned off?

Did you find the horse porn folder too? That was from...uh, my kid brother.
Apparently that's illegal. When I was a teenager, everyone had seen the Pamela Anderson with a horse video. Seems we're getting prudish nowadays.
ID: 101128 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,518,559
RAC: 10,612
Message 101129 - Posted: 7 Apr 2021, 17:25:06 UTC - in response to Message 101122.  

Unless you have set it to use local settings, it will use whatever you have set in your account's Computing preferences section.
What causes it to switch to local settings?

My bet is, accidentally pressing the button to see what it does, not noticing anything immediate, then shrugging, followed by not remembering some years later whether you pressed the button or not for that host (puts hand up)
ID: 101129 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,518,559
RAC: 10,612
Message 101130 - Posted: 7 Apr 2021, 17:35:46 UTC - in response to Message 101123.  

It seems the RAM problem is not yet solved. I just updated Rosetta now and got the same complaint about needing 6 GB of RAM and only having 3 GB.
It's not a problem, it's just some tasks needing more RAM.
It is a problem when a Work Unit says it it will need 6+GB of RAM, when it really only needs 300MB (or less), as that results in almost a third of the projects computing resources becoming unavailable.
It's true all the ones that came through to my larger machines all say under 700MB RAM. Maybe they can occasionally need a lot more so they're playing safe and not crashing small machines with them?

What I noticed earlier today was, I had 3 good tasks running and 11 came down.
As they began to run for 10 or 15 seconds before erroring out, 3 more than the existing 3 would run at a time with the others saying "Waiting for memory"
As each crashed out, one of the waiting-for-memory tasks would start running until crashing out. And so on until they all had.
So that's 6 tasks running at a time from 28Gb of memory allocated to Boinc. SIX!

The vast majority of people won't stand a chance at that rate.
I'd need 75Gb of free RAM allocated to Boinc to run on all 16 cores.
Never going to happen.

Up to yesterday I seemed to be running 16 tasks comfortably within 28Gb. No idea what's going on now. Bizarre.
ID: 101130 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 101131 - Posted: 7 Apr 2021, 17:42:23 UTC - in response to Message 101129.  

Unless you have set it to use local settings, it will use whatever you have set in your account's Computing preferences section.
What causes it to switch to local settings?

My bet is, accidentally pressing the button to see what it does, not noticing anything immediate, then shrugging, followed by not remembering some years later whether you pressed the button or not for that host (puts hand up)
Where is this button? I've never seen it. I assumed that it changed mine to local because I changed a local setting - eg the buffer size.
ID: 101131 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 1600
Credit: 12,116,986
RAC: 9,863
Message 101132 - Posted: 7 Apr 2021, 17:44:38 UTC - in response to Message 101130.  

It seems the RAM problem is not yet solved. I just updated Rosetta now and got the same complaint about needing 6 GB of RAM and only having 3 GB.
It's not a problem, it's just some tasks needing more RAM.
It is a problem when a Work Unit says it it will need 6+GB of RAM, when it really only needs 300MB (or less), as that results in almost a third of the projects computing resources becoming unavailable.
It's true all the ones that came through to my larger machines all say under 700MB RAM. Maybe they can occasionally need a lot more so they're playing safe and not crashing small machines with them?

What I noticed earlier today was, I had 3 good tasks running and 11 came down.
As they began to run for 10 or 15 seconds before erroring out, 3 more than the existing 3 would run at a time with the others saying "Waiting for memory"
As each crashed out, one of the waiting-for-memory tasks would start running until crashing out. And so on until they all had.
So that's 6 tasks running at a time from 28Gb of memory allocated to Boinc. SIX!

The vast majority of people won't stand a chance at that rate.
I'd need 75Gb of free RAM allocated to Boinc to run on all 16 cores.
Never going to happen.

Up to yesterday I seemed to be running 16 tasks comfortably within 28Gb. No idea what's going on now. Bizarre.
If the program needs the RAM, there's nothing they can do about it, unless they go back in time and find the real programmers who could write a game that ran in 48KB. Chances are some of the tasks will have less of a RAM need, and you'll get a decent mix. Or you get some from another project at once.

And why have you only got 48GB? I have 64 out of an allowed 128.
ID: 101132 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Breno

Send message
Joined: 8 Apr 20
Posts: 30
Credit: 12,984,922
RAC: 1,209
Message 101133 - Posted: 7 Apr 2021, 17:54:01 UTC - in response to Message 81016.  

Maybe it has something to do with the recent SSL post they posted on the Forum News.
Maybe every client instance has to manually reset the URL like months ago.
I don't really know, but you are right, the project is in risk of losing a lot of WUs if they don't attest to this issue.
Keep the faith in this project!
ID: 101133 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,518,559
RAC: 10,612
Message 101134 - Posted: 7 Apr 2021, 19:49:04 UTC - in response to Message 101131.  

Unless you have set it to use local settings, it will use whatever you have set in your account's Computing preferences section.
What causes it to switch to local settings?

My bet is, accidentally pressing the button to see what it does, not noticing anything immediate, then shrugging, followed by not remembering some years later whether you pressed the button or not for that host (puts hand up)
Where is this button? I've never seen it. I assumed that it changed mine to local because I changed a local setting - eg the buffer size.

It's at the very top of Computing Preferences - above all the tabs.
I know what you mean - it's so obvious I go blind to it
ID: 101134 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,518,559
RAC: 10,612
Message 101135 - Posted: 7 Apr 2021, 19:53:34 UTC - in response to Message 101132.  

It seems the RAM problem is not yet solved. I just updated Rosetta now and got the same complaint about needing 6 GB of RAM and only having 3 GB.
It's not a problem, it's just some tasks needing more RAM.
It is a problem when a Work Unit says it it will need 6+GB of RAM, when it really only needs 300MB (or less), as that results in almost a third of the projects computing resources becoming unavailable.
It's true all the ones that came through to my larger machines all say under 700MB RAM. Maybe they can occasionally need a lot more so they're playing safe and not crashing small machines with them?

What I noticed earlier today was, I had 3 good tasks running and 11 came down.
As they began to run for 10 or 15 seconds before erroring out, 3 more than the existing 3 would run at a time with the others saying "Waiting for memory"
As each crashed out, one of the waiting-for-memory tasks would start running until crashing out. And so on until they all had.
So that's 6 tasks running at a time from 28Gb of memory allocated to Boinc. SIX!

The vast majority of people won't stand a chance at that rate.
I'd need 75Gb of free RAM allocated to Boinc to run on all 16 cores.
Never going to happen.

Up to yesterday I seemed to be running 16 tasks comfortably within 28Gb. No idea what's going on now. Bizarre.
If the program needs the RAM, there's nothing they can do about it, unless they go back in time and find the real programmers who could write a game that ran in 48KB. Chances are some of the tasks will have less of a RAM need, and you'll get a decent mix. Or you get some from another project at once.
And why have you only got 48GB? I have 64 out of an allowed 128.

I allocate 28Gb from 32Gb total
They don't need the RAM. If they run, they generally use 300Mb, not 5 or 6Gb each. It's more than a bit crackers

Anyway, new news. I grabbed another few tasks while WCG is mainly running and they all seems new and running normally without crashing.
I think the hint dropped pretty heavily when every task got sent back un-run.
Things are happening whether they say so here or not
ID: 101135 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1234
Credit: 14,338,560
RAC: 2,014
Message 101137 - Posted: 7 Apr 2021, 20:30:15 UTC - in response to Message 101135.  

It seems the RAM problem is not yet solved. I just updated Rosetta now and got the same complaint about needing 6 GB of RAM and only having 3 GB.
It's not a problem, it's just some tasks needing more RAM.
It is a problem when a Work Unit says it it will need 6+GB of RAM, when it really only needs 300MB (or less), as that results in almost a third of the projects computing resources becoming unavailable.
It's true all the ones that came through to my larger machines all say under 700MB RAM. Maybe they can occasionally need a lot more so they're playing safe and not crashing small machines with them?
[snip]

I allocate 28Gb from 32Gb total
They don't need the RAM. If they run, they generally use 300Mb, not 5 or 6Gb each. It's more than a bit crackers

Anyway, new news. I grabbed another few tasks while WCG is mainly running and they all seems new and running normally without crashing.
I think the hint dropped pretty heavily when every task got sent back un-run.
Things are happening whether they say so here or not

Have you considered the possibility that many of those creating workunits are not yet very good at estimating how much RAM they will need to run?

I suspect that many of them are also not yet very good at reading the task log files, recognizing the problems they show, and correcting them.
ID: 101137 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim Martin

Send message
Joined: 9 Oct 05
Posts: 23
Credit: 1,443,682
RAC: 585
Message 101138 - Posted: 7 Apr 2021, 22:09:52 UTC

Hello. After approx. 15 years with Baker Lab, I've experienced an interesting problem, under the general category, computer errors.
Sorry to copy the entire err. report, but perhaps it will clarify.

Any ideas? The past three downloads gave, basically, the same results. Unless computing requirements have changed, recently, then I'll
have to change. Otherwise, perhaps, UW's end is with some new problems?

Good luck.

Jim Martin


Task 1364797245


Name ajzjTxIe_YBAABB_ABYBB_AAAAAAXB_AAY_CGGGGGGCCGGGGGCGGGGGGGGCGGGC_1-4_2-5_3-6.pdb_0001_abinitio_1_abinitio_SAVE_ALL_OUT_1389656_916_1
Workunit 1220204735
Created 7 Apr 2021, 15:30:32 UTC
Sent 7 Apr 2021, 15:32:35 UTC
Report deadline 10 Apr 2021, 15:32:35 UTC
Received 7 Apr 2021, 17:59:51 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 1 (0x00000001) Unknown error code
Computer ID 1324493
Run time 39 sec
CPU time 24 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 3.37 GFLOPS
Application version Rosetta v4.20
windows_x86_64
Peak working set size 153.00 MB
Peak swap size 124.98 MB
Peak disk usage 0.01 MB

Stderr output
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
(0x1) - exit code 1 (0x1)</message>
<stderr_txt>
command: projects/boinc.bakerlab.org_rosetta/rosetta_4.20_windows_x86_64.exe -fragA 00001.500.6mers -fragB 00001.500.4mers -in:file:fasta 00001.fasta -abinitio::increase_cycles 10 -mute all -abinitio::fastrelax -relax::default_repeats 15 -abinitio::rsd_wt_helix 0.5 -abinitio::rsd_wt_loop 0.5 -abinitio::use_filters false -ex1 -ex2aro -in:file:boinc_wu_zip cp_ajzjTxIe_YBAABB_ABYBB_AAAAAAXB_AAY_CGGGGGGCCGGGGGCGGGGGGGGCGGGC_1-4_2-5_3-6.pdb_0001_abinitio_1_fold_data.zip -out:file:silent default.out -silent_gz -in:file:native 00001.pdb -out:file:silent_struct_type binary -detect_disulf true -fix_disulf disulf -constraints::cst_file CB_cst -constraints:cst_weight 1 -number_9mer_frags 150 -number_3mer_frags 150 -nstruct 10000 -cpu_run_time 28800 -boinc:max_nstruct 20000 -checkpoint_interval 120 -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -boinc::cpu_run_timeout 36000 -run::rng mt19937 -constant_seed -jran 1985841
Using database: database_357d5d93529_n_methylminirosetta_database

ERROR: ERROR: FragmentIO: could not open file 00001.500.6mers
ERROR:: Exit from: ......srccorefragmentFragmentIO.cc line: 233
BOINC:: Error reading and gzipping output datafile: default.out
11:41:18 (10892): called boinc_finish(1)

</stderr_txt>
]]>
ID: 101138 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 101141 - Posted: 7 Apr 2021, 22:46:08 UTC - in response to Message 101138.  
Last modified: 7 Apr 2021, 22:47:14 UTC

perhaps, UW's end is with some new problems?
Yes: many people have reported the same issue recently. There’s nothing we can do about it other than let the bad work units fail, or stop running Rosetta until the problem has passed.
ID: 101141 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mrhastyrib

Send message
Joined: 18 Feb 21
Posts: 90
Credit: 2,541,890
RAC: 0
Message 101142 - Posted: 7 Apr 2021, 22:49:40 UTC - in response to Message 101128.  

When I was a teenager,


If you're THAT old, you shouldn't be getting hot flashes every time someone says "dude."
ID: 101142 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mrhastyrib

Send message
Joined: 18 Feb 21
Posts: 90
Credit: 2,541,890
RAC: 0
Message 101143 - Posted: 7 Apr 2021, 22:56:08 UTC - in response to Message 101141.  

There’s nothing we can do about it

We could ritualistically sacrifice a chicken, and then sprinkle its blood and entrails on @Peter Hucker.

The best part is, none of the staff of the nursing home would believe that it happened, when it's reported by the other residents.
ID: 101143 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,518,559
RAC: 10,612
Message 101144 - Posted: 8 Apr 2021, 0:02:04 UTC - in response to Message 101137.  
Last modified: 8 Apr 2021, 0:08:45 UTC

It seems the RAM problem is not yet solved. I just updated Rosetta now and got the same complaint about needing 6 GB of RAM and only having 3 GB.
It's not a problem, it's just some tasks needing more RAM.
It is a problem when a Work Unit says it it will need 6+GB of RAM, when it really only needs 300MB (or less), as that results in almost a third of the projects computing resources becoming unavailable.
It's true all the ones that came through to my larger machines all say under 700MB RAM. Maybe they can occasionally need a lot more so they're playing safe and not crashing small machines with them?
I allocate 28Gb from 32Gb total
They don't need the RAM. If they run, they generally use 300Mb, not 5 or 6Gb each. It's more than a bit crackers

Anyway, new news. I grabbed another few tasks while WCG is mainly running and they all seems new and running normally without crashing.
I think the hint dropped pretty heavily when every task got sent back un-run.
Things are happening whether they say so here or not

Have you considered the possibility that many of those creating work-units are not yet very good at estimating how much RAM they will need to run?

I suspect that many of them are also not yet very good at reading the task log files, recognizing the problems they show, and correcting them.

I hadn't considered it because if someone can code for the kind of work we're getting I wouldn't be so grossly insulting as to suggest they're a bit thick.
I can easily imagine either the slip of a finger or maybe some kind of test that they didn't want to be limited by RAM or disk space to have accidentally been left in.

Honestly, of all the things to suggest... have a word with yourself

Aside from that, it would be nice if we could have a few more of those tasks that it seems I was lucky to pick up. They seem fine on my main PC with plenty of RAM

Edit again: Miraculously picked up 4 tasks on my laptop immediately after posting. None again when I tried on the desktop. They're trying, but hand to mouth.
ID: 101144 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim Martin

Send message
Joined: 9 Oct 05
Posts: 23
Credit: 1,443,682
RAC: 585
Message 101145 - Posted: 8 Apr 2021, 0:55:10 UTC - in response to Message 101141.  

Thanks, for the reply, Brian. I wonder why some have this problem, and others don't. Nothing has changed (computer) on this end of the line.
So, will just run SiDock@home, for awhile. Natalia has a cheerful, and informative approach to running things.

jm
ID: 101145 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 92 · 93 · 94 · 95 · 96 · 97 · 98 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org