computation error

Message boards : Number crunching : computation error

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
H.

Send message
Joined: 20 Nov 07
Posts: 5
Credit: 1,013,932
RAC: 0
Message 78922 - Posted: 16 Oct 2015, 12:59:12 UTC

hi.
for the last week or so ALL my units (and i DO mean ALL of them!) ended in a computation error. yes, i've reset the project, but that didn't help. what's going on here?
ID: 78922 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 21,465,703
RAC: 16,826
Message 78924 - Posted: 16 Oct 2015, 13:16:25 UTC - in response to Message 78922.  

hi.
for the last week or so ALL my units (and i DO mean ALL of them!) ended in a computation error. yes, i've reset the project, but that didn't help. what's going on here?


if you are running version 3.65, it is dynamically linked and wants some new libraries, specifically libglut ....


../projects/boinc.bakerlab.org_rosetta/minirosetta_3.65_x86_64-pc-linux-gnu: error while loading shared libraries: libglut.so.3: cannot open shared object file: No such file or directory

you need to install the library.

Minirosetta 3.62-3.65

ID: 78924 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 78926 - Posted: 16 Oct 2015, 13:48:36 UTC

Run:

sudo apt-get install freeglut3

ID: 78926 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 78927 - Posted: 16 Oct 2015, 14:31:36 UTC

DK is looking in to this issue with Linux installations. See his post.
Rosetta Moderator: Mod.Sense
ID: 78927 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bjorn Munch
Avatar

Send message
Joined: 5 Oct 13
Posts: 3
Credit: 3,599,545
RAC: 0
Message 78930 - Posted: 16 Oct 2015, 16:21:53 UTC

Dang, one of my computers is running Linux Mint 14 (based on Ubuntu 12.10), apparently too old to get freeglut3. And yes all my tasks are failing. I guess I have to disable rosetta on that box and revert to 100% SETI...
ID: 78930 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Dr. Merkwürdigliebe
Avatar

Send message
Joined: 5 Dec 10
Posts: 81
Credit: 2,657,273
RAC: 0
Message 78931 - Posted: 16 Oct 2015, 16:52:12 UTC - in response to Message 78930.  

Dang, one of my computers is running Linux Mint 14 (based on Ubuntu 12.10), apparently too old to get freeglut3. And yes all my tasks are failing. I guess I have to disable rosetta on that box and revert to 100% SETI...


Still looking for aliens, huh? All those CPU/GPU cycles burned in vain...

What's the error output of the apt-get command? The package ought to be in the repositories.

All those SSE and screen saver errors...they just distract the developers from the real necessities.
ID: 78931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1829
Credit: 115,902,918
RAC: 61,143
Message 78933 - Posted: 16 Oct 2015, 18:12:00 UTC - in response to Message 78930.  

Dang, one of my computers is running Linux Mint 14 (based on Ubuntu 12.10), apparently too old to get freeglut3. And yes all my tasks are failing. I guess I have to disable rosetta on that box and revert to 100% SETI...

I had the same problem with DotschUX which is based on Ubuntu 8 I think. (running from a compactflash card so had to keep it small). Switched to Ubuntu 15 and ran Chilean's string and all appears to be working now.
ID: 78933 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,662,308
RAC: 0
Message 78934 - Posted: 16 Oct 2015, 19:40:00 UTC - in response to Message 78922.  

hi.
for the last week or so ALL my units (and i DO mean ALL of them!) ended in a computation error. yes, i've reset the project, but that didn't help. what's going on here?


Same problem on the Ubuntu side of this dual boot machine. All work units immediately have a computation error. No apparent problems when the same machine is running under Windows 10. (It's an oldish Toshiba that started as Windows 7, was configured with the Ubuntu dual boot a few years ago, and recently upgraded to Windows 10 on the Windows side.)

They have said that bandwidth is a problem, but this is a just a massive waste of bandwidth.

The website is also broken. The message sort feature is NOT sorting the messages. Or is it broken as designed, and there are just a bunch of pegged posts at the top?

Complexities I don't need. Such annoyances led me to abandon several previous BOINC projects.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 78934 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 21,465,703
RAC: 16,826
Message 78935 - Posted: 16 Oct 2015, 19:49:51 UTC - in response to Message 78934.  

hi.
for the last week or so ALL my units (and i DO mean ALL of them!) ended in a computation error. yes, i've reset the project, but that didn't help. what's going on here?


Same problem on the Ubuntu side of this dual boot machine. All work units immediately have a computation error. No apparent problems when the same machine is running under Windows 10. (It's an oldish Toshiba that started as Windows 7, was configured with the Ubuntu dual boot a few years ago, and recently upgraded to Windows 10 on the Windows side.)

They have said that bandwidth is a problem, but this is a just a massive waste of bandwidth.

The website is also broken. The message sort feature is NOT sorting the messages. Or is it broken as designed, and there are just a bunch of pegged posts at the top?

Complexities I don't need. Such annoyances led me to abandon several previous BOINC projects.



The task details indicate that v3.65 Linux tasks are aborting because the new binary needs access to libglut.so

Just install the libglut.so library and continue if you want.

sudo apt-get install freeglut3


<stderr_txt>
../../projects/boinc.bakerlab.org_rosetta/minirosetta_3.65_x86_64-pc-linux-gnu: error while loading shared libraries: libglut.so.3: cannot open shared object file: No such file or directory

</stderr_txt>


ID: 78935 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bjorn Munch
Avatar

Send message
Joined: 5 Oct 13
Posts: 3
Credit: 3,599,545
RAC: 0
Message 78942 - Posted: 18 Oct 2015, 9:11:58 UTC - in response to Message 78933.  

Dang, one of my computers is running Linux Mint 14 (based on Ubuntu 12.10), apparently too old to get freeglut3. And yes all my tasks are failing. I guess I have to disable rosetta on that box and revert to 100% SETI...

I had the same problem with DotschUX which is based on Ubuntu 8 I think. (running from a compactflash card so had to keep it small). Switched to Ubuntu 15 and ran Chilean's string and all appears to be working now.


apt-get would confirm there was a freeglut3 but when it proceeded to actually download the deb file it was not there. However, I was able to find the freeglut3 for the LTS Ubuntu 12.04 and install from the web. I enabled rosetta again and now tasks seem to be running. :-)

There is no such thing as Ubuntu 15, it's 15.04 or 15.10 though the latter hasn't been released just yet.
ID: 78942 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,662,308
RAC: 0
Message 78948 - Posted: 20 Oct 2015, 2:14:39 UTC - in response to Message 78942.  

[quote][quote]Dang, one of my computers is running Linux Mint 14 (based on Ubuntu 12.10), apparently too old to get freeglut3. And yes all my tasks are failing. I guess I have to disable rosetta on that box and revert to 100% SETI...

I had the same problem with DotschUX which is based on Ubuntu 8 I think. (running from a compactflash card so had to keep it small). Switched to Ubuntu 15 and ran Chilea
ID: 78948 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,662,308
RAC: 0
Message 78949 - Posted: 20 Oct 2015, 2:19:55 UTC - in response to Message 78948.  

[quote][quote]Dang, one of my computers is running Linux Mint 14 (based on Ubuntu 12.10), apparently too old to get freeglut3. And yes all my tasks are failing. I guess I have to disable rosetta on that box and revert to 100% SETI...

I had the same problem with DotschUX which is based on Ubuntu 8 I think. (running from a compactflash card so had to keep it small). Switched to Ubuntu 15 and ran Chilea


I did NOT post that previous message. Buggy website, complicated and buggy projects, and I wouldn't trust any scientific results running on such a foundation. It lost TWO previous attempts to update the status of this thread, but maybe it was confused by the other person (whoever it was) using DotshUX?

Time to say goodbye to rosetta@home, but too busy this month to do the new project research, so probably will pass 1.5 million units of "work done" credit before switching completely...

Unlikely that anyone reading this will have any recommendations for alternative projects, but I'll try to check back.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 78949 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 354
Credit: 382,349
RAC: 0
Message 78951 - Posted: 20 Oct 2015, 10:56:21 UTC - in response to Message 78949.  
Last modified: 20 Oct 2015, 11:05:16 UTC

I did NOT post that previous message. Buggy website, complicated and buggy projects, and I wouldn't trust any scientific results running on such a foundation. It lost TWO previous attempts to update the status of this thread, but maybe it was confused by the other person (whoever it was) using DotshUX?

The forum software may be old and missing some features, which we are all used to, but it definitely won't create posts for you. Cosidering all the pretty weird issues you have using this board, I'd say either buggy browser or, and that's more likely here IMHO, a problem between keyboard and chair.
.
ID: 78951 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,662,308
RAC: 0
Message 78956 - Posted: 21 Oct 2015, 1:11:56 UTC - in response to Message 78951.  

I did NOT post that previous message. Buggy website, complicated and buggy projects, and I wouldn't trust any scientific results running on such a foundation. It lost TWO previous attempts to update the status of this thread, but maybe it was confused by the other person (whoever it was) using DotshUX?

The forum software may be old and missing some features, which we are all used to, but it definitely won't create posts for you. Cosidering all the pretty weird issues you have using this board, I'd say either buggy browser or, and that's more likely here IMHO, a problem between keyboard and chair.


Look, I'm telling you that I did NOT write that post about DotshUX (which I've never heard of), though I was trying to write a completely DIFFERENT post at about the same time. Maybe a failure in the semaphores? (Actually reminds me of a significant bug in an email system I wrote several decades ago. It took several months before anyone called it to my attention, when it was a 10-minute fix.)

Browser is Firefox, even on my Ubuntu machines.

The suggestion of "a problem between keyboard and chair" is a rude ad hominem comment. REAL science doesn't work that way.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 78956 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 354
Credit: 382,349
RAC: 0
Message 78958 - Posted: 21 Oct 2015, 8:59:16 UTC - in response to Message 78956.  

Look, I'm telling you that I did NOT write that post about DotshUX (which I've never heard of), though I was trying to write a completely DIFFERENT post at about the same time.

After clicking on the "reply" button you (and not the forum software) deleted some UBB tags, that's why it looks like your post instead of a quote.

Before posting you can use the preview button to see if everything is the way you want it to be or you can even edit your post within one hour after posting it instead of quoting it and telling you dindn't want it that way. So everything just like with any other forum software.



The suggestion of "a problem between keyboard and chair" is a rude ad hominem comment. REAL science doesn't work that way.

What suggestion do you expect? Nobody else here has the "issues" you reported so far and also nobody else is able to reproduce them (I tried that when you said the topic order is wrong).
.
ID: 78958 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 78964 - Posted: 21 Oct 2015, 16:38:04 UTC - in response to Message 78956.  

I did NOT post that previous message. Buggy website, complicated and buggy projects, and I wouldn't trust any scientific results running on such a foundation. It lost TWO previous attempts to update the status of this thread, but maybe it was confused by the other person (whoever it was) using DotshUX?

The forum software may be old and missing some features, which we are all used to, but it definitely won't create posts for you. Cosidering all the pretty weird issues you have using this board, I'd say either buggy browser or, and that's more likely here IMHO, a problem between keyboard and chair.


Look, I'm telling you that I did NOT write that post about DotshUX (which I've never heard of), though I was trying to write a completely DIFFERENT post at about the same time. Maybe a failure in the semaphores? (Actually reminds me of a significant bug in an email system I wrote several decades ago. It took several months before anyone called it to my attention, when it was a 10-minute fix.)

Browser is Firefox, even on my Ubuntu machines.

The suggestion of "a problem between keyboard and chair" is a rude ad hominem comment. REAL science doesn't work that way.

ID: 78964 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 78965 - Posted: 21 Oct 2015, 16:39:18 UTC - in response to Message 78964.  

I did NOT post that previous message. Buggy website, complicated and buggy projects, and I wouldn't trust any scientific results running on such a foundation. It lost TWO previous attempts to update the status of this thread, but maybe it was confused by the other person (whoever it was) using DotshUX?

The forum software may be old and missing some features, which we are all used to, but it definitely won't create posts for you. Cosidering all the pretty weird issues you have using this board, I'd say either buggy browser or, and that's more likely here IMHO, a problem between keyboard and chair.


Look, I'm telling you that I did NOT write that post about DotshUX (which I've never heard of), though I was trying to write a completely DIFFERENT post at about the same time. Maybe a failure in the semaphores? (Actually reminds me of a significant bug in an email system I wrote several decades ago. It took several months before anyone called it to my attention, when it was a 10-minute fix.)

Browser is Firefox, even on my Ubuntu machines.

The suggestion of "a problem between keyboard and chair" is a rude ad hominem comment. REAL science doesn't work that way.


You see, I didn't write that either. I did tho, delete the /quote tag. So it looks as if I wrote it.

ID: 78965 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile shanen
Avatar

Send message
Joined: 16 Apr 14
Posts: 195
Credit: 12,662,308
RAC: 0
Message 78966 - Posted: 22 Oct 2015, 0:57:00 UTC - in response to Message 78965.  
Last modified: 22 Oct 2015, 0:58:21 UTC

So your point is apparently that this website is complicated and buggy? Sorry, but I already noticed that. If you were attempting to motivate me to be interested in doing something to solve their [your?] problems, then you failed. Again.

The problem as I see it... Oh wait, I see so MANY problems that it is pretty much impossible to pick one as "the problem" to focus on.

One of YOUR major problems is that few people are noticing problems. I think that is related to a pretty large lack of concern on the side of the people "managing" this system. Indifference transmitted, indifference returned.

Actually, one of my pieces of evidence for the "benign neglect" hypothesis is in the 13 threads that are locked at the top of this "Number crunching" message board. Four of them have apparently had no activity in over 3,000 days?

I'm vaguely curious about the excuses at rosetta@home, but mostly my guess is that they feel they have such an abundance of computing resources that they don't care about wasting them, which creates an unnecessary conflict situation when I would prefer not to waste my own computing resources if they could be donated to some other project that actually appreciates them.

Case in point is the implicit complaint addressed in the "increasing the default run time" discussion. They say the "problem" is a waste of bandwidth at their end. Okay, so maybe they could stop sending me hundreds (more likely thousands) of work units that my machines can't process before their deadlines?

I have wasted a lot of time and energy trying to set the parameters to avoid such waste. In spite of my best efforts, right now this machine has received several screens of pending work units, but it's a relatively slow machine that only handles two at a time and can only finish about 6 per day, so I know that most that downloaded data will be discarded. A couple of my machines run 4 at a time, and one of them will only store a few extra work cycles at a time, so sometimes it runs out and does nothing. My most powerful machine runs 8 at a time, but I mostly run it on weekends, so it "loses" a large chunk of its work.

Waste, waste, waste. The people running rosetti@home have convinced me they care very little. Or less.
#1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech)
ID: 78966 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Timo
Avatar

Send message
Joined: 9 Jan 12
Posts: 185
Credit: 45,644,940
RAC: 271
Message 78967 - Posted: 22 Oct 2015, 1:07:56 UTC - in response to Message 78966.  

... right now this machine has received several screens of pending work units, but it's a relatively slow machine that only handles two at a time and can only finish about 6 per day, so I know that most that downloaded data will be discarded. A couple of my machines run 4 at a time..


Looking at your tasks it seems A LOT of them are 'ABORTED BY USER' - not sure why you're randomly aborting tasks?

If you feel your having issues with the work queue being overly ambitious, it would help to let BOINC finish more tasks rather than abort them and get a better feel for how much work your machines can actually complete per week.

The work queue / forecasting of how much work to cache is actually a function of BOINC more than Rosetta@Home itself.

I have a bunch of different machines crunching for Rosetta and haven't experienced any of the problems you describe, so I can only guess that your local settings/outdated BOINC software, or something else specific to your setup is at fault.
ID: 78967 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 78969 - Posted: 22 Oct 2015, 5:00:17 UTC - in response to Message 78966.  

So your point is apparently that this website is complicated and buggy? Sorry, but I already noticed that. If you were attempting to motivate me to be interested in doing something to solve their [your?] problems, then you failed. Again.


No, I simply replicated the error you made. You deleted the /quote tag by mistake, thus it looked as if you wrote something you didn't. There is no bug. BBCode is widely used in forums, it isn't a "complicated" format to master.

Case in point is the implicit complaint addressed in the "increasing the default run time" discussion. They say the "problem" is a waste of bandwidth at their end. Okay, so maybe they could stop sending me hundreds (more likely thousands) of work units that my machines can't process before their deadlines?


This is you not setting your WU cache properly. The project doesn't "send you hundreds of WUs" out of the blue, YOUR machine is asking for hundreds of WUs. Although, I remember there being a quota to avoid sending "thousands" of WUs to machines not configured properly.

I have wasted a lot of time and energy trying to set the parameters to avoid such waste. In spite of my best efforts, right now this machine has received several screens of pending work units, but it's a relatively slow machine that only handles two at a time and can only finish about 6 per day, so I know that most that downloaded data will be discarded. A couple of my machines run 4 at a time, and one of them will only store a few extra work cycles at a time, so sometimes it runs out and does nothing. My most powerful machine runs 8 at a time, but I mostly run it on weekends, so it "loses" a large chunk of its work.


This is also you not configuring your work cache/buffer accordingly. Increase the work cache in your BOINC manager on your 8-thread machine and decrease the cache on the 2-threaded machine. Reducing your runtime setting could help as well.

And just like Timo mentioned, I too have a bunch of different machines running R@H and I've never had the problems you are describing.
ID: 78969 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : computation error



©2024 University of Washington
https://www.bakerlab.org