Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 78 · 79 · 80 · 81 · 82 · 83 · 84 . . . 279 · Next

AuthorMessage
Profile Jo

Send message
Joined: 16 May 20
Posts: 10
Credit: 3,813,274
RAC: 0
Message 100205 - Posted: 27 Dec 2020, 21:19:59 UTC - in response to Message 100201.  

Sure, people are just donating hardware, time, power bills and more for free

Yes.
Except the 2nd sentence of your 1st post in these forums was about a loss of "good will" which you'd never established in the first place.
And a few messages later, in response to entirely plan, honest and accurate replies, if disappointing, you were trying to claim it as a loss of community.
At which point you immediately detach and send indignant PMs around. So much for it being about community and not entirely about you.
Which you confirm by sending huffy forum posts even after you detached. Using the word "we" of all things.

Really, you're not making yourself look any more justified. Rather, it's self-indulgent and pathetic.


Really? Let us look at my first post:

"
Are you saying that no one can remote into and reboot the servers? Is so, that is not good. There is a _lot_ of computing and good will lost if there are millions of jobs available and no one can download them.
"

Here is the thing, if people see no new work, and a lot of jobs in que, they will go other places. Never did I use "I" but I used "no one". You see, the latter is pointing at something plural. So how you make this about me is inside your own head.

Then you reply:


In the past, some holidays have been out of all and any contact.
That may not be the case this time, but it's a possibility.
Point being, it'll be fixed when it's fixed and complaining repeatedly in the forums has rarely ever solved it


If no one complains or in any other way raise concern, how are obvious errors get fixed? Why should one just give up and let errors continue for ever? How does that improve anything?

Then i try with yet another effort on a constructive reply:

I see why the feedbacks are felt like complaints, but what it is meant as is an heads up that people want to do work and that idle CPUs are being wasted. People think that this is to important not to be mentioned because people care, and they get a bit bummed out when the reply can be interpreted as "what ever..." I guess no one is expecting people to leave their families to fix this, but it is a disappointment that there is no one who can spend ten minutes in a remote session to reboot the servers, just in case that helps.

To build a strong and lasting community there must be an "us". There must be some feeling of "we are all in on this together". Right now that feeling is diminishing, and that is not good.

I do hope there will be made some effort to both stopping this current bug to reappear, and for a solution where low effort incident resolving can be done.


Then you reply

- "Idle CPUs" is 100% always a user failure - or "bug" as you mistifyingly want to call it
- If "people think this is too important not to be mentioned" then they'll have missed it already having been mentioned a dozen times
- "ten minutes" - yeah, ok...
- "community", "us" - <puke>


Now, how is this mocking from you being helpful? How does this make people feel welcome and useful? Try to explain why idle CPU _always_ i user failure.

All you achieved is making me not wanting to contribute. You made me feel anything but welcome. You behave in a way that is just rude and sad. People i general like communities. People in general like an "us". Pe3ople in general like to contribute to a common good. This is one of the most important ways humans differ from all other animals. And no, this is not abut _me_. It is about Rosetta and the lack of a community and a lack of simple low effort moves that can be made to get even more people on board.

So tell me, do you think Rosetta could use more processing power donated to the project? Do you care if Rosetta gains or loses computing power?
ID: 100205 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,769,229
RAC: 5,729
Message 100206 - Posted: 27 Dec 2020, 21:43:04 UTC - in response to Message 100205.  
Last modified: 27 Dec 2020, 21:45:34 UTC

deleted as it's not worth my time.
ID: 100206 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SoulReaper

Send message
Joined: 23 Nov 20
Posts: 2
Credit: 323,612
RAC: 0
Message 100212 - Posted: 28 Dec 2020, 3:29:30 UTC - in response to Message 100205.  

All you achieved is making me not wanting to contribute.

Ignore Sid Celery, he's a tool.

My only response to the Celery guy would be that people are entitled to their opinions, just like you are. And they have a right to state it. You're such a fellow who's attacking everyone just because their opinion doesn't match yours. This is a place to discuss and you can't shut down people okay? So grow up and perhaps try using a neutral tone.

No wonder there's no community when people like him roam around in these unmoderated forums. Check out F@H forums to see how they treat people, even for noob questions. And here we have people sitting on high horses and swinging maces for no reason smh. It's so funny lmao.
ID: 100212 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,869,469
RAC: 2,739
Message 100213 - Posted: 28 Dec 2020, 5:32:30 UTC - in response to Message 100177.  

People deserve a proper break and I'm not inclined to demand they're dragged in to suit, what are essentially, hobbyists.
No need to go in, just do a remote login & restart. If it fixes it, good. If not, then it can wait till they do go in.

Restart what? I looked at the server status and saw no signs that the server has anything at a point where just restarting a program would do much to the rate at which workunits are created. All of the programs seem to be running already, ready to create new workunits from the outputs of previous workunits once those outputs are returned if the original instructions to the server had been to do a series of workunits from a smaller number of starting points. The server needs more starting points to create workunits faster, and it cannot create those starting points. People with the right knowledge can create starting points remotely using the Robetta interface, but there don't seem to be enough of those people who aren't tied up with Christmas activities.

How many of the people demanding more workunits have been nice enough that they should get any, rather than having the server send them links to pictures of LARGE lumps of coal?
ID: 100213 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1225
Credit: 13,869,469
RAC: 2,739
Message 100214 - Posted: 28 Dec 2020, 5:40:46 UTC - in response to Message 100205.  

[snip]
Here is the thing, if people see no new work, and a lot of jobs in que, they will go other places. Never did I use "I" but I used "no one". You see, the latter is pointing at something plural. So how you make this about me is inside your own head.

[snip]
Where do you see lots of jobs in queue? When I looked at the server status, I saw lots of jobs in progress (i.e. already sent to computers that haven't finished and returned them), but no sign of a queue.
ID: 100214 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 376
Credit: 10,747,574
RAC: 5,986
Message 100215 - Posted: 28 Dec 2020, 6:19:09 UTC - in response to Message 100214.  


Where do you see lots of jobs in queue? When I looked at the server status, I saw lots of jobs in progress (i.e. already sent to computers that haven't finished and returned them), but no sign of a queue.


On the project home page it shows 11,144,555 work units available. These are the output of the generator and the feeder then ensures that there are always about 30,000 in the pot that is used by the downloader.

The feeder is shown as running but is presumably not working correctly.
ID: 100215 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1494
Credit: 14,706,505
RAC: 15,631
Message 100216 - Posted: 28 Dec 2020, 6:25:08 UTC - in response to Message 100213.  
Last modified: 28 Dec 2020, 6:33:28 UTC

People deserve a proper break and I'm not inclined to demand they're dragged in to suit, what are essentially, hobbyists.
No need to go in, just do a remote login & restart. If it fixes it, good. If not, then it can wait till they do go in.

Restart what? I looked at the server status and saw no signs that the server has anything at a point where just restarting a program would do much to the rate at which workunits are created. All of the programs seem to be running already,
That's part of he problem.
It shows it's working, but nothing is happening.

It would be easier to see if all the information was on the one page, but it isn't.
On the main page it shows 11,144,555 queued jobs. In the server Status is shows Tasks ready to send is 0.

The rah_make_work1 server takes the Queued jobs and turns them in to Tasks we can actually process. Once done, they go to the Tasks ready to send queue, then get sent out when people request work.

Tasks in progress has gone from 5.5 million to less than 65k, even with all those millions of Queued jobs. That shows the system is broken. And one of the easiest & quickest fixes when a computer system has issues is to restart it. And for most server systems, management is done remotely. Hence no need for someone to go in, they just need to log in & give things a kick.


Edit- it would also help if the server status was updated more often than every few hours- and that's when it is being updated.

As near as i can tell it's presently 06:33 UTC, the time of the main page server status is 4:07:26 UTC, and on the Server status page it's 5:40:03 UTC.
Grant
Darwin NT
ID: 100216 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1494
Credit: 14,706,505
RAC: 15,631
Message 100217 - Posted: 28 Dec 2020, 6:37:14 UTC - in response to Message 100215.  

The feeder is shown as running but is presumably not working correctly.
For all we know it is- but until there is work to actually send out it doesn't matter if it is or isn't working.
rah_make_work1 also shows as working, but as there is no work being produced you can be sure that isn't really the case.
Grant
Darwin NT
ID: 100217 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1866
Credit: 8,186,159
RAC: 6,319
Message 100218 - Posted: 28 Dec 2020, 8:26:53 UTC - in response to Message 100214.  

Where do you see lots of jobs in queue?


Home page of Rosetta:
Total queued jobs: 11,148,825

On server status page:
Tasks ready to send: 0

ID: 100218 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 376
Credit: 10,747,574
RAC: 5,986
Message 100219 - Posted: 28 Dec 2020, 8:33:03 UTC - in response to Message 100217.  

The feeder is shown as running but is presumably not working correctly.
For all we know it is- but until there is work to actually send out it doesn't matter if it is or isn't working.
rah_make_work1 also shows as working, but as there is no work being produced you can be sure that isn't really the case.


OK, I’d assumed that the feeder was the unit that fed work from the output of the generator to the ready to be sent queue.

As the saying goes, assume makes an ass of u and me
ID: 100219 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1866
Credit: 8,186,159
RAC: 6,319
Message 100220 - Posted: 28 Dec 2020, 8:42:36 UTC - in response to Message 100216.  
Last modified: 28 Dec 2020, 8:44:15 UTC

That's part of he problem.
It shows it's working, but nothing is happening.
It would be easier to see if all the information was on the one page, but it isn't.


There are some daemons to queue jobs, like
./bin/xadd
sudo a2enmod cgi
or
/bin/update_versions
and others
Maybe there is some problems in these steps....
ID: 100220 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1866
Credit: 8,186,159
RAC: 6,319
Message 100221 - Posted: 28 Dec 2020, 8:48:58 UTC - in response to Message 100217.  

For all we know it is- but until there is work to actually send out it doesn't matter if it is or isn't working.

./bin/stop
./bin/start

With these 2 commands you can restart ALL daemons. 5 seconds of work and you can see if the problem is here...
ID: 100221 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 100223 - Posted: 28 Dec 2020, 11:09:47 UTC - in response to Message 100216.  

it would also help if the server status was updated more often than every few hours- and that's when it is being updated
AFAICT it is pretty regular: the front page gets updated every four hours (around 4, 8 and 12 Pacific time), and on the status page the Server status and Computing status get updated every ≈62 minutes (separately, at different offsets).
ID: 100223 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,769,229
RAC: 5,729
Message 100224 - Posted: 28 Dec 2020, 12:10:16 UTC - in response to Message 100221.  

For all we know it is- but until there is work to actually send out it doesn't matter if it is or isn't working.


./bin/stop
./bin/start

With these 2 commands you can restart ALL daemons. 5 seconds of work and you can see if the problem is here...


But you are assuming someone is there, or at home, to do the work...there isn't! They do not work on weekends unless they have in lab stuff to do or Server updates or something that can't be done during the week. They take the weekends off because for them it's a 9 to 5 5 day a week job not a hobby as it is for us, sure they enjoy their work but so did you when you were working, but did YOU go in on weekends to do stuff knowing you weren't going to get paid for it and knowing if things got worse no one was there to help fix it. I know I didn't, I took every day off to it's fullest and spent it with my family and friends and then come the day I had to go back to work I started worrying about work stuff again. At one of my jobs they even had it set up so that most people could NOT go in to work because they disabled the access badges on weekends, you couldn't even get in the front door let alone do any work. In fact when I had to work late one night they had to have a Security Guard babysit me so I could get to the different places I needed to be as his badge worked and mine didn't.
ID: 100224 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1866
Credit: 8,186,159
RAC: 6,319
Message 100226 - Posted: 28 Dec 2020, 12:31:51 UTC - in response to Message 100224.  

But you are assuming someone is there, or at home, to do the work...there isn't! They do not work on weekends unless they have in lab stuff to do or Server updates or something that can't be done during the week. They take the weekends off because for them it's a 9 to 5 5 day a week job not a hobby as it is for us, sure they enjoy their work but so did you when you were working, but did YOU go in on weekends to do stuff knowing you weren't going to get paid for it and knowing if things got worse no one was there to help fix it.

Ehi, ehi, i'm NOT angry with admins. I KNOW they have their vacations and it's right.
Mine are only considerations about Boinc server and his daemons.
As i wrote some days ago, i hope these days someone can have time to see problems and not wait until the middle of January....
ID: 100226 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kn4mwd

Send message
Joined: 15 Apr 20
Posts: 3
Credit: 32,767
RAC: 0
Message 100229 - Posted: 28 Dec 2020, 20:05:22 UTC

12/28/2020 12:07:54 AM | Rosetta@home | Sending scheduler request: To fetch work.
12/28/2020 12:07:54 AM | Rosetta@home | Requesting new tasks for CPU
12/28/2020 12:07:56 AM | Rosetta@home | Scheduler request completed: got 0 new tasks
12/28/2020 12:07:56 AM | Rosetta@home | No tasks sent
12/28/2020 12:07:56 AM | Rosetta@home | Project requested delay of 31 seconds
12/28/2020 9:37:22 AM | | Suspending computation - CPU is busy
12/28/2020 9:37:32 AM | | Resuming computation
12/28/2020 2:32:40 PM | Rosetta@home | Sending scheduler request: To fetch work.
12/28/2020 2:32:40 PM | Rosetta@home | Requesting new tasks for CPU
12/28/2020 2:32:42 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
12/28/2020 2:32:42 PM | Rosetta@home | No tasks sent
12/28/2020 2:32:42 PM | Rosetta@home | Project requested delay of 31 seconds
12/28/2020 2:49:55 PM | | Suspending computation - CPU is busy
12/28/2020 2:50:05 PM | | Resuming computation
what gives?
ID: 100229 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 100230 - Posted: 28 Dec 2020, 22:24:48 UTC - in response to Message 100229.  

12/28/2020 2:32:42 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
what gives?
There is a problem on the server such that it is not sending out any new tasks to anybody. It is probably temporary, but might not be rectified until after the holiday.
ID: 100230 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5664
Credit: 5,711,666
RAC: 1,996
Message 100232 - Posted: 28 Dec 2020, 23:46:49 UTC - in response to Message 100230.  

Still 0

12/28/2020 2:32:42 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
what gives?
There is a problem on the server such that it is not sending out any new tasks to anybody. It is probably temporary, but might not be rectified until after the holiday.
ID: 100232 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Brian Nixon

Send message
Joined: 12 Apr 20
Posts: 293
Credit: 8,432,366
RAC: 0
Message 100233 - Posted: 28 Dec 2020, 23:59:38 UTC - in response to Message 100232.  

Still 0
Still holiday
ID: 100233 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,769,229
RAC: 5,729
Message 100234 - Posted: 29 Dec 2020, 1:43:26 UTC - in response to Message 100233.  

Still 0
Still holiday


Maybe not even until after the 1st ie the 4th since the 1st is a Friday. Lots of other fish in the sea, catch what you can elsewhere and keep checking back here so you can get some when they have some.
ID: 100234 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 78 · 79 · 80 · 81 · 82 · 83 · 84 . . . 279 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org