Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 183 · 184 · 185 · 186 · 187 · 188 · 189 . . . 309 · Next

AuthorMessage
[DPC] BlueTooth76

Send message
Joined: 23 Mar 20
Posts: 4
Credit: 47,577,853
RAC: 0
Message 105131 - Posted: 22 Feb 2022, 14:42:25 UTC
Last modified: 22 Feb 2022, 14:51:29 UTC

Now WCG is down for 2 months, I moved back to Rosetta.

A lot of VirtualBox tasks get stuck for days and don't seem end. They expire on the server side, the work is lost and no points.
Don't want to babysit my computers...

Is there any fix for that? Other option is to disable Virtual Box work.
ID: 105131 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 105132 - Posted: 22 Feb 2022, 14:59:07 UTC - in response to Message 105131.  
Last modified: 22 Feb 2022, 15:00:38 UTC

A lot of VirtualBox tasks get stuck for days and don't seem end. They expire on the server side, the work is lost and no points.
Don't want to babysit my computers...

Is there any fix for that? Other option is to disable Virtual Box work.

They are easier to spot if you install BoincTasks. It shows the CPU useage, and when you see it in red that means the CPU is not doing much.
You can delete them in the first five minutes if you catch them.

Otherwise, the project has to fix it, if there is anyone at Rosetta at all.

(It would be nice if some software expert could automate the process.)
ID: 105132 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 272
Credit: 507,897
RAC: 334
Message 105133 - Posted: 22 Feb 2022, 15:01:44 UTC - in response to Message 105132.  

https://efmer.com/boinctasks/download-boinctasks/
ID: 105133 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[DPC] BlueTooth76

Send message
Joined: 23 Mar 20
Posts: 4
Credit: 47,577,853
RAC: 0
Message 105135 - Posted: 22 Feb 2022, 15:24:32 UTC - in response to Message 105132.  

A lot of VirtualBox tasks get stuck for days and don't seem end. They expire on the server side, the work is lost and no points.
Don't want to babysit my computers...

Is there any fix for that? Other option is to disable Virtual Box work.

They are easier to spot if you install BoincTasks. It shows the CPU useage, and when you see it in red that means the CPU is not doing much.
You can delete them in the first five minutes if you catch them.

Otherwise, the project has to fix it, if there is anyone at Rosetta at all.

(It would be nice if some software expert could automate the process.)


Thanks!

I'm used to look at my rigs once a week, sometimes 2 times in a month... Before the VB work, I ran Rosetta for a year without many issues.

I guess they'll have to fix it while I focus on non-VB work or another project....
ID: 105135 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105140 - Posted: 22 Feb 2022, 19:41:57 UTC - in response to Message 105135.  

A lot of VirtualBox tasks get stuck for days and don't seem end. They expire on the server side, the work is lost and no points.
Don't want to babysit my computers...

Is there any fix for that? Other option is to disable Virtual Box work.

They are easier to spot if you install BoincTasks. It shows the CPU useage, and when you see it in red that means the CPU is not doing much.
You can delete them in the first five minutes if you catch them.

Otherwise, the project has to fix it, if there is anyone at Rosetta at all.

(It would be nice if some software expert could automate the process.)


Thanks!

I'm used to look at my rigs once a week, sometimes 2 times in a month... Before the VB work, I ran Rosetta for a year without many issues.

I guess they'll have to fix it while I focus on non-VB work or another project....


Check your Vbox version and extensions version. You may need to update to get stable.

I forgot about WCG going down. I guess I need to disable that project for now.
You can try another Vbox project, I can't get it to be stable on my rig for some reason, others have lots of success. Look up QuChemPedia.
ID: 105140 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
[DPC] BlueTooth76

Send message
Joined: 23 Mar 20
Posts: 4
Credit: 47,577,853
RAC: 0
Message 105142 - Posted: 22 Feb 2022, 21:10:50 UTC - in response to Message 105140.  
Last modified: 22 Feb 2022, 21:11:25 UTC

A lot of VirtualBox tasks get stuck for days and don't seem end. They expire on the server side, the work is lost and no points.
Don't want to babysit my computers...

Is there any fix for that? Other option is to disable Virtual Box work.

They are easier to spot if you install BoincTasks. It shows the CPU useage, and when you see it in red that means the CPU is not doing much.
You can delete them in the first five minutes if you catch them.

Otherwise, the project has to fix it, if there is anyone at Rosetta at all.

(It would be nice if some software expert could automate the process.)


Thanks!

I'm used to look at my rigs once a week, sometimes 2 times in a month... Before the VB work, I ran Rosetta for a year without many issues.

I guess they'll have to fix it while I focus on non-VB work or another project....


Check your Vbox version and extensions version. You may need to update to get stable.

I forgot about WCG going down. I guess I need to disable that project for now.
You can try another Vbox project, I can't get it to be stable on my rig for some reason, others have lots of success. Look up QuChemPedia.


I turned of VB work for now, it's not worth it. Just had to abort another 22 units that got stuck, waste of CPU-cycles and no points.
Will start looking for another project in the coming days or just shut them down the machines until WCG is back.
ID: 105142 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1725
Credit: 18,398,287
RAC: 19,677
Message 105147 - Posted: 23 Feb 2022, 6:21:43 UTC

Another batch of work released, and only some of these error out straight away. Not great, but still better than all of them doing it.
Grant
Darwin NT
ID: 105147 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tullio

Send message
Joined: 10 May 20
Posts: 63
Credit: 630,125
RAC: 0
Message 105152 - Posted: 23 Feb 2022, 9:34:05 UTC
Last modified: 23 Feb 2022, 9:35:22 UTC

5 4.20 tasks running, two failed immediately, one rosetta python waiting to run.
Tullio
ID: 105152 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Erich56

Send message
Joined: 11 Jan 16
Posts: 35
Credit: 1,437,503
RAC: 0
Message 105153 - Posted: 23 Feb 2022, 9:34:22 UTC

Since 2 days I have tried Python on several of my machines, the success is modest :-(
Too many tasks are still faulty (e.g. keeping running, but not using CPU, which is a waste).
Since I cannot, as someone else wrote above, babysit my computers just to find out whether a newly started Python task is okay or not, I will abandon Rosetta for the time being. Which is too bad, but I have no other choice :-(
ID: 105153 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 105154 - Posted: 23 Feb 2022, 10:21:05 UTC - in response to Message 105153.  

You disable VirtualBox work on the details page of every one of your devices.
ID: 105154 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2002
Credit: 9,783,459
RAC: 5,082
Message 105155 - Posted: 23 Feb 2022, 10:25:39 UTC - in response to Message 105147.  

Another batch of work released, and only some of these error out straight away. Not great, but still better than all of them doing it.


+1
Seems that "h3_3stub" wus have problems

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00007FF767908316 read attempt to address 0xFFFFFFFF


- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x00007FF76B0A9D88


- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0000000000000000


Also these wus are NOT tested on Ralph
ID: 105155 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 272
Credit: 507,897
RAC: 334
Message 105156 - Posted: 23 Feb 2022, 10:27:20 UTC - in response to Message 105155.  

Some PcrV8MER wus crashed too.
ID: 105156 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
xii5ku

Send message
Joined: 29 Nov 16
Posts: 22
Credit: 13,889,918
RAC: 1,647
Message 105159 - Posted: 23 Feb 2022, 11:04:43 UTC - in response to Message 105131.  

BlueTooth76 wrote:
A lot of VirtualBox tasks get stuck for days and don't seem end. They expire on the server side, the work is lost and no points.
Don't want to babysit my computers...

Is there any fix for that? Other option is to disable Virtual Box work.
There is no fix, but a workaround. See thread "Summary of issues with VirtualBox tasks", message 104802.
ID: 105159 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2141
Credit: 41,532,131
RAC: 10,607
Message 105166 - Posted: 23 Feb 2022, 14:32:10 UTC - in response to Message 105129.  

Rosetta has always been an experimental project imo.
Asking questions that have never been asked before, using tasks that have never been written before, with parameters whose limits may not be entirely obvious from the outset.
So if things go wrong, it should hardly be a surprise to anyone and no-one should get themselves worked up about it, especially when failures are a bigger problem for the project than they are for any one of us.
And that's the case here. How they chose to solve the problem is down to them, not us. Because they <can't> solve it and only users can in this instance.
Same as it ever was.

This is cutting edge science. But... they usually use Ralph first to test their ideas. This time they didn't. Such is life at the 'new' RAH.

It should, you're right, but it's never really worked. I've never bothered with Ralph.

Some people take the view there's no such thing as betatest software - you only need to look at the assurances you get from finished product, no guarantee it'll do what it's claimed to do.
It makes sense not to have any sense of entitlement as to the reliability of anything we get issued. That approach certainly saves time.

If we didn't have perpetual indignance on these message boards, the traffic would certainly be a lot less. From some accounts a lot more than others.
ID: 105166 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2002
Credit: 9,783,459
RAC: 5,082
Message 105173 - Posted: 23 Feb 2022, 16:22:30 UTC - in response to Message 105129.  

This is cutting edge science. But...they usually use Ralph first to test their ideas. This time they didn't. Such is life at the 'new' RAH.


I participate also to Ralph and it's "not a problem to have problems".
But here in Rosetta i would like a stable and tested work
ID: 105173 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 2002
Credit: 9,783,459
RAC: 5,082
Message 105174 - Posted: 23 Feb 2022, 16:28:21 UTC - in response to Message 105166.  

If we didn't have perpetual indignance on these message boards, the traffic would certainly be a lot less. From some accounts a lot more than others.


I don't know if you're referring to me, but it's not important.
I have not a "perpetual indignance", often i write about science, about new languages or cpu/gpu features, etc.
A request, occasionally, of explanations i think it's not "indignance".

About "message board traffic" i'm agree with you: admins are vanished :-P
ID: 105174 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 272
Credit: 507,897
RAC: 334
Message 105176 - Posted: 23 Feb 2022, 17:49:55 UTC - in response to Message 105174.  

When i changed target cpu time from 1d12h to 2h and updated project in boinc manager workunits that were in progress at 7 hours didn't change their target cpu time to 2 hours and didn't finish.
ID: 105176 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Falconet

Send message
Joined: 9 Mar 09
Posts: 354
Credit: 1,276,393
RAC: 2,018
Message 105177 - Posted: 23 Feb 2022, 18:23:15 UTC - in response to Message 105176.  

When i changed target cpu time from 1d12h to 2h and updated project in boinc manager workunits that were in progress at 7 hours didn't change their target cpu time to 2 hours and didn't finish.



I recommend stopping them with LAIM off or exit BOINC ticking the box where it says "stop all applications".
ID: 105177 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105178 - Posted: 23 Feb 2022, 19:31:48 UTC - in response to Message 105131.  

Now WCG is down for 2 months, I moved back to Rosetta.

A lot of VirtualBox tasks get stuck for days and don't seem end. They expire on the server side, the work is lost and no points.
Don't want to babysit my computers...

Is there any fix for that? Other option is to disable Virtual Box work.



Which Vbox versions have you been using?
What other Vbox projects are you running?
Which computer are you referring to?
What is the load on your system?
How much memory is being used in terms of %? Do you get any error messages?
Do they go into "waiting to run" status a lot or are there any "need more memory" errors?
Do your machines run 24/7?
If not, before you close BOINC, do you suspend your work (and is the keep in memory option checked?) and then use the shut down client option before you exit?

All these things can be a factor.
ID: 105178 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105179 - Posted: 23 Feb 2022, 19:34:02 UTC - in response to Message 105173.  

This is cutting edge science. But...they usually use Ralph first to test their ideas. This time they didn't. Such is life at the 'new' RAH.


I participate also to Ralph and it's "not a problem to have problems".
But here in Rosetta i would like a stable and tested work



Rosetta is supposed to be stable and tested tasks with only minor errors that can be corrected quickly upon notification. That a team member just dumped the tasks on Rosetta and did not respond to any messages, shows the lack of commitment from the team towards their non neural network base.
ID: 105179 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 183 · 184 · 185 · 186 · 187 · 188 · 189 . . . 309 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org