3.43 is causing pop-ups

Message boards : Number crunching : 3.43 is causing pop-ups

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5

AuthorMessage
Polian
Avatar

Send message
Joined: 21 Sep 05
Posts: 152
Credit: 10,141,266
RAC: 0
Message 74452 - Posted: 16 Nov 2012, 21:22:13 UTC - in response to Message 74450.  

So far, I have seen the problem only under 64-bit Windows 7 with BOINC 7.0.28.


Same here -- BOINC 7.0.28, 64-bit Windows 7 (SP1), Minirosetta 3.43. Pops up 12 windows (Core i7-3930K CPU, 12 virtual cores) each of which looks like what you'd see if you had the screen-saver running, none of which are doing anything other than taking a core away from other BOINC projects.

It seems to have started with the Minirosetta 3.43 update...

L.



Why not abort these tasks as has been reported as the fix on the front page and all over these forums?
ID: 74452 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
KWSN - Roger the Shrubber

Send message
Joined: 16 Sep 07
Posts: 2
Credit: 6,610,363
RAC: 3,937
Message 74453 - Posted: 16 Nov 2012, 22:10:24 UTC

Doing it to me, too.
ID: 74453 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 192
Credit: 13,944,596
RAC: 11,648
Message 74456 - Posted: 17 Nov 2012, 5:11:17 UTC - in response to Message 74436.  


This would definitely save disk space. I don't know if it's an option with standard BOINC but I'll look into it and I'll also see if we might be able to implement this somehow.

thanks for pointing this out!



Seems what standard way to do it exist:
http://boinc.berkeley.edu/trac/wiki/PhysicalFileManagement
http://boinc.berkeley.edu/trac/wiki/BoincFiles
BOINC allow apps to write and read data directly from project folder. And it recommended in situations like with R@H database (Your application uses a large number of files, and you supply these as a single archive that is unpacked by your application)
So you need to find how to unpack it in project folder only once (at each app update) and delete after app version(and corresponding database revision) obsolete - since BOINC client does not manage such files automatically. Last is preferably but optional.

http://boinc.berkeley.edu/trac/wiki/DeleteFile - standart way to delete old files on BOINC client by command from server

If you do not know(find) "elegant" way to unpack database only once per app version - simple workaround is to unpack DB at startup of WU (like it work now) but to project folder instead of slot folder (where files auto deleted after WU end work) and before unpack check if database folder already exist? If exist - skip unpack. need to rename database folder to include DB revision (like minirosetta_database_rev52077 ) - to avoid mixed up files at app version change (when 2 apps versions and database rev. are stored in the same folder for short time)
ID: 74456 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jonathan Brier
Avatar

Send message
Joined: 1 Dec 05
Posts: 12
Credit: 1,335,896
RAC: 645
Message 74458 - Posted: 17 Nov 2012, 17:05:25 UTC - in response to Message 74447.  

Could you guys possibly send abort commands, from the server, for 3.43 jobs? There are many computers that aren't monitored 24x7 by people.



Chris Miles, a graduate student in the lab, has been doing our application updates. He's the one to thank for getting a fix up as soon as we identified the issues (with help from everyone here!).

Thanks!


Yes, you have to be patient and ideally let the boinc client take care of getting new jobs. However, you should manually kill and abort 3.43 jobs on Windows 64 bit platforms.




The run time limit on one of our test machines was less than 3 days (depends on the client benchmarks) so these jobs should flush out eventually.


I agree with KSMooney about the server task abort. Please setup a process that ensures quality work units only in the pipeline is essential. Removal of bad work units and applications should occur upon their detection. Letting units just "flush out" is not something a nontechnical audience handles well and not removing these issues causes the perception of volunteer computing to suffer.

At GridRepublic and Charity Engine we receive complaints and many support tickets when issues are let to flush through projects. Taking advantage of the server cancel mechanism is important for guaranteeing user experience once a problem is detected. This makes our brands suffer and lowers the potential devices available to Rosetta@home and other projects as it can damage the overall system's perspective.

Misbehaving work units is a reason why Charity Engine removed Rosetta@home as a back-fill research project. We need projects that ensure only high quality units run on the volunteer machines and any misbehaving units are removed at first sign of an issue. There were instances where work units were not exiting properly on machines and taking up too much RAM... never did figure out the source of the issue.
GridRepublic - bringing BOINC mainstream: http://www.gridrepublic.org

GridRepublic Fan Page: http://www.facebook.com/GridRepublic

Progress Thru Processors Facebook: http://www.facebook.com/progressthruprocessors
ID: 74458 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
BadThad

Send message
Joined: 8 Nov 05
Posts: 30
Credit: 59,316,669
RAC: 14,666
Message 74459 - Posted: 17 Nov 2012, 18:00:10 UTC

Got up this morning to multiple windows on the desktop. I repaired BOINC and reset the project. Now I can't get work. Houston....we have a problem!
ID: 74459 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Wayne Mattox

Send message
Joined: 12 Apr 09
Posts: 1
Credit: 4,880,139
RAC: 0
Message 74460 - Posted: 17 Nov 2012, 18:26:45 UTC

Since no one has stated it - after deleting 3.43 jobs - 3.45 seem to be running fine Thanks
ID: 74460 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jacob Klein

Send message
Joined: 3 Jul 07
Posts: 15
Credit: 4,354,633
RAC: 6,589
Message 74466 - Posted: 18 Nov 2012, 14:46:02 UTC - in response to Message 74447.  
Last modified: 18 Nov 2012, 14:46:51 UTC

David,

Could you please look into why Ralph is still sending out busted 3.43 tasks?
http://ralph.bakerlab.org/forum_thread.php?id=539#5615

I've lost days worth of crunching because of this.

Also, please consider implementation and execution of a server-side abort.
People are uninstalling BOINC, repairing BOINC, writing bug reports, wasting tons of CPU cycles across all projects, all for a problem with your projects that you guys could have already cleaned up if you really wanted to.

This is ridiculous.
ID: 74466 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TPCBF

Send message
Joined: 29 Nov 10
Posts: 108
Credit: 4,043,926
RAC: 402
Message 74470 - Posted: 19 Nov 2012, 1:02:33 UTC - in response to Message 74452.  

Why not abort these tasks as has been reported as the fix on the front page and all over these forums?
Because that isn't really a "solution". You simply forget that not all machines crunching for Rosetta@Home are easily accessible. While I could do just that, aborting the batch of faulty WUs on one of my own hosts, it created a mess on two remote systems for which I had in the past permission to run it on. Not any longer, as those users now felt interrupted and I had to remove BOINC/R@H from those systems once I got on-site...

Sorry, if for whatever reason such faulty WUs make it out "into the wild", there needs to be a way to have a server side abort on those, not requiring user interaction. WCG can do this just fine...

Ralf
ID: 74470 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 192
Credit: 13,944,596
RAC: 11,648
Message 74471 - Posted: 19 Nov 2012, 3:26:33 UTC - in response to Message 74466.  

David,

Could you please look into why Ralph is still sending out busted 3.43 tasks?
http://ralph.bakerlab.org/forum_thread.php?id=539#5615


I am not David, but i can explain.
Ralph not actually send more NEW wus to 3.43 app. This is resent after its fails on another computer. (Up to 500 such WUs left)
I also received a portion of WUs for 3.43 version. Checking them I was convinced that all they resubmit WUs that hit computers with windows x64 as a first wingman.
ID: 74471 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 192
Credit: 13,944,596
RAC: 11,648
Message 74472 - Posted: 19 Nov 2012, 3:55:34 UTC


And I join the request about the remote (server-side) remove of WUs of 3.43 version. Only in our team more than 50 machines total connected to R@H without access(direct or remote) to the management client. And errors with pop-ups windows continue - as far as how correspondig WUs go up in queue.
For some computers (with large cache of tasks in BOINC settings) pop-up problem even still in the future! - as they are now completing the last WUs for the 3.41 version.

I do not know if there is a regular way for remote(server-side) cancel WUs already issued to clients...
But even if it is not, you can just replace the file that is causing the problems (https://boinc.bakerlab.org/rosetta/download/minirosetta_3.43_windows_x86_64.exe) by correct app file and instruct BOINC client(all connected to R@H or just with win x64 OS) to delete local copy (http://boinc.berkeley.edu/trac/wiki/DeleteFile).
It will force BOINC to redownload correct app file.
ID: 74472 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bart

Send message
Joined: 8 Oct 11
Posts: 2
Credit: 476,125
RAC: 0
Message 74473 - Posted: 19 Nov 2012, 7:05:10 UTC

So is 3.4.5
I have suspended and no new jobs.
ID: 74473 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Stephen Miller

Send message
Joined: 18 Sep 05
Posts: 13
Credit: 16,294,215
RAC: 0
Message 74483 - Posted: 20 Nov 2012, 7:33:48 UTC

A BOINC project message would have been a good method to let everyone (whose running a current BOINC anyways) know the announcement that's on the main page.

I updated BOINC about the same time Rosetta went into stupid mode and thought it was a new "feature" of BOINC. Today I see that it's a 3.43 issue. Thanks to Microsoft training, we have learned to live with bugs and issues until they are resolved. I'm embarrassed.

Never in the field of human conflict was so much owed by so many to so few - Winston Churchill

BOINC version: Never in the field of distributed computing was so much wasted by so many for so few credit.
ID: 74483 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Doug_Hood

Send message
Joined: 15 Dec 05
Posts: 2
Credit: 3,416,526
RAC: 0
Message 74607 - Posted: 28 Nov 2012, 13:55:17 UTC

I am just going to say this to any of the moderators/Rosetta staff who might still be listening:

If anything like this 3.43 fiasco happens again, I will drop Rosetta. Here is a big F'N hint: BOINC has a message feature. USE IT. If I don't see a message in BOINC the next time that something is wrong with Rosetta, I will delete it immediately from my project list and move on. Period.
ID: 74607 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mad_Max

Send message
Joined: 31 Dec 09
Posts: 192
Credit: 13,944,596
RAC: 11,648
Message 74612 - Posted: 28 Nov 2012, 17:06:26 UTC

Stephen Miller
Doug_Hood


If you mean the Message tab in Boinc client (Notices) it will be nice. But they(R@H staff) can not use this feature. Because version of BOINC software(server side) currently used in the R@H project does not support this feature. They first have to update (replace) all server software to new version. And the project team does not want to do this (as it is quite difficult and potentialy can cause another problems)
ID: 74612 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Gary Charpentier

Send message
Joined: 2 Oct 07
Posts: 3
Credit: 3,186,985
RAC: 4,004
Message 74654 - Posted: 3 Dec 2012, 20:12:45 UTC - in response to Message 74612.  

If you mean the Message tab in Boinc client (Notices) it will be nice. But they(R@H staff) can not use this feature. Because version of BOINC software(server side) currently used in the R@H project does not support this feature. They first have to update (replace) all server software to new version. And the project team does not want to do this (as it is quite difficult and potentialy can cause another problems)

What are they undergrads? Too busy with coffee and donuts? Do they like security holes? Do they want people mad at them? Can't afford backup storage?

ID: 74654 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1658
Credit: 75,261,974
RAC: 49,471
Message 74658 - Posted: 4 Dec 2012, 10:21:52 UTC - in response to Message 74654.  

If you mean the Message tab in Boinc client (Notices) it will be nice. But they(R@H staff) can not use this feature. Because version of BOINC software(server side) currently used in the R@H project does not support this feature. They first have to update (replace) all server software to new version. And the project team does not want to do this (as it is quite difficult and potentialy can cause another problems)

What are they undergrads? Too busy with coffee and donuts? Do they like security holes? Do they want people mad at them? Can't afford backup storage?


Are you suggesting that you think the server upgrade is trivial? If so, what are you basing that on?
ID: 74658 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 74664 - Posted: 4 Dec 2012, 15:49:02 UTC

Indeed the server code from BOINC with Rosie is outdated. You can see that on several things. You can not see all your different pc's to the project. You can not sort per application, per sort of tasks. The link in BOINC that should bring you to your "home account" doesn't do that. And indeed the message tab is not working.
For active crunchers Rosetta can indeed be a pain, with not the best support. However I'm here for the science it does. I'll have a lot of critics to the project as well, but I'll stick with it.
Greetings,
TJ.
ID: 74664 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 295
Credit: 359,460
RAC: 0
Message 74668 - Posted: 4 Dec 2012, 19:39:38 UTC - in response to Message 74664.  

You can not see all your different pc's to the project.

Hmm... I can see all my PCs when I click here, should work for everyone.
.
ID: 74668 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 74670 - Posted: 5 Dec 2012, 15:15:25 UTC - in response to Message 74668.  

You can not see all your different pc's to the project.

Hmm... I can see all my PCs when I click here, should work for everyone.


Of course we can, but that is not what I meant. With the new server code from BOINC you can see at "all tasks "page all the pc's running for the project.
Greetings,
TJ.
ID: 74670 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 295
Credit: 359,460
RAC: 0
Message 74672 - Posted: 5 Dec 2012, 17:35:52 UTC - in response to Message 74670.  

You can not see all your different pc's to the project.

Hmm... I can see all my PCs when I click here, should work for everyone.


Of course we can, but that is not what I meant. With the new server code from BOINC you can see at "all tasks "page all the pc's running for the project.

THAT page you mean... yeah, there I don't see them too.
.
ID: 74672 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5

Message boards : Number crunching : 3.43 is causing pop-ups



©2020 University of Washington
https://www.bakerlab.org