Posts by Markus Elfring

1) Message boards : Number crunching : Checking VM hypervisor communication failures for BOINC applications (Message 103921)
Posted 27 Dec 2021 by Markus Elfring
Post:

  • How do you think about to achieve any further insights for the program “/var/lib/boinc/projects/boinc.bakerlab.org_rosetta/vboxwrapper_26198_x86_64-pc-linux-gnu”?
  • How should the software components evolve according to information like the following?


/var/lib/boinc/slots/0/stderr.txt:
“…

2021-12-27 13:33:32 (3837): Detected: VirtualBox VboxManage Interface (Version: 6.1.30)
2021-12-27 13:33:38 (3837): Error in host info for VM: -1041038848
Command:
VBoxManage -q list hostinfo 
Output:
VBoxManage: error: Failed to create the VirtualBox object!
VBoxManage: error: Code NS_ERROR_SOCKET_FAIL (0xC1F30200) - IPC daemon socket error (extended info not available)
VBoxManage: error: Most likely, the VirtualBox COM server is not running or failed to start.

2021-12-27 13:33:38 (3837): WARNING: Communication with VM Hypervisor failed.

…”
2) Message boards : Number crunching : Checking VM hypervisor communication failures for BOINC applications (Message 103914)
Posted 26 Dec 2021 by Markus Elfring
Post:
I would like to share some computation resources for selected scientific research.
Thus I dare to run an application like “rosetta python projects 1.03 (vbox64)” again together with the software “BOINC 7.19.0” on an openSUSE Tumbleweed system.

But I stumbled on the status message “Moved: Communication with VM Hypervisor failed.” for tasks like the following (for example) so that I got the impression that the desired data processing is not working as expected at the moment.

  • aaai-SAR_pp-mNMABU-AGLY-ACBEN2_pp_8_2756113_1
  • aaqb-SAR-mALA-AMACBEN4-mACPenC12C_pp_5_2805265_1
  • boinc_cages_IL_2728657_84362
  • boinc_cages_IL_2728657_84423



How can remaining open issues be resolved here?

3) Message boards : Number crunching : Side effects from computation errors on other applications? (Message 73655)
Posted 19 Aug 2012 by Markus Elfring
Post:
I saw few situations when R@H cause errors on other application.

I find it a general difficulty to correlate Rosetta application error codes with my observations of unpleasant software behaviour. I hope that it will become easier to narrow down the really involved error reasons.
4) Message boards : Number crunching : Side effects from computation errors on other applications? (Message 73571)
Posted 4 Aug 2012 by Markus Elfring
Post:
I observe a big amount of computation errors as results by Rosetta applications on my PC system. I notice also that when such failures appear in my Microsoft Windows desktop environment it becomes likely that other programs like games or internet browsers behave also in unexpected ways like hangs or crashes.
Do any more users stumble on such bad side effects from failing data processing in one application with other software?
5) Message boards : Number crunching : Client Errors (Message 72454)
Posted 5 Mar 2012 by Markus Elfring
Post:
[...], and without exception, every unit results in "client error."

Are you interested in any improvements for a topic like "Statistics for computation errors"?
6) Message boards : Number crunching : Looking into Rosetta's source code? (Message 72142)
Posted 15 Jan 2012 by Markus Elfring
Post:
... - have you thought of external "consulting" from the volunteering programmers from the BOINC or Free/Libre/Open Source Software community?

I am volunteering for some other applications already.

I do realize there can be many secrets and fears of being copied involved

I hope that such fears can be reduced.

- but additional IT resources could be very potent: both in generating new and higher quality, reviewed code as well as higher reliability and recognition within our communities.

I would also appreciate if Rosetta's applications can benefit from corresponding effects.

These are understandable concerns which we have sometimes a hard time to answer.

Do you get any answers for the involved issues from the project leaders?

I remember seeing on this forum a very brief remark of a volunteer who saw some actual code and I thought it was a positive experience for both sides.

Would you like to improve the experience a bit more in this direction?
7) Message boards : Number crunching : Looking into Rosetta's source code? (Message 72141)
Posted 15 Jan 2012 by Markus Elfring
Post:
I've read elsewhere that they charge for a commercial version of the source code.

Yes. - This university is also interested in the commercialisation of "its assets".

That appears likely to provide a major part of their income, so do you expect them to make it free?

Support services and education can also be sold for free software, can't it?

Would you like to benefit from more freedom in this science and software domain?
8) Message boards : Number crunching : Looking into Rosetta's source code? (Message 72029)
Posted 9 Jan 2012 by Markus Elfring
Post:
You could, in principle, get a copy of the base code through the UW - as you know.

Jennifer Owens informed me that she can not help me with the requested source code access so far. Who can decide from your team to provide/grant a look into the used source files?

Would you like to turn your offer into free software?
9) Message boards : Number crunching : Looking into Rosetta's source code? (Message 71835)
Posted 20 Dec 2011 by Markus Elfring
Post:
If you're serious and committed about volunteering to do testing/debugging of the Rosetta@home code, it might be worth emailing David Baker to volunteer in such a capacity, but I can't make any guarantees that anything will come of it.

Have you got a more direct contact to him?

Are you interested in further clarification of access ways to the original source files that drive software like the application "Rosetta Mini 3.19" currently?
10) Message boards : Number crunching : Statistics for computation errors? (Message 71759)
Posted 6 Dec 2011 by Markus Elfring
Post:
Markus, it sounds like you a mixing the ideas of the project team (which needs to gather information across the whole of the project) with a individual user (which needs information about their own machines and potential problems they may be having).

That is partly the case. - It seems that the processing on my computer was affected by some bad implementations for a task selection. It is hard to find an useful pattern in the variation of error reasons.
A few other users gave also feedback on unexpected program behaviour (in the forum here) for Rosetta's applications. I guess that a couple of users would like to see further explanations of observed failure rates.

Each BOINC project also tends to have a variety of types of work, each of which might warrant it's own bucket of statistics.

I guess that Rosetta researchers and software developers can become overwhelmed by the sheer number of computation errors. Various work results are sent back by a potentially growing number of users and hosts.

On the other hand, some symptoms and messages and outcomes are common to general causes, such as memory exceptions, or common to specific Operating Systems.

Would you like to publish such details in a public report or issue tracker?

If some logic to analyze the output in general terms were incorporated into BOINC, all of the projects could benefit from such a thing.

Are you aware of any approaches to improve the software infrastructure for automatic analysis of computation errors?

Is anybody besides me interested in a kind of public computation health indicator additional to the existing statistic views?
11) Message boards : Number crunching : Looking into Rosetta's source code? (Message 71745)
Posted 4 Dec 2011 by Markus Elfring
Post:
The code that's being run on Rosetta@home corresponds to a fairly recent snapshot of the main Rosetta development branch.

How many software developers have got direct access to the original source files for Rosetta's applications?

Would you like to consider further ways to support easier cooperation and improvements for a higher contribution probability?
12) Message boards : Number crunching : Statistics for computation errors? (Message 71636)
Posted 24 Nov 2011 by Markus Elfring
Post:
I can see in the task list for my PC system that a couple of computations have got the client state "Compute error". I can look into each of them by the results web display. But I find this user interface to find out corresponding error reasons not so convenient as I imagine it could be.

The BOINC software has got an infrastructure to generate some statistics. Now I am looking for tools which can visualise the error distribution in an improved way to increase the chances for fixing involved open issues.

Is any automatic analysis performed on the returned exit codes within work units?
Is an automatic categorisation performed for computation failures so that an efficient drill-down into interesting issues would be supported?

How often do you notice error reasons like "Out Of Memory (C++ Exception)" and "Access Violation" at the moment?
13) Message boards : Number crunching : Switch off error message dialogues for background processes (Message 71626)
Posted 20 Nov 2011 by Markus Elfring
Post:
The application "Rosetta Mini 3.17" crashes so often on my current computer that I get the impression that successful computations have become a minority on this system. I get also notifications for events like invalid memory accesses by a dialogue. This disturbs me (especially in my Microsoft Windows desktop environment).
I would appreciate if such application pop-ups could be switched off so that error reporting would be performed for the usual computations in the background only.
14) Message boards : Number crunching : Looking into Rosetta's source code? (Message 71614)
Posted 15 Nov 2011 by Markus Elfring
Post:
The code that's being run on Rosetta@home corresponds to a fairly recent snapshot of the main Rosetta development branch.

Thanks for your clarification.

I believe that there's provisions for interested third parties to get access to the development source code as something like a "consultant" (...).

I appreciate your explanation for the current situation.

Are there any chances to improve the accessibility to the really used program sources for interested free software developers like me?

Do the involved universities care for "Open Access" (with implementation documentation and source files)?
15) Message boards : Number crunching : Looking into Rosetta's source code? (Message 71613)
Posted 15 Nov 2011 by Markus Elfring
Post:
Probably an easier course of action is to post whatever information about the crash you have here to the forums.

I assume that it will be very hard to find the real error reasons when so many computations fail. I imagine that automatic categorisation of computation failures would be needed to concentrate on interesting open issues, wouldn't it?

Hopefully one of the other posters will recognize it as an existing issue and offer you a workaround, or perhaps one of the developers will be able to diagnose and fix it.

I would be glad if somebody will recognize useful patterns.

Issue candidates:
1. Task 463437031: Compute error
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Out Of Memory (C++ Exception)

2. Task 462767468: Compute error
SIGBUS: bus error

3. Task 462746389: Validate error

4. Task 463638378: Compute error
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation
16) Message boards : Number crunching : Looking into Rosetta's source code? (Message 71610)
Posted 15 Nov 2011 by Markus Elfring
Post:
That might not help you exactly, as the version which is on Rosetta@home isn't the same as the released version.

I would appreciate if I can look into the source files exactly for the currently used revision by a (distributed) content management system. How are the fulfilment chances for such a wish?
17) Message boards : Number crunching : Looking into Rosetta's source code? (Message 71604)
Posted 14 Nov 2011 by Markus Elfring
Post:
I am surprised how often the application "Rosetta Mini 3.17" crashed on my computer recently.
Is it possible to inspect corresponding source code for the affected executable files?






©2024 University of Washington
https://www.bakerlab.org