Posts by robertmiles

1) Message boards : Number crunching : Rosetta 4.0+ (Message 91143)
Posted 21 Sep 2019 by Profile robertmiles
Post:
I was referring to the post by James W. His machine is 4 GB.
https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=3710630


Ah, ok.
I notice that, despite "out of memory" error, i have over 6gb of free ram during crunch, so it seems that is not a problem of lack of memory.

Check if you've limited how much memory BOINC is allowed to use.

Also check is the operating system (probably Windows) is running in 32-bit mode, where it can't reach more than 4 GB of memory even if more if present.

Also, is the application program is running in 32-bit mode?
2) Message boards : Number crunching : Rosetta 4.0+ (Message 90999)
Posted 7 Aug 2019 by Profile robertmiles
Post:
1084416855

ERROR: ERROR: FragmentIO: could not open file 00001.200.9mers
ERROR:: Exit from: ..\..\..\src\core\fragment\FragmentIO.cc line: 233
BOINC:: Error reading and gzipping output datafile: default.out
14:35:27 (15216): called boinc_finish(1)


For this task, it looks like input file 00001.200.9mers was not sent along with the task and therefore could not be opened.
The other recently reported errors look more like the program found an error in one of the input files, and therefore could not continue properly past that point.


And, still....1086460738
No one can send the correct input file?

Rather few people have write access to the server, which they need in order to add that file to the list of input files that need to be sent along with these tasks. Even fewer have a copy of that file or even know what it should contain.

The real problem is that too few admins are reading this thread, seeing the problem reports, and knowing who to contact in order to get them fixed.
3) Message boards : Number crunching : Rosetta 4.0+ (Message 90990)
Posted 6 Aug 2019 by Profile robertmiles
Post:
Do we need to start threatening to move to other BOINC projects if we don't get more admin attention?
4) Message boards : Number crunching : Rosetta 4.0+ (Message 90987)
Posted 6 Aug 2019 by Profile robertmiles
Post:
In this thread we don't see admins since....i don't remember. I don't know even if they are reading this thread.

They hang out over here:
http://boinc.bakerlab.org/rosetta/forum_thread.php?id=1000

Not since January, though!
5) Message boards : Number crunching : Rosetta 4.0+ (Message 90981)
Posted 5 Aug 2019 by Profile robertmiles
Post:
1084416855

ERROR: ERROR: FragmentIO: could not open file 00001.200.9mers
ERROR:: Exit from: ..\..\..\src\core\fragment\FragmentIO.cc line: 233
BOINC:: Error reading and gzipping output datafile: default.out
14:35:27 (15216): called boinc_finish(1)


Again.
Please there is no one who can debug this amount of errors (cannot open files, out of memory, etc)??

The last few errors reported all look like there was an error in the input files, not an error in the application program or an error specific to your computer.

Usually the first line containing ERROR: gives the error, and any others are only the result of the program not being able to continue properly after the error. Sometimes, additional lines of error messages are needed to pin down why the error occurred, especially if the error was in the program.

For this task, it looks like input file 00001.200.9mers was not sent along with the task and therefore could not be opened.

The other recently reported errors look more like the program found an error in one of the input files, and therefore could not continue properly past that point.
6) Message boards : Number crunching : Rosetta 4.0+ (Message 90951)
Posted 26 Jul 2019 by Profile robertmiles
Post:
Hi, I'm new on this forum but want to ask developers why I get in last few days so much errors. Most of them is on 4.08 under Linux, this is my host to check, most of WU's have this error but only about 10% of all WS is invalid. Before few days only few, under 1% was invalid.

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
command: ../../projects/boinc.bakerlab.org_rosetta/rosetta_4.08_x86_64-pc-linux-gnu @luosc_3His_18res-3_symm_c.200.10_0001_HDZ_3_442_0011_0001.flags -nstruct 10000 -cpu_run_time 28800 -watchdog -boinc:max_nstruct 600 -checkpoint_interval 120 -mute all -database minirosetta_database -in::file::zip minirosetta_database.zip -boinc::watchdog -run::rng mt19937 -constant_seed -jran 2078213

ERROR: Error in simple_cycpep_predict app! The imported native pose has a different number of residues than the sequence provided.
ERROR:: Exit from: src/protocols/cyclic_peptide_predict/SimpleCycpepPredictApplication.cc line: 1700
BACKTRACE:
[0x5a57db6]
[0xaea70f]
[0xaea9c0]
[0xaff25b]
[0x412a94]
[0x5ff3ccc]
[0x6108e7]
BOINC:: Error reading and gzipping output datafile: default.out
19:55:39 (4350): called boinc_finish(1)

</stderr_txt>
]]>

What is wrong? My host or some error in app or in this WU's? Thank you for help, Dzordzik.

Look at the first line starting with ERROR:. It appears to show that it found in error in an input file. If so, there are often many other workunits sharing at least one of the same input files, so this error could also affect many other tasks, mostly for other users.

The server knows it has already sent your computer that input file, so it is likely to also send the same computer many other tasks that use that input file.
7) Message boards : Number crunching : Rosetta 4.0+ (Message 90950)
Posted 26 Jul 2019 by Profile robertmiles
Post:
BOINC:: Error reading and gzipping output datafile: default.out

I see the same thing on two of them.
https://boinc.bakerlab.org/rosetta/result.php?resultid=1084260100
https://boinc.bakerlab.org/rosetta/result.php?resultid=1084270467

And two more on another machine. It seems to be just in the last two days.

To me, that looks like an error resulting from some other error listed first, which prevented the default.out file from being produced.

You may need to show any other errors listed first, to show what caused this error.
8) Message boards : Number crunching : Rosetta 4.0+ (Message 90933)
Posted 24 Jul 2019 by Profile robertmiles
Post:
This may be due to a bad input file, rather than a program flaw.

Rosetta v4.07 windows_x86_64


Task 1084021530

http://boinc.bakerlab.org/rosetta/result.php?resultid=1084021530

ERROR: Unrecognized residue: HDZ


Task 1084021690

http://boinc.bakerlab.org/rosetta/result.php?resultid=1084021690

ERROR: Unrecognized residue: HDZ


At least, it didn't waste much CPU time.
9) Message boards : Rosetta@home Science : An idea for future research - proteins that bind toxic minerals (Message 90785)
Posted 24 May 2019 by Profile robertmiles
Post:
At least one plant absorbs large amounts of a toxic mineral. You can identify proteins binding toxic minerals, in order to make more plants that can help remove toxic minerals.


Transport processes allow arsenic hyperaccumulation

https://asknature.org/strategy/transport-processes-allow-arsenic-hyperaccumulation/

Arsenic hyperaccumulation in the Chinese brake fern (Pteris vittata) deters grasshopper (Schistocerca americana) herbivory.

https://www.ncbi.nlm.nih.gov/pubmed/17587384

Growth and arsenic uptake by Chinese brake fern inoculated with an arbuscular mycorrhizal fungus

https://www.sciencedirect.com/science/article/abs/pii/S0098847209000586
10) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 90514)
Posted 15 Mar 2019 by Profile robertmiles
Post:
I am seeing all too many errors from work units at the end of their processing cycle (after 12hours processing) and would like some advice as to whether there are any changes I can make to stop them.

Examples can be seen in WUs 1062692421 and 1062687362 but basically they show exit status 139 (unknown error) with signal 11 and a message saying that default.out.gz already exists with size -1.

Any suggestions would be gratefully received.

You might check if decreasing the time that workunits can run on your computers to ten hours has any effect on this.
11) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 89956)
Posted 3 Dec 2018 by Profile robertmiles
Post:
I am getting a message of "Abandoned by Project" on too many workunits. With 8 hour workunits this is unacceptable and since I compute in the Gridcoin pool I cannot change my settings.


Could this mean that your computer is so slow that two other computers have finished the workunit before your does?

Does your computer finish workunits before their deadlines?
12) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 89837)
Posted 5 Nov 2018 by Profile robertmiles
Post:
Dang it, I'm still getting the same error.

I tried to find the file app_control.cpp, but couldn't find it - is this a file I can edit?

Thanks!!

Franko

[snip]

Files with the .cpp extension are usually C++ source files, which can be edited. However, doing so is not useful unless:

1. You have a copy of the file. Most BOINC downloads do not include the source files - you have to know where to find the source files and download the entire package of source files.

2. You know enough C++ to make useful edits.

3. You have all of the compilers installed to compile the entire program for your operating system.

4. You have the instructions to compile all source files needed, and then link them into a new version of the program.

5. You know how to substitute the new version of the program for the old version.
13) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 89782)
Posted 26 Oct 2018 by Profile robertmiles
Post:
If you look at the top ten computers https://boinc.bakerlab.org/rosetta/top_hosts.php?sort_by=expavg_credit&offset=0, the first 4 places are occupied by [DPC] Nifhack with AMD:

[snip]

Looks like the main limitation of CPUs with this many processors is not the number of processors, but the speed of the memory that all the processors in the same package share.

If so, some of these processors could even be beyond the point where deciding which processor to allow to make the next memory access takes up enough of the run time is high enough to cause a significant slowdown.

You might also look up the cache size inside each of these CPUs - competing for cache space could also cause a significant slowdown.
14) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 89053)
Posted 4 Jun 2018 by Profile robertmiles
Post:
I agree with RJS5. If your acct was just started on 27th, how could you have crunched any WUs at all? You MUST have a duplicate or separate acct as well. Do you have other computers/devices also crunching Rosetta?

Did you notice the 27th of what month?
15) Message boards : Number crunching : Rosetta 4.0+ (Message 88621)
Posted 3 Apr 2018 by Profile robertmiles
Post:
I've had 8 tasks cancelled by server today, when they were, on the average, about half finished. Are you planning to issue any credit for the CPU time they used, or should I think of reducing the share of CPU time I offer to Rosetta@Home?

Most of them used the 32-bit version of 4.07, even though they were running under 64-bit Windows and BOINC. The computer has 32 GB of memory, so the 64-bit version of 4.07 should have been able to give them all enough memory.
16) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 88320)
Posted 19 Feb 2018 by Profile robertmiles
Post:
Are you sure that 1 hour is even an allowed value for CPU time? I haven't checked lately, but 3 hours used to be the lowest allowed value.

3hrs used to be the default, but 1hr was (and still is) the minimum allowed.

I agree the 1hr option should be removed. And with so many multi-core processors out there, the minimum should probably be 3hrs. 2hrs is also a current option.


Due to the tasks running for 6 hours then having computation errors its better to have the 1 hour task. Then once it reaches 2-3 hours its know to be bad and can be manually aborted instead of wasting a full 6 hours.

Depends on what caused the computation error.

Each Rosetta@Home task is composed of, usually, 100 subtasks. The first of these only check that the computer is handling such tasks properly; if it is the only one completed, the results of the task are useless.

The other 99 are either from 99 different starting points, or 99 iterations from one starting point. Only as many are actually done as will fit into the time allowed.

If the cause of the computation error is in only one starting point, it's probably best to run as many subtasks as will complete before reaching this starting point, since that many subtasks are not leading to a computation error. I don't think the project has mentioned whether they can recover output from all the properly completed subtasks if a later subtask gives a computation error.

On the other hand, if the cause of the computation error is in an input file shared by all 99 of these subtasks, it is best for the first one to detect the error and stop the whole task.

Note that allowing longer runs reduces the amount of communications time required to get input files from the server to your computer and get the output files back.
17) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 88285)
Posted 14 Feb 2018 by Profile robertmiles
Post:
Are you sure that 1 hour is even an allowed value for CPU time? I haven't checked lately, bu 3 hours used to be the lowest allowed value.
18) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 87903)
Posted 14 Dec 2017 by Profile robertmiles
Post:
Thanks for pointing out the issues with really long jobs to us. Some of these jobs are intended to predict the structures of cyclic peptides, and invoke a few different filters during their runs. For some peptides, passing all these filters is very low probability event, and therefore no structure makes it through even after hours of running. We are looking further into it, and will update you with more information very soon.

Some ideas you may want to think about:

Grant some credit for work that does not pass the filters, even if not as much as for work that does pass the filters.

Check for an end-of-run every time a starting point reaches a final decision on whether it produces good results, not just if the work on that starting point has passed the filters.

This means that users get some credit if they only remove at least one starting point from the list of starting points that would otherwise be sent to at least one more user.
19) Message boards : Number crunching : Minirosetta 3.73-3.78 (Message 87863)
Posted 8 Dec 2017 by Profile robertmiles
Post:
If you have a invalid pointer in your sw it's your problem, not a cpu problem.
I'm missing the word "because" in that sentence.

I just saw a similar problem but under Windows 10 and on an Intel CPU.

7H2LD3_51C703_fold_and_dock_SAVE_ALL_OUT_538615_1685
http://boinc.bakerlab.org/workunit.php?wuid=864346673

Rosetta Mini 3.78

64-bit Windows 10
Intel i7-5950X, 32 GB, SSD

Perhaps someone could check if it's the same problem, but under conditions much less likely to have the problem become visible.
20) Message boards : Number crunching : Problems and Technical Issues with Rosetta@home (Message 87797)
Posted 3 Dec 2017 by Profile robertmiles
Post:
New member here.

Boinc is not transferring data for Rosetts@home to my laptop. Communication has been continually deferred all night long.

Other Boinc projects are working fine.

Any ideas?

You might mention some details that could affect transfers. For example, what version of BOINC are you running? What operating system are you using (for example, Windows 10)? 32-bit or 64-bit? What CPU does your computer use?

How much memory does your computer have? Rosetta@Home requires more than most other BOINC projects.

Have Rosetta@Home tasks been failing on your computer lately? If enough have, expect rather few tasks for the next few days.

Note that if you're running Windows 10 and have installed the recent Fall Update, expect some things not to work until you turn your computer off for at least one minute, then turn it on and let it reboot. The usual restart without power off is NOT adequate just after installing this update.

What share of time on your computer did you assign Rosetta@Home? If 0, then expect Rosetta@Home tasks only when BOINC has tried and failed to get tasks from all other BOINC projects it is participating in.


Next 20



©2019 University of Washington
http://www.bakerlab.org