Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · 22 · 23 . . . 55 · Next
Author | Message |
---|---|
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
I get an instant 'compute error' on all my work units for the last few days now. No problems with other projects from WCG. I assume you mean this computer. See this thread. In short: you might need to downgrade to BOINC v6.12.34, which you use on your other computers (and as you see they don't have such issues). . |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
I get an instant 'compute error' on all my work units for the last few days now. No problems with other projects from WCG. There might be simpler solution than downgrading. He's getting -185 errors with "couldn't start Input file minirosetta_3.31_windows_intelx86.exe missing or invalid: -123: -123". Perhaps simply rebooting the computer and/or possibly resetting rosetta will do the trick. Best, Snags |
Polian Send message Joined: 21 Sep 05 Posts: 152 Credit: 10,141,266 RAC: 0 |
I get an instant 'compute error' on all my work units for the last few days now. No problems with other projects from WCG. I agree - maybe the application became corrupted somehow and/or failed download? Try resetting the project. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
More likely that an anti-virus or firewall has interfered with the executable in some way, or the permissions for the BOINC user are not sufficient to allow it to run. Rosetta Moderator: Mod.Sense |
ex_brit Send message Joined: 12 Dec 09 Posts: 15 Credit: 100,070 RAC: 0 |
More likely that an anti-virus or firewall has interfered with the executable in some way, or the permissions for the BOINC user are not sufficient to allow it to run. Why would that suddenly be a problem though? I admit I'm only observing this as I haven't lately had any problems at all with Rosetta, but I just had to ditch Poem@Home because every single WU I ever got from them failed - computation error and all they seem to want to blame is Boinc, or me. Is anyone working with the Boinc people if indeed it is a Boinc problem? Peter. Toronto, Canada |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Each BOINC project periodically sends out new versions of their application. This would mean that there is an executable file that must pass through a/v or firewall protections. It also means a new name of allowed exception might need to be defined, depending on how specifically the exceptions are named. So, that might be a reason why it would be working one day and not the next, and why I suggested it as a thing to check in to. Rosetta Moderator: Mod.Sense |
ex_brit Send message Joined: 12 Dec 09 Posts: 15 Credit: 100,070 RAC: 0 |
Each BOINC project periodically sends out new versions of their application. This would mean that there is an executable file that must pass through a/v or firewall protections. It also means a new name of allowed exception might need to be defined, depending on how specifically the exceptions are named. Understood. In the case of my security software I'd get a popup requesting permission which would have to be OK'd. I'm wondering if all software acts similarly or are people ignoring them. Peter. Toronto, Canada |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 26,464,799 RAC: 18,152 |
Hi all. Сan you pass a wish (proposal) on the optimization of minirosetta app to the programmers of the project? One of the minor drawbacks of the Rosetta@home project compared to other distributed computing projects (in addition to a very high RAM usage) is a major burden on the hard disk at startup. When you running 2 or 4 computational threads it is usually not a serious problem, but one with a higher number of threads (eg in our team we have a few machines with 24 threads and 1 machine with 48 threads) BOINC start and load all these R@H threads is a problem - load may take up few tens of minutes, when a large number of threads competition to access the disk at the same time. I do an analysis of the application work with the disc and found that almost all the load creates decompression of minirosetta_database_rev48292.zip archive to a working folder (...boincDataslots...), because archive contains at this moment 1517 files (while the number of all the other files to process one WU is typically less than 100). In addintion this constant(per each WU - for exaple with standart Intel i7 CPU and default runtime it ~64 times per day) unpacking(writing) and removal(deleting) ~ 1500 files results in fast file system fragmentation and further slow down the disk. If I understand correctly (correct me if I'm wrong), this file is a complete archive of the main minirosetta database and the processing of specific WUs use it read-only (no writing) and requires only a part (relative small?) of files from it. Now, how to optimize the disk work. I do not know, does BOINC architecture permittin to access files outside of the slots folder... Is so the best solution would be to extract and store the database in one instance (for example in ...boincDataprojectsboinc.bakerlab. org_rosetta folder) without unpacking it to a slots folder at every startup of each WU. If this is not possible (as I suspect) then we have the option to copy the archive to slots folder without unpacking, reading only the necessary files directly from the zip/gzip archive. Relevant functions should be included in a set of standard libraries for many programming languages. I have a little programming experience, but I use these features and implementation is relative simple(usually just a few extra lines of code compared to reading from flat files) Difference in the volume not so great (62Mb vs 147 Mb), but the number of files is huge (1 vs 1517). So acceleration for SSD drives will not be very significant, but on conventional HDD drives - by orders of magnitude (as they do not cope well with the processing of a large number of small files). |
Trotador Send message Joined: 30 May 09 Posts: 108 Credit: 291,214,977 RAC: 0 |
I subscribe this proposal/wish. As a owner of several 24/32 threads machines I've suffered this effect. It is also a blocking issue if someone want to set up a system based on a USB stick and no hdd since it can't keep up with this movement of data. Hi all. |
Clark Williams Send message Joined: 25 Nov 09 Posts: 3 Credit: 271,975 RAC: 0 |
I've been having trouble getting work downloaded from Rosetta for more than a week. Something happening? C. Williams |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
the tasks that are labeled hyb_ac_bench_3rdeD_10_SAVE_ALL_OUT_IGNORE_THE_REST (and all the rest) are giving errors and giving me only 20 credits. OINC:: CPU time: 36569.4s, 14400s + 21600s[2012- 8-26 6: 0:30:] :: BOINC WARNING! cannot get file size for default.out.gz: could not open file. Output exists: default.out.gz Size: -1 InternalDecoyCount: 0 (GZ) ----- 0 ----- Stream information inconsistent. Writing W_0000001 ====================================================== DONE :: 1 starting structures 36569.4 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== called boinc_finish </stderr_txt> ]]> what's this all about? more badly written code? |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
I've been having trouble getting work downloaded from Rosetta for more than a week. Nothing looking wrong on your host profile. Quota per day 100. What are you seeing for BOINC messages when you update to the project? Rosetta Moderator: Mod.Sense |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,533,485 RAC: 10,732 |
The tasks that are labeled hyb_ac_bench_3rdeD_10_SAVE_ALL_OUT_IGNORE_THE_REST (and all the rest) are giving errors and giving me only 20 credits. Sorry for the late report, but we're getting this intermittently too: On this computer, these tasks: hyb_ab_bench_3rojD_SAVE_ALL_OUT_IGNORE_THE_REST_53909_1171_0 hybrid_ac_bench_4dg9A_SAVE_ALL_OUT_IGNORE_THE_REST_53491_67_0 hyb_ad_bench_T0528_SAVE_ALL_OUT_IGNORE_THE_REST_56539_32_1 hyb_ag_bench_2yeqB_SAVE_ALL_OUT_IGNORE_THE_REST_57261_513_0 These were all cut short by the watchdog and didn't complete 1 decoy That said, that computer did complete several other tasks of this type without problem as shown here: Full task list |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 26,464,799 RAC: 18,152 |
These were all cut short by the watchdog and didn't complete 1 decoy I encountered with this bug too. I report it in another thread: https://boinc.bakerlab.org/rosetta/forum_thread.php?id=6055&nowrap=true#73741 |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2141 Credit: 41,533,485 RAC: 10,732 |
These were all cut short by the watchdog and didn't complete 1 decoy Yup, the same. I think I should've posted in the thread you did too as it's not a rosetta@home issue |
ex_brit Send message Joined: 12 Dec 09 Posts: 15 Credit: 100,070 RAC: 0 |
The latest versions of BOINC are causing lots of different issues with many projects. Peter. Toronto, Canada |
Keith Jillings Send message Joined: 26 Sep 06 Posts: 7 Credit: 536,631 RAC: 0 |
I've been having trouble getting work downloaded from Rosetta for more than a week. Likewise. I was having the same problem with SETI, which has just (this afternoon) downloaded several WUs after weeks of silence. I think it may be something to do with the latest BOINC update. The PC is busy crunching other stuff, so I'm not too bothered. I can't find a way to "force" it download new work. |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
I've been having trouble getting work downloaded from Rosetta for more than a week. it shows......tflop estimate is down over 10% for the whole rosetta project |
Clark Williams Send message Joined: 25 Nov 09 Posts: 3 Credit: 271,975 RAC: 0 |
I've been having trouble getting work downloaded from Rosetta for more than a week. Scheduler Request Pending, followed by communications deferred for about 4 minutes followed by nothing under tasks. My statistics are flatlined and have been since 2012 Aug 19. C. Williams |
robertmiles Send message Joined: 16 Jun 08 Posts: 1234 Credit: 14,338,560 RAC: 2,014 |
I've been having trouble getting work downloaded from Rosetta for more than a week. Do you happen to have a graphics board in your computer that BOINC can use for GPU workunits? For my computers with such boards, I've found that BOINC will not download any CPU workunits (such as those for Rosetta@Home) until it has at least one GPU workunit. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org