Minirosetta 3.73-3.78

Author	Message
sgaboinc Send message Joined: 2 Apr 14 Posts: 282 Credit: 208,966 RAC: 0	Message 79609 - Posted: 24 Feb 2016, 5:22:40 UTC - in response to Message 79582. Last modified: 24 Feb 2016, 5:24:08 UTC What I am seeing is that the project happily goes along for a while Requesting new tasks for CPU and gets the Scheduler request completed: got 1 task message. Then after a few hours it gets the Scheduler request completed: got 0 tasks. No work sent. Rosetta Mini for Android is not available for your type of computer. Finally, the message Rosetta Mini needs 57220.46 MB RAM but only 7363.62 MB is available for use. After that it stops updating. Remaining tasks will continue to upload until it runs out. Rosetta does not automatically download any more tasks or report any that were finished. You can manually update and get it to reset and start again however it will just run through to the same result in a few hours. actually, i'm wondering if limiting the number of concurrent tasks may help. for r@h, i normally see the number of tasks running as one task/thread per core. hence it nicely use all 8 cores with 8 tasks/threads (incl HT cores) of my i7 4771 cpu. i'm running on 16 GB of ram in linux. i've yet to encounter the 'needs xxx MB of RAM' with r@h, but with a different project (atlas@home from cern), the memory requirements are quite huge and i often see only 4 threads / tasks running and hit the memory limit. coming to think about ram, i think linux and windows o/s are able to utilize swap for virtual memory hence disk space as swap memory if you have allocated sufficient space for that. but for atlas@home, i think the use of virtualbox probably limits what could be swappable. you may like to see if disk swap spaces may be somewhat tunable in that respects. the other thing i think has to do with the boinc client itself, i'm thinking an updated or more recent boinc client may possibly resolve some of these issues as what you are seeing is probably a behavior of boinc client rather than r@h ID: 79609 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1237 Credit: 14,421,737 RAC: 195	Message 79615 - Posted: 24 Feb 2016, 11:53:23 UTC - in response to Message 79609. [snip] coming to think about ram, i think linux and windows o/s are able to utilize swap for virtual memory hence disk space as swap memory if you have allocated sufficient space for that. but for atlas@home, i think the use of virtualbox probably limits what could be swappable. you may like to see if disk swap spaces may be somewhat tunable in that respects. BOINC tasks usually have swapping turned off, in an effort to make them run faster. This means that there is often no effort to make the applications able to stand the address changes caused by swapping something out of memory, and then swapping it back in at a different address because the original address is still in use by some other program. ID: 79615 · Rating: 0 · rate: / Reply Quote

rjs5 Send message Joined: 22 Nov 10 Posts: 274 Credit: 23,665,706 RAC: 3,752	Message 79617 - Posted: 24 Feb 2016, 14:13:35 UTC - in response to Message 79615. [snip] coming to think about ram, i think linux and windows o/s are able to utilize swap for virtual memory hence disk space as swap memory if you have allocated sufficient space for that. but for atlas@home, i think the use of virtualbox probably limits what could be swappable. you may like to see if disk swap spaces may be somewhat tunable in that respects. BOINC tasks usually have swapping turned off, in an effort to make them run faster. This means that there is often no effort to make the applications able to stand the address changes caused by swapping something out of memory, and then swapping it back in at a different address because the original address is still in use by some other program. The OS (Windows, all variants of Linux, MACOS, ... ) provides the program with VIRTUAL memory. The virtual memory is translated into PHYSICAL memory using the TLB translations. A virtual page of memory can get swapped to disk and then be relocated into a different PHYSICAL memory location by setting the TLB entry properly. The executing program does not even know if the page has been swapped out to disk. The last time I looked, Windows allocated a disk swap file the same size as memory ( C:pagefile.sys ). You can explicitly set the size of this file, even to 0 bytes .... but when you run low on memory, the OS will kill stuff "Out of Memory". Virtualbox is just a program in memory that runs on top of your OS and you set the memory size that virtualbox is allowed to use. I usually set virtualbox to be able to use about 50% of my physical memory BUT I have 16gb or more on my systems. ID: 79617 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1237 Credit: 14,421,737 RAC: 195	Message 79624 - Posted: 25 Feb 2016, 2:44:08 UTC - in response to Message 79617. [snip] coming to think about ram, i think linux and windows o/s are able to utilize swap for virtual memory hence disk space as swap memory if you have allocated sufficient space for that. but for atlas@home, i think the use of virtualbox probably limits what could be swappable. you may like to see if disk swap spaces may be somewhat tunable in that respects. BOINC tasks usually have swapping turned off, in an effort to make them run faster. This means that there is often no effort to make the applications able to stand the address changes caused by swapping something out of memory, and then swapping it back in at a different address because the original address is still in use by some other program. The OS (Windows, all variants of Linux, MACOS, ... ) provides the program with VIRTUAL memory. The virtual memory is translated into PHYSICAL memory using the TLB translations. A virtual page of memory can get swapped to disk and then be relocated into a different PHYSICAL memory location by setting the TLB entry properly. The executing program does not even know if the page has been swapped out to disk. The last time I looked, Windows allocated a disk swap file the same size as memory ( C:pagefile.sys ). You can explicitly set the size of this file, even to 0 bytes .... but when you run low on memory, the OS will kill stuff "Out of Memory". Virtualbox is just a program in memory that runs on top of your OS and you set the memory size that virtualbox is allowed to use. I usually set virtualbox to be able to use about 50% of my physical memory BUT I have 16gb or more on my systems. It's hard to get Virtualbox working correctly - once the versions of BOINC available so far have detected that virtualization is not enabled in the BIOS or the UEFI, they will remember this forever and prevent the test of whether it is enabled from being run again. Also, all of the Virtualbox workunits I've seen much about so far seize 4 GB of physical memory, and won't allow any of it to be paged. I'm hoping that a new version of Virtualbox will remove this restriction. As far as I know, Virtualbox can handle 32-bit workunits, but not 64-bit workunits. ID: 79624 · Rating: 0 · rate: / Reply Quote

rjs5 Send message Joined: 22 Nov 10 Posts: 274 Credit: 23,665,706 RAC: 3,752	Message 79630 - Posted: 25 Feb 2016, 15:00:13 UTC - in response to Message 79624. [snip] coming to think about ram, i think linux and windows o/s are able to utilize swap for virtual memory hence disk space as swap memory if you have allocated sufficient space for that. but for atlas@home, i think the use of virtualbox probably limits what could be swappable. you may like to see if disk swap spaces may be somewhat tunable in that respects. BOINC tasks usually have swapping turned off, in an effort to make them run faster. This means that there is often no effort to make the applications able to stand the address changes caused by swapping something out of memory, and then swapping it back in at a different address because the original address is still in use by some other program. The OS (Windows, all variants of Linux, MACOS, ... ) provides the program with VIRTUAL memory. The virtual memory is translated into PHYSICAL memory using the TLB translations. A virtual page of memory can get swapped to disk and then be relocated into a different PHYSICAL memory location by setting the TLB entry properly. The executing program does not even know if the page has been swapped out to disk. The last time I looked, Windows allocated a disk swap file the same size as memory ( C:pagefile.sys ). You can explicitly set the size of this file, even to 0 bytes .... but when you run low on memory, the OS will kill stuff "Out of Memory". Virtualbox is just a program in memory that runs on top of your OS and you set the memory size that virtualbox is allowed to use. I usually set virtualbox to be able to use about 50% of my physical memory BUT I have 16gb or more on my systems. It's hard to get Virtualbox working correctly - once the versions of BOINC available so far have detected that virtualization is not enabled in the BIOS or the UEFI, they will remember this forever and prevent the test of whether it is enabled from being run again. Also, all of the Virtualbox workunits I've seen much about so far seize 4 GB of physical memory, and won't allow any of it to be paged. I'm hoping that a new version of Virtualbox will remove this restriction. As far as I know, Virtualbox can handle 32-bit workunits, but not 64-bit workunits. I use regularly use Virtualbox to build Linux images on machines and none of my comments were about the pre-configured BOINC VIRTUALBOX implementation. I have no experience with BOINC packaged Virtualbox. I imagine that BOINC projects choose to use the BOINC Virtualbox so they can control the execution environment and quality of data generated very closely. 32-bit only probably makes sense to for BOINC Virtualbox in that case. ID: 79630 · Rating: 0 · rate: / Reply Quote

[FI] OIKARINEN Send message Joined: 16 Nov 13 Posts: 6 Credit: 131,483 RAC: 0	Message 79669 - Posted: 1 Mar 2016, 14:15:41 UTC I've been running the 3.71 version of rosetta for 2 days .. And I just noticed a lot of crashing workunits running on different computers , all of those WUs have this attached : ERROR: unrecognized residue AX1 ERROR:: Exit from: ......srccoreiopdbfile_data.cc line: 2077 BOINC:: Error reading and gzipping output datafile: default.out Life is too short to live concerned about its mysteries. ID: 79669 · Rating: 0 · rate: / Reply Quote

Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0	Message 79674 - Posted: 1 Mar 2016, 19:15:19 UTC - in response to Message 79624. Last modified: 1 Mar 2016, 19:16:18 UTC It's hard to get Virtualbox working correctly - once the versions of BOINC available so far have detected that virtualization is not enabled in the BIOS or the UEFI, they will remember this forever and prevent the test of whether it is enabled from being run again. They have a solution to that problem in the Cosmology FAQs: I enabled VT-x/AMD-v but jobs say “Scheduler wait: Please upgrade BOINC” Also, all of the Virtualbox workunits I've seen much about so far seize 4 GB of physical memory, and won't allow any of it to be paged. I'm hoping that a new version of Virtualbox will remove this restriction. I think that just depends on the application. ATLAS and vLHC take a lot of memory, but Cosmology does not that I recall. I have had some problems with VirtualBox interfering with some other programs (both CPU and GPU, even non-BOINC ones), but not with the VBox programs themselves. I just use the pre-packaged versions on the CERN projects and Cosmology, but they all went easily enough, though you do need to watch the memory. If VBox would be of any use for Rosetta, I would be willing to try it here. ID: 79674 · Rating: 0 · rate: / Reply Quote

Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0	Message 79704 - Posted: 7 Mar 2016, 16:42:12 UTC Last modified: 7 Mar 2016, 16:55:38 UTC Both of my computers received 24 hour backs after a single request for work resulted in this reply: Sun Mar 6 03:51:29 2016 \| rosetta@home \| Rosetta Mini for Android is not available for your type of computer. When I noticed the problem several hours after the back-off began I simply hit the update button and successfully retrieved new tasks. ***Wild Speculation Alert*** If the amount of Android tasks exceeds the number of available devices by too great a number and/or fail at too high a rate then the new tasks/resends could be clogging the queue. As long as there are in fact plenty of cpu tasks to crunch, a 24 hour back-off would seem excessive. I should add that I only became concerned because I recently reduced my preferred cpu runtime and my cache and set other projects to no new tasks (preparing for a possible imminent shut down of computers for an indeterminate period of time) so this 24 hour back-off actually lead to no tasks crunching at all. Otherwise I might have noticed but not been concerned enough to explore the possible causes or to comment. I only comment now in the possibility that this back-off interval could be changed to something shorter. I know the project doesn't want a bunch of computer asking every 5 minutes while there's a clog but if it is a predictable clog and you can see how long it typically lasts perhaps you could adjust the back-off accordingly. Would anything longer than the 6 hour default target runtime really be necessary? Although not a big deal in the overall scheme of things, would things run somewhat smoother, on both sides of the connection, if crunchers weren't left to go idle unnecessarily for such a long time? Best, Snags edit: I just saw additional posts in this thread that suggest rosie really did run out of cpu tasks. Ah, well. I suppose I should see if I can find BOINC documentation on the back-off settings (documentation that I could actually understand, that is) : / ID: 79704 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1237 Credit: 14,421,737 RAC: 195	Message 79708 - Posted: 7 Mar 2016, 20:29:32 UTC - in response to Message 79704. Both of my computers received 24 hour backs after a single request for work resulted in this reply: [quote]Sun Mar 6 03:51:29 2016 \| rosetta@home \| Rosetta Mini for Android is not available for your type of computer. When I noticed the problem several hours after the back-off began I simply hit the update button and successfully retrieved new tasks. ***Wild Speculation Alert*** If the amount of Android tasks exceeds the number of available devices by too great a number and/or fail at too high a rate then the new tasks/resends could be clogging the queue. As long as there are in fact plenty of cpu tasks to crunch, a 24 hour back-off would seem excessive. I should add that I only became concerned because I recently reduced my preferred cpu runtime and my cache and set other projects to no new tasks (preparing for a possible imminent shut down of computers for an indeterminate period of time) so this 24 hour back-off actually lead to no tasks crunching at all. Otherwise I might have noticed but not been concerned enough to explore the possible causes or to comment. I only comment now in the possibility that this back-off interval could be changed to something shorter. I know the project doesn't want a bunch of computer asking every 5 minutes while there's a clog but if it is a predictable clog and you can see how long it typically lasts perhaps you could adjust the back-off accordingly. Would anything longer than the 6 hour default target runtime really be necessary? Although not a big deal in the overall scheme of things, would things run somewhat smoother, on both sides of the connection, if crunchers weren't left to go idle unnecessarily for such a long ID: 79708 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1237 Credit: 14,421,737 RAC: 195	Message 79709 - Posted: 7 Mar 2016, 20:31:33 UTC - in response to Message 79704. Both of my computers received 24 hour backs after a single request for work resulted in this reply: [quote]Sun Mar 6 03:51:29 2016 \| rosetta@home \| Rosetta Mini for Android is not available for your type of computer. When I noticed the problem several hours after the back-off began I simply hit the update button and successfully retrieved new tasks. ***Wild Speculation Alert*** If the amount of Android tasks exceeds the number of available devices by too great a number and/or fail at too high a rate then the new tasks/resends could be clogging the queue. As long as there are in fact plenty of cpu tasks to crunch, a 24 hour back-off would seem excessive. I should add that I only became concerned because I recently reduced my preferred cpu runtime and my cache and set other projects to no new tasks (preparing for a possible imminent shut down of computers for an indeterminate period of time) so this 24 hour back-off actually lead to no tasks crunching at all. Otherwise I might have noticed but not been concerned enough to explore the possible causes or to comment. I only comment now in the possibility that this back-off interval could be changed to something shorter. I know the project doesn't want a bunch of computer asking every 5 minutes while there's a clog but if it is a predictable clog and you can see how long it typically lasts perhaps you could adjust the back-off accordingly. Would anything longer than the 6 hour default target runtime really be necessary? Although not a big deal in the overall scheme of things, would things run somewhat smoother, on both sides of the connection, if crunchers weren't left to go idle unnecessarily for such a long ID: 79709 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1237 Credit: 14,421,737 RAC: 195	Message 79710 - Posted: 7 Mar 2016, 20:31:57 UTC - in response to Message 79704. Both of my computers received 24 hour backs after a single request for work resulted in this reply: Sun Mar 6 03:51:29 2016 \| rosetta@home \| Rosetta Mini for Android is not available for your type of computer. When I noticed the problem several hours after the back-off began I simply hit the update button and successfully retrieved new tasks. [snip] I've seen a similar problem twice. I have an Android device in addition to my Windows devices, but so far I have BOINC installed only on the Windows devices. ID: 79710 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1237 Credit: 14,421,737 RAC: 195	Message 79711 - Posted: 7 Mar 2016, 20:33:33 UTC - in response to Message 79704. Both of my computers received 24 hour backs after a single request for work resulted in this reply: [quote]Sun Mar 6 03:51:29 2016 \| rosetta@home \| Rosetta Mini for Android is not available for your type of computer. When I noticed the problem several hours after the back-off began I simply hit the update button and successfully retrieved new tasks. ***Wild Speculation Alert*** If the amount of Android tasks exceeds the number of available devices by too great a number and/or fail at too high a rate then the new tasks/resends could be clogging the queue. As long as there are in fact plenty of cpu tasks to crunch, a 24 hour back-off would seem excessive. I should add that I only became concerned because I recently reduced my preferred cpu runtime and my cache and set other projects to no new tasks (preparing for a possible imminent shut down of computers for an indeterminate period of time) so this 24 hour back-off actually lead to no tasks crunching at all. Otherwise I might have noticed but not been concerned enough to explore the possible causes or to comment. I only comment now in the possibility that this back-off interval could be changed to something shorter. I know the project doesn't want a bunch of computer asking every 5 minutes while there's a clog but if it is a predictable clog and you can see how long it typically lasts perhaps you could adjust the back-off accordingly. Would anything longer than the 6 hour default target runtime really be necessary? Although not a big deal in the overall scheme of things, would things run somewhat smoother, on both sides of the connection, if crunchers weren't left to go idle unnecessarily for such a long ID: 79711 · Rating: 0 · rate: / Reply Quote

iriemon Send message Joined: 16 Jan 16 Posts: 6 Credit: 776,294 RAC: 3	Message 79715 - Posted: 8 Mar 2016, 15:26:32 UTC Any news on getting the communication problem fixed? Been sitting here for 2 days without any new work being available..... ID: 79715 · Rating: 0 · rate: / Reply Quote

iriemon Send message Joined: 16 Jan 16 Posts: 6 Credit: 776,294 RAC: 3	Message 79717 - Posted: 8 Mar 2016, 15:34:11 UTC - in response to Message 79715. Any news on getting the communication problem fixed? Been sitting here for 2 days without any new work being available..... For some reason, I decided to clear my IE cache and then tried to dl a new work unit and to my surprise IT WORKED! Happily crunching...... ID: 79717 · Rating: 0 · rate: / Reply Quote

Dr. Merkwürdigliebe Send message Joined: 5 Dec 10 Posts: 81 Credit: 2,657,273 RAC: 0	Message 79721 - Posted: 8 Mar 2016, 17:10:06 UTC Last modified: 8 Mar 2016, 17:11:04 UTC Failed task ID: 79721 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1237 Credit: 14,421,737 RAC: 195	Message 79733 - Posted: 8 Mar 2016, 21:24:02 UTC - in response to Message 79717. Any news on getting the communication problem fixed? Been sitting here for 2 days without any new work being available..... For some reason, I decided to clear my IE cache and then tried to dl a new work unit and to my surprise IT WORKED! Happily crunching...... I decided to try that on my Windows 10 computer. Surprise - if Windows 10 even includes IE, it is very well hidden. I told BOINC Manager to update for Rosetta@home anyway - it downloaded a workunit. It looks likely that the problem is fixed on the server and IE is not involved. ID: 79733 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2058 Credit: 10,978,603 RAC: 12,915	Message 79756 - Posted: 15 Mar 2016, 20:51:49 UTC 801194890 Starting work on structure: _00002 [2016- 3-15 20:35:13:] :: BOINC:: Initializing ... ok. [2016- 3-15 20:35:13:] :: BOINC :: boinc_init() BOINC:: Setting up shared resources ... ok. failed to create shared mem segment: minirosetta Size: 25001672 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0085EEB0 write attempt to address 0x017D7EC1 ID: 79756 · Rating: 0 · rate: / Reply Quote

ArcSedna Send message Joined: 23 Oct 11 Posts: 16 Credit: 84,719,482 RAC: 69,945	Message 79792 - Posted: 23 Mar 2016, 21:23:21 UTC Some workunits hang up for long hours until manual termination. They have string like EN_MAP_hyb_cst EN_MAP_cst RE_MAP_hyb_cst RE_MAP_cst in the middle of the name. Sample (Already aborted) Their behavior is 'do nothing for a long time'. Looks like this: Elapsed real time : 32 hours Elapsed cpu time : 15 minutes This is happening on my Mac computers. Windows and Linux seem to be OK. OS : Mac OS X 10.11.3 Boinc : 7.2.42 Memory : 8GB to 16GB Thanks. ID: 79792 · Rating: 0 · rate: / Reply Quote

James Adrian Send message Joined: 27 Apr 12 Posts: 5 Credit: 1,916,171 RAC: 1,090	Message 79799 - Posted: 26 Mar 2016, 17:17:12 UTC Has anyone else gotten work units for Minirosetta 3.71 that are estimated to run 14 days? I'm running on an old (2009) Mac with 8GB of memory and lately I've gotten these here and there. Thanks Boinc 7.6.22 Mac OS 10.11.4 ID: 79799 · Rating: 0 · rate: / Reply Quote

James Adrian Send message Joined: 27 Apr 12 Posts: 5 Credit: 1,916,171 RAC: 1,090	Message 79800 - Posted: 26 Mar 2016, 17:51:16 UTC - in response to Message 79792. ArcSedna, I just saw your post, once I sorted to see newest first. My problem seems slightly different but like you I see the problem with work units named as in your post. One other observation: I have a newer Mac laptop but so far I have not seen the problem with the work units on it, just on my older iMac. ID: 79800 · Rating: 0 · rate: / Reply Quote