Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 48 · 49 · 50 · 51 · 52 · 53 · 54 . . . 55 · Next

AuthorMessage
googloo
Avatar

Send message
Joined: 15 Sep 06
Posts: 133
Credit: 21,688,476
RAC: 5,328
Message 79981 - Posted: 30 Apr 2016, 9:55:26 UTC - in response to Message 79979.  

You've broken something, I'm getting this message now on all my Ubuntu rigs.

Fri 29 Apr 2016 12:13:59 AEST | rosetta@home | Sending scheduler request: Requested by user.
Fri 29 Apr 2016 12:13:59 AEST | rosetta@home | Reporting 6 completed tasks, requesting new tasks for CPU
Fri 29 Apr 2016 12:14:04 AEST | rosetta@home | Scheduler request completed: got 0 new tasks
Fri 29 Apr 2016 12:14:04 AEST | rosetta@home | No work sent
Fri 29 Apr 2016 12:14:04 AEST | rosetta@home | Rosetta Mini for Android is not available for your type of computer.


Same here, on my Win7 Ultimate!!

30.04.2016 11:19:55 | rosetta@home | Sending scheduler request: To fetch work.
30.04.2016 11:19:55 | rosetta@home | Requesting new tasks for CPU and NVIDIA GPU and Intel GPU
30.04.2016 11:19:58 | rosetta@home | Scheduler request completed: got 0 new tasks
30.04.2016 11:19:58 | rosetta@home | No work sent
30.04.2016 11:19:58 | rosetta@home | Rosetta Mini for Android is not available for your type of computer.


Same here again, now it's

4/30/2016 5:50:23 AM | rosetta@home | update requested by user
4/30/2016 5:50:24 AM | rosetta@home | Sending scheduler request: Requested by user.
4/30/2016 5:50:24 AM | rosetta@home | Requesting new tasks for Intel GPU
4/30/2016 5:50:26 AM | rosetta@home | Scheduler request completed: got 0 new tasks

on my Windows 7 computer.
ID: 79981 · Rating: 0 · rate: Rate + / Rate - Report as offensive
iancantwell

Send message
Joined: 7 Jul 13
Posts: 4
Credit: 376,360
RAC: 181
Message 79984 - Posted: 30 Apr 2016, 14:39:29 UTC

According to my event log: Task 06_optimize0001_fold_SAVE_ALL_OUT_344614_3237_2 exited with zero status but no 'finished' file. Another message says that "if this happens repeatedly you may need to reset the project".
As I haven't had this problem before it maybe that the task is somehow corrupt and should be withdrawn for analysis
ID: 79984 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile [VENETO] boboviz

Send message
Joined: 1 Dec 05
Posts: 1860
Credit: 8,160,158
RAC: 8,440
Message 79989 - Posted: 1 May 2016, 8:04:50 UTC - in response to Message 79979.  

30.04.2016 11:19:58 | rosetta@home | Scheduler request completed: got 0 new tasks
30.04.2016 11:19:58 | rosetta@home | No work sent
30.04.2016 11:19:58 | rosetta@home | Rosetta Mini for Android is not available for your type of computer.


It's time to update the scheduler (and the server code).
But i think that during CASP it is impossible
ID: 79989 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Link
Avatar

Send message
Joined: 4 May 07
Posts: 352
Credit: 382,349
RAC: 0
Message 79991 - Posted: 1 May 2016, 8:54:28 UTC - in response to Message 79977.  
Last modified: 1 May 2016, 9:03:01 UTC

What that means for me is that when I know I won't be available to manually push reports at least once during the day, I essentially end up having to shift to another project for those systems. That's what I do during a vacation away from the systems.

You could simply increase your cache to 4-6 days, or even better, set up the other project as a backup project. When away on vacation, it's better to have more than one project active, so in case your main project runs out of work, your computer has some other projects to choose from.
.
ID: 79991 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 39
Message 79996 - Posted: 1 May 2016, 18:34:26 UTC - in response to Message 79991.  

Thanks for the reply

It is an option -- but there are times (vacations) when I'm away for a week or more and that would not work well for that.

The thing is, I believe there is a problem a bit further up the chain than simply at my workstations since this behavior is Rosetta project specific.

Further, my project balance is a bit more balanced (sometimes with more than one CPU project in the mix) than simply running a back up project in place would resolve even as a work around.



What that means for me is that when I know I won't be available to manually push reports at least once during the day, I essentially end up having to shift to another project for those systems. That's what I do during a vacation away from the systems.

You could simply increase your cache to 4-6 days, or even better, set up the other project as a backup project. When away on vacation, it's better to have more than one project active, so in case your main project runs out of work, your computer has some other projects to choose from.


ID: 79996 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1982
Credit: 38,450,760
RAC: 14,492
Message 79998 - Posted: 2 May 2016, 1:20:40 UTC

Looking at something else, I glanced at the number of active users on Rosetta.

On March 3rd it was 79,000 users
In May 2nd it's now 142,000

The number of new hosts at the start of April has consistently been over 1,000 and up to 3,900 each day.

This is a lot

Rosetta Users Overview
ID: 79998 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 39
Message 80001 - Posted: 2 May 2016, 21:56:42 UTC - in response to Message 79996.  

I appreciate the suggestions for work arounds -- I do wonder though about what specifically is going on at the server side which might be causing this Rosetta specific problem for me.



Thanks for the reply

It is an option -- but there are times (vacations) when I'm away for a week or more and that would not work well for that.

The thing is, I believe there is a problem a bit further up the chain than simply at my workstations since this behavior is Rosetta project specific.

Further, my project balance is a bit more balanced (sometimes with more than one CPU project in the mix) than simply running a back up project in place would resolve even as a work around.



What that means for me is that when I know I won't be available to manually push reports at least once during the day, I essentially end up having to shift to another project for those systems. That's what I do during a vacation away from the systems.

You could simply increase your cache to 4-6 days, or even better, set up the other project as a backup project. When away on vacation, it's better to have more than one project active, so in case your main project runs out of work, your computer has some other projects to choose from.



ID: 80001 · Rating: 0 · rate: Rate + / Rate - Report as offensive
ThrowerGB

Send message
Joined: 4 Dec 05
Posts: 3
Credit: 12,259,708
RAC: 0
Message 80017 - Posted: 4 May 2016, 15:28:40 UTC
Last modified: 4 May 2016, 15:32:19 UTC

I'm not getting new tasks either. I've been getting the same message about Rosetta mini for Android. Yet I'm running on OSX. I've just removed BOINC from my system including data files and reloaded BOINC. The file below shows the log.
In addition to Rosetta, I'm running Seti@home.
--------------
Wed May 4 11:10:24 2016 | | Starting BOINC client version 7.6.22 for x86_64-apple-darwin
Wed May 4 11:10:24 2016 | | log flags: file_xfer, sched_ops, task
Wed May 4 11:10:24 2016 | | Libraries: libcurl/7.39.0 OpenSSL/1.0.1j zlib/1.2.5 c-ares/1.10.0
Wed May 4 11:10:24 2016 | | Data directory: /Library/Application Support/BOINC Data
Wed May 4 11:10:24 2016 | | CUDA: NVIDIA GPU 0: GeForce GTX 675MX (driver version 7.5.26, CUDA version 7.5, compute capability 3.0, 1024MB, 3MB available, 1933 GFLOPS peak)
Wed May 4 11:10:24 2016 | | OpenCL: NVIDIA GPU 0: GeForce GTX 675MX (driver version 10.10.5.2 310.42.25f01, device version OpenCL 1.2, 1024MB, 3MB available, 1933 GFLOPS peak)
Wed May 4 11:10:24 2016 | | OpenCL CPU: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (OpenCL driver vendor: Apple, driver version 1.1, device version OpenCL 1.2)
Wed May 4 11:10:25 2016 | | Host name: iMac.home
Wed May 4 11:10:25 2016 | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz [x86 Family 6 Model 58 Stepping 9]
Wed May 4 11:10:25 2016 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clfsh ds acpi mmx fxsr sse sse2 ss htt tm pbe pni pclmulqdq dtes64 mon dscpl vmx smx est tm2 ssse3 cx16 tpr pdcm sse4_1 sse4_2 x2apic popcnt aes pcid xsave osxsave tsctmr avx rdrand f16c
Wed May 4 11:10:25 2016 | | OS: Mac OS X 10.11.4 (Darwin 15.4.0)
Wed May 4 11:10:25 2016 | | Memory: 16.00 GB physical, 559.94 GB virtual
Wed May 4 11:10:25 2016 | | Disk: 1.01 TB total, 559.70 GB free
Wed May 4 11:10:25 2016 | | Local time is UTC -4 hours
Wed May 4 11:10:25 2016 | rosetta@home | URL https://boinc.bakerlab.org/rosetta/; Computer ID 1528837; resource share 48
Wed May 4 11:10:25 2016 | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6107788; resource share 47
Wed May 4 11:10:25 2016 | rosetta@home | General prefs: from rosetta@home (last modified 01-May-2016 17:00:45)
Wed May 4 11:10:25 2016 | rosetta@home | Computer location: home
Wed May 4 11:10:25 2016 | rosetta@home | General prefs: no separate prefs for home; using your defaults
Wed May 4 11:10:25 2016 | | Reading preferences override file
Wed May 4 11:10:25 2016 | | Preferences:
Wed May 4 11:10:25 2016 | | max memory usage when active: 13107.20MB
Wed May 4 11:10:25 2016 | | max memory usage when idle: 15728.64MB
Wed May 4 11:10:25 2016 | | max disk usage: 100.00GB
Wed May 4 11:10:25 2016 | | (to change preferences, visit a project web site or select Preferences in the Manager)
Wed May 4 11:10:25 2016 | rosetta@home | Sending scheduler request: To fetch work.
Wed May 4 11:10:25 2016 | rosetta@home | Requesting new tasks for CPU and NVIDIA GPU
Wed May 4 11:10:26 2016 | rosetta@home | Scheduler request completed: got 0 new tasks
Wed May 4 11:10:26 2016 | rosetta@home | No work sent
Wed May 4 11:10:26 2016 | rosetta@home | Rosetta Mini for Android is not available for your type of computer.
ID: 80017 · Rating: 0 · rate: Rate + / Rate - Report as offensive
googloo
Avatar

Send message
Joined: 15 Sep 06
Posts: 133
Credit: 21,688,476
RAC: 5,328
Message 80018 - Posted: 4 May 2016, 15:45:48 UTC

Two (and maybe more) of my posts have disappeared. What's up?

I repeat: the problem with "Rosetta Mini for Android is not available for your type of computer" appears to be the 24-hour back off period that results. Can this be adjusted?
ID: 80018 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 80020 - Posted: 4 May 2016, 16:47:17 UTC - in response to Message 80018.  

Two (and maybe more) of my posts have disappeared. What's up?

I repeat: the problem with "Rosetta Mini for Android is not available for your type of computer" appears to be the 24-hour back off period that results. Can this be adjusted?


Your posts are over in the other thread

It's a BOINC question, not R@h. I am not aware of any means of tailoring the backoff behavior of BOINC Manager.
Rosetta Moderator: Mod.Sense
ID: 80020 · Rating: 0 · rate: Rate + / Rate - Report as offensive
googloo
Avatar

Send message
Joined: 15 Sep 06
Posts: 133
Credit: 21,688,476
RAC: 5,328
Message 80022 - Posted: 4 May 2016, 20:38:54 UTC - in response to Message 80020.  

Two (and maybe more) of my posts have disappeared. What's up?

I repeat: the problem with "Rosetta Mini for Android is not available for your type of computer" appears to be the 24-hour back off period that results. Can this be adjusted?


Your posts are over in the other thread

It's a BOINC question, not R@h. I am not aware of any means of tailoring the backoff behavior of BOINC Manager.


Sorry for my confusion, and thanks for the answer.
ID: 80022 · Rating: 0 · rate: Rate + / Rate - Report as offensive
danosavi

Send message
Joined: 20 Apr 07
Posts: 1
Credit: 58,586
RAC: 0
Message 80028 - Posted: 6 May 2016, 7:10:40 UTC

Hello, this morning I've noticed a sudden decrease in task count of all my connected computers in the computer list of my account. Machine and total credits are correct, it's just the number of tasks that decreased.

Is that a bug or something else?

Thanks.
ID: 80028 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Dr. Merkwürdigliebe
Avatar

Send message
Joined: 5 Dec 10
Posts: 81
Credit: 2,657,273
RAC: 0
Message 80033 - Posted: 6 May 2016, 14:28:07 UTC

Occasionally I click on the "Show graphics" button in the boinc client to see the proteins spin. Sometimes I close the window before the animation starts and the window is gone - but not from memory.

ps aux | grep defunc


merkwuerdig    4889  0.0  0.0      0     0 ?        Z    16:16   0:00 [minirosetta_gra] <defunct>


I can't kill those zombie processes. Not even with kill -9.
ID: 80033 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 39
Message 80035 - Posted: 6 May 2016, 17:49:28 UTC - in response to Message 80020.  

Not directly related to the Android work units, but I do wonder about the answer suggesting that it is a BOINC question regarding back off behavior.

The reason for my follow up here is that while in Rosetta, the back-off period of 24 hours when a (perhaps incorrect) 'server not responding' report happens on my workstation when reporting results, is that while other projects go to a progressive back off cycle -- approximately 1 hour, then 2 hours, then 3 hours, then 4 hours, then recycle to a 1 hour back off, Rosetta, and ONLY Rosetta, goes immediately to a 24 hour back off cycle.

I'd note further, that every time I go to a workstation at any stage of the 24 hour back off cycle and manually push an update, the update goes through.

So I remain a bit bewildered here at what from my experience with a number of workstations certainly appears to be Rosetta specific behavior.

Since I run multiple projects, and have encountered this over the past month or more, I have ended up shifting a bit over to other projects which do not exhibit what certainly appears as a Rosetta specific behavior, particularly when I anticipate leaving a workstation in an unattended mode.




Two (and maybe more) of my posts have disappeared. What's up?

I repeat: the problem with "Rosetta Mini for Android is not available for your type of computer" appears to be the 24-hour back off period that results. Can this be adjusted?


Your posts are over in the other thread

It's a BOINC question, not R@h. I am not aware of any means of tailoring the backoff behavior of BOINC Manager.

ID: 80035 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 80036 - Posted: 6 May 2016, 19:18:09 UTC

I see what you mean Barry. So I did some searching. Found this description of Per-processor-type backoff. It seems to indicate that the backoff is determined by the client's tracking of schedule requests for each project and processor type. From this I conclude that there must be periods of time when client machines can fail to get work on several scheduler requests in a row, and this leads to 24hr backoff.

It is hard to catch a server status snapshot showing the server is out of work. But my theory is that there are points within the 10 minute interval where the jobs that generate the tasks from the queue of work are unable to keep up. The project team has (apparently) increased the cache of work to try and avoid running out. But I think there are still times when there are no tasks available to send.

I have been watching the number of available tasks on the server status page quite frequently over the past month or so, and I suspect that the task volumes are now getting so high that the snapshot you see every 10 minutes of the number of available tasks is not reflective of the underlying reality. I mean if it shows 200,000 tasks are available, it doesn't mean much when the server sends out 400,000 tasks over the next 10 minutes. I have no idea what the actual rates are, but when you look at the number of outstanding work units change between 10 minute intervals, it seems the numbers must be very high.

Obviously if number of outstanding tasks is 1.0 million at one time interval, and 1.4 million on the next, the project must have sent out at least 400,000 tasks. But when you don't really know how many tasks were reported in as completed during that time interval, the number could be much higher.
Rosetta Moderator: Mod.Sense
ID: 80036 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 39
Message 80037 - Posted: 6 May 2016, 20:33:38 UTC - in response to Message 80036.  

Thanks for your detailed response.

One thing I'd note here -- I believe it isn't the request for new work which yields the 24 hour back off that I see, rather it is the reporting of completed work units.

I could be wrong there, but I believe the sequence is to report completed work and then request new work. When I check the workstations, I see between 3 and 12 completed work units which have not been reported. If the problem is a lack of available work, then I would expect the completed work units to be reported and a 'no new work' message. I do see that sort of report with some other projects which are periodically sparse with work units (GPUGrid for example or MilkyWay)

I will note at this point I see that Rosetta is getting a lot of new users and that likely will be draining down the available work.

In any event, should this persist, I figure to shift systems to other projects temporarily when I can't regularly access the systems (say when I am out of town).

ID: 80037 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 80038 - Posted: 6 May 2016, 21:18:06 UTC

Tasks are reported and requested in the same scheduler request. However, if the task results are still uploading at the time of the scheduler request, they cannot be reported as completed yet. So perhaps the ones that you see are ready to report were just recently completed, or their uploads just recently completed. So they were ready to report after the series of work requests caused the backoff to run up to 24hrs.

EMailing with DK today, he confirms there are short periods where no work is available, even though you don't typically see that reflected on the status page. New work becomes available soon enough that the status typically shows lots of work at-the-ready. This is why the problem is intermittent.
Rosetta Moderator: Mod.Sense
ID: 80038 · Rating: 0 · rate: Rate + / Rate - Report as offensive
BarryAZ

Send message
Joined: 27 Dec 05
Posts: 153
Credit: 30,843,285
RAC: 39
Message 80039 - Posted: 6 May 2016, 23:40:02 UTC - in response to Message 80038.  

OK -- fair enough -- it is possible that given the large Rosetta client population and its growth that I see it as a "Rosetta specific" issue.

As I noted, the thing which surprises me is that apparently, the instant this happens on a workstation, it goes to a 24 hour back off -- which I don't see with other projects. Then again, most of the other projects (aside from GPUGrid) are not in a 'no work available' mode all that often.

As it is, for me the trade off is that World Grid is getting happier with me as the mix is shifting a bit toward it with my work units.


Tasks are reported and requested in the same scheduler request. However, if the task results are still uploading at the time of the scheduler request, they cannot be reported as completed yet. So perhaps the ones that you see are ready to report were just recently completed, or their uploads just recently completed. So they were ready to report after the series of work requests caused the backoff to run up to 24hrs.

EMailing with DK today, he confirms there are short periods where no work is available, even though you don't typically see that reflected on the status page. New work becomes available soon enough that the status typically shows lots of work at-the-ready. This is why the problem is intermittent.


ID: 80039 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1224
Credit: 13,847,398
RAC: 1,953
Message 80226 - Posted: 23 Jun 2016, 5:21:10 UTC
Last modified: 23 Jun 2016, 5:26:24 UTC

Rosetta@Home appears to have had some problem that block uploads for the last several hours.

This workunit is affected:

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=759093541

It appears that downloads and getting new workunits to clients are probably not affected.
ID: 80226 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Emerald42

Send message
Joined: 9 Jun 08
Posts: 9
Credit: 1,534,626
RAC: 0
Message 80227 - Posted: 23 Jun 2016, 5:33:45 UTC
Last modified: 23 Jun 2016, 5:34:06 UTC

I must beight here. And i can´t upload several ready computed files to server.
ID: 80227 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 . . . 48 · 49 · 50 · 51 · 52 · 53 · 54 . . . 55 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2024 University of Washington
https://www.bakerlab.org