Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 320 · 321 · 322 · 323 · 324 · 325 · 326 . . . 328 · Next
Author | Message |
---|---|
dagamier Send message Joined: 12 Dec 05 Posts: 8 Credit: 2,942,707 RAC: 1,311 ![]() |
Why is it that in the Tasks tab, all of my work units show that they will be late (Completion before deadline) and already are late based on the deadline column. I'm on a 12 core Mac and all of my other project burn through units, but the Rosetta one consistently has them all run late. I've even tried tweaking settings to give it a higher priority, but still consistently late. ![]() |
![]() ![]() Send message Joined: 29 Jan 08 Posts: 54 Credit: 1,778,862 RAC: 1,111 |
Why is it that in the Tasks tab, all of my work units show that they will be late (Completion before deadline) and already are late based on the deadline column. I'm on a 12 core Mac and all of my other project burn through units, but the Rosetta one consistently has them all run late. I've even tried tweaking settings to give it a higher priority, but still consistently late. The Tasks that your computer has completed have a very long compute time. Under your account for Rosetta look at your Preferences for Rosetta and see what you have for a Targeted "CPU Run Time. ? Standard is 8 Hours |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2292 Credit: 43,161,777 RAC: 25,399 ![]() |
Why is it that in the Tasks tab, all of my work units show that they will be late (Completion before deadline) and already are late based on the deadline column. I'm on a 12 core Mac and all of my other project burn through units, but the Rosetta one consistently has them all run late. I've even tried tweaking settings to give it a higher priority, but still consistently late. Yes, it's definitely that. It looks like the Targeted CPU runtime is set at "1 day" (24hrs) while it's also taking 30hrs of 'wall clock time' to complete that 24hrs of CPU runtime. While I've just argued that runtimes that are below the 8hrs Boinc schedules at should be increased to an explicit 8hrs, runtimes that are more than 8hrs - and, importantly, are missing deadline - do need to be reduced but not necessarily all the way down to 8hrs. I run all my tasks <successfully> with a 12hr runtime, so I'd reduce to that level first and I think it will work even if the wall clock time requires 15hrs to complete them. The downside is that more tasks will be required to run Rosetta for the same throughput at a time when they're few and far between, but it's other users who are the far bigger problem, as described in my earlier post this evening. ![]() ![]() |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 59 Credit: 46,885,830 RAC: 134,589 ![]() |
I currently have my buffer set to store at least 0.15 and up to 0.25 additional days of work... ... just a 0.1 plus 0.1 cache size and 50% CPUs... ... 3 hours (Rosetta Beta I think) should be set explicitly at 8h Me jumping in late, per normal. I realize there's strong opinions about the buffer size. But I have mine defined as 3 days +.01 for one simple reason... I'm often away from the computers for 4 days at a time and I have had them run out of work. I firmly believe that idle jiffies are the devils playground. Having said that, I keep mine, don't even need to take my socks off so I can count all 8* of them, at 100% CPU 24X7. I've also gone in and removed all the limitations on run-time, what they want is what they get. Thanks to information provided by the wise folks here I'm controlling which project get priority via other means. *2 will go back to Alaska where there's already 4 running, where I live, in a little less than 2 weeks and 2 of them will go into summer hibernation, here in Arizona, until around Halloween when I return. ![]() |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2292 Credit: 43,161,777 RAC: 25,399 ![]() |
I currently have my buffer set to store at least 0.15 and up to 0.25 additional days of work... Sorry, but this really makes no sense at all. If deadlines are 3 days and you fill your buffer to 3 days and they have to run for the default 8 hours after downloading then *ALL* will miss deadline by design. They don't for you, for one reason and one reason alone. The default runtime for these majority Rosetta Beta tasks turns out to be 3hrs, not 8hrs, so while Boinc *thinks* it's filling your cache to 3 days at the point of download, it only turns out to be ~1.125 days You must see this. My small cache thing is just me. You can have larger if you want. That's not the problem, up to a total of, say, 2.5 days (+8hrs runtime = < 3days). If they ever get round to fixing this 3hr default issue, you'll find out pretty quickly. The fact you're away for 4 days at a time shouldn't be relevant. Boinc polls for tasks several times a day. It will only fail if there are no tasks to grab, in which case that'll apply if you're there or not. ![]() ![]() |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2292 Credit: 43,161,777 RAC: 25,399 ![]() |
I'm sure everyone already knows that boinc-process is down again - being Wednesday. I'd estimate about 10hrs ago. It's happened again. Early Thursday am UTC and boinc-process is back for validating (still 128k backlog atm). And Assimilators are still running (zero backlog) Plus, In progress tasks have jumped ~40k from 106k to 148k Still 0 ready to send, but I've somehow already got a full cache here Positive news right now at least. I'll look again in the morning. Edit: In progress just updated to 162k with 4674 showing ready to send. Edit 2: And validation backlog down to just 11k Could everything really be fixed, or is it just a false dawn? ![]() ![]() |
Bill Swisher Send message Joined: 10 Jun 13 Posts: 59 Credit: 46,885,830 RAC: 134,589 ![]() |
Consider me confused. Here are the properties of the last Rosetta that I downloaded, hopefully the formatting isn't terrible, it says it wants 8 hours of processing time. Application Rosetta Beta 6.06 Name 11mrredo_11_hallucinated_138_30_SAVE_ALL_OUT_3013178_65 State Ready to start Received Wed 26 Mar 2025 10:07:34 AM MST Report deadline Sat 29 Mar 2025 10:07:34 AM MST Estimated computation size 80,000 GFLOPs Executable rosetta_beta_6.06_x86_64-pc-linux-gnu I looked at another computer and it's pretty much a mirror of this also. In the mean time this computer is running WCG jobs that appear to have been downloaded 3 days ago. ![]() |
![]() Send message Joined: 28 Mar 20 Posts: 1804 Credit: 18,534,891 RAC: 2 |
Edit: weird thing, but the assimilation backlog seems to have cleared down to zero at about the same time as validators went down. No idea what that's aboutThe Assimilators clear Tasks (move them from here to the main science database) once they have been Validated. No more Validation, no more new work for the Assimilators to do. Grant Darwin NT |
![]() Send message Joined: 28 Mar 20 Posts: 1804 Credit: 18,534,891 RAC: 2 |
Why is it that in the Tasks tab, all of my work units show that they will be late (Completion before deadline) and already are late based on the deadline column. I'm on a 12 core Mac and all of my other project burn through units, but the Rosetta one consistently has them all run late. I've even tried tweaking settings to give it a higher priority, but still consistently late.Because it takes you 30 hours to do only 24 hours of work. eg Run time 1 days 6 hours 11 min 55 sec CPU time 23 hours 59 min 22 sec And it takes you 3.5 days to return work, when the deadlines are only 3 days. Running more than one project, there is no need for a cache, 0.1 days and 0.01 additional days is plenty. You should also figure out why it's taking you so long to process a Task- either your system is busy doing other computationally intensive work as well as BOINC, or you have set in your Usage Limits settings "Use at most 100 % of CPU time" to something less than 100% (this is an option that really should be removed). Set it to 100% and be done with it- if you have cooling issues with your system, fix them. If it's a laptop, then just limit the number of cores/threads BOINC can use. On a lightly used system, the difference between CPU time and Run time shouldn't be much more than 5-10mins for a Target time of 24hrs. Grant Darwin NT |
![]() Send message Joined: 28 Mar 20 Posts: 1804 Credit: 18,534,891 RAC: 2 |
Consider me confused. Here are the properties of the last Rosetta that I downloaded, hopefully the formatting isn't terrible, it says it wants 8 hours of processing time.It is hard coded by the project, regardless of how long it may actually run, and regardless of what your Target CPU time is. Years back, due to problems with the initial Estimated completion time estimates, people were getting smashed with 1,000s of Tasks they had no hope of doing in time. One of the suggestions was to set the initial Estimated completion time to the project's default value (which is 8 hours). The other suggestion (and the preferred option) was to set it to the project's default value if no Target CPU time was set by the account holder. If they had a set Target CPU time, then use that time for their initial Estimated completion time. Unfortunately, they went with the first option. Grant Darwin NT |
![]() Send message Joined: 28 Mar 20 Posts: 1804 Credit: 18,534,891 RAC: 2 |
Edit: In progress just updated to 162k with 4674 showing ready to send.I'm thinking false dawn. The Assimilators appeared to do OK early on as the Validation backlog was cleared. And the amount of work In progress climbed nicely as there were Tasks Ready to Send- for a while. But now the In progress work appears to have plateaued and is dropping again as the Tasks Ready to Send is pretty much back to 0 again, and the assimilator backlog continues to grow, again. Grant Darwin NT |
![]() Send message Joined: 28 Mar 20 Posts: 1804 Credit: 18,534,891 RAC: 2 |
Ready to send is still pretty much 0, In progress continues to decline, Assimilator backlog continues to grow. It's still borked. Grant Darwin NT |
Matthias Lehmkuhl Send message Joined: 20 Nov 05 Posts: 13 Credit: 2,573,510 RAC: 1,555 ![]() |
I can't report my finished and uploaded work message: Rosetta@home 27.03.2025 12:19:56 (CET) Server error: feeder not running Project status tells feeder is running Matthias |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2292 Credit: 43,161,777 RAC: 25,399 ![]() |
Consider me confused. Here are the properties of the last Rosetta that I downloaded, hopefully the formatting isn't terrible, it says it wants 8 hours of processing time. What Grant said is right. Before the task is started the runtime shown to Boinc is hard-coded to be 8hrs. That's what you're showing us above. But the moment the task starts running the Rosetta Beta task's own internal runtime, mistakenly imo, is set to 3hrs and takes over. Look at a running task to confirm that - as it progresses the remaining runtime veers toward 3hrs. That's the <only> reason you don't miss deadlines. Which is why I make the suggested changes to Target CPU runtime that I have. Your tasks all run too short, you grab more tasks than you should, and if it wasn't for the large queue we currently have, which is an anomaly tbh, we'd all run out much quicker than we should. ![]() ![]() |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2292 Credit: 43,161,777 RAC: 25,399 ![]() |
Edit: In progress just updated to 162k with 4674 showing ready to send.I'm thinking false dawn. I'm now agreeing with you. In progress has dropped back to currently 133k, Assimilator backlog increasing again and ready to send dropping nearer to zero <sigh> It's the hope that kills you... Thanks for explaining Assimilators - I never really understood that. More the project's problem than ours, so I'll stop caring about that one and leave it to them. Except to the extent that it indicates an ongoing problem everywhere. Nice while it lasted (a few hours) ![]() ![]() |
Lem Novantotto Send message Joined: 13 Sep 23 Posts: 7 Credit: 1,713,174 RAC: 39,093 ![]() |
I can't report my finished and uploaded work I have the exact same issue with 2 of my 3 computers (all running Linux). Oddly enough, the third one is reporting its work without any problem. All my three computers are behind the same router. -- Bye, Lem |
Jean-David Beyer Send message Joined: 2 Nov 05 Posts: 217 Credit: 7,399,159 RAC: 7,462 ![]() |
I can't report my finished and uploaded work Me too. Running Linux. ![]() |
![]() Send message Joined: 1 Dec 05 Posts: 2054 Credit: 10,773,622 RAC: 11,574 ![]() |
'cmon guys of Rosetta, there are over 9 milions wus to crunch. Open the floodgates and let if flow |
![]() ![]() Send message Joined: 30 May 06 Posts: 5726 Credit: 5,966,803 RAC: 1,731 ![]() |
9 mill? I don't see that. Be sure to subtract what is going to their AI system. We just used up 1.33+ million tasks With current total users with credit there is barely 9 tasks per user that were processed. And for the system...well what else is new? seriously...SOS if you know what I mean and not the distress symbol. |
![]() Send message Joined: 1 Dec 05 Posts: 2054 Credit: 10,773,622 RAC: 11,574 ![]() |
9 mill? I don't see that Home page. Be sure to subtract what is going to their AI system. Their AI system is internal, so, no, you can't see the queue. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2025 University of Washington
https://www.bakerlab.org