Problems with web site

Message boards : Number crunching : Problems with web site

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 19 · Next

AuthorMessage
Profile hedera
Avatar

Send message
Joined: 15 Jul 06
Posts: 76
Credit: 5,138,180
RAC: 840
Message 58447 - Posted: 4 Jan 2009, 6:29:10 UTC

I just realized my CPU usage was essentially nil, and when I checked BOINC, the message log shows it's been requesting tasks from Rosetta all day (since 8:15 AM local) and never gotten any tasks to work on.

I'm running BOINC Manager 6.2.19 on Windows XP SP2, with 4 GB of RAM and a 3.2 GHz Pentium 4 CPU. I've never seen this problem before. Every request for tasks gets this:

01/03/2009 10:23:32 PM|rosetta@home|Sending scheduler request: Requested by user. Requesting 60480 seconds of work, reporting 0 completed tasks
01/03/2009 10:23:37 PM|rosetta@home|Scheduler request succeeded: got 0 new tasks

Is there something wrong with requesting 60480 seconds of work? Do I need to upgrade BOINC Manager again? Have you no tasks available? Help?
--hedera

Never be afraid to try something new. Remember that amateurs built the ark. Professionals built the Titanic.

ID: 58447 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 58452 - Posted: 4 Jan 2009, 9:33:09 UTC - in response to Message 58447.  

There is no work available for the time being due to a system problem. Perhaps by Monday morning sometime Pacific time the team will correct this problem. Hang in there. See the server problems and not getting any work threads to follow whats going on. Also you can tell by the home page stats if there is any work or not, also look at the server status page via the link on the lower left hand side of the home page.


I just realized my CPU usage was essentially nil, and when I checked BOINC, the message log shows it's been requesting tasks from Rosetta all day (since 8:15 AM local) and never gotten any tasks to work on.

I'm running BOINC Manager 6.2.19 on Windows XP SP2, with 4 GB of RAM and a 3.2 GHz Pentium 4 CPU. I've never seen this problem before. Every request for tasks gets this:

01/03/2009 10:23:32 PM|rosetta@home|Sending scheduler request: Requested by user. Requesting 60480 seconds of work, reporting 0 completed tasks
01/03/2009 10:23:37 PM|rosetta@home|Scheduler request succeeded: got 0 new tasks

Is there something wrong with requesting 60480 seconds of work? Do I need to upgrade BOINC Manager again? Have you no tasks available? Help?

ID: 58452 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,824,497
RAC: 2,340
Message 58455 - Posted: 4 Jan 2009, 11:13:40 UTC - in response to Message 58447.  

I just realized my CPU usage was essentially nil, and when I checked BOINC, the message log shows it's been requesting tasks from Rosetta all day (since 8:15 AM local) and never gotten any tasks to work on.

I'm running BOINC Manager 6.2.19 on Windows XP SP2, with 4 GB of RAM and a 3.2 GHz Pentium 4 CPU. I've never seen this problem before. Every request for tasks gets this:

01/03/2009 10:23:32 PM|rosetta@home|Sending scheduler request: Requested by user. Requesting 60480 seconds of work, reporting 0 completed tasks
01/03/2009 10:23:37 PM|rosetta@home|Scheduler request succeeded: got 0 new tasks

Is there something wrong with requesting 60480 seconds of work? Do I need to upgrade BOINC Manager again? Have you no tasks available? Help?


The systems status shows that one of the two work generator programs is down, and the other one can't keep up with the demand for more workunits; there were only 22 workunits available the last I looked, not necessarily including any suitable for your machine.

For cases like this, you might want to add another project rather reliable at supplying workunits, but give it only a small fraction of your available CPU time:

http://boinc.fzk.de/poem/

This one, at least, offers rather short workunits compared to many BOINC projects. Also, it is working on protein folding, as Rosetta@home is. Doesn't claim to be helping any specific diseases, though.

Giving it only a small share of your available CPU time means that even if you keep it enabled, it normally won't use much of your machine, but it will still fill in any times when you can't get workunits from Rosetta@home.
ID: 58455 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1894
Credit: 8,533,765
RAC: 9,759
Message 58457 - Posted: 4 Jan 2009, 11:28:26 UTC - in response to Message 58455.  

For cases like this, you might want to add another project rather reliable at supplying workunits, but give it only a small fraction of your available CPU time:

http://boinc.fzk.de/poem/

This one, at least, offers rather short workunits compared to many BOINC projects. Also, it is working on protein folding, as Rosetta@home is. Doesn't claim to be helping any specific diseases, though.


Thanks I was looking for another Project with short turn around times AND was doing something for Science. I always thought Poem had to to do with poetry, silly me!!! I couldn't right poetry if my life depended on it, so always ignored the site. Thank again!!!
ID: 58457 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,824,497
RAC: 2,340
Message 58465 - Posted: 4 Jan 2009, 13:15:28 UTC - in response to Message 58457.  
Last modified: 4 Jan 2009, 13:16:41 UTC

For cases like this, you might want to add another project rather reliable at supplying workunits, but give it only a small fraction of your available CPU time:

http://boinc.fzk.de/poem/

This one, at least, offers rather short workunits compared to many BOINC projects. Also, it is working on protein folding, as Rosetta@home is. Doesn't claim to be helping any specific diseases, though.


Thanks I was looking for another Project with short turn around times AND was doing something for Science. I always thought Poem had to to do with poetry, silly me!!! I couldn't right poetry if my life depended on it, so always ignored the site. Thank again!!!


You're welcome.

The SIMAP site also has short workunits, but is only active perhaps one week every month. Active today, though.

http://boinc.bio.wzw.tum.de/boincsimap/

The malariacontrol.net site also has short workunits, but is going less active and often doesn't have any workunits available.

http://www.malariacontrol.net/

Neither is as closely related to what Rosetta@home is doing, though.

You might also want to watch Cels@home. Currently inactive for some changes including moving to another server, and with fewer workunits than requested even when it was active. Typical workunits were about 6 CPU hours on my machine.

http://cels-at-home-dev.dyndns.org/cels/

This website is related to Cels@home, but it's not very clear if that's where they plan to move:

http://ficp.engr.utexas.edu/cels/

Predictor@home has been essentially inactive for several months; I don't know how long the workunits are.

World Community Grid tends to provide long workunits, but is about to go inactive in order to move to another servers site.
ID: 58465 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 1965
Credit: 38,160,504
RAC: 9,210
Message 58487 - Posted: 4 Jan 2009, 18:23:00 UTC - in response to Message 58457.  

I couldn't right poetry if my life depended on it...

Surely you mean you couldn't write poetry if your life depended on it. That sounds right.

Ok, sorry. I'm waiting for work and have nothing better to say. You may beat me with a stick...
ID: 58487 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
upstatelabs

Send message
Joined: 22 Jun 06
Posts: 10
Credit: 516,767
RAC: 0
Message 58700 - Posted: 9 Jan 2009, 15:56:18 UTC

Seems as there aren't any new WUs available....

Any idea when it'll be back to normal?
ID: 58700 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5658
Credit: 5,670,291
RAC: 2,328
Message 58702 - Posted: 9 Jan 2009, 17:22:34 UTC - in response to Message 58700.  

Seems as there aren't any new WUs available....

Any idea when it'll be back to normal?



no one but the team does, its 920am back in Seattle so they should be looking into the problem by now.
ID: 58702 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
upstatelabs

Send message
Joined: 22 Jun 06
Posts: 10
Credit: 516,767
RAC: 0
Message 58703 - Posted: 9 Jan 2009, 19:34:12 UTC - in response to Message 58702.  

no one but the team does, its 920am back in Seattle so they should be looking into the problem by now.


Things seem to be fixed... I have WUs again :)


ID: 58703 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 58708 - Posted: 9 Jan 2009, 22:16:24 UTC

I have not been able to tell if they are issuing again, but WCG is on the air and taking work back and the web site is up again ...

I am not sure if they are issuing work again or not ... I can't tell from my buffers ... and I just lowered their priority so I am probably not asking for work yet ...

But most of their projects are Bio related ...
ID: 58708 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1223
Credit: 13,824,497
RAC: 2,340
Message 58713 - Posted: 10 Jan 2009, 6:02:05 UTC - in response to Message 58708.  

I have not been able to tell if they are issuing again, but WCG is on the air and taking work back and the web site is up again ...

I am not sure if they are issuing work again or not ... I can't tell from my buffers ... and I just lowered their priority so I am probably not asking for work yet ...

But most of their projects are Bio related ...


They are issuing workunits again, although perhaps not as many. They're still checking for any unexpected effects of their new environment, such as running their server software on new servers.
ID: 58713 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kurre

Send message
Joined: 12 Apr 06
Posts: 9
Credit: 69,240
RAC: 0
Message 58806 - Posted: 14 Jan 2009, 15:20:27 UTC

My comp cant report work. This is the last error indication i could find in my log's. I have about 5 workunits that my comp tried to report but couldn't. They just dissapare from my comp and must be floating out there somewhere in the big cloud.
Is it my instalation or is this an general issue????

Running boinc 6.4.5

2009-01-14 15:32:23||Internet access OK - project servers may be temporarily down.
2009-01-14 15:32:23|rosetta@home|Finished upload of abinitio_norelax_homfrag_129_B_2ccvA_SAVE_ALL_OUT_4626_12633_0_0
2009-01-14 15:32:25|rosetta@home|Scheduler request failed: Transferred a partial file
2009-01-14 15:33:25|rosetta@home|Sending scheduler request: To fetch work.
ID: 58806 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58808 - Posted: 14 Jan 2009, 16:07:46 UTC

If the tasks were removed from the list in your BOINC Manager, then another scheduler request went through successfully. BOINC will automatically retry until it goes through.
Rosetta Moderator: Mod.Sense
ID: 58808 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kurre

Send message
Joined: 12 Apr 06
Posts: 9
Credit: 69,240
RAC: 0
Message 58810 - Posted: 14 Jan 2009, 16:44:53 UTC - in response to Message 58808.  
Last modified: 14 Jan 2009, 16:47:44 UTC

If the tasks were removed from the list in your BOINC Manager, then another scheduler request went through successfully. BOINC will automatically retry until it goes through.


The thing is that they dissapare from my lokal boinc client but they are still marked as in progress and new at the website. This is an example
220365003 200760705 12 Jan 2009 19:28:50 UTC 22 Jan 2009 19:28:50 UTC In Progress Unknown New
ID: 58810 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58813 - Posted: 14 Jan 2009, 18:26:48 UTC

The only way I know of for tasks to show on the website, but not appear on your local BOINC Manager display is when your machine never received the work in the first place. The server thinks that it assigned it to you, but your machine never saw it. Some people called these "ghost WUs".

Many of the task names are very similar. The simplest way to tell them apart is by the numbers at the very end of the name. Are you certain these are the tasks that you completed?

Here is a link to your host.
Rosetta Moderator: Mod.Sense
ID: 58813 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kurre

Send message
Joined: 12 Apr 06
Posts: 9
Credit: 69,240
RAC: 0
Message 58814 - Posted: 14 Jan 2009, 19:29:25 UTC - in response to Message 58813.  
Last modified: 14 Jan 2009, 19:36:38 UTC

The only way I know of for tasks to show on the website, but not appear on your local BOINC Manager display is when your machine never received the work in the first place. The server thinks that it assigned it to you, but your machine never saw it. Some people called these "ghost WUs".

Many of the task names are very similar. The simplest way to tell them apart is by the numbers at the very end of the name. Are you certain these are the tasks that you completed?

Here is a link to your host.


Seems like I have 2 different problems
Today i had same problem that i had a few years ago server was down but my client didn't care about that and just fluched the result.

13-Jan-2009 17:28:02 [rosetta@home] Finished upload of t075_1_NMRREF_1_t075_1_S_00002_0000200IGNORE_THE_REST_070000_6211_20_0_0
13-Jan-2009 20:50:14 [rosetta@home] Finished upload of abrelax_nofilter_-1n0u_-SAVE_ALL_OUT_6206_14591_0_0
13-Jan-2009 22:07:34 [rosetta@home] Finished upload of MaR214A_t071_1_RDC_NMR_NESG_SAVE_ALL_OUT_6215_13455_0_0
14-Jan-2009 15:32:23 [rosetta@home] Finished upload of abinitio_norelax_homfrag_129_B_2ccvA_SAVE_ALL_OUT_4626_12633_0_0
14-Jan-2009 17:21:43 [rosetta@home] Finished upload of abinitio_norelax_homfrag_129_B_1ten__SAVE_ALL_OUT_4626_10396_0_0
14-Jan-2009 19:25:12 [rosetta@home] Finished upload of t076_1_NMRREF_1_t076_1_idid_model_05_coreIGNORE_THE_REST_idl_6217_7848_0_0


Then in another log i found these notes that indicates problems at reboot or uncontrolled shutdowns of the client. Had that fealing before today that the error happened around or after reboots. Had to reboot the comp some times after patches and upgades after a 3 weeks vacation.

cant find C:ProgramBOINC\RebootPending.txt

Lets hope that the doubble \ just is a error in the logstring ;-)

So no i can't be sure yet that my client had got all those jobs has to dig a bit deeper into the logfile's before i can say that but i haven't the time for that right now.

Can there be some problems in the code that handle restart of the workunits. Probably an commom pease of code shared by all your projects.
ID: 58814 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 58815 - Posted: 14 Jan 2009, 20:23:07 UTC

Mike posted yesterday some of the issues they have been working on for the next release.

I don't know the details of exactly what he means, but as I read his comment about "Bug fix in checkpointing machinery, states were not being correctly restored", I would say it sounds possible that this is exactly what you are talking about. If so, then yes, some problems were uncovered, and fixes are being tested and should be available in the next release.
Rosetta Moderator: Mod.Sense
ID: 58815 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Kurre

Send message
Joined: 12 Apr 06
Posts: 9
Credit: 69,240
RAC: 0
Message 58818 - Posted: 14 Jan 2009, 23:00:59 UTC - in response to Message 58815.  

Mike posted yesterday some of the issues they have been working on for the next release.

I don't know the details of exactly what he means, but as I read his comment about "Bug fix in checkpointing machinery, states were not being correctly restored", I would say it sounds possible that this is exactly what you are talking about. If so, then yes, some problems were uncovered, and fixes are being tested and should be available in the next release.


Ah ok the one that talking about instability in handling textfiles might fit on my earlier problem, because my machine get an error no 2 from XP. Can't find file that is and it's textfiles that don't seems to be handeled ok when a restart is done. And it seems to affects any WU:s so it's probably fixed now.

And the error i had today is probably a harder one to isolate because it might not be easily reprodusable. Your server or my connection has to be in a special state so the boinc client thinks it's ok until it's to late (files already deleted from my client). It's not easy to get a two face commit to work properly or what it's called today.

Some wasted cputime but you can't get them all.
ID: 58818 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile MM Sihombing
Avatar

Send message
Joined: 22 May 06
Posts: 15
Credit: 1,424,082
RAC: 0
Message 58820 - Posted: 14 Jan 2009, 23:23:15 UTC

User of the day has been stuck for several days I think.
ID: 58820 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 58821 - Posted: 14 Jan 2009, 23:37:44 UTC - in response to Message 58818.  

And the error i had today is probably a harder one to isolate because it might not be easily reprodusable. Your server or my connection has to be in a special state so the boinc client thinks it's ok until it's to late (files already deleted from my client). It's not easy to get a two face commit to work properly or what it's called today.

Some wasted cputime but you can't get them all.


It is called Two Phase COmmit and actually it is not hard to make work at all ... it is just that the BOINC Developers probably did not think that it was really important to ensure that a proper two phase commit is needed.

By their lights, it isn't ... the actual science is, and has been loaded, so, the data that they care about has been moved to the server. The trivia of proper accounting of credit and things like that are not that important to them ...

As to the last statement ... well ... in relational databases, the two phase commit protocol is core to all activities ... including MySQL and SQL Server which are used by projects for BOINC ...
ID: 58821 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 19 · Next

Message boards : Number crunching : Problems with web site



©2024 University of Washington
https://www.bakerlab.org