Well, you said you wanted feedback ...

Author	Message
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 4800 - Posted: 30 Nov 2005, 15:47:44 UTC Ya'll said you wanted our impressions after running the graphics for a bit. So, I have some ... probably un-informed and wrong ... but ... what the heck ... after 4 hours watching, luckly on two different passes on the same end target ... 1) are the vector creations at a set number of steps? I have not been able to figure this out yet. If so, I, for one, would like a count down to the next vector. 2) Looking at an example it seems that there are places that should articulate, but don't. Could we get a "dot" at those places where it is possible for the model to articulate the "joint"? When I looked at the processing of an example, it seemed that there should be more "joints". 3) Also looking at this example, it seems that the lengths of the segments are different in the working model and in the "known" solution. Is this significant? Or an artifact? Or a bug? 4) When I look at the example while processing, some of it was very close. Yet, a number of the "joints" where it would have been nice to see some movement, nothing changed. For example, if we start at the red end, there is a "plateau" followed by an angle joint. Yet, the "native" shows the opposite. Yet, not once in the testing did the program change the sequencing at these locations. another example is the blue end, not once did it try the blue end pointing out the outher direction. 5) There was a lot more "flailing" about than I would have expected. Worse, this seemed more prevelent than attempts to test movement in more "joints". For example, the blue end would "kick out" yet, I saw little change to the bends in the length of the blue segment. Again, not knowing for sure where the articulation points might be hampers my vision, but, it sure seemed to me that many possible changes were never attempted. For this, might it be possible to retain statistics on which "joints" are moved? Then those that have not had any movement could gradually become more likely candidates for additional adjustment. In this example, there was little productive change in the general structure that seemed to be very close to the target. I am of course assuming that the orientation in space is not relevant. Again, in my observation there was little seeming progress in the adjustment of the portion of the structure that held the most promise. 6) The step by step changes seemed to be global, in other words, each step was the end result of several changes. I know, or think I know, we are looking at a two dimensional representation of a 3 dimensional action space, but, there seemed to be little ratiomality in the application of changes. 7) "Successful" structures did not seem to be preserved across vectors. I am guessing that this is by intent. However, again in this example, the red and blue portions are "close", but this area of 3D space was not retained as a potential partial solution. This might require that the energy solution for an area of 3D space would need to be tracked independently of the total space. If, for example, the energy quotion of a particular "tube" following the structure was maintained, with a hierarchy, the secondary structure "tube" would show, and allow retention of potential partial solutions. What I mean is that the blue tube, red-brown tube, and brown-green tubes, lying "next" to each other in 3D space (or seeming to), would be enclosed by another larger "tube". ====== I am not sure if any of these observations make sense. But, again, for what it is worth ... my main suspicion though is that I was seduced by the native representation and may have read more into what is being done than is there (or less?). ID: 4800 · Rating: 2 · rate: / Reply Quote

Christian Diepold Send message Joined: 23 Sep 05 Posts: 37 Credit: 300,225 RAC: 0	Message 4930 - Posted: 2 Dec 2005, 9:41:36 UTC Any comments from a dev on this, in my opinion, great list by Paul? ID: 4930 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 5036 - Posted: 3 Dec 2005, 15:35:47 UTC Well, I guess not ... ID: 5036 · Rating: 0 · rate: / Reply Quote

Christian Diepold Send message Joined: 23 Sep 05 Posts: 37 Credit: 300,225 RAC: 0	Message 5057 - Posted: 3 Dec 2005, 22:06:48 UTC :( Thx for the list Paul, it's a shame it's going to waste. ID: 5057 · Rating: 0 · rate: / Reply Quote

Jack Schonbrun Send message Joined: 1 Nov 05 Posts: 115 Credit: 5,954 RAC: 0	Message 5085 - Posted: 4 Dec 2005, 5:04:52 UTC - in response to Message 5057. :( Thx for the list Paul, it's a shame it's going to waste. Hey, there's a lot to digest in there! I'm still processing it. :) Mostly I'm thinking about how to make it easy for us to speak a common language. One thing I found quite interesting though in Paul's screen shot was you can see that the RMSD improves quite a lot during the "Full Atom Relax" stage: This means that, despite some weaknessess you might see in this trajectory, our protocol is actually working to make the structure more like the native. ID: 5085 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 5097 - Posted: 4 Dec 2005, 11:24:13 UTC AH, ok, the lesson is ... "Hey, there's a lot to digest in there! I'm going to need time to finish processing it." As in, let us know you saw it ... :) When silence is deafening, it does get noticed. The appearance was that my post was not seen, or it was of such insufficient interest that ... well ... the horse is down and I will put away the stick ... :) I agree that there was progress. The more general point was that the progress potential was limited, AND, was not significantly improviing the confomation. Or, that is the impression I had. In very early threads we tried to discuss why we were not getting coverage of the total plotted "space" on the RMSD chartsas shown in the top predictions. Now, to my simple mind, watching the convergance chart you extracted, that the trend was to the good, but, only for a period of time, then it spent a lot of time "repeating" itself. Or more specifically, it seemed to get itself stuck in a "local minima", where the cross hairs are at this point. Again, these are only "impressions", but, there seems to be a lot of repeated movements which to my programmer's mind questions the effective randomness. What I mean here is that because we do not truly generate random numbers, that the extraction of random values and their application can, in effect, remove some of the effective randomness. So, by the fact that we pull off numbers and apply them by application to an algotithm in a static pattern, if that pattern interacts with the RND function we do not truly do random searchs. This can be complicated by the fact that many of the RND functions provided, um, stink. Did you write your own? Or are you using a library function? If you are using a library function, did you test it for its period and quality? Another task for the math mavens. My best bad example is the Apple II which was supposed to have a period in the millions but had a period of only 17,000 and change. New conceptual thought. Does the program have the ability to track the general direction of the trends? ID: 5097 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 5103 - Posted: 4 Dec 2005, 13:17:28 UTC - in response to Message 5097. Last modified: 4 Dec 2005, 13:18:18 UTC What I mean here is that because we do not truly generate random numbers, that the extraction of random values and their application can, in effect, remove some of the effective randomness. So, by the fact that we pull off numbers and apply them by application to an algotithm in a static pattern, if that pattern interacts with the RND function we do not truly do random searchs. This can be complicated by the fact that many of the RND functions provided, um, stink. Did you write your own? Or are you using a library function? If you are using a library function, did you test it for its period and quality? See this post by David K. where he states that RAN3 from Numerical Recipes (Knuth's portable subtractive generator) is used. Numerical Recipes states about RAN3: "One might hope that its weaknesses, if any, are therefore of a highly different character from the weaknesses, if any, of RAN1 above. If you ever suspect trouble with one routine, it is a good idea to try the other in the same application." I doubt, though, that that would make a lot of difference... ID: 5103 · Rating: 0 · rate: / Reply Quote

Jack Schonbrun Send message Joined: 1 Nov 05 Posts: 115 Credit: 5,954 RAC: 0	Message 5133 - Posted: 4 Dec 2005, 22:53:17 UTC Vanita is correct that we don't expect to see every joint moving with same frequency. In fact, one of the strengths of the Rosetta algorithm is that it can often with some accuracy which parts of the chain are more flexible than others. This means that we don't have to search all possible chain configurations, which would take an impossibly long time. So you shouldn't expect to see completely random motion of the chain. The downside of such an algorithm is that it can err, and not flex the right parts. This may be what's happening when you see regions of the chain that are wrong, yet never change. The great thing about Rosetta@home is that we can use all of your computer power to tip the balance a little bit. We now don't have to rely so much on our initial predictions of what is flexible and what isn't. This may not be obvious in any particular trajectory. One of our methods for adding more flexibility is, paradoxically, to make some regions of the chain less flexible, by restraining the joints to certain angles. But we restrain them to different angles in each WU. So we force many different values, and across all the Work Units sample many different possibilities. By definition, most will be wrong. But our energy function should be able pick out the ones that are right. ID: 5133 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 5154 - Posted: 5 Dec 2005, 7:41:41 UTC Ok, how about the small suggestions about changing the grpahics? Well, at least I created food for thought. And I would certainly have my math mavens checking the RND function. you would be surprised at how easy it is to fall into a trap where the RND function actually is so effectively non-random that you will never get to where you want to go. I only suggest this based on my suspicions, but, in my defense, I used to be a systems engineer ... so, it is an "informed" suspicion. We as a collective of scientists and participants have been staring at a problem space where we are not seeing "coverage" as expected (the Top Predictions) and with the graphics, MY eyes are telling me that things are not as random as I would have expected. I grant that this could very well be because of the explanations of the available conformations. However, I have been concentrating on looking at this one protien and how it works, and I have to tell you that the changes to the red end either never vary from the initial positions, or the variations never bring the trial closer to the expected. ID: 5154 · Rating: 0 · rate: / Reply Quote

Hoelder1in Send message Joined: 30 Sep 05 Posts: 169 Credit: 3,915,947 RAC: 0	Message 5164 - Posted: 5 Dec 2005, 8:57:35 UTC - in response to Message 5154. We as a collective of scientists and participants have been staring at a problem space where we are not seeing "coverage" as expected (the Top Predictions) and with the graphics, MY eyes are telling me that things are not as random as I would have expected. I grant that this could very well be because of the explanations of the available conformations. Presumably you are referring to the lack of coverage below 2 A RMSD ? Well, you have to keep in mind that we "are staring at" a one-dimensional representation of a parameter space of very high dimensionality. So the volumn of parameter space below say, 2 A RMSD is very much smaller than you might suspect by looking at the RMSD vz. energy plot (by many orders of magnitude) and it is really no wonder that it is so hard to find structures that fall into that range. As to the random number generator that is being used, this is probably one of the most widely used for similar Monte Carlo applications. So if it had a serious flaw, a lot of projects would be in trouble. ID: 5164 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 5167 - Posted: 5 Dec 2005, 9:44:18 UTC - in response to Message 5164. As to the random number generator that is being used, this is probably one of the most widely used for similar Monte Carlo applications. So if it had a serious flaw, a lot of projects would be in trouble. Well, to be honest, I am not concerned about the other project ... :) Seriously, the Apple II RND function was bad for quite some time before it was noticed. No one bothered to check. For me, this is a simple and obvious thing to look at ... especially when there was an indiction that they had some tame mathematitions available. We should all remember the points about the word "assume". :) Anyway, the project is free to do what they wish. I just followed up on their asking for observations. I used to analize systems looking for problems, and I like to think I was pretty good at it, and thus I noted a suspicion. Easy enough to do the RND testing to eliminate that as a possible cause. From there it gets harder. The point that the restricted set of operations may be the result of limited conformations is well taken. However, the other point I made is that *AT THE MOMENT* we (or maybe only me) do not know if the distribution of the actions is as random as expected. Which is why I asked if the program tracks the actual events it tries and from there if they match theoretical expectations. If not, then this is another place that needs attention. I do grasp the point that we are looking at a large dimensional space "translated" into a "flat" notation and this may cover up some of the behaviors. Also, I am still learning about what we are doing so I have tried to make the point that I may be mis-reading the observations. On the other hand, the boy did notice that the emperor did not have any clothes ... Very last point, perhaps not relevant. A surprise "gift" in my very late 40's was that I am autistic. Much of the down side is covered up by a pretty high intelligence. But, looking back on my life, it can/does explain why I could "see" things that others could not. Because I can not always say why, well, it is easy to discount the observation. With that in mind, all I can say is that the behavior of the system I have been watching does not "feel right". That of course does not mean that I am right, though my recollection of history is that it is not wise to not be too hasty to bet agaist me ... :) Anyway, I will keep looking as I have time, especially if I see an indication that we are running back over the problem space as noted by, hmmm, I can't find out where the suggestion was that we would be stepping back over this territory using the "best" results as a start for the next trials ... ID: 5167 · Rating: 0 · rate: / Reply Quote

dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0	Message 5201 - Posted: 5 Dec 2005, 18:25:53 UTC - in response to Message 5164. We as a collective of scientists and participants have been staring at a problem space where we are not seeing "coverage" as expected (the Top Predictions) and with the graphics, MY eyes are telling me that things are not as random as I would have expected. I grant that this could very well be because of the explanations of the available conformations. Presumably you are referring to the lack of coverage below 2 A RMSD ? Well, you have to keep in mind that we "are staring at" a one-dimensional representation of a parameter space of very high dimensionality. So the volumn of parameter space below say, 2 A RMSD is very much smaller than you might suspect by looking at the RMSD vz. energy plot (by many orders of magnitude) and it is really no wonder that it is so hard to find structures that fall into that range. As to the random number generator that is being used, this is probably one of the most widely used for similar Monte Carlo applications. So if it had a serious flaw, a lot of projects would be in trouble. This goes back to a thought that I had looking at the graph shown [ul=https://boinc.bakerlab.org/rosetta/rah_energy_vs_rmsd_plot.php?data=2005-11-15_1pvaA_abrelax&userid=8170]here.[/url] The problem I have is that there's no indication of density above that apparent curve that bounds the left and lower sides of the red area. I would like to see that same graph with some sort of logarithmic display of density showing. I'd take a guess that if you figure the RMSD for an exact match as zero (it should be), and determine the RMSD for the protein stretched out in a straight line, the probability graph will be a very steep bell curve type thing. It would be very interesting to see if we're hitting that distribution in real life. Echoing Paul's comments, there's an age old proverb in the computer Biz. "The gerenation of random numbers is far too important a matter to be left to chance." To put that another way, there are many RNG's out there that actually have very non-random properties. Starting with the old rand() function from the days of Bell Labs V6 Unix on a PDP 1134a in 1980 or so, () which was simply the classic "multiply and add" type thing, whose lower 8 bits cycled. Not very random at all. () Showing my age now, but that's what I cut my teeth on. :) ID: 5201 · Rating: 0 · rate: / Reply Quote

Jack Schonbrun Send message Joined: 1 Nov 05 Posts: 115 Credit: 5,954 RAC: 0	Message 5205 - Posted: 5 Dec 2005, 18:46:51 UTC Last modified: 5 Dec 2005, 19:17:10 UTC Paul, I think you bring up a good point. We should always be checking to see whether our programs are doing what we think they are. And visualization is a great way to do it. A great side effect of designing the screen saver is that we are spending more time ourselves studying the trajectories, and making sure that our sampling protocols behave as we expect. I do want to try to put your mind at ease with respect to our random number generator. You can read the chapter of Numerical Recipes that describes it for free on-line. It's actually pretty short. Towards the end, they describe rand3. You will also see in that there is much discussion throughout the chapter about not assuming any random number generator is truly random. These guys are no slouches. Rand3 was designed by Donald Knuth, a Computer Science luminary. In the chapter you will find the complete source for rand3. Anybody can code it up and try their hand at demonstrating that is has a flaw. In fact, you will see that they offer $1000 to anybody who can show that their rand2 algorithm is flawed. I think for rand3, Knuth would be quite interested. He is known to pay people who find errors in his work. I should point out that these algorithms were first published 20 years ago, and (to my knowledge) nobody has found a problem. ID: 5205 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 5237 - Posted: 5 Dec 2005, 22:27:17 UTC Jack, It is not a matter of putting my mind at ease, and I recognize the eminance of the originator. However, we have completely skirted the question. Are we getting the random numbers we need. My numerical recipies in FORTRAN book on page 276 also says: "Knuth's subtractive toutine ran3 seems to be the timing winner among portable routines. Unfortunately the subtractive method is not so well studied, and not a standard." Then they go on to say they keep ran3 as a back up if they suspect the randomness of the other generators. And even if I was to believe that the algorithm is flawless, well, translation of that algorithm into numbers requires: coding, compiling, and execution. None of which have been validated. The only reason I bring it up is that does not look right makes me suspicious. One of my friends in the past was a mathematician, he taaught me to NOT trust the systems with out testing. Case in point was when we were testing an interative system and found that under some conditions the floating point numbers returned were flawed between bits 9-14 (I think it was), the leading and trailing bits of the numbers were correct. I do not want to tell you how long it took to run the numbers using a pocket calculator. It was a bad compiler doing this to us. Anyway, my opinion was solicited, I responded. You don't think there is a problem. Thats fine. Just remember, if I ever find out that there was a problem, well, you will not hear the end of it for a decade or so. :) Oh, and the $1,000 reward is for ran2 which you are not using. I have not found out what the projected period is for either ran2 or 3 not that it matters i guess. Oh, and you also have not addressed the qustion about coupling between the rnd function and the application, which is a reservation pointed out in the text also ... :) ID: 5237 · Rating: 0 · rate: / Reply Quote

Housing and Food Services Send message Joined: 1 Jul 05 Posts: 85 Credit: 155,098,531 RAC: 0	Message 5238 - Posted: 5 Dec 2005, 22:36:50 UTC - in response to Message 5237. Last modified: 5 Dec 2005, 22:38:37 UTC You could always go from ran3 to using lava lamps. . imagine how cool that would look in the graphics app :) -E http://www.lavarnd.org/index.html ID: 5238 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 5280 - Posted: 6 Dec 2005, 9:51:47 UTC Or to atmospheric noise ... of course, now we get into the "type" of noise issue ... which is back to my questions about the the effective randomness of the application. :) Not being math oriented I don't get into the esoterica of "white" vs. "Pink" vs. whatever. I just know that it is important. I also know that "coupling" is a potential source of problems in that it converts what should be random into a non-random effect. Ironically enough, this replicates what drove me into depression and eventually through anxiety and panic disorders ... :) Not to pick on Jack too hard, but, the question is not if we selected the correct algorithms. It is the questions about what is really going on. Originally I asked if there was tracking of the applied changes so that it could be examined to see if we were truly getting a random distribution of attempted changes. If the distribution of the things tried is NOT random, then the system will never work. Since that instrumentation is not likely, and will be hard to add, hard to track, and hard to test ... well ... the appearance of repetitious behavior raised the question of randomness. A simpler source of non-randomness is of course in the RND generator. So, that is why the question if the IMPLEMENTATION of the function has been tested. And if those tests indicated that the distribution of random numbers is gaussian and non-Gaussian is needed, well, that is an issue. Since I have been pondering this, the latest irony (at least to my mind), I cam up with a short list of untested assumptions: 1) The rnd3 algorithm works as specified by Knuth 2) That others have tested rnd3 and no flaws have been found 3) The implementation of rnd3 works correctly 4) There is no coupling between rnd3 and the remainder of the program 5) The output of random numbers is of the correct mathematical properties needed by Rosetta@Home #1 Taken on faith, but is probably correct, yet, WE have not independently tested this. #2 Taken on faith. #3 Taken on faith. And my coding and doing the tests does not prove anything more than #1 and #2 are probably correct. #4 This is much harder to test as now we have to make statistical studies of the behavior of the system as a whole #5 Also hard to prove, but, that is why mathematicians were invented. Anyway, I don't like assumptions. Good famous assumptions: 1) IFDIV bug in the Pentium processor is of no interest to the community. And even if it was, numerical accuracy of answers is not required by the "New Math" and so, close enough is acceptable. 2) If two of three "voting" systems agree they are correct. Of course this does not explain the delayed launch because the majority opinion was wrong. I will note here also that the assumption that three "redundant results" only proves that the three systems have agreed on an answer, but, there is no implication that the answer is in fact correct. If we have "gospel out" from three machines it matters not what "garbage in" happened. But I digress ... 3) Submitting an angle of 91 degrees (one degree past vertical) on the OTH radar will still allow the radar to illuminate anything. In fact, this is a physical impossibility for the radar to angle the beam past 90 degrees (not having tested it, I am not even sure it can get to 90 degrees, and I would not want to be anywhere near 8+ Megawatts coming back down on my head., but I digress again). This particular item was an assumption by the programmer that submitted inputs should always result in an output, when confronted with divide by zero responded with a "fix" ... Opps! Actually this was a specification error in that the systems engineers did not test all of the limits correctly. At this point, I think it is time for me to drop it. As usual, my eloquence seems inadequate for the task at hand. The original question raised early after the start of the program was *"Why are we not getting the coverage expected?"* We discussed algorithms, we discussed operational parameters. However, we never questioned if we are really getting the randomness needed to get the operational effects needed. So ... I guess I am done. And no, I am not mad or even disappointed. The project has to decide what to do and I have no problems with that. If anyone wants to discuss why I have these questions and/or wants more "sea stories", well, no problem ... ID: 5280 · Rating: 0 · rate: / Reply Quote

George Send message Joined: 19 Nov 05 Posts: 1 Credit: 36,881 RAC: 0	Message 5342 - Posted: 7 Dec 2005, 6:24:17 UTC - in response to Message 5205. In the chapter you will find the complete source for rand3. Anybody can code it up and try their hand at demonstrating that is has a flaw. In fact, you will see that they offer $1000 to anybody who can show that their rand2 algorithm is flawed. I think for rand3, Knuth would be quite interested. He is known to pay people who find errors in his work. I should point out that these algorithms were first published 20 years ago, and (to my knowledge) nobody has found a problem. Don't be using a 20 years old algorithm in the first place; there are a lot better things now. Take a look at Mersenne twister, which is the most popular non-cryptographic PRNG today. ID: 5342 · Rating: 0 · rate: / Reply Quote

Jack Schonbrun Send message Joined: 1 Nov 05 Posts: 115 Credit: 5,954 RAC: 0	Message 5346 - Posted: 7 Dec 2005, 7:20:58 UTC - in response to Message 5342. Last modified: 7 Dec 2005, 7:22:33 UTC Don't be using a 20 years old algorithm in the first place; there are a lot better things now. Take a look at Mersenne twister, which is the most popular non-cryptographic PRNG today. Looks interesting. Thanks again to everybody for their concern on this issue. Hopefully someday we'll have time to explain the algorithm in enough detail that you will see it is very unlikely that any flaws are due to our random number generation. I do have to disagree slightly with the sentiment on 20 year old random number generators. I'm sure technology has advanced, but it does imply a certain level of battle testing. I would be very happy if somebody could point me to a link or an article, or their own research, showing that rand3 from numerical recipes is flawed. It's not that I doubt it, but I would like to see some evidence before we decide to abandon it. In fact, I hope we can move on from the discussion of random number generation, and talk about what perhaps is the real issue: whether the non-randomness that we intentionally introduce to bias our search is doing what we want. ID: 5346 · Rating: 0 · rate: / Reply Quote

Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0	Message 5351 - Posted: 7 Dec 2005, 11:47:24 UTC Jack, Sorry, but this will be last time I promise. I never asked you to abandon it. I asked you to test it. *Big* difference. And I *still* think you should test it. And, by the way, regardless of the rnd function I would have suggested the same thing. This has almost nothing to do with the function, but with the number stream. So, if you will pardon me, you missed the whole point of the questions. When you have the appearance of non-random behavior where random behavior is expected this *should* be one of the first things looked at. Since it is not that difficult to test I am actually surprised that you are surprised we are "stuck" on (by implication) a "non-real" issue. My starting point is that I see a coverage pattern in Top Predictions that is always a cluster. I watch models where the same places are adjusted over and over again with other places ignored. Both, to my mind, are evidence of non-random behavior ... It is to MY *considerable* surprise that a scientist is resistant to testing the assumptions in the design and implementation of an expiriment. Perhaps my suspicion is unfounded. But we do *not* know that, and likely never will. In fact, I hope we can move on from the discussion of random number generation, and talk about what perhaps is the real issue: whether the non-randomness that we intentionally introduce to bias our search is doing what we want. Um, now this is a first for me ... can you explain this? This could be a source of the behavior I am thinking I am seeing. Perhaps ... ID: 5351 · Rating: 0 · rate: / Reply Quote

dgnuff Send message Joined: 1 Nov 05 Posts: 350 Credit: 24,773,605 RAC: 0	Message 5521 - Posted: 8 Dec 2005, 4:33:28 UTC - in response to Message 5351. My starting point is that I see a coverage pattern in Top Predictions that is always a cluster. I watch models where the same places are adjusted over and over again with other places ignored. Both, to my mind, are evidence of non-random behavior ... I can't speak for the second of these, but the "cluster" in the predictions graphs presents absolutely zero hard evidence of non-random behavior. I know that this flies directly in the face of what you see on that graph, but once you understand what that graph shows, you'll see why. The best way to describe the problem is like this. Suppose you had a bag containing 1,000,000 marbles, where 10 of them have the number 1, 499,995 have number 2, and the remaining 499,995 have the number 3. If you now pull marbles at random from the bag, and plot the distribution: count of marbles vs number on the marble, you'll get a graph that is very skewed. It'll have a "cluster" at 2 and 3, and almost nothing at 1. This skewed distribution is not a result of a faulty random number generator, it's a result of a skewed distribution of numbers in the bag in the first place. Doing this, the only way you'd ever see a linear output distribution is with about 333,333 marbles in the bag for each of the numbers 1, 2 and 3. We're up against the same problem. We have a large space to work in (David Baker's 500 dimensional space), that contains an astronomical number of points (marbles in the bag). Each of these has an energy and an RMSD associated with it, and we're simply plotting a graph of RMSD vs energy for each point that we've managed to find. For your argument to hold up, you'd have to prove that the disribution of energy and RMSD across this 500D space is linear, and all the evidence suggests it's not. ID: 5521 · Rating: 0 · rate: / Reply Quote