I get tired of repeating this over and over again, so I thought I would write it down.
- Gathering data.
- Spotting patterns in the data.
- Making up a possible causal relationship from the patterns. This is called abduction.
- Testing that hypothesis. Showing the truth of the predicate produces a true result is called deduction.
- Creating a falsifiable hypothesis. Many would say that showing a concrete case where the hypothesis is true yet the result should be false is the difference between elaborate guessing and science.
- Testing That. Induction again.
- Updating the probability that the cause is related to the effect based on the experiments. Stipulating that a finite number of results can be extrapolated to a larger infinite set is called induction.
Repeat and rinse.
- This is not a sequential system where each step is completed perfectly. Instead, all of the steps are going on at the same time, and each step many times is only done “well enough” for whatever goal the practitioner has
- The system builds on itself. So whatever you learned previously, new nouns, new actions, new relationships — you carry forward into the current activity
- A baby does all of that the first time he cries on purpose and somebody brings him a bottle. This is not only scientific learning, this is real-world human learning.
- As chains of causality are created, the universe of nouns and actions also increases. This means that the search for “why did that happen?” is the driving and integral force behind learning new systems of knowledge. Everything we know, we know because we asked “why does this happen?” — that is, we looked for causes.
- The result is that the probability is updated, not that we “know” something. Real-world knowledge is inherently Bayesian.
- The human brain is a master of working with reality using a bunch of heuristics that may only at times be 5% correct. We are wrong in zillions of ways in which it does not matter that we are wrong.
- The system has a tendency to “stick” running down bad paths for a while and then making quantum jumps. This is the way the system is supposed to operate. It does not move directly in a straight line. It finds false paths, sticks with those, then jumps over to new paths. (Humans and politics play a big role in these jumps. People have a hard time changing their attitudes.)
- When given a new situation, we can do two things: 1) create an analogy between this situation and one for which we have much more causal data mapped out, or 2) take the rules of the universe as we “know” them and try to extend them into this domain. Both approaches are fraught with difficulty.
- Because this learning process increases our knowledge domain, and because that domain is not increased before the process is finished, we can only ever really begin learning something by analogy. After the simple analogy is understood we can begin looking for differences between the system being used as an example and the new system
- You never really know. Science is always provisional. The best you can say is that it is ludicrously improbable that you are mistaken about something
- When somebody says, “That’s Science!” The appropriate response is to question him on each of these items. Most times people confuse what most scientists believe with science. Remember that there is a big difference between a series of experiments which may show X to a 61.2% likelihood and the fact that 95% of scientists agree on X. Remember the brain is master at jumping to “yes” or “no”. People don’t like to be unsure about something, so their opinions will aggregate around certainties. Real science does not do this. If you want an honest, reliable answer, stick with the real numbers. Science is not a popularity contest.
- Check a priori data. The assumptions people bring into an experiment have a lot to do with the conclusions they draw from it. Hidden assumptions get you every time.
- “Test” doesn’t mean to simply to the experiment again. It means to describe the experiment in neutral terms and have somebody who does not know what they are testing for to try to reproduce the results.
- If you don’t know why data and algorithms should all be public then my explaining it won’t help any.
I wonder if I am being pedantic here. I don’t even know why this is necessary — it all seems pretty obvious — but folks keep getting hung up on it over and over again. Might as well write it out.If you've read this far and you're interested in Agile, you should take my No-frills Agile Tune-up Email Course, and follow me on Twitter.