Information Processing

Pessimism of the Intellect, Optimism of the Will     Archive   Favorite posts   Twitter: @steve_hsu

Friday, August 22, 2014

Two reflections on SCI FOO 2014

Two excellent blog posts on SCI FOO by Jacob Vanderplas (Astronomer and Data Scientist at the University of Washington) and Dominic Cummings (former director of strategy for the conservative party in the UK).

Hacking Academia: Data Science and the University (Vanderplas)

Almost a year ago, I wrote a post I called the Big Data Brain Drain, lamenting the ways that academia is neglecting the skills of modern data-intensive research, and in doing so is driving away many of the men and women who are perhaps best equipped to enable progress in these fields. This seemed to strike a chord with a wide range of people, and has led me to some incredible opportunities for conversation and collaboration on the subject. One of those conversations took place at the recent SciFoo conference, and this article is my way of recording some reflections on that conversation. ...

The problem we discussed is laid out in some detail in my Brain Drain post, but a quick summary is this: scientific research in many disciplines is becoming more and more dependent on the careful analysis of large datasets. This analysis requires a skill-set as broad as it is deep: scientists must be experts not only in their own domain, but in statistics, computing, algorithm building, and software design as well. Many researchers are working hard to attain these skills; the problem is that academia's reward structure is not well-poised to reward the value of this type of work. In short, time spent developing high-quality reusable software tools translates to less time writing and publishing, which under the current system translates to little hope for academic career advancement. ...

Few scientists know how to use the political system to effect change. We need help from people like Cummings.

... It was interesting that some very eminent scientists, all much cleverer than ~100% of those in politics [INSERT: better to say 'all with higher IQ than ~100% of those in politics'], have naive views about how politics works. In group discussions, there was little focused discussion about how they could influence politics better even though it is clearly a subject that they care about very much. (Gershenfeld said that scientists have recently launched a bid to take over various local government functions in Barcelona, which sounds interesting.)

... To get things changed in politics, scientists need mechanisms a) to agree priorities in order to focus their actions on b) roadmaps with specifics. Generalised whining never works. The way to influence politicians is to make it easy for them to fall down certain paths without much thought, and this means having a general set of goals but also a detailed roadmap the politicians can apply, otherwise they will drift by default to the daily fog of chaos and moonlight.


3. High status people have more confidence in asking basic / fundamental / possibly stupid questions. One can see people thinking ‘I thought that but didn’t say it in case people thought it was stupid and now the famous guy’s said it and everyone thinks he’s profound’. The famous guys don’t worry about looking stupid and they want to get down to fundamentals in fields outside their own.

4. I do not mean this critically but watching some of the participants I was reminded of Freeman Dyson’s comment:

‘I feel it myself, the glitter of nuclear weapons. It is irresistible if you come to them as a scientist. To feel it’s there in your hands. To release the energy that fuels the stars. To let it do your bidding. And to perform these miracles, to lift a million tons of rock into the sky, it is something that gives people an illusion of illimitable power, and it is in some ways responsible for all our troubles, I would say, this is what you might call ‘technical arrogance’ that overcomes people when they see what they can do with their minds.’

People talk about rationales for all sorts of things but looking in their eyes the fundamental driver seems to be – am I right, can I do it, do the patterns in my mind reflect something real? People like this are going to do new things if they can and they are cleverer than the regulators. As a community I think it is fair to say that outside odd fields like nuclear weapons research (which is odd because it still requires not only a large collection of highly skilled people but also a lot of money and all sorts of elements that are hard (but not impossible) for a non-state actor to acquire and use without detection), they believe that pushing the barriers of knowledge is right and inevitable. ...

Sunday, August 17, 2014

Genetic Architecture of Intelligence (arXiv:1408.3421)

This paper is based on talks I've given in the last few years. See here and here for video. Although there isn't much that hasn't already appeared in the talks or on this blog (other than some Compressed Sensing results for the nonlinear case) it's nice to have it in one place. The references are meant to be useful to people seriously interested in this subject, although I imagine they are nowhere near comprehensive. Apologies to anyone whose work I missed.

If you don't like the word "intelligence" just substitute "height" and everything will be OK. We live in strange times.
On the genetic architecture of intelligence and other quantitative traits (arXiv:1408.3421)
Categories: q-bio.GN
Comments: 30 pages, 13 figures

How do genes affect cognitive ability or other human quantitative traits such as height or disease risk? Progress on this challenging question is likely to be significant in the near future. I begin with a brief review of psychometric measurements of intelligence, introducing the idea of a "general factor" or g score. The main results concern the stability, validity (predictive power), and heritability of adult g. The largest component of genetic variance for both height and intelligence is additive (linear), leading to important simplifications in predictive modeling and statistical estimation. Due mainly to the rapidly decreasing cost of genotyping, it is possible that within the coming decade researchers will identify loci which account for a significant fraction of total g variation. In the case of height analogous efforts are well under way. I describe some unpublished results concerning the genetic architecture of height and cognitive ability, which suggest that roughly 10k moderately rare causal variants of mostly negative effect are responsible for normal population variation. Using results from Compressed Sensing (L1-penalized regression), I estimate the statistical power required to characterize both linear and nonlinear models for quantitative traits. The main unknown parameter s (sparsity) is the number of loci which account for the bulk of the genetic variation. The required sample size is of order 100s, or roughly a million in the case of cognitive ability.

Saturday, August 16, 2014

Neural Networks and Deep Learning

One of the SCI FOO sessions I enjoyed the most this year was a discussion of deep learning by AI researcher Juergen Schmidhuber. For an overview of recent progress, see this recent paper. Also of interest: Michael Nielsen's pedagogical book project.

An application which especially caught my attention is described by Schmidhuber here:
Many traditional methods of Evolutionary Computation [15-19] can evolve problem solvers with hundreds of parameters, but not millions. Ours can [1,2], by greatly reducing the search space through evolving compact, compressed descriptions [3-8] of huge solvers. For example, a Recurrent Neural Network [34-36] with over a million synapses or weights learned (without a teacher) to drive a simulated car based on a high-dimensional video-like visual input stream.
More details here. They trained a deep neural net to drive a car using visual input (pixels from the driver's perspective, generated by a video game); output consists of steering orientation and accelerator/brake activation. There was no hard coded structure corresponding to physics -- the neural net optimized a utility function primarily defined by time between crashes. It learned how to drive the car around the track after less than 10k training sessions.

For some earlier discussion of deep neural nets and their application to language translation, see here. Schmidhuber has also worked on Solomonoff universal induction.

These TED videos give you some flavor of Schmidhuber's sense of humor :-) Apparently his younger brother (mentioned in the first video) has transitioned from theoretical physics to algorithmic finance. Schmidhuber on China.

Friday, August 15, 2014

Y Combinator: "fund for the pivot"

I'm catching up on podcasts a bit now that I'm back in Michigan. I had an iTunes problem and was waiting for the next version release while on the road.

Econtalk did a nice interview with Y Combinator President Sam Altman. Y Combinator has always been entrepreneur-centric, to the point that the quality of the founders is one of the main factors they consider (i.e., more important than startup idea or business plan). At around 19 minutes, Altman reveals that they often "fund for the pivot" -- meaning that sometimes they want to place a bet on the entrepreneur even if they think the original idea is doomed. Altman also reveals that Y Combinator never looks at business plans or revenue projections. I can't count the number of times an idiot MBA demanded a detailed revenue projection from one of my startups, at a stage where the numbers and projections were completely meaningless.

Another good observation is about the importance of communication skills in a founder. The leadership team are a central nexus that has to informationally equilibrate the rest of the company + investors + partners + board members + journalists + customers ...  This is benefited tremendously by having someone who is articulate, succinct, and can "code switch" so as to speak the native language of an engineer or sales rep or VC.

@30 min or so:
Russ: ... one of the things that happens to me when I come out here in the summer--I live outside of Washington, D.C. and I come out every 6 or 7 weeks in the summer, and come to Stanford--I feel like I'm at the center of the universe. You know, Washington is--everyone in Washington, except for me--

Guest: Thinks they are--

Russ: Thinks they are in the center. And there are things they are in the center in. Obviously. But it's so placid there. And when I come to Stanford, the intellectual, the excitement about products and transforming concepts into reality, is palpable. And then I run into start-up people and venture capitalists. And they are so alive, compared to, say, a lobbyist in Washington, say, just to pick a random example. And there are certain things that just--again, it's almost palpable. You can almost feel them. So the thing is that I notice being here--which are already the next big thing, which at least they feel like they are.  [ Visiting Washington DC gives me hives! ]
I recall a Foo Camp (the O'Reilly one, not SCI FOO at Google; perhaps 2007-2010 or so) session led by Paul Graham and some of the other Y Combinator founders/funders. At the time they weren't sure at all that their model would work. It was quite an honest discussion and I think even they must be surprised at how successful they've been since then.

Wednesday, August 13, 2014

Designer babies: selection vs editing

The discussion in this video is sophisticated enough to make the distinction between embryo selection -- the parents get a baby whose DNA originates from them, but the "best baby possible" -- and active genetic editing, which can give the child genes that neither parent had.

The movie GATTACA focuses on selection -- the director made a deliberate decision to eliminate reference to splicing or editing of genes. (Possibly because Ethan Hawke's character Vincent would have no chance competing against edited people.)

At SCI FOO, George Church seemed confident that editing would be an option in the near future. He is convinced that off-target mutations are not a problem for CRISPR. I have not yet seen this demonstrated in the literature, but of course George knows a lot more than what has been published. (Warning: I may have misunderstood his comments as there was a lot of background noise when we were talking.)

One interesting genetic variant (Lrp5?) that I learned about at the meeting, of obvious interest to future splicers and editors, apparently conveys an +8 SD increase in bone strength!

My views on all of this:
... given sufficient phenotype|genotype data, genomic prediction of traits such as cognitive ability will be possible. If, for example, 0.6 or 0.7 of total population variance is captured by the predictor, the accuracy will be roughly plus or minus half a standard deviation (e.g., a few cm of height, or 8 IQ points). The required sample size to extract a model of this accuracy is probably on the order of a million individuals. As genotyping costs continue to decline, it seems likely that we will reach this threshold within five years for easily acquired phenotypes like height (self-reported height is reasonably accurate), and perhaps within the next decade for more difficult phenotypes such as cognitive ability. At the time of this writing SNP genotyping costs are below $50 USD per individual, meaning that a single super-wealthy benefactor could independently fund a crash program for less than $100 million.

Once predictive models are available, they can be used in reproductive applications, rang- ing from embryo selection (choosing which IVF zygote to implant) to active genetic editing (e.g., using powerful new CRISPR techniques). In the former case, parents choosing between 10 or so zygotes could improve their expected phenotype value by a population standard de- viation. For typical parents, choosing the best out of 10 might mean the difference between a child who struggles in school, versus one who is able to complete a good college degree. Zygote genotyping from single cell extraction is already technically well developed [25], so the last remaining capability required for embryo selection is complex phenotype prediction. The cost of these procedures would be less than tuition at many private kindergartens, and of course the consequences will extend over a lifetime and beyond.

The corresponding ethical issues are complex and deserve serious attention in what may be a relatively short interval before these capabilities become a reality. Each society will decide for itself where to draw the line on human genetic engineering, but we can expect a diversity of perspectives. Almost certainly, some countries will allow genetic engineering, thereby opening the door for global elites who can afford to travel for access to reproductive technology. As with most technologies, the rich and powerful will be the first beneficiaries. Eventually, though, I believe many countries will not only legalize human genetic engineering, but even make it a (voluntary) part of their national healthcare systems [26]. The alternative would be inequality of a kind never before experienced in human history.

Here is the version of the GATTACA scene that was cut. The parents are offered the choice of edited or spliced genes conferring rare mathematical or musical ability.

Monday, August 11, 2014

SCI FOO 2014: photos

The day before SCI FOO I visited Complete Genomics, which is very close to the Googleplex.

Self-driving cars:

SCI FOO festivities:

I did an interview with O'Reilly. It should appear in podcast form at some point and I'll post a link.

Obligatory selfie:

Friday, August 08, 2014

Next Super Collider in China?

If you're in particle physics you may have heard rumors that the Chinese government is considering getting into the collider business. Since no one knows what will happen in our field post-LHC, this is a very interesting development. A loose international collaboration has been pushing a new linear collider for some time, perhaps to be built in Japan. But since (1) the results from LHC are thus far not as exciting as some had anticipated, and (2) colliders are very very expensive, the future is unclear.

While in China and Taiwan I was told that it was very likely that a next generation collider project would make it into the coming 5 year science plan. It was even said that the location for the new machine (combining both linear and hadronic components) would be in my maternal ancestral homeland of Shandong province. (Korean physicists will be happy about the proximity of the site :-)

Obviously for the Chinese government the symbolic value of taking the lead in high energy physics is very high -- perhaps on par with putting a man on the moon. In the case of a collider, we're talking about 20 year timescales, so this is a long term project. Stay tuned!

On the importance of experiments, from Voting and Weighing:
There is an old saying in finance: in the short run, the market is a voting machine, but in the long run it's a weighing machine. ...

You might think science is a weighing machine, with experiments determining which theories survive and which ones perish. Healthy sciences certainly are weighing machines, and the imminence of weighing forces honesty in the voting. However, in particle physics the timescale over which voting is superseded by weighing has become decades -- the length of a person's entire scientific career. We will very likely (barring something amazing at the LHC, like the discovery of mini-black holes) have the first generation of string theorists retiring soon with absolutely no experimental tests of their *lifetime* of work. Nevertheless, some have been lavishly rewarded by the academic market for their contributions.

Thursday, August 07, 2014

@ SCI FOO 2014

Sorry for the lack of blog activity. I just returned from Asia and am in Palo Alto for SCI FOO 2014. Hopefully I'll post some cool photos from the event, which starts tomorrow evening. If you are there and read this blog then come over and say hello. If I had free t-shirts I'd give you one, but don't get your hopes up!

Earlier SCI FOO posts.

Saturday, August 02, 2014

It's all in the gene: cows

Some years ago a German driver took me from the Perimeter Institute to the Toronto airport. He was an immigrant to Canada and had a background in dairy farming. During the ride he told me all about driving German farmers to buy units of semen produced by highly prized Canadian bulls. The use of linear polygenic models in cattle breeding is already widespread, and the review article below gives some idea as to the accuracy.

See also Genomic Prediction: No Bull and Plenty of room at the top.
Invited Review: Reliability of genomic predictions for North American Holstein bulls

Journal of Dairy Science Volume 92, Issue 1, Pages 16–24, January 2009.

Genetic progress will increase when breeders examine genotypes in addition to pedigrees and phenotypes. Genotypes for 38,416 markers and August 2003 genetic evaluations for 3,576 Holstein bulls born before 1999 were used to predict January 2008 daughter deviations for 1,759 bulls born from 1999 through 2002. Genotypes were generated using the Illumina BovineSNP50 BeadChip and DNA from semen contributed by US and Canadian artificial-insemination organizations to the Cooperative Dairy DNA Repository. Genomic predictions for 5 yield traits, 5 fitness traits, 16 conformation traits, and net merit were computed using a linear model with an assumed normal distribution for marker effects and also using a nonlinear model with a heavier tailed prior distribution to account for major genes. The official parent average from 2003 and a 2003 parent average computed from only the subset of genotyped ancestors were combined with genomic predictions using a selection index. Combined predictions were more accurate than official parent averages for all 27 traits. The coefficients of determination (R2) were 0.05 to 0.38 greater with nonlinear genomic predictions included compared with those from parent average alone. Linear genomic predictions had R2 values similar to those from nonlinear predictions but averaged just 0.01 lower. The greatest benefits of genomic prediction were for fat percentage because of a known gene with a large effect. The R2 values were converted to realized reliabilities by dividing by mean reliability of 2008 daughter deviations and then adding the difference between published and observed reliabilities of 2003 parent averages. When averaged across all traits, combined genomic predictions had realized reliabilities that were 23% greater than reliabilities of parent averages (50 vs. 27%), and gains in information were equivalent to 11 additional daughter records. Reliability increased more by doubling the number of bulls genotyped than the number of markers genotyped. Genomic prediction improves reliability by tracing the inheritance of genes even with small effects.

Results and Discussion: ... Marker effects for most other traits were evenly distributed across all chromosomes with only a few regions having larger effects, which may explain why the infinitesimal model and standard quantitative genetic theories have worked well. The distribution of marker effects indicates primarily polygenic rather than simple inheritance and suggests that the favorable alleles will not become homozygous quickly, and genetic variation will remain even after intense selection. Thus, dairy cattle breeders may expect genetic progress to continue for many generations.

... Most animal breeders will conclude that these gains in reliability are sufficient to make genotyping profitable before breeders invest in progeny testing or embryo transfer. Rates of genetic progress should increase substantially as breeders take advantage of these new tools for improving animals (Schaeffer, 2008). Further increases in number of genotyped bulls, revisions to the statistical methods, and additional edits should increase the precision of future genomic predictions.

Table 3

TraitParent averageGenomic predictionGain from nonlinear genomic prediction compared with published parent average
Net merit301467535323
Milk yield353269565823
Fat yield351769656833
Protein yield353169585722
Fat percentage352969697843
Protein percentage353269626934
Productive life272855424518

"Horses ain't like people, man. They can't make themselves better than they're born. See, with a horse, it's all in the gene. It's the fucking gene that does the running. The horse has got absolutely nothing to do with it." --- Paulie (Eric Roberts) in The Pope of Greenwich Village

Tuesday, July 29, 2014


I have a new candidate for coolest research institute architecture. HKUST's Institute for Advanced Study is housed in an amazing building with a view of Clearwater Bay in HK. The members of the institute will be mostly theoretical physicists and mathematicians :-)

Stiff competition from Benasque's Center and the Perimeter Institute, however. Also Caltech's IQIM!

Click for larger images.

Compare to Dr. No's interrogation chamber :-)

Sunday, July 27, 2014


This paper provides additional support that the GWAS hits found by SSGAC affect cognitive ability. My guess is that UK age 14 SATS scores are pretty g-loaded. Note this is an ethnically homogeneous sample of students.

If the effect size per allele is about 1/30 SD, it would take ~1000 to account for normal population variation. These are the first loci detected, so typical effect size of alleles affecting cognitive ability is probably smaller. This seems consistent with my estimate of ~10k causal variants.

Genetic Variation Associated with Differential Educational Attainment in Adults Has Anticipated Associations with School Performance in Children (PLoS July 17, 2014 DOI: 10.1371/journal.pone.0100248)

Genome-wide association study results have yielded evidence for the association of common genetic variants with crude measures of completed educational attainment in adults. Whilst informative, these results do not inform as to the mechanism of these effects or their presence at earlier ages and where educational performance is more routinely and more precisely assessed. Single nucleotide polymorphisms exhibiting genome-wide significant associations with adult educational attainment were combined to derive an unweighted allele score in 5,979 and 6,145 young participants from the Avon Longitudinal Study of Parents and Children with key stage 3 national curriculum test results (SATS results) available at age 13 to 14 years in English and mathematics respectively. Standardised (z-scored) results for English and mathematics showed an expected relationship with sex, with girls exhibiting an advantage over boys in English (0.433 SD (95%CI 0.395, 0.470), p<10−10) with more similar results (though in the opposite direction) in mathematics (0.042 SD (95%CI 0.004, 0.080), p = 0.030). Each additional adult educational attainment increasing allele was associated with 0.041 SD (95%CI 0.020, 0.063), p = 1.79×10−04 and 0.028 SD (95%CI 0.007, 0.050), p = 0.01 increases in standardised SATS score for English and mathematics respectively. Educational attainment is a complex multifactorial behavioural trait which has not had heritable contributions to it fully characterised. We were able to apply the results from a large study of adult educational attainment to a study of child exam performance marking events in the process of learning rather than realised adult end product. Our results support evidence for common, small genetic contributions to educational attainment, but also emphasise the likely lifecourse nature of this genetic effect. Results here also, by an alternative route, suggest that existing methods for child examination are able to recognise early life variation likely to be related to ultimate educational attainment.

Saturday, July 26, 2014

Success, Ability, and all that

I came across this nice discussion at LessWrong which is similar to my old post Success vs Ability. The illustration below shows why even a strong predictor of outcome is seldom able to pick out the very top performer: e.g., taller people are on average better at basketball, but the best player in the world is not the tallest; smarter people are on average better at making money, but the richest person in the world is not the smartest, etc.

This seems like a trivial point (as are most things, when explained clearly), however, it still eludes the vast majority. For example, in the Atlantic article I linked to in the earlier post Creative Minds, the neuroscientist professor who studies creative genius misunderstands the implications of the Terman study. She repeats the common claim that Terman's study fails to support the importance of high cognitive ability to "genius"-level achievement: none of the Termites won a Nobel prize, whereas Shockley and Alvarez, who narrowly missed the (verbally loaded) Stanford-Binet cut for the study, each won for work in experimental physics. But luck, drive, creativity, and other factors, all at least somewhat independent of intelligence, influence success in science. Combine this with the fact that there are exponentially more people a bit below the Terman cut than above it, and Terman's results do little more than confirm that cognitive ability is positively but not perfectly correlated with creative output.

In the SMPY study probability of having published a literary work or earned a patent was increasing with ability even within the top 1%. The "IQ over 120 doesn't matter" meme falls apart if one measures individual likelihood of success, as opposed to the total number of individuals at, e.g., IQ 120 vs IQ 145 who have achieved some milestone. The base population of the former is 100 times that of the latter!

This topic came up last night in Hong Kong, at dinner with two hedge funders (Caltech/MIT guys with PhDs) who have had long careers in finance. Both observed that 20 years ago it was nearly impossible to predict which of their colleagues and peers would go on to make vast fortunes, as opposed to becoming merely rich.

Tuesday, July 22, 2014

Genome editing excises HIV

See also CRISPR Symposium at MSU and Genetic engineering of monkeys using CRISPR.
The Scientist: ... The researchers, led by Kamel Khalili at Temple University in Philadelphia, Pennsylvania, used the CRISPR/Cas9 genome-editing system to excise HIV from several human cell lines, including microglia and T cells. They targeted both the 5’ and 3’ ends of the virus, called the long terminal repeats (LTRs), so that the entire viral genome was removed.

“We were extremely happy with the outcome,” Khalili told The Scientist. “It was a little bit . . . mind-boggling how this system really can identify a single copy of the virus in a chromosome, which is highly packed DNA, and exactly cleave that region.”

His team showed that not only could Cas9 excise one copy of the HIV genome, but—operating in the same cell—it could also clip out another copy lurking in a different chromosome. Often, Khalili said, a cell can have several copies of latent HIV distributed across various chromosomes. “Most likely the technology is going to clean up the viral DNA” in a cell, he said.

... One limitation of the CRISPR/Cas9 approach is that it can chop up unintended regions of the genome, producing so-called off-target effects. Khalili’s group performed whole-genome sequencing to look for off-target effects, but didn’t find any. T.J. Cradick, the director of the protein engineering core facility at Georgia Tech, said that a more thorough analysis of potential off-target effects is still required to make sure nothing has been overlooked. Nonetheless, “latent HIV provirus is a very exciting target and . . . a very promising way forward,” said Cradick, who did not participate in the study.

W. Hu et al., “RNA-directed gene editing specifically eradicates latent and prevents new HIV-1 infection,” PNAS, doi:10.1073/pnas.1405186111, 2014

Monday, July 21, 2014

The Creative Mind

 See also Anne Roe's The Making of a Scientist.
The Atlantic: ... One after another, my writer subjects came to my office and spent three or four hours pouring out the stories of their struggles with mood disorder—mostly depression, but occasionally bipolar disorder. A full 80 percent of them had had some kind of mood disturbance at some time in their lives, compared with just 30 percent of the control group—only slightly less than an age-matched group in the general population. (At first I had been surprised that nearly all the writers I approached would so eagerly agree to participate in a study with a young and unknown assistant professor—but I quickly came to understand why they were so interested in talking to a psychiatrist.) 
The Vonneguts turned out to be representative of the writers’ families, in which both mood disorder and creativity were overrepresented—as with the Vonneguts, some of the creative relatives were writers, but others were dancers, visual artists, chemists, architects, or mathematicians. This is consistent with what some other studies have found. When the psychologist Kay Redfield Jamison looked at 47 famous writers and artists in Great Britain, she found that more than 38 percent had been treated for a mood disorder; the highest rates occurred among playwrights, and the second-highest among poets. When Joseph Schildkraut, a psychiatrist at Harvard Medical School, studied a group of 15 abstract-expressionist painters in the mid-20th century, he found that half of them had some form of mental illness, mostly depression or bipolar disorder; nearly half of these artists failed to live past age 60. ... 
This time around, I wanted to examine a more diverse sample of creativity, from the sciences as well as the arts. My motivations were partly selfish—I wanted the chance to discuss the creative process with people who might think and work differently, and I thought I could probably learn a lot by listening to just a few people from specific scientific fields. After all, each would be an individual jewel—a fascinating study on his or her own. Now that I’m about halfway through the study, I can say that this is exactly what has happened. My individual jewels so far include, among others, the filmmaker George Lucas, the mathematician and Fields Medalist William Thurston, the Pulitzer Prize–winning novelist Jane Smiley, and six Nobel laureates from the fields of chemistry, physics, and physiology or medicine. Because winners of major awards are typically older, and because I wanted to include some younger people, I’ve also recruited winners of the National Institutes of Health Pioneer Award and other prizes in the arts. 
Apart from stating their names, I do not have permission to reveal individual information about my subjects. And because the study is ongoing (each subject can take as long as a year to recruit, making for slow progress), we do not yet have any definitive results—though we do have a good sense of the direction that things are taking. By studying the structural and functional characteristics of subjects’ brains in addition to their personal and family histories, we are learning an enormous amount about how creativity occurs in the brain, as well as whether these scientists and artists display the same personal or familial connections to mental illness that the subjects in my Iowa Writers’ Workshop study did. ... 
As I hypothesized, the creative people have shown stronger activations in their association cortices during all four tasks than the controls have. (See the images on page 74.) This pattern has held true for both the artists and the scientists, suggesting that similar brain processes may underlie a broad spectrum of creative expression. Common stereotypes about “right brained” versus “left brained” people notwithstanding, this parallel makes sense. Many creative people are polymaths, people with broad interests in many fields—a common trait among my study subjects.

Saturday, July 19, 2014

Bell Curve @20 @Harvard

The host is Harvard professor Harvey Mansfield. I'm not sure who all of the other panelists are, but they seem to include a professor of government and another of economics. The Asian physics guy is probably Peter Lu.
The Program on Constitutional Government at Harvard University

March 14, 2014: Charles Murray, on “The Bell Curve Revisited.” Charles Murray is a Fellow at the American Enterprise Association, and the author of famous and influential books, among them, Losing Ground (1984), The Bell Curve; Intelligence and Class Structure in American Life (1994, with Richard Herrnstein), and most recently, Coming Apart: the State of White America,1960-2010 (2013). He declares himself a libertarian, has written for many journals, and has received the Irving Kristol award from AEI and the Bradley Prize from the Bradley Foundation. He is Harvard ’65 and received a PhD in political science from M. I. T. in 1974. He is also the author of several “Murray’s laws” of social behavior.

Hail Britannia -- 100k whole genomes

Progress! Genotyping of large, well-phenotyped samples.
TechnologyReview: The British government says that it plans to hire the U.S. gene-sequencing company Illumina to sequence 100,000 human genomes in what is the largest national project to decode the DNA of a populace. ...

Some other countries are also considering large national sequencing projects. The U.K. project will focus on people with cancer, as well as adults and children with rare diseases. Because all Britons are members of the National Health Service, the project expects to be able to compare DNA data with detailed centralized health records (see “Why the U.K. Wants a Genomic National Health Service”).

While the number of genomes to be sequenced is 100,000, the total number of Britons participating in the study is smaller, about 70,000. That is because for cancer patients Genomics England intends to obtain the sequence of both their inherited DNA as well as that of their cancers.
BGI bid for this work but their transition to the upgraded Complete Genomics technology is still in progress. This delay has affected our cognitive genomics project as well.

Big data sets are also being assembled in the US (note in this case only SNP genotyping; cost is less than $100 per person now):
AKESOgen announced today that it has been awarded a $7.5M contract by the U.S. Department of Veterans Affairs (VA) for genotyping samples from U.S. veterans as part of the Million Veteran Program (MVP). This award covers the genotyping of 105,000 veterans in the first year of a five year contract.

"The VA's Million Veteran Program is one of the largest genetic initiatives ever undertaken in the US and its visionary genomics and genetics approach will provide new insights about how genes affect health. The goal is to improve healthcare for veterans by understanding the genetic basis of many common conditions. The data will ultimately be beneficial to the healthcare of all veterans and of the wider community. We are delighted to have been selected by the VA for this unique endeavor and we will provide genetic data of the highest quality to the VA." said Bob Boisjoli, CEO of AKESOgen. To fulfill the VA contract, AKESOgen will utilize a custom designed array based genotyping solution from Affymetrix, Inc. ...
My prediction is that of order a million phenotype:genotype pairs will be enough to deduce the genetic architecture of complex traits such as height or cognitive ability. SNPs will be enough to solve most of the problem, so that cost is now ~ $100M or less -- interested billionaires please contact me :-)

Wednesday, July 16, 2014

Conor Mcgregor

Win or lose, he's entertaining. Definitely the biggest character in the UFC.

Friday, July 11, 2014

Minds and Machines

HLMI = ‘high–level machine intelligence’ = one that can carry out most human professions at least as well as a typical human. I'm more pessimistic than the average researcher in the poll. My 95 percent confidence interval has earliest HLMI about 50 years from now, putting me at ~ 80-90th percentile in this group as far as pessimism. I think human genetic engineering will be around for at least a generation or so before machines pass a "strong" Turing test. Perhaps a genetically enhanced team of researchers will be the ones who finally reach the milestone, ~ 100 years after Turing proposed it :-)
These are the days of miracle and wonder
This is the long-distance call
The way the camera follows us in slo-mo
The way we look to us all
The way we look to a distant constellation
That’s dying in a corner of the sky
These are the days of miracle and wonder
And don’t cry baby don’t cry
Don’t cry -- Paul Simon

Future Progress in Artificial Intelligence: A Poll Among Experts

Vincent C. Müller & Nick Bostrom

Abstract: In some quarters, there is intense concern about high–level machine intelligence and superintelligent AI coming up in a few decades, bringing with it significant risks for humanity; in other quarters, these issues are ignored or considered science fiction. We wanted to clarify what the distribution of opinions actually is, what probability the best experts currently assign to high–level machine intelligence coming up within a particular time–frame, which risks they see with that development and how fast they see these developing. We thus designed a brief questionnaire and distributed it to four groups of experts. Overall, the results show an agreement among experts that AI systems will probably reach overall human ability around 2040-2050, and move on to super-intelligence in less than 30 years thereafter. The experts say the probability is about one in three that this development turns out to be ‘bad’ or ‘extremely bad’ for humanity.

Thursday, July 10, 2014

Chimp intelligence is heritable

A natural place to look for alleles of large effect are the otherwise conserved (from mouse through chimp) variants that are different in humans. See The Genetics of Humanness and The Essential Difference.

My guess (without checking the paper to see if they report it) is that test-retest correlation for chimps is well below the 0.9--0.95 often found for (human) g. Thus the h2 = 0.5 figure reported below could be significantly higher if corrected for reliability.
Nature News: Smart chimpanzees often have smart offspring, researchers suggest in one of the first analyses of the genetic contribution to intelligence in apes. The findings, published online today in Current Biology1, could shed light on how human intelligence evolved, and might even lead to discoveries of genes associated with mental capacity.

A team led by William Hopkins, a psychologist at Georgia State University in Atlanta, tested the intelligence of 99 chimpanzees aged 9 to 54 years old, most of them descended from the same group of animals housed at the Yerkes National Primate Research Center in Atlanta. The chimps faced cognitive challenges such as remembering where food was hidden in a rotating object, following a human’s gaze and using tools to solve problems.

A subsequent statistical analysis revealed a correlation between the animals' performance on these tests and their relatedness to other chimpanzees participating in the study. About half of the difference in performance between individual apes was genetic, the researchers found.

In humans, about 30% of intelligence in children can be explained by genetics; for adults, who are less vulnerable to environmental influences, that figure rises to 70%. Those numbers are comparable to the new estimate of the heritability of intelligence across a wide age range of chimps, says Danielle Posthuma, a behavioural geneticist at VU University in Amsterdam, who was not involved in the research.

“This study is much overdue,” says Rasmus Nielsen, a computational biologist at the University of California, Berkeley. “There has been enormous focus on understanding heritability of intelligence in humans, but very little on our closest relatives.”

Tuesday, July 08, 2014

James Simons: Mathematics, Common Sense, and Good Luck

A great MIT colloquium by Jim Simons (intro by I. Singer). Interesting discussion @28 min about how Simons (after leaving mathematics at 38) became an investor. Initially, he relied both on fundamental / event-driven analysis (reading the newspaper ;-) as well as computer models. But Simons eventually decided on a completely model-driven approach, and the rest is history.

@38 min: on RenTech's secret, We start with first rate scientists ... Great infrastructure ... New ideas shared and discussed as soon as possible in an open environment ... Compensation based on overall firm performance ...

@44 min: Be guided by beauty ... Try to do it RIGHT ... Don't give up and hope for some good luck!

@48 min: a defense of HFT ... the cost of liquidity?

@55 min: world's greatest investor is a Keynesian :-)

@58 min: brief precis of financial crisis ... See also here.

See also Jim Simons is my hero.

Do Standardized Tests Matter?

Thanks to a reader for pointing me to this TEDx talk by Nathan Kuncel. See also SAT and GRE Validity.

Saturday, July 05, 2014

Physics and the Horizons of Truth

I came across a PDF version of this book online. It contains a number of fine essays, including the ones excerpted from below. A recurring question concerning Godel's incompleteness results is whether they impact "interesting" mathematical questions.
CHAPTER 21 The Godel Phenomenon in Mathematics: A Modern View: ... Hilbert believed that all mathematical truths are knowable, and he set the threshold for mathematical knowledge at the ability to devise a “mechanical procedure.” This dream was shattered by Godel and Turing. Godel’s incompleteness theorem exhibited true statements that can never be proved. Turing formalized Hilbert’s notion of computation and of finite algorithms (thereby initiating the computer revolution) and proved that some problems are undecidable – they have no such algorithms.

Though the first examples of such unknowables seemed somewhat unnatural, more and more natural examples of unprovable or undecidable problems were found in different areas of mathematics. The independence of the continuum hypothesis and the undecidability of Diophantine equations are famous early examples. This became known as the Godel phenomenon, and its effect on the practice of mathematics has been debated since. Many argued that though some of the inaccessible truths above are natural, they are far from what is really of interest to most working mathematicians. Indeed, it would seem that in the seventy-five years since the incompleteness theo- rem, mathematics has continued thriving, with remarkable achievements such as the recent settlement of Fermat’s last “theorem” by Wiles and the Poincare conjecture by Perelman. Are there interesting mathematical truths that are unknowable?

The main point of this chapter is that when knowability is interpreted by modern standards, namely, via computational complexity, the Godel phenomenon is very much with us. We argue that to understand a mathematical structure, having a decision pro- cedure is but a first approximation; a real understanding requires an efficient algorithm. Remarkably, Godel was the first to propose this modern view in a letter to von Neumann in 1956, which was discovered only in the 1990s.

Meanwhile, from the mid-1960s on, the field of theoretical computer science has made formal Godel’s challenge and has created a theory that enables quantification of the difficulty of computational problems. In particular, a reasonable way to capture knowable problems (which we can efficiently solve) is the class P, and a reasonable way to capture interesting problems (which we would like to solve) is the class NP. Moreover, assuming the widely believed P ̸= NP conjecture, the class NP -complete captures interesting unknowable problems. ...
This volume also includes Paul Cohen's essay (chapter 19) on his work on the Continuum Hypothesis and his interactions with Godel. See also Horizons of Truth.
Cohen: ... I still had a feeling of skepticism about Godel's work, but skepticism mixed with awe and admiration.

I can say my feeling was roughly this: How can someone thinking about logic in almost philosophical terms discover a result that had implications for Diophantine equations? ... I closed the book and tried to rediscover the proof, which I still feel is the best way to understand things. I totally capitulated. The Incompleteness Theorem was true, and Godel was far superior to me in understanding the nature of mathematics.

Although the proof was basically simple, when stripped to its essentials I felt that its discoverer was above me and other mere mortals in his ability to understand what mathematics -- and even human thought, for that matter -- really was. From that moment on, my regard for Godel was so high that I almost felt it would be beyond my wildest dreams to meet him and discover for myself how he thought about mathematics and the fount from which his deep intuition flowed. I could imagine myself as a clever mathematician solving difficult problems, but how could I emulate a result of the magnitude of the Incompleteness Theorem? There it stood, in splendid isolation and majesty, not allowing any kind of completion or addition because it answered the basic questions with such finality.
My recent interest in this topic parallels a remark by David Deutsch
The reason why we find it possible to construct, say, electronic calculators, and indeed why we can perform mental arithmetic, cannot be found in mathematics or logic. The reason is that the laws of physics "happen" to permit the existence of physical models for the operations of arithmetic such as addition, subtraction and multiplication.
that suggests the primacy of physical reality over mathematics (usually the opposite assumption is made!) -- the parts of mathematics which are simply models or abstractions of "real" physical things are most likely to be free of contradiction or misleading intuition. Aspects of mathematics which have no physical analog (e.g., infinite sets) are prone to problems in formalization or mechanization. Physics (models which can to be compared to experimental observation; actual "effective procedures") does not ever require infinity, although it may be of some conceptual convenience. Hence one suspects, along the lines above, that mathematics without something like the "axiom of infinity" might be well-defined. Is there some sort of finiteness restriction (e.g., upper bound on Godel number) that evades Godel's theorem? If one only asks arithmetical questions about numbers below some upper bound, can't one avoid undecidability?

Tuesday, July 01, 2014

Snowden finale

Anyone care to make predictions?
Alternet: ... According to The Sunday Times of London, Glenn Greenwald will publish the names of Americans targeted by the NSA.

“One of the big questions when it comes to domestic spying is, ‘Who have been the NSA’s specific targets?’” he told the Times. “Are they political critics and dissidents and activists? Are they genuinely people we’d regard as terrorists? What are the metrics and calculations that go into choosing those targets and what is done with the surveillance that is conducted? Those are the kinds of questions that I want to still answer.”

Greenwald has promised that this will be the “biggest” revelation of the nearly two million classified files he received from Edward Snowden, and that “Snowden’s legacy would be ‘shaped in large part’ by this ‘finishing piece’ still to come.” In a May interview with GQ, Greenwald spoke of this “finale:”

"I think we will end the big stories in about three months or so [June or July 2014]. I like to think of it as a fireworks show: You want to save your best for last. There's a story that from the beginning I thought would be our biggest, and I'm saving that. The last one is the one where the sky is all covered in spectacular multicolored hues. This will be the finale, a big missing piece. Snowden knows about it and is excited about it."

Loyalty: Ames High School fight song

My high school fight song. Legend says it was written in the counterculture 60's and that some wag managed to slip "comrades at work and at play" into the lyrics under the noses of the school administrators. It's just the kind of thing clever AHS students might attempt :-)

Thursday, June 26, 2014

Theoreticians as Professional Outsiders

The book also contains essays on Schrodinger, Fisher, Pauling, George Price, and Rashevsky.
Theoreticians as Professional Outsiders: The Modeling Strategies of John von Neumann and Norbert Wiener (Ehud Lamm in Biology Outside the Box: Boundary Crossers and Innovation in Biology, Oren Harman and Michael R. Dietrich (eds.))

Both von Neumann and Wiener were outsiders to biology. Both were inspired by biology and both proposed models and generalizations that proved inspirational for biologists. Around the same time in the 1940s von Neumann developed the notion of self reproducing automata and Wiener suggested an explication of teleology using the notion of negative feedback. These efforts were similar in spirit. Both von Neumann and Wiener used mathematical ideas to attack foundational issues in biology, and the concepts they articulated had lasting effect. But there were significant differences as well. Von Neumann presented a how-possibly model, which sparked interest by mathematicians and computer scientists, while Wiener collaborated more directly with biologists, and his proposal influenced the philosophy of biology. The two cases illustrate different strategies by which mathematicians, the “professional outsiders” of science, can choose to guide their engagement with biological questions and with the biological community, and illustrate different kinds of generalizations that mathematization can contribute to biology. The different strategies employed by von Neumann and Wiener and the types of models they constructed may have affected the fate of von Neumann’s and Wiener’s ideas – as well as the reputation, in biology, of von Neumann and Wiener themselves.
For and Against theory in biology:
... E.B. Wilson articulated the reserved attitude of biologists towards uninvited theoreticians. Wilson’s remarks at the Cold Spring Harbor Symposia on Quantitative Biology in 1934 were ostensibly about the “Mathematics of Growth” but it is impossible to fail to notice their tone and true scope. Wilson suggested orienting the discussion around five axioms or “platitudes” as he called them. The first two are probably enough to get his point across. Axiom 1 states that “science need not be mathematical,” and if that’s not bad enough, axiom 2 solidifies the reserved attitude towards mathematization by stating that “simply because a subject is mathematical it need not therefore be scientific.”

... While the idea of self-reproduction seems incredible, and some might even have thought it to involve a self-contradiction, with objects creating something as complex as they are themselves, von Neumann’s solution to the problem of self-reproduction was remarkably simple. It is based on two operations: (1) constructing an object according to a list of instructions, and (2) copying a list of instructions as is ... This procedure is trivial for anyone computer-literate to understand; it was a remarkable theoretical result in 1948. What, however, does it tell us about biology? It is often observed that von Neumann’s explanation, which involves treating the genetic material both as instructions and as data that is copied as-is, is analogous to the reproduction of cells, since DNA, the analogue of the instruction list, is passively replicated. Von Neumann compared the construction instructions that direct the automaton to genes, noting that genes probably do not constitute instructions fully specifying the construction of the objects their presence stimulates. He warned that genes are probably only general pointers or cues that affect development, a warning that alas did not curtail the “genetic program” metaphor that became dominant in years to come.

Von Neumann noted that his model explained how mutations that do not affect self- replication are possible. If the instruction list specifies not only the self-replicating automaton but also an additional structure, this structure will also be replicated. ...

... As Claude Shannon put it in a 1958 review of von Neumann’s contributions to automata theory, and specifically self-reproducing automata:

If reality is copied too closely in the model we have to deal with all of the complexity of nature, much of which is not particularly relevant to the self-reproducing question. However, by simplifying too much, the structure becomes so abstract and simplified that the problem is almost trivial and the solution is un-impressive with regard to solving the philosophical point that is involved. In one place, after a lengthy discussion of the difficulties of formulating the problem satisfactorily, von Neumann remarks: "I do not want to be seriously bothered with the objection that (a) everybody knows that automata can reproduce themselves (b) everybody knows that they cannot."
See also On Crick and Watson and Reliable Organization of Unreliable Components

Blog Archive