Alexander Turner: Year-long stipend for research into shard theory and mechanistic interpretability in reinforcement learning
Amount: $220,000.00
Award date: January 1, 2023
Focus area: Technical research
  • Turner is an independent researcher with a PhD in Computer Science from Oregon State University.

  • Turner has an excellent track record for producing insightful research; examples of past original research include formalizing a notion of power-seeking behavior and steering models via activation engineering.

  • Turner has provided mentorship to various promising researchers, increasing the pool of expertise within AI safety.

  • While this stipend is on the higher side for the fund, we believe the rate was justified in this case, due to Turner’s competitive private-sector earning potential, as well as the quality of his previous research.

Outcomes: While there is usually some lag between grant time and research results, Turner has already produced one paper out of this grant.

Note: this grant was made by the same grantmaking team under the Long-Term Future Fund. Read more about the AI Risk Mitigation Fund Team here.