Formal Methods for AI Safety

Self-Aware Systems

Future intelligent systems could cause great harm to humanity. Because of the large risks, if we are to responsibly build intelligent systems, they must not only be safe but we must be very convinced that they are safe. For example, an AI which is taught human morality by reinforcement learning might be safe, but it’s hard to see how we could become sufficiently convinced to responsibly deploy it.

Before deploying an advanced intelligent system, we should have a very convincing argument for its safety. If that argument is rigorously correct, then it is a mathematical proof. This is not a trivial endeavor! It’s just the only path that appears to be open to us.

The mathematical foundations of computing have been known since Church and Turing‘s work in 1936. Both created computational models which were simultaneously logical models about which theorems could be proved. Church created the lambda calculus

View original post 360 more words

A brilliant example of communication about science and humanity

dw2

Mathematical Universe

Do you enjoy great detective puzzles? Do you like noticing small anomalies, and turning them into clues to an unexpected explanation? Do you like watching world-class scientists at work, piecing together insights to create new theories, and coping with disappointments when their theories appear to be disproved?

In the book “Our mathematical universe”, the mysteries being addressed are some of the very biggest imaginable:

  • What is everything made out of?
  • Where does the universe come from? For example, what made the Big Bang go “bang”?
  • What gives science its authority to speak with so much confidence about matters such as the age and size of the universe?
  • Is it true that the constants of nature appear remarkably “fine-tuned” so as to allow the emergence of life – in a way suggesting a miracle?
  • What does modern physics (including quantum mechanics) have to teach us about mind and consciousness?
  • What are…

View original post 490 more words