Ask Uncle Colin: why does the Newton-Raphson method work?

Ask Uncle Colin is a chance to ask your burning, possibly embarrassing, maths questions – and to show off your skills at coming up with clever acronyms. Send your questions to colin@flyingcoloursmaths.co.uk and Uncle Colin will do what he can.

Dear Uncle Colin,

I know how to use the Newton-Raphson method – but I don’t know why it works and I’m worried nobody will like me because of it.

-- Getting An Understanding Starts Somewhere

But of course, GAUSS! First up, just for the sake of clarity, not knowing where the Newton-Raphson method comes from is perfectly normal. In fact, not many people know that the version we know and love was developed by Simpson¹ – I didn’t know that until recently, and a few people like me (I think).

If you want to find the root of a function with the Newton-Raphson method, starting from a sensible guess $x_0$, here’s what you do:

Work out the value of the function at $x_0$
Work out the value of the derivative of the function at $x_0$
Divide the first by the second
Take the result away from $x_0$ – and you get your next guess, $x_1$.

For example, if I wanted to work out $\sqrt{123}$ and figured out that $f(x) = x^2-123$ would give me a zero at that point, I’d guess that the answer was about 11 (because $11^2=121$).

I’d work out: $f(11) = -2$ .

I’d differentiate to get $f’(x) = 2x$, so $f’(11) = 22$.

I’d divide the first by the second to get $-\frac{1}{11}$.

I’d take that away for a second, improved guess of $x_1 = 11 + \frac{1}{11} = 11.\dot 0 \dot 9$. (It’s actually 11.0905, so that’s good to three decimal places.)

But why does it work? That’s a more interesting question. It’s actually a very simple idea: draw a tangent to the curve at your best guess, and use where that crosses the $x$-axis as your next guess.

Not convinced? Well, the tangent at $x_0$ has a gradient of $f’(x_0)$ and goes through $(x_0, f(x_0))$, so the equation of the line is $(y-f(x_0)) = f’(x_0) (x - x_0)$ – and the point $(x_1, 0)$ lies on this line.

Substituting, you get $- f(x_0) = f’(x_0) (x_1 - x_0)$. Divide by the gradient and add $x_0$, you get:

$x_0 - \frac{ f(x_0)}{f’(x_0)} = x_1$, which is the Newton-Raphson recipe. Neat, eh?

-- Uncle Colin

Newton and Raphson had similar, but much more limited, processes for finding polynomial roots. ↩︎