Why normalize() method doesn’t work in some cases?

By default, String.prototype.normalize() uses NFC as an argument. NFC replaces multiple characters with single one.

MDN

You can specify “NFC” to get the composed canonical form, in which multiple code points are replaced with single code points where possible.

And here’s an example from MDN. It works.

let str = 'u006Eu0303';
str = str.normalize();
console.log(`${str}: ${str.length}`);

But then I decided to try this method with other characters. For example:

let str = 'u0057u0303';
str = str.normalize();
console.log(`${str}: ${str.length}`);

What’s wrong in the second example? Why doesn’t it work?

Answer

It doesn’t replace multiple characters it replaces multiple codepoints and only where possible.

ñ, being a character used in Spanish has its own codepoint in unicode: — U+00D1 — so you can just say ñ instead of “Take an n and then put a ~ on top of it”.

, being a representation of a phonic sound doesn’t have its own codepoint. It is a character used comparatively rarely so hasn’t been given precious space in the more efficient bits of Unicode. The only way you can have one is to say “Take a W and then put a ~ on top of it”.