unicode-characters

Ok, this one is completely useless but here it is anyway. I came across some text online that contained several non-standard html special characters. I was curious to see the character codes of those letters. While I was at it I also wanted to count the number of occurrences of those special characters.

function getChars(s, filter){
    if(!filter){
        filter = function(s){return s.charCodeAt(0)<32 || s.charCodeAt(0)>126}
    }
    let chars = s.split('').filter(filter);
    let map={}, c;
    for(let i in chars){
        c=chars[i].charCodeAt(0);
        map[c] = map[c] ? map[c]+1 : 1
    }
    return map;
}

Now you can pass in a string like this:

getChars('the brown fox jumped over the lazy dog€₹🍇👞')

and get this:

{
  "8364": 1,
  "8377": 1,
  "55356": 1,
  "55357": 1,
  "56414": 1,
  "57159": 1
}

Also, if you aren’t 100% clear on how character encoding works, go read this article by Joel Spolsky right now.

Leave a Reply

Your email address will not be published. Required fields are marked *