diff --git a/_posts/2013-11-06-unicode-codepoints-in-ruby.md b/_posts/2013-11-06-unicode-codepoints-in-ruby.md new file mode 100644 index 0000000..4219b52 --- /dev/null +++ b/_posts/2013-11-06-unicode-codepoints-in-ruby.md @@ -0,0 +1,44 @@ +--- +layout: post +title: Unicode codepoints in ruby +date: 06.11.2013 12:04 +--- +Another post of the category "better write it down before you forget it". + +I ❤ Unicode. Atleast most of the time. That's why I have things like ✓, ✗ and +ツ mapped directly on my keyboard. + +But sometimes you need not only the symbol itself, but maybe the codepoint as well. That's easy in ruby: + + irb> "❤".codepoints + => [10084] + +Got some codepoints and need to map it back to it's symbol? Easy: + + irb> [10084, 10003].pack("U*") + => "❤✓" + +Oh, of course the usual `\uXYZ` syntax works aswell, but you need the hexstring for that: + + irb> 10084.to_s 16 + => "2764" + irb> "\u{2764}" + => "❤" + +Sometimes you may need to see the actual bytes. This is easy in ruby aswell: + + irb> "❤".bytes + => [226, 157, 164] + +There is documentation on these things: + +* [each_codepoint][] +* [codepoints][] +* [bytes][] + +Enjoy the world of unicode! [❤][unicode-heart] + +[each_codepoint]: http://www.ruby-doc.org/core-2.0.0/String.html#method-i-each_codepoint +[codepoints]: http://www.ruby-doc.org/core-2.0.0/String.html#method-i-codepoints +[bytes]: http://www.ruby-doc.org/core-2.0.0/String.html#method-i-bytes +[unicode-heart]: http://codepoints.net/U+2764