1
Fork 0
blog/_posts/2013-11-06-unicode-codepoints-in-ruby.md

54 lines
1.3 KiB
Markdown
Raw Normal View History

2017-07-09 12:18:22 +00:00
extends: post.liquid
2013-11-06 11:17:51 +00:00
title: Unicode codepoints in ruby
2017-07-09 12:18:22 +00:00
date: 06 Nov 2013 12:04:00 +0100
path: /:year/:month/:day/unicode-codepoints-in-ruby
2017-12-19 21:26:32 +00:00
route: blog
2013-11-06 11:17:51 +00:00
---
Another post of the category "better write it down before you forget it".
I ❤ Unicode. Atleast most of the time. That's why I have things like ✓, ✗ and
ツ mapped directly on my keyboard.
But sometimes you need not only the symbol itself, but maybe the codepoint as well. That's easy in ruby:
2013-11-06 11:46:19 +00:00
~~~ruby
irb> "❤".codepoints
=> [10084]
~~~
2013-11-06 11:17:51 +00:00
Got some codepoints and need to map it back to it's symbol? Easy:
2013-11-06 11:46:19 +00:00
~~~ruby
irb> [10084, 10003].pack("U*")
=> "❤✓"
~~~
2013-11-06 11:17:51 +00:00
Oh, of course the usual `\uXYZ` syntax works aswell, but you need the hexstring for that:
2013-11-06 11:46:19 +00:00
~~~ruby
irb> 10084.to_s 16
=> "2764"
irb> "\u{2764}"
=> "❤"
~~~
2013-11-06 11:17:51 +00:00
Sometimes you may need to see the actual bytes. This is easy in ruby aswell:
2013-11-06 11:46:19 +00:00
~~~ruby
irb> "❤".bytes
=> [226, 157, 164]
~~~
2013-11-06 11:17:51 +00:00
There is documentation on these things:
* [each_codepoint][]
* [codepoints][]
* [bytes][]
Enjoy the world of unicode! [❤][unicode-heart]
[each_codepoint]: http://www.ruby-doc.org/core-2.0.0/String.html#method-i-each_codepoint
[codepoints]: http://www.ruby-doc.org/core-2.0.0/String.html#method-i-codepoints
[bytes]: http://www.ruby-doc.org/core-2.0.0/String.html#method-i-bytes
[unicode-heart]: http://codepoints.net/U+2764