Unicode in VIM

Enable Unicode and Korean in your VIM on Mac and Windows:

  1. Compile your VIM with multi-byte support
  2. Configure VIM to use Unicode encoding
  3. Configure your terminal for Unicode

Compile your VIM with multi-byte support:

To see if it’s compiled with multi-byte support, type :version in VIM or vim –version on shell. If you find +multy_byte there it’s already compiled with multi-byte support, but it’s not if you see ‘-‘ instead of ‘+’ before multi_byte. To compile VIM with it enabled, download the source, configure it with ‘big’ features, make it, and install it:

tar xvjf vim7x.tar.bz2
cd vim7x/src
./configure --with-features=big --prefix=/usr/local/or/whatever/you/want
make
sudo make install

You may want to install it in a different location.

On Windows, you probably have a pre-compiled version, which by default comes with multi-byte support.

Configure VIM to use Unicode encoding:

VIM inherits the system locale for its default encoding. Thus in most cases it is enough to set LANG so as to use Unicode, most commonly UTF-8:

export LANG=en_GB.UTF-8 # for bash or sh family
setenv LANG en_GB.UTF-8 # for csh or its family
LANG=en_GB.UTF-8 # on Windows

for instance. To blindly let VIM use Unicode by default for editing and loading/saving files (it already uses Unicode, specifically UTF-8, internally), run or add the following options to your .vimrc.

:set encoding=utf-8
:setglobal fileencoding=utf-8

setglobal make it default when creating a new file, and fileencoding option local to the current buffer inherits its global setting.

You may also need to set the encoding that your terminal use so that VIM correctly translate between your terminal and itself, if your terminal uses an unusual encoding. Otherwise, you may leave it empty. As a general remark, if your system locale is correctly set in an environment variable, this will likely be the encoding your terminal uses. By default VIM uses this locale as its default encoding. Thus you can set your terminal encoding using the system locale before you set the encoding to Unicode unless it’s already set otherwise:

:if &termencoding == ""
:let &termencoding = &encoding
:endif

On Windows (only tested on Windows 7), the terminal encoding was set correctly automatically (it is usually a code page compatible to latin1) upon the startup of VIM. Thus it is usually better not to change it, or check if it’s already set if you try to change it as shown above. On Mac (Mac OS X 10.6 or later, not sure for the versions before), you can configure the terminal to use Unicode (see below), and thus set it to UTF-8 or just leave it empty.

Configure your terminal for Unicode:

With those described above set and a decent Unicode font used, you should be able to see most (Western) Unicode characters on screen. If you’re still not able to see/enter Unicode characters, it’s very likely your terminal does not support it or is not configured so. In particular, if you want to use multi-byte multi-space characters, which occupy more than a single character cell on screen such as CJK characters, you might need additional setup. On Mac, make sure that your terminal is configured to use Unicode (UTF-8) for Character encoding in Preference > Settings > Advanced. If it’s set so, monospace fonts such as Menlo and Andale usually provide a good Unicode support for single-space Unicode characters, and the terminal should locate and use a matching font for multi-space characters such as Korean. If everything is set correctly, you will see Unicode characters on your VIM as well as shell. If you can see but not enter them, uncheck Escape non-ASCII input in the Preference > Settings > Advanced.

On Windows, its default terminal with the default monospace font often does not handle multi-space characters very well. It is usually easier to use gVIM to edit Unicode text, where you may specify fonts for single-space and multi-space (or wide in VIM’s term) characters separately. Note that the the multi-space font must be exactly twice wider than the monospaced single-space font. VIM tries to locate such a font automatically given its default (single-space) GUI font, but in case it fails you should specify one. There are usually monospaced twice-wide fonts for multi-space characters, for instance CJK characters, shipped with Windows. For Korean, those fonts ending with ‘Che’, such as BatangChe and GulimChe, work nicely, and for Japanese, MS Mincho and MS Gothic should work. set guifontwide does this job, and likewise guifont is used to specify its single-space counterpart:

:if has("running_gui")
:set guifont=Consola:h9:cANSI
:set guifontwide=BatangChe
:endif

On both systems, you need an appropriate input method (IME) to enter such characters.

Reference:

As usual, the most useful and comprehensive source of information is VIM’s built-in help pages, accessible via :help mbyte, or here if you prefer reading on line.

Advertisements
Unicode in VIM

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s