WM_CHAR and charsets?

Discussion:

(too old to reply)

Quantum

2006-10-09 11:08:30 UTC

Hi all,

I'm struggling with a couple of things re:WM_CHAR and charsets.

When I get a WM_CHAR message, I can get the character code as an integer
by:

int intVal=(int)wParam;

However, that integer value depends on what character set is currently
being used, correct?

So, lets say I'm using the standard Windows character set, WINDOWS-1252.
Then the intVal will have the int value that is the int value of the
char when checked against the WINDOWS-1252 charact set. Yes?

However, if I change character sets in Windows (I don't know how, but I
know you can...) to say ISO-8859-8 (the Hebrew one), then pressing
exactly the same physical key on the keyboard would generate a
completely different char and int value?

If that lot is correct, then I've got a question if I may:

1) Is there a way I can convert the intVal I get from a WM_CHAR message
into the equivalent character on a seperate character set I specify. So
maybe some function like this:

-----------------------------------------------------
//aVal is just the (int)wParam value.
//aCharSet is the charset to convert aVal to.
int convertIntoCharset(int aVal, CharSet aCharSet){
origCharSet=GetCurrentCharSet();
int newVal=//Do some calculation with origCharSet and aCharSet.
return newVal;
}
//Returns -1 if there is no equivalent char on aCharSet.
//Otherwise returns the equivalent int value on aCharSet.
-----------------------------------------------------

Thanks for your time.
Q

Heinz Ozwirk

2006-10-09 17:13:17 UTC

Permalink

Post by Quantum
Hi all,
I'm struggling with a couple of things re:WM_CHAR and charsets.
When I get a WM_CHAR message, I can get the character code as an integer
int intVal=(int)wParam;
However, that integer value depends on what character set is currently
being used, correct?

Yes, but it also depends on the keyboard layout you are using. The keyboard
driver is used in various places to translate form scan codes to virtual key
codes and finally to unicode. It is even used to translate virtual key codes
into strings that can be displayed in a menu as an accelerator for a
command.

Post by Quantum
So, lets say I'm using the standard Windows character set, WINDOWS-1252.
Then the intVal will have the int value that is the int value of the char
when checked against the WINDOWS-1252 charact set. Yes?

When you press a key, the keyboard driver will translate its scan code into
a virtual key code. You'll get that code with the WM_KEYDOWN message. When
this message is passed to TranslateMessage, it will translate the virtual
key code into a unicode character and send it with a WM_UNICHAR message to
your program. Only if your app doesn't handle this message, the unicode
character is translated to its nearest match in the current character set
(Windows-1252, for example) and passed to your app with a WM_CHAR message.

Post by Quantum
However, if I change character sets in Windows (I don't know how, but I
know you can...) to say ISO-8859-8 (the Hebrew one), then pressing exactly
the same physical key on the keyboard would generate a completely
different char and int value?

If you only change the default ANSI codepage, everything works as above.
Only when the unicode character is translated into an 8-bit character, the
default codepage will be used. So, if you type the key for the letter ä
(a-umlaut) on a German keyboard, this keystroke will result in a WM_UNICHAR
message with the unicode value of "lower case letter a umlaut". Then it will
perhaps be discarded or translated into whatever letter is closest in
ISO-8858-9, so you would probably get a WM_CHAR message with the letter a
(without those fancy dots).

So, yes, you might get completly different codes, put not those for Hebrew
characters. When you change the default codepage, you should also select a
matching keyboard layout.

Post by Quantum
1) Is there a way I can convert the intVal I get from a WM_CHAR message
into the equivalent character on a seperate character set I specify. So

When your get a WM_CHAR message, a lot of processing has already been done.
You don't really know which key has originally been pressed, and you cannot
be sure that you'll receive a WM_CHAR message for all keys, not even for
those that are supposed to generate some character (opposed some function
key like Insert). For a program using characters from different languages or
character sets, it might be easier to use unicode instead of "ANSI" and
process WM_UNICHAR instead of WM_CHAR. If you really have to translate a
unicode string into a single- or multi-byte character string, you could use
WideCharToMultiByte to do such translations.

HTH
Heinz

Quantum

2006-10-09 18:40:14 UTC

Permalink

Post by Heinz Ozwirk

If you only change the default ANSI codepage, everything works as above.
Only when the unicode character is translated into an 8-bit character, the
default codepage will be used. So, if you type the key for the letter ä
(a-umlaut) on a German keyboard, this keystroke will result in a WM_UNICHAR
message with the unicode value of "lower case letter a umlaut". Then it will
perhaps be discarded or translated into whatever letter is closest in
ISO-8858-9, so you would probably get a WM_CHAR message with the letter a
(without those fancy dots).
So, yes, you might get completly different codes, put not those for Hebrew
characters. When you change the default codepage, you should also select a
matching keyboard layout.

Post by Quantum
1) Is there a way I can convert the intVal I get from a WM_CHAR message
into the equivalent character on a seperate character set I specify. So

Thanks for your help. :)