Implicit conversion between UTF string encodings
Initially recorded by
DerekParnell
Problem
Coders are used to implicit conversions between some datatypes, especially when there is no data loss, such as
integer to
floating point. Consequently, a common practice is to write ...
| char[] A;
dchar[] B;
A = "test string";
B = A; |
|
|
and expect that D will do the conversion. However, D rejects this with the message ...
cannot implicitly convert expression A of type char[] to dchar[]
Forcing the coder to rewrite the statement as ...
and forcing the coder to explictly import the std.utf module.
Side note, this code works - since D converts string literals:
| char[] A;
dchar[] B;
A = "test string";
B = "test string"; |
|
|
Also when selecting overloaded methods, auto conversion is used by D for real and integer parameters
| void Foo(real X){}
. . .
int i;
Foo(i); |
|
|
but not UTF strings.
| void Foo(dchar[] X) { . . . }
. . .
char[] Y;
Foo(Y); |
|
|
Proposal
That D will silently import std.utf and perform the conversions implicitly.
Cons
- Implies that the std.utf module is accessible at compile and link times.
- Implies that the conversion routines in std.utf are present.
- Implies that the conversion routines in std.utf are sufficient to do the conversions.
- Converting between UTF encodings is slower than converting integer <-> floating point. But if you have to do it anyway, what's the difference? I think this is a red-herring (DerekParnell)
- I meant it slow enough to avoid doing (automatically). It is still faster than converting to and from other encodings, since there are no lookups involved. It's a little like autoboxing, I think? (AndersFBjörklund)
Pros
- Reduces the extraneous coding requirement.
- Enhances the 'D is cool' factor.