Last update January 6, 2019

# Lexical

 Lexical Messages As a mathematician I would like: Another math suggestion identifiers & "unialpha" Suggestion Bare escape sequences Links

## Messages

Minor typo:

Decimal Literal, U Suffix Type 0U .. 4_294_967_296U uint 4_294_967_296U .. 18_446_744_073_709_551_615UL ulong

Should presumably be

Decimal Literal, U Suffix Type 0U .. 4_294_967_295U uint 4_294_967_296U .. 18_446_744_073_709_551_615UL ulong

### As a mathematician I would like:

If I had two changes I would make to the C programming language, it is

1. Put in an exponentiation operator, a^b or a**b, so that 2**3 is 8. For a numerics programmer this would be really useful. This is one way that Fortran really scores over C. I know that there is the pow function in C, but it always treats the power as a float or double, where if the power is an integer it should really work differently. Also the optimization should be able to do clever things with a^2 (i.e. write it as inline code a*a). Also, using "pow" is just plain ugly. I do appreciate that a^b and a**b already have meaning in C (the first is exclusive or, and the second is a*(*b)), but I think that this is sufficiently worthwhile for numerics programmers that you either find a whole new symbol, or depreciate the current use of ^ or ** (e.g. one could insist that the current a**b is always written a* *b - I mean when does a**b actually ever appear in real code?).

2. a%b has a very definite and unambiguous meaning when a is negative, and b is positive. The output should be non-negative. This is something Perl has done right.

### Another math suggestion

I support the creation of a ** operator, as suggested in the comment above. I would also like to have the possibility of writing complex numbers as 1+2j as well as 1+2i. (While this may look like a silly requirement, 1i and 1j have different meanings in some fields, such as in electronics, where they indicate the time convention used.) I think these two suggestions would make math coding even more comfortable. Thx.

### identifiers & "unialpha"

Why is D referencing "ISO/IEC 9899:1999 (E) Appendix D" for defining "universal alpha"? "ISO/IEC 9899:1999 (E) Appendix D" isn't listing "universal alpha".

Sample: \u00B7 (MIDDLE DOT, Other_Punctuation) isn't an "universal alpha" but allowed by Appendix D in identifiers.

"ISO/IEC 9899:1999 (E) Appendix D" itself is referencing "ISO/IEC TR 10176:1998" for the character data. I strongly suggest to drop the redirection via "Appendix D" and use "ISO/IEC TR 10176 (current)" instead of the dated version "ISO/IEC TR 10176:1998". The 1998 version didn't yet include quite a chunk of CJK and Math characters that can be found in the current version.

(from NG:digitalmars.D/42263 by Thomas Kuehne)

### Suggestion

I recommend that the D Programming Language Lex specification reserves (or weakly reserves) all two-to-six long lowercase a-to-z ASCII identifiers ( [a-z]{2,6} ) which have at least one [aeiouy] character, for future language expansion. Perhaps further relax the reserved identifiers by saying only English words are reserved. That should accommodate future language needs as the language grows. --Eljay

### Bare escape sequences

The text and at least one example say that a bare escape sequence is a string. (Example: "1" ~ \n ~ "2" is the same as "1\n2".) However, the formal grammar does not mention this. Which is wrong? (I hope it's the grammar; I like this idea.) ChrisChittleborough, June 16, 2011 0:39 CET

(Years later, while cleaning up old bookmarks ...) The formal grammar does not allow bare escape sequences. The words "an escape sequence" probably should be removed from the sentence "A string literal is either a double quoted string, a wysiwyg quoted string, an escape sequence, a delimited string, a token string, or a hex string".