I think they are a design flaw in Awk; I'm going to look into that and recommend changes to POSIX via the Austin Group mailing list if it still exists.
Awk has some newline sensitivities due to the following ambiguities:
condition # condition with no action allowed: default { print } action
{ action } # action with no condition allowed
condition { action } # both
Therefore, this is not allowed (or well, it is, but codifies a separate condition with a default action, and an unconditional action).
condition
{ action }
There can be no newline between a condition and the opening { of its action. And actions must be brace enclosed.
And thus (IIRC) the awk lexical analyzer (in the original One True Awk implementation) returns an explicit newline token to the Yacc parser. In any phrase structure that doesn't deal with that token, a newline will cause a syntax error:
function( # no good
arg
)
function("string " # no good
foo + bar
" catenation")
When the lexer produces the token which is the opening brace of an action, it could shift into a freeform state, in which it consumes newlines internally. Then when the action is parsed, it can be returned to the newline-sensitive mode.
The newline sensitivities don't seem to serve a purpose in the C-like language within the actions.
That language also occurs outside of actions via the function construct:
function whatever(...) {
}
here the lexer would also be shifted into the freeform mode, as appropriate.
Awk has some newline sensitivities due to the following ambiguities:
Therefore, this is not allowed (or well, it is, but codifies a separate condition with a default action, and an unconditional action). There can be no newline between a condition and the opening { of its action. And actions must be brace enclosed.And thus (IIRC) the awk lexical analyzer (in the original One True Awk implementation) returns an explicit newline token to the Yacc parser. In any phrase structure that doesn't deal with that token, a newline will cause a syntax error:
When the lexer produces the token which is the opening brace of an action, it could shift into a freeform state, in which it consumes newlines internally. Then when the action is parsed, it can be returned to the newline-sensitive mode.The newline sensitivities don't seem to serve a purpose in the C-like language within the actions.
That language also occurs outside of actions via the function construct:
here the lexer would also be shifted into the freeform mode, as appropriate.