`
guanhuaing
  • 浏览: 1196270 次
文章分类
社区版块
存档分类
最新评论

C--如何定义复杂的类型声明

 
阅读更多

Even relatively new C programmers have no trouble reading simple C declarations such as

int      foo[5];     // foo is an array of 5 ints
char    *foo;        // foo is a pointer to char
double   foo();      // foo is a function returning a double

but as the declarations get a bit more involved, it's more difficult to know exactly what you're looking at.

char *(*(**foo[][8])())[]; // huh ?????

It turns out that the rules for reading an arbitrarily-complex C variable declaration are easily learned by even beginning programmers (though how to actually use the variable so declared may be well out of reach).

This Tech Tip shows how to do it.

Basic and Derived Types

In addition to one variable name, a declaration is composed of one "basic type" and zero or more "derived types", and it's crucial to understand the distinction between them.

The complete list of basic types is:

char signed char unsigned char
short unsigned short
int unsigned int
long unsigned long
float double void
struct tag union tag enum tag
long long unsigned long long long double ANSI/ISO C only

A declaration can have exactly one basic type, and it's always on the far left of the expression.

The "basic types" are augmented with "derived types", and C has three of them:

* pointer to...
This is denoted by the familiar * character, and it should be self evident that a pointer always has to point to something.
[] array of...
"Array of" can be undimensioned -- [] -- or dimensioned -- [10] -- but the sizes don't really play significantly into reading a declaration. We typically include the size in the description. It should be clear that arrays have to be "arrays of" something.
() function returning...
This is usually denoted by a pair of parentheses together - () - though it's also possible to find a prototype parameter list inside. Parameters lists (if present) don't really play into reading a declaration, and we typically ignore them. We'll note that parens used to represent "function returning" are different than those used for grouping: grouping parens surround the variable name, while "function returning" parens are always on the right.
Functions are meaningless unless they return something (and we accommodate the void type by waving the hand and pretend that it's "returning" void).

A derived type always modifies something that follows, whether it be the basic type or another derived type, and to make a declaration read properly one must always include the preposition ("to", "of", "returning"). Saying "pointer" instead of "pointer to" will make your declarations fall apart.

It's possible that a type expression may have no derived types (e.g., "int i" describes "i is an int"), or it can have many. Interpreting the derived types is usually the sticking point when reading a complex declaration, but this is resolved with operator precedence in the next section.

Operator Precedence

Almost every C programmer is familiar with the operator precedence tables, which give rules that say (for instance) multiply and divide have higher precedence than ("are preformed before") addition or subtraction, and parentheses can be used to alter the grouping. This seems natural for "normal" expressions, but the same rules do indeed apply to declarations - they are type expressions rather than computational ones.

The "array of" [] and "function returning" () type operators have higher precedence than "pointer to" *, and this leads to some fairly straightforward rules for decoding.

Always start with the variable name:

foo is ...

and always end with the basic type:

foo is ... int

The "filling in the middle" part is usually the trickier part, but it can be summarize with this rule:

"go right when you can, go left when you must"

Working your way out from the variable name, honor the precedence rules and consume derived-type tokens to the right as far as possible without bumping into a grouping parenthesis. Then go left to the matching paren.

A simple example

We'll start with a simple example:

long **foo[7];

We'll approach this systematically, focusing on just one or two small part as we develop the description in English. As we do it, we'll show the focus of our attention in red, and strike out the parts we've finished with.

long **foo [7];
Start with the variable name and end with the basic type:
foo is ... long
long ** foo[7];
At this point, the variable name is touching two derived types: "array of 7" and "pointer to", and the rule is to go right when you can, so in this case we consume the "array of 7"
foo is array of 7 ... long
long ** foo[7];
Now we've gone as far right as possible, so the innermost part is only touching the "pointer to" - consume it.
foo is array of 7 pointer to ... long
long * *foo[7];
The innermost part is now only touching a "pointer to", so consume it also.
foo is array of 7 pointer to pointer to long

This completes the declaration!

A hairy example

To really test our skills, we'll try a very complex declaration that very well may never appear in real life (indeed: we're hard-pressed to think of how this could actually be used). But it shows that the rules scale to very complex declarations.

char *(*(**foo [][8])())[];
All declaration start out this way: "variable name is .... basictype"
foo is ... char
char *(*(**foo[] [8])())[];
The innermost part touches "array of" and "pointer to" - go right.
foo is array of ... char
char *(*(**foo[][8])())[];
It's common in a declaration to alternate right and left, but this is not the rule: the rule is to go as far right as we can, and here we find that the innermost part still touches "array of" and "pointer to". Again, go right.
foo is array of array of 8 ... char
char *(*(** foo[][8])())[];
Now we've hit parenthesis used for grouping, and this halts our march to the right. So we have to backtrack to collect all the parts to the left (but only as far as the paren). This consumes the "pointer to":
foo is array of array of 8 pointer to ... char
char *(*(* *foo[][8])())[];
Again we are backtracking to the left, so we consume the next "pointer to":
foo is array of array of 8 pointer to pointer to ... char
char *(*(**foo[][8])())[];
After consuming the "pointer to" in the previous step, this finished off the entire parenthesized subexpression, so we "consume" the parens too. This leaves the innermost part touching "function returning" on the right, and "pointer to" on the left - go right:
foo is array of array of 8 pointer to pointer to function returning ... char
char *(* (**foo[][8])() )[];
Again we hit grouping parenthesis, so backtrack to the left:
foo is array of array of 8 pointer to pointer to function returning pointer to ... char
char * (*(**foo[][8])())[];
Consuming the grouping parentheses, we then find that the innermost part is touching "array of" on the right, and "pointer to" on the left. Go right:
foo is array of array of 8 pointer to pointer to function returning pointer to array of ... char
char * (*(**foo[][8])())[];
Finally we're left with only "pointer to" on the left: consume it to finish the declaration.
foo is array of array of 8 pointer to pointer to function returning pointer to array of pointer to char

We have no idea how this variable is useful, but at least we can describe the type correctly.

Abstract Declarators

The C standard describes an "abstract declarator", which is used when a type needs to be described but not associated with a variable name. These occur in two places -- casts, and as arguments to sizeof -- and they can look intimidating:

int (*(*)())()

To the obvious question of "where does one start?", the answer is "find where the variable name would go, then treat it like a normal declaration". There is only one place where a variable name could possibly go, and locating it is actually straightforward. Using the syntax rules, we know that:

  • to the right of all the "pointer to" derived type tokens
  • to the left of all "array of" derived type tokens
  • to the left of all "function returning" derived type tokens
  • inside all the grouping parentheses

Looking at the example, we see that the rightmost "pointer to" sets one boundary, and the leftmost "function returning" sets another one:

int (*(* • ) • ())()

The red • indicators show the only two places that could possibly hold the variable name, but the leftmost one is the only one that fits the "inside the grouping parens" rule. This gives us our declaration as:

int (*(*foo)())()

which our "normal" rules describe as:

foo is a pointer to function returning pointer to function returning int

Semantic restrictions/notes

Not all combinations of derived types are allowed, and it's possible to create a declaration that perfectly follows the syntax rules but is nevertheless not legal in C (e.g., syntactically valid but semantically invalid). We'll touch on them here.

Can't have arrays of functions
Use "array of pointer to function returning..." instead.
Functions can't return functions
Use "function returning pointer to function returning..." instead.
Functions can't return arrays
Use "function returning pointer to array of..." instead.
In arrays, only the leftmost [] can be undimensioned
C supports multi-dimensional arrays (e.g., char foo[1][2][3][4]), though in practice this often suggests poor data structuring. Nevertheless, when there is more than one array dimension, only the leftmost one is allowed to be empty. char foo[] and char foo[][5] are legal, but char foo[5][] is not.
"void" type is restricted
Since void is a special pseudo-type, a variable with this basic type is only legal with a final derived type of "pointer to" or "function returning". It's not legal to have "array of void" or to declare a variable of just type "void" without any derived types.
void *foo;            // legal
void foo();           // legal
void foo;             // not legal
void foo[];           // not legal

Adding calling-convention types

On the Windows platform, it's common to decorate a function declaration with an indication of its calling convention. These tell the compiler which mechanism should be used to call the function in question, and the method used to call the function must be the same one which the function expects. They look like:

extern int __cdecl main(int argc, char **argv);

extern BOOL __stdcall DrvQueryDriverInfo(DWORD dwMode, PVOID pBuffer,
                              DWORD cbBuf, PDWORD pcbNeeded);

These decorations are very common in Win32 development, and are straightforward enough to understand. More information can be found in Unixwiz.net Tech Tip: Using Win32 calling conventions

Where it gets somewhat more tricky is when the calling convention must be incorporated into a pointer (including via a typedef), because the tag doesn't seem to fit into the normal scheme of things. These are often used (for instance) when dealing with the LoadLibrary() and GetProcAddress() API calls to call a function from a freshly-loaded DLL.

We commonly see this with typedefs:

typedef BOOL (__stdcall *PFNDRVQUERYDRIVERINFO)(
    DWORD   dwMode,
    PVOID   pBuffer,
    DWORD   cbBuf,
    PDWORD  pcbNeeded
    );

...

/* get the function address from the DLL */
pfnDrvQueryDriverInfo = (PFNDRVRQUERYDRIVERINFO)
	GetProcAddress(hDll, "DrvQueryDriverInfo")

The calling convention is an attribute of the function, not the pointer, so in the usual reading puts it after the pointer but inside the grouping parenthesis:

BOOL (__stdcall *foo)(...);

is read as:

foo is a pointer
to a __stdcall function
returning BOOL.

source url:http://www.unixwiz.net/techtips/reading-cdecl.html

分享到:
评论

相关推荐

    C语言初识常见关键字-typedef重定义

    typedef 是 C 和 C++ 语言中的一个关键字,用于为数据类型定义一个新的名称。这可以使得代码更加简洁和可读。通过使用 typedef,我们可以为基本数据类型、结构体、联合体等定义别名,这样在后续的代码中就可以使用这...

    如何理解c和c++的复杂类型声明

    c和c++的复杂类型声明,比较好的介绍了c和c++的复杂类型的定义,声明

    如何理解c和c++的复杂类型声明[定义].pdf

    如何理解c和c++的复杂类型声明[定义].pdf

    枚举类型定义与变量声明定义的不同方法

    枚举类型定义与变量声明定义的不同方法,使用vc++ 6.0编写

    C语言复习题64-按类型(自己修正)程序设计.doc

    函数(函数定义、声明、函数调用、函数参数、全局变量和局部变量、静态变量) 指针(指针的概念、指针和一维数组) 结构体和联合体(基本概念) 3、 主要算法 (1) 判断整除 (2) 用公式求和、积 (3) 求最大...

    C++大学教程

    1.7 C语言与C++的历史--------------------------------------------------6 1.8 C++标准库---------------------------------------------------------7 1.9 Java、Internet与万维网-------------------------...

    c-minus词法分析器

    注意,C-缺乏原型,因此声明和定义之间没有区别(像C一样)。 4. var_declaration -> type_specifier ID; | type_specifier ID [ NUM ]; 5. type_specifier -> INT | VOID 变量声明或者声明了简单的整数类型变量,...

    c语言 编译原理 分词

    c语言 编译原理 分词 定义Token表示右部的值。 检查语义错误: (标识符声明、定义和使用) 没有声明;重复声明;类型不相容 符号表 (标识符名,地址,类型) 过程:读入Token 遇到标识符声明时,检查是否已...

    关于C中函数声明与定义

    如若子函数为返回值是int时,可不用声明,因为编译器会为子函数默认一个声明,返回值为int类型的,所以最开始的那个例子才不会报错。3.static修饰的函数作用域为从声明/定义处到源文件结尾处为止。

    C++ 数据结构知识点合集-C/C++ 数组允许定义可存储相同类型数据项的变量-供大家学习研究参考

    // 声明一个结构体类型 Books struct Books { char title[50]; char author[50]; char subject[100]; int book_id; }; int main( ) { Books Book1; // 定义结构体类型 Books 的变量 Book1 Books Book2; // ...

    C#与.NET技术平台实战演练.part2

    7定义类与建立实体this操作数8-8使用访问修饰符8-9建立嵌套类8-10名称空间8-10-1声明名称空间8-10-2名称空间的领域8-10-3使用名称空间的好处8-10-4名称空间存取控制8-11完全区别名称8-12使用using前置命令建立阶层式...

    C语言程序设计-指针与函数.pptx

    在C语言程序中,函数定义了之后,系统为该函数分配一段存储空间。其中函数的起始地址称为该函数的入口地址,将此地址赋给另一个变量,则该变量为一个指向函数的指针变量。 函数型指针变量的定义: 类型 (*标识符)( )...

    c语言头文件如何模块化编程-有最清晰的链接关系.docx

    在C语言中,头文件(.h 文件)用于存放函数声明、宏定义、类型定义等接口信息,它是模块化编程的重要组成部分。下面是一个清晰的C语言模块化编程中头文件与源文件之间的链接关系示例: 模块化编程的基本步骤: ...

    python怎么定义一个整型-零基础如何学好Python之int数字整型类型定义int() .pdf

    python怎么定义⼀个整型_零基础如何学好Python之int数字整 型类型定义int()。。。 本⽂主题是讲python数字类型python int整型使⽤⽅法及技巧。它是不可变数据类型中的⼀种,它的⼀些性质和字符串是⼀样的,注意是 ...

    清华大学-数据结构(课件+习题+课后答案)

    试用C++的类声明定义“复数”的抽象数据类型。要求 (1) 在复数内部用浮点数定义它的实部和虚部。 (2) 实现3个构造函数:缺省的构造函数没有参数;第二个构造函数将双精度浮点数赋给复数的实部,虚部置为0;第三个...

    你必须知道的495个C语言问题

    1.3 因为C语言没有精确定义类型的大小,所以我一般都用typedef定义int16和int32。然后根据实际的机器环境把它们定义为int、short、long等类型。这样看来,所有的问题都解决了,是吗? 2 1.4 新的64位机上的64位...

    C语言入门经典(第4版)--源代码及课后练习答案

    IvorHorton还著有关于C、C++和Java的多部入门级好书,如《C语言入门经典(第4版)》和《C++入门经典(第3版)》。 译者  杨浩,知名译者,大学讲师,从事机械和计算机方面的教学和研究多年,发表论文数篇,参编和翻译的...

    grub4dos-V0.4.6a-2017-02-04更新

    内置字库,如果不是 16*16 字体,头部需声明:DotSize=[font_h]。 受内存限制,当前大字库可支持到 32*32,中文小字库可支持到 40*40. 2.不再支持 vga 图形模式。 2015-07-07(yaya) 1.支持每像素16位彩色模式...

Global site tag (gtag.js) - Google Analytics