demangle(3C++) demangle(3C++)
dem, demangle - demangle C++ external names to a readable format
#include <dem.h>
cc [flag ...] file ... -lmangle [library ...]
typedef struct DEMARG DEMARG;
typedef struct DEMCL DEMCL;
typedef struct DEM DEM;
int demangle(const char *in, char *out);
int dem(char *s, DEM *p, char *buf);
void dem_printcl(DEMCL *p, char *buf);
void dem_printarg(DEMARG *p, char *buf, int f);
void dem_printarglist(DEMARG *p, char *buf, int sv);
int dem_print(DEM *p, char *buf);
void dem_printfunc(DEM *dp, char *buf);
dem and demangle are interfaces for user programs to ``demangle'' the
mangled external names that C++ produces for functions, class members,
etc..
A description of the C++ mangling scheme is provided on page 122 and
following of the Annotated C++ Reference Manual.
The simplest interface to the library is to call the demangle() function,
as follows:
int ret;
char inbuf[1024];
char outbuf[MAXDBUF];
if ((ret = demangle(inbuf, outbuf)) < 0) {
/* error! */
}
The demangle() function will return 0 if it successfully demangled the
name. If the demangle operation fails, the input string (inbuf) is copied
to the output buffer (outbuf).
To attain a finer level of control over the demangling operation, call
the dem() function as follows:
Page 1
demangle(3C++) demangle(3C++)
int ret;
char inbuf[1024];
DEM d;
char sbuf[MAXDBUF];
ret = dem(inbuf, &d, sbuf);
where inbuf is the input name, d the data structure that dem() fills up,
and sbuf is used as an internal buffer that the demangler uses to
allocate this data structure (d will contain pointers into sbuf).
Note that the first parameter to dem() is of type char *, not const char
*: a call to dem() may alter its input.
There is a constant MAXDBUF defined in dem.h. This is the maximum size
of buffer required for an unmangled name's data structure.
dem() returns -1 on error, otherwise 0.
The include file <dem.h> has comments describing each field in the data
structures. The data structures are somewhat complicated by the need to
handle nested types and function arguments which themselves are function
pointers with their own arguments.
To format this data structure in various ways, there are several
functions:
dem_print() formats a complete demangled name from the contents of the
DEM structure. dem_printcl() formats just a class name. dem_printfunc()
format just a function name. dem_printarg() formats a single function
argument. dem_printarglist() formats a complete function argument list.
demangle(), dem() and dem_print() return 0 if they succeed, and -1 if the
input name is not a valid mangled name (or if there are any other error
conditions, like passing in invalid arguments).
This particular application reads from standard input and displays the
class name for each mangled name read, or "(none)" on errors and C
functions/data.
#include <stdio.h>
#include <dem.h>
main()
{
char sbuf[MAXDBUF];
DEM d;
int ret;
Page 2
demangle(3C++) demangle(3C++)
char buf[1024];
char buf2[1024];
while (gets(buf) != NULL) {
ret = dem(buf, &d, sbuf);
if (ret || d.cl == NULL) {
printf("%s --> (none)\n", buf);
}
else {
dem_printcl(d.cl, buf2);
printf("%s --> %s\n", buf, buf2);
}
}
}
The demangler handles mangled class typenames, whether they are simple,
nested, or template classes. For example:
A__pt__2_i --> A<int>
__Q2_1A1B --> A::B
The demangler also handles local variables of the form:
__nnnxxx
For example:
__2x --> x
1. "signed" and "volatile" encodings are not handled.
2. The encoding for nested classes as mentioned on page 123 of the ARM
is handled slightly differently in cfront; there is a "_" after the
digit after the "Q".
3. A nested class starting with "Q" sometimes has the length encoded
before it; the demangler handles either case.
4. The "Tnn" and "Nnnn" notations mentioned on page 124 are not fully
supported. It is assumed that the number of the designated argument is
less than or equal to 9. So if you have 11 or more arguments, and you
want to repeat argument 10 or greater, the demangler will reject the
encoded name.
Page 3
demangle(3C++) demangle(3C++)
5. All literal arguments to templates are assumed to be const. For
example, the non-const literal value "37" is encoded as "Ci".
6. Some compilers will add a gratuitous "_" before external names.
7. The grammar allows class names up to 999 characters. This is
considered important for handling templates.
GRAMMAR FOR EXTERNAL NAMES
start --> name
################# COMPLETE NAMES #################
name --> sti | std | ptbl | func | data | vtbl |
cname3 | local
sti --> "__sti" "__" id
std --> "__std" "__" id
ptbl --> "__ptbl_vec" "__" id
func --> "__op" arg funcpost | id funcpost
funcpost --> "__" funcpost2 | "__" cname funcpost2
funcpost2 --> csv "F" arglist
csv --> "" | "C" | "S" | "V"
data --> id | id "__" cname
vtbl --> "__vtbl" "__" cname
local --> "__" num regid
################# CLASS NAMES #################
cname --> cname2 | nest
nest --> "Q" digit "_" cnamelist
cnamelist --> cname2 | cnamelist cname2
cname2 --> cnlen cnid
cname3 --> cnid | "__" nest
cnlen --> digit | digit digit | digit digit digit
cnid --> id | id "__pt__" cnlen "_" arglist
################# ARGUMENT LISTS #################
arglist --> arg | arglist arg
arg --> modlist arg2 | "X" modlist arg2 lit
modlist --> mod | modlist mod
mod --> "" | "U" | "C" | "V" | "S" | "P" | "R" |
arr | mptr
arr --> "A" num "_"
mptr --> "M" cname
arg2 --> fund | cname | funcp | repeat1 | repeat2
fund --> "v" | "c" | "s" | "i" | "l" | "f" |
"d" | "r" | "e"
funcp --> "F" arglist "_" arg
repeat1 --> "T" digit | "T" digit digit
Page 4
demangle(3C++) demangle(3C++)
repeat2 --> "N" digit digit | "N" digit digit digit
lit --> litnum | zero | litmptr | cnlen id | sptr
litnum --> "L" digit lnum | "L" digit digit "_" lnum
litmptr --> "LM" num "_" litnum "_" cnlen id
lnum --> num | "n" num
sptr --> cnlen id "__" cname
zero --> 0
################# LOW LEVEL STUFF #################
digit --> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
id --> special | regid
special --> "__ct" | "__pp" # etc.
regid --> letter | letter restid
restid --> letter | digit | restid letter |
restid digit
letter --> "A"-"Z" | "a" - "z" | "_"
num --> digit | num digit
PPPPaaaaggggeeee 5555 [ Back ]
|