Proof of concept: automatically import C header files

H. S. Teoh hsteoh at quickfur.ath.cx
Wed Jul 17 15:59:13 PDT 2013


On Wed, Jul 17, 2013 at 03:36:15PM -0700, Walter Bright wrote:
> On 7/17/2013 3:20 PM, H. S. Teoh wrote:
> >Though about trigraphs... I've to admit I've never actually seen
> >*real* C code that uses trigraphs, but yeah, needing to account for
> >them can significantly complicate your code.
> 
> Building a correct C front end is a known technology, doing a
> half-baked job isn't going to impress people.

IOW either you don't do it at all, or you have to go all the way and
implement a fully-functional C frontend?

If so, libclang is starting to sound rather attractive...


> >But as for preprocessor-specific stuff, couldn't we just pipe it
> >through a standalone C preprocessor and be done with it? It can't be
> >*that* hard, right?
> 
> You could, but then you are left with failing to recognize:
> 
>     #define FOO 3
> 
> and converting it to:
> 
>     enum FOO = 3;

Hmm. We *could* pre-preprocess the code to do this conversion first to
pick out these #define's, then suppress the #define's we understand from
the input to the C preprocessor. Something like this:

	bool isSimpleValue(string s) {
		// basically, return true if s is something compilable
		// when put on the right side of "enum x = ...".
	}

	auto pipe = spawnCPreprocessor();
	string[string] manifestConstants;
	foreach (line; inputFile.byLine()) {
		if (auto m=match(line, `^\s*#define\s+(\w+)\s+(.*?)\s+`))
		{
			if (isSimpleValue(m.captures[2])) {
				manifestConstants[m.captures[1]] =
					m.captures[2];

				// Suppress enums that we picked out
				continue;
			}
			// whatever we don't understand, hand over to
			// the C preprocessor
		}
		pipe.writeln(line);
	}

Basically, whatever #define's we can understand, we handle, and anything
else we let the C preprocessor deal with. By suppressing the #define's
we've picked out, we force the C preprocessor to leave any reference to
them as unexpanded identifiers, so that later on we can just generate
the enums and the resulting code will match up correctly.


T

-- 
Prosperity breeds contempt, and poverty breeds consent. -- Suck.com


More information about the Digitalmars-d mailing list