Have Your Efficiency, and Flexibility TooMetaprogramming Techniques For No-Compromise Codeby Nick SabalauskyFull source code for this article is available on GitHub, or can be downloaded here. View this article on a Single Page or Table of Contents:
Metaprogramming Plus: The Flexibility EnhancementsIf you recall from earlier, Flexibility was concerned that the metaprogramming approach seemed to prevent complex configurability. He didn't think he could use complex logic to decide what types of Gizmos needed to be made and how many. The problem is that Gizmo's settings are specified at compile-time, but the logic to determine the configuration may need to happen at runtime. Dr. Metaprogramming knew that could be worked around and promised to show various methods of handing this. These methods will be demonstrated by making two basic changes to the existing metaprogramming example:
Naturally, if you want to compare the time and memory usage with all the previous versions, then these values should be set to bigPorts = 5, extrasNumPorts = 2 and extrasIsSpinnable = true. Note that none of these require any changes to the Gizmo type itself. Only the code in UltraGiz and main() is affected. That is to say, the only changes are in setting up and using the same Gizmo types that we've already written. Method #1: Compile-Time Function ExecutionFrequently abbreviated as CTFE, this method can't be done in all languages. And in a language that does support it (like D) it can be the least powerful method. But it's the simplest and easiest, and is perfectly sufficient in many situations. All that needs to be done is assign the return value of a function to a compile-time value. The compiler will execute the function itself (if it can) instead of waiting until runtime. Simple. Changes from ex4_metaprogramming.d are highlighted: From ex6_meta_flex1_ctfe.d: struct UltraGiz { template gizmos(int numPorts, bool isSpinnable) { Gizmo!(numPorts, isSpinnable)[] gizmos; } int numTimesUsedSpinny; int numTimesUsedTwoPort; void useGizmo(T)(ref T gizmo) { gizmo.doStuff(); gizmo.spin(); if(gizmo.isSpinnable) numTimesUsedSpinny++; if(gizmo.numPorts == 2) numTimesUsedTwoPort++; } static int generateBigPorts()
{
// Big fancy computation to determine number of ports
int num=0;
for(int i=0; i<10; i++)
{
if(i >= 5)
num++;
}
return num; // Ultimately, the result is 5
}
static int generateExtrasNumPorts(int input)
{
return input - 3;
}
static bool generateExtrasIsSpinnable(int input=9)
{
if(input == 0)
return false;
return !generateExtrasIsSpinnable(input-1);
}
static immutable bigPort = generateBigPorts();
static immutable extrasNumPorts = generateExtrasNumPorts(bigPort);
static immutable extrasIsSpinnable = generateExtrasIsSpinnable();
void run()
{
StopWatch stopWatch;
stopWatch.start();
// Create gizmos
gizmos!(1, false).length = 10_000;
gizmos!(1, true ).length = 10_000;
gizmos!(2, false).length = 10_000;
// Use extrasNumPorts and extrasIsSpinnable
// so 8,000 more of these will be made down below.
gizmos!(2, true ).length = 2_000;
gizmos!(bigPort, false).length = 5_000;
gizmos!(bigPort, true ).length = 5_000;
// Add in the extra Gizmos
gizmos!(extrasNumPorts, extrasIsSpinnable).length += 8_000;
// Use gizmos
foreach(i; 0..10_000)
{
foreach(ref gizmo; gizmos!(1, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(1, true )) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(2, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(2, true )) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(bigPort, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(bigPort, true )) useGizmo(gizmo);
}
writeln(stopWatch.peek.msecs, "ms");
}
}
void main()
{
UltraGiz ultra;
ultra.run();
// Compile time error: A portless Gizmo is useless!
//auto g = Gizmo!(0, true);
}
Method #2: Compiling at RuntimeOn the downside, this method takes extra time (potentially very noticeable) whenever a setting needs to be changed. That may or may not be a problem depending on the nature of the program and the setting. Also, you'll need to distribute your configurable source code along with your program. Finally, this method requires either:
Note that rules out using this method for most embedded targets. So ok, maybe this doesn't sound very good so far. However, this method is extremely powerful, viable for a wide variety of languages, and only requires very simple changes to the code being configured. Additionally, the compiler requirement may not be as much of a problem as it may seem if you have permission to redistribute the compiler, or if you're targeting Linux (which generally has pretty good package management and dependency tools), or if your program is only intended for private use. The trick here is to generate a small amount of source code at runtime, recompile your program, and then run the result. For simplicity, this example will use a separate "frontend" program that will configure, compile and run the real example program. But you could also have it all in one program: After your program issues the command to recompile itself, it would then relaunch itself (possibly saving and restoring any important state in the process) much like auto-updating programs that download and launch newer versions of themselves would do. Or, you could keep the configurable routines in a DLL or .so, unload the DLL or .so, recompile it, and then reload it. From ex6_meta_flex2_frontend.d, the frontend program: import std.conv; import std.file; import std.process; import std.stdio; void main(string[] args) { immutable configFile = "ex6_meta_flex2_config.d"; immutable mainProgram = "ex6_meta_flex2_compilingAtRuntime"; immutable mainProgramSrc = "ex6_meta_flex2_compilingAtRuntime.d"; version(Windows) immutable exeSuffix = ".exe"; else immutable exeSuffix = ""; // Number of ports on each of the many-port Gizmos. // Normally 5 int bigPort; // 8,000 extra Gizmos will be created with // this many ports and this spinnability. // Normally 2-port spinnable int extrasNumPorts; bool extrasIsSpinnable; try { bigPort = to!int (args[1]); extrasNumPorts = to!int (args[2]); extrasIsSpinnable = to!bool(args[3]); } catch(Throwable e) { writeln("Usage:"); writeln(" ex6_meta_flex2_frontend "~ "{bigPort} {extrasNumPorts} {extrasIsSpinnable}"); writeln("Example: ex6_meta_flex2_frontend 5 2 true"); return; } // This is the content of the "ex6_meta_flex2_config.d" file to be generated. auto configContent = ` immutable conf_bigPort = `~to!string(bigPort)~`; immutable conf_extrasNumPorts = `~to!string(extrasNumPorts)~`; immutable conf_extrasIsSpinnable = `~to!string(extrasIsSpinnable)~`; `; // Load old configuration writefln("Checking \t%s...", configFile); string oldContent; if(exists(configFile)) oldContent = cast(string)std.file.read(configFile); // Did the configuration change? bool configChanged = false; if(configContent != oldContent) { writefln("Saving \t%s...", configFile); std.file.write(configFile, configContent); configChanged = true; } // Need to recompile? if(configChanged || !exists(mainProgram~exeSuffix)) { writefln("Compiling \t%s...", mainProgramSrc); system("dmd "~mainProgramSrc~" -release -inline -O -J."); } // Run the main program writefln("Running \t%s...", mainProgram); version(Windows) system(mainProgram); else system("./"~mainProgram); }And the main program, with changes from ex4_metaprogramming.d highlighted: From ex6_meta_flex2_compilingAtRuntime.d, the main program: struct UltraGiz(int bigPort, int extrasNumPorts, bool extrasIsSpinnable)
{
template gizmos(int numPorts, bool isSpinnable)
{
Gizmo!(numPorts, isSpinnable)[] gizmos;
}
int numTimesUsedSpinny;
int numTimesUsedTwoPort;
void useGizmo(T)(ref T gizmo)
{
gizmo.doStuff();
gizmo.spin();
if(gizmo.isSpinnable)
numTimesUsedSpinny++;
if(gizmo.numPorts == 2)
numTimesUsedTwoPort++;
}
void run()
{
StopWatch stopWatch;
stopWatch.start();
// Create gizmos
gizmos!(1, false).length = 10_000;
gizmos!(1, true ).length = 10_000;
gizmos!(2, false).length = 10_000;
// Use the template parameters extrasNumPorts and extrasIsSpinnable
// so 8,000 more of these will be made down below.
gizmos!(2, true ).length = 2_000;
gizmos!(bigPort, false).length = 5_000;
gizmos!(bigPort, true ).length = 5_000;
// Add in the extra Gizmos
gizmos!(extrasNumPorts, extrasIsSpinnable).length += 8_000;
// Use gizmos
foreach(i; 0..10_000)
{
foreach(ref gizmo; gizmos!(1, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(1, true )) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(2, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(2, true )) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(bigPort, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(bigPort, true )) useGizmo(gizmo);
}
writeln(stopWatch.peek.msecs, "ms");
}
}
void main()
{
mixin(import("ex6_meta_flex2_config.d"));
UltraGiz!(conf_bigPort, conf_extrasNumPorts, conf_extrasIsSpinnable) ultra;
ultra.run();
// Compile time error: A portless Gizmo is useless!
//auto g = Gizmo!(0, true);
}
Method #3: Convert a Runtime Value to Compile-TimeYes, you read that right. Though it may sound bizarre, like it would require time-travel, it is possible to convert a runtime value to compile-time. Although, it does have some restrictions:
What essentially happens is you take all the compile-time code paths you may want to trigger at runtime, and you trigger all of them at compile-time. Each one of them will produce a result that can be accessed at runtime. Then, at runtime, you just "choose your effect". If you don't understand that, don't worry. It's really a much simpler, more obvious concept than it sounds. Here's a simple example: From example_runtimeToCompileTime.d: import std.conv; import std.stdio; // Remember, this is a completely different type // for every value of compileTimeValue. class Foo(int compileTimeValue) { static immutable theCompileTimeValue = compileTimeValue; static int count = 0; this() { count++; } static void display() { writefln("Foo!(%s).count == %s", theCompileTimeValue, count); } } void main(string[] args) { foreach(arg; args[1..$]) { int runtimeValue = to!int(arg); // Dispatch runtime value to compile-time switch(runtimeValue) { // Note: // case {runtime value}: new Foo!{equivalent compile time value}(); case 0: new Foo!0(); break; case 1: new Foo!1(); break; case 2: new Foo!2(); break; case 3: new Foo!3(); break; case 10: new Foo!10(); break; case 99: new Foo!99(); break; default: throw new Exception(text("Value ",runtimeValue," not supported.")); } } Foo!( 0).display(); Foo!( 1).display(); Foo!( 2).display(); Foo!( 3).display(); Foo!(10).display(); Foo!(99).display(); }Of course, given the repetition in there, metaprogramming can be used to automatically generate the code to handle large numbers of possible values. Or even the entire range of certain types, such as enum, bool, byte or maybe even a 16-bit value. A 32-bit value would be unrealistic on modern hardware, though. And arbitrary strings would be out of the question unless you limited them to a predetermined set of strings, or to a couple bytes in length (or more if you limited the allowable characters). But even with these limitations, this can still be a useful technique. In any case, the fact remains: With certain restrictions, it is possible to convert a runtime value into a compile-time value. Here's how it can be applied to our Gizmo example (as usual, changes from ex4_metaprogramming.d are highlighted): From ex6_meta_flex3_runtimeToCompileTime1.d: struct UltraGiz { template gizmos(int numPorts, bool isSpinnable) { Gizmo!(numPorts, isSpinnable)[] gizmos; } int numTimesUsedSpinny; int numTimesUsedTwoPort; void useGizmo(T)(ref T gizmo) { gizmo.doStuff(); gizmo.spin(); if(gizmo.isSpinnable) numTimesUsedSpinny++; if(gizmo.numPorts == 2) numTimesUsedTwoPort++; } // Note this is templated
void addGizmosTo(int numPorts, bool isSpinnable)(int numGizmos)
{
gizmos!(numPorts, isSpinnable).length += numGizmos;
}
void addGizmos(int numPorts, bool isSpinnable, int numGizmos)
{
// Dispatch to correct version of addGizmosTo.
// Effectively converts a runtime value to compile-time.
if(numPorts == 1)
{
if(isSpinnable)
addGizmosTo!(1, true )(numGizmos);
else
addGizmosTo!(1, false)(numGizmos);
}
else if(numPorts == 2)
{
if(isSpinnable)
addGizmosTo!(2, true )(numGizmos);
else
addGizmosTo!(2, false)(numGizmos);
}
else if(numPorts == 3)
{
if(isSpinnable)
addGizmosTo!(3, true )(numGizmos);
else
addGizmosTo!(3, false)(numGizmos);
}
else if(numPorts == 5)
{
if(isSpinnable)
addGizmosTo!(5, true )(numGizmos);
else
addGizmosTo!(5, false)(numGizmos);
}
else if(numPorts == 10)
{
if(isSpinnable)
addGizmosTo!(10, true )(numGizmos);
else
addGizmosTo!(10, false)(numGizmos);
}
else
throw new Exception(to!string(numPorts)~"-port Gizmo not supported.");
}
void run(int bigPort)(int extrasNumPorts, bool extrasIsSpinnable)
{
StopWatch stopWatch;
stopWatch.start();
// Create gizmos
gizmos!(1, false).length = 10_000;
gizmos!(1, true ).length = 10_000;
gizmos!(2, false).length = 10_000;
// Use the commandline parameters extrasNumPorts and extrasIsSpinnable
// so 8,000 more of these will be made down below.
gizmos!(2, true ).length = 2_000;
gizmos!(bigPort, false).length = 5_000;
gizmos!(bigPort, true ).length = 5_000;
// Add in the extra Gizmos
addGizmos(extrasNumPorts, extrasIsSpinnable, 8_000);
// Use gizmos
foreach(i; 0..10_000)
{
foreach(ref gizmo; gizmos!(1, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(1, true )) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(2, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(2, true )) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(bigPort, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(bigPort, true )) useGizmo(gizmo);
}
writeln(stopWatch.peek.msecs, "ms");
}
}
void main(string[] args)
{
// Number of ports on each of the many-port Gizmos.
// Normally 5
int bigPort;
// 8,000 extra Gizmos will be created with
// this many ports and this spinnability.
// Normally 2-port spinnable
int extrasNumPorts;
bool extrasIsSpinnable;
try
{
bigPort = to!int (args[1]);
extrasNumPorts = to!int (args[2]);
extrasIsSpinnable = to!bool(args[3]);
if(bigPort != 3 && bigPort != 5 && bigPort != 10)
throw new Exception("Invalid choice for bigPort");
}
catch(Throwable e)
{
writeln("Usage:");
writeln(" ex6_meta_flex3_runtimeToCompileTime1 "~
"{bigPort} {extrasNumPorts} {extrasIsSpinnable}");
writeln("bigPort must be 3, 5 or 10");
writeln("Example: ex6_meta_flex3_runtimeToCompileTime1 5 2 true");
return;
}
// Dispatch to correct version of UltraGiz.run.
// Effectively converts a runtime value to compile-time.
if(bigPort == 3)
ultra.run!3(extrasNumPorts, extrasIsSpinnable);
else if(bigPort == 5)
ultra.run!5(extrasNumPorts, extrasIsSpinnable);
else if(bigPort == 10)
ultra.run!10(extrasNumPorts, extrasIsSpinnable);
// Compile time error: A portless Gizmo is useless!
//auto g = Gizmo!(0, true);
}
That will work, but there's two potential problems with it. The first problem is that it involves extra runtime code. That could cut into, or possibly even eliminate, the efficiency savings from metaprogramming. However, the extra runtime code is only run once when setting up the Gizmos, not while the Gizmos are actually being used. So as long as the Gizmo usage is enough to overshadow the extra overhead, it should still be worth it. The second problem is that the addGizmos() function is an incredibly repetitive mess of copy-pasted code. It's a total violation of DRY: Don't Repeat Yourself. Maintaining that function would be very error-prone. Fortunately, that's easily fixed with a preprocessor, macros, or in D's case, string mixins: From ex6_meta_flex3_runtimeToCompileTime2.d: void addGizmos(int numPorts, bool isSpinnable, int numGizmos) { // Dispatch to correct version of addGizmosTo. // Effectively converts a runtime value to compile-time. string dispatch(int[] numPortsArray)
{
auto str = "";
foreach(numPorts; numPortsArray)
{
auto numPortsStr = to!string(numPorts);
str ~= `
if(numPorts == `~numPortsStr~`)
{
if(isSpinnable)
addGizmosTo!(`~numPortsStr~`, true )(numGizmos);
else
addGizmosTo!(`~numPortsStr~`, false)(numGizmos);
}
else
`;
}
str ~=
`throw new Exception(
to!string(numPorts)~"-port Gizmo not supported."
);`;
return str;
}
mixin(dispatch( [1, 2, 3, 5, 10] ));
}
If you wish, you can see the generated code by simply outputting the result of dispatch: From snippet_outputGeneratedCode.d: // In the UltraGiz.addGizmos() function of // 'ex6_meta_flex3_runtimeToCompileTime2.d', // see the generated code by replacing this: mixin(dispatch( [1, 2, 3, 5, 10] )); // With this: immutable code = dispatch( [1, 2, 3, 5, 10] ); pragma(msg, "code:\n"~code); // Displayed at compile-time mixin(code);Method #4: Dynamic FallbackJust like the town elder who made the handcrafted version, we can fallback on a dynamic version that uses runtime options instead of compile-time options. This is easier and more flexible than the previous method. In fact, method #1, compile-time function execution, is probably the only method that's easier than this, but this one is more powerful and supported by more languages. So this is a pretty good option. However, the downside is this would naturally be the least efficient of all the methods, since some of the Gizmos would forgo the metaprogramming benefits. But as long as you don't need runtime configurability for all your Gizmos, then you can still get a net savings over the original non-metaprogramming version. To do this, we'll use the same metaprogramming Gizmo we've been using for all the other methods in this section. But we'll also add a DynamicGizmo which is identical to the original Gizmo in ex1_original.d, just with a different name. Then, the UltraGiz will look like this (as usual, changes from ex4_metaprogramming.d are highlighted): From ex6_meta_flex4_dynamicFallback1.d: struct UltraGiz { template gizmos(T)
{
T[] gizmos;
}
// Shortcut for non-dynamic gizmos, so we can still say:
// gizmos!(2, true)
// instead of needing to use the more verbose:
// gizmos!( Gizmos!(2, true) )
template gizmos(int numPorts, bool isSpinnable)
{
alias gizmos!( Gizmo!(numPorts, isSpinnable) ) gizmos;
}
int numTimesUsedSpinny;
int numTimesUsedTwoPort;
void useGizmo(T)(ref T gizmo)
{
gizmo.doStuff();
gizmo.spin();
if(gizmo.isSpinnable)
numTimesUsedSpinny++;
if(gizmo.numPorts == 2)
numTimesUsedTwoPort++;
}
void run(int bigPort, int extrasNumPorts, bool extrasIsSpinnable)
{
StopWatch stopWatch;
stopWatch.start();
// Create gizmos
gizmos!(1, false).length = 10_000;
gizmos!(1, true ).length = 10_000;
gizmos!(2, false).length = 10_000;
// Use the commandline parameters extrasNumPorts and extrasIsSpinnable
// so 8,000 more of these will be made down below as dynamic gizmos.
gizmos!(2, true ).length = 2_000;
gizmos!(DynamicGizmo).length = 18_000;
foreach(i; 0..5_000)
gizmos!(DynamicGizmo)[i] = DynamicGizmo(bigPort, false);
foreach(i; 5_000..10_000)
gizmos!(DynamicGizmo)[i] = DynamicGizmo(bigPort, true);
foreach(i; 10_000..18_000)
gizmos!(DynamicGizmo)[i] = DynamicGizmo(extrasNumPorts, extrasIsSpinnable);
// Use gizmos
foreach(i; 0..10_000)
{
foreach(ref gizmo; gizmos!(1, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(1, true )) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(2, false)) useGizmo(gizmo);
foreach(ref gizmo; gizmos!(2, true )) useGizmo(gizmo);
foreach(ref gizmo; gizmos!DynamicGizmo) useGizmo(gizmo);
}
writeln(stopWatch.peek.msecs, "ms");
}
}
void main(string[] args)
{
// Number of ports on each of the many-port Gizmos.
// Normally 5
int bigPort;
// 8,000 extra Gizmos will be created with
// this many ports and this spinnability.
// Normally 2-port spinnable
int extrasNumPorts;
bool extrasIsSpinnable;
try
{
bigPort = to!int (args[1]);
extrasNumPorts = to!int (args[2]);
extrasIsSpinnable = to!bool(args[3]);
}
catch(Throwable e)
{
writeln("Usage:");
writeln(" ex6_meta_flex4_dynamicFallback1 "~
"{bigPort} {extrasNumPorts} {extrasIsSpinnable}");
writeln("Example: ex6_meta_flex4_dynamicFallback1 5 2 true");
return;
}
ultra.run(bigPort, extrasNumPorts, extrasIsSpinnable);
// Compile time error: A portless Gizmo is useless!
//auto g1 = Gizmo!(0, true);
// Runtime error: A portless Gizmo is useless!
//auto g2 = DynamicGizmo(0, true);
}
The original Gizmo with the runtime options, ie DynamicGizmo, is used where necessary, while the more common cases are optimized with metaprogramming techniques. Not a bad compromise. As you can see in the full code listing for ex6_meta_flex4_dynamicFallback1.d, I opted to make a completely separate definition for the dynamic version of Gizmo; that is, the DynamicGizmo. It would have also been possible to use a single definition for both the metaprogramming Gizmo and the DynamicGizmo. To do that, you'd just need to add another compile-time parameter, say bool dynamicGizmo, to go along with numPorts and isSpinnable. Doing so would probably be a good idea if only part of your struct is affected by the change from runtime options to compile-time options. But with Gizmo, the metaprogramming version converted practically everything to compile-time options, so in this case it was a little cleaner to just leave DynamicGizmo defined separately. One other notable change I made was to the gizmos template (Ie, the arrays that had been named gizmosA, gizmosB, etc. in the earlier handcrafted version.) In all the other metaprogramming versions, gizmos had been templated on number of ports and spinnability. That worked fine, but now we have DynamicGizmo which doesn't really fit into that. So now gizmos is templated on the Gizmo's type so the dynamic Gizmos can be accessed with gizmos!(DynamicGizmo). Unfortunately, that also means the nice simple: gizmos!(2, true)Becomes the ugly: gizmos!( Gizmos!(2, true) )So I created an overload of the gizmos template which maps the nice simple old syntax to the new one. As an extra benefit, templating gizmos on type makes it easy to clean up all those repetitive foreach statements in UltraGiz.run(): From ex6_meta_flex4_dynamicFallback2.d: void run(int bigPort, int extrasNumPorts, bool extrasIsSpinnable) { StopWatch stopWatch; stopWatch.start(); // Create gizmos gizmos!(1, false).length = 10_000; gizmos!(1, true ).length = 10_000; gizmos!(2, false).length = 10_000; // Use the commandline parameters extrasNumPorts and extrasIsSpinnable // so 8,000 more of these will be made down below as dynamic gizmos. gizmos!(2, true ).length = 2_000; gizmos!(DynamicGizmo).length = 18_000; foreach(i; 0..5_000) gizmos!(DynamicGizmo)[i] = DynamicGizmo(bigPort, false); foreach(i; 5_000..10_000) gizmos!(DynamicGizmo)[i] = DynamicGizmo(bigPort, true); foreach(i; 10_000..18_000) gizmos!(DynamicGizmo)[i] = DynamicGizmo(extrasNumPorts, extrasIsSpinnable); // Use gizmos foreach(i; 0..10_000) { // Think of this as an array of types:
alias TypeTuple!(
Gizmo!(1, false),
Gizmo!(1, true ),
Gizmo!(2, false),
Gizmo!(2, true ),
DynamicGizmo,
) AllGizmoTypes;
foreach(T; AllGizmoTypes)
foreach(ref gizmo; gizmos!T)
useGizmo(gizmo);
}
writeln(stopWatch.peek.msecs, "ms");
}
Next: The Last Remaining Elephant In The Room: Runtime Conversion Table of Contents:
|