Scriptlike: Shell Scripting in D: Annoyances and a Library Solution

Due to its power and general-purpose intent, D has great potential as a faster, easier (in terms of flow-control), and cross-platform alternative to shell scripting. But after quite a bit of experience using it as such, I've noticed several annoyances that consistently slow me down and over-complicate my D scripts:

  • Managing paths as stings is messy and error-prone: There are a lot of fantastic tools in std.path to deal with paths, but there are still annoyances:

    • No static compile-time checks to keep non-path strings out of an API that requires a path.

    • Extra naming conventions are sometimes needed, to help distinguish paths from other stings. Not a major problem, but it's still one more thing to keep consistent, and one more thing making your variable names longer.

    • Extra care required to make sure directories are properly separated by slashes: The buildPath function can help with this, but it's easy to forget and often very tempting to omit.

    • Slashes vs backslashes: Posix uses forward slashes. Windows almost always accepts forward or backslashes interchangeably, but there are a few occasional exceptions. Plus, displaying forward-slashed paths in Windows is bad style.

      D's recommended style is to always use the platform-native slash in all your internal variables, but this can complicate algorithms and make it harder to guarantee your software properly accepts both versions on any OS (which is good style). It also leads to awkward, verbose, easy-to-forget-or-deliberately-stop-caring ugliness like writing "some/big/path" as buildPath("some", "big", "path") or as "some"~dirSeparator~"big"~dirSeparator~"path".

      Or you can use forward-slash-only internally (like I do) and only convert to backslashes as needed. But then you have to obsessively sanitize all path inputs, and many of std.path's outputs. Either way, it's not exactly trivial script-like coding.

    • Trailing slashes? You can ignore the matter of whether trailing slashes exist by relying on the error-prone convention of always using buildPath, even in trivial cases where it verbosifies code and tempts you to omit it (and accept buildPath's unwillingness you let you use "all paths are internally forward-slashed" convention to not complicate your internal logic with "forward and/or back" concerns). Or you can use an "always trailing slash" or "never trailing slash" convention, try to remember which one you chose, and obsessively sanitize all function inputs and external return values while maybe avoiding ugly double-slashes and definitely taking care not to accidentally convert null paths into root directories. Neither options are particularly appealing for shell-like scripts.

  • Invoking a tool that's in the current directory is platform-specific: On Posix, you must prefix `./`. On Windows you must not prefix `./` (but `.\` will work and so will omitting the path). So more buildPath and dirSeparator, just to invoke a tool in the current directory! Whee! Obviously more the fault of the system's command interpreters than D, but still an annoyance that could be improved.

  • Invoking a command shell-style: D's std.process was overhauled a few versions ago and, for the most part, it's fantastic. Easy to capture the child process's output if you want, and any other way you might want to handle the child's standard pipes is fully configurable, as well as invoking the process synchronously or asynchronously.

    But want the standard shell-script style of "synchronously run a shell command and automatically forward all the child's std pipes through the parent's"? You can do it, but for something so common, it's not very obvious how. Examining the documentation, the correct way is supposed to be spawnShell(cmd).wait();. But currently, that fails on Windows when the command's path has spaces, even when the path is properly quoted. Until that's fixed (it's actually fixed in master as of two days ago and should appear in v2.066), the only real solution, AFAICT, is to use the old system(cmd), even though it's "scheduled for deprecation".

  • Proper quoting: std.process has convenient tools for properly quoting commands and arguments. Unfortunately, it's easy to forget to call these functions whenever appropriate, which leads to easy "fails on spaces" bugs. Additionally, on Windows the quoting is done in an oddball way that's rather ugly when echoed (for tracing) and doesn't even work for command names given to system, which as explained above, is necessary for invoking commands shell-style.

  • Command tracing: Optionally echiong all the commands your script runs can be helpful, but it means writing a wrapper to use for all your std.process calls. Not a big problem, but it's one more thing to do and one more thing to clutter your script.

  • Managing imports: Adding proper imports for every part of the standard library I commonly use is perfectly fine for bigger software, but for shell-like scripts it's largely just pedantic boilerplate.

  • Safety when changing the working directory: Often you'll need to run a command in a specific directory. But as soon as you change directories, all your relative paths silently become invalid (They're relative paths, after all). Which means being extra careful to not use the wrong relative paths, and extra work and code to either maintain absolute versions of those paths or make sure everything stays in absolute form (which then results in longer, noisier command lines and tracing output...unless you go to the extra work of sanitizing that too).

    Some of that work and danger can be mitigated by using this idiom every time you change the current working directory:

    { auto saveCwd = getcwd(); chdir("some/dir"); scope(exit) chdir(saveCwd); work... }

    But beyond occasional usage, that gets to be quite a bit of clutter and bookkeeping.

  • Ignoring pedantic filesystem exceptions: Sometimes you just need to make sure a directory exists, creating it if necessary. Or check whether something is a file or directory while counting "doesn't exist" as "duh, no it isn't a file/directory". The exceptions std.path throws in such cases are often a very good sanity check, guaranteeing careless errors don't get through, so I'm glad they exist. But when doing shell-like scripting, they're often more of a bother than they're worth. There are always easy ways to "do it the right way", such as checking whether a path exists before calling isFile or mkdirRecurse, but for script-like purposes these are often unnecessary concerns and only add clutter to otherwise straightforward code. Wrappers are easy to make, but that's still one more thing to do, and that much more boilerplate for your scripts.

  • Deleting an entire directory tree: Normally as easy as rm foo -rf (if you don't care about Windows compatibility), there's no similarly simple equivalent in std.file.

  • Copy and rename aren't as powerful: In a shell script, copy and move can optionally take a glob or directory as the source (instead of just a file), and a directory as the destination. But std.file's equivalents can only take individual files as both source and destination. And std.file's remove and rmdir don't accept globs either.

  • Remove isn't as powerful: Like copy and rename, remove and rmdir don't accept globs. And unlike Posix's rm, a single command can't be used to delete either files or directories. To delete a path generically, you need to first check if it's a file or a directory and call the appropriate function.

  • Accessing the filesystem with UTF-16 or UTF-32 paths: The functions in std.file only accept paths and file names in UTF-8. Although it hasn't been an issue for me personally, if you do have paths in UTF-16 or UTF-32, they need to be converted manually with to!string() or text(). This is a reasonable default since it reduces hidden allocations, but for script-like uses it could be a nuisance.

Some of these things could, and probably should, be addressed in D's standard library, Phobos. But many other issues are arguably less appropriate for Phobos for various reasons. To address all of these issues, and perhaps others that may surface, I've created a simple one-file library, Scriptlike.

At the moment, not all of the issues above are addressed by Scriptlike just yet. But the rest is ready-to-go, and you can check the issue tracker to see what work still remains. Scriptlike is also available in DUB as package scriptlike.

Here are the current features in Scriptlike:

  • A thin wrapper over std.path and std.file that provides a dedicated Path type specifically designed for managing file paths in a simple, reliable, cross-platform way. No more dealing with slashes, paths-with-spaces, calling buildPath, normalizing, or getting paths mixed up with ordinary strings.
  • Optionally enable automatic command echoing (including shell commands, changing/creating directories and deleting/copying/moving/linking/renaming both directories and files) by setting one simple flag: bool scriptlikeTraceCommands
  • Most typical Phobos modules automatically imported. Who needs rows and rows of standard lib imports for a mere script?
  • Less-pedantic filesystem operations for when you don't care whether it exists or not: existsAsFile, existsAsDir, existsAsSymlink, tryRename, trySymlink, tryCopy, tryMkdir, tryMkdirRecurse, tryRmdir, tryRmdirRecurse, tryRemove: All check whether the source path exists and return WITHOUT throwing if there's nothing to do.
  • One simple call, runShell, to run a shell command script-style (ie, synchronously with forwarded stdout/in/err) from any working directory. (Also automatically works around DMD #10863 without waiting for v2.066.)
  • One simple function, fail(string msg), to help you exit with an error message in an exception-safe way. (Does require some minor boilerplate added to your main().)
  • More to come!

Leave a comment