Differences between revisions 14 and 15
Revision 14 as of 2010-11-22 21:37:13
Size: 4221
Editor: Lhunath
Comment: Large scripts.
Revision 15 as of 2011-04-28 14:34:52
Size: 4816
Editor: GreyCat
Comment: code reusability; also expand on binary data
Deletions are marked like this. Additions are marked like this.
Line 13: Line 13:
 1. '''Code reusability'''. I know, I know. Everyone wants to write a library of reusable bash functions. It's just not feasible. You can't pass variables by reference (''especially'' not arrays), so any function that acts on a variable has to ''know the name'' of the global variable it's acting on. This means you can't reuse functions in other projects, without pasting the function and changing the names of the global variables.
Line 17: Line 19:
 1. '''Binary data'''. Bash has no way to store the null byte, so binary data either has to be encoded or placed awkwardly in an array. Parsing binary data is also a problem. Try perl or C.  1. '''Binary data'''. Bash has no way to store the NUL byte in a variable, so binary data either has to be encoded (and decoded), or kept in a file. You also can't pass the NUL byte as an argument to a program, because the kernel uses C strings for those. Parsing binary data from a file is also a nontrivial problem. Try perl or C instead.

This is a stub. Please fill in the missing pieces.

There are certain things BASH is not very good at. There are certain tasks you shouldn't do in bash, unless you really, truly have to. It's often better to switch to a different language for most of these tasks.

  1. Speed. Do we really have to say it? Bash is slow. If speed is an important consideration, then Bash may not be the best choice.

  2. Floating point math. Bash has only integer math. Use bc(1) or AWK instead.

  3. Data structures. Bash does not have Pascal-style records (C-style structs); nor does it have pointers. Any attempt to create advanced data structures (stacks, queues, linked lists, binary trees...) will have to be done with extremely primitive hacks.

  4. Code reusability. I know, I know. Everyone wants to write a library of reusable bash functions. It's just not feasible. You can't pass variables by reference (especially not arrays), so any function that acts on a variable has to know the name of the global variable it's acting on. This means you can't reuse functions in other projects, without pasting the function and changing the names of the global variables.

  5. Fancy ProcessManagement. Bash has nothing analogous to select(2) or poll(2). Use C instead.

  6. XML and HTML (or alike) parsing. You need external tools or libraries to do this correctly. Use xslt, tidy, xmlstarlet, perl, or some other suitable tool.

  7. Binary data. Bash has no way to store the NUL byte in a variable, so binary data either has to be encoded (and decoded), or kept in a file. You also can't pass the NUL byte as an argument to a program, because the kernel uses C strings for those. Parsing binary data from a file is also a nontrivial problem. Try perl or C instead.

  8. Database queries. When retrieving a tuple from a relational database, there is no way for Bash to understand where one element of the tuple ends and the next begins. In general, Bash is not suited to any sort of data retrieval that extracts multiple data values in a single operation, unless there is a clearly defined delimiter between fields. For database queries (SQL or otherwise), switch to a language that supports the database's query API.

  9. Variable typing. Like most scripting languages, Bash does not really support strong variable types. Variables are loosely categorized as scalar or array, with partial support for an integer type. But really, everything is a string.

  10. Dropping permissions. It can be tough to make a bash script safe to execute as root. In languages like C, perl, and python, you can easily drop privileges at a certain point. With bash, this is tricky, because while you can run su or sudo, you lose variables, and even the executing environment. Use a proper programming language if you have security worries.

  11. Try/catch. Some programming languages let you wrap a command in a try ... catch block. This will interpret the command in a sort of "sandbox", where errors that would normally cause an abort are "caught", and trigger some sort of error-handling code. Bash does not have anything analogous to this. Any bash code you run is real code.

  12. Sorting. Bash can't sort data sets. If you need to sort an array, you can either write your own sorting algorithm in pure bash, or you can serialize the data set, pipe it to sort, and then parse it back in. Either way is painful, particularly if your sort doesn't have -z.

On top of these, BASH is not ideal for large programs. If your program is going to be responsible for a lot of tasks, especially interactively, then you may want to consider another interpreter or switch to a compiled language altogether. Large BASH scripts very quickly get in trouble because BASH is slow at a lot of things other interpreters are fast at. Large chunks of BASH code quickly become non-transparent with few ways other than functions to bring structure to your code. BASH scripts are nearly untestable. Even the most purist of bash programmers (and there aren't many!) write code that, when it all adds up, becomes difficult to maintain. BASH has almost no concept of code safety which lets sneaky little bugs crawl in really easily without warning or notice. And when things go wrong (and things will go wrong), really large scripts are very difficult to debug.

If you do plan to write large BASH scripts, make sure to pay even more attention than normal to every single good practice rule and uphold a consistent style throughout the entire code to avoid too much headache later on.


CategoryShell

BashWeaknesses (last edited 2022-09-01 18:14:07 by 188)