Table of contents
1. Introduction
2. Background
3. CLI Design
-
3.1 Lexical analysis
3.2 Syntax analysis
3.3 Data structure
4. CLI Development
-
4.1 Scanner
4.2 Parser
4.3 The graph and the command execution
5. Conclusions
6. References
1. Introduction
The configuration, maintenance a monitoring of embedded devices or dedicated servers that need human interaction at least once, such as routers, switchers and firewalls, can provide several interfaces that allow these operations. These interfaces usually are of two types: Command Line Interface (CLI) and Graphical User Interface (GUI).
A GUI, given its user-friendly nature, it’s becoming a common tool for interacting with an embedded system, specially when the device is a home appliance such as a Set Top Box or a mobile telephone.
However, the default and usually the most powerful tool is the CLI. Most embedded devices provide it given that is extremely useful during the development stages of the system and usually allows more advanced operations that are not present in the GUI. Virtually all telecommunications devices provide a CLI for management. Another very useful case of a CLI in which is the only way for interacting with the system is a bootloader.
For the above reasons, the CLI becomes an important topic during the development cycle of an embedded system or a server.
A CLI can be designed and implemented in many different ways. The CLI described in this article comprehends an organized, efficient and dynamic approach that takes advantage of scanners, parsers and graphs.
Normally, embedded devices provide a standard shell such a bash or ash (used in Busybox), but the administrators that will have access to the final product must not have a shell access to the system. They must have access only to a reduced set of operations that include the control of specific peripherals such as the management of a microwave channel in a long-haul radio bridge.
Notice that a CLI is not the same as a shell. The later is a different and very powerful tool that comprehends a command language interpreter that may support concepts such as shell variables, redirection and aliases. A CLI provides just a set of required commands for executing a specific task.
However, there are some similarities between a CLI a shell. For example, both have to take control of the terminal, both have a set of reserved words and both have a specific grammar that allows them to interpret the given commands.
The examples in this guide are based on the Tarasca Project, a proof-of-concept (alpha stage) and Open Source project and that is completely written in C, lex and yacc for *nix. It contains the infrastructure for creating a CLI with user-defined commands that supports privileged menus, tab completion, command completion, commands history and some other features.
2. Background
A CLI can be designed and developed without any specific knowledge in a given area, but the design and development steps can be accomplished faster and can be better structured with a basic knowledge of compiler theory and a programming language.
Notice that a CLI can be designed and developed without the use of any scanner or parser, but as we will see later, this tools are quite helpful when the programmer needs to implement a considerable amount of commands for configuring, monitoring and maintaining a complex system such as a router or a long-haul microwave bridge.
Compiler theory is large and complex, but there are only two concepts from this theory that we apply to CLIs in a simplified manner: lexical analysis and syntax analysis. Lexical analysis is the processing of an input sequence of characters to produce, as output, a sequence of symbols called tokens. Syntax analysis is the process of analyzing a sequence of tokens in order to determine its grammatical structure with respect to a given formal grammar. Sections 3.1 and 3.2 give a brief and, hopefully, a clear explanation.
This site and this book [1] give a further explanation.
The last useful element for a working CLI is the knowledge of a programming language. There is no need for knowing everything about it, but some experience with data structures and pointers is definitely useful. Also, there is no need of a particular programming language, a CLI can be developed in the language of your choice such as C, Perl, Python or Lisp.
Since the Tarasca Project is written in C, we will use this programming language for developing the CLI.
3. CLI design
We can design and develop a CLI in many different ways. The selected choice, that it is used in the Tarasca Project, allows to:
Define the commands in configuration files that can be part of the filesystem or compiled into the executable.
- Process the commands until a convenient data structure is generated.
- Accept input from a terminal that supports add-on functionality such as tab-completion and command history.
- Match the given input from the terminal to the data structure in memory.
- Execute the command.
- Send the output to the terminal.
An illustration explains better the chosen design. Figure 1 shows the building blocks of the CLI:
In this article we will focus on the core issues, the red-highlighted block, for building a CLI. We will address other topics, such as the terminal control and add-on functionality in further articles.
In order to analyze the highlighted blocks of Figure 1, we will assume that we already have the other blocks working properly.
An example of a quick and simple CLI’s command is the following: suppose that we have a radio bridge with multiple communication channels in which we want the command
1 |
> config <channel_number> <bandwidth> |
to set the bandwidth of the given channel number.
This command, called config, receives two parameters, also called arguments. The first argument, channel_number, is an integer number greater than 0; and the second argument, bandwidth, is an integer number in the range [0 – 4096]. The bandwidth is measured in KB. Any other value is considered an error.
3.1 Lexical analysis
Lexical analysis is the processing of an input sequence of characters to produce, as output, a sequence of symbols called tokens. A scanner is the tool used for this procedure. For example, a typical scanner that receives as input the character sequence config 2 2000, will produce as output the tokens config, 2 and 2000. These tokens become the input of the parser.
A dictionary allows the characters to be in any given sequence and to follow a specific order. This is, we need to define the characters that will be accepted by our CLI and we must order them in a particular manner.
For example, we can decide that the allowed characters will be the digits [0-9], all letters from the English alphabet [a-zA-Z] and the characters -, _, space, tab and enter. We can decide also, that all characters can start and end with a sequence of letters, and can have zero or more digits, a – or a _ in the middle of the letter sequences. The spaces and tabs are discarded, but do not produce any error as it would be with any other character such as the & or the %. The enter tells the parser that the sequence is over and that it has to produce a token for the scanner.
Using regular expressions syntax, the allowed syntax would be:
1 |
[^s*[a-zA-Z][-|_|[0-9]*][a-zA-Z]s*$] |
Therefore the following two sequences are valid:
1 2 |
pwd-enable t activate |
but the following two sequences are not valid:
1 2 |
_meminfo (the sequence cannot start with a _) ifup_0 (the sequence cannot end with a digit) |
A finite state machine, that graphically can be seen as a connected graph, is the best way for designing a scanner.
The graph in Figure 2 shows several vertices, also called nodes, that are connected through unidirectional edges. Each node represents a state and each edge represents an input character. Thus, for producing the token pwd-enable when receiving the sequence
1 |
pwd-enabletenter ( t means tab ) |
we have to move through the directed graph in the following way:
We always start from the initial state 0, then when we receive the character p we move to state 1 because when a letter is received in state 0, the matching edge takes us to state 1. When we receive the characters w and d we remain in state 1 because of the edge that points to itself. Then , when we receive the character –, we move to state 2, then when we receive the character e we return to state 1 and remain there for the characters n, a, b, l and e . Then the t or tab is discarded. Finally, when we receive the enter we check if we are in an acceptable state or in an error state, this is, we check if it is a valid sequence of characters. If it is valid, we produce the token, otherwise we produce an error.
3.2 Syntax analysis
Syntax analysis is the process of analyzing a sequence of tokens in order to determine its grammatical structure with respect to a given formal grammar.
A parser is the tools used for this procedure and their input are the output tokens produced by the scanner.
Defining a formal grammar could be a complex process, the dragon book [1] explains it in great detail. Here, we will just describe the necessary concepts for developing the CLI.
A formal grammar is the precise description of a set of strings. There are two main categories of formal grammars: an analytic grammar describes how to recognize when strings are members in the set, whereas a generative grammar only describes how to write those strings in the set.
Since we are interested in writing a set of strings, we will use a generative grammar.
These kinds of grammars are classified in several types known as the Chomsky hierarchy. The most relevant for our purposes are the context-free grammars. The languages that can be described with such a grammar are called context-free languages in which it must be possible to tell how to parse any portion of an input string with just a single token of look-ahead. Strictly speaking, that is a description of an LR grammar.
In the formal grammatical rules for a language, each kind of syntactic unit or grouping is named by a symbol. Those which are built by grouping smaller constructs according to grammatical rules are called nonterminal symbols whilst those which cannot be subdivided are called terminal symbols or token types. An input corresponding to a single terminal symbol is called token, and a piece corresponding to a single nonterminal symbol is called grouping.
For example, the tokens of C are identifiers, constants, the various keywords such as if and for, arithmetic operators and punctuation marks. So the terminal symbols of a grammar for C include identifier, number, string, one symbol for each keyword, operator or punctuation mark. Examples of terminal symbols are if, return, static, int, the plus-sign, the semi-colon and others.
The following C function subdivided into tokens explains the above paragraphs:
1 2 3 4 5 6 7 |
int /* keyword int */ sum (int x, int y) /* identifier, open-parenthesis, keyword int, identifier, * coma, keyword int, identifier, close-parenthesis */ { /* open-brace */ return x + y; /* keyword return, identifier, plus sign, identifier, * semicolon */ } /* close-brace */ |
Listing 1. The tokens of a simple C function
The syntactic groupings of C include the expression, the statement, the declaration, and the function definition. These are represented in the grammar of C by nonterminal symbols expression, statement, declaration and function definition. The full grammar uses many additional language constructs, each with its own nonterminal symbol, in order to express the meanings of these four.
For example, Listing 1 is a function definition that contains one declaration and one statement. In the statement there are three expressions, the first one is x + y that is composed of the two expression x and y.
The start symbol is the only nonterminal symbol that must be distinguished as the special one which
defines a complete utterance in the language.
In C, the start symbol is the sequence of definitions and declarations.
Each time a terminal or a nonterminal symbol is validated, the parser calls a function that will add an element to the data structure that the CLI will use. Therefore, at the end of the syntax analysis, if there were no errors, we will have a validated group of tokens (the terminal and nonterminal symbols) and the data structure for the CLI. The creation and the description of the elements of this data structure is explained in the following section.
3.3 Data structure
In order to execute a command, we must create the necessary structures for finding, executing and eventually extracting information from these commands.
The way we chose for creating the CLI structure is quite fast and organized, although there could be several other ways.
This structure can be created on the fly at the same time at which the syntax analysis takes place. Once the parser has validated and identified the token type, it calls a specific function that will create a specific part of this structure.
Before we describe the creation of this structure and the way its elements interact between them, we identify and describe these elements:
Modes
The modes are used for classifying in levels the different types of commands. For example, the commands that configure a radio channel could be inside the mode channel-conf, whilst the command that change the privileged password could be inside the mode privileged. Between one mode and another there could be a password that allows a privileged user to execute certain commands.
Commands
The commands describe an instruction that will be executed. For example, the command activate could execute the action of activating a radio channel. A command has several properties, the most important ones are the function it calls when it is entered, which actually executes it, and its description that could be used for generating its help. Each command belongs to a specific mode, but it can be exported to other modes becoming available to several modes at the same time. This is useful for generic commands such as help and quit that must be available at any moment.
Arguments
The argument is a parameter that is given to a command. For example, the command config has the argument channel_number that identifies a specific channel and the argument bandwidth that sets the desired bandwidth. Depending on the command, some arguments can be mandatory, and others can be optional.
Prompt
Identifies the functional level in which we are at any given moment. It could be a static or a dynamic string. For example, using the host name as a prompt would be a dynamic string.
The graph
The structure that puts all these elements together in an organized and efficient manner is a graph.
This graph contains 3 types of single-linked lists:
- List 1: The first list is composed of modes, each node in the list represent a mode. For example, the first node could be used as a user mode for basic commands, second node could be used as privileged mode for sensitive commands such as the device configuration operations, third mode could be used as privileged mode for the configuration of radio channels only, and so on. There is only one list of modes.
- List 2: Since each mode can have several commands, each node of the mode list has a list of commands. For example, typical commands for the third node would be activate channel and lockout channel that are typical commands for configuring a network interface and only a privileged user can have access to them.
Some other commands could allow us to move from one mode to another.
For example, the command enable could move us from the user mode to the privileged mode, an the command exit could move us to the previous mode. - List 3: Since each command can have several mandatory and/or optional arguments, each node of the command list has a list of arguments. For example, the command config channel needs as arguments the radio channel number and its bandwidth.
Figure 3 show the graph as the main data structure for the CLI with its three types of lists: modes, commands and arguments.
Having a graph of this kind facilitates significantly how to execute a command, how to extract its data, or simply how to move from one element to another.
For example, taking as reference Figure 3, if we are in the node that represents the channel-conf mode and from the CLI we enter the config 2 2000 command, we have to move through the list of commands until we find the node of the config command. If the command is not found an error is issued. If it is found, we have to move along the arguments list of this command for finding each one of its arguments. If at least one of the arguments is not found, an error is issued. First we search for the channel_number argument. If we found it, we move ahead for searching the bandwidth argument. If we found it, we stop, otherwise we issue an error.
After this procedure, we can execute the command or extract information about it and its arguments.
Notice that one of the properties of a command node is the function that must be called when this command is given.
The following sections describe how the user can implement the scanner, the parser and how to create the graph and implement the functions of each command.
4 CLI development
4.1 Scanner
A scanner is a tool for performing lexical analysis, this is, a scanner processes an input sequence of characters to produce, as output, a sequence of tokens.
The Tarasca Project uses the widespread open source scanner called Flex, therefore we will use it for explaining how to generate our tokens. The code in this section is taken from the file rhal/rhal_lexer.l
A Flex input file consists of three sections, separated by a line with %% in it:
1 2 3 4 5 6 |
definitions %% rules %% user code %% |
Listing 2. The structure of a Flex file
The definitions section contains declarations of simple name definitions to simplify the scanner specification and declarations of start conditions. Listing 3 shows the definitions of the start and end identifiers, quoted string, comments, space and new line character that are the elements that our scanner needs for the CLI:
1 2 3 4 5 6 7 |
identifier_start [A-Za-z_] identifier_end [ |