3 Replies Latest reply on Nov 12, 2014 11:49 AM by Bill Robinson

    [ NSH ] Piped redirections are buffered and not flushed properly (data can be lost)

    Clément BARRET



      Let me show you a very simple thing (all tested and working flawlessly in standard bash) :


      1) Create a /tmp/youpi.sh simple script

      --- (content starts below) ---


      echo youpi;

      sleep 42;

      echo plop;

      --- (content ends just above) ---


      2) Make it executable


      % chmod +x /tmp/youpi.sh


      3) Try this simple thing :


      % /tmp/youpi.sh | cat -e


      You should see the first line appear with a "$" at the end of the line (added by cat -e)


      4) Now try to just add a redirection :


      % /tmp/youpi.sh | cat -e > /tmp/my_youpi_output.log


      now, BREAK using CTRL+C (don't wait for the 42 seconds).


      % cat /tmp/my_youpi_output.log



      it's EMPTY :X


      5) You can do the same but wait for the script to end and you'll see that the redirection worked.


      % /tmp/youpi.sh | cat -e > /tmp/my_youpi_output.log; cat /tmp/my_youpi_output.log





      So, two conclusions here :


      => Redirections aren't working as they're supposed to. (same thing in bash, ksh and even regular zsh works flawlessly)

      => You can miss information (aka lose data) if your program is terminated early (killed for instance) because the piped and buffered output will be lost.


      Then, I wanted to know what would happen if I keep pushing some "echo" there and see if it would flush the output at some point.


      Actually, it does flush the output every 4096 bytes chunk... but all the data written and still not flushed will be lost if the program is killed...


      So, I went to the NSH manual page from there :


      % man /opt/bmc/bladelogic/NSH/man/man1/nsh.1


      and I noticed those two important things :


      First : NSH is ZSH with "the following differences"  (I quote)




             This  manual  page outlines the differences between the Network Shell and a regular shell.  It does not provide a detailed description

             of Network Shell behavior.  See the man pages for zsh to obtain detailed information on how the  Network  Shell  works.   The  Network

             Shell is a link to a distributed version of zsh.



      then on redirections :




             Redirection  in  the Network Shell is implemented with pipes rather than the usual dup() or dup2 () system calls. This is necessary to

             properly implement redirection to files on remote hosts. There are a few limitations when using redirection.   First,  only  the  file

             descriptors  1  (standard  output) and 2 (standard error) are supported for redirection.  Other values may produce unexpected results.

             Next, the redirection type <>, which causes the output file to be opened for both read and write, is treated the same as the  <  redi-

             rection type. The remaining types of redirections work (with the restrictions described above).



      So basically, the internal mechanism isn't the same and "there are few limitations when using redirection.".

      Well, the man author is missing other limitations here...


      Since it's not mentioned (as a difference from ZSH) in the NSH provided man, I'm expecting everything described in the ZSH manual will work.


      So, open a ZSH manual and look for the "PROCESS SUBSTITUTION" topic

      % man zshexpn




             Each  command  argument  of  the form ‘<(list)’, ‘>(list)’ or ‘=(list)’ is subject to process substitution.  In the case of the < or >

             forms, the shell runs process list asynchronously.  If the system supports the /dev/fd mechanism, the command argument is the name  of

             the  device  file corresponding to a file descriptor; otherwise, if the system supports named pipes (FIFOs), the command argument will

             be a named pipe.  If the form with > is selected then writing on this special file will provide input for list.  If <  is  used,  then

             the file passed as an argument will be connected to the output of the list process.  For example,


                    paste <(cut -f1 file1) <(cut -f3 file2) |

                    tee >(process1) >(process2) >/dev/null


             cuts  fields  1 and 3 from the files file1 and file2 respectively, pastes the results together, and sends it to the processes process1

             and process2.


             If =(...) is used instead of <(...), then the file passed as an argument will be the name of a temporary file containing the output of

             the list process.  This may be used instead of the < form for a program that expects to lseek (see lseek(2)) on the input file.


             There  is  an optimisation for substitutions of the form =(<<<arg), where arg is a single-word argument to the here-string redirection

             <<<.  This form produces a file name containing the value of arg after  any  substitutions  have  been  performed.   This  is  handled

             entirely  within  the  current shell.  This is effectively the reverse of the special form $(<arg) which treats arg as a file name and

             replaces it with the file’s contents.


             The = form is useful as both the /dev/fd and the named pipe implementation of <(...) have drawbacks.  In the former  case,  some  pro-

             grammes may automatically close the file descriptor in question before examining the file on the command line, particularly if this is

             necessary for security reasons such as when the programme is running setuid.  In the second case, if the programme does  not  actually

             open  the  file, the subshell attempting to read from or write to the pipe will (in a typical implementation, different operating sys-

             tems may have different behaviour) block for ever and have to be killed explicitly.  In both cases, the shell  actually  supplies  the

             information using a pipe, so that programmes that expect to lseek (see lseek(2)) on the file will not work.


             Also note that the previous example can be more compactly and efficiently written (provided the MULTIOS option is set) as:


                    paste <(cut -f1 file1) <(cut -f3 file2) \

                    > >(process1) > >(process2)


             The shell uses pipes instead of FIFOs to implement the latter two process substitutions in the above example.


             There  is an additional problem with >(process); when this is attached to an external command, the parent shell does not wait for pro-

             cess to finish and hence an immediately following command cannot rely on the results being complete.  The problem and solution are the

             same as described in the section MULTIOS in zshmisc(1).  Hence in a simplified version of the example above:


                    paste <(cut -f1 file1) <(cut -f3 file2) > >(process)


             (note that no MULTIOS are involved), process will be run asynchronously.  The workaround is:


                    { paste <(cut -f1 file1) <(cut -f3 file2) } > >(process)


             The extra processes here are spawned from the parent shell which will wait for their completion.



      Basically, you can try that the way you want, it never will work. I couldn't use a single one of those examples and make it work in NSH.


      To conclude, as a summary :


      - Redirections in NSH aren't reliable, you can lose data.

      - Process substitution doesn't work in NSH.


      Now, I'm a bit stuck because I have to be able to retrieve and process the output of some programs in real time on a per line base (like I can do with any regular shell that's using readline between pipes) without having to wait either them to terminate or to send me enough volume of data (4096 bytes chunk). And I need that to be done in NSH...