Automatic argument parsers for python

One of the recurrent problems I used to have when writing argument parsers is that after refactoring code, I also had to change the argument parser options which generally led to inconsistency between the arguments of the function and some of the options of the argument parser. The following example can illustrate the problem:

def main(a,b):
  """
  This function adds together two numbers a and b
  param a: first number
  param b: second number
  """
  print(a+b)

if __name__ == "__main__":
  import argparse
  parser = argparse.ArgumentParser()
  parser.add_argument("--a", type=int, required=True, help="first number")
  parser.add_argument("--b", type=int, required=True, help="second number")
  args = parser.parse_args()
  main(**vars(args))

This code is nothing but a simple function that prints a+b and the argument parser asks for a and b. The perhaps not so obvious part is the invocation of the function in which we have ** and vars. vars converts the named tuple args to a dictionary of the form {“a":1, "b":2}, and ** expands the dictionary to be used as arguments for the function. So if you have main(**{"a":1, "b":2}) it is equivalent to main(a=1, b=2).

Let’s refactor the function so that we change the name of the argument a to num.

def main(num,b):
  """
  This function adds together two numbers num and b
  param num: first number
  param b: second number
  """
  print(num+b)

if __name__ == "__main__":
  import argparse
  parser = argparse.ArgumentParser()
  parser.add_argument("--num", type=int, required=True, help="first number")
  parser.add_argument("--b", type=int, required=True, help="second number")
  args = parser.parse_args()
  main(**vars(args))

Easy but it required changing the code in two regions: the function and the argument parser definition. Let’s say that I start adding new arguments like the following.

def main(num, b, prefix, prologue=None):
  """
  This function adds together two numbers a and b
  param: num: first number
  param: b: second number
  param: prefix: print something before the summation
  param: prologue: print a line before the summation
  """
  if prologue:
    print(prologue)
  print(prefix+str(num+b))

if __name__ == "__main__":
  import argparse
  parser = argparse.ArgumentParser()
  parser.add_argument("--num", type=int, required=True, help="first number")
  parser.add_argument("--b", type=int, required=True, help="second number")
  parser.add_argument("--prefix", type=str, required=True, help="some prefix to print before the summation")
  parser.add_argument("--prologue", type=str, required=False, help="print a line before the summation")
  args = parser.parse_args()
  main(**vars(args))

You probably start noticing my problem. Any time I have to edit two different regions of the code, the chances of me forgetting or doing copy-paste wrong increase.

So what can we do to avoid this? Automatic argument parsers can save the day.

One of the available alternatives is plac, which allows you to move the argument parser definition to the very same region of the code where you define the function. The following shows you how to use plac for the previous example.

import plac

@plac.opt('num', "first number", type=int)
@plac.opt('b', "second number", type=int)
@plac.opt('prefix', "print something before the summation", type=str)
@plac.opt('prologue', "prologue: print a line before the summation", type=str, abbrev="l")
def main(num, b, prefix, prologue):
  """
  This function adds together two numbers a and b
  param: num: first number
  param: b: second number
  param: prefix: print something before the summation
  param: prologue: print a line before the summation
  """
  if prologue:
    print(prologue)
  print(prefix+str(num+b))

if __name__ == "__main__":
    plac.call(main)

The previous code is functionally equivalent (there are some minor differences, such as how to enforce an optional argument) to the previous example and the main problem has been reduced: now the function and the argument parser definition are in the same place. Still, there is quite a bit of duplication. First, we have repeated the name of the arguments thrice (plac.opt, definition, and docstring) and the description of the arguments twice (definition, and docstring). Consequently, it is still quite likely that we forget to update some of the places. Wouldn’t it be great if we could get an automatic parser that is created from the documentation (e.g. the docstring)?

funcargparse is one of such automatic parser builders, in which all the details required for the argument parser are provided in the docstring, and, since you should always be documenting your functions, then your required effort is minimal. You can see how easy is it to translate the previous example to funcargparse:

def main(num=None, b=None, prefix=None, prologue=None):
  """
  This function adds together two numbers a and b
  
  Parameters
  ----------
  num: int
      first number
  b: int
      second number
  prefix: str
      print something before the summation
  prologue: str
      print a line before the summation
  """
  
  if prologue:
    print(prologue)
  print(prefix+str(num+b))

if __name__ == "__main__":
    
    from funcargparse import FuncArgParser
    parser = FuncArgParser()
    parser.setup_args(main)
    parser.create_arguments()
    args = parser.parse_args()
    main(**vars(args))
    

The name of the variables and their type are part of the docstrings, which is always a good practice. By using this kind of tool, you won’t need to worry about copy-pasting anymore and the overall quality of your code will improve, as you are enforced to write up-to-date documentation.

I also want to present my own alternative, argParseFromDoc, that adds another good practice to the functionalities in funcargparse. In argParseFromDoc, argument types are inferred from the type hints of the signature, so you are enforced to include both type hints and argument descriptions. In addition, it is also compatible with several docstring formats and performs automatic checking of the signature and the comments. The last code snipped shows you how to use argParseFromDoc.

from typing import Optional
def main(num:int, b:int, prefix:str, prologue:Optional[str]):
  """
  This function adds together two numbers a and b
  
  Parameters
  ----------
  num: int
      first number
  b: int
      second number
  prefix: str
      print something before the summation
  prologue: str
      print a line before the summation
  """
  
  if prologue:
    print(prologue)
  print(prefix+str(num+b))

if __name__ == "__main__":
    
    from argParseFromDoc import get_parser_from_function
    parser = get_parser_from_function(main)
    args = parser.parse_args()
    main(**vars(args))

More advanced functionalities are also implemented, including variable number of arguments, optional/mandatory arguments, default values, etc. Take a look at the documentation and examples if you are interested. And, please, if you are using my package, any feedback is welcome.

Author