Tutorials¶
Some tutorials for using and extending Good Tables.
1. Implementing a custom processor¶
# TODO: This is unfinished.
Implementing a custom validator that can be invoked in a pipeline is easy.
Let’s write one that checks that values are in a certain range.
For data, see the file custom-range.csv in the examples directory.
In our data, we have name, age and city data for a group of people.
We want to ensure that all the people in our data are in the 25-50 age range.
For demonstration, we’ll write a pretty specific validator for this.
Of course the implementation could be made more generic for a range scenarios.
Our validator class:
class AgeRangeValidator(object):
column_name = 'age'
column_type = int
column_range = (25, 50)
report = {}
def run_row(self, index, headers, row):
valid = True
return valid
def run():
valids = []
return valid, report
def generate_report():
return report
As you can see, we hard coded column_name, column_type and column_range.
Also, we are running our validation through the run_row method, which is the most common method used for validations.
However, we could easily run the same validation in the run_column instead:
# stuff
def run_column():
valid = True
return valid
So, let’s see it in action. First, we’ll run the validator in ‘stand alone’ via its run method:
validator = AgeRangeValidator()
filepath = 'examples/custom-range.csv'
valid, report = validator.run(filepath)
And the same, but part of a validation pipeline using the structure validator with our AgeRangeValidator:
validators = ('structure', 'my_module.AgeRangeValidator')
filepath = 'examples/custom-range.csv'
validation_pipeline = ValidationPipeline(filepath, validators)
valid, report = validation_pipeline.run()