Often when working with API's and other data, I've run into somewhat questionable entries for timezones and currency symbols. I thought it'd be helpful to have these expectations easily available to GE users so cross-referencing dictionaries or 3rd party API's wouldn't be necessary.
What it does
The timezone expectation checks whether the data in column valid IANA timezones (utilizing the pytz library).
The currency expectation check whether the a column contains valid 3 letter values for currency codes (utilizing the py-moneyed library).
How we built it
- With the Great Expectations documentation and computer :)
Challenges we ran into
- Deciding on whether to use pytz or the zoneinfo standard library package included in Python 3.9.
- Finding a simple and updated money library that could be relied on. It would be possible to just use a static tuple/list for the currency codes, but that'd created problems whenever a currency code was changed or created.
Accomplishments that we're proud of
First time contributor to Great Expectations.
What we learned
The main learning was the steps necessary to contribute a new expectation to GE. This will be very helpful in the future if I think of more ideas.
What's next for Lucas Smith's New Expectations
- In the future, the expectations could allow for "in-exact" entries (e.g. "usd" instead of "USD").