Tag: Airflow

OAUTH Authentication in Apache Airflow (2.8.1)

There’s a rather innocuous sounding bug in Apache Airflow that should be corrected in 2.8.2 — https://github.com/apache/airflow/pull/36538 — that means you absolutely cannot set up SSO using OAUTH with FabAirflowSecurityManagerOverride. Using the deprecated AirflowSecurityManager would work, manually updating your Apache Airflow code with the fix will work. But there’s no point in trying to set up SSO with the FabAirflowSecurityManagerOverride as your custom security manager — whatever lovely code you write won’t be invoked, you’ll get an error saying the username or email address is not present even though you thoughtfully wrote out some custom code to map out those exact attributes, and it all looks like it should be working!

Apache Airflow — No Backfill

A lot of software seems to be designed to save the user from themselves. This is great 90% of the time when you mess up and really want their help (or when the software’s help is cosmetic … my gripe against auto-correcting smart quotes, as an example). But I seem to fall into the other 10% a lot. And I mean a LOT. Apache Airflow jobs try to grab new information all.of.the.time. It’s a feature called “backfill”, and I’m sure it helps all sorts of people do exactly what they really wanted done. Not me 🙁

Having updated to 1.8, though, I now see a configuration parameter to instruct a DAG not to do me any favors. Just do what you’re asked when you’re asked to do it: catchup = False

DAG('testjob', default_args=default_args, schedule_interval='0 * * * *', catchup=False)