Apache Airflow User Survey 2019

I’ve done a lot of work with Apache Airflow over the last two years, but my experience of how people use it is limited to the teams I’ve worked on and a few glances through talks I’ve seen. I have plenty of ideas of things I’d like to improve, but I have no idea what might be bugging everyone else.

So last week (okay, the week before actually. I got busy) I sent out this tweet

Do you use @ApacheAirflow? Want to help me get an indication of how people use it? Mind taking a (very short, 7 question) survey?

https://ashberlintaylor.typeform.com/to/hIO0Ks

Ta!

—https://twitter.com/AshBerlin/status/1096083434538180608

Thank you to the 152 people who responded! As promised here is a summary of the results.

This post doesn’t really have a conclusion as such - I’ve just presented and summarized the results.

The questions

Before we go any further, a re-cap of the questions I asked. These were picked fairly randomly and I tried to make it quick to answer so didn’t ask too many.

On a scale of 0-10 how likely are you to recommend Apache Airflow? (0 being not at all)
How do you expect your use of Airflow to evolve in 2019? Increase, Stay about the same, Not sure yet, Decrease
How many active DAGs do you have in your Airflow cluster(s)? 1—5, 6—20, 21—50, 51+
Roughly how many Tasks do you have defined in your DAGs? 1—10, 11—50, 51—200, 201+
What executor do you use? Sequential, Local, Celery, Kubernetes, Dask, Mesos
What would you like to see added/changed in Airflow for version 2.0 and beyond? Free text input
Anything else you’d like to mention? Free text input

The results

1. Airflow’s Net Promoter Score

The average score was 8.3, which is pretty good (though the channels I used to find respondents is probably going to impart a large chunk of selection bias to the results. Still, a pretty good figure, and there were some “detractors”, and I’m glad they responded too!)

> NPS = Percent_of_9_and_10s - Percent_of_0_to_6s > NPS = (70 - 17)/152 > NPS = 0.348684

(65 responses, or 42% were in the “passive” 7-8 range.)

A Net Promoter Score of 35 is good (0—50, 50+ is fantastic), though I have no idea what to compare this too - an OSS software NPS is an entirely different to the score for a commercial company.

Still, 88% of people had a 7+ rating, which I think is pretty good.

Rating	Responses	Precent
0	0	0%
1	0	0%
2	1	0.7%
3	2	1.3%
4	3	2%
5	2	1.3%
6	8	5.3%
7	14	9.2%
8	51	33.6%
9	28	18.4%
10	42	27.6%

2. Change of use in Airflow for 2019

Answer	Responses	Percent
Increase	118 responses	77.6%
Stay about the same	28 responses	18.4%
Not sure yet	4 responses	2.6%
Decrease	2 responses	1.3%

Interestingly only one of the detractors (people who rated 0—6) expected their use to decrease, and 9 (of the 17) were increase.

3 and 4. How many DAGs and Tasks people have

DAGs:

# DAGs	Responses	Percent
1—5	25 responses	16.6%
6—20	44 responses	29.1%
21—50	27 responses	17.9%
51+	55 responses	36.4%

Tasks:

# Tasks	Responses	Percent
1—10	41 responses	27.3%
11—50	44 responses	29.3%
51—200	23 responses	15.3%
201+	42 responses	28%

Some people asking for a Kubernetes Executor - well there is one!

5. What executor do you use?

Executor	Responses	Percent
Celery	95 responses	63.8%
Local	30 responses	20.1%
Kubernetes	16 responses	10.7%
Sequential	7 responses	4.7%
Dask	1 response	0.7%
Mesos	0 responses	0%

It’s probably no surprise that many people use the Celery executor —- it’s well written about and the “default” horizontal scaling approach. A few people using the Sequential executor commented on why - that they mostly just use APIs from their tasks (EMR, or Cloud Dataproc etc.) so their tasks don’t present a heavy load.

That’s it for the quantitative answers. Now for the harder work of summarizing the free-text fields.

6. What would you like to see added/changed in Airflow for version 2.0 and beyond?

To be able to summarize these answers in any useful format I’ve had to try and classify the responses given. For each response I classified it as against the following categories and sub-categories. In total 70% of the responses had

Scheduler - 23 comments

High-availability or run multiple schedulers: 8 comments

Performance of scheduler: 8 comments

Ed: Comments about the CPU use of the scheduler when running, or the time it takes the scheduler to queue tasks

Reparsing of DAG files: 5 comments

The scheduler currently re-parses the DAG files in a fairly tight loop, which can be a bit heavy on external systems if you have a dynamic DAG.

Improvements to SubDAGs: 2 comments

General requests for “improve subdags”. Ed: I agree, and I’m surprised more people didn’t ask for this.

Webserver and WebUI - 39 comments

Accessibility: 3 comments

Colour blind/high contrast mode. General accessibility improvements. Absolutely, we should be better about this.

User Experience: 11 comments

Lots of comments around asking for a “Better UI” or a “Cleaner UI”

Performance: 7 comments

Comments about the UI being slow - especially for large DAGs or a large number of DAGs.

The Web server shouldn’t have to parse the DAGs. Ed: Agreed, and AIP-12 will go a large way towards that

Auto-updating: 3 comments

Having to refresh the page to see tasks changing state is so 2001 ;)

Ed: this would make a huge difference to the feel of the UI, but might need larger architectural changes to make happen. Sadly

Operational Visibility:: 2 comments

Requests to make it easier to see that state of the whole Airflow system from within the UI - i.e. helping workout why tasks in a DAG might not be progressing etc.

Ed: people after my own heart!

Timezone handling: 5 comments

Better handling of Timezones in the UI, specifically better support for local timezone. Ed: not clear if “local” means the viewers timezone, or just the configured timezone - i.e. do people access Airflow from multiple TZs?

Misc Feature Request: 8 comments

Comments that didn’t fit else where - things like parameterized DAG trigger from UI, more control, keyboard shortcuts, grouping/collapsing rows

Core - 15 comments

The “core” of Airflow, excluding the scheduler or the webserver.

Plugins: 4 comments

Requests for clearer defined plugin architecture, splitting Airflow into core and plugins. Ed: they may not need to be plugins to split, just python modules would work

More Operators: 11 comments

Requests for more operators/sensors. One good request was to have “composable” operators to explosion of XtoY operators. Ed: this would be nice! If someone wants to start an Airflow Improvement Proposal for this that would be ace.

Pull Request review/merge time - 3 comments

Three people commented about how long it takes to get PRs reviewed or merged. Ed: Absolutely, and we’d love to get through them quicker, but there is only so much time the volunteer-based committers can spend on this in a day without getting fired ;)

DAGs - 16 comments

Inter-DAG dependencies: 3 comments

A better way of declaring cross-dag dependencies. Ed: None of the comments specifically said what the current ExternalTaskSensor was lacking.

Event-based Sensors: 4 comments

The ability to sensors to respond to external events without polling. Ed: the new mode="reschedule" on sensors goes a little way to helping with this, but this could still be improved.

Versioned DAGs: 4 comments

Asking for better handling of DAGs as they change over time.*Ed: Again* AIP-12 will go a large way towards that

Misc: 4 comments

Various DAG API changes such as more flexibility in retry, SLA, timeout. Better isolation between DAGs Ed: PythonVirtualEnvOperator might help a little bit with this.

Documentation - 21 comments

Lots of requests for better docs Ed: yes please!*, many mentioning “best practice” around deployment, upgrade process etc. Clearer write ups of what new features each release brings.

Kubernetes - 10 comments

Better/tighter Kubernetes integration. Easier deployments of DAGs on Kube. Further customization of pods that are run.

Ed: Some comments like “integration with Kubernetes” probably ties back to the previous point about docs - we have a Kubernetes executor and PodOperators too. Maybe people don’t know about them

Alternative ways of authoring DAGs - 5 comments

Ed: these are II’m afraid low-priority for the Airflow core team. One of the selling points of Airflow is that the DAGs are Python code. This could be added via a plugin though

Add a DSL (Domain Specific Language): 1 comment

A request to describe DAGs in YAML/JSON and then submit via the API - helpful for non-Python teams. Ed: JustEat described something similar (without the API) in their Talk ait the London Airflow Meetup #1)

GUI editor for DAGs: 4 comments

Various “UI to edit from Web”, “drag-and-drop” etc.

Other - 20 comments

Improved HTTP API: 5 comments

Calls for better/more fully-featured HTTP API - anything you can do via Web UI or CLI should be possible via HTTP API too. Ed: Totaly!

Test Framework for end users: 3 comments

Three people asked for “ways to test DAGs locally” or variations of that. Ed: Bas at GoDataDrvien wrote https://blog.godatadriven.com/testing-and-debugging-apache-airflow which provides some useful tips.

Miscelanous: 12

Things that didn’t fit elsewhere, or didn’t deserve their own category: “Better security” Ed: yes, security could always be improved, but what specifically?”, multi-tenant clusters Ed: RBAC helps a tiny bit there, execution_date is confusing to new-comers, Airflow should be on the Amazon Marketplace, etc.