Excellent points.

2 min readMay 25, 2020

Excellent points. I’m actually not super fond of using this ubiquitously, I just meant to show that it was possible in python + supply a way of doing this systematically, not so much to suggest that it should become the new standard.

That said, I’d argue that while it’s a bad pattern for production code, it’s a useful pattern for exploratory analyses in data science, and the debugging tradeoff can be worth it. The primary advantage is more readable code when traditional python can be less clear at-a-glance. E.g. while exploring data assets, one frequently has to make adhoc transformations simply for the purpose of visualizing this in some other way, but you don’t want to assign these to new variables — you just want to visualize the output and move on. And because the chains are created sequentially (in a jupyter notebook, you’ll often run a >> b , check the output and then re-run, adding another link a >> b >> c , check the output and then run a >> b >> c >> Plot ), debugging is often pretty easy.

If you disagree with it making the code more readable, you could then alternatively simply view this as supplying the necessary abstraction so others don’t have to code the overloading pattern, should they want to use this pattern in their libraries. You can find a number of libraries that actually leverage this sort of overloading to make the code visually reflect the underlying logical structure already (e.g. Airflow uses the bit shift operator to enable visual structuring of DAGs). :)

Written by Robert Yi

No responses yet