The SQL IDE should die

Why I hate the IDE — at least, for analytics work.

Robert Yi
Hyperquery
Published in
4 min readJan 3, 2023

--

Article originally published in Win With Data — subscribe for regular updates!

Before I condemn the IDE, I want to say that I do look back fondly on my days writing in SQL IDEs. The process was spare and brutally technical in a way that nourished my obsession with Unix fundamentalism. While I was at Airbnb, I even wrote an open source library to be able to build my CLI setup into a proper SQL IDE, data discovery and all.

But it’s high time I admit something I’ve always known deep in my shell-script heart:

The IDE was not built for analytics.

Let’s talk about why.

Note: if you’re searching for the bias, I am a co-founder of Hyperquery, where we’re building a notebook for analytics. My intent isn’t to convince you to use us, per se, but to share the argument that motivated us in the first place. I’ll try to stay measured.

Why the IDE isn’t for analytics:
analytics isn’t about development.

Well, the IDE is not for analytics, by definition. The IDE is an integrated development environment. It consolidates the needs of the development workflow.

But analytics is not about development. The extent to which we leverage a programming language at all is not to develop an application, but as an access and manipulation layer. And everything else that happens in analytics revolves around interpreting the resulting data payload, not hardening the code into a codebase.

Image by author.

Analytics is primarily about alignment, interpretation, communication — the non-SQL behaviors that enable us to establish an interface between data and impact. While our scripting chops open the door to a world of data inaccessible to the rest of the business, our subsequent behaviors unlock the value therein.

What the solution should look like:
the notebook, where data and interpretation mix.

So what’s the solution?

We don’t need an integrated development environment, because analytics isn’t primarily about development. We need an integrated analytics environment that addresses the needs of analytics, not just SQL. It’s time we stopped co-opting an interface from another field when our needs are different.

We need a proper analytics notebook.

Why?

1. Notebooks fit the analytics workflow better.

While SQL IDEs push you towards consolidation (one, final query), notebooks push you towards exploration. And the latter is the preferred pattern for analytics: your queries are rarely ends in and of themselves. They deserve, at minimum, a line or two of explanation, contextualization, always. Notebooks are better for this.

2. Notebooks reinforce better behaviors.

It might seem that dumpster diving into your IDE is the fastest way to get going, it’s seldom the best solution — it’s the fast food of analytics work. It may work in a pinch, but relying on it for the bulk of your work will only reinforce bad habits and degrade the quality of your work in the long-term. Work should always be aligned and interpreted on either end of the technical work.

3. Notebooks elevate data to knowledge, and that’s what we care about.

Notebooks represent knowledge, and knowledge is the currency of the business that analytics teams should peddle (not data!). SQL queries deal in data. Orientation around the thing that matters aligns all ancillary processes to it in a more coherent way. Knowledge should be organized, not data. Knowledge should be shared, not data.

Final comments

A few caveats:

  1. There are certainly workflows where development is appropriate: building pipelines, data models, etc. But these fall within the realm of data engineering and analytics engineering. While these are often within the scope of analytics work, they are not analytics.
  2. Some of you may be chanting “Jupyter” or its derivatives at this point, but I don’t think this is the optimal solution. It’s not built from first principles for analytics, meaning its shape will inevitably bear fundamental shortcomings and clumsy vestiges. But that’s another post for another time.

And all that said, certainly I’m biased. We have a lot of sunk cost here. But I hope you find the original reasoning sound (and if not, let me know — there is precious little I care about more than challenging this line of reasoning).

Notebook or not, an upheaval is overdue. Not everything is a nail. We deserve a tool purpose-built for analytics, not another re-purposed development tool.

Tweet @imrobertyi / @hyperquery to say hi.👋
Follow us on LinkedIn. 🙂

To learn more about how hyperquery can help, check out hyperquery.ai. Original blog post published on Win With Data.

--

--

Chief Product Officer, Hyperquery (hyperquery.ai). Former ds @ Airbnb, Wayfair; Ph.D. @ MIT, physics @ Harvard. twitter.com/imrobertyi Also at win.hyperquery.ai