Follow-up to “Graduating from ETL Developer to Data Engineer”

Eric McCarty
4 min readAug 9, 2023

Good Intentions Can Lead to Broken Promises

In January 2022, I wrote a post titled Graduating from ETL Developer to Data Engineer. At the time, I was a data analytics specialist customer engineer at Google, working with customers who desperately wanted to modernize their data platforms. I was assigned to the “Select Insurance” vertical, so my customers were the biggest insurance carriers in the US: all Fortune 500, most close to or exceeding 100 years old, all struggling with running their business with legacy ETL tools and platforms but desperately wanting to get more value out of their data.

I wrote this piece because I saw a disconnect from what I was seeing in the industry (on blogs, at conferences, in industry whitepapers, and vocal folks on LinkedIn) and what I was seeing in the real world (at least, fortune 500 financial services companies) in regards to data engineering. If you only saw the former, you would think the whole data world was running on streaming, cloud-native, modern data pipelines. But I worked mostly with the latter, people working with GUI-based ETL tools. And I saw these people dealing with a lack of respect: they were running the pipelines that drove the business decisions of some of the biggest companies on earth, yet being ignored for promotions or recognition because they “weren’t real programmers.”

So I wrote the post to those people so they could level-up their career. I couldn’t have imagined how much engagement it was going to get. Still to this day, a year and a half later, I have people reaching out and talking to me about it. I am not one to use the word “viral” because lets face it, “GUI-based ETL developers that actually use LinkedIn and Medium regularly” is a pretty niche audience. But for the people I wrote it for, I couldn’t be happier with how much the message resonated. Nothing out there really caters to us usually. So thank you to everyone who read, liked, and interacted with that post.

Part 2?

However I made a mistake: at the end I talked about a part 2. And “where is part 2” is the most common question I get. It was supposed to be a collaboration with a colleague where we showed an end-to-end pipeline using modern data engineering principles, geared directly to an audience of GUI-based ETL developers. And we actually did create this pipeline. It contained an RDBMS with a batch input pipeline, a streaming dataset with a streaming input pipeline, and an ELT workflow to curate the data to a target table. It leveraged cloud-native, modern engineering principles and included actual integrated data. And more importantly, we wrote exactly how each transformation translates to their GUI-based counterpart so you could see that, while the tech and processes may be different, the end-goal is the same, resulting in you being able to apply what you know to a modern architecture.

But a few things happened that made me never publish:

  • As with most things, work like this was outside of our day jobs so it always fell to the bottom of the priority list.
  • Even though this was using fake data, since it was code developed during work hours, we had to get approvals to actually publish. As technologists, the paperwork part is always the hardest.
  • Less than two months after part 1 was published, I had a new job at Google as the global solution manager for data warehouse modernization. The job was great but very in-depth, taking me further away from this work.
  • Then a year after that, I was caught up in the big-tech layoffs of early 2023.
  • So then my full-time job was looking for a job, which led me to my new career as a principal data engineer at Walgreens.

None of these are excuses, at the end of the day it’s all about prioritization, and I didn’t prioritize properly. I let that audience, no matter how niche it is, down. Since I don’t work at Google anymore, I don’t have access to the code we developed, so it wont see the light of day. And if I’m being honest, I could say I would re-build it in my own time by myself, but then I’d just be lying to myself and to you.

So, this is me saying there wont be a formal part 2. But to make up for it, I’m offering my time to anyone interested in making the leap from GUI-based ETL developer to data engineer. Whether that’s a discussion on career, offering advice on where to start, or reviewing your code when tackling your first non-GUI pipeline, I want to give back where I can.

I will be writing more in the future. Some of the career changes in the past couple of years has given me a fresh perspective. And this time, I’ll have no excuses.

--

--

Eric McCarty

Data Specialist Engineer at Google and former Technical Architect at USAA and Walgreens. Opinions are of my own and not of Google.