Обрывки мемуаров

May 01, 2023 16:08


Из разных дискуссий в ЖЖ

Read more... )

life, на злобу дня

Leave a comment

Comments 6

(The comment has been removed)

grumbler October 31 2023, 12:18:52 UTC

II. It Fucking Sucks

It's an insane dumpster fire spiderweb of technical debt and it's only like one week old. Here are some fun details.

I get a friend of mine hired (big fan of nepotism), and he finds, on day one, a file in the project's repository that deletes prod using our CI/CD pipelines if it is ever moved into the wrong folder. It comes complete with the key and password required for an admin account. It was produced by the former lead engineer, who has moved on to a new role before his sins catch up with him.

The entire thing is stitched together by spreadsheets that are parsed by Python, dropped into S3, parsed by Lambdas into more S3, the S3 files are picked up by MongoDB, then MongoDB records are passed by another Lambda into S3, the S3 files are pulled into Snowflake via Snowpipe, the new Snowflake data is pivoted by a Javascript stored procedure into a relational format... and that's how you edit someone's database access. That whole process is to upload like a 2KB CSV to a database that has people's database roles in it ( ... )

Reply


grumbler October 31 2023, 12:19:24 UTC

I Accidentally Saved Half A Million Dollars

Published on October 29, 2023

I saved my company half a million dollars in about five minutes. This is more money than I've made for my employers over the course of my entire career because this industry is a sham. I clicked about five buttons.

Let's talk about why happened and why it's a disgrace that it was even possible.

I. Background

Let's start with some background, because it is fucking wild that an inefficiency that took me five minutes to solve in a GUI configuration panel was allowed to persist. We cancelled someone's contract the week before I did this. Someone lost their job because no one could get their act together long enough to click the button I told them to click.

A few years ago, this company decided that it wanted to create an analytics platform, following the decision to become more "data driven". They hired some incredibly talented people to make this happen, and then like five times as many idiots.

At the time this was happening, I had just graduated and joined the ( ... )

Reply

grumbler October 31 2023, 12:20:00 UTC

II. The Budget

The next thing to realize is that this platform never really had a chance of making any money for the organization. They do a little accounting trick (read: lying) which I'll talk about in another post that makes it seem like they've had huge wins, but really this is just many times more expensive than our previous operational model.

The deal is that we pretend the whole team is doing something or other, and we stay within budget because the organization can't afford to spend infinite money on this social fiction. However, the budget for our database costs was being drastically overrun. I'm not sure what the original estimate was, but I think it was intended to cost something like 200K for a year of operations, but we were now close to a million dollars.

Some quick facts:
  1. We use Snowflake as our database, which charges you based on the size of the computer you use to run your queries.
  2. You only pay for computers while they're on.
  3. We probably run a few thousand queries per week, mostly developers experimenting with little ( ... )

Reply

grumbler October 31 2023, 12:20:22 UTC

V. Chaos Reigns

I return to work the following Monday. I suspected that this would save a bunch of money, and guess what, our projected bill dropped from a million to half a million dollars, and everyone is losing their fucking minds.

My team has spun this as a huge cost saving, when really we just applied a fire extinguisher to the pile of money that we had set alight.

Other teams are attacking my team, insisting that it can't be a coincidence that the one new guy joined exactly as we did this, and how was it possible we didn't know how to generate that kind of saving without his help? They are saying this because it makes them seem higher status and their teams only produce money in the land where you lie all day, but it is a fair question.

While my managers are very happy, they quietly suggest it may be unwise to roll out the changes to all the computers (I only did a few to be safe) because it would oversaturate the department to hear about us all day. And invite unwelcome questions. The subtext is that if we do this all slowly ( ... )

Reply


grumbler March 30 2024, 12:01:07 UTC

Oracle Database 12.2.

It is close to 25 million lines of C code.

What an unimaginable horror! You can't change a single line of code in the product without breaking 1000s of existing tests. Generations of programmers have worked on that code under difficult deadlines and filled the code with all kinds of crap.

Very complex pieces of logic, memory management, context switching, etc. are all held together with thousands of flags. The whole code is ridden with mysterious macros that one cannot decipher without picking a notebook and expanding relevant pats of the macros by hand. It can take a day to two days to really understand what a macro does.

Sometimes one needs to understand the values and the effects of 20 different flag to predict how the code would behave in different situations. Sometimes 100s too! I am not exaggerating.

The only reason why this product is still surviving and still works is due to literally millions of tests!

Here is how the life of an Oracle Database developer is:

- Start working on a new bug.

- Spend ( ... )

Reply

grumbler March 30 2024, 12:02:17 UTC

https://hardsign.livejournal.com/566694.html

знаменитый пост на портале YCombinator про то, как выглядит код этого продукта изнутри. Разумеется, его давно уже пишут индусы, поэтому качество у него соответствующее (кстати, как-то давно летел в командировку в Америку, в самолёте разговорился с девушкой из Индии и объяснил ей, что у нас называется идиомой «индийский код»; она этого, как ни странно, не знала). Пост описывает процесс исправления ошибки:
  • Находишь место, где происходит ошибка.
  • Пару недель разбираешься, при каких значениях многочисленных флагов и глобальных переменных она происходит.
  • Добавляешь новый флаг для обработки ситуации и дописываешь пару строк кода.
  • Запускаешь тесты. Тестов много, прогон всех тестов на кластере из двух сотен машин занимает сутки, а то и больше.
  • Если повезёт, то «покраснеют» несколько десятков тестов, если не повезёт - несколько тысяч.

Reply


Leave a comment

Up