Spyke

Replies

Comment on

Jeff Geerling: Self-hosting your own media considered harmful (updated). Youtube removed his content, saying that self hosting content is "dangerous or harmful content"

I think a few folks haven't read the article or know who Jeff Geerling is. The title of this article is confusing.

Jeff posted a video on YT about how to self-host your own media in 2024. He recently got a violation from YT that YT considers his video to be harmful and dangerous. He appealed, got denied, but then the update is that YT removed the violation.

Comment on

Ladybird Browser Team Selects Swift as Preferred Language

Reply in thread

Please read this and try again.

https://www.gnu.org/philosophy/free-sw.en.html#packaging

Rules about how to package a modified version are acceptable, if they don't substantively limit your freedom to release modified versions, or your freedom to make and use modified versions privately. Thus, it is acceptable for the license to require that you change the name of the modified version, remove a logo, or identify your modifications as yours. As long as these requirements are not so burdensome that they effectively hamper you from releasing your changes, they are acceptable; you're already making other changes to the program, so you won't have trouble making a few more.

Comment on

The Tragic Death of Inheritance

Most of us have bad memories of over-complex hierarchies we regret seeing, but this is probably due to the dominance of OOP in recent decades.

This sentence here is why inheritance gets a bad reputation, rightly or wrongly. Inheritance sounds intuitive when you're inheriting Vehicle in your Bicycle class, but it falls apart when dealing with more abstract ideas. Thus, it's not immediately clear when and why you should use inheritance, and it soon becomes a tangled mess.

Thus, OO programs can easily fall into a trap of organizing code into false hierarchies. And those hierarchies may not make sense from developer to developer who is reading the code.

I'm not a fan of OO programming, but I do think it can occasionally be a useful tool.

Comment on

You might as well timestamp it

Ehhh, I don't quite agree with this. I've done the same thing where I used a timestamp field to replace a boolean. However, they are technically not the same thing. In databases, boolean fields can be nullable so you actually have 3-valued boolean logic: true, false, and null. You can technically only replace a non-nullable field to a timestamp column because you are treating null in timestamp as false.

Two examples:

  1. A table of generated documents for employees to sign. There's a field where they need to agree to something, but it's optional. You want to differentiate between employees who agreed, employees who disagreed, and employees who have yet to agree. You can't change the column from is_agreed to agreed_at.

  2. Adding a boolean column to an existing table. These columns need to either default to an value (which is fair) or be nullable.

Comment on

parquet vs csv

Do you use it? When?

Parquet is really used for big data batch data processing. It's columnar-based file format and is optimized for large, aggregation queries. It's non-human readable so you need a library like apache arrow to read/write to it.

I would use parquet in the following circumstances (or combination of circumstances):

  • The data is very large
  • I'm integrating this into an analytical query engine (Presto, etc.)
  • I'm transporting data that needs to land in an analytical data warehouse (Snowflake, BigQuery, etc.)
  • Consumed by data scientists, machine learning engineers, or other data engineers

Since the data is columnar-based, doing queries like select sum(sales) from revenue is much cheaper and faster if the underlying data is in parquet than csv.

The big advantage of csv is that it's more portable. csv as a data file format has been around forever, so it is used in a lot of places where parquet can't be used.