Laravel: 10 tips for your new database
From the obvious, to the guilty misconceptions
To me, the first and most important part to create an application is to know what to save, and how to save it. SQL Databases are the most used, battle-tested, and mature of all solutions around the web, and even embedded devices.
For that reason alone, I wanted to share my 10 tips that I have mentally engraven for long time when I have to deal with starting a database schema. It also serves me to archive and point to when someone needs help starting from zero.
1. Pen & paper first
Always try to make an schematic about your database tables and Models if you’re planning a MVC-style application. Make boxes, draw lines, and detail which tables are connected with which columns.
This may look like a kindergarten assignment, or feel like you’re wasting time if it’s already “in your head”, but having a clear representation of how the tables in your database relate to each other before any code hits a computer will make easy to make changes instead of refactoring multiple files.
Let’s be clear on this: refactoring a database already tied to the application will be more time demanding, and later changes will be more expensive than just redrawing a line with an eraser.
2. Be explicit
Don’t go saving some keystrokes on your tables or columns just because you know what data type is it. If a column stores an API token, then use api_token
as the name.
Going for cryptic names like for tkn
, or too short like token
, only messes up your ability to maintain the app in the longer term. Going for large names like api_token_string_for_service_in_cloud
will only make your editor display a nice ellipsis after some characters. In the age of software collaboration, you will only make your partners more confused.
This doesn’t mean to avoid the context of the table entirely. A column named article_title
is fine, but it’s better to name it just title
if is under the articles
table.
3. Be vertical
This may reminisce the first database management class in university, but not everyone has experience in software development. There are still people who make the error of expanding data horizontally.
Never, ever, try to expand information by adding more columns. Always try to create child records that reference the parent, even if it’s in the same table.
For example, adding columns like child_1
, child_2
and so on, should be avoided at all costs. Instead, do a children
table and reference the parents
for each record that it belongs to using parent_id
.
Not only this allows also a given record to scale back and forth in the number of children, but also remove the burder of retrieving the children with the SELECT * from parents
query.
4. Follow conventions
There are few important conventions for naming columns in a database. While these are not “standard”, it’s what I personally use to have a clear guess of the data type a column represents through its name, most of the time.
- Primary keys are always named
id
whatever the data type is. - Timestamps are past tense, suffixed
…_at
- Date, Time, and Datetime are not timestamps, name them freely.
- Booleans are always positive and start with
is_…
- Numeric columns should be plural, like
guesses
orfailures
- Lists as JSON columns are plural, and may end with
…_list
- Complex JSON trees can be named freely, but may end with
…_tree
- Foreign columns always end with
…_id
, likeauthor_id
- Foreign columns using non-id foreign columns can be suffixed as
…_as_id
- Multiple text columns may end with
…_body
- Multiple text columns for binary-encoded data may end with
…_data
- Binary columns always end with
…_blob
or…_binary
- Columns for encoded data always end with
…_encoded
- Columns for encrypted data always end with
…_encrypted
Again, these are not mandatory to follow, but when collaborating, explicitly naming these column with their purpose can help a lot of coworkers deal correctly with the column data without having to ask to someone to pick up the phone.
5. Don’t go overboard with data sizes
If you’re planning to shove in big columns (like big walls of text or raw binary data) in tables accessed constantly, most of the time these can be moved into their own tables to avoid taking them out by accident with the classic SELECT * FROM my_table
.
For example, let’s make a users
table with a column called biography
. Unless you have total control on the query at all times, which often is not, you can easily consume a lot of memory in your app when retrieving multiple users from the database, as a SELECT * FROM users
will also retrieve the biography of each.
May be an user_biographies
table will help you keep both separately, along a left-join to explicitly incorporate the big column.
6. Always think SQL-agnostic
Someone once told me, “marry or live free, there is no in-between”. Same happens with DB engines.
You may want to know the difference between SQLite, MySQL, MariaDB, PostgreSQL and Microsoft SQL Server, so you can choose the best option for your application. The true is: you will know once you need to.
If your project doesn’t require any exclusive feature from these databases, you may well create an application that is database-agnostic. Otherwise, you WILL know which feature can be useful to you, and marry your application to a vendor. For example, you may want to use MariaDB with Galera Cluster, or SQL Server for your Windows ecosystem.
There is no guarantee you will ever need to change database engine in the future. Sometimes, migrating from one cloud provider to other means changing versions of the same database engine. Sometimes you may found that what you’re migrating from it’s not longer offered, supported, or just costlier to operate. When these things happen, and you need to migrate as fast as possible, having an agnostic-SQL app helps a lot.
Starting with SQLite is faster, but consider always spinning up a real testing database to be 100% sure your application will work on production, as SQLite has many caveats that are not reproducible with a real database.
7. Have a backup plan
Even if your cloud provider offers backups, you may still want to test how to restore one. Don’t be shy and always try to go the whole round of backup-and-restore so when a catastrophe happens you’re not caught with your pants down. It will take less time and sweat checking what to do on testing rather than under a real scenario with thousands of clients.
About backups and snapshots, always try to check if the cloud provider supports incremental or differential snapshots. In either case, these are way better than complete snapshots as copying a whole database takes time.
In the worst case, you may want to execute an SQL script to save a dump of certain database data in cold storage, periodically, by querying certain records after a window of time, and keep it on the cheapest storage plan your cloud provider offers.
8. SQLite can do so much
SQLite is usually a good idea to start if you need to prototype a database instantly, but also consider that is not meant for production in 90% of cases. A lot of problems with SQLite comes from three things:
- It can be very slow outside a transaction.
- Testing in memory may be slower than other database engines.
- It only works with few types of columns, as other more complex are emulated.
The other 10% of applications that will use SQLite will be perfectly fine under the few data-types and not expecting fast speeds or concurrency guarantees. Hell, I even use it for my Larawiz site to create Laravel apps in 1 minute, and works flawlessly.
For that reasons alone, you may want to use it for a quick start, but after that you will have better piece of mind making the final test on a database engine instance.
9. Avoid enums
Enums have been largely a problem for databases when these columns need to change, like adding, removing or updating an option.
As general rule, keep Enums as variable character length column (varchar
) or integers if you are very sure you need numbers to relate to lengthy pieces of data (like large error messages), and use your app to enforce Enums, not the database.
This way you can change the enum dataset without modifying the database directly, and migrate old values with a very simple query. Otherwise, you’re bound to what shenanigans the database wants you to handle to modify an enum column.
10. Keep binaries under control
Binaries can be your friends, but most of the time these are not.
While binaries can be useful to store inert raw data, like cryptographic keys, small images, or even encoded text, it will be always difficult to query or compare since it’s binary data at the end of the day. Also, some databases run in circles if you use them as a primary key, or try to compare them at all outside a null
.
Personally, I use binary columns when I need the raw value of something. For example, a public key that is later fed to the app to check a the validity of a message signature.
You may always resort to saving binary data to the application storage that is too big and reference that through a column that points its exact location. In some scenarios it may not be the faster way to retrieve data, specially if you need to go through an external API (network ↔ file system), but your DB won’t suffer maintaining thousands of records of megabytes each.
If you’re curious about some good tips on databases, at least not deep enough to be lost, you may check these additional resources: