Org ID, Org Attach & Better Folder Names

You might have heard of org-mode header IDs. By default, these are Universally Unique Identifiers (UUIDs). In this post, I want to talk about what they are, why I use them (and you should, too), and how to make them into slugs: human-readable IDs that make more sense. This will be a long explanation of what I discovered in org-mode, so buckle up!

What are UUIDs and why should I care?

The org-mode manual doesn’t exactly point at header IDs unless you know where to look. Because of this, a casual user of org-mode might overlook these powerful organizational tools.

In org-mode, we can link to a file and a head directly: <path to file>/parent header/header, but this is risky because we refile (move) headers between files; old projects get archived, meeting notes get moved under other relevant projects, etc. This means that any links using a path like the one above will break¹.

For this reason, we have header IDs: Org-IDs. They require that we include org-id in our init file (as part of org-mode’s modules), and then we can generate them for any header by calling org-id-get-create:² “create an ID for the current entry and return it. If the entry already has an ID, just return it.”

By default, org-id-get-create creates a string of random numbers and letters: a UUID. Now, every time we want to reference the header, we link to the UUID instead of the header’s path, like so: [[id:our UUID here][description here]].

This is great, but these UUIDs are a string of random characters that make no sense to us. A header with an ID like 05576976-a33c-11ec-9da6-020017000b7b doesn’t tell us anything. We will return to this problem in a bit, because first I want to discuss another great (and perhaps overlooked) org-mode feature that makes this even more problematic.

Referencing to files with Org-Attach

Org-mode comes with a built-in attachment mechanism called org-attach. You can summon it with C-c C-a when inside org-mode under a header. Org-attach was one of the features I always knew existed but never used.

We already have a system to navigate and organize our files ingrained deep into our mind, be it Finder, Windows Explorer, or whatever GUI file manager we have in front of us. For us dedicated Emacs users, there’s of course the excellent Dired.

What’s more though, org-attach is using some seemingly weird system to store our files deep in folders that don’t make sense to us, and the only way to find them later is to use org-attach again to go to these attachments (by calling org-attach with C-c a, then choosing f or F, depending on the case). At least, that’s what I thought.

Assuming you haven’t tweaked your org IDs, org-attach will nest an attachment under three folders. First, the default data folder, which is where all the attachments are stored. Second, under it, a folder with the first two characters of the header’s UUID. Then, third, a folder with the rest of that UUID as its name.

This is a bit confusing, so let’s break it down a bit. Again, the following workflow assumes you don’t have any settings affecting org-attach. Launch Emacs with -q if you have any doubts:

Go visit an org file, and navigate to one of its headers.
Summon org-attach with C-c C-a. This will bring up a menu. For now, just use the first option, a.
Next, org-attach will ask you which file you want to attach. Navigate to one and select it. Don’t worry, this will create a copy. The original will stay where it is.
With the point on the header you just worked with, which should now have an :ATTACH: tag, summon org-attach again, but this time call option F, which will open the directory the file is in with Dired.

What you’re going to see is that you’re inside your org file’s folder, inside a data folder, and then inside a weird two-letter folder, and then inside a long string representing the rest of the UUID. Something like this: /home/user/orgfiles/data/ab/f4b2cf-4b38-45ec-9333-346b42861d24.

I’d argue that this way of creating attachments would cause newcomers to Emacs to prefer other methods for storing their files. It looks odd, and it doesn’t make sense when you need to find something later. That’s too bad because you’re missing out on org-mode’s excellent ability to organize your projects with their files attached right to them, of course, but this habit also comes after years of using org-mode (at least that’s how it was in my case).

If only you could make these folders make sense. Well, this is Eamcs. Of course you can.

From Org-IDs to Timestamp Slugs

The idea is simple: change how these IDs are created in org-mode by modifying org-id-method. By default, its value is a UUID. But we can change it to ts³: (setq org-id-method 'ts). That’s it. The next time we create an ID using org-id-get-create, it will produce something like 20220315T083403.413614. Still a bit confusing to read, but much better than UUIDs! The format is: year, month, date, followed by T for time, and then the current time down to the fraction of a second. Why do we need it broken down to these tiny time fractions? To guarantee we get different org-ids if we create several of those sequentially.

This will set up our unique IDs as timestamps⁴, but we still need to configure org-attach to use them. Because org-mode is set up to use UUIDs by default, org-attach is set to create directories that are meant to work with UUIDs. The directory structure is determined by org-attach-id-to-path-function-list. Specifically, it points to two functions: org-attach-id-uuid-folder-format and org-attach-id-ts-folder-format. You can go into org-attach.el and see that they break down the folder structure in a pretty straightforward way: the UUID function (which is the one used by default) takes the first two characters of the UUID and makes a parent folder out of those (as seen above), while the ts function takes the first six. The first six characters make sense because they include the year and the month.

By default, if we use the above example of 20220315T083403.413614 as a timestamp, we will get the following directory structure: /home/user/orgfiles/data/20/220315T083403.413614: the first two digits are for first two digit of the year: 2022. This is not very useful, as you will need to use org-mode until the year 2100 for a new sub-folder to be created!

This is what happened to me, and it required some head-scratching and diving into org-attach to figure out. I tried to mess around with the functions in org-attach directly, but that didn’t go well. Eventually someone on IRC pointed me to what I missed: what needs to be changed is org-attach-id-to-path-function-list. It is as simple as changing the order of the functions on this list, so org-attach will know to use the ts function first.

Together with org-id-method, which we defined above, we can write the whole thing like so:

    (setq org-id-method 'ts)
    (setq org-attach-id-to-path-function-list
    '(org-attach-id-ts-folder-format
    org-attach-id-uuid-folder-format))

Now when you use org-attach, it will use the ts function and create the following directory (to use the example above): /home/user/orgfiles/data/202203/15T083403.413614. This makes much more sense. You could also build your own function that would look like org-attach-id-ts-folder-format, perhaps for using the first 4 characters to create a parent directory for the year only. You will just need to make sure your custom function shows up first in org-attach-id-to-path-function-list.

A few Extra Things

I mentioned I used an excellent package called org-super-links. In a nutshell, this package automates creating org-IDs, linking them to an org-header, and creating a backlink from that header to the one linking from it. You should read Karl’s post about it and how he uses it to get a better idea than what I’m letting on here if you’re interested. As a matter of fact, if you want to get some more background, read the previous post mentioned above and you’ll see I’ve been trying to change org-IDs into timestamp slugs for a while. So much so, in fact, that I wrote my own function to do that for me until I discovered org has a system built-in already. As it goes with Emacs though, there’s no wrong answer. The previous attempt took me deeper into Elisp, which was a fun learning experience in itself.

With org-super-links, the process above is quicker because I don’t need to deal with org-id-get-create. I just search for the header I want to link to, and everything’s created automatically: an ID for the header I’m on, an ID for the header I’m linking to, a link linking to that ID at my marker, and a backlink at the header I am linking to pointing back to the header I’m on. You’d probably need to read that last one again. The bottom line is that if you’re serious about organizing your org headers, you should probably check out this package.

Another post that’s been floating around since 2017 by Matthew Lee Hinman approaches org-mode IDs from a different angle: exporting to HTML. In the post, Hinman explains that if you use IDs in your org file, you will also benefit when you export it to an HTML file: the header links you will use will link where you need them to go, and that’s even after you move headers around.

I’m actually using HTML more and more at work when I want to export my org-mode files into KB articles that go into wikis. This is especially helpful when you include a table of contents: the headers in the TOC will not break if you use IDs. This is maybe a bit more of a niche use, but having a TOC in a how-to article makes a lot of sense, and org-mode creates one for you by default when you export to HTML.

Footnotes

¹ As a matter of fact, I think breaking links to headers and losing information is one of the reasons for org-roam’s popularity. Of course, the new versions come up with a lot more than just linking notes across a database, but at its core, I think this is why people started adopting it.

² In newer versions of Emacs, using org-id-get-create (C-c l) can automatically generate an org-id for a header, if one doesn’t exist, provided that org-id-link-to-org-use-id is set to t. This can be done in the init file in Emacs.

³ Emacs documentation specifies a third method, org: “Org’s own internal method, using an encoding of the current time to microsecond accuracy, and optionally the current domain of the computer. See the variable ‘org-id-include-domain’.” This generates what seems to be a random string of text that is also not human-friendly. I’m not sure about the computer’s domain part, but this might be interesting for folks who have several computers on a domain using Emacs.

⁴ Giving headers unique IDs as timestamps is also useful because it leaves “clues” to help locate lost information later. For example, when you create a project with this unique ID, and later on you forget what it was, you can use the date to clue you in. It gives you another layer of search on your agenda (“202203” for example) to show all the projects created in March, provided you create an ID for each project, which you totally should. Because this is a simple text string, you can also use this outside of Emacs with other scripts to automate tasks that will look for this ID. It opens a world of options now that you have a range of unique IDs that you understand in your head, AKA, slugs. Also, it just looks better.