Research Notes with Org-Mode

Here's my workflow for math research notes (spoiler: lots of org-mode). I was inspired to write this based on a recent blog post by the ever popular Gilles Castel, My Mathematics PhD Research Workflow. I wanted to show off Org's good capabilities as outlining and research tool, to contrast with the more modular vim and extensions.

1. Why Org?

And by extension, why Emacs? There are several good reasons.

  1. Emacs is long-term software. Gnu Emacs has been around since 1986, and has a religious (or cult, depending on your editor preferences) following. Gwern explains in great detail on what you should consider when picking software to work with, and convenience is not a consideration on their list.
  2. Org is made for language-agnostic Document Structuring. The creator, Carsten Dominik, explains why he made Org mode, and why its features contribute to the intended purpose of Org, "Note Taking and Project Planning". As I'll show, Org allow markup and LaTeX to create something easy to type and can be distributed in a variety of formats, with availability for advanced formatting.
  3. Both parts of the duo are hackable and extensible. Your ability to change defaults ranges from changing keybinding to writing code that can string together existing functionality, or altogether change it. The result is that you have turned your Emacs into a document creator that is as personalized as you would like and can do most anything.

Now, if you read these and weren't convinced, don't worry, there's still hope for you. Org is not just an Emacs package, it is currently being implemented in NeoViM, has syntax highlighting in many different editors, is a Pandoc supported file format, and some people would like to see it as a standardized format for markdown-esque writing. So you still might be able to save yourself. But, if you feel Org is not right for you… keep reading.

2. A Crash Course

Org is a document outliner. It creates headings and subheadings that contain content. The headings can have many properties, e.g. it can have a marker that indicates the section's completeness, or a tag indicating what kind of content is there. On its own, it is just a syntax. You could type it with any editor, but Org mode was created for Emacs to give many extra convenience functions. A sample of what you could type into an org file is below, written so that you know what kind of text you're looking at:

* Top-Level Headline
** Second-Level Headline                                               :tag1:
DEADLINE: <2022-04-17 Sun> SCHEDULED: <2022-04-10 Sun>

Here is some regular text, *bold text*, /italic text/, =code=, +strikethrough+, etc. I'm going to break the line. \\

 - here's a bullet point-ed list
   - sublisting is implemented

 | Pros of Org-mode             | More Pros                |
 | Org-mode can do tables       | with automatic alignment |
 | and spreadsheet capabilities | (but only in Emacs)      |

Insert inline math with LaTeX syntax, $\int_{\partial M} \omega = \int_M d \omega$ or use an equation environment. Here's a link [[Top-Level Headline][to a website, file, section in any org document... you name it]].

You can even put in blocks that are export agnostic, whether the final output is LaTeX or HTML, this block of text is always centered.

There are some Emacs-specific capabilities, for example

#+BEGIN_SRC python :results output
  print("This is python code executable from the org file. It runs on your computer and you can have the result automagically put into your document.")
  print("This is an editor dependent thing; not all editors support it. But in Emacs, you even get syntax highlighting for languages!")

: This is python code executable from the org file. It runs on your computer and you can have the result automagically put into your document.
: This is an editor dependent thing; not all editors support it. But in Emacs, you even get syntax highlighting for languages!

And Emacs has an ability to filter headlines based on properties, like if your tasks still have TODO on them, or if you want to filter based on tag. You can track time on tasks via org-clock functionality, which logs when you start and stop tasks. There is org-capture which allows you to insert into an org document from anywhere in Emacs. You can also do way out-there stuff, like make a Zettelkasten. Trust me, people have been and continue to go nuts over Org, and new stuff is being added all the time with extensive backward compatability, so notes that work now will continue to work in the years to come.

Another great feature of org-mode is that is is a plain text file with no specialized program to read it. If you don't want to use Emacs for your org-mode features, use Python! Use Perl! Use anything you want because all of it is made to be read by computers and people. This also makes it easy to version control, for example, through Git.

With some Org-mode syntax out of the way, let's talk about how to use it for research notes.

3. Writing Research Notes

There are many different inputs and outputs of research notes. You read papers, listen to talks, have conversations with other researchers, and in the end you have publishable papers and code to use. These all require very different needs in terms of recording information. I'll handle these cases and show how org-mode can give you functionality for all of them.

3.1. Note-Taking from Papers

At the most basic level, Org's hierarchical structure makes notes from different sources easy to organize: for example, put research topics on top-level headlines, and individual documents in second level headlines, and type away! You could also use tags to organize by project, use TODO markers to show what papers you still need to finish reading, scheduling properties so that you can organize in time when papers should be read… etc.

Org's full compatability with LaTeX ensures you can type any and all equations you want, and code blocks let you test code yourself (in small batches). Cross-referencing allows drawing connections between works with easy connections… etc.

You could link to a PDF of the paper on your computer for easy reference. There's even a package out there for annotating PDFs in org-mode… see 4.3.

3.2. Note-Taking from Lectures

While Vim is known for its speed of input and editing, you can get pretty fast with some extra Emacs packages, see 4.1 But a mixture of markdown and full-on LaTeX support allows you to do things faster than, say writing in complete LaTeX all the time (cough, cough Gilles). But I think you need Emacs or an org-mode equivalent to do this well. I wouldn't recommend fast note taking in org-mode outside of something with many keyboard shortcuts.

Other than that, just type as fast as you can! I would not be amazed to find a package for drawing commutative diagrams, the bane of every TeX'ed homework, in Emacs that makes it easy.

3.3. Document Drafting

Here's probably where I would use Org-mode the most. Org-mode in Emacs supports exporting the markdown-like elements to a variety of formats, including LaTeX, HTML, beamer presentations, and a host of others. That means your Org mode headlines get turned into \section commands, LaTeX equations get converted as is, and lists are formatted the right way.

But more powerfully, you can also include arbitrary LaTeX header code. That means you can \usepackage to your heart's content, redefine commands, or use your standard preamble. That means there are no formatting losses by drafting in Org-mode. If you set up everything right (which takes some learning), you can completely reproduce any LaTeX document in Org. You heard me right, folks. Nothing lost in by using Org-mode, and nothing gained by writing in LaTeX.

Additionally, if you have a big document of notes, you can write functions to export based on headings properties. Leave out every item that isn't DONE and is still TODO, separate drafts or multiple version based the :tag: in the headline. For example,

* Introduction                                                          :AMS:
* Abstract                                                           :Annals:
* Bounds in Sobolev Space                                               :AMS:
* TODO Weak Compactness Results                                  :Annals:AMS:
** A New Proof of Banach-Alaoglu

And now you can also project plan! Schedule and deadline specific sections so that you are not overwhelmed by keeping a huge document updated.

You can also insert references that can get exported to BibTeX! You need org-contrib package and a reference manger, see about 4.2.

4. Packages and Tools

Org-mode by itself is nice, but extra functionality can be helpful. I'll show off packages that speed up LaTeX input, manage and insert references, and annotate PDFs.

4.1. LaTeX Input: CDLaTeX

For direct LaTeX editing in Emacs, there is a package for quickly inserting predefined LaTeX symbols, macros, and environments called CDLaTeX (CD for the aforementioned Carsten Dominik). Org-mode comes with similar built-in functionality, called org-cdlatex-mode, which allows CDLaTeX keybindgs work in Org-mode. I'll just give a sample, and refer to a thorough explanation here.

For example, fr<TAB> puts in a fraction macro, ` puts in math symbols, and ' puts in formatting macros (bold, italics, \mathbb, etc.), and you can tab through the various fields to pass up curly braces. I can put these together to type fr<TAB>`a<TAB>`b<TAB>`i Q'b to get \(\frac{\alpha}{\beta}\in \mathbb{Q}\), using my predefined keys. This is so much faster than, say, Overleaf's autocomplete, the caveat being it needs to be ingrained into muscle memory first. So some work now for payoff for later… much later (perhaps the rest of your career).

4.2. Managing and Inserting References: Zotero and Ivy-BibTeX

This workflow was introduced to me by my wonderful friend, Anish. Zotero is a desktop application that displays a searchable and categorizable list of published works you have found. There's even a browser extension which can automatically find BibTeX entries and PDFs of papers from the website you're one (say, arXiv). There's a menu option for exporting every reference in BibTeX or BibLaTeX format to a classic .bib file. This pairs nicely with a package called ivy-bibtex, which gives an Emacs function to do a variety of things with a pre-specified .bib file. You can fuzzy-find a reference (i.e. don't need to type it in exactly), insert a cite command with the correct key, or insert a formatted bibliography entry for the reference, open the pdf stored, the link where it was found, etc… This makes reference management painless.

I should mention, Zotero is quite overkill for this one little task; the same could be done with a shell script. For example, in BASH,

#!/usr/bin/env bash

for file in "$@"
    if [ -f $file ]; then
       cat $file >> ~/mybibfile.bib
       echo "inserting references of $file into ~/mybibfile.bib"
        echo "Couldn't find file $file"

And save the file as bibly. And now you can do stuff like bibly file1.bib file2.bib, or you could make bibly the default program to open .bib files, in which case, when you download a .bib file, say from Google Books or arXiv, the entry is automatically added to your .bib file. Then you just hook up ivy-bibtex to use this single massive file, and it just works.

4.3. PDF Annotation: Org-Noter

Anish turned me onto this one as well. Org-Noter is a Emacs package to display a PDFs and make page-by-page annotations of said PDFs in an org file, while navigating the PDF or the org file. It's the digital equivalent to taking margin notes in a book, but with full Org mode capabilites, like LaTeX support, cross-referencing, etc. I'll leave it to you to decide how to use this best.

5. My Workflow

Here I'll show how am writing a PhD thesis and taking notes in the same Org file.

5.1. Ground Floor: OS and Desktop

I run off of two laptops, one a 2020 Thinkpad model (refurbished) and a 2015 Dell Inspiron. I run Arch Linux and don't play any games on them, so the two laptops do everything just about the same, modulo hardware differences. I use Emacs as a Desktop through EXWM, a package that turns Emacs into a complete desktop on Linux. I love it.

I sync research notes and homeworks via Git through GitHub. This allows for me to work on the same document on two different computers, and in the event of conflicts, I can choose which version to use via Git, or in the event of catastrophic failure of a laptop, have all my notes backed up. This aspect is still a work in progress, but does a good job, I think.

5.2. First Floor: File Organization

I have two main directories for schoolwork: one for classes and one for research. In the research, I have folder organized by project, and in them, a single file. In this one file, I collect everything on the project. I do org-noter notes, I write up sections for publishable papers, I can record notes from meetings with my advisor. It all gets hierarchically organized in roughly a final document structure. Utility functions like the org-refile function allows me to move sections and whole trees of content between headlines, so that there is no errors or slowdowns by copy-pasting information.

I keep an external file for my personal LaTeX commands: samsymb.sty. It does formatting, common commands, and document structure all in one file. Then at the top of my, I put #+LATEX_HEADER: \usepackage{../samsymb.sty}, and all of a sudden, my LaTeX export works like magic. I have these files on my org-agenda-files so that I can see which sections still need attention.

5.3. Top Floor: Content Entry

I use org-cdlatex all the time and have customized its commands quite a bit. I replace \mathbf, bound to 'b to \mathbb, which is a crazy improvement. I commonly cite with ivy-bibtex, which requires a \usepackage in samsymb.sty, or a command at the top of the Org file.

5.4. The Final Product

I can then export to LaTeX and then compile with your compiler of choice (mine is LuaLaTeX), and I use latexmk to have references resolve. In the end, I get a professional PDF:


And it was painless to write, without having to mess too finely with LaTeX or its editing.

6. What Else is Org Good For?

Many, many things: a TODO list, blogging, the previously mentioned Zettelkasten… But since org-mode and Emacs are programmable, you can make it do anything a computer does. Autocomplete and code checking, pushing your files to a server, automating git commands… the list goes on.

One plan I have for myself is, when I finish a research project, export my big org file of research to a webpage and let interested parties peruse for their research gain.

7. Links from this Article

7.1. Blog Posts, Talks, Exposition

7.2. Emacs Packages

7.3. External Tools (that you don't need with the power of Emacs!)

8. Edits and Discussion

  • <2022-04-12 Tue> Grammar and spelling edits
  • <2022-04-11 Mon> Irreal Post

Author: Sam Wallace

Created: 2022-04-12 Tue 09:15