# How to estimate the number of false positives

### Prelude: the topological gap protocol preprint¶

In arXiv:2207.02472 Microsoft Quantum reported validating and then applying the topological gap protocol. I summarized my overall opinion about it in a commentary. However, even after finishing the commentary, I was confused about how the manuscript estimates the chance that they found false positives. Because I am not trained in statistics, I've used this opportunity to learn how to make such an analysis properly. Here's what I found.

# Qsymm workflow

Qsymm is a great tool, developed here in the group, but not everyone is very familiar with it. This is a Qsymm tutorial designed to change that. Qsymm already has extensive documentation and a few tutorials on read the docs, but this tutorial shows a start-to-finish workflow. It covers using the nonhermitian branch of qsymm to create real-space Hamiltonians, saving and loading models, how to test our models for various symmetries, and how to use our models in kwant systems.

# How I designed the cover of my Ph.D. thesis

I will hand out around a hundred copies of my thesis to my defense committee, colleagues, and my family and friends. Let's be honest, most people will probably not get further than attempting to read the summary and appreciating the cover. Four years of work has gone into generating the content of the thesis, so I figured, at least some thought has to go into the design of its cover. Unfortunately, I am by no means an expert on a graphical design or even competent enough to attempt to use any kind of graphic design software. But luckily for me, I do consider myself an expert in Python , and why not make the thesis design a fun process?

# Moving a theory group online

Most people experience a major disruption in their routine because of the COVID-19 pandemic and the social distancing measures that counteract it. Research groups like us are no different, although as a theory group we have it much easier—we don't have lab equipment to look after.

Here is the situation: you are leading a research group, or perhaps you are a member of one. A pandemic strikes, and everyone is told to work from home. What should you do?

I love open science. Since you are reading a scientific blog, I believe it is likely that you also support many of open science ideas. Indeed, easy access to publications, code, and research data makes research easier to reuse, while also ensuring transparency of the process and better quality control. Unfortunately the academic community is extremely conservative and it just takes forever for new standards to become commonplace.

There is however one easy way to help …

### Let's set the scene¶

You're a researcher doing numerical modelling. You're an old hand. You use Python (and therefore are awesome).

Your days are spent constructing mathematical models, implementing them in code, and exploring the models as you change different parameters. You realize that simplicity is key, so you make sure to write your models as pure functions of the parameters, maybe a little something like this:

def complicated_model(x, y):
...
return result


Beautiful; now it's time to do some science! Of course, you'll want to plot your model as a function of the parameters. Python makes this super simple but, as we'll see, this simplicity has a price.

# In the footsteps of Einstein

I teach the undergraduate solid state physics course, where we just switched to a shiny new book "Oxford Solid State Basics" by Steve Simon.

Steve's story of condensed matter physics starts with the heat capacity of solid materials. It's a great way to dive into how quantum mechanics combines with lucky guesses to improve our understanding of what is happening. It is also what we do in our course.

A great source of experimental data showing the problem is Einstein's original work, and Steve's book reproduces the plot from Einstein. (See also the English translation) Unfortunately that plot belongs to the current publisher of Annalen der Physik and cannot be republished under a free license. So in order to provide this data in the lecture notes and to make it available to whoever wants, I decided to take the original data Einstein has and repeat the exercise. Because we are living in an enlightened age, I also wanted to see if the more advanced Debye model would be any better for Einstein's data.

# Machine learning analysis of scientific articles

Neural networks have an advantage compared to humans because they have access to a much larger body of information. This is why what looks like random noise to a human, after correct processing by a machine learning algorithm turns out to be a signature of Higgs boson.

While analysing physics data with machine learning is definitely a great direction of research, another intriguing possibility is trying to infer what the researchers themselves think. To make an example, the domain of sentiment analysis tries to not only extract the information contained in the text, but also the attitude the author has about this information.