Why Empirical?

Published in

empiricalci

4 min readOct 19, 2016

empirical is a platform where researchers can execute their experiments and share their complete research environment, including: data, framework, solution model and results.

I would like to start this blog with a little background on empirical and why I’m working on it.

First a little bit about me: I’m Alan, the founder of empirical. I studied mechatronics in college and then got into a PhD on AI where I specialized in computer vision. During that time, I started consulting for a startup, optimizing computer vision algorithms for tagging videos to produce interactive experiences. I left my PhD, before finishing my dissertation, to join the startup full-time leading the tech side.

Science is inefficient

Both during my PhD as a researcher and my job as a software engineer I worked with empirical science and realized there are multiple inefficiencies slowing down modern science.

Discovery

Usually before you jump into solving any problem, you start by surveying all the different ways people have used to solve that problem. However, in science, even when information is not locked up behind paywalls, it’s not simple to distinguish which are the best approaches. Searching for the best approach for your problem is like looking for a needle in a haystack. All the information is out there, but is buried below a pile of long a hard to read papers.

Reproducibility

Once you have found a few approaches that you want to try, there are no easy means to reproduce research. Even when the source code and results are available, each project has it’s own framework with multiple dependencies. They’re implemented in different programming languages, different environments and Operative Systems. This adds up to make it really cumbersome and time consuming to test a new approach.

Experimentation workflow

Finally, there isn’t really an established way to setup an experiment so everybody kind of invents their own way. As a researcher, you usually have to write scripts to execute, log and compare results across multiple experiments. This currently is a manual, custom and repetitive process.

All these problems add up and result in long feedback loops, slow research iteration cycles which often leads to subpar quality research.

There has to be a better way

Now imagine that you have a database where every experiment is cataloged and ranked by a performance metric relevant to the problem it solves, (e.g. accuracy for object recognition or classification). You could easily see which is the best approach for the problem you want to solve and start from there.

Then let’s say that you want to reproduce that approach, and instead of spending days setting up and configuring your environment you can just replicate the experiments in one single command, no matter if you’re on Windows, Mac or Linux or what environment were the authors on when they developed it. If that’s the case, then maybe you won’t only try one, but multiple ones and see for yourself which one you like better.

Finally, say that you modify the approach and want to compare it to the original one and you could just do this with one command that pushes your experiment to the same database and now it’s ranked among all the other approaches and all the versions of your own approach. Now your newly developed approach is there for somebody else to find it and improve upon.

Why now?

“Software is eating the world” — Marc Andreesen
“We will move from mobile first to an AI first world.” — Sundar Pichai

Software is eating the world and AI is eating software. Artificial Intelligence is going to touch every industry in the near future and developers are going to have to adopt it. This brings up a big change in the way that software is going to be developed. Algorithms are transitioning from being programmed to being trained, from being deterministic to stochastic.

AI is an empirical science, and the tools that we need for testing and developing are very different than the ones used for web and mobile applications. Your tests are no longer going to be pass/fail, you’re going to need to evaluate on a performance metric and ideally compare against a public benchmark.

This means that inefficiencies that are affecting research in academia today will soon be affecting every other industry.

Introducing empirical

At empirical we want to build the tools scientists and engineers need to be productive when working with empirical science. A lot of work has gone into this already and so far we have:

A framework for portable computational experiments
An open-source client to optimize the development workflow and replication of experiments
An online dashboard to keep track of your experiments (currently in private alpha)

What’s next?

There’s still a lot of work to be done. Some things we’re currently working on include:

GPU support. (Edit: Done as of v0.5)
Build a framework for modular evaluation/benchmarking
Make it easy for organizations to collaborate internally
Hosting datasets and models
Allow to run experiments on the cloud

Finally we plan to support saving different types of results that can be uploaded and visualized on the dashboard. These results will allow to rank, sort and compare experiments, which leads to our ultimate goal:

Use the results for better discovery. For any computable problem, be able to answer: What’s the best solution for this?

Get involved

If any of this sounds interesting, please reach out. We would love your feedback. You can do it by using the tool and file issues, propose new features or simply join the conversation on our Gitter chat.