I’ve recently spent a good few hours learning to use a Python package for gene regulatory network prediction. I was finally successful tonight, but it took me a lot of debugging. I thought it would be helpful to share some general tips I’ve learned over the years about how to avoid going crazy when using other people’s code:
Read the documentation first about how to install and use the package. If you’re lucky it will be clearly explained. Unfortunately this is usually not the case, but reading incomplete documentation is better than going full LEEROY JENKINS.
Installing the software (and its dependencies) is half the battle. Always set things up in a new Conda environment so you don’t nuke anything by mistake.
If you get stuck during installation, Google the error messages, and if you still can’t figure it out, delete everything and try again. If it still doesn’t work, email the authors and pray.
Don’t be afraid to make changes to the code to get things to run — but always keep a backup of the original!
Did I mention backups are important?
Always make notes about how you installed the package, and where you got any auxilliary files. You will likely need to repeat the process at some point in the future (for example, if you’re showing someone else how to do it).
If you get an error when you call a function from the package, follow these steps in order:
Identify the problem. What function is causing the error? What is the immediate cause?
Check for a simple solution. (Spelling error / typo, version mismatch, other silly mistakes)
Check if others have had this problem and solved it. (Google, StackOverflow, etc.)
Check the documentation. What is the root cause of the error? How can it be avoided?
Consider alternate options. For example, if there’s an error saving a file as a particular format, try saving as a different format.
Perform a detailed inspection of the code that’s giving the error, and modify it as needed to fix it.
Write your own code to perform the function without error.
If you run out of RAM, download some more. Or in my case,
ssh
into my university’s high-performance computing environment, install everything again, and re-run the job, this time requesting 250GB.Always do a sanity check on your final results. If they seem weird, you probably misconfigured something.
If you’re really struggling, check to see if there are alternative packages that can do the same thing. If there aren’t any, consider writing your own code, publishing it as a paper, and then abandoning it so that others in the future can share your pain.
Happy coding, everyone!
This should be required reading! And not just for biosciences folk.
Loved the last point.