How do I solve [insert problem here] with Python?

You have that project that inspires you, and have started learning a little Python and seeking more resources, and now you’re probably thinking “but how do I do [thing I specifically need]?”

That’s where libraries come in. Libraries get their name from the physical equivalent – just like the GC library provides books that can help a project along, programming libraries provide code encapsulated as functions and objects to carry out tasks common to the domain of the library.

hamster robot that you can program with python
sourced from: https://xkcd.com/413/

In this comic, the pair has a new pet and they’d really like it to have a soul. They find a solution via the (fictional) Soul library, and so they want to use this library in the script they’re writing to control the robot. They are able to use this library because of the Python keyword import, which tells Python to please go get the (installed!) Python library that follows the import statement. The full Python statement to import the Soul library is:

import soul

They can program their new pet via the  Python interface for ROS (Robot Operating System)

What about if your problem isn’t robots? The following is a highly biased sampling of common tasks that the Digital Fellows are asked about and isn’t remotely exhaustive. The library recommendations are also highly biased because when possible you should always opt for well maintained open source projects with large sustainable communities. So what if you want to…

create a […] using a web-framework?

  • single task web application
    • Bottle – Micro web-framework with no dependencies
  • single task web application that other people will actually use
    • Flask – More robust micro web-framework
  • web application that has to do a lot of things
    • Pyramid – The Start Small, Finish Big  Stay Finished Framework
  • content management system
    • Django – Web Framework for Perfectionists With Deadlines

get data from…

  • most government/open datasets?
    • sodaPy – Socrata Open Data API (Applications Programming Interface – how you talk to the site via code)
  • what if there’s no good library?
    • Requests – HTTP for Humans. Talk to the internet without having to write raw URLs
  •  sites with no public API?
    • Beautiful Soup – Web scraper designed for quick turnaround projects like screen-scraping.
    • Scrapy – Framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

work with..

  • spreadsheets and tables?
    • Pandas – Python Data Analysis Library
  • arrays and matrices?
    • NumPy  – N-dimensional Array Library
    • xarray – N-D labeled (timestamps, locations, etc) arrays and datasets
    • SciPy – Scientific Computing Library
  • shapefiles and geographic data?
    • GeoPandas – Extends Pandas to allow spatial operations on geometric types.
  • images and video?
    • Pillow –  Friendly Fork of Python Image Library
    • OpenCV – Open Source Computer Vision Library
  •  audio?
    • librosa – Library for audio and music analysis
    • Parselmouth – Praat in Python, the Pythonic way
  •  databases?
    • SQLAlchemy – SQL Toolkit and Object Relational Mapper
  • data that’s a little too big?
    • Dask – Flexible parallel computing library for analytic computing
    • PySpark – Fast and general purpose cluster computing system

explore data using…

  • the scientific ecosystem?
  • textual analysis ?
    • NLTK – Natural Language Toolkit
    • spaCy – Industiral Strength Natural Language Processing
  • descriptive and inferential statistics?
  • machine learning?
    • scikit-learn – Tools for data mining and data analysis
    • keras – Deep Learning Library
    • PyTorch – Tensors and Dynamic neural networks
  • network analysis?
    • NetworkX – Software for complex networks

visualize data….

  • in publish worthy plots?
    • matplotlib – Static and interactive data visualization
    • seaborn – Statistical data visualization
  • in interactive dashboards?
    • bokeh – Interactive visualization library that targets modern web browsers for presentation.
    • glue – Explore relationships within and among related datasets
  • on a map?

test code?

  • pytest – write clean unit and functional tests
  • hypothesis – a modern implementation of property based testing
  • splinter – automate browser actions like interacting with site buttons

What if you need a library not listed here? Searching “[my problem] + Python” in a web browser will almost definitely yield code, but there are also indexes of code that has been packaged (which means turned into a library) that make the installation process much easier:

  • If you’re using the Anaconda distribution of Python (which is the one the fellows recommend), you should use their indexes when possible because they’ve tested that the code should work with their version of Python. Their indexes are conda and conda-forge.
  • If the package isn’t in the conda channels, or you’re not using anaconda, then you should look in PyPi (Python Package Index).

What resources are there to help you find the right libraries and learn how to use them?