The Most Underrated Python Packages
In my experience as a Python user, I’ve come across a lot of different packages and curated lists. Some are in my bookmarks like the great awesome-python-data-science curated list
In my experience as a Python user, I’ve come across a lot of different packages and curated lists. Some are in my bookmarks like the great awesome-python-data-science curated list, or awesome-python curated list. If you don’t know them, go check them out asap.
In this post, I’d like to show you something else. These are the results of late-night GitHub/Reddit browsing, and cool stuff shared by colleagues.
Some of these packages are really unique, others are just fun to use and real underdogs among the data scientist/statistician I’ve worked with.
Misc (the weird ones)
- Knock Knock: Send notifications from Python to mobile devices or the desktop or email.
- tqdm: Extensible Progress Bar for Python and CLI, with built-in support for pandas.
- Colorama:Simple cross-platform colored terminal text.
- Pandas-log: It provides feedback about basic pandas operations. Great for debugging long pipe chains.
- Pandas-flavor:The easy way to extend Pandas DataFrame/Series.
Data Cleaning and Manipulation
- ftfy: Fixes mojibake and other glitches in Unicode text, after the fact.
- janitor:A lot of cool functions to clean data.
- Optimus:Another package for data cleaning.
- Great-expectations: A great package to check if your data obeys your expectations.
Data Exploration and Modelling
- Pandas-profile: Create an HTML report full of statistics from pandas DataFrame.
- pydqc: Allow to compare statistics between two datasets.
- Pandas-summary:An extension to pandas DataFrames describe function.
- pivottable-js: drag’n’drop functionality for pandas inside jupyter notebook.
Performance Checking and Optimization
- Py-spy: Sampling profiler for Python programs.
- pyperf:Toolkit to run Python benchmarks.
- snakeviz: An in-browser Python profile viewer with great support for Jupiter notebook.
- Cachier: Persistent, stale-free, local and cross-machine caching for Python functions.
- Faiss: A library for efficient similarity search and clustering of dense vectors.
I hope you found something useful or fun for your work. I’m going to expand the post in the future, so stay tuned for new updates!