Python / data sci / ML + AI resources

Hi all,

Posting this because I keep sending people emails that are some version of it. I’ll try to keep this post updated and just send people links to it instead.
It’s a list of resources I found helpful when learning Python, data science, writing software for research, machine learning, deep learning, et al.
Thought it might be worth doing now after chatting with @njourjine and @ralphpeterson.
Some of this might be old news but if any of it helps someone learn without having to find out the hard way like I did, it’s worth it.

Python Programming:

Think Python is a great intro to Python.

Read this and you know 90% of what you need to know to write Python.

https://greenteapress.com/wp/think-python-2e/

The second book I would read is Fluent Python.

(I access through a university O’Reilly subscription but you can probably find this book with some clever Google searches as well.)

Read that and you will know 150% of what you need.

Some of the later chapters are stuff you might never use but the first ~half of the book on the Python data model will open your eyes.

Reproducible research:

For all my projects I basically use the structure suggested in this paper:

Machine learning / deep learning

For a very good, intuitive introduction to machine learning, I very much recommend Andrew Ng’s Coursera course:

The exercises are in Matlab but you can find Python versions online, e.g.

The Andrew Ng Deep learning course is also good

Other links:

I really like Alan Downing’s books in general as examples of code to read and for a more computer science-like intro to Python

https://greenteapress.com/wp/

e.g., the example code in his Digital Signal Processing book

https://greenteapress.com/wp/think-dsp/

Jake Vanderplas has some excellent books as well:
This would be good for someone coming to Python from Matlab

and this one focuses on the libraries that a lot of data scientists use in Python

I also very much like

especially the introduction and first 2-3 chapters that sum up everything I’d want to tell someone coming to the scientific/data science Python ecosystem.

Good practical books on Python packages and on deep learning from Tomas Beuzen:

(I’m not a huge poetry fan, that this book uses, but still the book gives a nice high-level intro)

https://www.tomasbeuzen.com/deep-learning-with-pytorch

research software engineering + reproducibility:

https://the-turing-way.netlify.app/reproducible-research/reproducible-research.html

A lot of stuff in these I wish I had learned sooner. Material from the RSE book has some overlap with Carpentries courses but later chapters will be new.

Also turns out to be good for making life easy in an industry job, cuz a lot of these ideas originate in software engineering anyway.

Some books I’m working through now because I’m realizing it would be good to have a better understanding of ML from a probabilistic POV:

https://probability4datascience.com/

2 Likes