===== Machine Learning 101 ===== Goal: //Answer the question: What Machine Learning is and how to use it?// ==== Prepare for the lab ==== * [[wp>All_models_are_wrong|All models are wrong]] * [[https://artint.info/2e/html/ArtInt2e.Ch7.html|Supervised Machine Learning]] -- //this is a fairly comprehensive but relatively easy introduction to machine learning. It's not necessary to memorize everything - just get a general idea of what machine learning is and what the basic techniques are// ==== Materials ==== - Q&A Session -- a short series of keywords to "warm up" (based on the textbook): - What is the difference between supervised learning, unsupervised learning, and reinforcement learning? - What is the difference between regression and classification? - What is linear regression? - How do we select the best linear regression model? Do you know what MSE is? - What is a decision tree? - What is a neural network? - What is overfitting and why is it a problem? - How to deal with this problem in linear regression? How in other models? - Why do we separate the learning set and the test set? - What is cross validation? - All the models are wrong. Is this the problem? - Is 80% accuracy a good result? - Practice session: - Today we will practice two basic models: linear regression (for regression problems) and logistic regression (for classification problems). There will be also a short bonus on neural networks. To do the tasks, go to the Jupyter Notebooks listed below. - [[https://colab.research.google.com/drive/1nn1W8SKMAIbnHj2GHBBYTOLIyob2eGL5?usp=sharing|ML. Regression]] - [[https://colab.research.google.com/drive/1qmAvC8DIb6ESfQogsIDmVv2GKGHuuJQ8?usp=sharing|ML. Classification]] - Advanced practice session: - If you want to tackle additional topics, do the optional Advanced section in both notebooks. This will give you the opportunity to generate artificial features (for linear regression) and learn about decision trees (for classification problems). ==== Learn more! ==== * [[https://github.com/afshinea/stanford-cs-229-machine-learning/|Machine Learning cheatsheets]] -- may be very useful during the class :-) * Plotz - [[https://doi.org/10.1145/3459666|Applying Machine Learning for Sensor Data Analysis in Interactive Systems: Common Pitfalls of Pragmatic Use and Ways to Avoid Them]] (ACM, 2022) * [[https://www.kdnuggets.com/2018/04/10-machine-learning-algorithms-data-scientist.html|Ten Machine Learning Algorithms You Should Know to Become a Data Scientist]] * [[http://www.kdnuggets.com/2017/05/guerrilla-guide-machine-learning-python.html|The Guerrilla Guide to Machine Learning with Python]] ("a complete course for the quick study hacker with no time (or patience) to spare") * [[https://www.lenwood.cc/2014/05/13/12-free-data-mining-books/|14 Free (as in beer) Data Mining Books]] * ML courses at Coursera.org: * [[https://www.coursera.org/learn/machine-learning|Machine learning]] * [[https://www.coursera.org/specializations/machine-learning|Machine learning specialization]] * [[https://www.coursera.org/specializations/jhu-data-science|Data Science specialization]] * [[https://steelkiwi.com/blog/what-is-machine-learning/|What is Machine Learning and What is It Not?]]