Skip to main content

Command Palette

Search for a command to run...

What Is Regression: Supervised Machine Learning Tips

Updated
4 min read
What Is Regression: Supervised Machine Learning Tips
C

We are all about democratizing access to learning #DataScience and #CognitiveComputing. Register today to gain the skills you need to change the world. Proudly an IBM initiative.

What is Supervised Machine Learning and what is Regression?

Machine Learning falls into the subset of Artificial Intelligence, defined by the ability of the machine to learn from some data. Supervised Machine Learning, is yet another subcategory of Machine Learning, defined by its use of labelled datasets to train algorithms that can accurately predict outcomes or classify some data. Regression is used to understand the relationship between some target parameter of interest and the independent, predictor variables that influence that parameter. Furthermore, regression analysis not only enables us to capture the important relationships between variables but also to reduce the complexity of real-world applications, for us to be able to understand and interpret them and to utilize this interpretation for our own benefit.

Why Is It Important?

Regression is commonly used to predict some numeric outcome. For example, in real estate, to predict a sale price of a house, in medical research, to understand the relationship between drug dosage and blood pressure, in business, to understand the relationship between advertising spending and revenue, in agriculture, to measure the effect of fertilizer and water on crop yields, and many more real-life applications that became very trivial for quality analysis. Linear regression, logistical regression, and polynomial regression are a few of the most popular regression algorithms.

Some Tips for a Good Quality Regression Analysis

There are a few strategies in Regression analysis that can help us obtain the most accurate results. The first tip is to start with a simple model containing just a few predictor parameters. As we progress, we can make the model more complex as needed. But let’s be sure that the added complexity (adding more parameters or adding higher order of the parameters) truly improves the precision. While complexity tends to increase the model fit (how well a model generalizes based on the data on which it was trained), it also tends to lower the precision of the predictions (wider prediction intervals). The next tip is to use prior studies to determine which variables to include in the regression model. If we look at some previous data and analysis results, we can automatically apply the same strategy to the current data. For example, knowing which parameters are the most significant in determining the house sale price based on some previous studies. Another important tip is to make sure that there is no correlation between our predictor variables. This means that each predictor variable should have its own independent effect on the target variable. An example of correlated predictor variables would be a garage area and the number of cars that can fit in a garage, as they both correspond to the size of a garage. The last tip for regression modelling is to present both confidence interval and statistical significance with the results. Confidence intervals and statistical significance provide consistent information; however, it changes how people perceive this information. Some studies reveal that reporting only confidence interval may mislead the interpretation of the results.

How Would I Learn Supervised Machine Learning & Regression?

The most effective way that I would recommend you to get the required skills to become knowledgeable about Supervised Machine Learning and Regression techniques is by taking a course. Courses are effective because the information you seek is all collected and organized for you and there are credited instructors from leading-edge companies around the world to get you from beginner to expert.

I highly suggest taking this course created by IBM, as it is a valuable option for individuals of all levels of knowledge:

IBM Supervised Machine Learning: Regression image.png

What's Included?

The course contains videos and reading material, as well as interactive practice labs where learners can explore and apply the skills they learned in this course. It will allow you to use Python language in Jupyter Notebook, a cloud-based skills network environment that is pre-set for you with all available to be downloaded packages and libraries. It will introduce you to topics such as linear and polynomial regressions, cross-validation, and regularization techniques. The practice labs will utilize ‘scikit-learn’ and ‘scipy’ libraries among many others to perform the analysis.

Start Now:

IBM Supervised Machine Learning: Regression

About The Author

Svitlana (Lana) Kramar

svitlana.png I am a Data Science Intern at IBM and a Master's student in Data Science and Analytics at the University of Calgary, who enjoys travelling, learning new languages and cultures and loves spreading her passion for Data Science.

LinkedIn Github Gitlab